Professional Documents
Culture Documents
Naive Set Theory Notes
Naive Set Theory Notes
Notation:
{} enclose a set.
{1, 2, 3} = {3, 2, 2, 1, 3} because a set is not defined by order or multiplicity.
{0, 2, 4, . . .} = {x|x is an even natural number} because two ways of writing
a set are equivalent.
is the empty set.
x A denotes x is an element of A.
N = {0, 1, 2, . . .} are the natural numbers.
Z = {. . . , 2, 1, 0, 1, 2, . . .} are the integers.
Q = {m
n |m, n Z and n 6= 0} are the rational numbers.
R are the real numbers.
Axiom 1.1. Axiom of Extensionality Let A, B be sets. If (x)x A iff x B
then A = B.
Definition 1.1 (Subset). Let A, B be sets. Then A is a subset of B, written
A B iff (x) if x A then x B.
Theorem 1.1. If A B and B A then A = B.
Proof. Let x be arbitrary.
Because A B if x A then x B
Because B A if x B then x A
Hence, x A iff x B, thus A = B.
Definition 1.2 (Union). Let A, B be sets. The Union A B of A and B is
defined by x A B if x A or x B.
Theorem 1.2. A (B C) = (A B) C
Proof. Let x be arbitrary.
x A (B C) iff x A or x B C
iff x A or (x B or x C)
iff x A or x B or x C
iff (x A or x B) or x C
iff x A B or x C
iff x (A B) C
Definition 1.3 (Intersection). Let A, B be sets. The intersection A B of A
and B is defined by a A B iff x A and x B
Theorem 1.3. A (B C) = (A B) (A C)
Proof. Let x be arbitrary. Then x A (B C) iff x A and x B C
iff x A and (x B or x C)
iff (x A and x B) or (x A and x C)
iff x A B or x A C
iff x (A B) (A C)
1
Definition 1.4 (Set Theoretic Difference). Let A, B be sets. The set theoretic
difference A \ B is defined by x A \ B iff x A and x 6 B.
Theorem 1.4. A \ (B C) = (A \ B) (A \ C)
Proof. Let x be arbitrary.
Then x A \ (B C) iff x A and x 6 B C
iff x A and not (x B or x C)
iff x A and (x 6 B and x 6 inC)
iff x A and x 6 B and x 6 inC
iff x A and x 6 B and x A and x 6 inC
iff x (A \ B) and x (A \ C)
iff x (A \ B) (A \ C)
Provisional definition of function: Let A, B be sets. Then f is a function
from A to B written f : A B if f assigns a unique element b B to each
a A.
But what is the meaning of assign?
Basic idea is to consider f : R R defined by f (x) = x2 . We shall identify
f with its graph. To generalize this to arbitrary sets A and B we first need the
concept of an ordered pair. IE, a mathematical object ha, bi satisfying
ha, bi = hc, di iff a = c and b = d. In particular, h1, 2i =
6 {1, 2}.
Definition 1.5 (Cartesian Product). If A, B are sets, then their Cartesian
Product A B = {ha, bi |a A, b B}
Definition 1.6 (Function). f is a function from A to B iff f A B and for
each a A, there exists a unique b B such that ha, bi f .
In this case, the unique value b is called the value of f at a, and we write
f (a) = b.
It only remains to define ha, bi in terms of set theory.
Definition 1.7 (Ordered Pair). ha, bi = {{a}, {a, b}}
Theorem 1.5. ha, bi = hc, di iff a = c and b = d.
Proof. Clearly if a = c and b = d then ha, bi = {{a}, {a, b}} = {{c}, {c, d}} =
hc, di
1. Suppose a = b. Then {{a}, {a, b}} = {{a}, {a, a}} = {{a}, {a}} = {{a}}
Since {{a}} = {{c}, {c, d}} we must have {a} = {c} and {a} = {c, d}. So
a = c = d, in particular, a = c and b = d.
2. Suppose c = d. Then similarly a = b = c so a = c and b = d.
3. Suppose a 6= b and c 6= d.
Since {{a}, {a, b}} = {{c}, {c, d}} we must have {a} = {c} or {a} = {c, d}.
The latter is clearly impossible, so a = c.
Similarly, {a, b} = {c, d} or {a, b} = {c}. The latter is clearly impossible,
and as a = c, then b = d.
2
Theorem 1.9. N Z
Proof. We can list the elements of Z: 0, 1, 1, 2, 2, . . .
Theorem 1.10. N Q
Proof. We proceed in two steps.
First, we prove that N Q+ = {q Q : q > 0} and consider the following
1
2
3
1
1
1
2
3
2
infinite matrix 1 .
2 %
2 .
3
13 % 23 .
3
Proceed through the matrix along the indicated route adding rational numbers to your list if they have not already occurred.
Second, we declare that N Q. In the first part, we showed that there exists
a bijection f : N Q+ hence we can list Q by 0, f (1), f (1), . . ..
Definition 1.14 (Power Set). If A is any set, then its power set is P(A) =
{B : B A}, so P({1, 2, . . . , n})is of size 2n .
Theorem 1.11 (Cantor). N 6 P(N)
Proof. This method of proof is called the diagonal argument.
We must show that there does not exist a bijection f : N P(N).
Let f : N P(N) be any function.
So, we shall prove that f is not a surjection. Hence, we must find a set
S N such that n N f (n) 6= S.
We do this via a time and motion study
For each n N we must make the nth decision. That is, if n S.
We perform the nth task, that is, ensure f (n) 6= S.
We make the nth decision so that it accomplishes the nth task, ie, n S iff
n 6 f (n).
Clearly, f (n) and S will differ on whether they contain n, thus, n
N f (n) 6= S, and so f is not a surjection.
In general, unattributed Theorems are due to Cantor.
Theorem 1.12. If A is any set, then A 6 P(A)
Proof. Suppose that f : A P(A) is any function. Consider the set S = {a
A : a 6 f (a)}.
Then, a S iff a 6 f (a), so f (a) = S. Hence, f is not a surjection.
Definition 1.15 (). Let A and B be sets.
A B iff there exists an injection f : A B.
A B iff A B and A 6 B
Theorem 1.13. For any set A, A P(A)
Proof. We can define an injection f : A P(A) by f (a) = {a}.
Thus, A P(A)
By Cantors Theorem, A 6 P(A), thus A P(A).
4
N
N
P(N)
f (0)+1
f (n)+1
, . . . , pn
, . . .}, where
n, then k(c) = c
We can define an injection f : (0, 1) (0, 1) (0, 1) such that f (r) = 12 , r .
Next we define an injection g : (0, 1) (0, 1) (0, 1) as follows.
If r = 0.r0 r1 . . . rn . . . and s = 0.s0 s1 . . . sn . . ., then,
g(hr, si) = 0.r0 s0 r1 s1 . . . rn sn . . ..
By the Cantor-Bernstein Theorem, (0, 1) (0, 1) (0, 1)
Definition 1.21 (Sym(N)). Sym(N) = {f NN : f is a bijection }
Theorem 1.28. Sym(N) P(N)
Proof. Since Sym(N) NN P(N), we have Sym(N) P(N).
Next, we define a function f : P(N) Sym(N) by S fS where fS (2n) =
2n + 1 and fS (2n + 1) = 2n if n S and fS (2n) = 2n and fS (2n + 1) = 2n + 1
otherwise.
This is an injection, and so P(N) Sym(N), so, by Cantor-Bernstein,
P(N) Sym(N)
Relations
11
Definition 2.12 (Isomorphism). Let hA, <i and hB, i be partial orders. A
map f : A B is an isomorphism if the following conditions are satisfied:
1. f is a bijection
2. (a, b A)a < b iff f (a) f (b)
In this case, we say that hA, <i an hB, i are isomorphic, and write hA, <i
=
hB, i
Theorem 2.4. hZ, <i
= hZ, >i
Proof. Define f : Z Z by f (z) = z, then a < b iff a > b iff f (a) > f (b).
Thus, f is an isomorphism
Notation - Let D be the strict divisibility relation on N+ . that is, aDb
a < b&a|b
Theorem 2.5. hN+ , Di
6 hP(N), i
=
Proof. N+ 6 P(N), so no bijection exists.
Theorem 2.6. hR \ {0}, <i
6 hR, <i
=
Proof. Suppose f : R \ {0} R is an isomorphism.
For each n 1 let rn = f ( n1 )
Then, r1 > r2 > . . . rk > . . . f (1)
Let s be the greatest lower bound of {rn : n 1}.
t R \ {0} such that f (t) = s
Clearly, t < 0, thus 2t > t
f ( 2t ) > s, hence n 1 such that rn < f ( 2t ).
But then 2t < n1 and f ( n1 ) < f ( 2t ), so a contradiction.
Definition 2.13. For each prime p, let Z[1/p] = { pan : a Z, n 0}
Definition 2.14 (Dense Linear Order without Endpoints). A linear order
hD, <i is a dense linear order without endpoints, or DLO, if the following hold:
1. if a, b D and a < b then c D such that a < c < b
2. if a D then b D such that a < b
3. if a D then c D such that c < a
Theorem 2.7. For each prime p, hZ[1/p], <i is a DLO.
Proof. Clearly hZ[1/p], <i is a linear order without endpoints.
Let a, b Z[1/p] with a < b. Then, we can express a = pcn and b =
Adjusting c or d if necessary, we can suppose a = pcn and b = pdn .
d
So then a = pcn < c+1
pn pn = b
1
So, let r = pcn + pn+1
= cp+1
pn+1 Z[1/p]
Then, a < r < b
12
d
pm .
Theorem 2.8. If hA, <i and hB, i are countable DLOs, then hA, <i
= hB, i
Proof. This is called the Back and Forth Argument.
Let A = {an : n N} and B = {bn : n N}
First define A0 = {a0 } and B0 = {b0 } and f0 = {ha0 , b0 i
Suppose inductively that we have constructed An , Bn , fn such that An is a
finite subset of A such that a0 , . . . , an An , Bn is similarly related to B, and
fn : An Bn is an order preserving bijection.
We construct An+1 , Bn+1 , fn+1 in two steps.
1. If an+1 An , then A0n = An , Bn0 = Bn and fn0 = fn .
If an+1
/ An , then say c0 < c1 < . . . , an+1 < . . . < cm where An =
{c0 , . . . , cm }
d B such that f (c0 ) < . . . < d < . . . < f (cm ), so we define A0n =
An {an+1 }, Bn0 = Bn {d} and fn0 = fn {han+1 , di
2. If bn+1 Bn0 , then An+1 = A0n , Bn+1 = Bn0 and fn+1 = fn0
Else, d0 < . . . < bn+1 < . . . < d` , where Bn0 = {d0 , . . . , d` }.
Then, e A such that (fn0 )1 (d0 ) < . . . < e < . . . < (fn0 )1 (d` )
And so, we define An+1 = A0n {e}, Bn+1 = Bn0 {bn+1 } and fn+1 =
fn0 {he, bn+a i}
Finally, let f = n fn , then f is an order preserving bijection between A
and B.
Corollary 2.9.
1. hQ, <i
= hQ \ {0}, <i
2. hZ[1/2], <i
= hZ[1/3], <i
3. If hD, <i is a countable DLO, then hD, <i
= hQ, <i
Definition 2.15 (S extends R). Suppose R and S are binary relations on the
set A. We say S extends R if R S
Theorem 2.10. If hA, i is a partial order, then there exists a binary relation
< extending such that hA, <i is a linear order.
Propositional Logic
Definition 3.1 (Formal Language). We define our formal alphabet as the following symbols
1. The sentence connectives , , , ,
2. The punctuation symbols (, )
13
14
4. (( )) = F if () = () = F
(( )) = T else.
5. (( )) = F if () = T and () = F
(( )) = T else.
6. (( )) = T if () = v()
(( )) = F else.
Definition 3.7 (Proper Initial Segment). If ha0 , . . . , an i is a finite sequence,
then a proper initial segment has the form ha0 , . . . , as i where 0 s < n
Lemma 3.2. Any proper initial segment of a wff has more left than right parentheses. Thus, no proper initial segment of a wff is a wff.
Proof. We argue by induction on the length n 1 of the wff .
First, suppose n = 1, then is a sentence symbol, say, As . As As has no
proper initial segments, the result holds vacuously.
Suppose n > 1, and the result holds for all wffs of length less than n.
Then, has the form ( ), ( ), (), ( ), ( ) for some shorter
wffs and .
Since all the cases are similar, we just consider the case when is ( ).
The proper initial segments are (,(0 where 0 is a not necessarily proper
initial segment of , (, ( 0 where 0 is a not necessarily proper initial
segment of .
So, using the inductive hypothesis and the previous theorem, we see that
the result also holds for .
Theorem 3.3 (Unique Readability Theorem). If is a wff of length greater
than one, then there exists exactly one way of expressing in the form (), (
), ( ), ( ), ( ) for shorter wffs , .
Proof. Suppose is both ( ) and ( ).
Then, ( ) = ( ), and so deleting the first parenthesis, ) = ).
Suppose 6= , then without loss of generality, is a proper initial segment
of .
By the lemma, is not a wff. This is a contradiction, so = .
Hence, deleting , , we have ) = ). So =
The other cases are similar.
From now on, we will use common sense in writing proofs, and may omit
parentheses when there is no ambiguity. These are not wffs, however, just
representations of them.
Definition 3.8 (Satisfiable). Let : L {T, F } be a truth assignment.
1. If is a wff, then satisfies if () = T .
2. If is a set of wffs, then satisfies if () = T for all
15
Theorem 3.8 (clearly false). If hA, <i is a countable infinite linear order then
A has a least element.
What goes wrong with an attempted proof by compactness? It turns out to
be impossible to translate this into a set of wffs.
Definition 3.15 (Distinct Representatives). Suppose that S is a set.
Suppose hSi : i Ii is an indexed collection of not necessarily distinct subsets
of S.
A system of distinct representatives is a choice of elements xi Si such that
if i 6= j then xi 6= xj .
Theorem 3.9 (Halls Matching Theorem 1935). Let S be any set and n N+ .
Let hs1 , . . . , sn i be an indexed collection of subsets of S.
Then, the necessary and sufficient condition for the existence of a system of
distinct representatives is
For any 1 k n and choice of k distinct indices i1 , . . . , ik , then we have
|Si1 . . . Sik | k
Theorem 3.10 (clearly false attempt at an infinite Halls theorem). As above,
but cross out all reference to n.
Counterexample is S = N.
Theorem 3.11 (Infinite Version of Halls Matching Theorem). Let S be any
set.
Let hS: i N+ i be an indexed collection of finite subsets of S.
Then, a necessary and sufficient condition for the existence of a system of
distinct representatives is for every 1 k and choice of k distinct indices,
i1 , . . . ik we have |Si1 . . . Sik | k
Proof. Clearly, if a system exists, then the condition holds.
Conversely, suppose the condition holds.
We work with the propositional language with sentence symbols Cn,x for
x Sn and n 1.
Let be the set of wffs of the following form
1. (Cn,x Cm,x ), x Sn Sm
2. (Cn,x Cn,y ), x 6= y Sn
3. For each n 1, letting Sn = {x1 , . . . xtn } we add Cn,x . . . Cn,xtn
By 2 and 3 we choose exactly one element from each set, and by 1 they are
distinct.
It only remains to check that is satisfiable.
By compactness, it is enough to check that is finitely satisfiable.
Let 0 be any finite subset of .
Let n1 , . . . nN be the finitely many indices mentioned in 0 .
18
20
S
Clearly w satisfies n = ` `
Thus, T is an infinite, finitely branching tree. By Konigs Lemma, there
exists an infinite branch B through T .
S
Let n B Levn (T ) and define = n n is a truth assignment which
satisfies .
4.1
Definition 4.8 (A structure for a first order language). A structure A for the
first order language L consists of
1. A nonempty set A, the universe of A
2. For each n-place predicate symbol P , an n-ary relation P A A A
. . . A = An
3. For each constant symbol c, an element cA A
4. For each n-ary function symbol, an n-ary operation f A : An A
24
nt
26
By compactness, is satisfiable.
Then, A satisfies Th(N) and yet s(x) = c > 1A + . . . + 1A for any number
{z
}
|
n
31
32
4.2
33
34
1. x(x = x)
2. (x)(y)(x = y y = x)
3. (x)(y)(z)((x = y y = z) x = z)
4. For each n-ary preducate symbol P , x1 , . . . , xn , y1 , . . . , yn ((x1 = y1
. . . xn = yn ) P x1 . . . xn = P y1 . . . yn )
5. For each n-ary function symbol f , x1 , . . . , xn , y1 , . . . yn ((x1 = y1
. . . xn = yn ) f x1 . . . xn = f y1 . . . yn )
Similarly, since is deductively closed and xy(x = y y = x) , we
have that if t1 , t2 are terms, then (t1 = t2 t2 = t1 ) .
Lemma 4.30. We construct a structure A for our language L + as follows:
Let T be the set of terms of L + and define a binary relation E on T by t1 Et2
iff (t1 = t2 ) .
Then E is an equivalence relation.
Proof. Let t T . Then (t = t) , and so tEt for all t, so E is reflexive.
Suppose that t1 Et2 . Then (t1 = t2 ) , and so (t2 = t1 ) , so t2 Et2 , so
E is symmetric.
Transitivity is similar.
For each t T , we define [t] = {s T |tEs}, then we define A = {[t]|t T }.
For any n-ary predicate symbol, we define an n-ary relation P A on A by
h[t1 ], . . . , [tn ]i P A iff P t1 , . . . , tn .
Lemma 4.31. P A is well defined.
Proof. Suppose that [s1 ] = [t1 ], . . . , [sn ] = [tn ]. We must show that P s1 , . . . , sn
iff P t1 , . . . , tn . By hypothesis (si = ti ) for each i.
Since ((si = ti . . . sn = tn ) (P s1 , . . . , sn P t1 , . . . , tn )) , the
result follows.
For each constant symbol c, we define cA = [c], and for each n-ary function symbol f , we define an n-ary operation f A on A by f A ([t1 ], . . . , [tn ]) =
[f t1 , . . . , tn ]. As above, f A is well-defined.
Finally, we define s : V A by s(x) = [x].
Lemma 4.32. For every term t, s(t) = [t].
Proof. By definition this is true when t is a constant symbol or a variable.
Suppose that t = f t1 , . . . , tn . By induction, we assume that s(ti ) = [ti ]
for each i. Thus s(f t1 , . . . , tn ) = f A (
s(t1 ), . . . , s(tn )) = f A ([t1 ], . . . , [tn ]) =
[f t1 , . . . , tn ].
Lemma 4.33 (Substitution Lemma). If the term t is substitutable for x in ,
then A |= tx [s] iff A |= [s(x|
s(t))].
35
36
Definition 4.22. Let A , B be structures for the first order language L . Then
A and B are elementarily equivalent, written A B iff for every sentence
L , A |= iff B |= .
Remark 4.5. A ' B implies A B, but the converse is false (eg, nonstandard arithmetic.)
Definition 4.23 (Complete Set). A consistent set of sentences is said to be
complete iff for every sentence , either T ` or T ` .
Definition 4.24 (Theory). A theory is a set of sentences. A consistant theory
is complete if it is as a set of sentences.
Theorem 4.38. Let T be a complete theory in a first order langague L . If A
and B are models of T , then A B.
Proof. Let be any sentence. Then T ` or T ` . Suppose that T ` . By
soundness if T is true then is true. So A |= and B |= .
On the other hand, if T ` , then by soundness we have A |= and
B |= .
Theorem 4.39 (Los-Vaught Test). Let T be a consistent theory in a countable
language L . Suppose
1. T has no finite models
2. If A are countably infinite models of T , then A ' B.
Then T is complete.
Proof. Suppose that T satisfies the two conditions. Assume for contradiction
that T is not complete. Then there is a sentence such that T 6` and T 6` .
By reduction ad absurdum, T {} and T {} are both consistent. BY
completeness, there exist countable models A and B of T {} and T {}
respectively. By the first property, they are countably infinite, and by the
second, A ' B, contradiction.
Corollary 4.40. TDLO is complete.
Proof. TDLO has no finite models. Also we have seen by the back and forth
argument that any two countable dense linear orders are isomorphism, and so
by L-V, we are done.
Corollary 4.41. hQ, <i hR, <i
Proof. Both are models of the complete theory TDLO .
The rationals are a linear order win which every conceivable finite configuration is realized. We next seek a graph theoretic analogue: the infinite random
graph.
37
Definition 4.25. For k 1, let Pk be the dfirst order sentence in the language
of graphs which says that if x1 , . . . , xk , y1 , . . . , yk are distinct vertices, there exists a vertex z such that z 6= xi , yi for all i, z is joined to xi for all i, and zis
not joined to yi for all i.
We define the theory TR to be the theory {(x)(y)(x 6= y)} {Pk |k 1}.
Theorem 4.42. TR is consistant.
Proof. By soundness, it is enough to show that it is satisfiable.
Consider the graph = hN, Ei where n is joind to m iff 2n appears in the
binary expansion of m, or vice versa.
We claim that satisfies Pk for all k 1. So let n1 , . . . , nk , m1 , . . . , mk
be distinct natural numbers. Let ` max{m1 , . . . , mk , n1 , . . . , nk }. Then z =
2n1 + . . . + 2nk + 2` is joined to all of n1 , . . . , nk , and none of the m1 , . . . , mk .
Question: Why is TR called the theory of the random graph?
Answer: Consider a graph = hN, E i defined as follows. For each pair
n 6= m, we toss a coin. If heads, we join, if tails we dont.
Proposition 4.43. is a model of TR with probability 1.
Proof. Suppose that x1 , . . . , xk , y1 , . . . , yk N are distinct. Fix some z
/F =
{x1 , . . . , xk , y1 , . . . , yk }. Then the probability that z works is 22k , and so the
probability that z doesnt work is 1 22k . Let N \ F = {z1 , . . . , zn , . . .}.
Then the probability that none of the z1 , . . . , zn works is (1 22k )n , and
limn (1 22k )n = 0, and so with probability 1, one of them will work.
Question: Why is TR claled the theory of THE random graph?
Theorem 4.44. If G, H are countably infinite models of TR , then G ' H.
Proof. By the back and forth argument.
Theorem 4.45. TR is complete.
Proof. L-V.
And now for something (apparently) completely different.
Well study the basics of finite random graph theory.
Definition 4.26. Gn is the set of all graphs with vertex set {1, . . . , n}.
Remark 4.6. Each G Gn is uniquely deterined by the edge set E P(T )
n
where T = {{k, `}|1 k < ` n}. Hence |Gn | = 2|T | = 2( 2 ) = 2n(n1)/2 .
Definition 4.27. If is any sentence in the language of graph theory, then
Probn () = |{GG|Gnn:G|=}|
|
This is the same probability as if we joined each pair of vertices independently
with probability 1/2.
38
Theorem 4.46 (Zero-One Law). Let be any sentence in the langauge of graph
theory. THen either limn Probn () = 1 or limn Probn () = 0.
We will first prove the following special case:
Theorem 4.47. For each k 1, limn Probn (Pk ) = 1.
Proof. Let n 2k + 1. Suppose that G Gn and G 6|= Pk . Then there exists
X = {x1 , . . . , xk , y1 , . . . , yk } such that no z in {1, . . . , n} \ X is joined to pre
2k n2k
cisely the first k. For fixed X, probability of this event is 1 21
.
So summing over all possible 2k subsets of X, we see that Probn (Pk ) =
n2k
n
1 2k
1
40