This action might not be possible to undo. Are you sure you want to continue?
Mathematical Logic
(Math 570)
Lecture Notes
Lou van den Dries
Fall Semester 2007
Contents
1 Preliminaries 1
1.1 Mathematical Logic: a brief overview . . . . . . . . . . . . . . . . 1
1.2 Sets and Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Basic Concepts of Logic 13
2.1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Completeness for Propositional Logic . . . . . . . . . . . . . . . . 19
2.3 Languages and Structures . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Variables and Terms . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Formulas and Sentences . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Logical Axioms and Rules; Formal Proofs . . . . . . . . . . . . . 37
3 The Completeness Theorem 43
3.1 Another Form of Completeness . . . . . . . . . . . . . . . . . . . 43
3.2 Proof of the Completeness Theorem . . . . . . . . . . . . . . . . 44
3.3 Some Elementary Results of Predicate Logic . . . . . . . . . . . . 49
3.4 Equational Classes and Universal Algebra . . . . . . . . . . . . . 53
4 Some Model Theory 59
4.1 L¨owenheimSkolem; Vaught’s Test . . . . . . . . . . . . . . . . . 59
4.2 Elementary Equivalence and BackandForth . . . . . . . . . . . 64
4.3 Quantiﬁer Elimination . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4 Presburger Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5 Skolemization and Extension by Deﬁnition . . . . . . . . . . . . . 73
5 Computability, Decidability, and Incompleteness 77
5.1 Computable Functions . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 The ChurchTuring Thesis . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Primitive Recursive Functions . . . . . . . . . . . . . . . . . . . . 87
5.4 Representability . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.5 Decidability and G¨odel Numbering . . . . . . . . . . . . . . . . . 95
5.6 Theorems of G¨odel and Church . . . . . . . . . . . . . . . . . . . 100
5.7 A more explicit incompleteness theorem . . . . . . . . . . . . . . 103
i
5.8 Undecidable Theories . . . . . . . . . . . . . . . . . . . . . . . . 105
Chapter 1
Preliminaries
We start with a brief overview of mathematical logic as covered in this course.
Next we review some basic notions from elementary set theory, which provides
a medium for communicating mathematics in a precise and clear way. In this
course we develop mathematical logic using elementary set theory as given,
just as one would do with other branches of mathematics, like group theory or
probability theory.
For more on the course material, see
Shoenﬁeld, J. R., Mathematical Logic, Reading, AddisonWesley, 1967.
For additional material in Model Theory we refer the reader to
Chang, C. C. and Keisler, H. J., Model Theory, New York, North
Holland, 1990,
Poizat, B., A Course in Model Theory, Springer, 2000,
and for additional material on Computability, to
Rogers, H., Theory of Recursive Functions and Eﬀective Com
putability, McGrawHill, 1967.
1.1 Mathematical Logic: a brief overview
Aristotle identiﬁed some simple patterns in human reasoning, and Leibniz dreamt
of reducing reasoning to calculation. As a mathematical subject, however, logic
is relatively recent: the 19th century pioneers were Bolzano, Boole, Cantor,
Dedekind, Frege, Peano, C.S. Peirce, and E. Schr¨oder. From our perspective
we see their work as leading to boolean algebra, set theory, propositional logic,
predicate logic, as clarifying the foundations of the natural and real number
systems, and as introducing suggestive symbolic notation for logical operations.
Also, their activity led to the view that logic + set theory can serve as a basis for
1
2 CHAPTER 1. PRELIMINARIES
all of mathematics. This era did not produce theorems in mathematical logic
of any real depth,
1
but it did bring crucial progress of a conceptual nature,
and the recognition that logic as used in mathematics obeys mathematical rules
that can be made fully explicit.
In the period 19001950 important new ideas came from Russell, Zermelo,
Hausdorﬀ, Hilbert, L¨owenheim, Ramsey, Skolem, Lusin, Post, Herbrand, G¨odel,
Tarski, Church, Kleene, Turing, and Gentzen. They discovered the ﬁrst real
theorems in mathematical logic, with those of G¨odel having a dramatic impact.
Hilbert (in G¨ottingen), Lusin (in Moscow), Tarski (in Warsaw and Berkeley),
and Church (in Princeton) had many students and collaborators, who made up
a large part of that generation and the next in mathematical logic. Most of
these names will be encountered again during the course.
The early part of the 20th century was also marked by the socalled
foundational crisis in mathematics.
A strong impulse for developing mathematical logic came from the attempts
during these times to provide solid foundations for mathematics. Mathematical
logic has now taken on a life of its own, and also thrives on many interactions
with other areas of mathematics and computer science.
In the second half of the last century, logic as pursued by mathematicians
gradually branched into four main areas: model theory, computability theory (or
recursion theory), set theory, and proof theory. The topics in this course are
part of the common background of mathematicians active in any of these areas.
What distinguishes mathematical logic within mathematics is that
statements about mathematical objects
are taken seriously as mathematical objects in their own right. More generally,
in mathematical logic we formalize (formulate in a precise mathematical way)
notions used informally by mathematicians such as:
• property
• statement (in a given language)
• structure
• truth (what it means for a given statement to be true in a given structure)
• proof (from a given set of axioms)
• algorithm
1
In the case of set theory one could dispute this. Even so, the main inﬂuence of set theory
on the rest of mathematics was to enable simple constructions of great generality, like cartesian
products, quotient sets and power sets, and this involves only very elementary set theory.
1.1. MATHEMATICAL LOGIC: A BRIEF OVERVIEW 3
Once we have mathematical deﬁnitions of these notions, we can try to prove
theorems about these formalized notions. If done with imagination, this process
can lead to unexpected rewards. Of course, formalization tends to caricature
the informal concepts it aims to capture, but no harm is done if this is kept
ﬁrmly in mind.
Example. The notorious Goldbach Conjecture asserts that every even integer
greater than 2 is a sum of two prime numbers. With the understanding that
the variables range over N = ¦0, 1, 2, . . . ¦, and that 0, 1, +, , < denote the
usual arithmetic operations and relations on N, this assertion can be expressed
formally as
(GC) ∀x[(1+1 < x∧even(x)) → ∃p∃q(prime(p)∧prime(q)∧x = p+q)]
where even(x) abbreviates ∃y(x = y +y) and prime(p) abbreviates
1 < p ∧ ∀r∀s(p = r s → (r = 1 ∨ s = 1)).
The expression GC is an example of a formal statement (also called a sentence)
in the language of arithmetic, which has symbols 0, 1, +, , < to denote arithmetic
operations and relations, in addition to logical symbols like =, ∧, ∨, , →, ∀, ∃,
and variables x, y, z, p, q, r, s.
The Goldbach Conjecture asserts that this particular sentence GC is true in
the structure (N; 0, 1, +, , <). (No proof of the Goldbach Conjecture is known.)
It also makes sense to ask whether the sentence GC is true in the structure
(R; 0, 1, +, , <)
(It’s not, as is easily veriﬁed. That the question makes sense does not mean
that it is of any interest.)
A century of experience gives us conﬁdence that all classical numbertheoretic
results—old or new, proved by elementary methods or by sophisticated algebra
and analysis—can be proved from the Peano axioms for arithmetic.
2
However,
in our present state of knowledge, GC might be true in (N; 0, 1, +, , <), but not
provable from those axioms. (On the other hand, once you know what exactly
we mean by
provable from the Peano axioms,
you will see that if GC is provable from those axioms, then GC is true in
(N; 0, 1, +, , <), and that if GC is false in (N; 0, 1, +, , <), then its negation
GC is provable from those axioms.)
The point of this example is simply to make the reader aware of the notions
“true in a given structure” and “provable from a given set of axioms,” and their
diﬀerence. One objective of this course is to ﬁgure out the connections (and
disconnections) between these notions.
2
Here we do not count as part of classical number theory some results like Ramsey’s
Theorem that can be stated in the language of arithmetic, but are arguably more in the spirit
of logic and combinatorics.
4 CHAPTER 1. PRELIMINARIES
Some highlights (1900–1950)
The results below are among the most frequently used facts of mathematical
logic. The terminology used in stating these results might be unfamiliar, but
that should change during the course. What matters is to get some preliminary
idea of what we are aiming for. As will become clear during the course, each of
these results has stronger versions, on which applications often depend, but in
this overview we prefer simple statements over strength and applicability.
We begin with two results that are fundamental in model theory. They
concern the notion of model of Σ where Σ is a set of sentences in a language
L. At this stage we only say by way of explanation that a model of Σ is a
mathematical structure in which all sentences of Σ are true. For example, if Σ
is the (inﬁnite) set of axioms for ﬁelds of characteristic zero in the language of
rings, then a model of Σ is just a ﬁeld of characteristic zero.
Theorem of L¨owenheim and Skolem. If Σ is a countable set of sentences
in some language and Σ has a model, then Σ has a countable model.
Compactness Theorem (G¨odel, Mal’cev). Let Σ be a set of sentences in some
language. Then Σ has a model if and only if each ﬁnite subset of Σ has a model.
The next result goes a little beyond model theory by relating the notion of
“model of Σ” to that of “provability from Σ”:
Completeness Theorem (G¨odel, 1930). Let Σ be a set of sentences in some
language L, and let σ be a sentence in L. Then σ is provable from Σ if and only
if σ is true in all models of Σ.
In our treatment we shall obtain the ﬁrst two theorems as byproducts of the
Completeness Theorem and its proof. In the case of the Compactness Theorem
this reﬂects history, but the theorem of L¨owenheim and Skolem predates the
Completeness Theorem. The L¨owenheimSkolem and Compactness theorems
do not mention the notion of provability, and thus model theorists often prefer
to bypass Completeness in establishing these results; see for example Poizat’s
book.
Here is an important early result on a speciﬁc arithmetic structure:
Theorem of Presburger and Skolem. Each sentence in the language of the
structure (Z; 0, 1, +, −, <) that is true in this structure is provable from the
axioms for ordered abelian groups with least positive element 1, augmented, for
each n = 2, 3, 4, ..., by an axiom that says that for every a there is a b such that
a = nb or a = nb + 1 or . . . or a = nb + 1 + + 1 (with n disjuncts in total).
Moreover, there is an algorithm that, given any sentence in this language as
input, decides whether this sentence is true in (Z; 0, 1, +, −, <).
Note that in (Z; 0, 1, +, −, <) we have not included multiplication among the
primitives; accordingly, nb stands for b + +b (with n summands).
When we do include multiplication, the situation changes dramatically:
1.2. SETS AND MAPS 5
Incompleteness and undecidability of arithmetic. (G¨odelChurch, 1930’s).
One can construct a sentence in the language of arithmetic that is true in the
structure (N; 0, 1, +, , <), but not provable from the Peano axioms.
There is no algorithm that, given any sentence in this language as input,
decides whether this sentence is true in (N; 0, 1, +, , <).
Here “there is no algorithm” is used in the mathematical sense of
there cannot exist an algorithm,
not in the weaker colloquial sense of “no algorithm is known.” This theorem
is intimately connected with the clariﬁcation of notions like computability and
algorithm in which Turing played a key role.
In contrast to these incompleteness and undecidability results on (suﬃciently
rich) arithmetic, we have
Tarski’s theorem on the ﬁeld of real numbers (19301950). Every sentence
in the language of arithmetic that is true in the structure
(R; 0, 1, +, , <)
is provable from the axioms for ordered ﬁelds augmented by the axioms
 every positive element is a square,
 every odd degree polynomial has a zero.
There is also an algorithm that decides for any given sentence in this language
as input, whether this sentence is true in (R; 0, 1, +, , <).
1.2 Sets and Maps
We shall use this section as an opportunity to ﬁx notations and terminologies
that are used throughout these notes, and throughout mathematics. In a few
places we shall need more set theory than we introduce here, for example, or
dinals and cardinals. The following little book is a good place to read about
these matters. (It also contains an axiomatic treatment of set theory starting
from scratch.)
Halmos, P. R., Naive set theory, New York, Springer, 1974
In an axiomatic treatment of set theory as in the book by Halmos all assertions
about sets below are proved from a few simple axioms. In such a treatment the
notion of set itself is left undeﬁned, but the axioms about sets are suggested
by thinking of a set as a collection of mathematical objects, called its elements
or members. To indicate that an object x is an element of the set A we write
x ∈ A, in words: x is in A (or: x belongs to A). To indicate that x is not in A we
write x / ∈ A. We consider the sets A and B as the same set (notation: A = B)
if and only if they have exactly the same elements. We often introduce a set
via the bracket notation, listing or indicating inside the brackets its elements.
6 CHAPTER 1. PRELIMINARIES
For example, ¦1, 2, 7¦ is the set with 1, 2, and 7 as its only elements. Note that
¦1, 2, 7¦ = ¦2, 7, 1¦, and ¦3, 3¦ = ¦3¦: the same set can be described in many
diﬀerent ways. Don’t confuse an object x with the set ¦x¦ that has x as its
only element: for example, the object x = ¦0, 1¦ is a set that has exactly two
elements, namely 0 and 1, but the set ¦x¦ = ¦¦0, 1¦¦ has only one element,
namely x.
Here are some important sets that the reader has probably encountered
previously.
Examples.
(1) The empty set: ∅ (it has no elements).
(2) The set of natural numbers: N = ¦0, 1, 2, 3, . . .¦.
(3) The set of integers: Z = ¦. . . , −2, −1, 0, 1, 2, . . .¦.
(4) The set of rational numbers: Q.
(5) The set of real numbers: R.
(6) The set of complex numbers: C.
Remark. Throughout these notes m and n always denote natural numbers.
For example, “for all m ...” will mean “for all m ∈ N...”.
If all elements of the set A are in the set B, then we say that A is a subset of
B (and write A ⊆ B). Thus the empty set ∅ is a subset of every set, and each
set is a subset of itself. We often introduce a set A in our discussions by deﬁning
A to be the set of all elements of a given set B that satisfy some property P.
Notation:
A := ¦x ∈ B : x satisﬁes P¦ (hence A ⊆ B).
Let A and B be sets. Then we can form the following sets:
(a) A∪ B := ¦x : x ∈ A or x ∈ B¦ (union of A and B);
(b) A∩ B := ¦x : x ∈ A and x ∈ B¦ (intersection of A and B);
(c) AB := ¦x : x ∈ A and x / ∈ B¦ (diﬀerence of A and B);
(d) AB := ¦(a, b) : a ∈ A and b ∈ B¦ (cartesian product of A and B).
Thus the elements of A B are the socalled ordered pairs (a, b) with a ∈ A
and b ∈ B. The key property of ordered pairs is that we have (a, b) = (c, d) if
and only if a = c and b = d. For example, you may think of R R as the set
of points (a, b) in the xyplane of coordinate geometry.
We say that A and B are disjoint if A∩B = ∅, that is, they have no element
in common.
Remark. In a deﬁnition such as we just gave: “We say that if —,” the
meaning of “if” is actually “if and only if.” We committed a similar abuse
of language earlier in deﬁning set inclusion by the phrase “If —, then we say
that .” We shall continue such abuse, in accordance with tradition, but
only in similarly worded deﬁnitions. Also, we shall often write “iﬀ” or “⇔” to
abbreviate “if and only if.”
1.2. SETS AND MAPS 7
Maps
Deﬁnition. A map is a triple f = (A, B, Γ) of sets A, B, Γ such that Γ ⊆ AB
and for each a ∈ A there is exactly one b ∈ B with (a, b) ∈ Γ; we write f(a) for
this unique b, and call it the value of f at a (or the image of a under f).
3
We
call A the domain of f, and B the codomain of f, and Γ the graph of f.
4
We
write f : A → B to indicate that f is a map with domain A and codomain B,
and in this situation we also say that f is a map from A to B.
Among the many synonyms of map are
mapping, assignment, function, operator, transformation.
Typically, “function” is used when the codomain is a set of numbers of some
kind, “operator” when the elements of domain and codomain are themselves
functions, and “transformation” is used in geometric situations where domain
and codomain are equal. (We use equal as synonym for the same or identical;
also coincide is a synonym for being the same.)
Examples.
(1) Given any set A we have the identity map 1
A
: A → A deﬁned by 1
A
(a) = a
for all a ∈ A.
(2) Any polynomial f(X) = a
0
+ a
1
X + + a
n
X
n
with real coeﬃcients
a
0
, . . . , a
n
gives rise to a function x → f(x) : R → R. We often use the
“maps to” symbol → in this way to indicate the rule by which to each x
in the domain we associate its value f(x).
Deﬁnition. Given f : A → B and g : B → C we have a map g ◦ f : A → C
deﬁned by (g ◦ f)(a) = g(f(a)) for all a ∈ A. It is called the composition of g
and f.
Deﬁnition. Let f : A → B be a map. It is said to be injective if for all a
1
,= a
2
in A we have f(a
1
) ,= f(a
2
). It is said to be surjective if for each b ∈ B there
exists a ∈ A such that f(a) = b. It is said to be bijective (or a bijection) if it is
both injective and surjective. For X ⊆ A we put
f(X) := ¦f(x) : x ∈ X¦ ⊆ B (direct image of X under f).
(There is a notational conﬂict here when X is both a subset of A and an element
of A, but it will always be clear from the context when f(X) is meant to be the
the direct image of X under f; some authors resolve the conﬂict by denoting this
direct image by f[X] or in some other way.) We also call f(A) = ¦f(a) : a ∈ A¦
the image of f. For Y ⊆ B we put
f
−1
(Y ) := ¦x ∈ A : f(x) ∈ Y ¦ ⊆ A (inverse image of Y under f).
Thus surjectivity of our map f is equivalent to f(A) = B.
3
Sometimes we shall write fa instead of f(a) in order to cut down on parentheses.
4
Other words for “domain” and “codomain” are “source” and “target”, respectively.
8 CHAPTER 1. PRELIMINARIES
If f : A → B is a bijection then we have an inverse map f
−1
: B → A given by
f
−1
(b) := the unique a ∈ A such that f(a) = b.
Note that then f
−1
◦ f = 1
A
and f ◦ f
−1
= 1
B
. Conversely, if f : A → B and
g : B → A satisfy g ◦ f = 1
A
and f ◦ g = 1
B
, then f is a bijection with f
−1
= g.
(The attentive reader will notice that we just introduced a potential conﬂict of
notation: for bijective f : A → B and Y ⊆ B, both the inverse image of Y
under f and the direct image of Y under f
−1
are denoted by f
−1
(Y ); no harm
is done, since these two subsets of A coincide.)
It follows from the deﬁnition of “map” that f : A → B and g : C → D are
equal (f = g) if and only if A = C, B = D, and f(x) = g(x) for all x ∈ A. We
say that g : C → D extends f : A → B if A ⊆ C, B ⊆ D, and f(x) = g(x) for
all x ∈ A.
5
Deﬁnition. A set A is said to be ﬁnite if there exists n and a bijection
f : ¦1, . . . , n¦ → A.
Here we use ¦1, . . . , n¦ as a suggestive notation for the set ¦m : 1 ≤ m ≤ n¦.
For n = 0 this is just ∅. If A is ﬁnite there is exactly one such n (although if
n > 1 there will be more than one bijection f : ¦1, . . . , n¦ → A); we call this
unique n the number of elements of A or the cardinality of A, and denote it by
[A[. A set which is not ﬁnite is said to be inﬁnite.
Deﬁnition. A set A is said to be countably inﬁnite if there is a bijection N → A.
It is said to be countable if it is either ﬁnite or countably inﬁnite.
Example. The sets N, Z and Q are countably inﬁnite, but the inﬁnite set R
is not countably inﬁnite. Every inﬁnite set has a countably inﬁnite subset.
One of the standard axioms of set theory, the Power Set Axiom says:
For any set A, there is a set whose elements are exactly the subsets of A.
Such a set of subsets of A is clearly uniquely determined by A, is denoted
by T(A), and is called the power set of A. If A is ﬁnite, so is T(A) and
[T(A)[ = 2
A
. Note that a → ¦a¦ : A → T(A) is an injective map. However,
there is no surjective map A → T(A):
Cantor’s Theorem. Let S : A → T(A) be a map. Then the set
¦a ∈ A : a / ∈ S(a)¦ (a subset of A)
is not an element of S(A).
Proof. Suppose otherwise. Then ¦a ∈ A : a / ∈ S(a)¦ = S(b) where b ∈ A.
Assuming b ∈ S(b) yields b / ∈ S(b), a contradiction. Thus b / ∈ S(b); but then
b ∈ S(b), again a contradiction. This concludes the proof.
5
We also say “g : C →D is an extension of f : A →B” or “f : A →B is a restriction of
g : C →D.”
1.2. SETS AND MAPS 9
Let I and A be sets. Then there is a set whose elements are exactly the maps
f : I → A, and this set is denoted by A
I
. For I = ¦1, . . . , n¦ we also write A
n
instead of A
I
. Thus an element of A
n
is a map a : ¦1, . . . , n¦ → A; we usually
think of such an a as the ntuple (a(1), . . . , a(n)), and we often write a
i
instead
of a(i). So A
n
can be thought of as the set of ntuples (a
1
, . . . , a
n
) with each
a
i
∈ A. For n = 0 the set A
n
has just one element — the empty tuple.
An nary relation on A is just a subset of A
n
, and an nary operation on
A is a map from A
n
into A. Instead of “1ary” we usually say “unary”, and
instead of “2ary” we can say “binary”. For example, ¦(a, b) ∈ Z
2
: a < b¦ is a
binary relation on Z, and integer addition is the binary operation (a, b) → a +b
on Z.
Deﬁnition. ¦a
i
¦
i∈I
or (a
i
)
i∈I
denotes a family of objects a
i
indexed by the set
I, and is just a suggestive notation for a set ¦(i, a
i
) : i ∈ I¦, not to be confused
with the set ¦a
i
: i ∈ I¦. (There may be repetitions in the family, that is, it
may happen that a
i
= a
j
for distinct indices i, j ∈ I, but such repetition is not
reﬂected in the set ¦a
i
: i ∈ I¦. For example, if I = N and a
n
= a for all n, then
¦(i, a
i
) : i ∈ I¦ = ¦(i, a) : i ∈ N¦ is countably inﬁnite, but ¦a
i
: i ∈ I¦ = ¦a¦
has just one element.) For I = N we usually say “sequence” instead of “family”.
Given any family (A
i
)
i∈I
of sets (that is, each A
i
is a set) we have a set
_
i∈I
A
i
:= ¦x : x ∈ A
i
for some i ∈ I¦,
the union of the family. If I is ﬁnite and each A
i
is ﬁnite, then so is the union
above and
[
_
i∈I
A
i
[ ≤
i∈I
[A
i
[.
If I is countable and each A
i
is countable then
i∈I
A
i
is countable.
Given any family (A
i
)
i∈I
of sets we have a set
i∈I
A
i
:= ¦(a
i
)
i∈I
: a
i
∈ A
i
for all i ∈ I¦,
the product of the family. One axiom of set theory, the Axiom of Choice, is a
bit special, but we shall use it a few times. It says that for any family (A
i
)
i∈I
of nonempty sets there is a family (a
i
)
i∈I
such that a
i
∈ A
i
for all i ∈ I, that
is,
i∈I
A
i
,= ∅.
Words
Deﬁnition. Let A be a set. Think of A as an alphabet of letters. A word of
length n on A is an ntuple (a
1
, . . . , a
n
) of letters a
i
∈ A; because we think
of it as a word (string of letters) we shall write this tuple instead as a
1
. . . a
n
(without parentheses or commas). There is a unique word of length 0 on A, the
empty word and written . Given a word a = a
1
. . . a
n
of length n ≥ 1 on A,
10 CHAPTER 1. PRELIMINARIES
the ﬁrst letter (or ﬁrst symbol) of a is by deﬁnition a
1
, and the last letter (or
last symbol) of a is a
n
. The set of all words on A is denoted A
∗
:
A
∗
=
_
n
A
n
(disjoint union).
Logical expressions like formulas and terms will be introduced later as words of
a special form on suitable alphabets. When A ⊆ B we can identify A
∗
with a
subset of B
∗
, and this will be done whenever convenient.
Deﬁnition. Given words a = a
1
. . . a
m
and b = b
1
. . . b
n
on A of length m and
n respectively, we deﬁne their concatenation ab ∈ A
∗
:
ab = a
1
. . . a
m
b
1
. . . b
n
.
Thus ab is a word on A of length m + n. Concatenation is a binary operation
on A
∗
that is associative: (ab)c = a(bc) for all a, b, c ∈ A
∗
, with as twosided
identity: a = a = a for all a ∈ A
∗
, and with twosided cancellation: for all
a, b, c ∈ A
∗
, if ab = ac, then b = c, and if ac = bc, then a = b.
Equivalence Relations and Quotient Sets
Given a binary relation R on a set A it is often more suggestive to write aRb
instead of (a, b) ∈ R.
Deﬁnition. An equivalence relation on a set A is a binary relation ∼ on A such
that for all a, b, c ∈ A:
(i) a ∼ a (reﬂexivity);
(ii) a ∼ b implies b ∼ a (symmetry);
(iii) (a ∼ b and b ∼ c) implies a ∼ c (transitivity).
Example. Given any n we have the equivalence relation “congruence modulo
n” on Z deﬁned as follows: for any a, b ∈ Z we have
a ≡ b mod n ⇐⇒ a −b = nc for some c ∈ Z.
For n = 0 this is just equality on Z.
Let ∼ be an equivalence relation on the set A. The equivalence class a
∼
of an
element a ∈ A is deﬁned by a
∼
= ¦b ∈ A : a ∼ b¦ (a subset of A). For a, b ∈ A
we have a
∼
= b
∼
if and only if a ∼ b, and a
∼
∩b
∼
= ∅ if and only if a b. The
quotient set of A by ∼ is by deﬁnition the set of equivalence classes:
A/∼ = ¦a
∼
: a ∈ A¦.
This quotient set is a partition of A, that is, it is a collection of pairwise disjoint
nonempty subsets of A whose union is A. (Collection is a synonym for set; we
use it here because we don’t like to say “set of ... subsets ...”.) Every partition
of A is the quotient set A/ ∼ for a unique equivalence relation ∼ on A. Thus
1.2. SETS AND MAPS 11
equivalence relations on A and partitions of A are just diﬀerent ways to describe
the same situation.
In the previous example (congruence modulo n) the equivalence classes are
called congruence classes modulo n (or residue classes modulo n) and the cor
responding quotient set is often denoted Z/nZ.
Remark. Readers familiar with some abstract algebra will note that the con
struction in the example above is a special case of a more general construction—
that of a quotient of a group with respect to a normal subgroup.
Posets
A partially ordered set (short: poset) is a pair (P, ≤) consisting of a set P and
a partial ordering ≤ on P, that is, ≤ is a binary relation on P such that for all
p, q, r ∈ P:
(i) p ≤ p (reﬂexivity);
(ii) if p ≤ q and q ≤ p, then p = q (antisymmetry);
(iii) if p ≤ q and q ≤ r, then p ≤ r (transitivity).
If in addition we have for all p, q ∈ P,
(iv) p ≤ q or q ≤ p,
then we say that ≤ is a linear order on P, or that (P, ≤) is a linearly ordered
set.
6
Each of the sets N, Z, Q, R comes with its familiar linear order on it.
As an example, take any set A and its collection T(A) of subsets. Then
X ≤ Y :⇐⇒ X ⊆ Y (for subsets X, Y of A)
deﬁnes a poset (T(A), ≤), also referred to as the power set of A ordered by
inclusion. This is not a linearly ordered set if A has more than one element.
Finite linearly ordered sets are determined “up to unique isomorphism” by their
size: if (P, ≤) is a linearly ordered set and [P[ = n, then there is a unique map
ι : P → ¦1, . . . , n¦ such that for all p, q ∈ P we have: p ≤ q ⇐⇒ ι(p) ≤ ι(q).
This map ι is a bijection.
Let (P, ≤) be a poset. Here is some useful notation. For x, y ∈ P we set
x ≥ y : ⇐⇒ y ≤ x,
x < y : ⇐⇒ y > x :⇐⇒ x ≤ y and x ,= y.
Note that (P, ≥) is also a poset. A least element of P is a p ∈ P such that p ≤ x
for all x ∈ P; a largest element of P is deﬁned likewise, with ≥ instead of ≤.
Of course, P can have at most one least element; therefore we can refer to the
6
One also uses the term total order instead of linear order.
12 CHAPTER 1. PRELIMINARIES
least element of P, if P has a least element; likewise, we can refer to the largest
element of P, if P has a largest element.
A minimal element of P is a p ∈ P such that there is no x ∈ P with x < p;
a maximal element of P is deﬁned likewise, with > instead of <. If P has a
least element, then this element is also the unique minimal element of P; some
posets, however, have more than one minimal element. The reader might want
to prove the following result to get a feeling for these notions:
If P is ﬁnite and nonempty, then P has a maximal element, and there is a linear
order ≤
on P that extends ≤ in the sense that
p ≤ q =⇒ p ≤
q, for all p, q ∈ P.
(Hint: use induction on [P[.)
Let X ⊆ P. A lowerbound (respectively, upperbound) of X in P is an element l ∈
P (respectively, an element u ∈ P), such that l ≤ x for all x ∈ X (respectively,
x ≤ u for all x ∈ X).
We often tacitly consider X as a poset in its own right, by restricting the
given partial ordering of P to X. More precisely this means that we consider
the poset (X, ≤
X
) where the partial ordering ≤
X
on X is deﬁned by
x ≤
X
y ⇐⇒ x ≤ y (x, y ∈ X).
Thus we can speak of least, largest, minimal, and maximal elements of a set
X ⊆ P, when the ambient poset (P, ≤) is clear from the context. For example,
when X is the collection of nonempty subsets of a set A and X is ordered by
inclusion, then the minimal elements of X are the singletons ¦a¦ with a ∈ A.
We call X a chain in P if (X, ≤
X
) is linearly ordered.
Occasionally we shall use the following fact about posets (P, ≤).
Zorn’s Lemma. Suppose P is nonempty and every nonempty chain in P has
an upperbound in P. Then P has a maximal element.
For a further discussion of Zorn’s Lemma and its proof using the Axiom of
Choice we refer the reader to Halmos’s book on set theory.
Chapter 2
Basic Concepts of Logic
2.1 Propositional Logic
Propositional logic is the fragment of logic where we construct new statements
from given statements using socalled connectives like “not”, “or” and “and”.
The truth value of such a new statement is then completely determined by the
truth values of the given statements. Thus, given any statements p and q, we
can form the three statements
p (the negation of p, pronounced as “not p”),
p ∨ q (the disjunction of p and q, pronounced as “p or q”),
p ∧ q (the conjunction of p and q, pronounced as “p and q”).
This leads to more complicated combinations like
_
p ∧(q)
_
. We shall regard
p as true if and only if p is not true; also, p ∨ q is deﬁned to be true if and
only if p is true or q is true (including the possibility that both are true), and
p ∧ q is deemed to be true if and only if p is true and q is true. Instead of “not
true” we also say “false”. We now introduce a formalism that makes this into
mathematics.
We start with the ﬁve distinct symbols
· ⊥ ∨ ∧
to be thought of as true, false, not, or, and and, respectively. These symbols are
ﬁxed throughout the course, and are called propositional connectives. In this
section we also ﬁx a set A whose elements will be called propositional atoms (or
just atoms), such that no propositional connective is an atom. It may help the
reader to think of an atom a as a variable for which we can substitute arbitrary
statements, assumed to be either true or false.
A proposition on A is a word on the alphabet A ∪ ¦·, ⊥, , ∨, ∧¦ that can
be obtained by applying the following rules:
13
14 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(i) each atom a ∈ A (viewed as a word of length 1) is a proposition on A;
(ii) · and ⊥ (viewed as words of length 1) are propositions on A;
(iii) if p and q are propositions on A, then the concatenations p, ∨pq and ∧pq
are propositions on A.
For the rest of this section “proposition” means “proposition on A”, and p, q, r
(sometimes with subscripts) will denote propositions.
Example. Suppose a, b, c are atoms. Then ∧ ∨ abc is a proposition. This
follows from the rules above: a is a proposition, so a is a proposition, hence
∨ab as well; also c is a proposition, and thus ∧ ∨ abc is a proposition.
We deﬁned “proposition” using the suggestive but vague phrase “can be ob
tained by applying the following rules”. The reader should take such an infor
mal description as shorthand for a completely explicit deﬁnition, which in the
case at hand is as follows:
A proposition is a word w on the alphabet A ∪ ¦·, ⊥, , ∨, ∧¦ for which there
is a sequence w
1
, . . . , w
n
of words on that same alphabet, with n ≥ 1, such that
w = w
n
and for each k ∈ ¦1, . . . , n¦, either w
k
∈ A∪¦·, ⊥¦ (where each element
in the last set is viewed as a word of length 1), or there are i, j ∈ ¦1, . . . , k −1¦
such that w
k
is one of the concatenations w
i
, ∨w
i
w
j
, ∧w
i
w
j
.
We let Prop(A) denote the set of propositions.
Remark. Having the connectives ∨ and ∧ in front of the propositions they
“connect” rather than in between, is called preﬁx notation or Polish notation.
This is theoretically elegant, but for the sake of readability we usually write p∨q
and p ∧ q to denote ∨pq and ∧pq respectively, and we also use parentheses and
brackets if this helps to clarify the structure of a proposition. So the proposition
in the example above could be denoted by [(a)∨b]∧(c), or even by (a∨b)∧c
since we shall agree that binds stronger than ∨ and ∧ in this informal way
of indicating propositions. Because of the informal nature of these conventions,
we don’t have to give precise rules for their use; it’s enough that each actual
use is clear to the reader.
The intended structure of a proposition—how we think of it as built up
from atoms via connectives—is best exhibited in the form of a tree, a two
dimensional array, rather than as a onedimensional string. Such trees, however,
occupy valuable space on the printed page, and are typographically demanding.
Fortunately, our “oﬃcial” preﬁx notation does uniquely determine the intended
structure of a proposition: that is what the next lemma amounts to.
Lemma 2.1.1 (Unique Readability). If p has length 1, then either p = ·, or
p = ⊥, or p is an atom. If p has length > 1, then its ﬁrst symbol is either ,
or ∨, or ∧. If the ﬁrst symbol of p is , then p = q for a unique q. If the ﬁrst
symbol of p is ∨, then p = ∨qr for a unique pair (q, r). If the ﬁrst symbol of p
is ∧, then p = ∧qr for a unique pair (q, r).
(Note that we used here our convention that p, q, r denote propositions.) Only
the last two claims are worth proving in print, the others should require only a
2.1. PROPOSITIONAL LOGIC 15
moment’s thought. For now we shall assume this lemma without proof. At the
end of this section we establish more general results which are needed also later
in the course.
Remark. Rather than thinking of a proposition as a statement, it’s better
viewed as a function whose arguments and values are statements: replacing the
atoms in a proposition by speciﬁc mathematical statements like “2 2 = 4”,
“π
2
< 7”, and “every even integer > 2 is the sum of two prime numbers”, we
obtain again a mathematical statement.
We shall use the following notational conventions: p → q denotes p ∨ q, and
p ↔ q denotes (p → q) ∧ (q → p). By recursion on n we deﬁne
p
1
∨ . . . ∨ p
n
=
_
¸
¸
_
¸
¸
_
⊥ if n = 0
p
1
if n = 1
p
1
∨ p
2
if n = 2
(p
1
∨ . . . ∨ p
n−1
) ∨ p
n
if n > 2
Thus p ∨ q ∨ r stands for (p ∨ q) ∨ r. We call p
1
∨ . . . ∨ p
n
the disjunction of
p
1
, . . . , p
n
. The reason that for n = 0 we take this disjunction to be ⊥ is that
we want a disjunction to be true if and only if (at least) one of the disjuncts is
true.
Similarly, the conjunction p
1
∧ . . . ∧ p
n
of p
1
, . . . , p
n
is deﬁned by replacing
everywhere ∨ by ∧ and ⊥ by · in the deﬁnition of p
1
∨ . . . ∨ p
n
.
Deﬁnition. A truth assignment is a map t : A → ¦0, 1¦. We extend such a t
to
ˆ
t : Prop(A) → ¦0, 1¦ by requiring
(i)
ˆ
t(·) = 1
(ii)
ˆ
t(⊥) = 0
(iii)
ˆ
t(p) = 1 −
ˆ
t(p)
(iv)
ˆ
t(p ∨ q) = max(
ˆ
t(p),
ˆ
t(q))
(v)
ˆ
t(p ∧ q) = min(
ˆ
t(p),
ˆ
t(q))
Note that there is exactly one such extension
ˆ
t by unique readability. To simplify
notation we often write t instead of
ˆ
t. The array below is called a truth table.
It shows on each row below the top row how the two leftmost entries t(p) and
t(q) determine t(p), t(p ∨ q), t(p ∧ q), t(p → q) and t(p ↔ q).
p q p p ∨ q p ∧ q p → q p ↔ q
0 0 1 0 0 1 1
0 1 1 1 0 1 0
1 0 0 1 0 0 0
1 1 0 1 1 1 1
Let t : A → ¦0, 1¦. Note that t(p → q) = 1 if and only if t(p) ≤ t(q), and that
t(p ↔ q) = 1 if and only if t(p) = t(q).
Suppose a
1
, . . . , a
n
are the distinct atoms that occur in p, and we know how
p is built up from those atoms. Then we can compute in a ﬁnite number of steps
16 CHAPTER 2. BASIC CONCEPTS OF LOGIC
t(p) from t(a
1
), . . . , t(a
n
). In particular, t(p) = t
(p) for any t
: A → ¦0, 1¦ such
that t(a
i
) = t
(a
i
) for i = 1, . . . , n.
Deﬁnition. We say that p is a tautology (notation: [= p) if t(p) = 1 for all
t : A → ¦0, 1¦. We say that p is satisﬁable if t(p) = 1 for some t : A → ¦0, 1¦.
Thus · is a tautology, and p ∨ p, p → (p ∨ q) are tautologies for all p and
q. By the remark preceding the deﬁnition one can verify whether any given p
with exactly n distinct atoms in it is a tautology by computing 2
n
numbers and
checking that these numbers all come out 1. (To do this accurately by hand is
already cumbersome for n = 5, but computers can handle somewhat larger n.
Fortunately, other methods are often eﬃcient for special cases.)
Remark. Note that [= p ↔ q iﬀ t(p) = t(q) for all t : A → ¦0, 1¦. We call p
equivalent to q if [= p ↔ q. Note that “equivalent to” deﬁnes an equivalence
relation on Prop(A). The lemma below gives a useful list of equivalences. We
leave it to the reader to verify them.
Lemma 2.1.2. For all p, q, r we have the following equivalences:
(1) [= (p ∨ p) ↔ p, [= (p ∧ p) ↔ p
(2) [= (p ∨ q) ↔ (q ∨ p), [= (p ∧ q) ↔ (q ∧ p)
(3) [= (p ∨ (q ∨ r)) ↔ ((p ∨ q) ∨ r), [= (p ∧ (q ∧ r)) ↔ ((p ∧ q) ∧ r)
(4) [= (p ∨ (q ∧ r)) ↔ (p ∨ q) ∧ (p ∨ r), [= (p ∧ (q ∨ r)) ↔ (p ∧ q) ∨ (p ∧ r)
(5) [= (p ∨ (p ∧ q)) ↔ p, [= (p ∧ (p ∨ q)) ↔ p
(6) [= ((p ∨ q)) ↔ (p ∧ q), [= ((p ∧ q)) ↔ (p ∨ q)
(7) [= (p ∨ p) ↔ ·, [= (p ∧ p) ↔ ⊥
(8) [= p ↔ p
Items (1), (2), (3), (4), (5), and (6) are often referred to as the idempotent
law, commutativity, associativity, distributivity, the absorption law, and the De
Morgan law, respectively. Note the leftright symmetry in (1)–(7) : the socalled
duality of propositional logic. We shall return to this issue in the more algebraic
setting of boolean algebras.
Some notation: let (p
i
)
i∈I
be a family of propositions with ﬁnite index set
I, choose a bijection k → i(k) : ¦1, . . . , n¦ → I and set
i∈I
p
i
:= p
i(1)
∨ ∨ p
i(n)
,
i∈I
p
i
:= p
i(1)
∧ ∧ p
i(n)
.
If I is clear from context we just write
_
i
p
i
and
_
i
p
i
instead. Of course, the
notations
_
i∈I
p
i
and
_
i∈I
p
i
can only be used when the particular choice of
bijection of ¦1, . . . , n¦ with I does not matter; this is usually the case, because
the equivalence class of p
i(1)
∨ ∨ p
i(n)
does not depend on this choice, and
the same is true for the equivalence class of p
i(1)
∧ ∧ p
i(n)
.
Next we deﬁne “model of Σ” and “tautological consequence of Σ”.
Deﬁnition. Let Σ ⊆ Prop(A). By a model of Σ we mean a truth assignment
t : A → ¦0, 1¦ such that t(p) = 1 for all p ∈ Σ. We say that a proposition p is
a tautological consequence of Σ (written Σ [= p) if t(p) = 1 for every model t of
Σ. Note that [= p is the same as ∅ [= p.
2.1. PROPOSITIONAL LOGIC 17
Lemma 2.1.3. Let Σ ⊆ Prop(A) and p, q ∈ Prop(A). Then
(1) Σ [= p ∧ q ⇐⇒ Σ [= p and Σ [= q.
(2) Σ [= p =⇒ Σ [= p ∨ q.
(3) Σ ∪ ¦p¦ [= q ⇐⇒ Σ [= p → q.
(4) If Σ [= p and Σ [= p → q, then Σ [= q. (Modus Ponens.)
Proof. We will prove (3) here and leave the rest as exercise.
(⇒) Assume Σ ∪ ¦p¦ [= q. To derive Σ [= p → q we consider any model
t : A −→ ¦0, 1¦ of Σ, and need only show that then t(p → q) = 1. If t(p) = 1
then t(Σ ∪ ¦p¦) ⊆ ¦1¦, hence t(q) = 1 and thus t(p → q) = 1. If t(p) = 0 then
t(p → q) = 1 by deﬁnition.
(⇐) Assume Σ [= p → q. To derive Σ ∪ ¦p¦ [= q we consider any model
t : A −→ ¦0, 1¦ of Σ ∪ ¦p¦, and need only derive that t(q) = 1. By assumption
t(p → q) = 1 and in view of t(p) = 1, this gives t(q) = 1 as required.
We ﬁnish this section with the promised general result on unique readability.
We also establish facts of similar nature that are needed later.
Deﬁnition. Let F be a set of symbols with a function a : F → N (called the
arity function). A symbol f ∈ F is said to have arity n if a(f) = n. A word on
F is said to be admissible if it can be obtained by applying the following rules:
(i) If f ∈ F has arity 0, then f viewed as a word of length 1 is admissible.
(ii) If f ∈ F has arity m > 0 and t
1
, . . . , t
m
are admissible words on F, then
the concatenation ft
1
. . . t
m
is admissible.
Below we just write “admissible word” instead of “admissible word on F”. Note
that the empty word is not admissible, and that the last symbol of an admissible
word cannot be of arity > 0.
Example. Take F = A∪ ¦·, ⊥, , ∨, ∧¦ and deﬁne arity : F → N by
arity(x) = 0 for x ∈ A∪ ¦·, ⊥¦, arity() = 1, arity(∨) = arity(∧) = 2.
Then the set of admissible words is just Prop(A).
Lemma 2.1.4. Let t
1
, . . . , t
m
and u
1
, . . . , u
n
be admissible words and w any
word on F such that t
1
. . . t
m
w = u
1
. . . u
n
. Then m ≤ n, t
i
= u
i
for i =
1, . . . , m, and w = u
m+1
u
n
.
Proof. By induction on the length of u
1
. . . u
n
. If this length is 0, then m = n =
0 and w is the empty word. Suppose the length is > 0, and assume the lemma
holds for smaller lengths. Note that n > 0. If m = 0, then the conclusion of the
lemma holds, so suppose m > 0. The ﬁrst symbol of t
1
equals the ﬁrst symbol
of u
1
. Say this ﬁrst symbol is h ∈ F with arity k. Then t
1
= ha
1
. . . a
k
and
u
1
= hb
1
. . . b
k
where a
1
, . . . , a
k
and b
1
, . . . , b
k
are admissible words. Cancelling
the ﬁrst symbol h gives
a
1
. . . a
k
t
2
. . . t
m
w = b
1
. . . b
k
u
2
. . . u
n
.
18 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(Caution: any of k, m−1, n−1 could be 0.) We have length(b
1
. . . b
k
u
2
. . . u
n
) =
length(u
1
. . . u
n
) −1, so the induction hypothesis applies. It yields k +m−1 ≤
k + n −1 (so m ≤ n), a
1
= b
1
, . . . , a
k
= b
k
(so t
1
= u
1
), t
2
= u
2
, . . . , t
m
= u
m
,
and w = u
m+1
u
n
.
Here are two immediate consequences that we shall use:
1. Let t
1
, . . . , t
m
and u
1
, . . . , u
n
be admissible words such that t
1
. . . t
m
=
u
1
. . . u
n
. Then m = n and t
i
= u
i
for i = 1, . . . , m.
2. Let t and u be admissible words and w a word on F such that tw = u.
Then t = u and w is the empty word.
Lemma 2.1.5 (Unique Readability).
Each admissible word equals ft
1
. . . t
m
for a unique tuple (f, t
1
, . . . , t
m
) where
f ∈ F has arity m and t
1
, . . . , t
m
are admissible words.
Proof. Suppose ft
1
. . . t
m
= gu
1
. . . u
n
where f, g ∈ F have arity m and n
respectively, and t
1
, . . . , t
m
, u
1
, . . . , u
n
are admissible words on F. We have to
show that then f = g, m = n and t
i
= u
i
for i = 1, . . . , m. Observe ﬁrst that
f = g since f and g are the ﬁrst symbols of two equal words. After cancelling
the ﬁrst symbol of both words, the ﬁrst consequence of the previous lemma leads
to the desired conclusion.
Given words v, w ∈ F
∗
and i ∈ ¦1, . . . , length(w)¦, we say that v occurs in
w at starting position i if w = w
1
vw
2
where w
1
, w
2
∈ F
∗
and w
1
has length
i − 1. (For example, if f, g ∈ F are distinct, then the word fgf has exactly
two occurrences in the word fgfgf, one at starting position 1, and the other
at starting position 3; these two occurrences overlap, but such overlapping is
impossible with admissible words, see exercise 5 at the end of this section.)
Given w = w
1
vw
2
as above, and given v
∈ F
∗
, the result of replacing v in w at
starting position i by v
is by deﬁnition the word w
1
v
w
2
.
Lemma 2.1.6. Let w be an admissible word and 1 ≤ i ≤ length(w). Then there
is a unique admissible word that occurs in w at starting position i.
Proof. We prove existence by induction on length(w). Uniqueness then follows
from the fact stated just before Lemma 2.1.5. Clearly w is an admissible word
occurring in w at starting position 1. Suppose i > 1. Then we write w =
ft
1
. . . t
n
where f ∈ F has arity n > 0, and t
1
, . . . , t
n
are admissible words, and
we take j ∈ ¦1, . . . , n¦ such that
1 + length(t
1
) + + length(t
j−1
) < i ≤ 1 + length(t
1
) + + length(t
j
).
Now apply the inductive assumption to t
j
.
Remark. Let w = ft
1
. . . t
n
where f ∈ F has arity n > 0, and t
1
, . . . , t
n
are
admissible words. Put l
j
:= 1 +length(t
1
) + +length(t
j
) for j = 0, . . . , n (so
l
0
= 1). Suppose l
j−1
< i ≤ l
j
, 1 ≤ j ≤ n, and let v be the admissible word
that occurs in w at starting position i. Then the proof of the last lemma shows
that this occurrence is entirely inside t
j
, that is, i −1 + length(v) ≤ l
j
.
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 19
Corollary 2.1.7. Let w be an admissible word and 1 ≤ i ≤ length(w). Then
the result of replacing the admissible word v in w at starting position i by an
admissible word v
is again an admissible word.
This follows by a routine induction on length(w), using the last remark.
Exercises. In the exercises below, A = a
1
, . . . , a
n
¦, [A[ = n.
(1) (Disjunctive Normal Form) Each p is equivalent to a disjunction
p
1
∨ ∨ p
k
where each disjunct p
i
is a conjunction a
1
1
∧ . . . ∧ a
n
n
with all
j
∈ −1, 1¦ and
where for an atom a we put a
1
:= a and a
−1
:= a.
(2) (Conjunctive Normal Form) Same as last problem, except that the signs ∨ and ∧
are interchanged, as well as the words “disjunction” and “conjunction,” and also
the words “disjunct” and “conjunct.”
(3) To each p associate the function f
p
: 0, 1¦
A
→ 0, 1¦ deﬁned by f
p
(t) = t(p).
(Think of a truth table for p where the 2
n
rows correspond to the 2
n
truth
assignments t : A → 0, 1¦, and the column under p records the values t(p).)
Then for every function f : 0, 1¦
A
→ 0, 1¦ there is a p such that f = f
p
.
(4) Let ∼ be the equivalence relation on Prop(A) given by
p ∼ q :⇐⇒ [= p ↔ q.
Then the quotient set Prop(A)/∼ is ﬁnite; determine its cardinality as a function
of n = [A[.
(5) Let w be an admissible word and 1 ≤ i < i
≤ length(w). Let v and v
be the
admissible words that occur at starting positions i and i
respectively in w. Then
these occurrences are either nonoverlapping, that is, i −1 +length(v) < i
, or the
occurrence of v
is entirely inside that of v, that is,
i
−1 + length(v
) ≤ i −1 + length(v).
2.2 Completeness for Propositional Logic
In this section we introduce a proof system for propositional logic, state the
completeness of this proof system, and then prove this completeness.
As in the previous section we ﬁx a set A of atoms, and the conventions of
that section remain in force.
A propositional axiom is by deﬁnition a proposition that occurs in the list below,
for some choice of p, q, r:
1. ·
2. p → (p ∨ q); p → (q ∨ p)
3. p →
_
q → (p ∨ q)
_
20 CHAPTER 2. BASIC CONCEPTS OF LOGIC
4. (p ∧ q) → p; (p ∧ q) → q
5. p →
_
q → (p ∧ q)
_
6.
_
p → (q → r)
_
→
_
(p → q) → (p → r)
_
7. p → (p → ⊥)
8. (p → ⊥) → p
Each of items 2–8 describes inﬁnitely many propositional axioms. That is why
we do not call these items axioms, but axiom schemes. For example, if a, b ∈ A,
then a → (a ∨ ⊥) and b → (b ∨ (a ∧ b)) are distinct propositional axioms,
and both instances of axiom scheme 2. It is easy to check that all propositional
axioms are tautologies.
Here is our single rule of inference for propositional logic:
Modus Ponens (MP): from p and p → q, infer q.
In the rest of this section Σ denotes a set of propositions, that is, Σ ⊆ Prop(A).
Deﬁnition. A formal proof , or just proof, of p from Σ is a sequence p
1
, . . . , p
n
with n ≥ 1 and p
n
= p, such that for k = 1, . . . , n:
(i) either p
k
∈ Σ,
(ii) or p
k
is a propositional axiom,
(iii) or there are i, j ∈ ¦1, . . . , k −1¦ such that p
k
can be inferred from p
i
and
p
j
by MP.
If there exists a proof of p from Σ, then we write Σ ¬ p, and say Σ proves p.
For Σ = ∅ we also write ¬ p instead of Σ ¬ p.
Lemma 2.2.1. ¬ p → p.
Proof. The proposition p →
_
(p → p) → p
_
is a propositional axiom by axiom
scheme 2. By axiom scheme 6,
¦p →
_
(p → p) → p
_
¦ → ¦
_
p → (p → p)
_
→ (p → p)¦
is a propositional axiom. Applying MP to these two axioms yields
¬
_
p → (p → p)
_
→ (p → p).
Since p → (p → p) is also a propositional axiom by scheme 2, we can apply MP
again to obtain ¬ p → p.
Proposition 2.2.2. If Σ ¬ p, then Σ [= p.
This should be clear from earlier facts that we stated and which the reader
was asked to verify. The converse is true but less obvious:
Theorem 2.2.3 (Completeness  ﬁrst form).
Σ ¬ p ⇐⇒ Σ [= p
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 21
Remark. There is some arbitrariness in our choice of axioms and rule, and thus
in our notion of formal proof. This is in contrast to the deﬁnition of [=, which
merely formalizes the basic underlying idea of propositional logic as stated in
the introduction to the previous section. However, the equivalence of ¬ and
[= (Completeness Theorem) shows that our choice of axioms and rule yields a
complete proof system. Moreover, this equivalence has consequences which can
be stated in terms of [= alone. An example is the Compactness Theorem.
Theorem 2.2.4 (Compactness of Propositional Logic). If Σ [= p then there is
a ﬁnite subset Σ
0
of Σ such that Σ
0
[= p.
It is convenient to prove ﬁrst a variant of the Completeness Theorem.
Deﬁnition. We say that Σ is inconsistent if Σ ¬ ⊥, and otherwise (that is, if
Σ ⊥) we call Σ consistent.
Theorem 2.2.5 (Completeness  second form).
Σ is consistent if and only if Σ has a model.
From this second form of the Completenenes Theorem we obtain easily an al
ternative form of the Compactness of Propositional Logic:
Corollary 2.2.6. Σ has a model ⇐⇒ every ﬁnite subset of Σ has a model.
We ﬁrst show that the second form of the Completeness Theorem implies the
ﬁrst form. For this we need a technical lemma that will also be useful later in
the course.
Lemma 2.2.7 (Deduction Lemma). Suppose Σ ∪ ¦p¦ ¬ q. Then Σ ¬ p → q.
Proof. By induction on proofs.
If q is a propositional axiom, then Σ ¬ q, and since q → (p → q) is a
propositional axiom, MP yields Σ ¬ p → q. If q ∈ Σ ∪ ¦p¦, then either q ∈ Σ
in which case the same argument as before gives Σ ¬ p → q, or q = p and then
Σ ¬ p → q since ¬ p → p by the lemma above.
Now assume that q is obtained by MP from r and r → q, where Σ∪¦p¦ ¬ r
and Σ ∪ ¦p¦ ¬ r → q and where we assume inductively that Σ ¬ p → r and
Σ ¬ p → (r → q). Then we obtain Σ ¬ p → q from the propositional axiom
_
p → (r → q)
_
→
_
(p → r) → (p → q)
_
by applying MP twice.
Corollary 2.2.8. Σ ¬ p if and only if Σ ∪ ¦p¦ is inconsistent.
Proof. (⇒) Assume Σ ¬ p. Since p → (p → ⊥) is a propositional axiom, we
can apply MP twice to get Σ ∪ ¦p¦ ¬ ⊥. Hence Σ ∪ ¦p¦ is inconsistent.
(⇐) Assume Σ ∪ ¦p¦ is inconsistent. Then Σ ∪ ¦p¦ ¬ ⊥, and so by the
Deduction Lemma we have Σ ¬ p → ⊥. Since (p → ⊥) → p is a propositional
axiom, MP yields Σ ¬ p.
22 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Corollary 2.2.9. The second form of Completeness (Theorem 2.2.5) implies
the ﬁrst form (Theorem 2.2.3).
Proof. Assume the second form of Completeness holds, and that Σ [= p. We
want to show that then Σ ¬ p. From Σ [= p it follows that Σ ∪ ¦p¦ has
no model. Hence by the second form of Completeness, the set Σ ∪ ¦p¦ is
inconsistent. Then by Corollary 2.2.8 we have Σ ¬ p.
Deﬁnition. We say that Σ is complete if Σ is consistent, and for each p either
Σ ¬ p or Σ ¬ p.
Completeness as a property of a set of propositions should not be confused
with the completeness of our proof system as expressed by the Completeness
Theorem. (It is just a historical accident that we use the same word.)
Below we use Zorn’s Lemma to show that any consistent set of propositions
can be extended to a complete set of propositions.
Lemma 2.2.10 (Lindenbaum). Suppose Σ is consistent. Then Σ ⊆ Σ
for some
complete Σ
⊆ Prop(A).
Proof. Let P be the collection of all consistent subsets of Prop(A) that contain
Σ. In particular Σ ∈ P. We consider P as partially ordered by inclusion. Any
totally ordered subcollection ¦Σ
i
: i ∈ I¦ of P with I ,= ∅ has an upper bound
in P, namely
¦Σ
i
: i ∈ I¦. (To see this it suﬃces to check that
¦Σ
i
: i ∈ I¦
is consistent. Suppose otherwise, that is, suppose
¦Σ
i
: i ∈ I¦ ¬ ⊥. Since a
proof can use only ﬁnitely many of the axioms in
¦Σ
i
: i ∈ I¦, there exists
i ∈ I such that Σ
i
¬ ⊥, contradicting the consistency of Σ
i
.)
Thus by Zorn’s lemma P has a maximal element Σ
. We claim that then Σ
is complete. For any p, if Σ
p, then by Corollary 2.2.8 the set Σ
∪ ¦p¦ is
consistent, hence p ∈ Σ
by maximality of Σ
, and thus Σ
¬ p.
Suppose A is countable. For this case we can give a proof of Lindenbaum’s
Lemma without using Zorn’s Lemma as follows.
Proof. Because A is countable, Prop(A) is countable. Take an enumeration
(p
n
)
n∈N
of Prop(A). We construct an increasing sequence Σ = Σ
0
⊆ Σ
1
⊆ . . .
of consistent subsets of Prop(A) as follows. Given a consistent Σ
n
⊆ Prop(A)
we deﬁne
Σ
n+1
=
_
Σ
n
∪ ¦p
n
¦ if Σ
n
¬ p
n
,
Σ
n
∪ ¦p
n
¦ if Σ
n
p
n
,
so Σ
n+1
remains consistent by earlier results. Thus Σ
∞
:=
¦Σ
n
: n ∈ N¦
is consistent and also complete: for any n either p
n
∈ Σ
n+1
⊆ Σ
∞
or p
n
∈
Σ
n+1
⊆ Σ
∞
.
Deﬁne the truth assignment t
Σ
: A → ¦0, 1¦ by
t
Σ
(a) = 1 if Σ ¬ a, and t
Σ
(a) = 0 otherwise.
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 23
Lemma 2.2.11. Suppose Σ is complete. Then for each p we have
Σ ¬ p ⇐⇒ t
Σ
(p) = 1.
In particular, t
Σ
is a model of Σ.
Proof. We proceed by induction on the number of connectives in p. If p is an
atom or p = · or p = ⊥, then the equivalence follows immediately from the
deﬁnitions. It remains to consider the three cases below.
Case 1: p = q, and (inductive assumption) Σ ¬ q ⇐⇒ t
Σ
(q) = 1.
(⇒) Suppose Σ ¬ p. Then t
Σ
(p) = 1: Otherwise, t
Σ
(q) = 1, so Σ ¬ q by the
inductive assumption; since q → (p → ⊥) is a propositional axiom, we can apply
MP twice to get Σ ¬ ⊥, which contradicts the consistency of Σ.
(⇐) Suppose t
Σ
(p) = 1. Then t
Σ
(q) = 0, so Σ q, and thus Σ ¬ p by
completeness of Σ.
Case 2: p = q ∨ r, Σ ¬ q ⇐⇒ t
Σ
(q) = 1, and Σ ¬ r ⇐⇒ t
Σ
(r) = 1.
(⇒) Suppose that Σ ¬ p. Then t
Σ
(p) = 1: Otherwise, t
Σ
(p) = 0, so t
Σ
(q) = 0
and t
Σ
(r) = 0, hence Σ q and Σ r, and thus Σ ¬ q and Σ ¬ r by
completeness of Σ; since q → (r → p) is a propositional axiom, we can apply
MP twice to get Σ ¬ p, which in view of the propositional axiom p → (p → ⊥)
and MP yields Σ ¬ ⊥, which contradicts the consistency of Σ.
(⇐) Suppose t
Σ
(p) = 1. Then t
Σ
(q) = 1 or t
Σ
(r) = 1. Hence Σ ¬ q or Σ ¬ r.
Using MP and the propositional axioms q → p and r → p we obtain Σ ¬ p.
Case 3: p = q ∧ r, Σ ¬ q ⇐⇒ t
Σ
(q) = 1, and Σ ¬ r ⇐⇒ t
Σ
(r) = 1.
We leave this case as an exercise.
We can now ﬁnish the proof of Completeness (second form):
Suppose Σ is consistent. Then by Lindenbaum’s Lemma Σ is a subset of a
complete set Σ
of propositions. By the previous lemma, such a Σ
has a model,
and such a model is also a model of Σ.
The converse—if Σ has a model, then Σ is consistent—is left to the reader.
Application to coloring inﬁnite graphs. What follows is a standard use
of compactness of propositional logic, one of many. Let (V, E) be a graph, by
which we mean here that V is a set (of vertices) and E (the set of edges) is a
binary relation on V that is irreﬂexive and symmetric, that is, for all v, w ∈ V
we have (v, v) / ∈ E, and if (v, w) ∈ E, then (w, v) ∈ E. Let some n ≥ 1 be
given. Then an ncoloring of (V, E) is a function c : V → ¦1, . . . , n¦ such that
c(v) ,= c(w) for all (v, w) ∈ E: neighboring vertices should have diﬀerent colors.
Suppose for every ﬁnite V
0
⊆ V there is an ncoloring of (V
0
, E
0
), where
E
0
:= E ∩ (V
0
V
0
). We claim that there exists an ncoloring of (V, E).
Proof. Take A := V ¦1, . . . , n¦ as the set of atoms, and think of an atom (v, i)
as representing the statement that v has color i. Thus for (V, E) to have an
ncoloring means that the following set Σ ⊆ Prop A has a model:
Σ := ¦(v, 1) ∨ ∨ (v, n) : v ∈ V ¦ ∪ ¦
_
(v, i) ∧ (v, j)
_
: v ∈ V, 1 ≤ i < j ≤ n¦
∪ ¦
_
(v, i) ∧ (w, i)
_
: (v, w) ∈ E, 1 ≤ i ≤ n¦.
24 CHAPTER 2. BASIC CONCEPTS OF LOGIC
The assumption that all ﬁnite subgraphs of (V, E) are ncolorable yields that
every ﬁnite subset of Σ has a model. Hence by compactness Σ has a model.
Exercises.
(1) Let (P, ≤) be a poset. Then there is a linear order ≤
on P that extends ≤. (Hint:
use the compactness theorem and the fact that this is true when P is ﬁnite.)
(2) Suppose Σ ⊆ Prop(A) is such that for each truth assignment t : A → 0, 1¦ there
is p ∈ Σ with t(p) = 1. Then there are p
1
, . . . , p
n
∈ Σ such that p
1
∨ ∨ p
n
is a
tautology. (The interesting case is when A is inﬁnite.)
2.3 Languages and Structures
Propositional Logic captures only one aspect of mathematical reasoning. We
also need the capability to deal with predicates (also called relations), variables,
and the quantiﬁers “for all” and “there exists.” We now begin setting up a
framework for Predicate Logic (or FirstOrder Logic), which has these additional
features and has a claim on being a complete logic for mathematical reasoning.
Deﬁnition. A language
1
L is a disjoint union of:
(i) a set L
r
of relation symbols; each R ∈ L
r
has associated arity a(R) ∈ N;
(ii) a set L
f
of function symbols; each F ∈ L
f
has associated arity a(F) ∈ N.
An mary relation or function symbol is one that has arity m. Instead of “0
ary”, “1ary”, “2ary” we say “nullary”, “unary”, “binary”. A constant symbol
is a function symbol of arity 0.
Examples.
(1) The language L
Gr
= ¦1,
−1
, ¦ of groups has constant symbol 1, unary
function symbol
−1
, and binary function symbol .
(2) The language L
Ab
= ¦0, −, +¦ of (additive) abelian groups has constant
symbol 0, unary function symbol −, and binary function symbol +.
(3) The language L
O
= ¦<¦ has just one binary relation symbol <.
(4) The language L
OAb
= ¦<, 0, −, +¦ of ordered abelian groups.
(5) The language L
Rig
= ¦0, 1, +, ¦ of rigs (or semirings) has constant symbols
0 and 1, and binary function symbols + and .
(6) The language L
Ri
= ¦0, 1, −, +, ¦ of rings. The symbols are those of the
previous example, plus the unary function symbol −.
From now on, let L denote a language.
Deﬁnition. A structure / for L (or Lstructure) is a triple
_
A; (R
A
)
R∈L
r , (F
A
)
F∈L
f
_
consisting of:
(i) a nonempty set A, the underlying set of /;
2
1
What we call here a language is also known as a signature, or a vocabulary.
2
It is also called the universe of A; we prefer less grandiose terminology.
2.3. LANGUAGES AND STRUCTURES 25
(ii) for each mary R ∈ L
r
a set R
A
⊆ A
m
(an mary relation on A), the
interpretation of R in /;
(iii) for each nary F ∈ L
f
an operation F
A
: A
n
−→ A (an nary operation
on A), the interpretation of F in /.
Remark. The interpretation of a constant symbol c of L is a function
c
A
: A
0
−→ A.
Since A
0
has just one element, c
A
is uniquely determined by its value at this
element; we shall identify c
A
with this value, so c
A
∈ A.
Given an Lstructure /, the relations R
A
on A (for R ∈ L
r
), and operations F
A
on A (for F ∈ L
f
) are called the primitives of /. When / is clear from context
we often omit the superscript / in denoting the interpretation of a symbol of L
in /. The reader is supposed to keep in mind the distinction between symbols
of L and their interpretation in an Lstructure, even if we use the same notation
for both.
Examples.
(1) Each group is considered as an L
Gr
structure by interpreting the symbols
1,
−1
, and as the identity element of the group, its group inverse, and its
group multiplication, respectively.
(2) Let / = (A; 0, −, +) be an abelian group; here 0 ∈ A is the zero element
of the group, and − : A → A and + : A
2
→ A denote the group operations
of /. We consider / as an L
Ab
structure by taking as interpretations of
the symbols 0,− and + of L
Ab
the group operations 0, − and + on A.
(We took here the liberty of using the same notation for possibly entirely
diﬀerent things: + is an element of the set L
Ab
, but also denotes in this
context its interpretation as a binary operation on the set A. Similarly
with 0 and −.) In fact, any set A in which we single out an element, a
unary operation on A, and a binary operation on A, can be construed as
an L
Ab
structure if we choose to do so.
(3) (N; <) is an L
O
structure where we interpret < as the usual ordering
relation on N. Similarly for (Z; <), (Q; <) and (R; <). (Here we take
even more notational liberties, by letting < denote ﬁve diﬀerent things: a
symbol of L
O
, and the usual orderings of N, Z, Q, and R respectively.)
Again, any nonempty set A equipped with a binary relation on it can be
viewed as an L
O
structure.
(4) (Z; <, 0, −, +) and (Q, <, 0, −, +) are both L
OAb
structures.
(5) (N; 0, 1, +, ) is an L
Rig
structure.
(6) (Z; 0, 1, −, +, ) is an L
Ri
structure.
Let B be an Lstructure with underlying set B, and let A be a nonempty subset
of B such that F
B
(A
n
) ⊆ A for every nary function symbol F of L. Then A is
the underlying set of an Lstructure / deﬁned by letting
F
A
:= F
B
[
A
n: A
n
→ A, for nary F ∈ L
f
,
R
A
:= R
B
∩ A
m
for mary R ∈ L
r
.
26 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Deﬁnition. Such an Lstructure / is said to be a substructure of B, notation:
/ ⊆ B. We also say in this case that B is an extension of /, or extends /.
Examples.
(1) (Z; 0, 1, −, +, ) ⊆ (Q; 0, 1, −, +, ) ⊆ (R; 0, 1, −, +, )
(2) (N; <, 0, 1, +, ) ⊆ (Z; <, 0, 1, +, )
Deﬁnition. Let / = (A; . . .) and B = (B; . . .) be Lstructures.
A homomorphism h : / → B is a map h : A → B such that
(i) for each mary R ∈ L
r
and each (a
1
, . . . , a
m
) ∈ A
m
we have
(a
1
, . . . , a
m
) ∈ R
A
=⇒ (ha
1
, . . . , ha
m
) ∈ R
B
;
(ii) for each nary F ∈ L
f
and each (a
1
, . . . , a
n
) ∈ A
n
we have
h(F
A
(a
1
, . . . , a
n
)) = F
B
(ha
1
, . . . , ha
n
).
Replacing =⇒ in (i) by ⇐⇒ yields the notion of a strong homomorphism. An
embedding is an injective strong homomorphism; an isomorphism is a bijective
strong homomorphism. An automorphism of / is an isomorphism / → /.
If / ⊆ B, then the inclusion a → a : A → B is an embedding / → B.
Conversely, a homomorphism h : / → B yields a substructure h(/) of B with
underlying set h(A), and if h is an embedding we have an isomorphism a →
h(a) : / → h(/).
If i : / → B and j : B → ( are homomorphisms (strong homomorphisms,
embeddings, isomorphisms, respectively), then so is j ◦ i : / → (. The identity
map 1
A
on A is an automorphism of /. If i : / → B is an isomorphism then so
is the map i
−1
: B → /. Thus the automorphisms of / form a group Aut(/)
under composition with identity 1
A
.
Examples.
1. Let / = (Z; 0, −, +). Then k → −k is an automorphism of /.
2. Let / = (Z; <). The map k → k + 1 is an automorphism of / with
inverse given by k −→ k −1.
If / and B are groups (viewed as structures for the language L
Gr
), then a
homomorphism h : / → B is exactly what in algebra is called a homomorphism
from the group / to the group B. Likewise with rings, and other kinds of
algebraic structures.
A congruence on the Lstructure / is an equivalence relation ∼ on its underlying
set A such that
(i) if R ∈ L
r
is mary and a
1
∼ b
1
, . . . , a
m
∼ b
m
, then
(a
1
, . . . , a
m
) ∈ R
A
⇐⇒ (b
1
, . . . , b
m
) ∈ R
A
;
(ii) if F ∈ L
f
is nary and a
1
∼ b
1
, . . . , a
n
∼ b
n
, then
F
A
(a
1
, . . . , a
n
) ∼ F
A
(b
1
, . . . , b
n
).
2.3. LANGUAGES AND STRUCTURES 27
Note that a strong homomorphism h : / → B yields a congruence ∼
h
on / as
follows: for a
1
, a
2
∈ A we put
a
1
∼
h
a
2
⇐⇒ h(a
1
) = h(a
2
).
Given a congruence ∼ on the Lstructure / we obtain an Lstructure //∼ (the
quotient of / by ∼) as follows:
(i) the underlying set of //∼ is the quotient set A/∼;
(ii) the interpretation of an mary R ∈ L
r
in //∼ is the mary relation
¦(a
∼
1
, . . . , a
∼
m
) : (a
1
, . . . , a
m
) ∈ R
A
¦
on A/∼;
(iii) the interpretation of an nary F ∈ L
f
in // ∼ is the nary operation
(a
∼
1
, . . . , a
∼
n
) → F
A
(a
1
, . . . , a
n
)
∼
on A/∼.
Note that then we have a strong homomorphism a → a
∼
: / → //∼.
Products. Let
_
B
i
_
i∈I
be a family of Lstructures, B
i
= (B
i
; . . . ) for i ∈ I.
The product
i∈I
B
i
is deﬁned to be the Lstructure B whose underlying set is the product set
i∈I
B
i
, and where the basic relations and functions are deﬁned coordinate
wise: for mary R ∈ L
r
and elements b
1
= (b
1i
), . . . , b
m
= (b
mi
) ∈
i∈I
B
i
,
(b
1
, . . . , b
m
) ∈ R
B
⇐⇒ (b
1i
, . . . , b
mi
) ∈ R
B
i
for all i ∈ I,
and for nary F ∈ L
f
and b
1
= (b
1i
), . . . , b
n
= (b
ni
) ∈
i∈I
B
i
,
F
B
(b
1
, . . . , b
n
) :=
_
F
B
i
(b
1i
, . . . , b
ni
)
_
i∈I
.
For j ∈ I the projection map to the jth factor is the homomorphism
i∈I
B
i
→ B
j
, (b
i
) → b
j
.
This product construction makes it possible to combine several homomorphisms
with a common domain into a single one: if for each i ∈ I we have a homomor
phism h
i
: / → B
i
we obtain a homomorphism
h = (h
i
) : / →
i∈I
B
i
, h(a) :=
_
h
i
(a
i
)
_
.
28 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Exercises.
(1) Let G be a group viewed as a structure for the language of groups. Each normal
subgroup N of G yields a congruence ≡
N
on G by
a ≡
N
b ⇐⇒ aN = bN,
and each congruence on G equals ≡
N
for a unique normal subgroup N of G.
(2) Consider a strong homomorphism h : / → B of Lstructures. Then we have an
isomorphism from //∼
h
onto h(/) given by a
∼
h
→ h(a).
2.4 Variables and Terms
Throughout this course
Var = ¦v
0
, v
1
, v
2
, . . . ¦
is a countably inﬁnite set of symbols whose elements will be called variables;
we assume that v
m
,= v
n
for m ,= n, and that no variable is a function or
relation symbol in any language. We let x, y, z (sometimes with subscripts or
superscripts) denote variables, unless indicated otherwise.
Remark. Chapters 2–4 go through if we take as our set Var of variables any
inﬁnite (possibly uncountable) set; in model theory this can even be convenient.
For this more general Var we still insist that no variable is a function or relation
symbol in any language. In the few cases in chapters 2–4 that this more general
setup requires changes in proofs, this will be pointed out.
The results in Chapter 5 on undecidability presuppose a numbering of the
variables; our Var = ¦v
0
, v
1
, v
2
, . . . ¦ comes equipped with such a numbering.
Deﬁnition. An Lterm is a word on the alphabet L
f
∪Var obtained as follows:
(i) each variable (viewed as a word of length 1) is an Lterm;
(ii) whenever F ∈ L
f
is nary and t
1
, . . . , t
n
are Lterms, then the concatena
tion Ft
1
. . . t
n
is an Lterm.
Note: constant symbols of L are Lterms of length 1, by clause (ii) for n = 0.
The Lterms are the admissible words on the alphabet L
f
∪ Var where each
variable has arity 0. Thus “unique readability” is available.
We often write t(x
1
, . . . , x
n
) to indicate an Lterm t in which no variables
other than x
1
, . . . , x
n
occur. Whenever we use this notation we assume tacitly
that x
1
, . . . , x
n
are distinct. Note that we do not require that each of x
1
, . . . , x
n
actually occurs in t(x
1
, . . . , x
n
). (This is like indicating a polynomial in the
indeterminates x
1
, . . . , x
n
by p(x
1
, . . . , x
n
), where one allows that some of these
indeterminates do not actually occur in the polynomial p.)
If a term is written as an admissible word, then it may be hard to see how
it is built up from subterms. In practice we shall therefore use parentheses
and brackets in denoting terms, and avoid preﬁx notation if tradition dictates
otherwise.
2.4. VARIABLES AND TERMS 29
Example. The word +x −yz is an L
Ri
term. For easier reading we indicate
this term instead by (x + (−y)) z or even (x −y)z.
Deﬁnition. Let / be an Lstructure and t = t(x) be an Lterm where x =
(x
1
, . . . , x
m
). Then we associate to the ordered pair (t, x) a function t
A
: A
m
→ A
as follows
(i) If t is the variable x
i
then t
A
(a) = a
i
for a = (a
1
, . . . , a
m
) ∈ A
m
.
(ii) If t = Ft
1
. . . t
n
where F ∈ L
f
is nary and t
1
, . . . , t
n
are Lterms, then
t
A
(a) = F
A
(t
A
1
(a), . . . , t
A
n
(a)) for a ∈ A
m
.
This inductive deﬁnition is justiﬁed by unique readability. Note that if B is a
second Lstructure and / ⊆ B, then t
A
(a) = t
B
(a) for t as above and a ∈ A
m
.
Example. Consider R as a ring in the usual way, and let t(x, y, z) be the L
Ri

term (x−y)z. Then the function t
R
: R
3
→ R is given by t
R
(a, b, c) = (a−b)c.
A term is said to be variablefree if no variables occur in it. Let t be a variable
free Lterm and / an Lstructure. Then the above gives a nullary function
t
A
: A
0
→ A, identiﬁed as usual with its value at the unique element of A
0
, so
t
A
∈ A. In other words, if t is a constant symbol c, then t
A
= c
A
∈ A, where
c
A
is as in the previous section, and if t = Ft
1
. . . t
n
with nary F ∈ L
f
and
variablefree Lterms t
1
, . . . , t
n
, then t
A
= F
A
(t
A
1
, . . . , t
A
n
).
Deﬁnition. Let t be an Lterm, let x
1
, . . . , x
n
be distinct variables, and let
τ
1
, . . . , τ
n
be Lterms. Then t(τ
1
/x
1
, . . . , τ
n
/x
n
) is the word obtained by re
placing all occurrences of x
i
in t by t
i
, simultaneously for i = 1, . . . , n. If t is
given in the form t(x
1
, . . . , x
n
), then we write t(τ
1
, . . . , τ
n
) as a shorthand for
t(τ
1
/x
1
, . . . , τ
n
/x
n
).
The easy proof of the next lemma is left to the reader.
Lemma 2.4.1. Suppose t is an Lterm, x
1
, . . . , x
n
are distinct variables, and
τ
1
, . . . , τ
n
are Lterms. Then t(τ
1
/x
1
, . . . , τ
n
/x
n
) is an Lterm. If τ
1
, . . . , τ
n
are
variablefree and t = t(x
1
, . . . , x
n
), then t(τ
1
, . . . , τ
n
) is variablefree.
We urge the reader to do exercise (1) below and thus acquire the conﬁdence that
these formal term substitutions do correspond to actual function substitutions.
In the deﬁnition of t(τ
1
/x
1
, . . . , τ
n
/x
n
) the “replacing” should be simultaneous,
because it can happen that for t
:= t(τ
1
/x
1
) we have t
(τ
2
/x
2
) ,= t(τ
1
/x
1
, τ
2
/x
2
).
(Here t, τ
1
, τ
2
are Lterms and x
1
, x
2
are distinct variables.)
Generators. Let B be an Lstructure, let G ⊆ B, and assume also that L has
a constant symbol or that G ,= ∅. Then the set
¦t
B
(g
1
, . . . , g
m
) : t(x
1
, . . . , x
m
) is an Lterm and g
1
, . . . , g
m
∈ G¦ ⊆ B
is the underlying set of some / ⊆ B, and this / is clearly a substructure of any
/
⊆ B with G ⊆ A
. We call this / the substructure of B generated by G; if
/ = B, then we say that B is generated by G. If (a
i
)
i∈I
is a family of elements of
B, then “generated by (a
i
)” means “generated by G” where G = ¦a
i
: i ∈ I¦.
30 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Exercises.
(1) Let t(x
1
, . . . , x
m
) and τ
1
(y
1
, . . . , y
n
), . . . , τ
m
(y
1
, . . . , y
n
) be Lterms. Then the L
term t
∗
(y
1
, . . . , y
n
) := t(τ
1
(y
1
, . . . , y
n
), . . . , τ
m
(y
1
, . . . , y
n
)) has the property that
if / is an Lstructure and a = (a
1
, . . . , a
n
) ∈ A
n
, then
(t
∗
)
A
(a) = t
A
`
τ
A
1
(a), . . . , τ
A
m
(a)
´
.
(2) For every L
Ab
term t(x
1
, . . . , x
n
) there are integers k
1
, . . . , k
n
such that for every
abelian group / = (A; 0, −, +),
t
A
(a
1
, . . . , a
n
) = k
1
a
1
+ +k
n
a
n
, for all (a
1
, . . . , a
n
) ∈ A
n
.
Conversely, for any integers k
1
, . . . , k
n
there is an L
Ab
term t(x
1
, . . . , x
n
) such
that in every abelian group / = (A; 0, −, +) the above displayed identity holds.
(3) For every L
Ri
term t(x
1
, . . . , x
n
) there is a polynomial
P(x
1
, . . . , x
n
) ∈ Z[x
1
, . . . , x
n
]
such that for every commutative ring 1 = (R; 0, 1, −, +, ),
t
R
(r
1
, . . . , r
n
) = P(r
1
, . . . , r
n
), for all (r
1
, . . . , r
n
) ∈ R
n
.
Conversely, for any polynomial P(x
1
, . . . , x
n
) ∈ Z[x
1
, . . . , x
n
] there is an L
Ri

term t(x
1
, . . . , x
n
) such that in every commutative ring 1 = (R; 0, 1, −, +, ) the
above displayed identity holds.
(4) Let / and B be Lstructures, h : / → B a homomorphism, and t = t(x
1
, . . . , x
n
)
an Lterm. Then
h
`
t
A
(a
1
, . . . , a
n
)
´
= t
B
(ha
1
, . . . , ha
n
), for all (a
1
, . . . , a
n
) ∈ A
n
.
(If / ⊆ B and h : / → B is the inclusion, this gives t
A
(a
1
, . . . , a
n
) = t
B
(a
1
, . . . , a
n
)
for all (a
1
, . . . , a
n
) ∈ A
n
.)
(5) Consider the Lstructure / = (N; 0, 1, +, ) where L = L
Rig
.
(a) Is there an Lterm t(x) such that t
A
(0) = 1 and t
A
(1) = 0?
(b) Is there an Lterm t(x) such that t
A
(n) = 2
n
for all n ∈ N?
(c) Find all the substructures of /.
2.5 Formulas and Sentences
Besides variables we also introduce the eight distinct logical symbols
· ⊥ ∨ ∧ = ∃ ∀
The ﬁrst ﬁve of these we already met when discussing propositional logic. None
of these eight symbols is a variable, or a function or relation symbol of any
language. Below L denotes a language. To distinguish the logical symbols from
those in L, the latter are often referred to as the nonlogical symbols.
Deﬁnition. The atomic Lformulas are the following words on the alphabet
L ∪ Var ∪ ¦·, ⊥, =¦:
2.5. FORMULAS AND SENTENCES 31
(i) · and ⊥,
(ii) Rt
1
. . . t
m
, where R ∈ L
r
is mary and t
1
, . . . , t
m
are Lterms,
(iii) = t
1
t
2
, where t
1
and t
2
are Lterms.
The Lformulas are the words on the larger alphabet
L ∪ Var ∪ ¦·, ⊥, , ∨, ∧, =, ∃, ∀¦
obtained as follows:
(i) every atomic Lformula is an Lformula;
(ii) if ϕ, ψ are Lformulas, then so are ϕ, ∨ϕψ and ∧ϕψ;
(iii) if ϕ is a Lformula and x is a variable, then ∃xϕ and ∀xϕ are Lformulas.
Note that all Lformulas are admissible words on the alphabet
L ∪ Var ∪ ¦·, ⊥, , ∨, ∧, =, ∃, ∀¦,
where =, ∃ and ∀ are given arity 2 and the other symbols have the arities
assigned to them earlier. This fact makes the results on unique readability
applicable to Lformulas. (However, not all admissible words on this alphabet
are Lformulas: the word ∃xx is admissible but not an Lformula.)
The notational conventions introduced in the section on propositional logic
go through, with the role of propositions there taken over by formulas here. (For
example, given Lformulas ϕ and ψ we shall write ϕ ∨ ψ to indicate ∨ϕψ, and
ϕ → ψ to indicate ϕ ∨ ψ.)
The reader should distinguish between diﬀerent ways of using the symbol =.
Sometimes it denotes one of the eight formal logical symbols, but we also use it
to indicate equality of mathematical objects in the way we have done already
many times. The context should always make it clear what our intention is in
this respect without having to spell it out. To increase readability we usually
write an atomic formula = t
1
t
2
as t
1
= t
2
and its negation = t
1
t
2
as t
1
,= t
2
,
where t
1
, t
2
are Lterms. The logical symbol = is treated just as a binary relation
symbol, but its interpretation in a structure will always be the equality relation
on its underlying set. This will become clear later.
Deﬁnition. Let ϕ be a formula of L. Written as a word on the alphabet above
we have ϕ = s
1
. . . s
m
. A subformula of ϕ is a subword of the form s
i
. . . s
k
where 1 ≤ i ≤ k ≤ m which also happens to be a formula of L.
An occurrence of a variable x in ϕ at the jth place (that is, s
j
= x) is said
to be a bound occurrence if ϕ has a subformula s
i
s
i+1
. . . s
k
with i ≤ j ≤ k that
is of the form ∃xψ or ∀xψ. If an occurrence is not bound then it is said to be a
free occurrence.
At this point the reader is invited to do the ﬁrst exercise at the end of this
section, which gives another useful characterization of subformulas.
Example. In the formula
_
∃x(x = y)
_
∧x = 0, where x and y are distinct, the
ﬁrst two occurrences of x are bound, the third is free, and the only occurrence
of y is free. (Note: the formula is actually the string ∧∃x = xy = x0, and the
occurrences of x and y are really the occurrences in this string.)
32 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Deﬁnition. A sentence is a formula in which all occurrences of variables are
bound occurrences.
We write ϕ(x
1
, . . . , x
n
) to indicate a formula ϕ such that all variables that occur
free in ϕ are among x
1
, . . . , x
n
. In using this notation it is understood that
x
1
, . . . , x
n
are distinct variables, but it is not required that each of x
1
, . . . , x
n
occurs free in ϕ. (This is like indicating a polynomial in the indeterminates
x
1
, . . . , x
n
by p(x
1
, . . . , x
n
), where one allows that some of these indeterminates
do not actually occur in p.)
Deﬁnition. Let ϕ be an Lformula, let x
1
, . . . , x
n
be distinct variables, and
let t
1
, . . . , t
n
be Lterms. Then ϕ(t
1
/x
1
, . . . , t
n
/x
n
) is the word obtained by
replacing all the free occurences of x
i
in ϕ by t
i
, simultaneously for i = 1, . . . , n.
If ϕ is given in the form ϕ(x
1
, . . . , x
n
), then we write ϕ(t
1
, . . . , t
n
) as a shorthand
for ϕ(t
1
/x
1
, . . . , t
n
/x
n
).
We have the following lemma whose routine proof is left to the reader.
Lemma 2.5.1. Suppose ϕ is an Lformula, x
1
, . . . , x
n
are distinct variables,
and t
1
, . . . , t
n
are Lterms. Then ϕ(t
1
/x
1
, . . . , t
n
/x
n
) is an Lformula. If
t
1
, . . . , t
n
are variablefree and ϕ = ϕ(x
1
, . . . , x
n
), then ϕ(t
1
, . . . , t
n
) is an L
sentence.
In the deﬁnition of ϕ(t
1
/x
1
, . . . , t
n
/x
n
) the “replacing” should be simultaneous,
because it can happen that ϕ(t
1
/x
1
)(t
2
/x
2
) ,= ϕ(t
1
/x
1
, t
2
/x
2
).
Let / be an Lstructure with underlying set A, and let C ⊆ A. We extend L to
a language L
C
by adding a constant symbol c for each c ∈ C, called the name
of c. These names are symbols not in L. We make / into an L
C
structure
by keeping the same underlying set and interpretations of symbols of L, and
by interpreting each name c as the element c ∈ C. The L
C
structure thus
obtained is indicated by /
C
. Hence for each variablefree L
C
term t we have
a corresponding element t
A
C
of A, which for simplicity of notation we denote
instead by t
A
. All this applies in particular to the case C = A, where in L
A
we
have a name a for each a ∈ A.
Deﬁnition. We can now deﬁne what it means for an L
A
sentence σ to be true
in the Lstructure / (notation: / [= σ, also read as / satisﬁes σ or σ holds in
/, or σ is valid in /). First we consider atomic L
A
sentences:
(i) / [= ·, and / ,[= ⊥;
(ii) / [= Rt
1
. . . t
m
if and only if (t
A
1
, . . . , t
A
m
) ∈ R
A
, for mary R ∈ L
r
, and
variablefree L
A
terms t
1
, . . . , t
m
;
(iii) / [= t
1
= t
2
if and only if t
A
1
= t
A
2
, for variablefree L
A
terms t
1
, t
2
.
We extend the deﬁnition inductively to arbitrary L
A
sentences as follows:
(i) Suppose σ = σ
1
. Then / [= σ if and only if / σ
1
.
(ii) Suppose σ = σ
1
∨ σ
2
. Then / [= σ if and only if / [= σ
1
or / [= σ
2
.
(iii) Suppose σ = σ
1
∧ σ
2
. Then / [= σ if and only if / [= σ
1
and / [= σ
2
.
(iv) Suppose σ = ∃xϕ(x). Then / [= σ if and only if / [= ϕ(a) for some a ∈ A.
2.5. FORMULAS AND SENTENCES 33
(v) Suppose σ = ∀xϕ(x). Then / [= σ if and only if / [= ϕ(a) for all a ∈ A.
Even if we just want to deﬁne / [= σ for Lsentences σ, one can see that if
σ has the form ∃xϕ(x) or ∀xϕ(x), the inductive deﬁnition above forces us to
consider L
A
sentences ϕ(a). This is why we introduced names. We didn’t say so
explicitly, but “inductive” refers here to induction with respect to the number of
logical symbols in σ. For example, the fact that ϕ(a) has fewer logical symbols
than ∃xϕ(x) is crucial for the above to count as a deﬁnition.
It is easy to check that for an L
A
sentence σ = ∃x
1
. . . ∃x
n
ϕ(x
1
, . . . , x
n
),
/ [= σ ⇐⇒ / [= ϕ(a
1
, . . . , a
n
) for some (a
1
, . . . , a
n
) ∈ A
n
,
and that for an L
A
sentence σ = ∀x
1
. . . ∀x
n
ϕ(x
1
, . . . , x
n
),
/ [= σ ⇐⇒ / [= ϕ(a
1
, . . . , a
n
) for all (a
1
, . . . , a
n
) ∈ A
n
.
Deﬁnition. Given an L
A
formula ϕ(x
1
, . . . , x
n
) we let ϕ
A
be the following
subset of A
n
:
ϕ
A
= ¦(a
1
, . . . , a
n
) : / [= ϕ(a
1
, . . . , a
n
)¦
The formula ϕ(x
1
, . . . , x
n
) is said to deﬁne the set ϕ
A
in /. A set S ⊆ A
n
is said to be deﬁnable in / if S = ϕ
A
for some L
A
formula ϕ(x
1
, . . . , x
n
). If
moreover ϕ can be chosen to be an Lformula then S is said to be 0deﬁnable
in /.
Examples.
(1) The set ¦r ∈ R : r <
√
2¦ is 0deﬁnable in (R; <, 0, 1, +, −, ): it is deﬁned
by the formula (x
2
< 1+1) ∨(x < 0). (Here x
2
abbreviates the term x x.)
(2) The set ¦r ∈ R : r < π¦ is deﬁnable in (R; <, 0, 1, +, −, ): it is deﬁned by
the formula x < π.
We now single out formulas by certain syntactical conditions. These conditions
have semantic counterparts in terms of the behaviour of these formulas under
various kinds of homomorphisms, as shown in some exercises below.
An Lformula is said to be quantiﬁerfree if it has no occurrences of ∃ and
no occurrences of ∀. An Lformula is said to be existential if it has the form
∃x
1
. . . ∃x
m
ϕ with distinct x
1
, . . . , x
m
and a quantiﬁerfree Lformula ϕ. An
Lformula is said to be universal if it has the form ∀x
1
. . . ∀x
m
ϕ with distinct
x
1
, . . . , x
m
and a quantiﬁerfree Lformula ϕ. An Lformula is said to be positive
if it has no occurrences of (but it can have occurrences of ⊥).
Exercises.
(1) Let ϕ and ψ be Lformulas; put sf(ϕ) := set of subformulas of ϕ.
(a) If ϕ is atomic, then sf(ϕ) = ϕ¦.
(b) sf(ϕ) = ϕ¦ ∪ sf(ϕ).
(c) sf(ϕ ∨ψ) = ϕ ∨ ψ¦ ∪sf(ϕ) ∪ sf(ψ), and sf(ϕ ∧ψ) = ϕ ∧ψ¦ ∪sf(ϕ) ∪sf(ψ).
(d) sf(∃xϕ) = ∃xϕ¦ ∪ sf(ϕ), and sf(∀xϕ) = ∀xϕ¦ ∪ sf(ϕ).
34 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(2) If t(x
1
, . . . , x
n
) is an L
A
term and a
1
, . . . , a
n
∈ A, then
t(a
1
, . . . , a
n
)
A
= t
A
(a
1
, . . . , a
n
).
(3) Suppose that S
1
⊆ A
n
and S
2
⊆ A
n
are deﬁned in / by the L
A
formulas
ϕ
1
(x
1
, . . . , x
n
) and ϕ
2
(x
1
, . . . , x
n
) respectively. Then:
(a) S
1
∪ S
2
is deﬁned in / by (ϕ
1
∨ ϕ
2
)(x
1
, . . . , x
n
).
(b) S
1
∩ S
2
is deﬁned in / by (ϕ
1
∧ ϕ
2
)(x
1
, . . . , x
n
).
(c) A
n
S
1
is deﬁned in / by ϕ
1
(x
1
, . . . , x
n
).
(d) S
1
⊆ S
2
⇐⇒ / [= ∀x
1
. . . ∀x
n
`
ϕ
1
→ ϕ
2
).
(4) Let π : A
m+n
→ A
m
be the projection map given by
π(a
1
, . . . , a
m+n
) = (a
1
, . . . , a
m
),
and for S ⊆ A
m+n
and a ∈ A
m
, put
S(a) := b ∈ A
n
: (a, b) ∈ S¦ (a section of S).
Suppose that S ⊆ A
m+n
is deﬁned in / by the L
A
formula ϕ(x, y) where x =
(x
1
, . . . , x
m
) and y = (y
1
, . . . , y
n
). Then ∃y
1
. . . ∃y
n
ϕ(x, y) deﬁnes in / the subset
π(S) of A
m
, and ∀y
1
. . . ∀y
n
ϕ(x, y) deﬁnes in / the set
a ∈ A
m
: S(a) = A
n
¦.
(5) The following sets are 0deﬁnable in the corresponding structures:
(a) The ordering relation (m, n) ∈ N
2
: m < n¦ in (N; 0, +).
(b) The set 2, 3, 5, 7, . . .¦ of prime numbers in the semiring A = (N; 0, 1, +, ).
(c) The set 2
n
: n ∈ N¦ in the semiring A.
(d) The set a ∈ R : f is continuous at a¦ in (R; <, f) where f : R → R is any
function.
(6) Let the symbols of L be a binary relation symbol < and a unary relation symbol
U. Then there is an Lsentence σ such that for all X ⊆ R we have
(R; <, X) [= σ ⇐⇒ X is ﬁnite.
(7) Let / ⊆ B. Then we consider L
A
to be a sublanguage of L
B
in such a way
that each a ∈ A has the same name in L
A
as in L
B
. This convention is in force
throughout these notes.
(a) For each variable free L
A
term t we have t
A
= t
B
.
(b) If the L
A
sentence σ is quantiﬁerfree, then / [= σ ⇔ B [= σ.
(c) If σ is an existential L
A
sentence, then / [= σ ⇒ B [= σ
(d) If σ is a universal L
A
sentence, then B [= σ ⇒ / [= σ.
(8) Suppose h : / −→ B is a homomorphism of Lstructures. For each L
A
term t,
let t
h
be the L
B
term obtained from t by replacing each occurrence of a name
a of an element a ∈ A by the name ha of the corresponding element ha ∈ B.
Similarly, for each L
A
formula ϕ, let ϕ
h
be the L
B
formula obtained from ϕ by
replacing each occurrence of a name a of an element a ∈ A by the name ha of
the corresponding element ha ∈ B. Note that if ϕ is a sentence, so is ϕ
h
. Then:
2.6. MODELS 35
(a) if t is a variablefree L
A
term, then h(t
A
) = t
B
h
;
(b) if σ is a positive L
A
sentence without ∀symbol, then / [= σ ⇒ B [= σ
h
;
(c) if σ is a positive L
A
sentence and h is surjective, then / [= σ ⇒ B [= σ
h
;
(d) if σ is an L
A
sentence and h is an isomorphism, then / [= σ ⇔ B [= σ
h
;
In particular, isomorphic Lstructures satisfy exactly the same Lsentences.
2.6 Models
In the rest of this chapter L is a language, / is an Lstructure (with underlying
set A), and, unless indicated otherwise, t is an Lterm, ϕ, ψ, and θ are L
formulas, σ is an Lsentence, and Σ is a set of Lsentences. We drop the preﬁx
L in “Lterm” and “Lformula” and so on, unless this would cause confusion.
Deﬁnition. We say that / is a model of Σ or Σ holds in / (denoted / [= Σ)
if / [= σ for each σ ∈ Σ.
To discuss examples it is convenient to introduce some notation. Suppose L
contains (at least) the constant symbol 0 and the binary function symbol +.
Given any terms t
1
, . . . , t
n
we deﬁne the term t
1
+ +t
n
inductively as follows:
it is the term 0 if n = 0, the term t
1
if n = 1, and the term (t
1
+ +t
n−1
) +t
n
for n > 1. We write nt for the term t+ +t with n summands, in particular, 0t
and 1t denote the terms 0 and t respectively. Suppose L contains the constant
symbol 1 and the binary function symbol (the multiplication sign). Then we
have similar notational conventions for t
1
. . . t
n
and t
n
; in particular, for n = 0
both stand for the term 1, and t
1
is just t.
Examples. Fix three distinct variables x, y, z.
(1) Groups are the L
Gr
structures that are models of
Gr := ¦∀x(x 1 = x ∧ 1 x = x), ∀x(x x
−1
= 1 ∧ x
−1
x = 1),
∀x∀y∀z((x y) z = x (y z))¦
(2) Abelian groups are the L
Ab
structures that are models of
Ab := ¦∀x(x + 0 = x), ∀x(x + (−x) = 0), ∀x∀y(x +y = y +x),
∀x∀y∀z((x +y) +z = x + (y +z))¦
(3) Torsionfree abelian groups are the L
Ab
structures that are models of
Ab ∪ ¦∀x(nx = 0 → x = 0) : n = 1, 2, 3, . . .¦
(4) Rings are the L
Ri
structures that are models of
Ri := Ab ∪ ¦∀x∀y∀z
_
(x y) z = x (y z)
_
, ∀x
_
x 1 = x ∧ 1 x = x
_
,
∀x∀y∀z
_
(x (y +z) = x y +x z ∧ (x +y) z = x z +y z)
_
¦
36 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(5) Fields are the L
Ri
structures that are models of
Fl = Ri ∪ ¦∀x∀y(x y = y x), 1 ,= 0, ∀x
_
x ,= 0 → ∃y (x y = 1)
_
¦
(6) Fields of characteristic 0 are the L
Ri
structures that are models of
Fl(0) := Fl ∪ ¦n1 ,= 0 : n = 2, 3, 5, 7, 11, . . .¦
(7) Given a prime number p, ﬁelds of characteristic p are the L
Ri
structures
that are models of Fl(p) := Fl ∪ ¦p1 = 0¦.
(8) Algebraically closed ﬁelds are the L
Ri
structures that are models of
ACF := Fl∪¦∀u
1
. . . ∀u
n
∃x(x
n
+u
1
x
n−1
+. . .+u
n
= 0) : n = 2, 3, 4, 5, . . .¦
Here u
1
, u
2
, u
3
, . . . is some ﬁxed inﬁnite sequence of distinct variables, dis
tinct also from x, and u
i
x
n−i
abbreviates u
i
x
n−i
, for i = 1, . . . , n.
(9) Algebraically closed ﬁelds of characteristic 0 are the L
Ri
structures that are
models of ACF(0) := ACF ∪ ¦n1 ,= 0 : n = 2, 3, 5, 7, 11, . . .¦.
(10) Given a prime number p, algebraically closed ﬁelds of characteristic p are
the L
Ri
structures that are models of ACF(p) := ACF ∪ ¦p1 = 0¦.
Deﬁnition. We say that σ is a logical consequence of Σ (written Σ [= σ) if σ
is true in every model of Σ.
Example. It is wellknown that in any ring R we have x 0 = 0 for all x ∈ R.
This can now be expressed as Ri [= ∀x( x 0 = 0).
We deﬁned what it means for a sentence σ to hold in a given structure /. We
now extend this to arbitrary formulas.
First deﬁne an /instance of a formula ϕ = ϕ(x
1
, . . . , x
m
) to be an L
A

sentence of the form ϕ(a
1
, . . . , a
m
) with a
1
, . . . , a
m
∈ A. Of course ϕ can also
be written as ϕ(y
1
, . . . , y
n
) for another sequence of variables y
1
, . . . , y
n
, for ex
ample, y
1
, . . . , y
n
could be obtained by permuting x
1
, . . . , x
m
, or it could be
x
1
, . . . , x
m
, x
m+1
, obtained by adding a variable x
m+1
. Thus for the above to
count as a deﬁnition of “/instance,” the reader should check that these diﬀer
ent ways of specifying variables (including at least the variables occurring free
in ϕ) give the same /instances.
Deﬁnition. A formula ϕ is said to be valid in / (notation: / [= ϕ) if all its
/instances are true in /.
The reader should check that if ϕ = ϕ(x
1
, . . . , x
m
), then
/ [= ϕ ⇐⇒ / [= ∀x
1
. . . ∀x
m
ϕ.
We also extend the notion of “logical consequence of Σ” to formulas.
Deﬁnition. We say that ϕ is a logical consequence of Σ (notation: Σ [= ϕ) if
/ [= ϕ for all models / of Σ.
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS 37
One should not confuse the notion of “logical consequence of Σ” with that of
“provable from Σ.” We shall give a deﬁnition of provable from Σ in the next
section. The two notions will turn out to be equivalent, but that is hardly
obvious from their deﬁnitions: we shall need much of the next chapter to prove
this equivalence, which is called the Completeness Theorem for Predicate Logic.
We ﬁnish this section with two basic facts:
Lemma 2.6.1. Let α(x
1
, . . . , x
m
) be an L
A
term, and recall that α deﬁnes a
map α
A
: A
m
→ A. Let t
1
, . . . , t
m
be variablefree L
A
terms, with t
A
i
= a
i
∈ A
for i = 1, . . . , m. Then α(t
1
, . . . , t
m
) is a variablefree L
A
term, and
α(t
1
, . . . , t
m
)
A
= α(a
1
, . . . a
m
)
A
= α
A
(t
A
1
, . . . , t
A
m
).
This follows by a straightforward induction on α.
Lemma 2.6.2. Let t
1
, . . . , t
m
be variablefree L
A
terms with t
A
i
= a
i
∈ A
for i = 1, . . . , m. Let ϕ(x
1
, . . . , x
m
) be an L
A
formula. Then the L
A
formula
ϕ(t
1
, . . . , t
m
) is a sentence and
/ [= ϕ(t
1
, . . . , t
m
) ⇐⇒ / [= ϕ(a
1
, . . . , a
m
).
Proof. To keep notations simple we give the proof only for m = 1 with t = t
1
and x = x
1
. We proceed by induction on the number of logical symbols in ϕ(x).
Suppose that ϕ is atomic. The case where ϕ is · or ⊥ is obvious. Assume ϕ
is Rα
1
. . . α
m
where R ∈ L
r
is mary and α
1
(x), . . . , α
m
(x) are L
A
terms. Then
ϕ(t) = Rα
1
(t) . . . α
m
(t) and ϕ(a) = Rα
1
(a) . . . α
m
(a). We have / [= ϕ(t) iﬀ
(α
1
(t)
A
, . . . , α
m
(t)
A
) ∈ R
A
and also / [= ϕ(a) iﬀ (α
1
(a)
A
, . . . , α
m
(a)
A
) ∈ R
A
.
As α
i
(t)
A
= α
i
(a)
A
for all i by the previous lemma, we have / [= ϕ(t) iﬀ
/ [= ϕ(a). The case that ϕ(x) is α(x) = β(x) is handled the same way.
It is also clear that the desired property is inherited by disjunctions, con
junctions and negations of formulas ϕ(x) that have the property. Suppose now
that ϕ(x) = ∃y ψ.
Case y ,= x: Then ψ = ψ(x, y), ϕ(t) = ∃yψ(t, y) and ϕ(a) = ∃yψ(a, y). As
ϕ(t) = ∃yψ(t, y), we have / [= ϕ(t) iﬀ / [= ψ(t, b) for some b ∈ A. By the
inductive hypothesis the latter is equivalent to / [= ψ(a, b) for some b ∈ A,
hence equivalent to / [= ∃yψ(a, y). As ϕ(a) = ∃yψ(a, y), we conclude that
/ [= ϕ(t) iﬀ / [= ϕ(a).
Case y = x: Then x does not occur free in ϕ(x) = ∃xψ. So ϕ(t) = ϕ(a) = ϕ
is an L
A
sentence, and / [= ϕ(t) ⇔ / [= ϕ(a) is obvious.
When ϕ(x) = ∀y ψ then one can proceed exactly as above by distinguishing
two cases.
2.7 Logical Axioms and Rules; Formal Proofs
In this section we introduce a proof system for predicate logic and state its
completeness. We then derive as a consequence the compactness theorem and
some of its corollaries. The completeness is proved in the next chapter.
38 CHAPTER 2. BASIC CONCEPTS OF LOGIC
A propositional axiom of L is by deﬁnition a formula that for some ϕ, ψ, θ occurs
in the list below:
1. ·
2. ϕ → (ϕ ∨ ψ); ϕ → (ψ ∨ ϕ)
3. ϕ →
_
ψ → (ϕ ∨ ψ)
_
4. (ϕ ∧ ψ) → ϕ; (ϕ ∧ ψ) → ψ
5. ϕ →
_
ψ → (ϕ ∧ ψ)
_
6.
_
ϕ → (ψ → θ)
_
→
_
(ϕ → ψ) → (ϕ → θ)
_
7. ϕ → (ϕ → ⊥)
8. (ϕ → ⊥) → ϕ
Each of items 2–8 is a scheme describing inﬁnitely many axioms. Note that
this list is the same as the list in Section 2.2 except that instead of propositions
p, q, r we have formulas ϕ, ψ, θ.
The logical axioms of L are the propositional axioms of L and the equality
and quantiﬁer axioms of L as deﬁned below.
Deﬁnition. The equality axioms of L are the following formulas:
(i) x = x,
(ii) x = y → y = x,
(iii) (x = y ∧ y = z) → x = z,
(iv) (x
1
= y
1
∧ . . . ∧ x
m
= y
m
∧ Rx
1
. . . x
m
) → Ry
1
. . . y
m
,
(v) (x
1
= y
1
∧ . . . ∧ x
n
= y
n
) → Fx
1
. . . x
n
= Fy
1
. . . y
n
,
with the following restrictions on the variables and symbols of L: x, y, z are
distinct in (ii) and (iii); in (iv), x
1
, . . . , x
m
, y
1
, . . . , y
m
are distinct and R ∈ L
r
is mary; in (v), x
1
, . . . , x
n
, y
1
, . . . , y
n
are distinct, and F ∈ L
f
is nary. Note
that (i) represents an axiom scheme rather than a single axiom, since diﬀerent
variables x give diﬀerent formulas x = x. Likewise with (ii)–(v).
Let x and y be distinct variables, and let ϕ(y) be the formula ∃x(x ,= y).
Then ϕ(y) is valid in all / with [A[ > 1, but ϕ(x/y) is invalid in all /. Thus
substituting x for the free occurrences of y does not always preserve validity. To
get rid of this anomaly, we introduce the following restriction on substitutions
of a term t for free occurrences of y.
Deﬁnition. We say that t is free for y in ϕ, if no variable in t can become bound
upon replacing the free occurrences of y in ϕ by t, more precisely: whenever x
is a variable in t, then there are no occurrences of subformulas in ϕ of the form
∃xψ or ∀xψ that contain an occurrence of y that is free in ϕ.
Note that if t is variablefree, then t is free for y in ϕ. We remark that “free
for” abbreviates “free to be substituted for.” In exercise 2 the reader is asked to
show that, with this restriction, substitution of a term for the free occurrences
of a variable does preserve validity.
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS 39
Deﬁnition. The quantiﬁer axioms of L are the formulas ϕ(t/y) → ∃yϕ and
∀yϕ → ϕ(t/y) where t is free for y in ϕ.
These axioms have been chosen to have the following property.
Proposition 2.7.1. The logical axioms of L are valid in every Lstructure.
We ﬁrst prove this for the propositional axioms of L. Let α
1
, . . . , α
n
be distinct
propositional atoms not in L. Let p = p(α
1
, . . . , α
n
) ∈ Prop¦α
1
, . . . , α
n
¦. Let
ϕ
1
, . . . , ϕ
n
be formulas and let p(ϕ
1
, . . . , ϕ
n
) be the word obtained by replac
ing each occurrence of α
i
in p by ϕ
i
for i = 1, . . . , n. One checks easily that
p(ϕ
1
, . . . , ϕ
n
) is a formula.
Lemma 2.7.2. Suppose ϕ
i
= ϕ
i
(x
1
, . . . , x
m
) for 1 ≤ i ≤ n and let a
1
, . . . , a
m
∈
A. Deﬁne a truth assignment t : ¦α
1
, . . . , α
n
¦ −→ ¦0, 1¦ by t(α
i
) = 1 iﬀ
/ [= ϕ
i
(a
1
, . . . , a
m
). Then p(ϕ
1
, . . . , ϕ
n
) is an Lformula and
p(ϕ
1
, . . . , ϕ
n
)(a
1
/x
1
, . . . , a
m
/x
m
) = p(ϕ
1
(a
1
, . . . , a
m
), . . . , ϕ
n
(a
1
, . . . , a
m
)),
t(p(α
1
, . . . , α
n
)) = 1 ⇐⇒ / [=p(ϕ
1
(a
1
, . . . , a
m
), . . . , ϕ
n
(a
1
, . . . , a
m
)).
In particular, if p is a tautology, then / [= p(ϕ
1
, . . . , ϕ
n
).
Proof. Easy induction on p. We leave the details to the reader.
Deﬁnition. An Ltautology is a formula of the form p(ϕ
1
, . . . , ϕ
n
) for some
tautology p(α
1
, . . . , α
n
) ∈ Prop¦α
1
, . . . , α
n
¦ and some formulas ϕ
1
, . . . , ϕ
n
.
By Lemma 2.7.2 all Ltautologies are valid in all Lstructures. The propositional
axioms of L are Ltautologies, so all propositional axioms of L are valid in all
Lstructures. It is easy to check that all equality axioms of L are valid in all
Lstructures. In exercise 3 below the reader is asked to show that all quantiﬁer
axioms of L are valid in all Lstructures. This ﬁnishes the proof of Proposition
2.7.1.
Next we introduce rules for deriving new formulas from given formulas.
Deﬁnition. The logical rules of L are the following:
(i) Modus Ponens (MP): From ϕ and ϕ → ψ, infer ψ.
(ii) Generalization Rule (G): If the variable x does not occur free in ϕ, then
(a) from ϕ → ψ, infer ϕ → ∀xψ;
(b) from ψ → ϕ, infer ∃xψ → ϕ.
A key property of the logical rules is that their application preserves validity.
Here is a more precise statement of this fact, to be veriﬁed by the reader.
(i) If / [= ϕ and / [= ϕ → ψ, then / [= ψ.
(ii) Suppose x does not occur free in ϕ. Then
(a) if / [= ϕ → ψ, then / [= ϕ → ∀xψ;
(b) if / [= ψ → ϕ, then / [= ∃xψ → ϕ.
Deﬁnition. A formal proof , or just proof , of ϕ from Σ is a sequence ϕ
1
, . . . , ϕ
n
of formulas with n ≥ 1 and ϕ
n
= ϕ, such that for k = 1, . . . , n:
40 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(i) either ϕ
k
∈ Σ,
(ii) or ϕ
k
is a logical axiom,
(iii) or there are i, j ∈ ¦1, . . . , k −1¦ such that ϕ
k
can be inferred from ϕ
i
and
ϕ
j
by MP, or from ϕ
i
by G.
If there exists a proof of ϕ from Σ, then we write Σ ¬ ϕ and say Σ proves ϕ.
Proposition 2.7.3. If Σ ¬ ϕ, then Σ [= ϕ.
This follows easily from earlier facts that we stated and which the reader was
asked to verify. The converse is more interesting, and due to G¨odel (1930):
Theorem 2.7.4 (Completeness Theorem of Predicate Logic).
Σ ¬ ϕ ⇐⇒ Σ [= ϕ
Remark. Our choice of proof system, and thus our notion of formal proof
is somewhat arbitrary. However the equivalence of ¬ and [= (Completeness
Theorem) justiﬁes our choice of logical axioms and rules and shows in particular
that no further logical axioms and rules are needed. Moreover, this equivalence
has consequences that can be stated in terms of [= alone. An example is the
important Compactness Theorem.
Theorem 2.7.5 (Compactness Theorem). If Σ [= σ then there is a ﬁnite subset
Σ
0
of Σ such that Σ
0
[= σ.
The Compactness Theorem has many consequences. Here is one.
Corollary 2.7.6. Suppose σ is an L
Ri
sentence that holds in all ﬁelds of char
acteristic 0. Then there exists a natural number N such that σ is true in all
ﬁelds of characteristic p > N.
Proof. By assumption,
Fl(0) = Fl ∪ ¦n1 ,= 0 : n = 2, 3, 5, . . .¦ [= σ.
Then by Compactness, there is N ∈ N such that
Fl ∪ ¦n1 ,= 0 : n = 2, 3, 5, , . . . , n ≤ N¦ [= σ.
It follows that σ is true in all ﬁelds of characteristic p > N.
The converse of this corollary fails, see exercise 8 below. Note that Fl(0) is
inﬁnite. Could there be an alternative ﬁnite set of axioms whose models are
exactly the ﬁelds of characteristic 0?
Corollary 2.7.7. There is no ﬁnite set of L
Ri
sentences whose models are
exactly the ﬁelds of characteristic 0.
Proof. Suppose there is such a ﬁnite set of sentences ¦σ
1
, . . . , σ
N
¦. Let σ :=
σ
1
∧ ∧σ
N
. Then the models of σ are just the ﬁelds of characteristic 0. By the
previous result σ holds in some ﬁeld of characteristic p > 0. Contradiction!
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS 41
Exercises. All but the last one to be done without using Theorem 2.7.4 or
2.7.5. Actually, Exercise (4) will be used in proving Theorem 2.7.4.
(1) Let L = R¦ where R is a binary relation symbol, and let / = (A; R) be a ﬁnite
Lstructure (i. e. the set A is ﬁnite). Then there exists an Lsentence σ such that
the models of σ are exactly the Lstructures isomorphic to /. (In fact, for an
arbitrary language L, two ﬁnite Lstructures are isomorphic iﬀ they satisfy the
same Lsentences.)
(2) If t is free for y in ϕ and ϕ is valid in /, then ϕ(t/y) is valid in /.
(3) Suppose t is free for y in ϕ = ϕ(x
1
, . . . , x
n
, y). Then:
(i) Each /instance of the quantiﬁer axiom ϕ(t/y) → ∃yϕ has the form
ϕ(a
1
, . . . , a
n
, τ) → ∃yϕ(a
1
, . . . , a
n
, y)
with a
1
, . . . , a
n
∈ A and τ a variablefree L
A
term.
(ii) The quantiﬁer axiom ϕ(t/y) → ∃yϕ is valid in /. (Hint: use Lemma 2.6.2.)
(iii) The quantiﬁer axiom ∀yϕ → ϕ(t/y) is valid in /.
(4) If ϕ is an Ltautology, then ¬ ϕ.
(5) Σ ¬ ϕ
i
for i = 1, . . . , n ⇐⇒ Σ ¬ ϕ
1
∧ ∧ ϕ
n
.
(6) If Σ ¬ ϕ → ψ and Σ ¬ ψ → ϕ, then Σ ¬ ϕ ↔ ψ.
(7) ¬ ∃xϕ ↔ ∀xϕ and ¬ ∀xϕ ↔ ∃xϕ.
(8) Indicate an L
Ri
sentence that is true in the ﬁeld of real numbers, but false in all
ﬁelds of positive characteristic.
(9) Let σ be an L
Ab
sentence which holds in all nontrivial torsion free abelian groups.
Then there exists N ∈ N such that σ is true in all groups Z/pZ where p is a
prime number and p > N.
Chapter 3
The Completeness Theorem
The main aim of this chapter is to prove the Completeness Theorem. As a
byproduct we also derive some more elementary facts about predicate logic.
The last section contains some of the basics of universal algebra, which we can
treat here rather eﬃciently using our construction of a socalled termmodel in
the proof of the Completeness Theorem.
Conventions on the use of L, /, t, ϕ, ψ, θ, σ and Σ are as in the beginning
of Section 2.6.
3.1 Another Form of Completeness
It is convenient to prove ﬁrst a variant of the Completeness Theorem.
Deﬁnition. We say that Σ is consistent if Σ ⊥, and otherwise (that is, if
Σ ¬ ⊥), we call Σ inconsistent.
Theorem 3.1.1 (Completeness Theorem  second form).
Σ is consistent if and only if Σ has a model.
We ﬁrst show that this second form of the Completeness Theorem implies the
ﬁrst form. This will be done through a series of technical lemmas, which are
also useful later in this Chapter.
Lemma 3.1.2. Suppose Σ ¬ ϕ. Then Σ ¬ ∀xϕ.
Proof. From Σ ¬ ϕ and the Ltautology ϕ → (∀xϕ → ϕ) we obtain Σ ¬
∀xϕ → ϕ by MP. Then by G we have Σ ¬ ∀xϕ → ∀xϕ. Using the L
tautology (∀xϕ → ∀xϕ) → ∀xϕ and MP we get Σ ¬ ∀xϕ.
Lemma 3.1.3 (Deduction Lemma). Suppose Σ ∪ ¦σ¦ ¬ ϕ. Then Σ ¬ σ → ϕ.
Proof. By induction on the length of a proof of ϕ from Σ ∪ ¦σ¦.
The cases where ϕ is a logical axiom, or ϕ ∈ Σ ∪ ¦σ¦ or ϕ is obtained by
MP are treated just as in the proof of the Deduction Lemma of Propositional
Logic.
43
44 CHAPTER 3. THE COMPLETENESS THEOREM
Suppose that ϕ is obtained by part (a) of G, so ϕ is ϕ
1
→ ∀xψ where x does
not occur free in ϕ
1
and Σ ∪ ¦σ¦ ¬ ϕ
1
→ ψ, and where we assume inductively
that Σ ¬ σ → (ϕ
1
→ ψ). We have to argue that then Σ ¬ σ → (ϕ
1
→ ∀xψ).
From the Ltautology
_
σ → (ϕ
1
→ ψ)
_
→
_
(σ ∧ ϕ
1
) → ψ
_
and MP we get Σ ¬
(σ∧ϕ
1
) → ψ. Since x does not occur free in σ∧ϕ
1
this gives Σ ¬ (σ∧ϕ
1
) → ∀xψ,
by G. Using the Ltautology
_
(σ ∧ ϕ
1
) → ∀xψ
_
→
_
σ → (ϕ
1
→ ∀xψ)
_
and MP this gives Σ ¬ σ → (ϕ
1
→ ∀xψ).
The case that ϕ is obtained by part (b) of G is left to the reader.
Corollary 3.1.4. Suppose Σ ∪ ¦σ
1
, . . . , σ
n
¦ ¬ ϕ. Then Σ ¬ σ
1
∧ . . . ∧ σ
n
→ ϕ.
We leave the proof as an exercise.
Corollary 3.1.5. Σ ¬ σ if and only if Σ ∪ ¦σ¦ is inconsistent.
The proof is just like that of the corresponding fact of Propositional Logic.
Lemma 3.1.6. Σ ¬ ∀yϕ if and only if Σ ¬ ϕ.
Proof. (⇐) This is Lemma 3.1.2. For (⇒), assume Σ ¬ ∀yϕ. We have the
quantiﬁer axiom ∀yϕ → ϕ, so by MP we get Σ ¬ ϕ.
Corollary 3.1.7. Σ ¬ ∀y
1
. . . ∀y
n
ϕ if and only if Σ ¬ ϕ.
Corollary 3.1.8. The second form of the Completeness Theorem implies the
ﬁrst form, Theorem 2.7.4.
Proof. Assume the second form of the Completeness Theorem holds, and that
Σ [= ϕ. It suﬃces to show that then Σ ¬ ϕ. From Σ [= ϕ we obtain Σ [=
∀y
1
. . . ∀y
n
ϕ where ϕ = ϕ(y
1
, . . . , y
n
), and so Σ ∪ ¦σ¦ has no model where
σ is the sentence ∀y
1
. . . ∀y
n
ϕ. But then by the 2
nd
form of the Completeness
Theorem Σ∪ ¦σ¦ is inconsistent. Then by Corollary 3.1.5 we have Σ ¬ σ and
thus by Corollary 3.1.7 we get Σ ¬ ϕ.
We ﬁnish this section with another form of the Compactness Theorem:
Theorem 3.1.9 (Compactness Theorem  second form).
If each ﬁnite subset of Σ has a model, then Σ has a model.
This follows from the second form of the Completeness Theorem.
3.2 Proof of the Completeness Theorem
We are now going to prove Theorem 3.1.1. Since (⇐) is clear, we focus our
attention on (⇒), that is, given a consistent set of sentences Σ we must show
that Σ has a model. This job will be done in a series of lemmas. Unless we say
so, we do not assume in those lemmas that Σ is consistent.
3.2. PROOF OF THE COMPLETENESS THEOREM 45
Lemma 3.2.1. Suppose Σ ¬ ϕ and t is free for x in ϕ. Then Σ ¬ ϕ(t/x).
Proof. From Σ ¬ ϕ we get Σ ¬ ∀xϕ by Lemma 3.1.2. Then MP together with
the quantiﬁer axiom ∀xϕ → ϕ(t/x) gives Σ ¬ ϕ(t/x) as required.
Lemma 3.2.2. Suppose Σ ¬ ϕ, and t
1
, . . . , t
n
are terms whose variables do not
occur bound in ϕ. Then Σ ¬ ϕ(t
1
/x
1
, . . . , t
n
/x
n
).
Proof. Take distinct variables y
1
, . . . , y
n
that do not occur in ϕ or t
1
, . . . , t
n
and that are distinct from x
1
, . . . , x
n
. Use Lemma 3.2.1 n times in succession to
obtain Σ ¬ ψ where ψ = ϕ(y
1
/x
1
, . . . , y
n
/x
n
). Apply Lemma 3.2.1 again n times
to get Σ ¬ ψ(t
1
/y
1
, . . . , t
n
/y
n
). To ﬁnish, observe that ψ(t
1
/y
1
, . . . , t
n
/y
n
) =
ϕ(t
1
/x
1
, . . . , t
n
/x
n
).
Lemma 3.2.3.
(1) For each Lterm t we have ¬ t = t.
(2) Let t, t
be Lterms and Σ ¬ t = t
. Then Σ ¬ t
= t.
(3) Let t
1
, t
2
, t
3
be Lterms and Σ ¬ t
1
= t
2
and Σ ¬ t
2
= t
3
. Then Σ ¬ t
1
= t
3
.
(4) Let R ∈ L
r
be mary and let t
1
, t
1
, . . . , t
m
, t
m
be Lterms such that Σ ¬
t
i
= t
i
for i = 1, . . . , m and Σ ¬ Rt
1
. . . t
m
. Then Σ ¬ Rt
1
. . . t
m
.
(5) Let F ∈ L
f
be nary, and let t
1
, t
1
, . . . , t
n
, t
n
be Lterms such that Σ ¬ t
i
=
t
i
for i = 1, . . . , n. Then Σ ¬ Ft
1
. . . t
n
= Ft
1
. . . t
n
.
Proof. For (1), use the equality axiom x = x and apply Lemma 3.2.2. For
(2), take an equality axiom x = y → y = x and apply Lemma 3.2.2 to get
¬ t = t
→ t
= t. Then MP gives Σ ¬ t
= t.
For (3), take an equality axiom (x = y ∧ y = z) → (x = z) and apply Lemma
3.2.2 to get ¬ (t
1
= t
2
∧ t
2
= t
3
) → t
1
= t
3
. By the assumptions and Exercise 5
in Section 2.7 we have Σ ¬ t
1
= t
2
∧ t
2
= t
3
. Then MP yields Σ ¬ t
1
= t
3
.
To prove (4) we ﬁrst use the same Exercise 5 to get
Σ ¬ t
1
= t
1
∧ . . . ∧ t
m
= t
m
∧ Rt
1
. . . t
m
.
Take an equality axiom x
1
= y
1
∧. . . ∧x
m
= y
m
∧Rx
1
. . . x
m
→ Ry
1
. . . y
m
and
apply Lemma 3.2.2 to obtain
Σ ¬ t
1
= t
1
∧ . . . ∧ t
m
= t
m
∧ Rt
1
. . . t
m
→ Rt
1
. . . t
m
.
Then MP gives Σ ¬ Rt
1
. . . t
m
. Part (5) is obtained in a similar way by using
an equality axiom x
1
= y
1
∧ . . . ∧ x
n
= y
n
→ Fx
1
. . . x
n
= Fy
1
. . . y
n
.
Deﬁnition. Let T
L
be the set of variablefree Lterms. We deﬁne a binary
relation ∼
Σ
on T
L
by
t
1
∼
Σ
t
2
⇐⇒ Σ ¬ t
1
= t
2
.
Parts (1), (2) and (3) of the last lemma yield the following.
Lemma 3.2.4. The relation ∼
Σ
is an equivalence relation on T
L
.
46 CHAPTER 3. THE COMPLETENESS THEOREM
Deﬁnition. Suppose L has at least one constant symbol. Then T
L
is non
empty. We deﬁne the Lstructure /
Σ
as follows:
(i) Its underlying set is A
Σ
:= T
L
/ ∼
Σ
. Let [t] denote the equivalence class of
t ∈ T
L
with respect to ∼
Σ
.
(ii) If R ∈ L
r
is mary, then R
A
Σ
⊆ A
m
Σ
is given by
([t
1
], . . . , [t
m
]) ∈ R
A
Σ
:⇐⇒ Σ ¬ Rt
1
. . . t
m
(t
1
, . . . , t
m
∈ T
L
).
(iii) If F ∈ L
f
is nary, then F
A
Σ
: A
n
Σ
→ A
Σ
is given by
F
A
Σ
([t
1
], . . . , [t
n
]) = [Ft
1
. . . t
n
] (t
1
, . . . , t
n
∈ T
L
).
Remark. The reader should verify that this counts as a deﬁnition, i.e., does
not introduce ambiguity. (Use parts (4) and (5) of Lemma 3.2.3.)
Corollary 3.2.5. Suppose L has a constant symbol, and Σ is consistent. Then
(1) for each t ∈ T
L
we have t
A
Σ
= [t];
(2) for each atomic σ we have: Σ ¬ σ ⇐⇒ /
Σ
[= σ.
Proof. Part (1) follows by an easy induction. Let σ be Rt
1
. . . t
m
where R ∈ L
r
is mary and t
1
, . . . , t
m
∈ T
L
. Then
Σ ¬ Rt
1
. . . t
m
⇔ ([t
1
], . . . , [t
m
]) ∈ R
A
Σ
⇔ /
Σ
[= Rt
1
. . . t
m
,
where the last “⇔” follows from the deﬁnition of [= together with part (1). Now
suppose that σ is t
1
= t
2
where t
1
, t
2
∈ T
L
. Then
Σ ¬ t
1
= t
2
⇔ [t
1
] = [t
2
] ⇔ t
A
Σ
1
= t
A
Σ
2
⇔ /
Σ
[= t
1
= t
2
.
We also have Σ ¬ · ⇔ /
Σ
[= ·. So far we haven’t used the assumption that Σ
is consistent, but now we do. The consistency of Σ means that Σ ⊥. We also
have /
Σ
,[= ⊥ by deﬁnition of [=. Thus Σ ¬ ⊥ ⇔ /
Σ
[= ⊥.
If the equivalence in part (2) of this corollary holds for all σ (not only for atomic
σ), then /
Σ
[= Σ, so we would have found a model of Σ, and be done. But
clearly this equivalence can only hold for all σ if Σ has the property that for
each σ, either Σ ¬ σ or Σ ¬ σ. This property is of interest for other reasons
as well, and deserves a name:
Deﬁnition. We say that Σ is complete if Σ is consistent, and for each σ either
Σ ¬ σ or Σ ¬ σ.
Example. Let L = L
Ab
, Σ := Ab (the set of axioms for abelian groups), and
σ the sentence ∃x(x ,= 0). Then Σ σ since the trivial group doesn’t satisfy
σ. Also Σ σ, since there are nontrivial abelian groups and σ holds in such
groups. Thus Σ is not complete.
Completeness is a strong property and it can be hard to show that a given
set of axioms is complete. The set of axioms for algebraically closed ﬁelds of
characteristic 0 is complete (see the end of Section 4.3).
A key fact about completeness needed in this chapter is that any consistent
set of sentences extends to a complete set of sentences:
3.2. PROOF OF THE COMPLETENESS THEOREM 47
Lemma 3.2.6 (Lindenbaum). Suppose Σ is consistent. Then Σ ⊆ Σ
for some
complete set of Lsentences Σ
.
The proof uses Zorn’s Lemma, and is just like that of the corresponding fact of
Propositional Logic in Section 1.2.
Completeness of Σ does not guarantee that the equivalence of part (2) of
Corollary 3.2.5 holds for all σ. Completeness is only a necessary condition
for this equivalence to hold for all σ; another necessary condition is “to have
witnesses”:
Deﬁnition. A Σwitness for the sentence ∃xϕ(x) is a term t ∈ T
L
such that
Σ ¬ ϕ(t). We say that Σ has witnesses if there is a Σwitness for every sentence
∃xϕ(x) proved by Σ.
Theorem 3.2.7. Let L have a constant symbol, and suppose Σ is consistent.
Then the following two conditions are equivalent:
(i) For each σ we have: Σ ¬ σ ⇔ /
Σ
[= σ.
(ii) Σ is complete and has witnesses.
In particular, if Σ is complete and has witnesses, then /
Σ
is a model of Σ.
Proof. It should be clear that (i) implies (ii). For the converse, assume (ii). We
use induction on the number of logical symbols in σ to obtain (i). We already
know that (i) holds for atomic sentences. The cases that σ = σ
1
, σ = σ
1
∨σ
2
,
and σ = σ
1
∧ σ
2
are treated just as in the proof of the corresponding Lemma
2.2.11 for Propositional Logic. It remains to consider two cases:
Case σ = ∃xϕ(x):
(⇒) Suppose that Σ ¬ σ. Because we are assuming that Σ has witnesses we
have a t ∈ T
L
such that Σ ¬ ϕ(t). Then by the inductive hypothesis /
Σ
[= ϕ(t).
So by Lemma 2.6.2 we have an a ∈ A
Σ
such that /
Σ
[= ϕ(a). Therefore
/
Σ
[= ∃xϕ(x), hence /
Σ
[= σ.
(⇐) Assume /
Σ
[= σ. Then there is an a ∈ A
Σ
such that /
Σ
[= ϕ(a). Choose
t ∈ T
L
such that [t] = a. Then t
A
Σ
= a, hence /
Σ
[= ϕ(t) by Lemma 2.6.2.
Applying the inductive hypothesis we get Σ ¬ ϕ(t). This yields Σ ¬ ∃xϕ(x) by
MP and the quantiﬁer axiom ϕ(t) → ∃xϕ(x).
Case σ = ∀xϕ(x): This is similar to the previous case but we also need the
result from Exercise 7 in Section 2.7 that ¬ ∀xϕ ↔ ∃xϕ.
We call attention to some new notation in the next lemmas: the symbol ¬
L
is
used to emphasize that we are dealing with formal provability within L.
Lemma 3.2.8. Let Σ be a set of Lsentences, c a constant symbol not in L,
and L
c
:= L ∪ ¦c¦. Let ϕ(y) be an Lformula and suppose Σ ¬
L
c
ϕ(c). Then
Σ ¬
L
ϕ(y).
Proof. (Sketch) Take a proof of ϕ(c) from Σ in the language L
c
, and take a
variable z diﬀerent from all variables occurring in that proof, and also such that
z ,= y. Replace in every formula in this proof each occurrence of c by z. Check
that one obtains in this way a proof of ϕ(z/y) in the language L from Σ. So
Σ ¬
L
ϕ(z/y) and hence by Lemma 3.2.1 we have Σ ¬
L
ϕ(z/y)(y/z), that is,
Σ ¬
L
ϕ(y).
48 CHAPTER 3. THE COMPLETENESS THEOREM
Lemma 3.2.9. Assume Σ is consistent and Σ ¬ ∃yϕ(y). Let c be a constant
symbol not in L. Put L
c
:= L ∪ ¦c¦. Then Σ ∪ ¦ϕ(c)¦ is a consistent set of
L
c
sentences.
Proof. Suppose not. Then Σ∪¦ϕ(c)¦ ¬
L
c
⊥. By the Deduction Lemma (3.1.3)
Σ ¬
L
c
ϕ(c) → ⊥. Then by Lemma 3.2.8 we have Σ ¬
L
ϕ(y) → ⊥. By G we have
Σ ¬
L
∃yϕ(y) → ⊥. Applying MP yields Σ ¬ ⊥, contradicting the consistency
of Σ.
Lemma 3.2.10. Suppose Σ is consistent. Let σ
1
= ∃x
1
ϕ
1
(x
1
), . . ., σ
n
=
∃x
n
ϕ
n
(x
n
) be such that Σ ¬ σ
i
for every i = 1, . . . , n. Let c
1
, . . . , c
n
be
distinct constant symbols not in L. Put L
:= L ∪ ¦c
1
, . . . , c
n
¦ and Σ
=
Σ ∪ ¦ϕ
1
(c
1
), . . . , ϕ
n
(c
n
)¦. Then Σ
is a consistent set of L
sentences.
Proof. The previous lemma covers the case n = 1. The general case follows by
induction on n.
In the next lemma we use a superscript “w” for “witness.”
Lemma 3.2.11. Suppose Σ is consistent. For each Lsentence σ = ∃xϕ(x) such
that Σ ¬ σ, let c
σ
be a constant symbol not in L such that if σ
is a diﬀerent
Lsentence of the form ∃x
ϕ
(x
) provable from Σ, then c
σ
,= c
σ
. Put
L
w
:= L ∪ ¦c
σ
: σ = ∃xϕ(x) is an Lsentence such that Σ ¬ σ¦
Σ
w
:= Σ ∪ ¦ϕ(c
σ
) : σ = ∃xϕ(x) is an Lsentence such that Σ ¬ σ¦
Then Σ
w
is a consistent set of L
w
sentences.
Proof. Suppose not. Then Σ
w
¬ ⊥. Take a proof of ⊥ from Σ
w
and let
c
σ
1
, . . . , c
σ
n
be constant symbols in L
w
L such that this proof is a proof
of ⊥ in the language L∪¦c
σ
1
, . . . , c
σ
n
¦ from Σ∪¦ϕ
1
(c
σ
1
), . . . , ϕ
n
(c
σ
n
)¦, where
σ
i
= ∃x
i
ϕ
i
(x
i
) for 1 ≤ i ≤ n. So Σ ∪ ¦ϕ
1
(c
σ
1
), . . . , ϕ
n
(c
σ
n
)¦ is an inconsistent
set of L ∪ ¦c
σ
1
, . . . , c
σ
n
¦sentences. This contradicts Lemma 3.2.10.
Lemma 3.2.12. Let (L
n
) be an increasing sequence of languages: L
0
⊆ L
1
⊆
L
2
⊆ . . ., and denote their union by L
∞
. Let Σ
n
be a consistent set of L
n

sentences, for each n, such that Σ
0
⊆ Σ
1
⊆ Σ
2
. . .. Then the union Σ
∞
:=
n
Σ
n
is a consistent set of L
∞
sentences.
Proof. Suppose that Σ
∞
¬ ⊥. Take a proof of ⊥ from Σ
∞
. Then we can choose
n so large that this is actually a proof of ⊥ from Σ
n
in L
n
. This contradicts the
consistency of Σ
n
.
Suppose the language L
∗
extends L, let / be an Lstructure, and let /
∗
be an
L
∗
structure. Then / is said to be a reduct of /
∗
(and /
∗
an expansion of /)
if / and /
∗
have the same underlying set and the same interpretations of the
symbols of L. For example, (N; 0, +) is a reduct of (N; <, 0, 1, +, ). Note that
any L
∗
structure /
∗
has a unique reduct to an Lstructure, which we indicate
3.3. SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 49
by /
∗
[
L
. A key fact (to be veriﬁed by the reader) is that if / is a reduct of
/
∗
, then t
A
= t
A
∗
for all variablefree L
A
terms t, and
/ [= σ ⇐⇒ /
∗
[= σ
for all L
A
sentences σ.
We can now prove Theorem 3.1.1.
Proof. Let Σ be a consistent set of Lsentences. We construct a sequence (L
n
)
of languages and a sequence (Σ
n
) where each Σ
n
is a consistent set of L
n

sentences. We begin by setting L
0
= L and Σ
0
= Σ. Given the language L
n
and the consistent set of L
n
sentences Σ
n
, put
L
n+1
:=
_
L
n
if n is even,
L
w
n
if n is odd,
choose a complete set of L
n
sentences Σ
n
⊇ Σ
n
, and put
Σ
n+1
:=
_
Σ
n
if n is even,
Σ
w
n
if n is odd.
Here L
w
n
and Σ
w
n
are obtained from L
n
and Σ
n
in the same way that L
w
and
Σ
w
are obtained from L and Σ in Lemma 3.2.11. Note that L
n
⊆ L
n+1
, and
Σ
n
⊆ Σ
n+1
.
By the previous lemma the set Σ
∞
of L
∞
sentences is consistent. It is also
complete. To see this, let σ be an L
∞
sentence. Take n even and so large that σ
is an L
n
sentence. Then Σ
n+1
¬ σ or Σ
n+1
¬ σ and thus Σ
∞
¬ σ or Σ
∞
¬ σ.
We claim that Σ
∞
has witnesses. To see this, let σ = ∃xϕ(x) be an L
∞

sentence such that Σ
∞
¬ σ. Now take n to be odd and so large that σ is
an L
n
sentence and Σ
n
¬ σ. Then by construction of Σ
n+1
= Σ
w
n
we have
Σ
n+1
¬ ϕ(c
σ
), so Σ
∞
¬ ϕ(c
σ
).
It follows from Theorem 3.2.7 that Σ
∞
has a model, namely /
Σ
∞
. Put
/ := /
Σ
∞
[
L
. Then / [= Σ. This concludes the proof of the Completeness
Theorem (second form).
Exercises.
(1) Suppose Σ is consistent. Then Σ is complete if and only if every two models of
Σ satisfy the same sentences.
(2) Let L have just a constant symbol c, a unary relation symbol U and a unary
function symbol f, and suppose that Σ ¬ Ufc, and that f does not occur in the
sentences of Σ. Then Σ ¬ ∀xUx.
3.3 Some Elementary Results of Predicate Logic
Here we obtain some generalities of predicate logic: Equivalence and Equality
Theorems, Variants, and Prenex Form. In some proofs we shall take advantage
of the fact that the Completeness Theorem is now available.
50 CHAPTER 3. THE COMPLETENESS THEOREM
Lemma 3.3.1 (Distribution Rule).
(i) Suppose Σ ¬ ϕ → ψ. Then Σ ¬ ∃xϕ → ∃xψ and Σ ¬ ∀xϕ → ∀xψ.
(ii) Suppose Σ ¬ ϕ ↔ ψ. Then Σ ¬ ∃xϕ ↔ ∃xψ and Σ ¬ ∀xϕ ↔ ∀xψ.
Proof. We only do (i), since (ii) then follows easily. Let / be a model of Σ. By
the Completeness Theorem it suﬃces to show that then / [= ∃xϕ → ∃xψ and
/ [= ∀xϕ → ∀xψ. We shall prove / [= ∃xϕ → ∃xψ and leave the other part
to the reader. We have / [= ϕ → ψ. Choose variables y
1
, . . . , y
n
such that
ϕ = ϕ(x, y
1
, . . . , y
n
) and ψ = ψ(x, y
1
, . . . , y
n
). We need only show that then
for all a
1
, . . . , a
n
∈ A
/ [= ∃xϕ(x, a
1
, . . . , a
n
) → ∃xψ(x, a
1
, . . . , a
n
)
Suppose / [= ∃xϕ(x, a
1
, . . . , a
n
). Then / [= ϕ(a
0
, a
1
, . . . , a
n
) for some a
0
∈ A.
From / [= ϕ → ψ we obtain / [= ϕ(a
0
, . . . , a
n
) → ψ(a
0
, . . . , a
n
), hence / [=
ψ(a
0
, . . . , a
n
), and thus / [= ∃xψ(x, a
1
, . . . , a
n
).
Theorem 3.3.2 (Equivalence Theorem). Suppose Σ ¬ ϕ ↔ ϕ
, and let ψ
be
obtained from ψ by replacing some occurrence of ϕ as a subformula of ψ by ϕ
.
Then ψ
is again a formula and Σ ¬ ψ ↔ ψ
.
Proof. By induction on the number of logical symbols in ψ. If ψ is atomic, then
necessarily ψ = ϕ and ψ
= ϕ
and the desired result holds trivially.
Suppose that ψ = θ. Then either ψ = ϕ and ψ
= ϕ
, and the desired
result holds trivially, or the occurrence of ϕ we are replacing is an occurrence
in θ. Then the inductive hypothesis gives Σ ¬ θ ↔ θ
, where θ
is obtained by
replacing that occurrence (of ϕ) by ϕ
. Then ψ
= θ
and the desired result
follows easily. The cases ψ = ψ
1
∨ ψ
2
and ψ = ψ
1
∧ ψ
2
are left as exercises.
Suppose that ψ = ∃xθ. The case ψ = ϕ (and thus ψ
= ϕ
) is trivial. Suppose
ψ ,= ϕ. Then the occurrence of ϕ we are replacing is an occurrence inside θ.
So by inductive hypothesis we have Σ ¬ θ ↔ θ
. Then by the distribution rule
Σ ¬ ∃xθ ↔ ∃xθ
. The proof is similar if ψ = ∀xθ.
Deﬁnition. We say ϕ
1
and ϕ
2
are Σequivalent if Σ ¬ ϕ
1
↔ ϕ
2
. (In case Σ = ∅,
we just say equivalent.) One veriﬁes easily that Σequivalence is an equivalence
relation on the set of Lformulas.
Given a family (ϕ
i
)
i∈I
of formulas with ﬁnite index set I we choose a bijection
k → i(k) : ¦1, . . . , n¦ → I and set
i∈I
ϕ
i
:= ϕ
i(1)
∨ ∨ ϕ
i(n)
,
i∈I
ϕ
i
:= ϕ
i(1)
∧ ∧ ϕ
i(n)
.
If I is clear from context we just write
_
i
ϕ
i
and
_
i
ϕ
i
instead. Of course, these
notations
_
i∈I
ϕ
i
and
_
i∈I
ϕ
i
can only be used when the particular choice of
bijection of ¦1, . . . , n¦ with I does not matter; this is usually the case because
the equivalence class of ϕ
i(1)
∨ ∨ ϕ
i(n)
does not depend on this choice, and
the same is true with “∧” instead of “∨”.
3.3. SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 51
Deﬁnition. A variant of a formula is obtained by successive replacements of
the following type:
(i) replace an occurrence of a subformula ∃xϕ by ∃yϕ(y/x);
(ii) replace an occurrence of a subformula ∀xϕ by ∀yϕ(y/x).
where y is free for x in ϕ and y does not occur free in ϕ.
Lemma 3.3.3. A formula is equivalent to any of its variants.
Proof. By the Equivalence Theorem (3.3.2) it suﬃces to show ¬ ∃xϕ ↔ ∃yϕ(y/x)
and ¬ ∀xϕ ↔ ∀yϕ(y/x) where y is free for x in ϕ and does not occur free in ϕ.
We prove the ﬁrst equivalence, leaving the second as an exercise. Applying G
to the quantiﬁer axiom ϕ(y/x) → ∃xϕ gives ¬ ∃yϕ(y/x) → ∃xϕ. Similarly we
get ¬ ∃xϕ → ∃yϕ(y/x) (use that ϕ = ϕ(y/x)(x/y) by the assumption on y).
An application of Exercise 6 of Section 2.7 ﬁnishes the proof.
Deﬁnition. A formula in prenex form is a formula Q
1
x
1
. . . Q
n
x
n
ϕ where
x
1
, . . . , x
n
are distinct variables, each Q
i
∈ ¦∃, ∀¦ and ϕ is quantiﬁerfree. We
call Q
1
x
1
. . . Q
n
x
n
the preﬁx, and ϕ the matrix of the formula. Note that a
quantiﬁerfree formula is in prenex form; this is the case n = 0.
We leave the proof of the next lemma as an exercise. Instead of “occurrence of
... as a subformula” we say “part ...”. In this lemma Q denotes a quantiﬁer,
that is, Q ∈ ¦∃, ∀¦, and Q
denotes the other quantiﬁer: ∃
= ∀ and ∀
= ∃.
Lemma 3.3.4. The following prenex transformations always change a formula
into an equivalent formula:
(1) replace the formula by one of its variants;
(2) replace a part Qxψ by Q
xψ;
(3) replace a part (Qxψ) ∨ θ by Qx(ψ ∨ θ) where x is not free in θ;
(4) replace a part ψ ∨ Qxθ by Qx(ψ ∨ θ) where x is not free in ψ;
(5) replace a part (Qxψ) ∧ θ by Qx(ψ ∧ θ) where x is not free in θ;
(6) replace a part ψ ∧ Qxθ by Qx(ψ ∧ θ) where x is not free in ψ.
Remark. Note that the free variables of a formula (those that occur free in the
formula) do not change under prenex transformations.
Theorem 3.3.5 (Prenex Form). Every formula can be changed into one in
prenex form by a ﬁnite sequence of prenex transformations. In particular, each
formula is equivalent to one in prenex form.
Proof. By induction on the number of logical symbols. Atomic formulas are
already in prenex form. To simplify notation, write ϕ =⇒
pr
ψ to indicate
that ψ can be obtained from ϕ by a ﬁnite sequence of prenex transformations.
Assume inductively that
ϕ
1
=⇒
pr
Q
1
x
1
. . . Q
m
x
m
ψ
1
ϕ
2
=⇒
pr
Q
m+1
y
1
. . . Q
m+n
y
n
ψ
2
,
where Q
1
, . . . , Q
m
, . . . , Q
m+n
∈ ¦∃, ∀¦, x
1
, . . . , x
m
are distinct, y
1
, . . . , y
n
are
distinct, and ψ
1
and ψ
2
are quantiﬁerfree.
52 CHAPTER 3. THE COMPLETENESS THEOREM
Then for ϕ := ϕ
1
, we have
ϕ =⇒
pr
Q
1
x
1
. . . Q
m
x
m
ψ
1
.
Applying m prenex transformations of type (2) we get
Q
1
x
1
. . . Q
m
x
m
ψ
1
=⇒
pr
Q
1
x
1
. . . Q
m
x
m
ψ
1
,
hence ϕ =⇒
pr
Q
1
x
1
. . . Q
m
x
m
ψ
1
.
Next, let ϕ := ϕ
1
∨ ϕ
2
. The assumptions above yield
ϕ =⇒
pr
(Q
1
x
1
. . . Q
m
x
m
ψ
1
) ∨ (Q
m+1
y
1
. . . Q
m+n
y
n
ψ
2
).
Replacing here the RHS (righthand side) by a variant we may assume that
¦x
1
, . . . , x
m
¦ ∩ ¦y
1
, . . . , y
n
¦ = ∅, and also that no x
i
occurs free in ψ
2
and no
y
j
occurs free in ψ
1
. Applying m+n times prenex transformation of types (3)
and (4) we obtain
(Q
1
x
1
. . . Q
m
x
m
ψ
1
) ∨ (Q
m+1
y
1
. . . Q
m+n
y
n
ψ
2
) =⇒
pr
Q
1
x
1
. . . Q
m
x
m
Q
m+1
y
1
. . . Q
m+n
y
n
(ψ
1
∨ ψ
2
).
Hence ϕ =⇒
pr
Q
1
x
1
. . . Q
m
x
m
Q
m+1
y
1
. . . Q
m+n
y
n
(ψ
1
∨ ψ
2
). Likewise, to deal
with ϕ
1
∧ ϕ
2
, we apply prenex transformations of types (5) and (6).
Next, let ϕ := ∃xϕ
1
. Applying prenex transformations of type (1) we can
assume x
1
, . . . , x
m
are diﬀerent from x. Then ϕ =⇒
pr
∃xQ
1
x
1
. . . Q
m
x
m
ψ
1
, and
the RHS is in prenex form. The case ϕ := ∀xϕ
1
is similar.
We ﬁnish this section with results on equalities. Note that by Corollary 2.1.7,
the result of replacing an occurrence of an Lterm τ in t by an Lterm τ
is an
Lterm t
.
Proposition 3.3.6. Let τ and τ
be Lterms such that Σ ¬ τ = τ
, let t
be
obtained from t by replacing an occurrence of τ in t by τ
. Then Σ ¬ t = t
.
Proof. We proceed by induction on terms. First note that if t = τ, then t
= τ
.
This fact takes care of the case that t is a variable. Suppose t = Ft
1
. . . t
n
where F ∈ L
f
is nary and t
1
, . . . , t
n
are Lterms, and assume t ,= τ. Using the
facts on admissible words at the end of Section 2.1, including exercise 5, we see
that t
= Ft
1
. . . t
n
where for some i ∈ ¦1, . . . , n¦ we have t
j
= t
j
for all j ,= i,
j ∈ ¦1, . . . , n¦, and t
i
is obtained from t
i
by replacing an occurrence of τ in t
i
by τ
. Inductively we can assume that Σ ¬ t
1
= t
1
, . . . , Σ ¬ t
n
= t
n
, so by part
(5) of Lemma 3.2.3 we have Σ ¬ t = t
.
An occurrence of an Lterm τ in ϕ is said to be inside quantiﬁer scope if the
occurrence is inside an occurrence of a subformula ∃xψ or ∀xψ in ϕ. An occur
rence of an Lterm τ in ϕ is said to be outside quantiﬁer scope if the occurrence
is not inside quantiﬁer scope.
3.4. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 53
Proposition 3.3.7. Let τ and τ
be Lterms such that Σ ¬ τ = τ
, let ϕ
be
obtained from ϕ by replacing an occurrence of τ in ϕ outside quantiﬁer scope by
τ
. Then Σ ¬ ϕ ↔ ϕ
.
Proof. First take care of the case that ϕ is atomic by a similar argument as for
terms, and then proceed by the usual kind of induction on formulas.
Exercises.
(1) Let P be a unary relation symbol, Q be a binary relation symbol, and x, y distinct
variables. Use prenex transformations to put
∀x∃y(P(x) ∧ Q(x, y)) → ∃x∀y(Q(x, y) → P(y))
into prenex form.
(2) Let (ϕ
i
)
i∈I
be a family of formulas with ﬁnite index set I. Then
¬
`
∃x
_
i
ϕ
i
´
←→
`
_
i
∃xϕ
i
´
, ¬
`
∀x
^
i
ϕ
i
´
←→
`
^
i
∀xϕ
i
´
.
(3) If ϕ
1
(x
1
, . . . , x
m
) and ϕ
2
(x
1
, . . . , x
m
) are existential formulas, then
(ϕ
1
∨ ϕ
2
)(x
1
, . . . , x
m
), (ϕ
1
∧ ϕ
2
)(x
1
, . . . , x
m
)
are equivalent to existential formulas ϕ
∨
12
(x
1
, . . . , x
m
) and ϕ
∧
12
(x
1
, . . . , x
m
). The
same holds with “existential” replaced by “universal”.
(4) A formula is said to be unnested if each atomic subformula has the form Rx
1
. . . x
m
with mary R ∈ L
r
∪ ¯, ⊥, =¦ and distinct variables x
1
, . . . , x
m
, or the form
Fx
1
. . . x
n
= x
n+1
with nary F ∈ L
f
and distinct variables x
1
, . . . , x
n+1
. (This
allows ¯ and ⊥ as atomic subformulas of unnested formulas.) Then:
(i) for each term t(x
1
, . . . , x
m
) and variable y / ∈ x
1
, . . . , x
m
¦ the formula
t(x
1
, . . . , x
m
) = y
is equivalent to an unnested existential formula θ
1
(x
1
, . . . , x
m
, y), and also to an
unnested universal formula θ
2
(x
1
, . . . , x
m
, y).
(ii) each atomic formula ϕ(y
1
, . . . , y
n
) is equivalent to an unnested existential
formula ϕ
1
(y
1
, . . . , y
n
), and also to an unnested universal formula ϕ
2
(y
1
, . . . , y
n
).
(iii) each formula ϕ(y
1
, . . . , y
n
) is equivalent to an unnested formula ϕ
u
(y
1
, . . . , y
n
).
3.4 Equational Classes and Universal Algebra
The termstructure /
Σ
introduced in the proof of the Completeness Theorem
also plays a role in what is called universal algebra. This is a general setting
for constructing mathematical objects by generators and relations. Free groups,
tensor products of various kinds, polynomial rings, and so on, are all special
cases of a single construction in universal algebra.
In this section we ﬁx a language L that has only function symbols, including
at least one constant symbol. So L has no relation symbols. Instead of “L
structure” we say “Lalgebra”, and /, B denote Lalgebras. A substructure of
54 CHAPTER 3. THE COMPLETENESS THEOREM
/ is also called a subalgebra of /, and a quotient algebra of / is an Lalgebra
// ∼ where ∼ is a congruence on /. We call / trivial if [A[ = 1. There is up
to isomorphism exactly one trivial Lalgebra.
An Lidentity is an Lsentence
∀x
_
s
1
(x) = t
1
(x) ∧ ∧ s
n
(x) = t
n
(x)
_
, x = (x
1
, . . . , x
m
)
where x
1
, . . . , x
m
are distinct variables and ∀x abbreviates ∀x
1
. . . ∀x
m
, and
where s
1
, t
1
, . . . , s
n
, t
n
are Lterms.
Given a set Σ of Lidentities we deﬁne a Σalgebra to be an Lalgebra that
satisﬁes all identities in Σ, in other words, a Σalgebra is the same as a model
of Σ. To such a Σ we associate the class Mod(Σ) of all Σalgebras. A class ( of
Lalgebras is said to be equational if there is a set Σ of Lidentities such that
( = Mod(Σ).
Examples. With L = L
Gr
, Gr is a set of Lidentities, and Mod(Gr), the class
of groups, is the corresponding equational class of Lalgebras. With L = L
Ri
,
Ri is a set of Lidentities, and Mod(Ri), the class of rings, is the corresponding
equational class of Lalgebras. If one adds to Ri the identity ∀x∀y(xy = yx)
expressing the commutative law, then the corresponding class is the class of
commutative rings.
Theorem 3.4.1. (G.Birkhoﬀ) Let ( be a class of Lalgebras. Then the class (
is equational if and only if the following conditions are satisﬁed:
(1) closure under isomorphism: if / ∈ ( and /
∼
= B, then B ∈ (.
(2) the trivial Lalgebra belongs to (;
(3) every subalgebra of any algebra in ( belongs to (;
(4) every quotient algebra of any algebra in ( belongs to (;
(5) the product of any family (/
i
) of algebras in ( belongs to (.
It is easy to see that if ( is equational, then conditions (1)–(5) are satisﬁed. (For
(3) and (4) one can also appeal to the Exercises 7 and 8 of section 2.5) Towards
a proof of the converse, we need some universalalgebraic considerations that
are of interest beyond the connection to Birkhoﬀ’s theorem.
For the rest of this section we ﬁx a set Σ of Lidentities. Associated to Σ is the
term algebra /
Σ
whose elements are the equivalence classes [t] of variablefree
Lterms t, where two such terms s and t are equivalent iﬀ Σ ¬ s = t.
Lemma 3.4.2. /
Σ
is a Σalgebra.
Proof. Consider an identity
∀x
_
s
1
(x) = t
1
(x) ∧ ∧ s
n
(x) = t
n
(x)
_
, x = (x
1
, . . . , x
m
)
in Σ, let j ∈ ¦1, . . . , n¦ and put s = s
j
and t = t
j
. Let a
1
, . . . , a
m
∈ A
Σ
and put
/ = /
Σ
. It suﬃces to show that then s(a
1
, . . . , a
m
)
A
= t(a
1
, . . . , a
m
)
A
. Take
3.4. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 55
variablefree Lterms α
1
, . . . , α
m
such that a
1
= [α
1
], . . . , a
m
= [α
m
]. Then by
part (1) of Corollary 3.2.5 we have a
1
= α
A
1
, . . . , a
m
= α
A
m
, so
s(a
1
, . . . , a
m
)
A
= s(α
1
, . . . , α
m
)
A
, t(a
1
, . . . , a
m
)
A
= t(α
1
, . . . , α
m
)
A
by Lemma 2.6.1. Also, by part (1) of Corollary 3.2.5,
s(α
1
, . . . , α
m
)
A
= [s(α
1
, . . . , α
m
)], t(α
1
, . . . , α
m
)
A
= [t(α
1
, . . . , α
m
)].
Now Σ ¬ s(α
1
, . . . , α
m
) = t(α
1
, . . . , α
m
), so [s(α
1
, . . . , α
m
)] = [t(α
1
, . . . , α
m
)],
and thus s(a
1
, . . . , a
m
)
A
= t(a
1
, . . . , a
m
)
A
, as desired.
Actually, we are going to show that /
Σ
is a socalled initial Σalgebra.
An initial Σalgebra is a Σalgebra / such that for any Σalgebra B there is a
unique homomorphism / → B.
For example, the trivial group is an initial Gralgebra, and the ring of integers
is an initial Rialgebra.
Suppose / and B are both initial Σalgebras. Then there is a unique iso
morphism / → B. To see this, let i and j be the unique homomorphisms
/ → B and B → /, respectively. Then we have homomorphisms j ◦ i : / → /
and i ◦ j : B → B, respectively, so necessarily j ◦ i = id
A
and i ◦ j = id
B
,
so i and j are isomorphisms. So if there is an initial Σalgebra, it is unique
uptouniqueisomorphism.
Lemma 3.4.3. /
Σ
is an initial Σalgebra.
Proof. Let B be any Σalgebra. Note that if s, t ∈ T
L
and [s] = [t], then
Σ ¬ s = t, so s
B
= t
B
. Thus we have a map
A
Σ
→ B, [t] → t
B
.
It is easy to check that this map is a homomorphism /
Σ
→ B. By Exercise 4
in Section 2.4 it is the only such homomorphism.
Free algebras. Let I be an index set in what follows. Let / be a Σalgebra
and (a
i
)
i∈I
an Iindexed family of elements of A. Then / is said to be a free
Σalgebra on (a
i
) if for every Σalgebra B and Iindexed family (b
i
) of elements
of B there is exactly one homomorphism h : / → B such that h(a
i
) = b
i
for all
i ∈ I. We also express this by “
_
/, (a
i
)
_
is a free Σalgebra”. Finally, / itself is
sometimes referred to as a free Σalgebra if there is a family (a
j
)
j∈J
in A such
that
_
/, (a
j
)
_
is a free Σalgebra.
As an example, take L = L
Ri
and cRi := Ri ∪ ¦∀x∀y xy = yz¦, where x, y
are distinct variables. So the cRialgebras are just the commutative rings. Let
Z[X
1
, . . . , X
n
] be the ring of polynomials in distinct indeterminates X
1
, . . . , X
n
over Z. For any commutative ring R and elements b
1
, . . . , b
n
∈ R we have a
56 CHAPTER 3. THE COMPLETENESS THEOREM
unique ring homomorphism Z[X
1
, . . . , X
n
] → R that sends X
i
to b
i
for i =
1, . . . , n, namely the evaluation map (or substitution map)
Z[X
1
, . . . , X
n
] → R, f(X
1
, . . . , X
n
) → f(b
1
, . . . , b
n
).
Thus Z[X
1
, . . . , X
n
] is a free commutative ring on (X
i
)
1≤i≤n
.
For a simpler example, let L = L
Mo
:= ¦1, ¦ ⊆ L
Gr
be the language of monoids,
and consider
Mo := ¦∀x(1 x = x ∧ x 1 = x), ∀x∀y∀z
_
(xy)z = x(yz)
_
¦,
where x, y, z are distinct variables. A monoid, or semigroup with identity, is a
model / = (A; 1, ) of Mo, and we call 1 ∈ A the identity of the monoid /, and
its product operation.
Let E
∗
be the set of words on an alphabet E, and consider E
∗
as a monoid by
taking the empty word as its identity and the concatenation operation (v, w) →
vw as its product operation. Then E
∗
is a free monoid on the family (e)
e∈E
of
words of length 1, because for any monoid B and elements b
e
∈ B (for e ∈ E)
we have a unique monoid homomorphism E
∗
→ B that sends each e ∈ E to b
e
,
namely,
e
1
. . . e
n
→ b
e
1
b
e
n
.
Remark. If / and B are both free Σalgebras on (a
i
) and (b
i
) respectively, with
same index set I, and g : / → B and h : B → / are the unique homomorphisms
such that g(a
i
) = b
i
and h(b
i
) = a
i
for all i, then (h ◦ g)(a
i
) = a
i
for all i, so
h ◦ g = id
A
, and likewise g ◦ h = id
B
, so g is an isomorphism with inverse h.
Thus, given I, there is, up to unique isomorphism preserving Iindexed families,
at most one free Σalgebra on an Iindexed family of its elements.
We shall now construct free Σalgebras as initial algebras by working in an
extended language. Let L
I
:= L ∪ ¦c
i
: i ∈ I¦ be the language L augmented by
new constant symbols c
i
for i ∈ I, where new means that c
i
/ ∈ L for i ∈ I and
c
i
,= c
j
for distinct i, j ∈ I. So an L
I
algebra
_
B, (b
i
)
_
is just an Lalgebra B
equipped with an Iindexed family (b
i
) of elements of B. Let Σ
I
be Σ viewed
as a set of L
I
identities. Then a free Σalgebra on an Iindexed family of its
elements is just an initial Σ
I
algebra. In particular, the Σ
I
algebra /
Σ
I
is a
free Σalgebra on ([c
i
]). Thus, up to unique isomorphism of Σ
I
algebras, there
is a unique free Σalgebra on an Iindexed family of its elements.
Let (/, (a
i
)
i∈I
) be a free Σalgebra. Then / is generated by (a
i
). To see
why, let B be the subalgebra of / generated by (a
i
). Then we have a unique
homomorphism h : / → B such that h(a
i
) = a
i
for all i ∈ I, and then the
composition
/ → B → /
is necessarily id
A
, so B = /.
3.4. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 57
Let B be any Σalgebra, and take any family (b
j
)
j∈J
in B that generates B.
Take a free Σalgebra
_
/, (a
j
)
j∈J
_
, and take the unique homomorphism h :
_
/, (a
j
)
_
→
_
B, (b
j
)
_
. Then h(t
A
(a
j
1
, . . . , a
j
n
) = t
B
(b
j
1
, . . . , b
j
n
) for all L
term t(x
1
, . . . , x
n
) and j
1
, . . . , j
n
∈ J, so h(A) = B, and thus h induces an
isomorphism //∼
h
∼
= B. We have shown:
Every Σalgebra is isomorphic to a quotient of a free Σalgebra.
This fact can sometimes be used to reduce problems on Σalgebras to the case
of free Σalgebras; see the next subsection for an example.
Proof of Birkhoﬀ’s theorem. Let us say that a class ( of Lalgebras is closed
if it has properties (1)–(5) listed in Theorem 3.4.1. Assume ( is closed; we have
to show that then ( is equational. Indeed, let Σ(() be the set of Lidentities
∀x
_
s(x) = t(x)
_
that are true in all algebras of (. It is clear that each algebra in ( is a Σ(()
algebra, and it remains to show that every Σ(()algebra belongs to (. Here is
the key fact from which this will follow:
Claim. If / is an initial Σ(()algebra, then / ∈ (.
To prove this claim we take / := /
Σ(C)
. For s, t ∈ T
L
such that s = t does not
belong to Σ(() we pick B
s,t
∈ ( such that B
s,t
[= s ,= t, and we let h
s,t
: / → B
s,t
be the unique homomorphism, so h
s,t
([s]) ,= h
s,t
([t]). Let B :=
B
s,t
where the
product is over all pairs (s, t) as above, and let h : / → B be the homomorphism
given by h(a) =
_
h
s,t
(a)
_
. Note that B ∈ (. Then h is injective. To see why,
let s, t ∈ T
L
be such that [s] ,= [t] in A = A
Σ(C)
. Then s = t does not belong
to Σ((), so h
s,t
([s]) ,= h
s,t
([t]), and thus h([s]) ,= h([t]). This injectivity gives
/
∼
= h(/) ⊆ B, so / ∈ (. This ﬁnishes the proof of the claim.
Now, every Σ(()algebra is isomorphic to a quotient of a free Σ(()algebra, so
it remains to show that free Σ(()algebras belong to (. Let / be a free Σ(()
algebra on (a
i
)
i∈I
. Let (
I
be the class of all L
I
algebras
_
B, (b
i
)
_
with B ∈ (.
It is clear that (
I
is closed as a class of L
I
algebras. Now,
_
/, (a
i
)
_
is easily
seen to be an initial Σ((
I
)algebra. By the claim above, applied to (
I
in place
of (, we obtain
_
/, (a
i
)
_
∈ (
I
, and thus / ∈ (.
Generators and relations. Let G be any set. Then we have a Σalgebra /
with a map ι : G → A such that for any Σalgebra B and any map j : G → B
there is a unique homomorphism h : / → B such that h◦ι = j; in other words, /
is a free as a Σalgebra on (ιg)
g∈G
. Note that if (/
, ι
) (with ι
: G → A
) has the
same universal property as (/, ι), then the unique homomorphism h : / → /
such that h◦ι = ι
is an isomorphism, so this universal property determines the
pair (/, ι) uptouniqueisomorphism. So there is no harm in calling (/, ι) the
free Σalgebra on G. Note that / is generated as an Lalgebra by ιG.
Here is a particular way of constructing the free Σalgebra on G. Take the
language L
G
:= L ∪ G (disjoint union) with the elements of G as constant
58 CHAPTER 3. THE COMPLETENESS THEOREM
symbols. Let Σ(G) be Σ considered as a set of L
G
identities. Then / := /
Σ(G)
as a Σalgebra with the map g → [g] : G → A
Σ(G)
is the free Σalgebra on G.
Next, let R be a set of sentences s(g) = t(g) where s(x
1
, . . . , x
n
) and
t(x
1
, . . . , x
n
) are Lterms and g = (g
1
, . . . , g
n
) ∈ G
n
(with n depending on
the sentence). We wish to construct the Σalgebra generated by G with R as set
of relations.
1
This object is described uptoisomorphism in the next lemma.
Lemma 3.4.4. There is a Σalgebra /(G, R) with a map i : G → A(G, R) such
that:
(1) /(G, R) [= s(ig) = t(ig) for all s(g) = t(g) in R;
(2) for any Σalgebra B and map j : G → B with B [= s(jg) = t(jg) for all
s(g) = t(g) in R, there is a unique homomorphism h : /(G, R) → B such
that h ◦ i = j.
Proof. Let Σ(R) := Σ ∪ R, viewed as a set of L
G
sentences, let /(G, R) :=
/
Σ(R)
, and deﬁne i : G → A(G, R) by i(g) = [g]. As before one sees that the
universal property of the lemma is satisﬁed.
1
The use of the term “relations” here has nothing to do with nary relations on sets.
Chapter 4
Some Model Theory
In this chapter we ﬁrst derive the L¨owenheimSkolem Theorem. Next we develop
some basic methods related to proving completeness of a given set of axioms:
Vaught’s Test, backandforth, quantiﬁer elimination. Each of these methods,
when applicable, achieves a lot more than just establishing completeness.
4.1 L¨owenheimSkolem; Vaught’s Test
Below, the cardinality of a structure is deﬁned to be the cardinality of its un
derlying set. In this section we have the same conventions concerning L, /, t,
ϕ, ψ, θ, σ and Σ as in the beginning of Section 2.6, unless speciﬁed otherwise.
Theorem 4.1.1 (Countable L¨owenheimSkolem Theorem).
Suppose L is countable and Σ has a model. Then Σ has a countable model.
Proof. Since Var is countable, the hypothesis that L is countable yields that the
set of Lsentences is countable. Hence the language
L ∪ ¦c
σ
: Σ ¬ σ where σ is an Lsentence of the form ∃xϕ(x)¦
is countable, that is, adding witnesses keeps the language countable. The union
of countably many countable sets is countable, hence the set L
∞
constructed
in the proof of the Completeness Theorem is countable. It follows that there
are only countably many variablefree L
∞
terms, hence /
Σ
∞
is countable, and
thus its reduct /
Σ
∞
[
L
is a countable model of Σ.
Remark. The above proof is the ﬁrst time that we used the countability of
Var = ¦v
0
, v
1
, v
2
, . . . ¦. As promised in Section 2.4, we shall now indicate why the
Downward L¨owenheimSkolem Theorem goes through without assuming that
Var is countable.
Suppose that Var is uncountable. Take a countably inﬁnite subset Var
⊆
Var. Then each sentence is equivalent to one whose variables are all from Var
.
By replacing each sentence in Σ by an equivalent one all whose variables are
59
60 CHAPTER 4. SOME MODEL THEORY
from Var
, we obtain a countable set Σ
of sentences such that Σ and Σ
have
the same models. As in the proof above, we obtain a countable model of Σ
working throughout in the setting where only variables from Var
are used in
terms and formulas. This model is a countable model of Σ.
The following test can be useful in showing that a set of axioms Σ is complete.
Proposition 4.1.2 (Vaught’s Test). Let L be countable, and suppose Σ has a
model, and that all countable models of Σ are isomorphic. Then Σ is complete.
Proof. Suppose Σ is not complete. Then there is σ such that Σ σ and Σ σ.
Hence by the L¨owenheimSkolem Theorem there is a countable model / of Σ
such that / σ, and there is a countable model B of Σ such that B σ. We
have /
∼
= B, / [= σ and B [= σ, contradiction.
Example. Let L = ∅, so the Lstructures are just the nonempty sets. Let
Σ = ¦σ
1
, σ
2
, . . .¦ where
σ
n
= ∃x
1
. . . ∃x
n
1≤i<j≤n
x
i
,= x
j
.
The models of Σ are exactly the inﬁnite sets. All countable models of Σ are
countably inﬁnite and hence isomorphic to N. Thus by Vaught’s Test Σ is
complete.
In this example the hypothesis of Vaught’s Test is trivially satisﬁed. In other
cases it may require work to check this hypothesis. One general method in
model theory, BackandForth, is often used to verify the hypothesis of Vaught’s
Test. The proof of the next theorem shows BackandForth in action. To
formulate that theorem we deﬁne a totally ordered set to be a structure (A; <)
for the language L
O
that satisﬁes the following axioms (where x, y, z are distinct
variables):
∀x(x ,< x), ∀x∀y∀z
_
(x < y ∧y < z) → x < z
_
, ∀x∀y(x < y ∨x = y ∨y < x).
Let To be the set consisting of these three axioms. A totally ordered set is said
to be dense if it satisﬁes in addition the axiom
∀x∀y(x < y → ∃z(x < z ∧ z < y)),
and it is said to have no endpoints if it satisﬁes the axiom
∀x∃y∃z(y < x ∧ x < z).
So (Q; <) and (R; <) are dense totally ordered sets without endpoints. Let
To (dense without endpoints) be To augmented by these two extra axioms.
Theorem 4.1.3 (Cantor). Any two countable dense totally ordered sets without
endpoints are isomorphic.
4.1. L
¨
OWENHEIMSKOLEM; VAUGHT’S TEST 61
Proof. Let (A; <) and (B; <) be countable dense totally ordered sets without
endpoints. So A = ¦a
n
: n ∈ N¦ and B = ¦b
n
: n ∈ N¦. We deﬁne by recursion
a sequence (α
n
) in A and a sequence (β
n
) in B as follows: put α
0
:= a
0
and
β
0
:= b
0
. Let n > 0, and suppose we have distinct α
0
, . . . , α
n−1
in A and distinct
β
0
, . . . , β
n−1
in B such that for all i, j < n we have α
i
< α
j
⇐⇒ β
i
< β
j
. Then
we deﬁne α
n
∈ A and β
n
∈ B as follows:
Case 1: n is even. (Here we go forth.) First take k ∈ N minimal such that
a
k
/ ∈ ¦α
0
, . . . , α
n−1
¦; then take l ∈ N minimal such that b
l
is situated with
respect to β
0
, . . . , β
n−1
as a
k
is situated with respect to α
0
, . . . , α
n−1
, that is, l
is minimal such that for i = 0, . . . , n − 1 we have: α
i
< a
k
⇐⇒ β
i
< b
l
, and
α
i
> a
k
⇐⇒ β
i
> b
l
. (The reader should check that such an l exists: that is
where density and “no endpoints” come in); put α
n
:= a
k
and β
n
:= b
l
.
Case 2: n is odd. (Here we go back.) First take l ∈ N minimal such that
b
l
/ ∈ ¦β
0
, . . . , β
n−1
¦; next take k ∈ N minimal such that a
k
is situated with
respect to α
0
, . . . , α
n−1
as b
l
is situated with respect to β
0
, . . . , β
n−1
, that is, k
is minimal such that for i = 0, . . . , n − 1 we have: α
i
< a
k
⇐⇒ β
i
< b
l
, and
α
i
> a
k
⇐⇒ β
i
> b
l
. Put β
n
:= b
l
and α
n
:= a
k
.
One proves easily by induction on n that then a
n
∈ ¦α
0
, . . . , α
2n
¦ and b
n
∈
¦β
0
, . . . , β
2n
¦. Thus we have a bijection α
n
→ β
n
: A → B, and this bijection
is an isomorphism (A; <) → (B; <).
In combination with Vaught’s Test this gives
Corollary 4.1.4. To (dense without endpoints) is complete.
In the results below we let κ denote an inﬁnite cardinal. Recall that the set of
ordinals λ < κ has cardinality κ. We have the following generalization of the
L¨ owenheimSkolem theorem.
Theorem 4.1.5 (Generalized L¨owenheimSkolem Theorem). Suppose L has
size at most κ and Σ has an inﬁnite model. Then Σ has a model of cardinality
κ.
Proof. Let ¦c
λ
¦
λ<κ
be a family of κ new constant symbols that are not in L and
are pairwise distinct (that is, c
λ
,= c
µ
for λ < µ < κ). Let L
= L∪¦c
λ
: λ < κ¦
and let Σ
= Σ ∪ ¦c
λ
,= c
µ
: λ < µ < κ¦. We claim that Σ
is consistent. To see
this it suﬃces to show that, given any ﬁnite set Λ ⊆ κ, the set of L
sentences
Σ
Λ
:= Σ ∪ ¦c
λ
,= c
µ
: λ, µ ∈ Λ, λ ,= µ¦
has a model. Take an inﬁnite model / of Σ. We make an L
expansion /
Λ
of / by interpreting distinct c
λ
’s with λ ∈ Λ by distinct elements of A, and
interpreting the c
λ
’s with λ / ∈ Λ arbitrarily. Then /
Λ
is a model of Σ
Λ
, which
establishes the claim.
Note that L
also has size at most κ. The same arguments we used in proving
the countable version of the L¨owenheimSkolem Theorem show that then Σ
has a model B
= (B, (b
λ
)
λ<κ
) of cardinality at most κ. We have b
λ
,= b
µ
for
λ < µ < κ, hence B is a model of Σ of cardinality κ.
62 CHAPTER 4. SOME MODEL THEORY
The next proposition is Vaught’s Test for arbitrary languages and cardinalities.
Proposition 4.1.6. Suppose L has size at most κ, Σ has a model and all models
of Σ are inﬁnite. Suppose also that any two models of Σ of cardinality κ are
isomorphic. Then Σ is complete.
Proof. Let σ be an Lsentence and suppose that Σ σ and Σ σ. We
will derive a contradiction. First Σ σ means that Σ ∪ ¦σ¦ has a model.
Similarly Σ σ means that Σ ∪ ¦σ¦ has a model. These models must be
inﬁnite since they are models of Σ, so by the Generalized L¨owenheimSkolem
Theorem Σ ∪ ¦σ¦ has a model / of cardinality κ, and Σ ∪ ¦σ¦ has a model
B of cardinality κ. By assumption /
∼
= B, contradicting that / [= σ and
B [= σ.
We now discuss in detail one particular application of this generalized Vaught
Test. Fix a ﬁeld F. A vector space over F is an abelian (additively written)
group V equipped with a scalar multiplication operation
F V −→ V : (λ, v) −→ λv
such that for all λ, µ ∈ F and all v, w ∈ V ,
(i) (λ +µ)v = λv +µv,
(ii) λ(v +w) = λv +λw,
(iii) 1v = v,
(iv) (λµ)v = λ(µv).
Let L
F
be the language of vector spaces over F: it extends the language
L
Ab
= ¦0, −, +¦ of abelian groups with unary function symbols f
λ
, one for each
λ ∈ F; a vector space V over F is viewed as an L
F
structure by interpreting
each f
λ
as the function v −→ λv : V → V . One easily speciﬁes a set Σ
F
of
sentences whose models are exactly the vector spaces over F. Note that Σ
F
is
not complete since the trivial vector space satisﬁes ∀x(x = 0) but F viewed as
vector space over F does not. Moreover, if F is ﬁnite, then we have also non
trivial ﬁnite vector spaces. From a modeltheoretic perspective ﬁnite structures
are somewhat exceptional, so we are going to restrict attention to inﬁnite vector
spaces over F. Let x
1
, x
2
, . . . be a sequence of distinct variables and put
Σ
∞
F
:= Σ
F
∪ ¦∃x
1
. . . ∃x
n
1≤i<j≤n
x
i
,= x
j
: n = 2, 3, . . .¦.
So the models of Σ
∞
F
are exactly the inﬁnite vector spaces over F. Note that if
F itself is inﬁnite then each nontrivial vector space over F is inﬁnite.
We will need the following facts about vector spaces V and W over F.
(Proofs can be found in various texts.)
Fact.
(a) V has a basis B, that is, B ⊆ V , and for each vector v ∈ V there is a
unique family (λ
b
)
b∈B
of scalars (elements of F) such that ¦b ∈ B : λ
b
,= 0¦
is ﬁnite and v = Σ
b∈B
λ
b
b.
4.1. L
¨
OWENHEIMSKOLEM; VAUGHT’S TEST 63
(b) Any two bases B and C of V have the same cardinality.
(c) If V has basis B and W has basis C, then any bijection B → C extends
uniquely to an isomorphism V → W.
(d) Let B be a basis of V . Then [V [ = [B[ [F[ provided either [F[ or [B[ is
inﬁnite. If both are ﬁnite then [V [ = [F[
B
.
Theorem 4.1.7. Σ
∞
F
is complete.
Proof. Take a κ > [F[. In particular L
F
has size at most κ. Let V be a vector
space over F of cardinality κ. Then a basis of V must also have size κ by
property (d) above. Thus any two vector spaces over F of cardinality κ have
bases of cardinality κ and thus are isomorphic. It follows by the Generalized
Vaught Test that Σ
∞
F
is complete.
Remark. Theorem 4.1.7 and Exercise 3 imply for instance that if F = R then
all nontrivial vector spaces over F satisfy exactly the same sentences in L
F
.
With the generalized Vaught Test we can also prove that ACF(0) (whose models
are the algebraically closed ﬁelds of characteristic 0) is complete. The proof is
similar, with “transcendence bases” taking over the role of bases. The relevant
deﬁnitions and facts are as follows.
Let K be a ﬁeld with subﬁeld k. A subset B of K is said to be algebraically
independent over k if for all distinct b
1
, . . . , b
n
∈ B we have p(b
1
, . . . , b
n
) ,= 0
for all nonzero polynomials p(x
1
, . . . , x
n
) ∈ k[x
1
, . . . , x
n
], where x
1
, . . . , x
n
are
distinct variables. A transcendence basis of K over k is a set B ⊆ K such that
B is algebraically independent over k and K is algebraic over its subﬁeld k(B).
Fact.
(a) K has a transcendence basis over k;
(b) any two transcendence bases of K over k have the same size;
(c) If K is algebraically closed with transcendence basis B over k and K
is
also an algebraically closed ﬁeld extension of k with transcendence basis B
over k, then any bijection B → B
extends to an isomorphism K → K
;
(d) if K is uncountable and [K[ > [k[, then [K[ = [B[ for each transcendence
basis B of K over k.
Applying this with k = Q and k = F
p
for prime numbers p, we obtain that
any two algebraically closed ﬁelds of the same characteristic and the same un
countable size are isomorphic. Using Vaught’s Test for models of size ℵ
1
this
yields:
Theorem 4.1.8. The set ACF(0) of axioms for algebraically closed ﬁelds of
characteristic zero is complete. Likewise, ACF(p) is complete for each prime
number p.
If the hypothesis of Vaught’s Test (or its generalization) is satisﬁed, then many
things follow of which completeness is only one; it goes beyond the scope of
these notes to develop this large chapter of model theory, which goes under the
name of “categoricity in power”.
64 CHAPTER 4. SOME MODEL THEORY
Exercises.
(1) Let L = U¦ where U is a unary relation symbol. Consider the Lstructure
(Z; N). Give an informative description of a complete set of Lsentences true in
(Z; N). (A description like σ : (Z; N) [= σ¦ is not informative. An explicit,
possibly inﬁnite, list of axioms is required. Hint: Make an educated guess and
try to verify it using Vaught’s Test or one of its variants.)
(2) Let Σ
1
and Σ
2
be sets of Lsentences such that no symbol of L occurs in both Σ
1
and Σ
2
. Suppose Σ
1
and Σ
2
have inﬁnite models. Then Σ
1
∪ Σ
2
has a model.
(3) Let L = S¦ where S is a unary function symbol. Consider the Lstructure (Z; S)
where S(a) = a + 1 for a ∈ Z. Give an informative description of a complete set
of Lsentences true in (Z; S).
4.2 Elementary Equivalence and BackandForth
In the rest of this chapter we relax notation, and just write ϕ(a
1
, . . . , a
n
) for an
L
A
formula ϕ(a
1
, . . . , a
n
), where / = (A; . . . ) is an Lstructure, ϕ(x
1
, . . . , x
n
)
an L
A
formula, and (a
1
, . . . , a
n
) ∈ A
n
.
In this section / and B denote Lstructures. We say that / and B are elemen
tarily equivalent (notation: / ≡ B) if they satisfy the same Lsentences. Thus
by the previous section (Q; <) ≡ (R; <), and any two inﬁnite vector spaces
over a given ﬁeld F are elementarily equivalent.
A partial isomorphism from / to B is a bijection γ : X → Y with X ⊆ A and
Y ⊆ B such that
(i) for each mary R ∈ L
r
and a
1
, . . . , a
m
∈ X
(a
1
, . . . , a
m
) ∈ R
A
⇐⇒ (γa
1
, . . . , γa
m
) ∈ R
B
.
(ii) for each nary f ∈ L
f
and a
1
, . . . , a
n
, a
n+1
∈ X
f
A
(a
1
, . . . , a
n
) = a
n+1
⇐⇒ f
B
(γa
1
, . . . , γa
n
) = γ(a
n+1
).
Example: suppose / = (A; <) and B = (B; <) are totally ordered sets, and
a
1
, . . . , a
N
∈ A and b
1
, . . . , b
N
∈ B are such that a
1
< a
2
< < a
N
and
b
1
< b
2
< < b
N
; then the map a
i
→ b
i
: ¦a
1
, . . . , a
N
¦ → ¦b
1
, . . . , b
N
¦ is a
partial isomorphism from / to B.
A backandforth system from / to B is a collection Γ of partial isomorphisms
from / to B such that
(i) (“Forth”) for each γ ∈ Γ and a ∈ A there is a γ
∈ Γ such that γ
extends
γ and a ∈ domain(γ
);
(ii) (“Back”) for each γ ∈ Γ and b ∈ B there is a γ
∈ Γ such that γ
extends
γ and b ∈ codomain(γ
).
If Γ is a backandforth system from / to B, then Γ
−1
:= ¦γ
−1
: γ ∈ Γ¦ is a
backandforth system from B to /. We call / and B backandforth equivalent
4.3. QUANTIFIER ELIMINATION 65
(notation: / ≡
bf
B) if there exists a nonempty backandforth system from /
to B. Hence / ≡
bf
/, and if / ≡
bf
B, then B ≡
bf
/.
Cantor’s proof in the previous section generalizes as follows:
Proposition 4.2.1. Suppose / and B are countable and / ≡
bf
B. Then /
∼
= B.
Proof. Let Γ be a nonempty backandforth system from / to B. We proceed
as in the proof of Cantor’s theorem, and construct a sequence (γ
n
) of partial
isomorphisms from / to B such that each γ
n+1
extends γ
n
, A =
n
domain(γ
n
)
and B =
n
codomain(γ
n
). Then the map A → B that extends each γ
n
is an
isomorphism / → B.
In applying this proposition and the next one in a concrete situation, the
key is to guess a backandforth system. That is where insight and imagination
(and experience) come in. In the following result we do not need to assume
countability.
Proposition 4.2.2. If / ≡
bf
B, then / ≡ B.
Proof. Suppose Γ is a nonemepty backandforth system from / to B. By
induction on the number of logical symbols in unnested formulas ϕ(y
1
, . . . , y
n
)
one shows that for each γ ∈ Γ and a
1
, . . . , a
n
∈ domain(γ) we have
/ [= ϕ(a
1
, . . . , a
n
) ⇐⇒ B [= ϕ(γa
1
, . . . , γa
n
).
For n = 0 and using Exercise 4 of Section 3.3 this yields / ≡ B.
Exercises.
(1) Deﬁne a ﬁnite restriction of a bijection γ : X → Y to be a map γ[E : E → γ(E)
with ﬁnite E ⊆ X. If Γ is a backandforth system from / to B, so is the set of
ﬁnite restrictions of members of Γ.
(2) If / ≡
bf
B and B ≡
bf
(, then / ≡
bf
(.
4.3 Quantiﬁer Elimination
In this section x = (x
1
, . . . , x
n
) is a tuple of distinct variables and y is a single
variable distinct from x
1
, . . . , x
n
.
Deﬁnition. Σ has quantiﬁer elimination (QE) if every Lformula ϕ(x) is Σ
equivalent to a quantiﬁer free (short: qfree) Lformula ϕ
qf
(x).
By taking n = 0 in this deﬁnition we see that if Σ has QE, then every Lsentence
is Σequivalent to a qfree Lsentence.
Remark. Suppose Σ has QE. Let L
be a language such that L
⊇ L and all
symbols of L
L are constant symbols. (This includes the case L
= L.) Then
every set Σ
of L
sentences with Σ
⊇ Σ has QE. (To see this, check ﬁrst that
each L
formula ϕ
(x) has the form ϕ(c, x) for some Lformula ϕ(u, x), where
u = (u
1
, . . . , u
m
) is a tuple of distinct variables distinct from x
1
, . . . , x
n
, and
c = (c
1
, . . . , c
m
) is a tuple of constant symbols in L
L.)
66 CHAPTER 4. SOME MODEL THEORY
A basic conjunction in L is by deﬁnition a conjunction of ﬁnitely many atomic
and negated atomic Lformulas. Each qfree Lformula ϕ(x) is equivalent to a
disjunction ϕ
1
(x) ∨ ∨ ϕ
k
(x) of basic conjunctions ϕ
i
(x) in L (“disjunctive
normal form”).
Lemma 4.3.1. Suppose that for every basic conjunction θ(x, y) in L there is a
qfree Lformula θ
qf
(x) such that
Σ ¬ ∃yθ(x, y) ↔ θ
qf
(x).
Then Σ has QE.
Proof. Let us say that an Lformula ϕ(x) has ΣQE if it is Σequivalent to a q
free Lformula ϕ
qf
(x). Note that if the Lformulas ϕ
1
(x) and ϕ
2
(x) have ΣQE,
then ϕ
1
(x), (ϕ
1
∨ ϕ
2
)(x), and (ϕ
1
∧ ϕ
2
)(x) have ΣQE.
Next, let ϕ(x) = ∃yψ(x, y), and suppose inductively that the Lformula
ψ(x, y) has ΣQE. Hence ψ(x, y) is Σequivalent to a disjunction
_
i
ψ
i
(x, y) of
basic conjunctions ψ
i
(x, y) in L, with i ranging over some ﬁnite index set. In
view of the equivalence of ∃y
_
i
ψ
i
(x, y) with
_
i
∃yψ
i
(x, y) we obtain
Σ ¬ ϕ(x) ←→
i
∃yψ
i
(x, y).
Each ∃yψ
i
(x, y) has ΣQE, by hypothesis, so ϕ(x) has ΣQE.
Finally, let ϕ(x) = ∀yψ(x, y), and suppose inductively that the Lformula
ψ(x, y) has ΣQE. This case reduces to the previous case since ϕ(x) is equivalent
to ∃yψ(x, y).
Remark. In the structure (R; <, 0, 1, +, −, ) the formula
ϕ(a, b, c) := ∃y(ay
2
+by +c = 0)
is equivalent to the qfree formula
(b
2
−4ac ≥ 0 ∧ a ,= 0) ∨ (a = 0 ∧ b ,= 0) ∨ (a = 0 ∧ b = 0 ∧ c = 0).
Note that this equivalence gives an eﬀective test for the existence of a y with a
certain property, which avoids in particular having to check an inﬁnite number
of values of y (even uncountably many in the case above). This illustrates the
kind of property QE is.
Lemma 4.3.2. Suppose Σ has QE and B and ( are models of Σ with a common
substructure / (we do not assume / [= Σ). Then B and ( satisfy the same L
A

sentences.
Proof. Let σ be an L
A
sentence. We have to show B [= σ ⇔ ( [= σ. Write σ
as ϕ(a) with ϕ(x) an Lformula and a ∈ A
n
. Take a qfree Lformula ϕ
qf
(x)
that is Σequivalent to ϕ(x). Then B [= σ iﬀ B [= ϕ
qf
(a) iﬀ / [= ϕ
qf
(a) (by
Exercise 7 of Section 2.5) iﬀ ( [= ϕ
qf
(a) (by the same exercise) iﬀ ( [= σ.
4.3. QUANTIFIER ELIMINATION 67
Corollary 4.3.3. Suppose Σ has a model, has QE, and there exists an L
structure that can be embedded into every model of Σ. Then Σ is complete.
Proof. Take an Lstructure / that can be embedded into every model of Σ. Let
B and ( be any two models of Σ. So / is isomorphic to a substructure of B and
of (. Then by a slight rewording of the proof of Lemma 4.3.2 (considering only
Lsentences), we see that B and ( satisfy the same Lsentences. It follows that
Σ is complete.
Remark. We have seen that Vaught’s test can be used to prove completeness.
The above corollary gives another way of establishing completeness, and is often
applicable when the hypothesis of Vaught’s Test is not satisﬁed. Completeness
is only one of the nice consequences of QE, and the easiest one to explain at this
stage. The main impact of QE is rather that it gives access to the structural
properties of deﬁnable sets. This will be reﬂected in exercises at the end of
this section. Applications of model theory to other areas of mathematics often
involve QE as a key step.
Theorem 4.3.4. To(dense without endpoints) has QE.
Proof. Let (x, y) = (x
1
, . . . , x
n
, y) be a tuple of n + 1 distinct variables, and
consider a basic conjunction ϕ(x, y) in L
O
. By Lemma 4.3.1 it suﬃces to show
that ∃yϕ(x, y) is equivalent in To(dense without endpoints) to a qfree formula
ψ(x). We may assume that each conjunct of ϕ is of one of the following types:
y = x
i
, x
i
< y, y < x
i
(1 ≤ i ≤ n).
To justify this, observe that if we had instead a conjunct y ,= x
i
then we could
replace it by (y < x
i
) ∨ (x
i
< y) and use the fact that ∃y(ϕ
1
(x, y) ∨ ϕ
2
(x, y))
is equivalent to ∃yϕ
1
(x, y) ∨ ∃yϕ
2
(x, y). Similarly, a negation (y < x
i
) can
be replaced by the disjunction y = x
i
∨ x
i
< y, and likewise with negations
(x
i
< y). Also conjuncts in which y does not appear can be eliminated because
¬ ∃y(ψ(x) ∧ θ(x, y)) ←→ ψ(x) ∧ ∃yθ(x, y).
Suppose that we have a conjunct y = x
i
in ϕ(x, y), so, ϕ(x, y) is equivalent to
y = x
i
∧ ϕ
(x, y), where ϕ
(x, y) is a basic conjunction in L
O
. Then ∃yϕ(x, y)
is equivalent to ϕ
(x, x
i
), and we are done. So we can assume also that ϕ(x, y)
has no conjuncts of the form y = x
i
.
After all these reductions, and after rearranging conjuncts we can assume
that ϕ(x, y) is a conjunction
i∈I
x
i
< y ∧
j∈J
y < x
j
where I, J ⊆ ¦1, . . . , n¦ and where we allow I or J to be empty. Up till this
point we did not need the density and “no endpoints” axioms, but these come
in now: ∃yϕ(x, y) is equivalent in To(dense without endpoints) to the formula
i∈I,j∈J
x
i
< x
j
.
68 CHAPTER 4. SOME MODEL THEORY
We mention without proof two examples of QE, and give a complete proof for a
third example in the next section. The following theorem is due to Tarski and
(independently) to Chevalley. It dates from around 1950.
Theorem 4.3.5. ACF has QE.
Clearly, ACF is not complete, since it says nothing about the characteristic: it
doesn’t prove 1 + 1 = 0, nor does it prove 1 + 1 ,= 0. However, ACF(0), which
contains additional axioms forcing the characteristic to be 0, is complete by 4.3.3
and the fact that the ring of integers embeds in every algebraically closed ﬁeld
of characteristic 0. Tarski also established the following more diﬃcult theorem,
which is one of the key results in real algebraic geometry. (His original proof is
rather long; there is a shorter one due to A. Seidenberg, and an elegant short
proof by A. Robinson using a combination of basic algebra and elementary
model theory.)
Deﬁnition. RCF is a set of axioms true in the ordered ﬁeld (R; <, 0, 1, −, +, )
of real numbers. In addition to the ordered ﬁeld axioms, it has the axiom
∀x (x > 0 → ∃y (x = y
2
)) (x, y distinct variables) and for each odd n > 1 the
axiom
∀x
1
. . . ∀x
n
∃y (y
n
+x
1
y
n−1
+ +x
n
= 0)
where x
1
, . . . , x
n
, y are distinct variables.
Theorem 4.3.6. RCF has QE and is complete.
Exercises. In (5) and (6), an Ltheory is a set T of Lsentences such that for all
Lsentences σ, if T ¬ σ, then σ ∈ T. An axiomatization of an Ltheory T is a set Σ of
Lsentences such that T = σ : σ is an Lsentence and Σ ¬ σ¦.
(1) The subsets of C deﬁnable in (C; 0, 1, −, +, ) are exactly the ﬁnite subsets of C
and their complements in C. (Hint: use the fact that ACF has QE.)
(2) The subsets of R deﬁnable in the model (R; <, 0, 1, −, +, ) of RCF are exactly
the ﬁnite unions of intervals of all kinds (including degenerate intervals with just
one point) (Hint: use the fact that RCF has QE.)
(3) Let Eq
∞
be a set of axioms in the language ∼¦ (where ∼ is a binary relation
symbol) that say:
(i) ∼ is an equivalence relation;
(ii) every equivalence class is inﬁnite;
(iii) there are inﬁnitely many equivalence classes.
Then Eq
∞
admits QE and is complete. (It is also possible to use Vaught’s test
to prove completeness.)
(4) Suppose that a set Σ of Lsentences has QE. Let the language L
extend L by
new symbols of arity 0, and let Σ
⊇ Σ be a set of L
sentences. Then Σ
(as a
set of L
sentences) also has QE.
4.4. PRESBURGER ARITHMETIC 69
(5) Suppose the Ltheory T has QE. Then T has an axiomatization consisting of
sentences ∀x∃yϕ(x, y) and ∀xψ(x) where ϕ(x, y) and ψ(x) are qfree. (Hint: let
Σ be the set of Lsentences provable from T that have the indicated form; show
that Σ has QE, and is an axiomatization of T.)
(6) Assume the Ltheory T has builtin Skolem functions, that is, for each basic
conjunction ϕ(x, y) there are Lterms t
1
(x), . . . , t
k
(x) such that
T ¬ ∃yϕ(x, y) → ϕ(x, t
1
(x)) ∨ ∨ ϕ(x, t
k
(x)).
Then T has QE, for every ϕ(x, y) there are Lterms t
1
(x), . . . , t
k
(x) such that
T ¬ ∃yϕ(x, y) → ϕ(x, t
1
(x)) ∨ ∨ ϕ(x, t
k
(x)), and T has an axiomatization
consisting of sentences ∀xψ(x) where ψ(x) is qfree.
4.4 Presburger Arithmetic
In this section we consider in some detail one example of a set of axioms that
has QE, namely “Presburger Arithmetic.” Essentially, this is a complete set
of axioms for ordinary arithmetic of integers without multiplication, that is,
the axioms are true in (Z; 0, 1, +, −, <), and prove every sentence true in this
structure. There is a mild complication in trying to obtain this completeness
via QE: one can show (exercise) that for any qfree formula ϕ(x) in the language
¦0, 1, +, −, <¦ there is an N ∈ N such that either (Z; 0, 1, +, −, <) [= ϕ(n) for
all n > N or (Z; 0, 1, +, −, <) [= ϕ(n) for all n > N. In particular, formulas
such as ∃y(x = y +y) and ∃y(x = y +y +y) are not Σequivalent to any qfree
formula in this language, for any set Σ of axioms true in (Z; 0, 1, +, −, <).
To overcome this obstacle to QE we augment the language ¦0, 1, +, −, <¦
by new unary relation symbols P
1
, P
2
, P
3
, P
4
, . . . to obtain the language L
PrA
of Presburger Arithmetic (named after the Polish logician Presburger who was
a student of Tarski). We expand (Z; 0, 1, +, −, <) to the L
PrA
structure
˜
Z = (Z; 0, 1, +, −, <, Z, 2Z, 3Z, 4Z, . . . )
that is, P
n
is interpreted as the set nZ. This structure satisﬁes the set PrA of
Presburger Axioms which consists of the following sentences:
(i) the axioms of Ab for abelian groups;
(ii) the axioms of To expressing that < is a total order;
(iii) ∀x∀y∀z(x < y → x +z < y +z) (translation invariance of <);
(iv) 0 < 1 ∧ ∃y(0 < y < 1) (discreteness axiom);
(v) ∀x∃y
_
0≤r<n
x = ny +r1, n = 1, 2, 3, . . . (division with remainder);
(vi) ∀x
_
P
n
x ↔ ∃y(x = ny)
_
, n = 1, 2, 3, . . . (deﬁning axioms for P
1
, P
2
, . . . ).
Here we have ﬁxed distinct variables x, y, z for deﬁniteness. In (v) and in the
rest of this section r ranges over integers. Note that (v) and (vi) are inﬁnite
lists of axioms. Here are some elementary facts about models of PrA:
Proposition 4.4.1. Let / = (A; 0, 1, +, −, <, P
A
1
, P
A
2
, P
A
3
, . . . ) [= PrA. Then
(1) There is a unique embedding
˜
Z −→ /; it sends k ∈ Z to k1 ∈ A.
70 CHAPTER 4. SOME MODEL THEORY
(2) Given any n > 0 we have P
A
n
= n/, where we regard / as an abelian group,
and //n/ has exactly n elements, namely 0 +n/, . . . , (n −1)1 +n/.
(3) For any n > 0 and a ∈ A, exactly one of the a, a + 1, . . . , a + (n −1)1 lies
in n/;
(4) / is torsionfree as an abelian group.
Theorem 4.4.2. PrA has QE.
Proof. Let (x, y) = (x
1
, . . . , x
n
, y) be a tuple of n + 1 distinct variables, and
consider a basic conjunction ϕ(x, y) in L
PrA
. By Lemma 4.3.1 it suﬃces to show
that ∃yϕ(x, y) is PrAequivalent to a qfree formula ψ(x). We may assume that
each conjunct of ϕ is of one of the following types, for some integer m ≥ 1 and
L
PrA
term t(x):
my = t(x), my < t(x), t(x) < my, P
n
(my +t(x)).
To justify this assumption observe that if we had instead a conjunct my ,= t(x)
then we could replace it by (my < t(x)) ∨ (t(x) < my) and use the fact that
∃y(ϕ
1
(x, y) ∨ ϕ
2
(x, y)) is equivalent to ∃yϕ
1
(x, y) ∨ ∃yϕ
2
(x, y). Similarly a
negation P
n
(my +t(x)) can be replaced by the disjunction
P
n
(my +t(x) + 1) ∨ . . . ∨ P
n
(my +t(x) + (n −1)1)
Also conjuncts in which y does not appear can be eliminated because
¬ ∃y(ψ(x) ∧ θ(x, y)) ←→ ψ(x) ∧ ∃yθ(x, y).
Since PrA ¬ P
n
(z) ↔ P
rn
(rz) for r > 0 we can replace P
n
(my + t(x)) by
P
rn
(rmy + rt(x)). Also, for r > 0 we can replace my = t(x) by rmy = rt(x),
and likewise with my < t(x) and t(x) < my. We can therefore assume that
all conjuncts have the same “coeﬃcient” m in front of the variable y. After all
these reductions, and after rearranging conjuncts, ϕ(x, y) has the form
h∈H
my = t
h
(x) ∧
i∈I
t
i
(x) < my ∧
j∈J
my < t
j
(x) ∧
k∈K
P
n(k)
(my +t
k
(x))
where m > 0 and H, I, J, K are disjoint ﬁnite index sets. We allow some of
these index sets to be empty in which case the corresponding conjunction can
be left out.
Suppose that H ,= ∅, say h
∈ H. Then the formula ∃yϕ(x, y) is PrA
equivalent to
P
m
(t
h
(x)) ∧
h∈H
t
h
(x) = t
h
(x) ∧
i∈I
t
i
(x) < t
h
(x) ∧
j∈J
t
h
(x) < t
j
(x)
∧
k∈K
P
n(k)
(t
h
(x) +t
k
(x))
For the rest of the proof we assume that H = ∅.
4.4. PRESBURGER ARITHMETIC 71
To understand what follows, it may help to focus on the model
˜
Z, although
the arguments go through for arbitrary models of PrA. Fix any value a ∈ Z
n
of x. Consider the system of linear congruences (with “unknown” y)
P
n(k)
(my +t
k
(a)), (k ∈ K),
which in more familiar notation would be written as
my +t
k
(a) ≡ 0 mod n(k), (k ∈ K).
The solutions in Z of this system form a union of congruence classes modulo
N :=
k∈K
n(k), where as usual we put N = 1 for K = ∅. This suggests
replacing y successively by Nz, 1 +Nz,. . . ,(N −1)1 +Nz. Our precise claim is
that ∃yϕ(x, y) is PrAequivalent to the formula θ(x) given by
N−1
r=0
_
k∈K
P
n(k)
((mr)1 +t
k
(x)) ∧ ∃z
_
i∈I
t
i
(x) < m(r1 +Nz)
∧
j∈J
m(r1 +Nz) < t
j
(x)
__
We prove this equivalence with θ(x) as follows. Suppose
/ = (A, . . .) [= PrA, a = (a
1
, . . . , a
n
) ∈ A
n
.
We have to show that / [= ∃yϕ(a, y) if and only if / [= θ(a). So let b ∈ A be
such that / [= ϕ(a, b). Division with remainder yields a c ∈ A and an r such
that b = r1 +Nc and 0 ≤ r ≤ N −1. Note that then for k ∈ K,
mb +t
k
(a) = m(r1 +Nc) +t
k
(a) = (mr)1 + (mN)c +t
k
(a) ∈ n(k)/
and so / [= P
n(k)
((mr)1 +t
k
(a)). Also,
t
i
(a) < m(r1 +Nc) for every i ∈ I,
m(r1 +Nc) < t
j
(a) for every j ∈ J.
Therefore / [= θ(a) with ∃z witnessed by c. For the converse, suppose that the
disjunct of θ(a) indexed by a certain r ∈ ¦0, . . . , N − 1¦ is true in /, with ∃z
witnessed by c ∈ A. Then put b = r1+Nc and we get / [= ϕ(a, b). This proves
the claimed equivalence.
Now that we have proved the claim we have reduced to the situation (after
changing notation) where H = K = ∅ (i. e. no equations and no congruences).
So ϕ(x, y) now has the form
i∈I
t
i
(x) < my ∧
j∈J
my < t
j
(x)
72 CHAPTER 4. SOME MODEL THEORY
If J = ∅ or I = ∅ then PrA ¬ ∃yϕ(x, y) ↔ ·. This leaves the case where
both I and J are nonempty. So suppose / [= PrA and that A is the underlying
set of /. For each value a ∈ A
n
of x there is i
0
∈ I such that t
i
0
(a) is maximal
among the t
i
(a) with i ∈ I, and a j
0
∈ J such that t
j
0
(a) is minimal among the
t
j
(a) with j ∈ J. Moreover each interval of m successive elements of A contains
an element of m/. Therefore ∃yϕ(x, y) is equivalent in / to the disjunction
over all pairs (i
0
, j
0
) ∈ I J of the qfree formula
i∈I
t
i
(x) ≤ t
i
0
(x) ∧
j∈J
t
j
0
(x) ≤ t
j
(x)
∧
m
r=1
_
P
m
(t
i
0
(x) +r1) ∧ (t
i
0
(x) +r1 < t
j
0
(x))
_
This completes the proof. Note that L
PrA
does not contain the relation symbol
≤; we just write t ≤ t
to abbreviate (t < t
) ∨ (t = t
).
Remark. It now follows from Corollary 4.3.3 that PrA is complete: it has QE
and
˜
Z can be embedded in every model.
Discussion. The careful reader will have noticed that the elimination procedure
in the proof above is constructive: it describes an algorithm that, given any basic
conjunction ϕ(x, y) in L
PrA
as input, constructs a qfree formula ψ(x) of L
PrA
such that PrA ¬ ∃yϕ(x, y) ↔ ψ(x). In view of the equally constructive proof
of Lemma 4.3.1 this yields an algorithm that, given any L
PrA
formula ϕ(x) as
input, constructs a qfree L
PrA
formula ϕ
qf
(x) such that PrA ¬ ϕ(x) ↔ ϕ
qf
(x).
(Thus PrA has eﬀective QE.)
In particular, this last algorithm constructs for any L
PrA
sentence σ a qfree
L
PrA
sentence σ
qf
such that PrA ¬ σ ↔ σ
qf
. Since we also have an obvious
algorithm that, given any qfree L
PrA
sentence σ
qf
, checks whether σ
qf
is true
in
˜
Z, this yields an algorithm that, given any L
PrA
sentence σ, checks whether
σ is true in
˜
Z. Thus the structure
˜
Z is decidable. (A precise deﬁnition of
decidability will be given in the next Chapter.) The algorithms above can
easily be implemented by computer programs.
Let some Lstructure / be given, and suppose we have an algorithm for
deciding whether any given Lsentence is true in /. Even if this algorithm can
be implemented by a computer program, it does not guarantee that the program
is of practical use, or feasible: on some moderately small inputs it might have
to run for 10
100
years before producing an output. This bad behaviour is not
at all unusual: no (classical, sequential) algorithm for deciding the truth of
L
PrA
sentences in
˜
Z is feasible in a precise technical sense. Results of this kind
belong to complexity theory; this is an area where mathematics (logic, number
theory,. . . ) and computer science interact.
There do exist feasible integer linear programming algorithms that decide
the truth in
˜
Z of sentences of a special form, and this shows another (very
practical) side of complexity theory.
A positive impact of QE is that it yields structural properties of deﬁnable
sets, as we discuss next for
˜
Z.
4.5. SKOLEMIZATION AND EXTENSION BY DEFINITION 73
Deﬁnition. Let d be a positive integer. An arithmetic progression of modulus
d is a set of the form
¦k ∈ Z : k ≡ r mod d, α < k < β¦,
where r ∈ ¦0, . . . , d −1¦, α, β ∈ Z ∪ ¦−∞, +∞¦, α < β.
We leave the proof of the next lemma to the reader.
Lemma 4.4.3. Arithmetic progressions have the following properties.
(1) If P, Q ⊆ Z are arithmetic progressions of moduli d and e respectively, then
P ∩ Q is an arithmetic progression of modulus lcm(d, e).
(2) If P ⊆ Z is an arithmetic progression, then Z P is a ﬁnite union of
arithmetic progressions.
(3) Let T be the collection of all ﬁnite unions of arithmetic progressions. Then
T contains with any two sets X, Y also X ∪ Y , X ∩ Y , X Y .
Corollary 4.4.4. Let S ⊆ Z. Then
S is deﬁnable in
˜
Z ⇐⇒ S is a ﬁnite union of arithmetic progressions.
Proof. (⇐) It suﬃces to show that each arithmetic progression is deﬁnable in
˜
Z;
this is straightforward and left to the reader. (⇒) By QE and Lemma 4.4.3 it
suﬃces to show that each atomic L
PrA
formula ϕ(x) deﬁnes in
˜
Z a ﬁnite union
of arithmetic progressions. Every atomic formula ϕ(x) diﬀerent from · and ⊥
has the form t
1
(x) < t
2
(x) or the form t
1
(x) = t
2
(x) or the form P
d
(t(x)), where
t
1
(x), t
2
(x) and t(x) are L
PrA
terms. The ﬁrst two kinds reduce to t(x) > 0
and t(x) = 0 respectively (by subtraction). It follows that we may assume that
ϕ(x) has the form kx+l1 > 0, or the form kx+l1 = 0 , or the form P
d
(kx+l1),
where k, l ∈ Z. Considering cases (k = 0, k ,= 0 and k ≡ 0 mod d, and so on),
we see that such a ϕ(x) deﬁnes an arithmetic progression.
Exercises.
(1) Prove that 2Z cannot be deﬁned in the structure (Z; 0, 1, +, −, <) by a qfree
formula of the language 0, 1, +, −, <¦.
4.5 Skolemization and Extension by Deﬁnition
In this section L is a sublanguage of L
, Σ a set of Lsentences, and Σ
a set of
L
sentences with Σ ⊆ Σ
.
Deﬁnition. Σ
is said to be conservative over Σ (or a conservative extension
of Σ) if for every Lsentence σ
Σ
¬
L
σ ⇐⇒ Σ ¬
L
σ.
74 CHAPTER 4. SOME MODEL THEORY
Here (=⇒) is the signiﬁcant direction, (⇐=) is automatic.
Remarks
(1) Suppose Σ
conservative over Σ. Then Σ is consistent if and only if Σ
is
consistent.
(2) If each model of Σ has an L
expansion to a model of Σ
, then Σ
is conser
vative over Σ. (This follows easily from the Completeness Theorem.)
Proposition 4.5.1. Let ϕ(x
1
, . . . , x
n
, y) be an Lformula. Let f
ϕ
be an nary
function symbol not in L, and put L
:= L ∪ ¦f
ϕ
¦ and
Σ
:= Σ ∪ ¦∀x
1
. . . ∀x
n
(∃yϕ(x
1
, . . . , x
n
, y) → ϕ(x
1
, . . . , x
n
, f
ϕ
(x
1
, . . . , x
n
)))¦
Then Σ
is conservative over Σ.
Proof. Let / be any model of Σ. By remark (2) it suﬃces to obtain an L

expansion /
of / that makes the new axiom about f
ϕ
true. We choose a
function f
A
ϕ
: A
n
−→ A as follows. For any (a
1
, . . . , a
n
) ∈ A
n
, if there is a
b ∈ A such that / [= ϕ(a
1
, . . . , a
n
, b) then we let f
A
ϕ
(a
1
, . . . , a
n
) be such an
element b, and if no such b exists, we let f
A
ϕ
(a
1
, . . . , a
n
) be an arbitrary element
of A. Thus
/
[= ∀x
1
. . . ∀x
n
(∃yϕ(x
1
, . . . , x
n
, y) → ϕ(x
1
, . . . , x
n
, f
ϕ
(x
1
, . . . , x
n
)))
as desired.
Remark. A function f
A
ϕ
as in the proof is called a Skolem function in / for
the formula ϕ(x
1
, . . . , x
n
, y). It yields a “witness” for each relevant ntuple.
Deﬁnition. Given an Lformula ϕ(x
1
, . . . , x
m
), let R
ϕ
be an mary relation
symbol not in L, and put L
ϕ
:= L ∪ ¦R
ϕ
¦ and Σ
ϕ
:= Σ ∪ ¦ρ
ϕ
¦ where ρ
ϕ
is
∀x
1
. . . ∀x
m
(ϕ(x
1
, . . . , x
m
) ↔ R
ϕ
(x
1
, . . . , x
m
)).
The sentence ρ
ϕ
is called the deﬁning axiom for R
ϕ
. We call Σ
ϕ
an extension
of Σ by a deﬁnition for the relation symbol R
ϕ
.
Remark. Each model / of Σ expands uniquely to a model of Σ
ϕ
. We denote
this expansion by /
ϕ
. Every model of Σ
ϕ
is of the form /
ϕ
for a unique model
/ of Σ.
Proposition 4.5.2. Let ϕ = ϕ(x
1
, . . . , x
m
) be as above. Then we have:
(1) Σ
ϕ
is conservative over Σ.
(2) For each L
ϕ
formula ψ(y) where y = (y
1
, . . . , y
n
) there is an Lformula
ψ
∗
(y), called a translation of ψ(y), such that Σ
ϕ
¬ ψ(y) ↔ ψ
∗
(y).
(3) Suppose / [= Σ and S ⊆ A
m
. Then S is 0deﬁnable in / if and only if S
is 0deﬁnable in /
ϕ
, and the same with deﬁnable instead of 0deﬁnable.
4.5. SKOLEMIZATION AND EXTENSION BY DEFINITION 75
Proof. (1) is clear from the remark preceding the proposition, and (3) is im
mediate from (2). To prove (2) we observe that by the Equivalence Theorem
(3.3.2) it suﬃces to prove it for formulas ψ(y) = R
ϕ
t
1
(y) . . . t
m
(y) where the t
i
are Lterms. In this case we can take
∃u
1
. . . ∃u
m
(u
1
= t
1
(y) ∧ . . . ∧ u
m
= t
m
(y) ∧ ϕ(u
1
/x
1
, . . . , u
m
/x
m
))
as ψ
∗
(y) where the variables u
1
, . . . , u
m
do not appear in ϕ and are not among
y
1
, . . . , y
n
.
Deﬁnition. Suppose ϕ(x, y) is an Lformula where (x, y) = (x
1
, . . . , x
m
, y) is
a tuple of m + 1 distinct variables, such that Σ ¬ ∀x
1
. . . ∀x
m
∃
!
yϕ(x, y), where
∃
!
yϕ(x, y) abbreviates ∃y
_
ϕ(x, y) ∧ ∀z(ϕ(x, z) → y = z)
_
, with z a variable
not occurring in ϕ and not among x
1
, . . . , x
m
, y. Let f
ϕ
be an mary function
symbol not in L and put L
:= L ∪ ¦f
ϕ
¦ and Σ
:= Σ ∪ ¦γ
ϕ
¦ where γ
ϕ
is
∀x
1
. . . ∀x
m
ϕ(x, f
ϕ
(x))
The sentence γ
ϕ
is called the deﬁning axiom for f
ϕ
. We call Σ
an extension of
Σ by a deﬁnition of the function symbol f
ϕ
.
Remark. Each model / of Σ expands uniquely to a model of Σ
. We denote
this expansion by /
. Every model of Σ
is of the form /
for a unique model
/ of Σ. Proposition 4.5.2 goes through when L
ϕ
, Σ
ϕ
, and / are replaced by
L
, Σ
, and /
, respectively. We leave the proof of this as an exercise. (Hint:
reduce the analogue of (2) to the case of an unnested formula.)
Before deﬁning the next concept we consider a simple case. Let (A; <) be a
totally ordered set, A ,= ∅. By a deﬁnition of (A; <) in a structure B we mean
an injective map δ : A → B
k
, with k ∈ N, such that
(i) δ(A) ⊆ B
k
is deﬁnable in B.
(ii) The set ¦(δ(a), δ(b)) : a < b in A¦ ⊆ (B
k
)
2
= B
2k
is deﬁnable in B.
For example, we have a deﬁnition δ : Z → N
2
of the ordered set (Z; <) of integers
in the additive monoid (N; 0, +) of natural numbers, given by δ(n) = (n, 0) and
δ(−n) = (0, n). (We leave it to the reader to check this.)
It can be shown with tools a little beyond the scope of these notes that no
inﬁnite totally ordered set can be deﬁned in the ﬁeld of complex numbers.
In order to extend this notion to arbitrary structures / = (A; . . . ) we use the
following notation and terminology. Let X, Y be sets, f : X → Y a map, and
S ⊆ X
n
. Then the fimage of S is the subset
f(S) := ¦(f(x
1
), . . . , f(x
n
)) : (x
1
, . . . , x
n
) ∈ S¦
of Y
n
. Also, given k ∈ N, we use the bijection
((y
11
, . . . , y
1k
), . . . , (y
n1
, . . . , y
nk
)) → (y
11
, . . . , y
1k
, . . . , y
n1
, . . . , y
nk
)
from (Y
k
)
n
to Y
nk
to identify these two sets.
76 CHAPTER 4. SOME MODEL THEORY
Deﬁnition. A deﬁnition of an Lstructure / in a structure B is an injective
map δ : A → B
k
, with k ∈ N, such that
(i) δ(A) ⊆ B
k
is deﬁnable in B.
(ii) For each mary R ∈ L
r
the set δ(R
A
) ⊆ (B
k
)
m
= B
mk
is deﬁnable in B.
(iii) For each nary f ∈ L
f
the set δ
_
graph of f
A
_
⊆ (B
k
)
n+1
= B
(n+1)k
is
deﬁnable in B.
Remark. Here B is a structure for a language L
∗
that may have nothing to
do with the language L. Replacing everywhere “deﬁnable” by “0deﬁnable”, we
get the notion of a 0deﬁnition of / in B.
A more general way of viewing a structure / as in some sense living inside
a structure B is to allow δ to be an injective map from A into B
k
/E for some
equivalence relation E on B
k
that is deﬁnable in B, and imposing suitable
conditions. Our special case corresponds to E = equality on B
k
. (We do not
develop this idea here further: the right setting for it would be manysorted
structures, rather than our onesorted structures.)
Recall that by Lagrange’s “four squares” theorem we have
N = ¦a
2
+b
2
+c
2
+d
2
: a, b, c, d ∈ Z¦.
It follows that the inclusion map N → Z is a 0deﬁnition of (N; 0, +, , <) in
(Z; 0, 1, +, −, ). The bijection
a +bi → (a, b) : C → R
2
(a, b ∈ R)
is a 0deﬁnition of the ﬁeld (C; 0, 1, +, −, ) of complex numbers in the ﬁeld
(R; 0, 1, +, −, ) of real numbers.
On the other hand, there is no deﬁnition of the ﬁeld of real numbers in
the ﬁeld of complex numbers: this follows from the fact, stated earlier without
proof, that there is no deﬁnition of any inﬁnite totally ordered set in the ﬁeld of
complex numbers. (A special case says that R, considered as a subset of C, is
not deﬁnable in the ﬁeld of complex numbers; this follows easily from the fact
that ACF admits QE, see exercise 1.) Indeed, one can show that the only ﬁelds
deﬁnable in the ﬁeld of complex numbers are ﬁnite ﬁelds and ﬁelds isomorphic
to the ﬁeld of complex numbers itself.
Proposition 4.5.3. Let δ : A → B
k
be a 0deﬁnition of the Lstructure / in the
L
∗
structure B. Let x
1
, . . . , x
n
be distinct variables (viewed as ranging over A ),
and let x
11
, . . . , x
1k
, . . . , x
n1
, . . . , x
nk
be nk distinct variables (viewed as ranging
over B). Then we have a map that assigns to each Lformula ϕ(x
1
, . . . , x
n
) an
L
∗
formula δϕ(x
11
, . . . , x
1k
, . . . , x
n1
, . . . , x
nk
) such that
δ(ϕ
A
) = (δϕ)
B
⊆ B
nk
.
In particular, for n = 0 the map above assigns to each Lsentence σ an
L
∗
sentence δσ such that / [= σ ⇐⇒ B [= δσ.
Chapter 5
Computability, Decidability,
and Incompleteness
In this chapter we prove G¨odel’s famous Incompleteness Theorem. Consider the
structure N := (N; 0, S, +, , <), where S : N → N is the successor function. A
simple form of the incompleteness theorem is as follows.
Let Σ be a computable set of sentences in the language of N and true in N.
Then there exists a sentence σ in that language such that N [= σ, but Σ ,¬ σ.
In other words, no computable set of axioms in the language of N and true
in N can be complete, hence the name Incompleteness Theorem. The only
unexplained terminology here is “computable.” Intuitively, “Σ is computable”
means that there is an algorithm to recognize whether any given sentence in
the language of N belongs to Σ. (It seems reasonable to require this of an
axiom system for N.) Thus we begin this chapter with developing the notion of
computability. The interest of this notion is tied to the ChurchTuring Thesis
as explained in Section 5.2, and goes far beyond incompleteness. For example,
computability plays a role in combinatorial group theory (Higman’s Theorem)
and in certain diophantine questions (Hilbert’s 10th problem), not to mention
its role in the ideological underpinnings of computer science.
5.1 Computable Functions
First some notation. We let µx(..x..) denote the least x ∈ N for which ..x.. holds.
Here ..x.. is some condition on natural numbers x. For example µx(x
2
> 7) = 3.
We will only use this notation when the meaning of ..x.. is clear, and the set
¦x ∈ N : ..x..¦ is nonempty. For a ∈ N we also let µx
<a
(..x..) be the least
x < a in N such that ..x.. holds if there is such an x, and if there is no such x
we put µx
<a
(..x..) := a. For example, µx
<4
(x
2
> 3) = 2 and µx
<2
(x > 5) = 2.
77
78 CHAPTER 5. COMPUTABILITY
Deﬁnition. For R ⊆ N
n
, we deﬁne χ
R
: N
n
→ N by χ
R
(a) =
_
1 if a ∈ R,
0 if a / ∈ R.
Think of such R as an nary relation on N. We call χ
R
the characteristic
function of R, and often write R(a
1
, . . . , a
n
) instead of (a
1
, . . . , a
n
) ∈ R.
Example. χ
<
(m, n) = 1 iﬀ m < n, and χ
<
(m, n) = 0 iﬀ m ≥ n.
Deﬁnition. For i = 1, . . . , n we deﬁne I
n
i
: N
n
→ N by I
n
i
(a
1
, . . . , a
n
) = a
i
.
These functions are called coordinate functions.
Deﬁnition. The computable functions (or recursive functions) are the functions
from N
n
to N (for n = 0, 1, 2, . . .) obtained by inductively applying the following
rules:
(R1) + : N
2
→ N, : N
2
→ N, χ
≤
: N
2
→ N, and the coordinate functions I
n
i
(for each n and i = 1, . . . , n) are computable.
(R2) If G : N
m
→ N is computable and H
1
, . . . , H
m
: N
n
→ N are computable,
then so is the function F = G(H
1
, . . . , H
m
) : N
n
→ N deﬁned by
F(a) = G(H
1
(a), . . . , H
m
(a)).
(R3) If G : N
n+1
→ N is computable, and for all a ∈ N
n
there exists x ∈ N
such that G(a, x) = 0, then the function F : N
n
→ N given by
F(a) = µx(G(a, x) = 0)
is computable.
A relation R ⊆ N
n
is said to be computable (or recursive) if its characteristic
function χ
R
: N
n
−→ N is computable.
Example. If F : N
3
→ N and G : N
2
→ N are computable, then so is the
function H : N
4
→ N deﬁned by H(x
1
, x
2
, x
3
, x
4
) = F(G(x
1
, x
4
), x
2
, x
4
). This
follows from (R2) by noting that H(x) = F(G(I
4
1
(x), I
4
4
(x)), I
4
2
(x), I
4
4
(x)) where
x = (x
1
, x
2
, x
3
, x
4
). We shall use this device from now on in many proofs, but
only tacitly. (The reader should of course notice when we do so.)
From (R1), (R2) and (R3) we derive further rules for obtaining computable
functions. This is mostly an exercise in programming.
Lemma 5.1.1. Let H
1
, . . . , H
m
: N
n
→ N and R ⊆ N
m
be computable. Then
R(H
1
, . . . , H
k
) ⊆ N
n
is computable, where for a ∈ N
n
we put
R(H
1
, . . . , H
m
)(a) ⇐⇒ R(H
1
(a), . . . , H
m
(a)).
Proof. Observe that χ
R(H
1
,...,H
m
)
= χ
R
(H
1
, . . . , H
m
). Now apply (R2).
Lemma 5.1.2. The functions χ
≥
and χ
=
on N
2
are computable, as is the
constant function c
n
0
: N
n
→ N where c
n
0
(a) = 0 for all n and a ∈ N
n
.
5.1. COMPUTABLE FUNCTIONS 79
Proof. The function χ
≥
is computable because
χ
≥
(m, n) = χ
≤
(n, m) = χ
≤
(I
2
2
(m, n), I
2
1
(m, n))
which enables us to apply (R1) and (R2). Similarly, χ
=
is computable:
χ
=
(m, n) = χ
≤
(m, n) χ
≥
(m, n).
To handle c
n
0
we observe
c
n
0
(a) = µx(I
n+1
n+1
(a, x) = 0).
For k ∈ N we deﬁne the constant function c
n
k
: N
n
→ N by c
n
k
(a) = k.
Lemma 5.1.3. Every constant function c
n
k
is computable.
Proof. This is true for k = 0 (and all n) by Lemma 5.1.2. Next, observe that
c
n
k+1
(a) = µx(c
n
k
(a) < x) = µx
_
χ
≥
(c
n+1
k
(a, x), I
n+1
n+1
(a, x)) = 0
_
for a ∈ N
n
, so the general result follows by induction on k.
Let P, Q be nary relations on N. Then we can form the nary relations
P := N
n
P, P ∨ Q := P ∪ Q, P ∧ Q := P ∩ Q,
P → Q := (P) ∨ Q, P ↔ Q := (P → Q) ∧ (Q → P)
on N.
Lemma 5.1.4. Suppose P, Q are computable. Then P, P ∨Q, P ∧Q, P → Q
and P ↔ Q are also computable.
Proof. Let a ∈ N
n
. Then P(a) iﬀ χ
P
(a) = 0 iﬀ χ
P
(a) = c
n
0
(a), so χ
¬P
(a) =
χ
=
(χ
P
(a), c
n
0
(a)). Hence P is computable by (R2) and Lemma 5.1.2. Next,
the relation P ∧ Q is computable since χ
P∧Q
= χ
P
χ
Q
. By De Morgan’s Law,
P ∨ Q = (P ∧ Q). Thus P ∨ Q is computable. The rest is clear.
Lemma 5.1.5. The binary relations <, ≤, =, >, ≥, ,= on N are computable.
Proof. The relations ≥, ≤ and = have already been taken care of by Lemma
5.1.2 and (R1). The remaining relations are complements of these three, so by
Lemma 5.1.4 they are also computable.
Lemma 5.1.6. (Deﬁnition by Cases) Let R
1
, . . . , R
k
⊆ N
n
be computable such
that for each a ∈ N
n
exactly one of R
1
(a), . . . , R
k
(a) holds, and suppose that
G
1
, . . . , G
k
: N
n
→ N are computable. Then G : N
n
→ N given by
G(a) =
_
¸
_
¸
_
G
1
(a) if R
1
(a)
.
.
.
.
.
.
G
k
(a) if R
k
(a)
is computable.
80 CHAPTER 5. COMPUTABILITY
Proof. This follows from G = G
1
χ
R
1
+ +G
k
χ
R
k
.
Lemma 5.1.7. (Deﬁnition by Cases) Let R
1
, . . . , R
k
⊆ N
n
be computable such
that for each a ∈ N
n
exactly one of R
1
(a), . . . , R
k
(a) holds. Let P
1
, . . . , P
k
⊆ N
n
be computable. Then the relation P ⊆ N
n
deﬁned by
P(a) ⇐⇒
_
¸
_
¸
_
P
1
(a) if R
1
(a)
.
.
.
.
.
.
P
k
(a) if R
k
(a)
is computable.
Proof. Use that P = (P
1
∧ R
1
) ∨ ∨ (P
k
∧ R
k
).
Lemma 5.1.8. Let R ⊆ N
n+1
be computable such that for all a ∈ N
n
there
exists x ∈ N with (a, x) ∈ R. Then the function F : N
n
→ N given by
F(a) = µxR(a, x)
is computable.
Proof. Note that F(a) = µx(χ
¬R
(a, x) = 0) and apply (R3).
Here is a nice consequence of 5.1.5 and 5.1.8.
Lemma 5.1.9. Let F : N
n
→ N. Then F is computable if and only if its graph
(a subset of N
n+1
) is computable.
Proof. Let R ⊆ N
n+1
be the graph of F. Then for all a ∈ N
n
and b ∈ N,
R(a, b) ⇐⇒ F(a) = b, F(a) = µxR(a, x),
from which the lemma follows immediately.
Lemma 5.1.10. If R ⊆ N
n+1
is computable, then the function F
R
: N
n+1
→ N
deﬁned by F
R
(a, y) = µx
<y
R(a, x) is computable.
Proof. Use that F
R
(a, y) = µx(R(a, x) or x = y).
Some notation: below we use the bold symbol ∃∃∃ as shorthand for “there
exists a natural number”; likewise, we use the bold symbol ∀ to abbreviate “for
all natural numbers.” These abbreviation symbols should not be confused with
the logical symbols ∃ and ∀.
Lemma 5.1.11. Suppose R ⊆ N
n+1
is computable. Let P, Q ⊆ N
n+1
be the
relations deﬁned by
P(a, y) ⇐⇒ ∃∃∃x
<y
R(a, x)
Q(a, y) ⇐⇒ ∀x
<y
R(a, x),
for (a, y) = (a
1
, . . . , a
n
, y) ∈ N
n+1
. Then P and Q are computable.
5.1. COMPUTABLE FUNCTIONS 81
Proof. Using the notation and results from Lemma 5.1.10 we note that P(a, y)
iﬀ F
R
(a, y) < y. Hence χ
P
(a, y) = χ
<
(F
R
(a, y), y). For Q, note that Q(a, y)
iﬀ ∃∃∃x
<y
R(a, x).
Lemma 5.1.12. The function
˙
− : N
2
→ N deﬁned by a
˙
−b =
_
a −b if a ≥ b,
0 if a < b
is computable.
Proof. Use that a
˙
−b = µx(b +x = a or a < b).
The results above imply easily that many familiar functions are computable.
But is the exponential function n → 2
n
computable? It certainly is in the
intuitive sense: we know how to compute (in principle) its value at any given
argument. It is not that obvious from what we have proved so far that it is
computable in our precise sense. We now develop some coding tricks due to
G¨odel that enable us to prove routinely that functions like 2
x
are computable
according to our deﬁnition of “computable function”.
Deﬁnition. Deﬁne the function Pair : N
2
→ N by
Pair(x, y) :=
(x +y)(x +y + 1)
2
+ x
We call Pair the pairing function.
Lemma 5.1.13. The function Pair is bijective and computable.
Proof. Exercise.
Deﬁnition. Since Pair is a bijection we can deﬁne functions
Left, Right : N → N
by
Pair(x, y) = a ⇐⇒ Left(a) = x and Right(a) = y.
The reader should check that Left(a), Right(a) ≤ a for a ∈ N, and Left(a) < a
if 0 < a ∈ N.
Lemma 5.1.14. The functions Left and Right are computable.
Proof. Use 5.1.9 in combination with
Left(a) = µx
_
∃∃∃y
<a+1
Pair(x, y) = a
_
and Right(a) = µy
_
∃∃∃x
<a+1
Pair(x, y) = a
_
.
For a, b, c ∈ Z we have (by deﬁnition): a ≡ b mod c ⇐⇒ a −b ∈ cZ.
Lemma 5.1.15. The ternary relation a ≡ b mod c on N is computable.
82 CHAPTER 5. COMPUTABILITY
Proof. Use that for a, b, c ∈ N we have
a ≡ b mod c ⇐⇒
_
∃∃∃x
<a+1
a = x c +b or ∃∃∃x
<b+1
b = x c +a
_
.
We can now introduce G¨odel’s function β : N
2
→ N.
Deﬁnition. For a, i ∈ N we let β(a, i) be the remainder of Left(a) upon division
by 1 + (i + 1) Right(a), that is,
β(a, i) := µx
_
x ≡ Left(a) mod 1 + (i + 1) Right(a)
_
.
Proposition 5.1.16. The function β is computable, β(a, i) ≤ a
˙
−1 for all a, i ∈
N. For any sequence (a
0
, . . . , a
n−1
) of natural numbers there exists a ∈ N such
that β(a, i) = a
i
for i < n.
Proof. The computability of β is clear from earlier results. We have β(a, i) ≤
Left(a) ≤ a
˙
−1.
Let a
0
, . . . , a
n−1
be natural numbers. Then we take an N ∈ N such that
a
i
≤ N for all i < n and N is a multiple of every prime number less than
n. We claim that then 1 + N, 1 + 2N, . . . , 1 + nN are relatively prime. To
see this, suppose p is a prime number such that p [ 1 + iN and p [ 1 + jN
(1 ≤ i < j ≤ n); then p divides their diﬀerence (j −i)N, but p does not divide
N, hence p [ j −i < n, a contradiction.
By the Chinese Remainder Theorem there exists an M such that
M ≡ a
0
mod 1 +N
M ≡ a
1
mod 1 + 2N
.
.
.
M ≡ a
n−1
mod 1 +nN.
Put a := Pair(M, N); then Left(a) = M and Right(a) = N, and thus β(a, i) =
a
i
as required.
Remark. Proposition 5.1.16 shows that we can use β to encode a sequence of
numbers a
0
, . . . , a
n−1
in terms of a single number a. We use this as follows to
show that the function n → 2
n
is computable.
If a
0
, . . . , a
n
are natural numbers such that a
0
= 1, and a
i+1
= 2a
i
for
all i < n, then necessarily a
n
= 2
n
. Hence by Proposition 5.1.16 we have
β(a, n) = 2
n
where
a := µx(β(x, 0) = 1 and ∀i
<n
β(x, i + 1) = 2β(x, i)),
that is,
2
n
= β(a, n) = β(µx(β(x, 0) = 1 and ∀i
<n
β(x, i + 1) = 2β(x, i)), n)
It follows that n → 2
n
is computable.
5.1. COMPUTABLE FUNCTIONS 83
The above suggests a general method, which we develop next. To each se
quence (a
1
, . . . , a
n
) of natural numbers we assign a sequence number, denoted
¸a
1
, . . . , a
n
), and deﬁned to be the least natural number a such that β(a, 0) = n
(the length of the sequence) and β(a, i) = a
i
for i = 1, . . . , n. For n = 0 this
gives ¸) = 0, where ¸) is the sequence number of the empty sequence. We de
ﬁne the length function lh : N −→ N by lh(a) = β(a, 0), so lh is computable.
Observe that lh(¸a
1
, . . . , a
n
)) = n.
Put (a)
i
:= β(a, i +1). The function (a, i) → (a)
i
: N
2
−→ N is computable,
and (¸a
1
, . . . , a
n
))
i
= a
i+1
for i < n. Finally, let Seq ⊆ N denote the set of
sequence numbers. The set Seq is computable since
a ∈ Seq ⇐⇒ ∀x
<a
(lh(x) ,= lh(a) or ∃∃∃i
<lh(a)
(x)
i
,= (a)
i
)
Lemma 5.1.17. For any n, the function (a
1
, . . . , a
n
) → ¸a
1
, . . . , a
n
) : N
n
→ N
is computable, and a
i
< ¸a
1
, . . . , a
n
) for (a
1
, . . . , a
n
) ∈ N
n
and i = 1, . . . , n.
Proof. Use ¸a
1
, . . . , a
n
) = µa(β(a, 0) = n, β(a, 1) = a
1
, . . . , β(a, n) = a
n
), and
apply Lemmas 5.1.8, 5.1.4 and 5.1.16.
Lemma 5.1.18.
(1) The function In : N
2
→ N deﬁned by
In(a, i) = µx(lh(x) = i and ∀j
<i
(x)
j
= (a)
j
)
is computable and In(¸a
1
, . . . , a
n
), i) = ¸a
1
, . . . , a
i
) for all a
1
, . . . , a
n
∈ N
and i ≤ n.
(2) The function ∗ : N
2
→ N deﬁned by
a ∗ b = µx(lh(x) = lh(a) + lh(b) and ∀i
<lh(a)
(x)
i
= (a)
i
and ∀j
<lh(b)
(x)
lh(a)+j
= (b)
j
)
is computable and ¸a
1
, . . . , a
m
) ∗ ¸b
1
, . . . , b
n
) = ¸a
1
, . . . , a
m
, b
1
, . . . , b
n
) for
all a
1
, . . . , a
m
, b
1
, . . . , b
n
∈ N.
Deﬁnition. For F : N
n+1
→ N , let
¯
F : N
n+1
→ N be given by
¯
F(a, b) = ¸F(a, 0), . . . , F(a, b −1)) (a ∈ N
n
).
Note that
¯
F(a, 0) = ¸) = 0.
Lemma 5.1.19. Let F : N
n+1
→ N. Then F is computable if and only if
¯
F is
computable.
Proof. Suppose F is computable. Then
¯
F is computable since
¯
F(a, b) = µx(lh(x) = b and ∀i
<b
(x)
i
= F(a, i)).
In the other direction, suppose
¯
F is computable. Then F is computable since
F(a, b) = (
¯
F(a, b + 1))
b
.
84 CHAPTER 5. COMPUTABILITY
Given G : N
n+2
→ N there is a unique function F : N
n+1
→ N such that
F(a, b) = G(a, b,
¯
F(a, b)) (a ∈ N
n
).
This will be clear if we express the requirement on F as follows:
F(a, 0) = G(a, 0, 0), F(a, b + 1) = G(a, b + 1, ¸F(a, 0), . . . , F(a, b))).
The next result is important because it allows us to introduce computable func
tions by recursion on its values at smaller arguments.
Proposition 5.1.20. Let G and F be as above and suppose G is computable.
Then F is computable.
Proof. Note that
¯
F(a, b) = µx(Seq(x) and lh(x) = b and ∀i
<b
(x)
i
= G(a, i, In(x, i)))
for all a ∈ N
n
and b ∈ N. It follows that
¯
F is computable, and thus by the
previous lemma F is computable.
Deﬁnition. Let A : N
n
→ N and B : N
n+2
→ N be given. Let a range over
N
n
, and deﬁne the function F : N
n+1
→ N by
F(a, 0) = A(a),
F(a, b + 1) = B(a, b, F(a, b)).
We say that F is obtained from A and B by primitive recursion.
Proposition 5.1.21. Suppose A, B, and F are as above, and A and B are
computable. Then F is computable.
Proof. Deﬁne G : N
n+2
→ N by
G(a, b, c) =
_
A(a) if c = 0,
B(a, b
˙
−1, (c)
b
˙
−1
) if c > 0.
Clearly, G is computable. We claim that
F(a, b) = G(a, b,
¯
F(a, b)).
This claim yields the computability of F, by Proposition 5.1.20. We have
F(a, 0) = A(a) = G(a, 0, 0) = G(a, 0,
¯
F(a, 0)) and
F(a, b + 1) = B(a, b, F(a, b)) = B(a, b, (
¯
F(a, b + 1))
b
) = G(a, b + 1,
¯
F(a, b + 1)).
The claim follows.
5.1. COMPUTABLE FUNCTIONS 85
Proposition 5.1.20 will be applied over and over again in the later section on
G¨odel numbering, but in combination with deﬁnitions by cases. As a simple
example of such an application, let G : N → N and H : N
2
→ N be computable.
There is clearly a unique function F : N
2
→ N such that for all a, b ∈ N
F(a, b) =
_
F(a, G(b)) if G(b) < b,
H(a, b) otherwise.
In particular F(a, 0) = H(a, 0). We claim that F is computable.
According to Proposition 5.1.20 this claim will follow if we can specify a
computable function K : N
3
→ N such that F(a, b) = K(a, b,
¯
F(a, b)) for all
a, b ∈ N. Such a function K is given by
K(a, b, c) =
_
(c)
G(b)
if G(b) < b,
H(a, b) otherwise.
Exercises.
(1) The set of prime numbers is computable.
(2) The Fibonacci numbers are the natural numbers F
n
deﬁned recursively by F
0
= 0,
F
1
= 1, and F
n+2
= F
n+1
+F
n
. The function n → F
n
: N → N is computable.
(3) If f
1
, . . . , f
n
: N
m
→ N are computable and X ⊆ N
n
is computable, then
f
−1
(X) ⊆ N
m
is computable, where f := (f
1
, . . . , f
n
) : N
m
→ N
n
.
(4) If f : N → N is computable and surjective, then there is a computable function
g : N → N such that f ◦ g = id
N
.
(5) If f : N → N is computable and strictly increasing, then f(N) ⊆ N is com
putable.
(6) All computable functions and relations are deﬁnable in N.
(7) Let F : N
n
→ N, and deﬁne
¸F) : N → N, ¸F)(a) := F
`
(a)
0
, . . . , (a)
n−1
´
,
so F(a
1
, . . . , a
n
) = ¸F)(¸a
1
, . . . , a
n
)) for all a
1
, . . . , a
n
∈ N. Then F is com
putable iﬀ ¸F) is computable. (Hence nvariable computability reduces to 1
variable computability.)
Let T be a collection of functions F : N
m
→ N for various m. We say that T is
closed under composition if for all G : N
m
→ Nin T and all H
1
, . . . , H
m
: N
n
→ N
in T, the function F = G(H
1
, . . . , H
m
) : N
n
→ N is in T. We say that T is
closed under minimalization if for every G : N
n+1
→ N in T such that for all
a ∈ N
n
there exists x ∈ N with G(a, x) = 0, the function F : N
n
→ N given by
F(a) = µx(G(a, x) = 0) is in T. We say that a relation R ⊆ N
n
is in T if its
characteristic function χ
R
is in T.
(8) Suppose T contains the functions mentioned in (R1), and is closed under com
position and minimalization. All lemmas and propositions of this Section go
through with computable replaced by in T.
86 CHAPTER 5. COMPUTABILITY
5.2 The ChurchTuring Thesis
The computable functions as deﬁned in the last section are also computable
in the informal sense that for each such function F : N
n
→ N there is an
algorithm that on any input a ∈ N
n
stops after a ﬁnite number of steps and
produces an output F(a). An algorithm is given by a ﬁnite list of instructions,
a computer program, say. These instructions should be deterministic (leave
nothing to chance or choice). We deliberately neglect physical constraints of
space and time: imagine that the program that implements the algorithm has
unlimited access to time and space to do its work on any given input.
Let us write “calculable” for this intuitive, informal, idealized notion of
computable. The ChurchTuring Thesis asserts
each calculable function F : N → N is computable.
The corresponding assertion for functions N
n
→ N follows, because the re
sult of Exercise 7 in Section 5.1 is clearly also valid for “calculable” instead
of “computable.” Call a set P ⊆ N calculable if its characteristic function is
calculable.
While the ChurchTuring Thesis is not a precise mathematical statement, it
is an important guiding principle, and has never failed in practice: any function
that any competent person has ever recognized as being calculable, has turned
out to be computable, and the informal grounds for calculability have always
translated routinely into an actual proof of computability. Here is a heuristic
(informal) argument that might make the Thesis plausible.
Let an algorithm be given for computing F : N → N. We can assume
that on any input a ∈ N this algorithm consists of a ﬁnite sequence of steps,
numbered from 0 to n, say, where at each step i it produces a natural number
a
i
, with a
0
= a as starting number. It stops after step n with a
n
= F(a).
We assume that for each i < n the number a
i+1
is calculated by some ﬁxed
procedure from the earlier numbers a
0
, . . . , a
i
, that is, we have a calculable
function G : N → N such that a
i+1
= G(¸a
0
, . . . , a
i
)) for all i < n. The
algorithm should also tell us when to stop, that is, we should have a calculable
P ⊆ N such that P(¸a
0
, . . . , a
i
)) for i < n and P(¸a
0
, . . . , a
n
)). Since G and
P describe only single steps in the algorithm for F it is reasonable to assume
that they at least are computable. Once this is agreed to, one can show easily
that F is computable as well, see the exercise below.
A skeptical reader may ﬁnd this argument dubious, but Turing gave in 1936
a compelling informal analysis of what functions F : N → N are calculable
in principle, and this has led to general acceptance of the Thesis. In addition,
various alternative formalizations of the informal notion of calculable function
have been proposed, using various kinds of machines, formal systems, and so
on. They all have turned out to be equivalent in the sense of deﬁning the same
class of functions on N, namely the computable functions.
The above is only a rather narrow version of the ChurchTuring Thesis, but
it suﬃces for our purpose. There are various reﬁnements and more ambitious
versions. Also, our ChurchTuring Thesis does not characterize mathematically
5.3. PRIMITIVE RECURSIVE FUNCTIONS 87
the intuitive notion of algorithm, only the notion function computable by an
algorithm that produces for each input from N an output in N.
Exercises.
(1) Let G : N → N and P ⊆ N be given. Then there is for each a ∈ N at most one
ﬁnite sequence a
0
, . . . , a
n
of natural numbers such that a
0
= a, for all i < n we
have a
i+1
= G(¸a
0
, . . . , a
i
)) and P(¸a
0
, . . . , a
i
)), and P(¸a
0
, . . . , a
n
)). Suppose
that for each a ∈ N there is such a ﬁnite sequence a
0
, . . . , a
n
, and put F(a) := a
n
,
thus deﬁning a function F : N → N. If G and P are computable, so is F.
5.3 Primitive Recursive Functions
This section is not really needed in the rest of this chapter, but it may throw light
on some issues relating to computability. One such issue is the condition, in Rule
(R3) for generating computable functions, that for all a ∈ N
n
there exists y ∈ N
such that G(a, y) = 0. This condition is not constructive: it could be satisﬁed
for a certain G without us ever knowing it. We shall now argue informally that
it is impossible to generate in a fully constructive way exactly the computable
functions. Such a constructive generation process would presumably enable
us to enumerate eﬀectively a sequence of algorithms α
0
, α
1
, α
2
, . . . such that
each α
n
computes a (computable) function f
n
: N → N, and such that every
computable function f : N → N occurs in the sequence f
0
, f
1
, f
2
, . . . , possibly
more than once. Now consider the function f
diag
: N → N deﬁned by
f
diag
(n) = f
n
(n) + 1.
Then f
diag
is clearly computable in the intuitive sense, but f
diag
,= f
n
for all n,
in violation of the ChurchTuring Thesis.
This way of producing a new function f
diag
from a sequence (f
n
) is called
diagonalization.
1
The same basic idea applies in other cases, and is used in a
more sophisticated form in the proof of G¨odel’s incompleteness theorem.
Here is a class of computable functions that can be generated constructively:
The primitive recursive functions are the functions f : N
n
→ N obtained in
ductively as follows:
(PR1) The nullary function N
0
→ N with value 0, the unary successor function
S, and all coordinate functions I
n
i
are primitive recursive.
(PR2) If G : N
m
→ N is primitive recursive and H
1
, . . . , H
m
: N
n
→ N are
primitive recursive, then G(H
1
, . . . , H
m
) is primitive recursive.
(PR3) If F : N
n+1
→ N is obtained by primitive recursion from primitive re
cursive functions G : N
n
→ N and H : N
n+2
→ N, then F is primitive
recursive.
1
Perhaps it would be better to call it antidiagonalization.
88 CHAPTER 5. COMPUTABILITY
A relation R ⊆ N
n
is said to be primitive recursive if its characteristic function
χ
R
is primitive recursive. As the next two lemmas show, the computable func
tions that one ordinarily meets with are primitive recursive. In the rest of this
section x ranges over N
m
with m depending on the context, and y over N.
Lemma 5.3.1. The following functions and relations are primitive recursive:
(i) each constant function c
n
m
;
(ii) the binary operations +, , and (x, y) → x
y
on N;
(iii) the predecessor function Pd : N → N given by Pd(x) = x
˙
−1, the unary
relation ¦x ∈ N : x > 0¦, the function
˙
− : N
2
→ N;
(iv) the binary relations ≥, ≤ and = on N.
Proof. The function c
0
m
is obtained from c
0
0
by applying (PR2) m times with
G = S. Next, c
n
m
is obtained by applying (PR2) with G = c
0
m
(with k = 0 and
t = n). The functions in (ii) are obtained by the usual primitive recursions. It is
also easy to write down primitive recursions for the functions in (iii), in the order
they are listed. For (iv), note that χ
≥
(x, y + 1) = χ
>0
(x) χ
≥
(Pd(x), y).
Lemma 5.3.2. With the possible exceptions of Lemmas 5.1.8 and 5.1.9, all
Lemmas and Propositions in Section 5.1 go through with “computable” replaced
by “primitive recursive.”
Proof. To obtain the primitive recursive version of Lemma 5.1.10, note that
F
R
(a, 0) = 0, F
R
(a, y+1) = F
R
(a, y)χ
R
(a, F
R
(a, y))+(y+1)χ
¬R
(a, F
R
(a, y)).
A consequence of the primitive recursive version of Lemma 5.1.10 is the following
“restricted minimalization scheme” for primitive recursive functions:
if R ⊆ N
n+1
and H : N
n
→ N are primitive recursive, and for all a ∈ N
n
there exists x < H(a) such that R(a, x), then the function F : N
n
→ N given
by F(a) = µxR(a, x) is primitive recursive.
The primitive recursive versions of Lemmas 5.1.11–5.1.15 and Proposition 5.1.16
now follow easily. In particular, the function β is primitive recursive. Also, the
proof of Proposition 5.1.16 yields:
There is a primitive recursive function B : N → N such that, whenever
n < N, a
0
< N, . . . , a
n−1
< N, (n, a
0
, . . . , a
n−1
, N ∈ N)
then for some a < B(N) we have β(a, i) = a
i
for i = 0, . . . , n −1.
Using this fact and restricted minimalization, it follows that the unary relation
Seq, the unary function lh, and the binary functions (a, i) → (a)
i
, In and ∗ are
primitive recursive.
Let a function F : N
n+1
→ N be given. Then
¯
F : N
n+1
→ N satisﬁes the
primitive recursion
¯
F(a, 0) = 0 and
¯
F(a, b + 1) =
¯
F(a, b) ∗ ¸F(a, b)). It follows
5.3. PRIMITIVE RECURSIVE FUNCTIONS 89
that if F is primitive recursive, so is
¯
F. The converse is obvious. Suppose also
that G : N
n+2
→ N is primitive recursive, and F(a, b) = G(a, b,
¯
F(a, b)) for all
(a, b) ∈ N
n+1
; then
¯
F satisﬁes the primitive recursion
¯
F(A, 0) = G(a, 0, 0),
¯
F(a, b + 1) =
¯
F(a, b) ∗ ¸G(a, b,
¯
F(a, b))).
so
¯
F (and hence F) is primitive recursive.
The Ackermann Function. By diagonalization we can produce a computable
function that is not primitive recursive, but the socalled Ackermann function
does more, and plays a role in several contexts. First we deﬁne inductively a
sequence A
0
, A
1
, A
2
, . . . of primitive recursive functions A
n
: N → N:
A
0
(y) = y + 1, A
n+1
(0) = A
n
(1),
A
n+1
(y + 1) = A
n
(A
n+1
(y)).
Thus A
0
= S and A
n+1
◦A
0
= A
n
◦A
n+1
. One veriﬁes easily that A
1
(y) = y +2
and A
2
(y) = 2y + 3 for all y. We deﬁne the Ackermann function A : N
2
→ N
by A(n, y) := A
n
(y).
Lemma 5.3.3. The function A is computable, and strictly increasing in each
variable. Also, for all n and x, y:
(i) A
n
(x +y) ≥ A
n
(x) +y;
(ii) n ≥ 1 =⇒ A
n+1
(y) > A
n
(y) +y;
(iii) A
n+1
(y) ≥ A
n
(y + 1);
(iv) 2A
n
(y) < A
n+2
(y);
(v) x < y =⇒ A
n
(x +y) ≤ A
n+2
(y).
Proof. We leave it as an exercise at the end of this section to show that A is
computable. Assume inductively that A
0
, . . . , A
n
are strictly increasing and
A
0
(y) < A
1
(y) < < A
n
(y) for all y. Then
A
n+1
(y + 1) = A
n
(A
n+1
(y)) ≥ A
0
(A
n+1
(y)) > A
n+1
(y),
so A
n+1
is strictly increasing. Next we show that A
n+1
(y) > A
n
(y) for all y:
A
n+1
(0) = A
n
(1), so A
n+1
(0) > A
n
(0) and A
n+1
(0) > 1, so A
n+1
(y) > y + 1
for all y. Hence A
n+1
(y + 1) = A
n
(A
n+1
(y)) > A
n
(y + 1).
Inequality (i) follows easily by induction on n, and a second induction on y.
For inequality (ii), we proceed again by induction on (n, y): Using A
1
(y) =
y +2 and A
2
(y) = 2y +3, we obtain A
2
(y) > A
1
(y) +y. Let n > 1, and assume
inductively that A
n
(y) > A
n−1
(y) + y. Then A
n+1
(0) = A
n
(1) > A
n
(0) + 0,
and
A
n+1
(y + 1) = A
n
(A
n+1
(y)) ≥ A
n
(y + 1 +A
n
(y))
≥ A
n
(y + 1) +A
n
(y) > A
n
(y + 1) +y + 1.
90 CHAPTER 5. COMPUTABILITY
In (iii) we proceed by induction on y. We have equality for y = 0. Assuming
inductively that (iii) holds for a certain y we obtain
A
n+1
(y + 1) = A
n
(A
n+1
(y)) ≥ A
n
(A
n
(y + 1)) ≥ A
n
(y + 2).
Note that (iv) holds for n = 0. For n > 0 we have by (i), (ii) and (iii):
A
n
(y) +A
n
(y) ≤ A
n
(y +A
n
(y)) < A
n
(A
n+1
(y)) = A
n+1
(y + 1) ≤ A
n+2
(y).
Note that (v) holds for n = 0. Assume (v) holds for a certain n. Let x < y+1.
We can assume inductively that if x < y, then A
n+1
(x +y) ≤ A
n+3
(y), and we
want to show that
A
n+1
(x +y + 1) ≤ A
n+3
(y + 1).
Case 1. x = y. Then
A
n+1
(x +y + 1) = A
n+1
(2x + 1) = A
n
(A
n+1
(2x))
≤ A
n+2
(2x) < A
n+2
(A
n+3
(x)) = A
n+3
(y + 1).
Case 2. x < y. Then
A
n+1
(x +y + 1) = A
n
(A
n+1
(x +y)) ≤ A
n+2
(A
n+3
(y)) = A
n+3
(y + 1).
Below we put [x[ := x
1
+ +x
m
for x = (x
1
, . . . , x
m
) ∈ N
m
.
Proposition 5.3.4. Given any primitive recursive function F : N
m
→ N there
is an n = n(F) such that F(x) ≤ A
n
([x[) for all x ∈ N
m
.
Proof. Call an n = n(F) with the property above a bound for F. The nullary
constant function with value 0, the successor function S, and each coordinate
function I
m
i
, (1 ≤ i ≤ m), has bound 0. Next, assume F = G(H
1
, . . . , H
k
) where
G : N
k
→ N and H
1
, . . . , H
k
: N
m
→ N are primitive recursive, and assume
inductively that n(G) and n(H
1
), . . . , n(H
k
) are bounds for G and H
1
, . . . , H
k
.
By part (iv) of the previous lemma we can take N ∈ N such that n(G) ≤ N,
and
i
H
i
(x) ≤ A
N+1
([x[) for all x. Then
F(x) = G(H
1
(x), . . . , H
k
(x)) ≤ A
N
(
i
H
i
(x)) ≤ A
N
(A
N+1
([x[)) ≤ A
N+2
([x[).
Finally, assume that F : N
m+1
→ N is obtained by primitive recursion from
the primitive recursive functions G : N
m
→ N and H : N
m+2
→ N, and assume
inductively that n(G) and n(H) are bounds for G and H. Take N ∈ N such
that n(G) ≤ N + 3 and n(H) ≤ N. We claim that N + 3 is a bound for F:
F(x, 0) = G(x) ≤ A
N+3
([x[), and by part (v) of the lemma above,
F(x, y + 1) = H(x, y, F(x, y)) ≤ A
N
¦[x[ +y +A
N+3
([x[ +y)¦
≤ A
N+2
¦A
N+3
([x[ +y)¦ = A
N+3
([x[ +y + 1).
5.4. REPRESENTABILITY 91
Consider the function A
∗
: N → N deﬁned by A
∗
(n) = A(n, n). Then A
∗
is computable, and for any primitive recursive function F : N → N we have
F(y) < A
∗
(y) for all y > n(F), where n(F) is a bound for F. In particular, A
∗
is not primitive recursive. Hence A is computable but not primitive recursive.
The recursion in “primitive recursion” involves only one variable; the other
variables just act as parameters. The Ackermann function is deﬁned by a re
cursion involving both variables:
A(0, y) = y + 1, A(x + 1, 0) = A(x, 1), A(x + 1, y + 1) = A(x, A(x + 1, y)).
This kind of double recursion is therefore more powerful in some ways than what
can be done in terms of primitive recursion and composition.
Exercises.
(1) The graph of the Ackermann function is primitive recursive. (It follows that the
Ackermann function is recursive, and it gives an example showing that Lemma 5.1.9
fails with “primitive recursive” in place of “computable”.)
5.4 Representability
Let L be a numerical language, that is, L contains the constant symbol 0 and
the unary function symbol S. We let S
n
0 denote the term S . . . S0 in which S
appears exactly n times. So S
0
0 is the term 0, S
1
0 is the term S0, and so on.
Our key example of a numerical language is
L(N) := ¦0, S, +, , <¦ (the language of N).
Here N is the following set of nine axioms, where we ﬁx two distinct variables
x and y for the sake of deﬁniteness:
N1 ∀x(Sx ,= 0)
N2 ∀x∀y (Sx = Sy → x = y)
N3 ∀x(x + 0 = x)
N4 ∀x∀y
_
x +Sy = S(x +y)
_
N5 ∀x(x 0 = 0)
N6 ∀x∀y (x Sy = x y +x)
N7 ∀x(x ,< 0)
N8 ∀x∀y (x < Sy ↔ x < y ∨ x = y)
N9 ∀x∀y (x < y ∨ x = y ∨ y < x)
These axioms are clearly true in N. The fact that N is ﬁnite will play a role
later. It is a very weak set of axioms, but strong enough to prove numerical
facts like
SS0 +SSS0 = SSSSS0, ∀x
_
x < SS0 → (x = 0 ∨ x = S0)
_
.
Lemma 5.4.1. For each n,
N ¬ x < S
n+1
0 ↔ (x = 0 ∨ ∨ x = S
n
0).
92 CHAPTER 5. COMPUTABILITY
Proof. By induction on n. For n = 0, N ¬ x < S0 ↔ x = 0 by axioms N8 and
N7. Assume n > 0 and N ¬ x < S
n
0 ↔ (x = 0 ∨ ∨ x = S
n−1
0). Use axiom
N8 to conclude that N ¬ x < S
n+1
0 ↔ (x = 0 ∨ ∨ x = S
n
0).
To give an impression how weak N is we consider some of its models:
Some models of N.
(1) We usually refer to N as the standard model of N.
(2) Another model of N is N[x] := (N[x]; . . . ), where 0, S, +, are interpreted
as the zero polynomial, as the unary operation of adding 1 to a polynomial,
and as addition and multiplication of polynomials in N[x], and where < is
interpreted as follows: f(x) < g(x) iﬀ f(n) < g(n) for all large enough n.
(3) A more bizarre model of N: (R
≥0
; . . . ) with the usual interpretations of
0, S, +, , in particular S(r) := r +1, and with < interpreted as the binary
relation <
N
on R
≥0
: r <
N
s ⇔ (r, s ∈ N and r < s) or s / ∈ N.
Example (2) shows that N ,¬ ∀x∃y (x = 2y ∨ x = 2y + S0), since in N[x] the
element x is not in 2N[x] ∪ 2N[x] + 1; in other words, N cannot prove “every
element is even or odd.” In example (3) the binary relation <
N
on R
≥0
is not
even a total order. One useful fact about models of N is that they all contain
the socalled standard model N in a unique way:
Lemma 5.4.2. Suppose / [= N. Then there is a unique homomorphism
ι : N → /.
This homomorphism ι is an embedding, and for all a ∈ A and n,
(i) if a <
A
ι(n), then a = ι(m) for some m < n;
(ii) if a / ∈ ι(N), then ι(n) <
A
a.
As to the proof, note that for any homomorphism ι : N → / and all n we
must have ι(n) = (S
n
0)
A
. Hence there is at most one such homomorphism. It
remains to show that the map n → (S
n
0)
A
: N → A is an embedding ι : N → /
with properties (i) and (ii). We leave this as an exercise to the reader.
Deﬁnition. Let L be a numerical language, and Σ a set of Lsentences. A rela
tion R ⊆ N
m
is said to be Σrepresentable, if there is an Lformula ϕ(x
1
, . . . , x
m
)
such that for all (a
1
, . . . , a
m
) ∈ N
m
we have
(i) R(a
1
, . . . , a
m
) =⇒ Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0)
(ii) R(a
1
, . . . , a
m
) =⇒ Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0)
Such a ϕ(x
1
, . . . , x
m
) is said to represent R in Σ or to Σrepresent R. Note that if
ϕ(x
1
, . . . , x
m
) Σrepresents R and Σ is consistent, then for all (a
1
, . . . , a
m
) ∈ N
m
R(a
1
, . . . , a
m
) ⇐⇒ Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0),
R(a
1
, . . . , a
m
) ⇐⇒ Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0).
A function F : N
m
→ N is Σrepresentable if there is a formula ϕ(x
1
, . . . , x
m
, y)
of L such that for all (a
1
, . . . , a
m
) ∈ N
m
we have
Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0, y) ↔ y = S
F(a
1
,...,a
m
)
0.
5.4. REPRESENTABILITY 93
Such a ϕ(x
1
, . . . , x
m
, y) is said to represent F in Σ or to Σrepresent F.
An Lterm t(x
1
, . . . , x
m
) is said to represent the function F : N
m
→ N in Σ if
Σ ¬ t(S
a
1
0, . . . , S
a
m
0) = S
F(a)
0 for all a = (a
1
, . . . , a
m
) ∈ N
m
. Note that then
the function F is Σrepresented by the formula t(x
1
, . . . , x
m
) = y.
Proposition 5.4.3. Let L be a numerical language, Σ a set of Lsentences such
that Σ ¬ S0 ,= 0, and R ⊆ N
m
a relation. Then
R is Σrepresentable ⇐⇒ χ
R
is Σrepresentable.
Proof. (⇐) Assume χ
R
is Σrepresentable and let ϕ(x
1
, . . . , x
m
, y) be an L
formula Σrepresenting it. We show that ψ(x
1
, . . . , x
m
) := ϕ(x
1
, . . . , x
m
, S0)
Σrepresents R. Let (a
1
, . . . , a
m
) ∈ R; then χ
R
(a
1
, . . . , a
m
) = 1. Hence
Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0, y) ↔ y = S0,
so Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0, S0), that is, Σ ¬ ψ(S
a
1
0, . . . , S
a
m
0).
Similarly, (a
1
, . . . , a
m
) / ∈ R implies Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0, S0). (Here we need
Σ ¬ S0 ,= 0.)
(⇒) Conversely, assume R is Σrepresentable and let ϕ(x
1
, . . . , x
m
) be an L
formula Σrepresenting it. We show that
ψ(x
1
, . . . , x
m
, y) := (ϕ(x
1
, . . . , x
m
) ∧ y = S0) ∨ (ϕ(x
1
, . . . , x
m
) ∧ y = 0)
Σrepresents χ
R
.
Let (a
1
, . . . , a
m
) ∈ R; then Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0). Hence,
Σ ¬ [(ϕ(S
a
1
0, . . . , S
a
m
0) ∧ y = S0) ∨ (ϕ(S
a
1
0, . . . , S
a
m
0) ∧ y = 0)] ↔ y = S0
i.e. Σ ¬ ψ(S
a
1
0, . . . , S
a
m
0, y) ↔ y = S0.
And similarly for (a
1
, . . . , a
m
) / ∈ R, Σ ¬ ψ(S
a
1
0, . . . , S
a
m
0, y) ↔ y = 0.
Let Σ be a set of Lsentences. An Lterm t(x
1
, . . . , x
m
) is said to represent
the function F : N
m
→ N in Σ if Σ ¬ t(S
a
1
0, . . . , S
a
m
0) = S
F(a)
0 for all
a = (a
1
, . . . , a
m
) ∈ N
m
. Note that then the function F is Σrepresented by the
formula t(x
1
, . . . , x
m
) = y.
Theorem 5.4.4 (Representability). Each computable function F : N
n
→ N is
Nrepresentable. Each computable relation R ⊆ N
m
is Nrepresentable.
Proof. By Proposition 5.4.3 we need only consider the case of functions. We
make the following three claims:
(R1)
+ : N
2
→ N, : N
2
→ N, χ
≤
: N
2
→ N, and the coordinate function
I
n
i
(for each n and i = 1, . . . , n) are Nrepresentable.
(R2)
If G : N
m
→ N and H
1
, . . . , H
m
: N
n
→ N are Nrepresentable, then so
is F = G(H
1
, . . . , H
m
) : N
n
→ N deﬁned by
F(a) = G(H
1
(a), . . . , H
k
(a)).
94 CHAPTER 5. COMPUTABILITY
(R3)
If G : N
n+1
→ N is Nrepresentable, and for all a ∈ N
n
there exists
x ∈ N such that G(a, x) = 0, then the function F : N
n
→ N given by
F(a) = µx(G(a, x) = 0)
is Nrepresentable.
(R1)
: The proof of this claim has six parts.
(i) The formula x
1
= x
2
represents ¦(a, b) ∈ N
2
: a = b¦ in N:
Let a, b ∈ N. If a = b then obviously N ¬ S
a
0 = S
b
0. Suppose that
a ,= b. Then for every model / of N we have / [= S
a
0 ,= S
b
0, by
Lemma 5.4.2 and its proof. Hence N ¬ S
a
0 ,= S
b
0.
(ii) The term x
1
+x
2
represents + : N
2
→ N in N:
Let a + b = c where a, b, c ∈ N. By Lemma 5.4.2 and its proof we
have / [= S
a
0 + S
b
0 = S
c
0 for each model / of N. It follows that
N ¬ S
a
0 +S
b
0 = S
c
0.
(iii) The term x
1
x
2
represents : N
2
→ N in N:
The proof is similar to that of (ii).
(iv) The formula x
1
< x
2
represents ¦(a, b) ∈ N
2
: a < b¦ in N:
The proof is similar to that of (i).
(v) χ
≤
: N
2
→ N is Nrepresentable:
By (i) and (iv), the formula x
1
< x
2
∨ x
1
= x
2
represents the set
¦(a, b) ∈ N
2
: a ≤ b¦ in N. So by Proposition 5.4.3, χ
≤
: N
2
→ N is
Nrepresentable.
(vi) For n ≥ 1 and 1 ≤ i ≤ n , the term t
n
i
(x
1
, . . . , x
n
) := x
i
, represents
the function I
n
i
: N
n
→ N in N. This is obvious.
(R2)
: Let H
1
, . . . , H
m
: N
n
→ N be Nrepresented by ϕ
i
(x
1
, . . . , x
n
, y
i
) and
let G : N
m
→ N be Nrepresented by ψ(y
1
, . . . , y
m
, z) where z is distinct
from x
1
, . . . , x
n
and y
1
, . . . , y
m
.
Claim : F = G(H
1
, . . . , H
m
) is Nrepresented by
θ(x
1
, . . . , x
n
, z) := ∃y
1
. . . ∃y
m
((
m
i=1
ϕ
i
(x
1
, . . . , x
n
, y
i
)) ∧ ψ(y
1
, . . . , y
m
, z)).
Put a = (a
1
, . . . , a
n
) and let c = F(a). We have to show that
N ¬ θ(S
a
0, z) ↔ z = S
c
0
where S
a
0 := (S
a
1
0, . . . , S
a
n
0). Let b
i
= H
i
(a) and put b = (b
1
, . . . , b
m
).
Then F(a) = G(b) = c. Therefore, N ¬ ψ(S
b
0, z) ↔ z = S
c
0 and
N ¬ ϕ
i
(S
a
0, y
i
) ↔ y
i
= S
b
i
0, (i = 1, . . . , m)
Argue in models to conclude : N ¬ θ(S
a
0, z) ↔ z = S
c
0.
5.5. DECIDABILITY AND G
¨
ODEL NUMBERING 95
(R3)
: Let G : N
n+1
→ N be such that for all a ∈ N
n
there exists b ∈ N with
G(a, b) = 0. Deﬁne F : N
n
→ N by F(a) = µb(G(a, b) = 0). Suppose
that G is Nrepresented by ϕ(x
1
, . . . , x
n
, y, z). We claim that the formula
ψ(x
1
, . . . , x
n
, y) := ϕ(x
1
, . . . , x
n
, y, 0) ∧ ∀w(w < y → ϕ(x
1
, . . . , x
n
, w, 0))
Nrepresents F. Let a ∈ N
n
and let b = F(a). Then G(a, i) ,= 0 for i < b
and G(a, b) = 0. Therefore, N ¬ ϕ(S
a
0, S
b
0, z) ↔ z = 0 and for i < b,
G(a, i) ,= 0 and N ¬ ϕ(S
a
0, S
i
0, z) ↔ z = S
G(a,i)
0. By arguing in models
using Lemma 5.4.2 we obtain N ¬ ψ(S
a
0, y) ↔ y = S
b
0, as claimed.
Remark. The converse of this theorem is also true, and is plausible from the
ChurchTuring Thesis. We shall prove the converse in the next section.
Exercises. In the exercises below, L is a numerical language and Σ is a set of
Lsentences.
(1) Suppose Σ ¬ S
m
0 ,= S
n
0 whenever m ,= n. If a function F : N
m
→ N is Σ
represented by the Lformula ϕ(x
1
, . . . , x
m
, y), then the graph of F, as a relation
of arity m+1 on N, is Σrepresented by ϕ(x
1
, . . . , x
m
, y). (This result applies to
Σ = N, since N ¬ S
m
0 ,= S
n
0 whenever m ,= n.)
(2) Suppose Σ ⊇ N. Then the set of all Σrepresentable functions F : N
m
→ N,
(m = 0, 1, 2, . . . ) is closed under composition and minimalization.
5.5 Decidability and G¨odel Numbering
Deﬁnition. An Ltheory T is a set of Lsentences closed under provability, that
is, whenever T ¬ σ, then σ ∈ T.
Examples.
(1) Given a set Σ of Lsentences, the set Th(Σ) := ¦σ : Σ ¬ σ¦ of theorems
of Σ, is an Ltheory. If we need to indicate the dependence on L we write
Th
L
(Σ) for Th(Σ). We say that Σ axiomatizes an Ltheory T (or is an
axiomatization of T) if T = Th(Σ). For Σ = ∅ we also refer to Th
L
(Σ) as
predicate logic in L.
(2) Given an Lstructure /, the set Th(/) := ¦σ : / [= σ¦ is also an L
theory, called the theory of /. Note that the theory of / is automatically
complete.
(3) Given any class / of Lstructures, the set
Th(/) := ¦σ : / [= σ for all / ∈ /¦
is an Ltheory, called the theory of /. For example, for L = L
Ri
, and /
the class of ﬁnite ﬁelds, Th(/) is the set of Lsentences that are true in all
ﬁnite ﬁelds.
96 CHAPTER 5. COMPUTABILITY
The decision problem for a given Ltheory T is to ﬁnd an algorithm to
decide for any Lsentence σ whether or not σ belongs to T. Since we have not
(yet) deﬁned the concept of algorithm, this is just an informal description at
this stage. One of our goals in this section is to deﬁne a formal counterpart,
called “decidability.” In the next section we show that the L(N)theory Th(N) is
undecidable; by the ChurchTuring Thesis, this means that the decision problem
for Th(N) has no solution. (This result is a version of Church’s Theorem, and
is closely related to the Incompleteness Theorem.)
Unless we specify otherwise we assume in the rest of this chapter that the
language L is ﬁnite. This is done for simplicity, and at the end of the remaining
sections we indicate how to avoid this assumption. We shall number the terms
and formulas of L in such a way that various statements about these formulas
and about formal proofs in this language can be translated “eﬀectively” into
equivalent statements about natural numbers expressible by sentences in L(N).
Recall that v
0
, v
1
, v
2
, . . . are our variables. We assign to each symbol
s ∈ L . ¦logical symbols¦ . ¦v
0
, v
1
, v
2
, . . . ¦
a symbol number SN(s) ∈ N as follows: SN(v
i
) := 2i and to each remaining
symbol (in the ﬁnite set L.¦logical symbols¦) we assign an odd natural number
as symbol number, subject to the condition that diﬀerent symbols have diﬀerent
symbol numbers.
Deﬁnition. The G¨odel number t of an Lterm t is deﬁned recursively:
t =
_
¸SN(v
i
)) if t = v
i
,
¸SN(F), t
1
, . . . , t
n
) if t = Ft
1
, . . . , t
n
.
The G¨odel number ϕ of an Lformula ϕ is given recursively by
ϕ =
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸SN(·)) if ϕ = ·,
¸SN(⊥)) if ϕ = ⊥,
¸SN(=), t
1
, t
2
) if ϕ = (t
1
= t
2
),
¸SN(R), t
1
, . . . , t
m
) if ϕ = Rt
1
. . . t
m
,
¸SN(), ψ) if ϕ = ψ,
¸SN(∨), ϕ
1
, ϕ
2
) if ϕ = ϕ
1
∨ ϕ
2
,
¸SN(∧), ϕ
1
, ϕ
2
) if ϕ = ϕ
1
∧ ϕ
2
,
¸SN(∃), x, ψ) if ϕ = ∃xψ,
¸SN(∀), x, ψ) if ϕ = ∀xψ.
Lemma 5.5.1. The following subsets of N are computable:
(1) Vble := ¦x : x is a variable¦
(2) Term := ¦t : t is an Lterm¦
(3) AFor := ¦ϕ : ϕ is an atomic Lformula¦
(4) For := ¦ϕ : ϕ is an Lformula¦
5.5. DECIDABILITY AND G
¨
ODEL NUMBERING 97
Proof. (1) a ∈ Vble iﬀ a = ¸2b) for some b ≤ a.
(2) a ∈ Term iﬀ a ∈ Vble or a = ¸SN(F), t
1
, . . . , t
n
) for some function
symbol F of L of arity n and Lterms t
1
, . . . , t
n
with G¨odel numbers < a.
We leave (3) to the reader.
(4) We have For(a) ⇔
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
For((a)
1
) if a = ¸SN(), (a)
1
),
For((a)
1
) and For((a)
2
) if a = ¸SN(∨), (a)
1
, (a)
2
)
or a = ¸SN(∧), (a)
1
, (a)
2
),
Vble((a)
1
) and For((a)
2
) if a = ¸SN(∃), (a)
1
, (a)
2
)
or a = ¸SN(∀), (a)
1
, (a)
2
),
AFor(a) otherwise.
So For is computable.
In the next two lemmas, x ranges over variables, ϕ and ψ over Lformulas,
and t and τ over Lterms.
Lemma 5.5.2. The function Sub : N
3
→ N deﬁned by Sub(a, b, c) =
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
c if Vble(a) and a = b,
¸(a)
0
, Sub((a)
1
, b, c), . . . , Sub((a)
n
, b, c)) if a = ¸(a)
0
, . . . , (a)
n
) with n > 0 and
(a)
0
,= SN(∃), (a)
0
,= SN(∀),
¸SN(∃), (a)
1
, Sub((a)
2
, b, c)) if a = ¸SN(∃), (a)
1
, (a)
2
) and (a)
1
,= b,
¸SN(∀), (a)
1
, Sub((a)
2
, b, c)) if a = ¸SN(∀), (a)
1
, (a)
2
) and (a)
1
,= b,
a otherwise
is computable, and satisﬁes
Sub(t, x, τ) = t(τ/x) and Sub(ϕ, x, τ) = ϕ(τ/x).
Proof. Exercise.
Lemma 5.5.3. The following relations on N are computable:
(1) Fr := ¦(ϕ, x) : x occurs free in ϕ¦ ⊆ N
2
(2) FrSub := ¦(ϕ, x, τ) : τ is free for x in ϕ¦ ⊆ N
3
(3) PrAx := ¦ϕ : ϕ is a propositional axiom¦ ⊆ N
(4) Eq := ¦ϕ : ϕ is an equality axiom¦ ⊆ N
(5) Quant := ¦ψ : ψ is a quantiﬁer axiom¦ ⊆ N
(6) MP := ¦(ϕ
1
, ϕ
1
→ ϕ
2
, ϕ
2
) : ϕ
1
, ϕ
2
are Lformulas¦ ⊆ N
3
(7) Gen := ¦(ϕ, ψ) : ψ follows from ϕ by the generalization rule¦ ⊆ N
2
(8) Sent := ¦ϕ : ϕ is a sentence¦ ⊆ N
Proof. The usual inductive or explicit description of each of these notions trans
lates easily into a description of its “G¨odel image” that establishes computability
of this image. As an example, note that
Sent(a) ⇐⇒ For(a) and ∀i
<a
Fr(a, i),
so (8) follows from (1) and earlier results.
98 CHAPTER 5. COMPUTABILITY
In the rest of this Section Σ is a set of Lsentences. Put
Σ := ¦σ : σ ∈ Σ¦,
and call Σ computable if Σ is computable.
Deﬁnition. We deﬁne Prf
Σ
to be the “set of all G¨odel numbers of proofs from
Σ”, i. e. Prf
Σ
:= ¦¸ϕ
1
, . . . , ϕ
n
) : ϕ
1
, . . . , ϕ
n
is a proof from Σ¦. So every
element of Prf
Σ
is of the form ¸ϕ
1
, . . . , ϕ
n
) where n ≥ 1 and every ϕ
k
is
either in Σ, or a logical axiom, or obtained from some ϕ
i
, ϕ
j
with 1 ≤ i, j < k
by Modus Ponens, or obtained from some ϕ
i
with 1 ≤ i < k by Generalization.
Lemma 5.5.4. If Σ is computable, then Prf
Σ
is computable.
Proof. This is because a is in Prf
Σ
iﬀ Seq(a) and lh(a) ,= 0 and for every k <
lh(a) either (a)
k
∈ Σ ∪ PrAx ∪Eq ∪Quant or ∃∃∃i, j < k : MP((a)
i
, (a)
j
, (a)
k
)
or ∃∃∃i < k : Gen((a)
i
, (a)
k
).
Deﬁnition. A relation R ⊆ N
n
is said to be computably generated if there is a
computable relation Q ⊆ N
n+1
such that for all a ∈ N
n
we have
R(a) ⇔∃∃∃xQ(a, x)
“Recursively enumerable” is also used for “computably generated.”
Remark. Every computable relation is obviously computably generated. We
leave it as an exercise to check that the union and intersection of two computably
generated nary relations on N are computably generated. The complement of
a computably generated subset of N is not always computably generated, as we
shall see later.
Lemma 5.5.5. If Σ is computable, then Th(Σ) is computably generated.
Proof. Apply Lemma 5.5.4 and the fact that for all a ∈ N
a ∈ Th(Σ) ⇐⇒∃∃∃b
_
Prf
Σ
(b) and a = (b)
lh(b)
˙
−1
and Sent(a)
_
.
Deﬁnition. An Ltheory T is said to be computably axiomatizable if T has a
computable axiomatization.
2
We say that T is decidable if T is computable, and undecidable otherwise.
(Thus “T is decidable” means the same thing as “T is computable,” but for
Ltheories “decidable” is more widely used than “computable”.)
Proposition 5.5.6 (Negation Theorem). Let A ⊆ N
n
and suppose A and A
are computably generated. Then A is computable.
2
Instead of “computably axiomatizable,” also “recursively axiomatizable” and “eﬀectively
axiomatizable” are used.
5.5. DECIDABILITY AND G
¨
ODEL NUMBERING 99
Proof. Let P, Q ⊆ N
n+1
be computable such that for all a ∈ N
n
we have
A(a) ⇐⇒∃∃∃xP(a, x), A(a) ⇐⇒∃∃∃xQ(a, x).
Then there is for each a ∈ N
n
an x ∈ N such that (P ∨ Q)(a, x). The com
putability of A follows by noting that for all a ∈ N
n
we have
A(a) ⇐⇒ P(a, µx(P ∨ Q)(a, x)).
Proposition 5.5.7. Every complete and computably axiomatizable Ltheory is
decidable.
Proof. Let T be a complete Ltheory with computable axiomatization Σ. Then
T = Th(Σ) is computably generated. Now observe:
a / ∈ T ⇐⇒ a / ∈ Sent or ¸SN(), a) ∈ T
⇐⇒ a / ∈ Sent or ∃∃∃b
_
Prf
Σ
(b) and (b)
lh(b)
˙
−1
= ¸SN(), a)
_
.
Hence the complement of T is computably generated. Thus T is decidable by
the Negation Theorem.
Relaxing the assumption of a ﬁnite language. Much of the above does
not really need the assumption that L is ﬁnite. In the discussion below we only
assume that L is countable, so the case of ﬁnite L is included. First, we assign
to each symbol
s ∈ ¦v
0
, v
1
, v
2
, . . . ¦ . ¦logical symbols¦
its symbol number SN(s) ∈ N as follows: SN(v
i
) := 2i, and for
s = ·, ⊥, , ∨, ∧, =, ∃, ∀, respectively, put
SN(s) = 1, 3, 5, 7, 9, 11, 13, 15, respectively.
This part of our numbering of symbols is independent of L.
By a numbering of L we mean an injective function L → N that assigns to each
s ∈ L an odd natural number SN(s) > 15 such that if s is a relation symbol,
then SN(s) ≡ 1 mod 4, and if s is a function symbol, then SN(s) ≡ 3 mod 4.
Such a numbering of L is said to be computable if the sets
SN(L) = ¦SN(s) : s ∈ L¦) ⊆ N, ¦(SN(s), arity(s)) : s ∈ L¦ ⊆ N
2
are computable. (So if L is ﬁnite, then every numbering of L is computable.)
Given a numbering of L we use it to assign to each Lterm t and each L
formula ϕ its G¨odel number t and ϕ, just as we did earlier in the section.
Suppose a computable numbering of L is given, with the corresponding G¨odel
numbering of Lterms and Lformulas. Then Lemmas 5.5.1, 5.5.2, and 5.5.3 go
through, as a diligent reader can easily verify.
100 CHAPTER 5. COMPUTABILITY
Let also a set Σ of Lsentences be given. Deﬁne Σ := ¦σ : σ ∈ Σ¦, and
call Σ computable if Σ is a computable subset of N. We deﬁne Prf
Σ
to be the
set of all G¨odel numbers of proofs from Σ, and for an Ltheory T we deﬁne the
notions of T being computably axiomatizable, T being decidable, and T being
undecidable, all just as we did earlier in this section. (Note, however, that the
deﬁnitions of these notions are all relative to our given computable numbering
of L; for ﬁnite L diﬀerent choices of numbering of L yield equivalent notions of Σ
being computable, T being computably axiomatizable, and T being decidable.)
It is now routine to check that Lemmas 5.5.4, 5.5.5, and Proposition 5.5.7 go
through.
Exercises.
(1) Let a and b denote positive real numbers. Call a computable if there are com
putable functions f, g : N → N such that for all n > 0,
g(n) ,= 0 and [a −f(n)/g(n)[ < 1/n.
Then:
(i) every positive rational number is computable, and e is computable;
(ii) if a and b are computable, so are a + b, ab, and 1/a, and if in addition
a > b, then a −b is also computable;
(iii) a is computable if and only if the binary relation R
a
on N deﬁned by
R
a
(m, n) ⇐⇒ n > 0 and m/n < a
is computable. (Hint: use the Negation Theorem.)
(2) A nonempty S ⊆ N is computably generated iﬀ there is a computable function
f : N → N such that S = f(N). Moreover, if S is inﬁnite and computably
generated, then f can be chosen injective.
(3) If f : N → Nis computable and f(x) > x for all x ∈ N, then f(N) is computable.
(4) Every inﬁnite computably generated subset of N has an inﬁnite computable sub
set.
(5) A function F : N
n
→ N is computable iﬀ its graph is computably generated.
(6) Suppose Σ is a computable and consistent set of sentences in the numerical lan
guage L. Then every Σrepresentable relation R ⊆ N
n
is computable.
(7) A function F : N
m
→ Nis computable if and only if it is Σrepresentable for some
ﬁnite, consistent set Σ ⊇ N of sentences in some numerical language L ⊇ L(N).
The last exercise gives an alternative characterization of “computable function.”
5.6 Theorems of G¨odel and Church
In this section we assume that the ﬁnite language L extends L(N).
Theorem 5.6.1 (Church). No consistent Ltheory extending N is decidable.
5.6. THEOREMS OF G
¨
ODEL AND CHURCH 101
Before giving the proof we record the following consequence:
Corollary 5.6.2 (Weak form of G¨odel’s Incompleteness Theorem). Each com
putably axiomatizable Ltheory extending N is incomplete.
Proof. Immediate from 5.5.7 and Church’s Theorem.
We will indicate in the next Section how to construct for any consistent com
putable set of Lsentences Σ ⊇ N an Lsentence σ such that Σ ,¬ σ and Σ ,¬ σ.
(The corollary above only says that such a sentence exists.)
For the proof of Church’s Theorem we need a few lemmas.
Lemma 5.6.3. The function Num : N → N deﬁned by Num(a) = S
a
0 is
computable.
Proof. Num(0) = 0 and Num(a + 1) = ¸SN(S), Num(a)).
Let P ⊆ A
2
be any binary relation on a set A. For a ∈ A, we let P(a) ⊆ A be
given by the equivalence P(a)(b) ⇔ P(a, b).
Lemma 5.6.4 (Cantor). Given any P ⊆ A
2
, its antidiagonal Q ⊆ A deﬁned by
Q(b) ⇐⇒ P(b, b)
is not of the form P(a) for any a ∈ A.
Proof. Suppose Q = P(a), where a ∈ A. Then Q(a) iﬀ P(a, a). But by deﬁni
tion, Q(a) iﬀ P(a, a), a contradiction.
This is essentially Cantor’s proof that no f : A →P(A) can be surjective. (Use
P(a, b) :⇔ b ∈ f(a); then P(a) = f(a).)
Deﬁnition. Let Σ be a set of Lsentences. We ﬁx a variable x (e. g. x = v
0
)
and deﬁne the binary relation P
Σ
⊆ N
2
by
P
Σ
(a, b) ⇐⇒ Sub(a, x, Num(b)) ∈ Th(Σ)
For an Lformula ϕ(x) and a = ϕ(x), we have
Sub(ϕ(x), x, S
b
0) = ϕ(S
b
0),
so
P
Σ
(a, b) ⇐⇒ Σ ¬ ϕ(S
b
0).
Lemma 5.6.5. Suppose Σ ⊇ N is consistent. Then each computable set X ⊆ N
is of the form X = P
Σ
(a) for some a ∈ N.
Proof. Let X ⊆ Nbe computable. Then X is Σrepresentable by Theorem 5.4.4,
say by the formula ϕ(x), i. e. X(b) ⇒ Σ ¬ ϕ(S
b
0), and X(b) ⇒ Σ ¬ ϕ(S
b
0).
So X(b) ⇔ Σ ¬ ϕ(S
b
0) (using consistency to get “⇐”). Take a = ϕ(x); then
X(b) iﬀ Σ ¬ ϕ(S
b
0) iﬀ P
Σ
(a, b), that is, X = P
Σ
(a).
102 CHAPTER 5. COMPUTABILITY
Proof of Church’s Theorem. Let Σ ⊇ N be consistent. We have to show that
then Th(Σ) is undecidable, that is, Th(Σ) is not computable. Suppose that
Th(Σ) is computable. Then the antidiagonal Q
Σ
⊆ N of P
Σ
is computable:
b ∈ Q
Σ
⇔ (b, b) / ∈ P
Σ
⇔ Sub(b, x, Num(b)) / ∈ Th(Σ).
By Lemma 5.6.4, Q
Σ
is not among the P
Σ
(a). Therefore by Lemma 5.6.5, Q
Σ
is not computable, a contradiction. This concludes the proof.
By Lemma 5.5.5 the subset Th(N) of N is computably generated. But this
set is not computable:
Corollary 5.6.6. Th(N) and Th(∅) (in the language L(N)) are undecidable.
Proof. The undecidability of Th(N) is a special case of Church’s Theorem. Let
∧N be the sentence N1 ∧ ∧ N9. Then, for any L(N)sentence σ,
N ¬ σ ⇐⇒ ∅ ¬ ∧N → σ,
that is, for all a ∈ N,
a ∈ Th(N) ⇐⇒ a ∈ Sent and ¸SN(∨), ¸SN(), ∧N), a) ∈ Th(∅).
Therefore, if Th(∅) were decidable, then Th(N) would be decidable; but Th(N)
is undecidable. So Th(∅) is undecidable.
Discussion. We have seen that N is quite weak. A very strong set of axioms
in the language L(N) is PA (1st order Peano Arithmetic). Its axioms are those
of N together with all induction axioms, that is, all sentences of the form
∀x[
_
ϕ(x, 0) ∧ ∀y (ϕ(x, y) → ϕ(x, Sy))
_
→ ∀y ϕ(x, y)]
where ϕ(x, y) is an L(N)formula, x = (x
1
, . . . , x
n
), and ∀x stands for ∀x
1
. . . ∀x
n
.
Note that PA is consistent, since it has N = (N; <, 0, S, +, ) as a model.
Also PA is computable (exercise). Thus by the theorems above, Th(PA) is
undecidable and incomplete. To appreciate the signiﬁcance of this result, one
needs a little background knowledge, including some history.
Over a century of experience has shown that number theoretic assertions can
be expressed by sentences of L(N), admittedly in an often contorted way. (That
is, we know how to construct for any number theoretic statement a sentence σ
of L(N) such that the statement is true if and only if N [= σ. In most cases we
just indicate how to construct such a sentence, since an actual sentence would
be too unwieldy without abbreviations.)
What is more important, we know from experience that any established fact
of classical number theory—including results obtained by sophisticated analytic
and algebraic methods—can be proved from PA, in the sense that PA ¬ σ
for the sentence σ expressing that fact. Thus before G¨odel’s Incompleteness
Theorem it seemed natural to conjecture that PA is complete. (Apparently it
was not widely recognized at the time that completeness of PA would have had
5.7. A MORE EXPLICIT INCOMPLETENESS THEOREM 103
the astonishing consequence that Th(N) is decidable.) Of course, the situation
cannot be remedied by adding new axioms to PA, at least if we insist that the
axioms are true in N and that we have eﬀective means to tell which sentences
are axioms. In this sense, the Incompleteness Theorem is pervasive.
5.7 A more explicit incompleteness theorem
In Section 5.6 we obtained G¨odel’s Incompleteness Theorem as an immediate
corollary of Church’s theorem. In this section, we prove the incompleteness
theorem in the more explicit form stated in the introduction to this chapter.
In this section L ⊇ L(N) is a ﬁnite language, and Σ is a set of Lsentences.
We also ﬁx two distinct variables x and y.
We shall indicate how to construct, for any computable consistent Σ ⊇ N, a
formula ϕ(x) of L(N) with the following properties:
(i) N ¬ ϕ(S
m
0) for each m;
(ii) Σ ,¬ ∀xϕ(x).
Note that then the sentence ∀xϕ(x) is true in N but not provable from Σ. Here
is a sketch of how to make such a sentence. Assume for simplicity that L = L(N)
and N [= Σ. The idea is to construct sentences σ and σ
such that
(1) N [= σ ↔ σ
; and (2) N [= σ
⇐⇒ Σ ,¬ σ.
From (1) and (2) we get N [= σ ⇐⇒ Σ ,¬ σ. Assuming that N [= σ produces
a contradiction. Hence σ is true in N, and thus(!) not provable from Σ.
How to implement this strange idea? To take care of (2), one might guess
that σ
= ∀xpr(x, S
σ
0) where pr(x, y) is a formula representing in N the
binary relation Pr ⊆ N
2
deﬁned by
Pr(m, n) ⇐⇒ m is the G¨odel number of a proof from Σ
of a sentence with G¨odel number n.
But how do we arrange (1)? Since σ
:= ∀xpr(x, S
σ
0) depends on σ, the
solution is to apply the ﬁxedpoint lemma below to ρ(y) := ∀xpr(x, y).
This ﬁnishes our sketch. What follows is a rigorous implementation.
Lemma 5.7.1 (Fixpoint Lemma). Suppose Σ ⊇ N. Then there is for every
Lformula ρ(y) an Lsentence σ such that Σ ¬ σ ↔ ρ(S
n
0) where n = σ.
Proof. The function (a, b) → Sub(a, x, Num(b)) : N
2
→ N is computable by
Lemma 5.5.2. Hence by the representability theorem it is Nrepresentable. Let
sub(x
1
, x
2
, y) be an L(N)formula representing it in N. We can assume that the
variable x does not occur in sub(x
1
, x
2
, y). Then for all a, b in N,
N ¬ sub(S
a
0, S
b
0, y) ↔ y = S
c
0, where c = Sub(a, x, Num(b)) (1)
104 CHAPTER 5. COMPUTABILITY
Now let ρ(y) be an Lformula. Deﬁne θ(x):= ∃y(sub(x, x, y) ∧ ρ(y)) and let
m = θ(x). Let σ := θ(S
m
0), and put n = σ. We claim that
Σ ¬ σ ↔ ρ(S
n
0).
Indeed,
n = σ = θ(S
m
0) = Sub(θ(x), x, S
m
0) = Sub(m, x, Num(m)).
So by (1),
N ¬ sub(S
m
0, S
m
0, y) ↔ y = S
n
0 (2)
We have
σ = θ(S
m
0) = ∃y(sub(S
m
0, S
m
0, y) ∧ ρ(y)),
so by (2) we get Σ ¬ σ ↔ ∃y(y = S
n
0 ∧ ρ(y)). Hence, Σ ¬ σ ↔ ρ(S
n
0).
Theorem 5.7.2. Suppose Σ is consistent, computable, and proves all axioms
of N. Then there exists an L(N)formula ϕ(x) such that N ¬ ϕ(S
m
0) for each
m, but Σ ,¬ ∀xϕ(x).
Proof. Consider the relation Pr
Σ
⊆ N
2
deﬁned by
Pr
Σ
(m, n) ⇐⇒ m is the G¨odel number of a proof from Σ
of an Lsentence with G¨odel number n.
Since Σ is computable, Pr
Σ
is computable. Hence Pr
Σ
is representable in N. Let
pr
Σ
(x, y) be an L(N)formula representing Pr
Σ
in N, and hence in Σ. Because
Σ is consistent we have for all m, n:
Σ ¬ pr
Σ
(S
m
0, S
n
0) ⇐⇒ Pr
Σ
(m, n) (1)
Σ ¬ pr
Σ
(S
m
0, S
n
0) ⇐⇒ Pr
Σ
(m, n) (2)
Let ρ(y) be the L(N)formula ∀xpr
Σ
(x, y). Lemma 5.7.1 (with L = L(N) and
Σ = N) provides an L(N)sentence σ such that N ¬ σ ↔ ρ(S
σ
0). It follows
that Σ ¬ σ ↔ ρ(S
σ
0), that is
Σ ¬ σ ↔ ∀xpr
Σ
(x, S
σ
0) (3)
Claim: Σ ,¬ σ. Assume towards a contradiction that Σ ¬ σ; let m be the
G¨odel number of a proof of σ from Σ, so Pr
Σ
(m, σ). Because of (3) we also
have Σ ¬ ∀xpr
Σ
(x, S
σ
0), so Σ ¬ pr
Σ
(S
m
0, S
σ
0), which by (2) yields
Pr
Σ
(m, σ), a contradiction. This establishes the claim.
Now put ϕ(x) := pr
Σ
(x, S
σ
0). We now show :
(i) N ¬ ϕ(S
m
0) for each m. Because Σ ,¬ σ, no m is the G¨odel number
of a proof of σ from Σ. Hence Pr
Σ
(m, σ) for each m, which by the
deﬁning property of pr
Σ
yields N ¬ pr
Σ
(S
m
0, S
σ
0) for each m, that is,
N ¬ ϕ(S
m
0) for each m.
5.8. UNDECIDABLE THEORIES 105
(ii) Σ ,¬ ∀xϕ(x). This is because of the Claim and Σ ¬ σ ↔ ∀xϕ(x), by (3).
Corollary 5.7.3. Suppose that Σ is computable and true in the Lexpansion
N
∗
of N. Then there exists an L(N)formula ϕ(x) such that N ¬ ϕ(S
n
0) for
each n, but Σ ∪ N ,¬ ∀xϕ(x).
(Note that then ∀xϕ(x) is true in N
∗
but not provable from Σ.) To obtain
this corollary, apply the theorem above to Σ ∪ N in place of Σ.
Exercises.
(1) Let N
∗
be an Lexpansion of N. Then the set of G¨odel numbers of Lsentences
true in N
∗
is not deﬁnable in N
∗
.
This result (of Tarski) is known as the undeﬁnability of truth. It strengthens the
special case of Church’s theorem which says that the set of G¨odel numbers of
Lsentences true in N
∗
is not computable. Tarski’s result follows easily from the
ﬁxpoint lemma, as does the more general result in the next exercise.
(2) Suppose Σ ⊇ N is consistent. Then the set Th(Σ) is not Σrepresentable, and
there is no truth deﬁnition for Σ. Here a truth deﬁnition for Σ is an Lformula
true(y) such that for all Lsentences σ,
Σ ¬ σ ←→ true(S
n
0), where n = σ.
5.8 Undecidable Theories
Church’s theorem says that any consistent theory containing a certain basic
amount of integer arithmetic is undecidable. How about theories like Th(Fl)
(the theory of ﬁelds), and Th(Gr) (the theory of groups)? An easy way to prove
the undecidability of such theories is due to Tarski: he noticed that if N is
deﬁnable in some model of a theory T, then T is undecidable. The aim of this
section is to establish this result and indicate some applications. In order not to
distract from this theme by boring details, we shall occasionally replace a proof
by an appeal to the ChurchTuring Thesis. (A conscientious reader will replace
these appeals by proofs until reaching a level of skill that makes constructing
such proofs a predictable routine.)
In this section, L and L
are ﬁnite languages, Σ is a set of Lsentences, and
Σ
is a set of L
sentences.
Lemma 5.8.1. Let L ⊆ L
and Σ ⊆ Σ
.
(1) Suppose Σ
is conservative over Σ. Then
Th
L
(Σ) is undecidable =⇒ Th
L
(Σ
) is undecidable.
(2) Suppose L = L
and Σ
Σ is ﬁnite. Then
Th(Σ
) is undecidable =⇒ Th(Σ) is undecidable.
106 CHAPTER 5. COMPUTABILITY
(3) Suppose all symbols of L
L are constant symbols. Then
Th
L
(Σ) is undecidable ⇐⇒ Th
L
(Σ) is undecidable.
(4) Suppose Σ
extends Σ by a deﬁnition. Then
Th
L
(Σ) is undecidable ⇐⇒ Th
L
(Σ
) is undecidable.
Proof. (1) In this case we have for all a ∈ N,
a ∈ Th
L
(Σ) ⇐⇒ a ∈ Sent
L
and a ∈ Th
L
(Σ
).
It follows that if Th
L
(Σ
) is decidable, so is Th
L
(Σ).
(2) Write Σ
= ¦σ
1
, . . . , σ
N
¦ ∪ Σ, and put σ
:= σ
1
∧ ∧ σ
N
. Then for each
Lsentence σ we have Σ
¬ σ ⇐⇒ Σ ¬ σ
→ σ, so for all a ∈ N,
a ∈ Th(Σ
) ⇐⇒ a ∈ Sent and ¸SN(∨), ¸SN(), σ
), a) ∈ Th(Σ).
It follows that if Th(Σ) is decidable then so is Th(Σ
).
(3) Let c
0
, . . . , c
n
be the distinct constant symbols of L
L. Given any L

sentence σ we deﬁne the Lsentence σ
as follows: take k ∈ N minimal such
that σ contains no variable v
m
with m ≥ k, replace each occurrence of c
i
in σ
by v
k+i
for i = 0, . . . , n, and let ϕ(v
k
, . . . , v
k+n
) be the resulting Lformula (so
σ = ϕ(c
0
, . . . , c
n
)); then σ
:= ∀v
k
. . . ∀v
k+n
ϕ(v
k
, . . . , v
k+n
). An easy argument
using the completeness theorem shows that
Σ ¬
L
σ ⇐⇒ Σ ¬
L
σ
.
By the ChurchTuring Thesis there is a computable function a → a
: N → N
such that σ
= σ
for all L
sentences σ; we leave it to the reader to replace
this appeal to the ChurchTuring Thesis by a proof. Then, for all a ∈ N:
a ∈ Th
L
(Σ) ⇐⇒ a ∈ Sent
L
and a
∈ Th
L
(Σ).
This yields the ⇐ direction of (3); the converse holds by (1).
(4) The ⇒ direction holds by (1). For the ⇐ we use an algorithm (see ...) that
computes for each L
sentence σ an Lsentence σ
∗
such that Σ
¬ σ ↔ σ
∗
. By
the ChurchTuring Thesis there is a computable function a → a
∗
: N → N such
that σ
∗
= σ
∗
for all L
sentences σ. Hence, for all a ∈ N,
a ∈ Th
L
(Σ
) ⇐⇒ a ∈ Sent
L
and a
∗
∈ Th
L
(Σ).
This yields the ⇐ direction of (4).
Remark. We cannot drop the assumption L = L
in (2): take L = ∅, Σ = ∅,
L
= L(N) and Σ
= ∅. Then Th
L
(Σ
) is undecidable by Corollary 5.6.6, but
Th
L
(Σ) is decidable (exercise).
Deﬁnition. An Lstructure / is said to be strongly undecidable if for every set
Σ of Lsentences such that / [= Σ, Th(Σ) is undecidable.
5.8. UNDECIDABLE THEORIES 107
So / is strongly undecidable iﬀ every Ltheory of which / is a model is unde
cidable.
Example. N = (N; <, 0, S, +, ) is strongly undecidable. To see this, let Σ
be a set of L(N)sentences such that N [= Σ. We have to show that Th(Σ) is
undecidable. Now N [= Σ∪N. By Church’s Theorem Th(Σ∪N) is undecidable,
hence Th(Σ) is undecidable by part (2) of Lemma 5.8.1.
The following result is an easy application of part (3) of the previous lemma.
Lemma 5.8.2. Let c
0
, . . . , c
n
be distinct constant symbols not in L, and let
(/, a
0
, . . . , a
n
) be an L(c
0
, . . . , c
n
)expansion of the Lstructure /. Then
(/, a
0
, . . . , a
n
) is strongly undecidable =⇒ / is strongly undecidable.
Theorem 5.8.3 (Tarski). Suppose the Lstructure / is deﬁnable in the L

structure B and / is strongly undecidable. Then B is strongly undecidable.
Proof. (Sketch) By the previous lemma (with L
and B instead of L and /), we
can reduce to the case that we have a 0deﬁnition δ of / in B. One can show
that in this case we have (1) an algorithm that computes for any Lsentence
σ an L
sentence δσ, and (2) a ﬁnite set ∆ of L
sentences true in B, with the
following properties:
(i) / [= σ ⇐⇒ B [= δσ;
(ii) for each set Σ
of L
sentences with B [= Σ
and each Lsentence σ,
Σ ¬ σ ⇐⇒ Σ
∪ ∆ ¬ δσ, where Σ := ¦s : s is an Lsentence and Σ
∪ ∆ [= δs¦.
Let Σ
be a set of L
sentences such that B [= Σ
; we need to show that Th
L
(Σ
)
is undecidable. Suppose towards a contradiction that Th
L
(Σ
) is decidable.
Then Th
L
(Σ
∪ ∆) is decidable, by part (2) of Lemma 5.8.1, so we have an
algorithm for deciding whether any given L
sentence is provable from Σ
∪ ∆.
Take Σ as in (ii) above. Then / [= Σ by (i), and by (ii) we obtain an algorithm
for deciding whether any given Lsentence is provable from Σ. But Th
L
(Σ) is
undecidable by assumption, and we have a contradiction.
Corollary 5.8.4. Th(Ri) is undecidable, in other words, the theory of rings is
undecidable.
Proof. It suﬃces to show that the ring (Z; 0, 1, +, −, ) of integers is strongly
undecidable. Using Lagrange’s theorem that
N = ¦a
2
+b
2
+c
2
+d
2
: a, b, c, d ∈ Z¦,
we see that the inclusion map N → Z deﬁnes N in the ring of integers, so by
Tarski’s Theorem the ring of integers is strongly undecidable.
For the same reason, the theory of commutative rings, of integral domains, and
more generally, the theory of any class of rings that has the ring of integers
among its members is undecidable.
108 CHAPTER 5. COMPUTABILITY
Fact. The set Z ⊆ Q is 0deﬁnable in the ﬁeld (Q; 0, 1, +, −, ) of rational
numbers, and thus the ring of integers is deﬁnable in the ﬁeld of rational num
bers.
We shall take this here on faith. The only known proof uses nontrivial results
about quadratic forms, and is due to Julia Robinson.
Corollary 5.8.5. The theory Th(Fl) of ﬁelds is undecidable. The theory of any
class of ﬁelds that has the ﬁeld of rationals among its members is undecidable.
Exercises.
(1) Argue informally, using the ChurchTuring Thesis, that Th(ACF) is decidable.
You can use the fact that ACF has QE.
Below we let a, b, c denote integers. We say that a divides b (notation: a [ b) if
ax = b for some integer x, and we say that c is a least common multiple of a and
b if a [ c, b [ c, and c [ x for every integer x such that a [ x and b [ x. Recall
that if a and b are not both zero, then they have a unique positive least common
multiple, and that if a and b are coprime (that is, there is no integer x > 1 with
x [ a and x [ b), then they have ab as a least common multiple.
(2) The structure (Z; 0, 1, +, [) is strongly undecidable, where [ is the binary relation
of divisibility on Z. Hint: Show that if b +a is a least common multiple of a and
a+1, and b−a is a least common multiple of a and a−1, then b = a
2
. Use this to
deﬁne the squaring function in (Z; 0, 1, +, [), and then show that multiplication
is 0deﬁnable in (Z; 0, 1, +, [).
(3) Consider the group G of bijective maps Z → Z, with composition as the group
multiplication. Then G (as a model of Gr) is strongly undecidable. Hint: let s
be the element of G given by s(x) = x + 1. Check that if g ∈ G commutes with
s, then g = s
a
for some a. Next show that
a [ b ⇐⇒ s
b
commutes with each g ∈ G that commutes with s
a
.
Use these facts to specify a deﬁnition of the structure (Z; 0, 1, +, [) in the group
G.
Thus by Tarski’s theorem, the theory of groups is undecidable. In fact, the
theory of any class of groups that has the group G of the exercise above among
its members is undecidable. On the other hand, Th(Ab), the theory of abelian
groups, is known to be decidable (Szmielew).
(4) Let L = F¦ have just a binary function symbol. Then predicate logic in L (that
is, Th
L
(∅)) is undecidable.
It can also be shown that predicate logic in the language whose only symbol is a
binary relation symbol is undecidable. On the other hand, predicate logic in the
language whose only symbol is a unary function symbol is decidable.
To do?
• improve titlepage
• improve or delete index
• more exercises (from homework, exams)
• footnotes pointing to alternative terminology, etc.
• brief discussion of classes at end of “Sets and Maps”?
• at the end of results without proof?
• brief discussion on P=NP in connection with propositional logic
• section(s) on boolean algebra, including Stone representation, Lindenbaum
Tarski algebras, etc.
• section on equational logic? (boolean algebras, groups, as examples)
• solution to a problem by Erd¨os via compactness theorem, and other simple
applications of compactness
• include “equality theorem”,
• translation of one language in another (needed in connection with Tarski
theorem in last section)
• more details on backandforth in connection with unnested formulas
• extra elementary model theory (universal classes, modeltheoretic crite
ria for qe, etc., application to ACF, maybe extra section on RCF, Ax’s
theorem.
• On computability: a few extra results on c.e. sets, and exponential dio
phantine result.
• basic framework for manysorted logic.
Contents
1 Preliminaries 1.1 Mathematical Logic: a brief overview . . . . . . . . . . . . . . . . 1.2 Sets and Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Basic Concepts of Logic 2.1 Propositional Logic . . . . . . . . . . . . . 2.2 Completeness for Propositional Logic . . . 2.3 Languages and Structures . . . . . . . . . 2.4 Variables and Terms . . . . . . . . . . . . 2.5 Formulas and Sentences . . . . . . . . . . 2.6 Models . . . . . . . . . . . . . . . . . . . . 2.7 Logical Axioms and Rules; Formal Proofs 3 The 3.1 3.2 3.3 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 5 13 13 19 24 28 30 35 37 43 43 44 49 53 59 59 64 65 69 73
Completeness Theorem Another Form of Completeness . . . . . . . . Proof of the Completeness Theorem . . . . . Some Elementary Results of Predicate Logic . Equational Classes and Universal Algebra . .
4 Some Model Theory 4.1 L¨wenheimSkolem; Vaught’s Test . . . . . . o 4.2 Elementary Equivalence and BackandForth 4.3 Quantiﬁer Elimination . . . . . . . . . . . . . 4.4 Presburger Arithmetic . . . . . . . . . . . . . 4.5 Skolemization and Extension by Deﬁnition . .
5 Computability, Decidability, and Incompleteness 5.1 Computable Functions . . . . . . . . . . . . . . . . 5.2 The ChurchTuring Thesis . . . . . . . . . . . . . . 5.3 Primitive Recursive Functions . . . . . . . . . . . . 5.4 Representability . . . . . . . . . . . . . . . . . . . 5.5 Decidability and G¨del Numbering . . . . . . . . . o 5.6 Theorems of G¨del and Church . . . . . . . . . . . o 5.7 A more explicit incompleteness theorem . . . . . . i
77 . 77 . 86 . 87 . 91 . 95 . 100 . 103
. . . .8 Undecidable Theories . . . . . . . . . .5. . . . . . . . . . . 105 .
For additional material in Model Theory we refer the reader to Chang. and Keisler. C. Cantor.S. B. their activity led to the view that logic + set theory can serve as a basis for 1 . predicate logic.. R. like group theory or probability theory. Reading. H. propositional logic. see Shoenﬁeld. Model Theory. J. Boole. Schr¨der.. as clarifying the foundations of the natural and real number systems. Peirce. to Rogers. C. New York. McGrawHill. and Leibniz dreamt of reducing reasoning to calculation. H. For more on the course material. As a mathematical subject. Poizat. AddisonWesley.Chapter 1 Preliminaries We start with a brief overview of mathematical logic as covered in this course. 1967. From our perspective o we see their work as leading to boolean algebra. In this course we develop mathematical logic using elementary set theory as given. and as introducing suggestive symbolic notation for logical operations. Frege. Springer. 1990.. 2000. Theory of Recursive Functions and Eﬀective Computability.1 Mathematical Logic: a brief overview Aristotle identiﬁed some simple patterns in human reasoning. Also. Next we review some basic notions from elementary set theory. C. 1. NorthHolland. which provides a medium for communicating mathematics in a precise and clear way. Peano. just as one would do with other branches of mathematics. A Course in Model Theory. J. set theory. Dedekind. and for additional material on Computability.. however. logic is relatively recent: the 19th century pioneers were Bolzano. Mathematical Logic. and E. 1967.
and Gentzen. o o Tarski. Hilbert. quotient sets and power sets. . Kleene. like cartesian products. Ramsey. 1 but it did bring crucial progress of a conceptual nature. Skolem. Church. In the period 19001950 important new ideas came from Russell. logic as pursued by mathematicians gradually branched into four main areas: model theory. Lusin (in Moscow). The topics in this course are part of the common background of mathematicians active in any of these areas. Post. The early part of the 20th century was also marked by the socalled f oundational crisis in mathematics. They discovered the ﬁrst real theorems in mathematical logic. set theory. Lusin. Tarski (in Warsaw and Berkeley). and proof theory. PRELIMINARIES all of mathematics. who made up a large part of that generation and the next in mathematical logic. and also thrives on many interactions with other areas of mathematics and computer science. In the second half of the last century. o and Church (in Princeton) had many students and collaborators. More generally. o Hilbert (in G¨ttingen). G¨del. Turing. computability theory (or recursion theory). A strong impulse for developing mathematical logic came from the attempts during these times to provide solid foundations for mathematics. and the recognition that logic as used in mathematics obeys mathematical rules that can be made fully explicit. What distinguishes mathematical logic within mathematics is that statements about mathematical objects are taken seriously as mathematical objects in their own right. with those of G¨del having a dramatic impact. Hausdorﬀ. in mathematical logic we formalize (formulate in a precise mathematical way) notions used informally by mathematicians such as: • property • statement (in a given language) • structure • truth (what it means for a given statement to be true in a given structure) • proof (from a given set of axioms) • algorithm 1 In the case of set theory one could dispute this. Even so. This era did not produce theorems in mathematical logic of any real depth. Zermelo.2 CHAPTER 1. the main inﬂuence of set theory on the rest of mathematics was to enable simple constructions of great generality. and this involves only very elementary set theory. Most of these names will be encountered again during the course. L¨wenheim. Mathematical logic has now taken on a life of its own. Herbrand.
this assertion can be expressed formally as (GC) ∀x[(1 + 1 < x ∧ even(x)) → ∃p∃q(prime(p) ∧ prime(q) ∧ x = p + q)] where even(x) abbreviates ∃y(x = y + y) and prime(p) abbreviates 1 < p ∧ ∀r∀s(p = r · s → (r = 1 ∨ s = 1)). then GC is true in (N. ·. which has symbols 0.) A century of experience gives us conﬁdence that all classical numbertheoretic results—old or new. you will see that if GC is provable from those axioms. 0.1. as is easily veriﬁed. <). 0. we can try to prove theorems about these formalized notions. ·. 2. →. . z. 1. With the understanding that the variables range over N = {0. If done with imagination. <). <). proved by elementary methods or by sophisticated algebra and analysis—can be proved from the Peano axioms for arithmetic. ∨. 1. 1. ∀. 2 Here we do not count as part of classical number theory some results like Ramsey’s Theorem that can be stated in the language of arithmetic. in our present state of knowledge. The notorious Goldbach Conjecture asserts that every even integer greater than 2 is a sum of two prime numbers. . 0. +. p. ∧. r. 0. and that 0. +. . +. +. +. ·. < denote the usual arithmetic operations and relations on N. One objective of this course is to ﬁgure out the connections (and disconnections) between these notions. formalization tends to caricature the informal concepts it aims to capture. ·. ·. 1. and variables x. That the question makes sense does not mean that it is of any interest. 2 However.) The point of this example is simply to make the reader aware of the notions “true in a given structure” and “provable from a given set of axioms.1. ¬. s. in addition to logical symbols like =. q. but are arguably more in the spirit of logic and combinatorics. 1. but not provable from those axioms. 1. ∃. < to denote arithmetic operations and relations. . (No proof of the Goldbach Conjecture is known. }. but no harm is done if this is kept ﬁrmly in mind. GC might be true in (N. The expression GC is an example of a formal statement (also called a sentence) in the language of arithmetic. Example. then its negation ¬GC is provable from those axioms.) It also makes sense to ask whether the sentence GC is true in the structure (R. (On the other hand. and that if GC is false in (N. MATHEMATICAL LOGIC: A BRIEF OVERVIEW 3 Once we have mathematical deﬁnitions of these notions. ·. ·. 1. +. Of course. once you know what exactly we mean by provable from the Peano axioms. The Goldbach Conjecture asserts that this particular sentence GC is true in the structure (N. +. 0. <) (It’s not.” and their diﬀerence. this process can lead to unexpected rewards. 1. y. <).
+. At this stage we only say by way of explanation that a model of Σ is a mathematical structure in which all sentences of Σ are true. Here is an important early result on a speciﬁc arithmetic structure: Theorem of Presburger and Skolem. When we do include multiplication. each of these results has stronger versions. What matters is to get some preliminary idea of what we are aiming for. Then σ is provable from Σ if and only if σ is true in all models of Σ. for each n = 2. Each sentence in the language of the structure (Z. but that should change during the course. . . If Σ is a countable set of sentences o in some language and Σ has a model. then Σ has a countable model. there is an algorithm that. but in this overview we prefer simple statements over strength and applicability. accordingly. 1. Theorem of L¨wenheim and Skolem. Then Σ has a model if and only if each ﬁnite subset of Σ has a model. The L¨wenheimSkolem and Compactness theorems o do not mention the notion of provability. 4. if Σ is the (inﬁnite) set of axioms for ﬁelds of characteristic zero in the language of rings. 0. or a = nb + 1 + · · · + 1 (with n disjuncts in total). We begin with two results that are fundamental in model theory. The next result goes a little beyond model theory by relating the notion of “model of Σ” to that of “provability from Σ”: Completeness Theorem (G¨del. <) we have not included multiplication among the primitives. but the theorem of L¨wenheim and Skolem predates the o Completeness Theorem. PRELIMINARIES Some highlights (1900–1950) The results below are among the most frequently used facts of mathematical logic.4 CHAPTER 1. 0. For example. The terminology used in stating these results might be unfamiliar. Moreover. . −. on which applications often depend. given any sentence in this language as input. and thus model theorists often prefer to bypass Completeness in establishing these results. decides whether this sentence is true in (Z. 3. 0.. Let Σ be a set of sentences in some o language L. +. Mal’cev). In the case of the Compactness Theorem this reﬂects history. by an axiom that says that for every a there is a b such that a = nb or a = nb + 1 or . and let σ be a sentence in L. the situation changes dramatically: .. 1930).. −. −. <) that is true in this structure is provable from the axioms for ordered abelian groups with least positive element 1. nb stands for b + · · · + b (with n summands). <). then a model of Σ is just a ﬁeld of characteristic zero. As will become clear during the course. Let Σ be a set of sentences in some o language. In our treatment we shall obtain the ﬁrst two theorems as byproducts of the Completeness Theorem and its proof. +. 1. They concern the notion of model of Σ where Σ is a set of sentences in a language L. see for example Poizat’s book. Compactness Theorem (G¨del. augmented. Note that in (Z. 1.
In contrast to these incompleteness and undecidability results on (suﬃciently rich) arithmetic. In a few places we shall need more set theory than we introduce here. <). .every positive element is a square. 1. In such a treatment the notion of set itself is left undeﬁned. ordinals and cardinals. but not provable from the Peano axioms. R. <). 1930’s). . 0. Here “there is no algorithm” is used in the mathematical sense of there cannot exist an algorithm. for example. ·. we have Tarski’s theorem on the ﬁeld of real numbers (19301950). (G¨delChurch. We consider the sets A and B as the same set (notation: A = B) / if and only if they have exactly the same elements.2 Sets and Maps We shall use this section as an opportunity to ﬁx notations and terminologies that are used throughout these notes. but the axioms about sets are suggested by thinking of a set as a collection of mathematical objects.) Halmos. 1. <) is provable from the axioms for ordered ﬁelds augmented by the axioms . 0. 1.. Naive set theory. ·. decides whether this sentence is true in (N. To indicate that x is not in A we write x ∈ A. ·. in words: x is in A (or: x belongs to A). +. (It also contains an axiomatic treatment of set theory starting from scratch. +. 1. ·. called its elements or members. listing or indicating inside the brackets its elements. The following little book is a good place to read about these matters.2. New York. 1. SETS AND MAPS 5 Incompleteness and undecidability of arithmetic. given any sentence in this language as input. 1974 In an axiomatic treatment of set theory as in the book by Halmos all assertions about sets below are proved from a few simple axioms. There is also an algorithm that decides for any given sentence in this language as input. and throughout mathematics.” This theorem is intimately connected with the clariﬁcation of notions like computability and algorithm in which Turing played a key role.every odd degree polynomial has a zero. Every sentence in the language of arithmetic that is true in the structure (R. +. o One can construct a sentence in the language of arithmetic that is true in the structure (N. There is no algorithm that. We often introduce a set via the bracket notation. +.1. To indicate that an object x is an element of the set A we write x ∈ A. <). 0. P. Springer. whether this sentence is true in (R. not in the weaker colloquial sense of “no algorithm is known. 0.
Don’t confuse an object x with the set {x} that has x as its only element: for example. −1. 1. (3) The set of integers: Z = {. and 7 as its only elements. 3. we shall often write “iﬀ” or “⇔” to abbreviate “if and only if. that is. / A × B := {(a. you may think of R × R as the set of points (a. namely 0 and 1. . The key property of ordered pairs is that we have (a. Here are some important sets that the reader has probably encountered previously. b) in the xyplane of coordinate geometry. “for all m .” . and each set is a subset of itself. A ∩ B := {x : x ∈ A and x ∈ B} (intersection of A and B). the object x = {0. Remark. (4) The set of rational numbers: Q. 2. 2. We often introduce a set A in our discussions by deﬁning A to be the set of all elements of a given set B that satisfy some property P . . then we say that A is a subset of B (and write A ⊆ B).}. We say that A and B are disjoint if A ∩ B = ∅. d) if and only if a = c and b = d. Examples. PRELIMINARIES For example. .” We committed a similar abuse of language earlier in deﬁning set inclusion by the phrase “If —. b) = (c.}.” We shall continue such abuse. .” will mean “for all m ∈ N. 1}} has only one element. (1) The empty set: ∅ (it has no elements). (6) The set of complex numbers: C. 1}. (2) The set of natural numbers: N = {0. then we say that · · · . Remark.. For example. b) : a ∈ A and b ∈ B} (cartesian product of A and B). Then we can form the following sets: (a) (b) (c) (d) A ∪ B := {x : x ∈ A or x ∈ B} (union of A and B).”... and {3.” the meaning of “if” is actually “if and only if. but the set {x} = {{0. 2. (5) The set of real numbers: R. 0. 3} = {3}: the same set can be described in many diﬀerent ways. Let A and B be sets. . . Throughout these notes m and n always denote natural numbers. For example. they have no element in common. 2. . namely x. In a deﬁnition such as we just gave: “We say that · · · if —. . 7} is the set with 1. Thus the empty set ∅ is a subset of every set. but only in similarly worded deﬁnitions. Thus the elements of A × B are the socalled ordered pairs (a.6 CHAPTER 1. 7} = {2. Also. A B := {x : x ∈ A and x ∈ B} (diﬀerence of A and B). b) with a ∈ A and b ∈ B. 1} is a set that has exactly two elements. If all elements of the set A are in the set B. Note that {1. −2. 7. 2. in accordance with tradition. 1. {1.. Notation: A := {x ∈ B : x satisﬁes P } (hence A ⊆ B). .
Let f : A → B be a map. It is called the composition of g and f .1. and “transformation” is used in geometric situations where domain and codomain are equal. “operator” when the elements of domain and codomain are themselves functions.2. but it will always be clear from the context when f (X) is meant to be the the direct image of X under f . B. “function” is used when the codomain is a set of numbers of some kind. an gives rise to a function x → f (x) : R → R. Γ) of sets A. some authors resolve the conﬂict by denoting this direct image by f [X] or in some other way. . We often use the “maps to” symbol → in this way to indicate the rule by which to each x in the domain we associate its value f (x). For Y ⊆ B we put f −1 (Y ) := {x ∈ A : f (x) ∈ Y } ⊆ A (inverse image of Y under f ).3 We call A the domain of f . respectively. B. transf ormation. It is said to be injective if for all a1 = a2 in A we have f (a1 ) = f (a2 ). Typically. and call it the value of f at a (or the image of a under f ). 3 Sometimes 4 Other we shall write f a instead of f (a) in order to cut down on parentheses.) Examples. f unction. . SETS AND MAPS 7 Maps Deﬁnition. and Γ the graph of f . Γ such that Γ ⊆ A × B and for each a ∈ A there is exactly one b ∈ B with (a. operator. . . Among the many synonyms of map are mapping. (We use equal as synonym for the same or identical. Deﬁnition. For X ⊆ A we put f (X) := {f (x) : x ∈ X} ⊆ B (direct image of X under f ). (There is a notational conﬂict here when X is both a subset of A and an element of A. A map is a triple f = (A. (2) Any polynomial f (X) = a0 + a1 X + · · · + an X n with real coeﬃcients a0 . b) ∈ Γ. It is said to be surjective if for each b ∈ B there exists a ∈ A such that f (a) = b. (1) Given any set A we have the identity map 1A : A → A deﬁned by 1A (a) = a for all a ∈ A. Thus surjectivity of our map f is equivalent to f (A) = B.4 We write f : A → B to indicate that f is a map with domain A and codomain B. and B the codomain of f .) We also call f (A) = {f (a) : a ∈ A} the image of f . words for “domain” and “codomain” are “source” and “target”. also coincide is a synonym for being the same. we write f (a) for this unique b. assignment. Deﬁnition. It is said to be bijective (or a bijection) if it is both injective and surjective. Given f : A → B and g : B → C we have a map g ◦ f : A → C deﬁned by (g ◦ f )(a) = g(f (a)) for all a ∈ A. and in this situation we also say that f is a map from A to B. .
8 CHAPTER 1. This concludes the proof. a contradiction. Then {a ∈ A : a ∈ S(a)} = S(b) where b ∈ A. we call this unique n the number of elements of A or the cardinality of A. . Conversely. (The attentive reader will notice that we just introduced a potential conﬂict of notation: for bijective f : A → B and Y ⊆ B. . . The sets N. Thus b ∈ S(b). n} as a suggestive notation for the set {m : 1 ≤ m ≤ n}. A set which is not ﬁnite is said to be inﬁnite. since these two subsets of A coincide. B ⊆ D.) It follows from the deﬁnition of “map” that f : A → B and g : C → D are equal (f = g) if and only if A = C. . Z and Q are countably inﬁnite. . We say that g : C → D extends f : A → B if A ⊆ C. Such a set of subsets of A is clearly uniquely determined by A. the Power Set Axiom says: For any set A. If A is ﬁnite. . A set A is said to be ﬁnite if there exists n and a bijection f : {1. no harm is done. . and is called the power set of A. but the inﬁnite set R is not countably inﬁnite. if f : A → B and g : B → A satisfy g ◦ f = 1A and f ◦ g = 1B . but then / / b ∈ S(b). is denoted by P(A). . . Proof. Deﬁnition. However. One of the standard axioms of set theory. A set A is said to be countably inﬁnite if there is a bijection N → A. Every inﬁnite set has a countably inﬁnite subset.” (a subset of A) . 5 We also say “g : C → D is an extension of f : A → B” or “f : A → B is a restriction of g : C → D. B = D. . 5 Deﬁnition. n} → A. again a contradiction. / Assuming b ∈ S(b) yields b ∈ S(b). there is no surjective map A → P(A): Cantor’s Theorem. and denote it by A. both the inverse image of Y under f and the direct image of Y under f −1 are denoted by f −1 (Y ). For n = 0 this is just ∅. so is P(A) and P(A) = 2A . Here we use {1. Note that a → {a} : A → P(A) is an injective map. and f (x) = g(x) for all x ∈ A. Example. then f is a bijection with f −1 = g. Note that then f −1 ◦ f = 1A and f ◦ f −1 = 1B . and f (x) = g(x) for all x ∈ A. It is said to be countable if it is either ﬁnite or countably inﬁnite. there is a set whose elements are exactly the subsets of A. Let S : A → P(A) be a map. . PRELIMINARIES If f : A → B is a bijection then we have an inverse map f −1 : B → A given by f −1 (b) := the unique a ∈ A such that f (a) = b. Suppose otherwise. Then the set {a ∈ A : a ∈ S(a)} / is not an element of S(A). n} → A). If A is ﬁnite there is exactly one such n (although if n > 1 there will be more than one bijection f : {1. .
. i∈I the product of the family. For example. and this set is denoted by AI . . Then there is a set whose elements are exactly the maps f : I → A. an ) with each ai ∈ A. {ai }i∈I or (ai )i∈I denotes a family of objects ai indexed by the set I. . the empty word and written . . i∈I the union of the family. Words Deﬁnition. Given a word a = a1 . {(a. ai ) : i ∈ I}. . A word of length n on A is an ntuple (a1 . . then {(i. and we often write ai instead of a(i). . each Ai is a set) we have a set Ai := {x : x ∈ Ai for some i ∈ I}. that is. b) ∈ Z2 : a < b} is a binary relation on Z. . it may happen that ai = aj for distinct indices i. For example. Let A be a set. . .) For I = N we usually say “sequence” instead of “family”. Think of A as an alphabet of letters. For I = {1. . . b) → a + b on Z.1. then so is the union above and  Ai  ≤ Ai . . j ∈ I. because we think of it as a word (string of letters) we shall write this tuple instead as a1 . An nary relation on A is just a subset of An . . . but we shall use it a few times. is a bit special. It says that for any family (Ai )i∈I of nonempty sets there is a family (ai )i∈I such that ai ∈ Ai for all i ∈ I. the Axiom of Choice. There is a unique word of length 0 on A. . . So An can be thought of as the set of ntuples (a1 . . SETS AND MAPS 9 Let I and A be sets. an ) of letters ai ∈ A. . Ai := {(ai )i∈I : ai ∈ Ai for all i ∈ I}. and instead of “2ary” we can say “binary”. an of length n ≥ 1 on A. n} → A. if I = N and an = a for all n. and an nary operation on A is a map from An into A. Thus an element of An is a map a : {1. a) : i ∈ N} is countably inﬁnite. One axiom of set theory. .2. For n = 0 the set An has just one element — the empty tuple. (There may be repetitions in the family. we usually think of such an a as the ntuple (a(1). . i∈I i∈I i∈I If I is countable and each Ai is countable then Given any family (Ai )i∈I of sets we have a set Ai is countable. i∈I Ai = ∅. a(n)). n} we also write An instead of AI . . ai ) : i ∈ I} = {(i. an (without parentheses or commas). and integer addition is the binary operation (a. but {ai : i ∈ I} = {a} has just one element. Given any family (Ai )i∈I of sets (that is. . not to be confused with the set {ai : i ∈ I}. Instead of “1ary” we usually say “unary”. . . Deﬁnition. but such repetition is not reﬂected in the set {ai : i ∈ I}. If I is ﬁnite and each Ai is ﬁnite. that is. and is just a suggestive notation for a set {(i.
and if ac = bc. Logical expressions like formulas and terms will be introduced later as words of a special form on suitable alphabets. . . b) ∈ R. b ∈ A we have a∼ = b∼ if and only if a ∼ b. c ∈ A: (i) a ∼ a (reﬂexivity). then a = b. The quotient set of A by ∼ is by deﬁnition the set of equivalence classes: A/ ∼ = {a∼ : a ∈ A}. with as twosided identity: a = a = a for all a ∈ A∗ . The equivalence class a∼ of an element a ∈ A is deﬁned by a∼ = {b ∈ A : a ∼ b} (a subset of A). The set of all words on A is denoted A∗ : A∗ = n An (disjoint union). am b1 .. bn .) Every partition of A is the quotient set A/ ∼ for a unique equivalence relation ∼ on A. b. Given any n we have the equivalence relation “congruence modulo n” on Z deﬁned as follows: for any a. b ∈ Z we have a≡b mod n ⇐⇒ a − b = nc for some c ∈ Z.”. and the last letter (or last symbol) of a is an . bn on A of length m and n respectively. if ab = ac. Thus ab is a word on A of length m + n. Deﬁnition. . we use it here because we don’t like to say “set of . we deﬁne their concatenation ab ∈ A∗ : ab = a1 . subsets . and with twosided cancellation: for all a. c ∈ A∗ .. . Thus . (ii) a ∼ b implies b ∼ a (symmetry). it is a collection of pairwise disjoint nonempty subsets of A whose union is A. Example. c ∈ A∗ . . Let ∼ be an equivalence relation on the set A. . Concatenation is a binary operation on A∗ that is associative: (ab)c = a(bc) for all a. and this will be done whenever convenient. PRELIMINARIES the ﬁrst letter (or ﬁrst symbol) of a is by deﬁnition a1 . For n = 0 this is just equality on Z. b. Equivalence Relations and Quotient Sets Given a binary relation R on a set A it is often more suggestive to write aRb instead of (a. (Collection is a synonym for set. An equivalence relation on a set A is a binary relation ∼ on A such that for all a. b. Given words a = a1 . .10 CHAPTER 1. Deﬁnition. am and b = b1 . This quotient set is a partition of A. . that is.. When A ⊆ B we can identify A∗ with a subset of B ∗ . (iii) (a ∼ b and b ∼ c) implies a ∼ c (transitivity). For a. and a∼ ∩ b∼ = ∅ if and only if a b.. then b = c.
. ≤) is a linearly ordered set and P  = n. Y of A) deﬁnes a poset (P(A). ≤) be a poset. (iii) if p ≤ q and q ≤ r. x < y : ⇐⇒ y > x :⇐⇒ x ≤ y and x = y.1. ≥) is also a poset. This is not a linearly ordered set if A has more than one element. R comes with its familiar linear order on it. Finite linearly ordered sets are determined “up to unique isomorphism” by their size: if (P. ≤) consisting of a set P and a partial ordering ≤ on P . that is. . . . n} such that for all p. or that (P. then p = q (antisymmetry). Note that (P. ≤) is a linearly ordered set. If in addition we have for all p. r ∈ P : (i) p ≤ p (reﬂexivity). Posets A partially ordered set (short: poset) is a pair (P. Q. This map ι is a bijection. (ii) if p ≤ q and q ≤ p. q ∈ P . P can have at most one least element. Remark. y ∈ P we set x ≥ y : ⇐⇒ y ≤ x. In the previous example (congruence modulo n) the equivalence classes are called congruence classes modulo n (or residue classes modulo n) and the corresponding quotient set is often denoted Z/nZ. SETS AND MAPS 11 equivalence relations on A and partitions of A are just diﬀerent ways to describe the same situation.6 Each of the sets N. A least element of P is a p ∈ P such that p ≤ x for all x ∈ P . Readers familiar with some abstract algebra will note that the construction in the example above is a special case of a more general construction— that of a quotient of a group with respect to a normal subgroup.2. . q ∈ P we have: p ≤ q ⇐⇒ ι(p) ≤ ι(q). then we say that ≤ is a linear order on P . For x. ≤). As an example. Here is some useful notation. a largest element of P is deﬁned likewise. (iv) p ≤ q or q ≤ p. Of course. with ≥ instead of ≤. also referred to as the power set of A ordered by inclusion. Let (P. ≤ is a binary relation on P such that for all p. Then X ≤ Y :⇐⇒ X ⊆ Y (for subsets X. Z. therefore we can refer to the 6 One also uses the term total order instead of linear order. then p ≤ r (transitivity). take any set A and its collection P(A) of subsets. q. then there is a unique map ι : P → {1.
then P has a maximal element. then the minimal elements of X are the singletons {a} with a ∈ A. ≤X ) where the partial ordering ≤X on X is deﬁned by x ≤X y ⇐⇒ x ≤ y (x. we can refer to the largest element of P . y ∈ X). such that l ≤ x for all x ∈ X (respectively. likewise. q ∈ P. . and maximal elements of a set X ⊆ P . We call X a chain in P if (X. For a further discussion of Zorn’s Lemma and its proof using the Axiom of Choice we refer the reader to Halmos’s book on set theory. if P has a least element. For example. with > instead of <. ≤). then this element is also the unique minimal element of P . ≤X ) is linearly ordered. have more than one minimal element. Thus we can speak of least. PRELIMINARIES least element of P . A lowerbound (respectively. when the ambient poset (P. If P has a least element. A minimal element of P is a p ∈ P such that there is no x ∈ P with x < p. by restricting the given partial ordering of P to X. if P has a largest element. x ≤ u for all x ∈ X). More precisely this means that we consider the poset (X. Then P has a maximal element. and there is a linear order ≤ on P that extends ≤ in the sense that p ≤ q =⇒ p ≤ q.12 CHAPTER 1. minimal. for all p. We often tacitly consider X as a poset in its own right. some posets. (Hint: use induction on P . a maximal element of P is deﬁned likewise. Suppose P is nonempty and every nonempty chain in P has an upperbound in P . Zorn’s Lemma. when X is the collection of nonempty subsets of a set A and X is ordered by inclusion. upperbound ) of X in P is an element l ∈ P (respectively. The reader might want to prove the following result to get a feeling for these notions: If P is ﬁnite and nonempty. ≤) is clear from the context.) Let X ⊆ P . however. an element u ∈ P ). Occasionally we shall use the following fact about posets (P. largest.
∨. false. ¬. and and. assumed to be either true or false. Instead of “not true” we also say “false”. or. “or” and “and”. ⊥. We now introduce a formalism that makes this into mathematics. We shall regard ¬p as true if and only if p is not true. These symbols are ﬁxed throughout the course. p ∨ q is deﬁned to be true if and only if p is true or q is true (including the possibility that both are true). pronounced as “p and q”). This leads to more complicated combinations like ¬ p ∧ (¬q) . given any statements p and q. respectively.1 Propositional Logic Propositional logic is the fragment of logic where we construct new statements from given statements using socalled connectives like “not”. pronounced as “p or q”). We start with the ﬁve distinct symbols ⊥ ¬ ∨ ∧ to be thought of as true.Chapter 2 Basic Concepts of Logic 2. Thus. The truth value of such a new statement is then completely determined by the truth values of the given statements. we can form the three statements ¬p p∨q p∧q (the negation of p. and p ∧ q is deemed to be true if and only if p is true and q is true. also. such that no propositional connective is an atom. not. ∧} that can be obtained by applying the following rules: 13 . A proposition on A is a word on the alphabet A ∪ { . (the disjunction of p and q. pronounced as “not p”). (the conjunction of p and q. In this section we also ﬁx a set A whose elements will be called propositional atoms (or just atoms). and are called propositional connectives. It may help the reader to think of an atom a as a variable for which we can substitute arbitrary statements.
Remark. or ∧. . . hence ∨¬ab as well. Then ∧ ∨ ¬ab¬c is a proposition. and thus ∧ ∨ ¬ab¬c is a proposition. If the ﬁrst symbol of p is ¬. so ¬a is a proposition. k − 1} such that wk is one of the concatenations ¬wi . occupy valuable space on the printed page. . or p is an atom. but for the sake of readability we usually write p∨q and p ∧ q to denote ∨pq and ∧pq respectively. we don’t have to give precise rules for their use. rather than as a onedimensional string. . q. (ii) and ⊥ (viewed as words of length 1) are propositions on A. (iii) if p and q are propositions on A. and we also use parentheses and brackets if this helps to clarify the structure of a proposition. We let Prop(A) denote the set of propositions. then p = ∧qr for a unique pair (q. is called preﬁx notation or Polish notation. and p. or there are i. . then either p = . either wk ∈ A∪{ . a twodimensional array. ⊥. r). The intended structure of a proposition—how we think of it as built up from atoms via connectives—is best exhibited in the form of a tree. with n ≥ 1. c are atoms. . Such trees. This follows from the rules above: a is a proposition. Fortunately. q. Because of the informal nature of these conventions. ¬. BASIC CONCEPTS OF LOGIC (i) each atom a ∈ A (viewed as a word of length 1) is a proposition on A. If the ﬁrst symbol of p is ∧. n}. then p = ¬q for a unique q. ⊥} (where each element in the last set is viewed as a word of length 1). r). The reader should take such an informal description as shorthand for a completely explicit deﬁnition. then p = ∨qr for a unique pair (q.1. Suppose a. . . ∧} for which there is a sequence w1 . it’s enough that each actual use is clear to the reader. For the rest of this section “proposition” means “proposition on A”. Example. and are typographically demanding. If p has length 1. r denote propositions. or p = ⊥. .) Only the last two claims are worth proving in print. ∨pq and ∧pq are propositions on A. j ∈ {1. also ¬c is a proposition. then the concatenations ¬p. ∧wi wj . r (sometimes with subscripts) will denote propositions. Having the connectives ∨ and ∧ in front of the propositions they “connect” rather than in between. our “oﬃcial” preﬁx notation does uniquely determine the intended structure of a proposition: that is what the next lemma amounts to. b. . or even by (¬a∨b)∧¬c since we shall agree that ¬ binds stronger than ∨ and ∧ in this informal way of indicating propositions. ∨wi wj . the others should require only a . This is theoretically elegant. such that w = wn and for each k ∈ {1. ∨. which in the case at hand is as follows: A proposition is a word w on the alphabet A ∪ { . however. . . So the proposition in the example above could be denoted by [(¬a)∨b]∧(¬c). or ∨. then its ﬁrst symbol is either ¬. If the ﬁrst symbol of p is ∨.1 (Unique Readability). (Note that we used here our convention that p.14 CHAPTER 2. If p has length > 1. wn of words on that same alphabet. We deﬁned “proposition” using the suggestive but vague phrase “can be obtained by applying the following rules”. Lemma 2.
. t(p ∨ q). 1}. For now we shall assume this lemma without proof. . . . . the conjunction p1 ∧ . . p 0 0 1 1 q 0 1 0 1 ¬p 1 1 0 0 p∨q 0 1 1 1 p∧q 0 0 0 1 p→q 1 1 0 1 p↔q 1 0 0 1 Let t : A → {0. The reason that for n = 0 we take this disjunction to be ⊥ is that we want a disjunction to be true if and only if (at least) one of the disjuncts is true. By recursion on n we deﬁne if n = 0 ⊥ p1 if n = 1 p1 ∨ . an are the distinct atoms that occur in p. pn . it’s better viewed as a function whose arguments and values are statements: replacing the atoms in a proposition by speciﬁc mathematical statements like “2 × 2 = 4”. pn is deﬁned by replacing everywhere ∨ by ∧ and ⊥ by in the deﬁnition of p1 ∨ . . ∨ pn . A truth assignment is a map t : A → {0. . and “every even integer > 2 is the sum of two prime numbers”. .1. Rather than thinking of a proposition as a statement. and p ↔ q denotes (p → q) ∧ (q → p). PROPOSITIONAL LOGIC 15 moment’s thought. We call p1 ∨ . . The array below is called a truth table. Remark. Suppose a1 . We shall use the following notational conventions: p → q denotes ¬p ∨ q. . 1}. Deﬁnition. Then we can compute in a ﬁnite number of steps . . . . . ∨ pn the disjunction of p1 . and that t(p ↔ q) = 1 if and only if t(p) = t(q). Similarly. Note that t(p → q) = 1 if and only if t(p) ≤ t(q). . t(p → q) and t(p ↔ q). . It shows on each row below the top row how the two leftmost entries t(p) and t(q) determine t(¬p). ∨ pn−1 ) ∨ pn if n > 2 Thus p ∨ q ∨ r stands for (p ∨ q) ∨ r. 1} by requiring ˆ (i) t( ) = 1 ˆ (ii) t(⊥) = 0 ˆ ˆ (iii) t(¬p) = 1 − t(p) ˆ ˆ ˆ (iv) t(p ∨ q) = max(t(p). ∨ pn = if n = 2 p1 ∨ p2 (p1 ∨ . we obtain again a mathematical statement. At the end of this section we establish more general results which are needed also later in the course. .2. t(p ∧ q). We extend such a t ˆ to t : Prop(A) → {0. “π 2 < 7”. . . To simplify ˆ notation we often write t instead of t. t(q)) ˆ Note that there is exactly one such extension t by unique readability. t(q)) ˆ ˆ ˆ (v) t(p ∧ q) = min(t(p). and we know how p is built up from those atoms. ∧ pn of p1 . . .
Note that = p is the same as ∅ = p. . and the same is true for the equivalence class of pi(1) ∧ · · · ∧ pi(n) . i∈I i∈I pi := pi(1) ∧ · · · ∧ pi(n) . . . Note that = p ↔ q iﬀ t(p) = t(q) for all t : A → {0.) Remark. . We shall return to this issue in the more algebraic setting of boolean algebras. n} with I does not matter. = (p ∧ q) ↔ (q ∧ p) (3) = (p ∨ (q ∨ r)) ↔ ((p ∨ q) ∨ r). . (5). . but computers can handle somewhat larger n. associativity. 1}. commutativity. distributivity. t(an ). Deﬁnition. We call p equivalent to q if = p ↔ q. . . By the remark preceding the deﬁnition one can verify whether any given p with exactly n distinct atoms in it is a tautology by computing 2n numbers and checking that these numbers all come out 1. Some notation: let (pi )i∈I be a family of propositions with ﬁnite index set I. = (p ∧ (q ∨ r)) ↔ (p ∧ q) ∨ (p ∧ r) (5) = (p ∨ (p ∧ q)) ↔ p. . 1}. Note the leftright symmetry in (1)–(7) : the socalled duality of propositional logic. 1} such that t(p) = 1 for all p ∈ Σ. . We say that a proposition p is a tautological consequence of Σ (written Σ = p) if t(p) = 1 for every model t of Σ. this is usually the case. q. Lemma 2.1. n. choose a bijection k → i(k) : {1.16 CHAPTER 2. n} → I and set pi := pi(1) ∨ · · · ∨ pi(n) . The lemma below gives a useful list of equivalences. BASIC CONCEPTS OF LOGIC t(p) from t(a1 ). Note that “equivalent to” deﬁnes an equivalence relation on Prop(A). We leave it to the reader to verify them. By a model of Σ we mean a truth assignment t : A → {0. the absorption law . because the equivalence class of pi(1) ∨ · · · ∨ pi(n) does not depend on this choice. For all p. In particular. . (To do this accurately by hand is already cumbersome for n = 5. = (p ∧ (p ∨ q)) ↔ p (6) = (¬(p ∨ q)) ↔ (¬p ∧ ¬q). 1} such that t(ai ) = t (ai ) for i = 1.2. Next we deﬁne “model of Σ” and “tautological consequence of Σ”. (3). Fortunately. Thus is a tautology. and (6) are often referred to as the idempotent law . Let Σ ⊆ Prop(A). (4). r we have the following equivalences: (1) = (p ∨ p) ↔ p. other methods are often eﬃcient for special cases. = (p ∧ ¬p) ↔ ⊥ (8) = ¬¬p ↔ p Items (1). Deﬁnition. t(p) = t (p) for any t : A → {0. . . . (2). respectively. . . = (p ∧ (q ∧ r)) ↔ ((p ∧ q) ∧ r) (4) = (p ∨ (q ∧ r)) ↔ (p ∨ q) ∧ (p ∨ r). = (¬(p ∧ q)) ↔ (¬p ∨ ¬q) (7) = (p ∨ ¬p) ↔ . = (p ∧ p) ↔ p (2) = (p ∨ q) ↔ (q ∨ p). We say that p is satisﬁable if t(p) = 1 for some t : A → {0. . 1}. and the De Morgan law . the notations i∈I pi and i∈I pi can only be used when the particular choice of bijection of {1. p → (p ∨ q) are tautologies for all p and q. We say that p is a tautology (notation: = p) if t(p) = 1 for all t : A → {0. and p ∨ ¬p. If I is clear from context we just write i pi and i pi instead. Of course.
2. Below we just write “admissible word” instead of “admissible word on F ”. Then m ≤ n. ak and u1 = hb1 . We will prove (3) here and leave the rest as exercise. ∨. ti = ui for i = 1. . 1} of Σ ∪ {p}.3. Note that the empty word is not admissible. A symbol f ∈ F is said to have arity n if a(f ) = n. arity(∨) = arity(∧) = 2. To derive Σ = p → q we consider any model t : A −→ {0.) 17 Proof. . . If t(p) = 0 then t(p → q) = 1 by deﬁnition. ⊥. and assume the lemma holds for smaller lengths. The ﬁrst symbol of t1 equals the ﬁrst symbol of u1 . . ¬. tm and u1 . . . . . arity(¬) = 1. . un .1. Deﬁnition. and need only derive that t(q) = 1. To derive Σ ∪ {p} = q we consider any model t : A −→ {0. q ∈ Prop(A). 1} of Σ. (⇒) Assume Σ ∪ {p} = q. then the concatenation f t1 . . . m. Note that n > 0. . . . . . Say this ﬁrst symbol is h ∈ F with arity k. tm w = b1 . (⇐) Assume Σ = p → q. Cancelling the ﬁrst symbol h gives a1 . If m = 0. then the conclusion of the lemma holds. ak t2 . . . Lemma 2. Let t1 . . . bk are admissible words. We ﬁnish this section with the promised general result on unique readability. un . . By induction on the length of u1 . so suppose m > 0. ak and b1 . . . (3) Σ ∪ {p} = q ⇐⇒ Σ = p → q. . Take F = A ∪ { . . . Example. ⊥}. . and that the last symbol of an admissible word cannot be of arity > 0. (Modus Ponens. . . Then t1 = ha1 . By assumption t(p → q) = 1 and in view of t(p) = 1. (4) If Σ = p and Σ = p → q. ∧} and deﬁne arity : F → N by arity(x) = 0 for x ∈ A ∪ { . . Suppose the length is > 0. . . un be admissible words and w any word on F such that t1 . . . bk where a1 . (2) Σ = p =⇒ Σ = p ∨ q. Let Σ ⊆ Prop(A) and p. Then (1) Σ = p ∧ q ⇐⇒ Σ = p and Σ = q. tm are admissible words on F . . PROPOSITIONAL LOGIC Lemma 2.4. We also establish facts of similar nature that are needed later. .1. this gives t(q) = 1 as required. tm w = u1 . . If t(p) = 1 then t(Σ ∪ {p}) ⊆ {1}. (ii) If f ∈ F has arity m > 0 and t1 .1. and w = um+1 · · · un . then f viewed as a word of length 1 is admissible. Then the set of admissible words is just Prop(A). Proof. . . . . . A word on F is said to be admissible if it can be obtained by applying the following rules: (i) If f ∈ F has arity 0. Let F be a set of symbols with a function a : F → N (called the arity function). If this length is 0. bk u2 . . un . . hence t(q) = 1 and thus t(p → q) = 1. then Σ = q. tm is admissible. then m = n = 0 and w is the empty word. . and need only show that then t(p → q) = 1.
. . tm and u1 . (For example. .1. m. Each admissible word equals f t1 .) We have length(b1 . ak = bk (so t1 = u1 ). .1. . and given v ∈ F ∗ . . .5. Then m = n and ti = ui for i = 1. Then t = u and w is the empty word. . . . . . un are admissible words on F . . . these two occurrences overlap. . Then there is a unique admissible word that occurs in w at starting position i. . . Proof. . one at starting position 1. m = n and ti = ui for i = 1. Clearly w is an admissible word occurring in w at starting position 1. . Put lj := 1 + length(t1 ) + · · · + length(tj ) for j = 0. tm are admissible words. . tn where f ∈ F has arity n > 0. tm = gu1 . Here are two immediate consequences that we shall use: 1. tn are admissible words. Uniqueness then follows from the fact stated just before Lemma 2. tm for a unique tuple (f. g ∈ F have arity m and n respectively. . tn where f ∈ F has arity n > 0. Let t and u be admissible words and w a word on F such that tw = u. . u1 . . . . After cancelling the ﬁrst symbol of both words. . We have to show that then f = g. un be admissible words such that t1 . n} such that 1 + length(t1 ) + · · · + length(tj−1 ) < i ≤ 1 + length(t1 ) + · · · + length(tj ). Suppose lj−1 < i ≤ lj . . . t2 = u2 . . . . Observe ﬁrst that f = g since f and g are the ﬁrst symbols of two equal words. length(w)}. so the induction hypothesis applies. tm = um . n (so l0 = 1).18 CHAPTER 2. tm . Let w be an admissible word and 1 ≤ i ≤ length(w). Then we write w = f t1 . . w ∈ F ∗ and i ∈ {1. . . and t1 . the result of replacing v in w at starting position i by v is by deﬁnition the word w1 v w2 . 2. . . Lemma 2. . . . . . we say that v occurs in w at starting position i if w = w1 vw2 where w1 . . the ﬁrst consequence of the previous lemma leads to the desired conclusion. t1 . . . i − 1 + length(v) ≤ lj . . bk u2 . and let v be the admissible word that occurs in w at starting position i. un ) = length(u1 . un where f. . We prove existence by induction on length(w). . n−1 could be 0. . Remark. . Then the proof of the last lemma shows that this occurrence is entirely inside tj . and t1 . . . . . . see exercise 5 at the end of this section. . and t1 . . . tn are admissible words. . then the word f gf has exactly two occurrences in the word f gf gf .) Given w = w1 vw2 as above. and w = um+1 · · · un . . Suppose f t1 . . .1. . but such overlapping is impossible with admissible words. Let t1 . if f. m−1. . . m. . Proof. w2 ∈ F ∗ and w1 has length i − 1. . . 1 ≤ j ≤ n. Lemma 2. . tm = u1 . It yields k + m − 1 ≤ k + n − 1 (so m ≤ n). . . . BASIC CONCEPTS OF LOGIC (Caution: any of k. tm ) where f ∈ F has arity m and t1 . Let w = f t1 . . . and the other at starting position 3. . Given words v. a1 = b1 . . .6. . . un ) − 1. un . g ∈ F are distinct. . that is. and we take j ∈ {1. Suppose i > 1.5 (Unique Readability). . . Now apply the inductive assumption to tj .
7.2. and then prove this completeness.1. A propositional axiom is by deﬁnition a proposition that occurs in the list below. . . . and the conventions of that section remain in force.) Then for every function f : {0. (Think of a truth table for p where the 2n rows correspond to the 2n truth assignments t : A → {0. A = {a1 . A = n.2 Completeness for Propositional Logic In this section we introduce a proof system for propositional logic. Let v and v be the admissible words that occur at starting positions i and i respectively in w. that is. . as well as the words “disjunction” and “conjunction. 1}A → {0. In the exercises below. (1) (Disjunctive Normal Form) Each p is equivalent to a disjunction p1 ∨ · · · ∨ pk where each disjunct pi is a conjunction a11 ∧ . and the column under p records the values t(p). 1} and (2) (Conjunctive Normal Form) Same as last problem. (4) Let ∼ be the equivalence relation on Prop(A) given by p ∼ q :⇐⇒ = p ↔ q. 1} there is a p such that f = fp . Exercises. or the occurrence of v is entirely inside that of v. r: 1.2. j ∈ {−1. p → (q ∨ p) 3.” and also the words “disjunct” and “conjunct. p → (p ∨ q). Let w be an admissible word and 1 ≤ i ≤ length(w). determine its cardinality as a function of n = A. 1} deﬁned by fp (t) = t(p). As in the previous section we ﬁx a set A of atoms. Then the result of replacing the admissible word v in w at starting position i by an admissible word v is again an admissible word. except that the signs ∨ and ∧ are interchanged. for some choice of p. 1}. 2. that is. Then the quotient set Prop(A)/ ∼ is ﬁnite. using the last remark. 2. Then these occurrences are either nonoverlapping. (5) Let w be an admissible word and 1 ≤ i < i ≤ length(w). an }. ∧ ann with all where for an atom a we put a1 := a and a−1 := ¬a. COMPLETENESS FOR PROPOSITIONAL LOGIC 19 Corollary 2.” (3) To each p associate the function fp : {0. 1}A → {0. state the completeness of this proof system. This follows by a routine induction on length(w). i − 1 + length(v ) ≤ i − 1 + length(v). q. ¬p → ¬q → ¬(p ∨ q) . . . i − 1 + length(v) < i .
. .2.ﬁrst form). (ii) or pk is a propositional axiom. infer q. For example. but axiom schemes. That is why we do not call these items axioms.2. If Σ p. Since p → (p → p) is also a propositional axiom by scheme 2. . . pn with n ≥ 1 and pn = p. Lemma 2. . Here is our single rule of inference for propositional logic: Modus Ponens (MP): from p and p → q. and both instances of axiom scheme 2. . . such that for k = 1.2. then a → (a ∨ ⊥) and b → (b ∨ (¬a ∧ ¬b)) are distinct propositional axioms. This should be clear from earlier facts that we stated and which the reader was asked to verify. b ∈ A. p → q → (p ∧ q) CHAPTER 2. BASIC CONCEPTS OF LOGIC (p ∧ q) → q 6. if a. p → p. p → (q → r) → (p → q) → (p → r) 7. The converse is true but less obvious: Theorem 2. It is easy to check that all propositional axioms are tautologies. For Σ = ∅ we also write p instead of Σ p. then Σ = p. . (iii) or there are i. . j ∈ {1. . k − 1} such that pk can be inferred from pi and pj by MP. then we write Σ p.3 (Completeness . By axiom scheme 6. The proposition p → (p → p) → p is a propositional axiom by axiom scheme 2. of p from Σ is a sequence p1 . In the rest of this section Σ denotes a set of propositions. Proposition 2. Deﬁnition. (p ∧ q) → p. . Σ ⊆ Prop(A). 5. Σ p ⇐⇒ Σ = p . Proof. Applying MP to these two axioms yields p → (p → p) → (p → p). (¬p → ⊥) → p Each of items 2–8 describes inﬁnitely many propositional axioms. If there exists a proof of p from Σ. we can apply MP again to obtain p → p. that is. n: (i) either pk ∈ Σ.20 4. {p → (p → p) → p } → { p → (p → p) → (p → p)} is a propositional axiom. p → (¬p → ⊥) 8. . A formal proof . or just proof.2. and say Σ proves p.1.
Now assume that q is obtained by MP from r and r → q. Proof. It is convenient to prove ﬁrst a variant of the Completeness Theorem. Since p → (¬p → ⊥) is a propositional axiom. We ﬁrst show that the second form of the Completeness Theorem implies the ﬁrst form. (⇒) Assume Σ p.second form). However.7 (Deduction Lemma). MP yields Σ p. and thus in our notion of formal proof. where Σ ∪ {p} r and Σ ∪ {p} r → q and where we assume inductively that Σ p → r and Σ p → (r → q). and since q → (p → q) is a propositional axiom. ⊥. then either q ∈ Σ in which case the same argument as before gives Σ p → q. Theorem 2.4 (Compactness of Propositional Logic). Σ is consistent if and only if Σ has a model. Σ p if and only if Σ ∪ {¬p} is inconsistent.2.8.2. which merely formalizes the basic underlying idea of propositional logic as stated in the introduction to the previous section.2. then Σ q. An example is the Compactness Theorem. the equivalence of and = (Completeness Theorem) shows that our choice of axioms and rule yields a complete proof system. Then Σ p → q.2. If q is a propositional axiom. If Σ = p then there is a ﬁnite subset Σ0 of Σ such that Σ0 = p. For this we need a technical lemma that will also be useful later in the course. or q = p and then Σ p → q since p → p by the lemma above. MP yields Σ p → q.5 (Completeness . and so by the Deduction Lemma we have Σ ¬p → ⊥. Theorem 2. if Proof.6. we can apply MP twice to get Σ ∪ {¬p} ⊥. We say that Σ is inconsistent if Σ Σ ⊥) we call Σ consistent. If q ∈ Σ ∪ {p}. COMPLETENESS FOR PROPOSITIONAL LOGIC 21 Remark. (⇐) Assume Σ ∪ {¬p} is inconsistent. Then we obtain Σ p → q from the propositional axiom p → (r → q) → (p → r) → (p → q) by applying MP twice. Lemma 2. Since (¬p → ⊥) → p is a propositional axiom. Hence Σ ∪ {¬p} is inconsistent. Suppose Σ ∪ {p} q. This is in contrast to the deﬁnition of =. and otherwise (that is. .2.2. Corollary 2.2. this equivalence has consequences which can be stated in terms of = alone. Deﬁnition. Σ has a model ⇐⇒ every ﬁnite subset of Σ has a model. By induction on proofs. Moreover. There is some arbitrariness in our choice of axioms and rule. Then Σ ∪ {¬p} ⊥. From this second form of the Completenenes Theorem we obtain easily an alternative form of the Compactness of Propositional Logic: Corollary 2.
Completeness as a property of a set of propositions should not be confused with the completeness of our proof system as expressed by the Completeness Theorem.5) implies the ﬁrst form (Theorem 2.2. For any p. Suppose otherwise. Lemma 2.) Thus by Zorn’s lemma P has a maximal element Σ . . Σn+1 = Σn ∪ {¬pn } if Σn pn . Because A is countable. Proof. Since a proof can use only ﬁnitely many of the axioms in {Σi : i ∈ I}. In particular Σ ∈ P . We claim that then Σ is complete. namely {Σi : i ∈ I}. Given a consistent Σn ⊆ Prop(A) we deﬁne Σn ∪ {pn } if Σn pn . We want to show that then Σ p. From Σ = p it follows that Σ ∪ {¬p} has no model. Proof. We consider P as partially ordered by inclusion.2. Hence by the second form of Completeness. suppose {Σi : i ∈ I} ⊥. and thus Σ ¬p.9. Deﬁnition. of consistent subsets of Prop(A) as follows. We say that Σ is complete if Σ is consistent. Proof.2. Then by Corollary 2.2.22 CHAPTER 2. The second form of Completeness (Theorem 2. Prop(A) is countable. . . Suppose A is countable. that is.) Below we use Zorn’s Lemma to show that any consistent set of propositions can be extended to a complete set of propositions. Take an enumeration (pn )n∈N of Prop(A). and that Σ = p. 1} by tΣ (a) = 1 if Σ a.3). For this case we can give a proof of Lindenbaum’s Lemma without using Zorn’s Lemma as follows. there exists i ∈ I such that Σi ⊥. hence ¬p ∈ Σ by maximality of Σ . Any totally ordered subcollection {Σi : i ∈ I} of P with I = ∅ has an upper bound in P . Then Σ ⊆ Σ for some complete Σ ⊆ Prop(A). We construct an increasing sequence Σ = Σ0 ⊆ Σ1 ⊆ . Assume the second form of Completeness holds.10 (Lindenbaum). (It is just a historical accident that we use the same word. BASIC CONCEPTS OF LOGIC Corollary 2. Suppose Σ is consistent. if Σ p.2. and tΣ (a) = 0 otherwise. and for each p either Σ p or Σ ¬p. contradicting the consistency of Σi .8 the set Σ ∪ {¬p} is consistent.2. Deﬁne the truth assignment tΣ : A → {0. then by Corollary 2. (To see this it suﬃces to check that {Σi : i ∈ I} is consistent. Thus Σ∞ := {Σn : n ∈ N} is consistent and also complete: for any n either pn ∈ Σn+1 ⊆ Σ∞ or ¬pn ∈ Σn+1 ⊆ Σ∞ .8 we have Σ p. so Σn+1 remains consistent by earlier results. Let P be the collection of all consistent subsets of Prop(A) that contain Σ. the set Σ ∪ {¬p} is inconsistent.
tΣ (p) = 0. Application to coloring inﬁnite graphs. n) : v ∈ V } ∪ {¬ (v. so Σ q. It remains to consider the three cases below. Σ q ⇐⇒ tΣ (q) = 1. and if (v. tΣ (q) = 1. which contradicts the consistency of Σ. w ∈ V we have (v. Let some n ≥ 1 be / given. v) ∈ E. that is. and Σ r ⇐⇒ tΣ (r) = 1. Then tΣ (q) = 0.2. Then tΣ (p) = 1: Otherwise. Then an ncoloring of (V. . and Σ r ⇐⇒ tΣ (r) = 1. . then the equivalence follows immediately from the deﬁnitions. i) ∧ (w. Take A := V × {1. (⇒) Suppose that Σ p. E0 ). n} such that c(v) = c(w) for all (v. . Using MP and the propositional axioms q → p and r → p we obtain Σ p. Hence Σ q or Σ r. i) : (v. where E0 := E ∩ (V0 × V0 ). v) ∈ E. we can apply MP twice to get Σ ¬p. 1 ≤ i < j ≤ n} ∪ {¬ (v. and such a model is also a model of Σ. since q → (p → ⊥) is a propositional axiom. w) ∈ E. Suppose Σ is complete. E) is a function c : V → {1. Case 2 : p = q ∨ r. 1 ≤ i ≤ n}. E) to have an ncoloring means that the following set Σ ⊆ Prop A has a model: Σ := {(v. Proof. The converse—if Σ has a model. and think of an atom (v. tΣ is a model of Σ. by which we mean here that V is a set (of vertices) and E (the set of edges) is a binary relation on V that is irreﬂexive and symmetric. We can now ﬁnish the proof of Completeness (second form): Suppose Σ is consistent. for all v. w) ∈ E: neighboring vertices should have diﬀerent colors. (⇐) Suppose tΣ (p) = 1. . We claim that there exists an ncoloring of (V. and thus Σ p by completeness of Σ. Case 3 : p = q ∧ r. (⇒) Suppose Σ p. We proceed by induction on the number of connectives in p. What follows is a standard use of compactness of propositional logic. Thus for (V. . so Σ q by the inductive assumption. one of many. Then for each p we have Σ p ⇐⇒ tΣ (p) = 1. 23 In particular. COMPLETENESS FOR PROPOSITIONAL LOGIC Lemma 2. Σ q ⇐⇒ tΣ (q) = 1. . which in view of the propositional axiom p → (¬p → ⊥) and MP yields Σ ⊥. then Σ is consistent—is left to the reader. we can apply MP twice to get Σ ⊥. i) as representing the statement that v has color i. and (inductive assumption) Σ q ⇐⇒ tΣ (q) = 1. . E) be a graph. If p is an atom or p = or p = ⊥.11. n} as the set of atoms. . (⇐) Suppose tΣ (p) = 1. Then tΣ (p) = 1: Otherwise.2. such a Σ has a model. Proof. Suppose for every ﬁnite V0 ⊆ V there is an ncoloring of (V0 . Let (V. w) ∈ E.2. Case 1 : p = ¬q. so tΣ (q) = 0 and tΣ (r) = 0. Then tΣ (q) = 1 or tΣ (r) = 1. hence Σ q and Σ r. 1) ∨ · · · ∨ (v. which contradicts the consistency of Σ. since ¬q → (¬r → ¬p) is a propositional axiom. By the previous lemma. then (w. E). We leave this case as an exercise. Then by Lindenbaum’s Lemma Σ is a subset of a complete set Σ of propositions. j) : v ∈ V. i) ∧ (v. and thus Σ ¬q and Σ ¬r by completeness of Σ. .
1. “binary”. and binary function symbol +. plus the unary function symbol −. (2) The language LAb = {0. Deﬁnition. ·} of rigs (or semirings) has constant symbols 0 and 1. pn ∈ Σ such that p1 ∨ · · · ∨ pn is a tautology. (The interesting case is when A is inﬁnite. A structure A for L (or Lstructure) is a triple A. We also need the capability to deal with predicates (also called relations). Deﬁnition. (4) The language LOAb = {<. BASIC CONCEPTS OF LOGIC The assumption that all ﬁnite subgraphs of (V.2 1 What 2 It we call here a language is also known as a signature. ·} of rings. Hence by compactness Σ has a model. Exercises. (1) Let (P. A constant symbol is a function symbol of arity 0. (F A )F ∈Lf consisting of: (i) a nonempty set A. we prefer less grandiose terminology. −1 . E) are ncolorable yields that every ﬁnite subset of Σ has a model. (RA )R∈Lr . −. and the quantiﬁers “for all” and “there exists. From now on. . (5) The language LRig = {0. (6) The language LRi = {0. which has these additional features and has a claim on being a complete logic for mathematical reasoning. (3) The language LO = {<} has just one binary relation symbol <. let L denote a language. Instead of “0ary”. −. . or a vocabulary. −. 1. (Hint: use the compactness theorem and the fact that this is true when P is ﬁnite. Examples. Then there are p1 . +. 0.3 Languages and Structures Propositional Logic captures only one aspect of mathematical reasoning.” We now begin setting up a framework for Predicate Logic (or FirstOrder Logic). ·} of groups has constant symbol 1. Then there is a linear order ≤ on P that extends ≤. each F ∈ Lf has associated arity a(F ) ∈ N. (1) The language LGr = {1. A language 1 L is a disjoint union of: (i) a set Lr of relation symbols. The symbols are those of the previous example. . the underlying set of A. and binary function symbol ·. (ii) a set Lf of function symbols. “1ary”. is also called the universe of A. and binary function symbols + and ·. +} of (additive) abelian groups has constant symbol 0. “unary”. . each R ∈ Lr has associated arity a(R) ∈ N. +.) (2) Suppose Σ ⊆ Prop(A) is such that for each truth assignment t : A → {0. “2ary” we say “nullary”. 1} there is p ∈ Σ with t(p) = 1. unary function symbol −1 .) 2. . +} of ordered abelian groups. An mary relation or function symbol is one that has arity m. ≤) be a poset.24 CHAPTER 2. variables. unary function symbol −.
<. any nonempty set A equipped with a binary relation on it can be viewed as an LO structure. (1) Each group is considered as an LGr structure by interpreting the symbols 1. <). +) and (Q. and its group multiplication. 0. 0. Since A0 has just one element. cA is uniquely determined by its value at this element. can be construed as an LAb structure if we choose to do so. −. 0. (iii) for each nary F ∈ Lf an operation F A : An −→ A (an nary operation on A). and R respectively. Examples. −1 . Similarly with 0 and −. +. respectively. (Here we take even more notational liberties. The reader is supposed to keep in mind the distinction between symbols of L and their interpretation in an Lstructure. . +) be an abelian group. and · as the identity element of the group. <. Let B be an Lstructure with underlying set B. any set A in which we single out an element. and the usual orderings of N. but also denotes in this context its interpretation as a binary operation on the set A. and a binary operation on A.3. We consider A as an LAb structure by taking as interpretations of the symbols 0. Similarly for (Z. ·) is an LRig structure. − and + on A. Then A is the underlying set of an Lstructure A deﬁned by letting F A := F B An : An → A. a unary operation on A. LANGUAGES AND STRUCTURES 25 (ii) for each mary R ∈ Lr a set RA ⊆ Am (an mary relation on A). 0. −. +) are both LOAb structures. ·) is an LRi structure. and operations F A on A (for F ∈ Lf ) are called the primitives of A. The interpretation of a constant symbol c of L is a function cA : A0 −→ A. When A is clear from context we often omit the superscript A in denoting the interpretation of a symbol of L in A. (4) (Z. 1. Z. +. the interpretation of F in A. −. (Q.− and + of LAb the group operations 0.2.) In fact. we shall identify cA with this value. Remark. for mary R ∈ Lr . 0.) Again. <). (6) (Z. <) is an LO structure where we interpret < as the usual ordering relation on N. its group inverse. (3) (N. and − : A → A and + : A2 → A denote the group operations of A. RA := RB ∩ Am for nary F ∈ Lf . so cA ∈ A. −. Given an Lstructure A. the interpretation of R in A. 1. (2) Let A = (A. even if we use the same notation for both. here 0 ∈ A is the zero element of the group. (We took here the liberty of using the same notation for possibly entirely diﬀerent things: + is an element of the set LAb . Q. (5) (N. the relations RA on A (for R ∈ Lr ). <) and (R. by letting < denote ﬁve diﬀerent things: a symbol of LO . and let A be a nonempty subset of B such that F B (An ) ⊆ A for every nary function symbol F of L.
a homomorphism h : A → B yields a substructure h(A) of B with underlying set h(A). An automorphism of A is an isomorphism A → A. . . ·) ⊆ (R. The map k → k + 1 is an automorphism of A with inverse given by k −→ k − 1. . . (ii) for each nary F ∈ Lf and each (a1 . Examples. and other kinds of algebraic structures. Let A = (Z. an ) ∼ F A (b1 . . bn ). A congruence on the Lstructure A is an equivalence relation ∼ on its underlying set A such that (i) if R ∈ Lr is mary and a1 ∼ b1 . . 0. ·) Deﬁnition. 1. and if h is an embedding we have an isomorphism a → h(a) : A → h(A). an ) ∈ An we have h(F A (a1 . . If A and B are groups (viewed as structures for the language LGr ). . 0. . +. respectively). isomorphisms. am ) ∈ Am we have (a1 . 0. . . 0. . am ) ∈ RA =⇒ (ha1 . . A homomorphism h : A → B is a map h : A → B such that (i) for each mary R ∈ Lr and each (a1 . . . The identity map 1A on A is an automorphism of A. an ∼ bn . . then F A (a1 . . If i : A → B and j : B → C are homomorphisms (strong homomorphisms. (ii) if F ∈ Lf is nary and a1 ∼ b1 . 1. 0. . an isomorphism is a bijective strong homomorphism. then the inclusion a → a : A → B is an embedding A → B.26 CHAPTER 2. . <). . −. bm ) ∈ RA . or extends A. notation: A ⊆ B. 1. . . . 0. ·) ⊆ (Z. +. Let A = (Z. then a homomorphism h : A → B is exactly what in algebra is called a homomorphism from the group A to the group B. <. +). −. an )) = F B (ha1 . BASIC CONCEPTS OF LOGIC Deﬁnition. Thus the automorphisms of A form a group Aut(A) under composition with identity 1A . embeddings. am ) ∈ RA ⇐⇒ (b1 . . . ham ) ∈ RB . . . 1. We also say in this case that B is an extension of A. . If A ⊆ B. . +. . (1) (Z. . . . . . . Likewise with rings. . . Then k → −k is an automorphism of A. am ∼ bm . . Replacing =⇒ in (i) by ⇐⇒ yields the notion of a strong homomorphism. . . . ·) (2) (N. If i : A → B is an isomorphism then so is the map i−1 : B → A. +. Examples. −. . . 1. . <. 2. . . . then (a1 . then so is j ◦ i : A → C.) be Lstructures. Let A = (A. han ). Such an Lstructure A is said to be a substructure of B.) and B = (B. −. . 1. . . . . ·) ⊆ (Q. An embedding is an injective strong homomorphism. +. . Conversely. .
. . ) for i ∈ I. For j ∈ I the projection map to the jth factor is the homomorphism Bi → Bj . bmi ) ∈ RBi for all i ∈ I. . . Products. . . Note that then we have a strong homomorphism a → a∼ : A → A/ ∼. . and where the basic relations and functions are deﬁned coordinatewise: for mary R ∈ Lr and elements b1 = (b1i ). bm ) ∈ RB ⇐⇒ (b1i . Bi i∈I is deﬁned to be the Lstructure B whose underlying set is the product set i∈I Bi . a∼ ) : (a1 . . . . . . . . . . . .3. . a2 ∈ A we put a1 ∼h a2 ⇐⇒ h(a1 ) = h(a2 ). .2. bm = (bmi ) ∈ i∈I Bi . . . an )∼ n on A/ ∼. (iii) the interpretation of an nary F ∈ Lf in A/ ∼ is the nary operation ∼ (a1 . . . . . bn = (bni ) ∈ F B (b1 . Bi = (Bi . . . Given a congruence ∼ on the Lstructure A we obtain an Lstructure A/ ∼ (the quotient of A by ∼) as follows: (i) the underlying set of A/ ∼ is the quotient set A/ ∼. . . . h(a) := hi (ai ) . Let Bi The product i∈I be a family of Lstructures. . a∼ ) → F A (a1 . . LANGUAGES AND STRUCTURES 27 Note that a strong homomorphism h : A → B yields a congruence ∼h on A as follows: for a1 . (ii) the interpretation of an mary R ∈ Lr in A/ ∼ is the mary relation {(a∼ . . bn ) := F Bi (b1i . . . am ) ∈ RA } 1 m on A/ ∼. . . (b1 . . . i∈I (bi ) → bj . bni ) i∈I i∈I Bi . . and for nary F ∈ Lf and b1 = (b1i ). . . . . This product construction makes it possible to combine several homomorphisms with a common domain into a single one: if for each i ∈ I we have a homomorphism hi : A → Bi we obtain a homomorphism h = (hi ) : A → i∈I Bi .
Deﬁnition.4 Variables and Terms Var = {v0 . Then we have an isomorphism from A/ ∼h onto h(A) given by a∼h → h(a). xn are distinct. then the concatenation F t1 . } comes equipped with such a numbering. . tn is an Lterm. then it may be hard to see how it is built up from subterms. and each congruence on G equals ≡N for a unique normal subgroup N of G. . xn by p(x1 . . . tn are Lterms. . (2) Consider a strong homomorphism h : A → B of Lstructures. . by clause (ii) for n = 0. . v1 . (This is like indicating a polynomial in the indeterminates x1 . . The Lterms are the admissible words on the alphabet Lf ∪ Var where each variable has arity 0. . . . . . y. . Remark.) If a term is written as an admissible word. v2 . . . unless indicated otherwise. 2. We let x. An Lterm is a word on the alphabet Lf ∪ Var obtained as follows: (i) each variable (viewed as a word of length 1) is an Lterm.28 Exercises. xn ) to indicate an Lterm t in which no variables other than x1 . . this will be pointed out. . we assume that vm = vn for m = n. v1 . Whenever we use this notation we assume tacitly that x1 . . xn occur. Note that we do not require that each of x1 . . xn ). . xn ). and that no variable is a function or relation symbol in any language. (ii) whenever F ∈ Lf is nary and t1 . in model theory this can even be convenient. . v2 . In practice we shall therefore use parentheses and brackets in denoting terms. our Var = {v0 . . . . . We often write t(x1 . . . Thus “unique readability” is available. z (sometimes with subscripts or superscripts) denote variables. . . } Throughout this course is a countably inﬁnite set of symbols whose elements will be called variables. . For this more general Var we still insist that no variable is a function or relation symbol in any language. . xn actually occurs in t(x1 . and avoid preﬁx notation if tradition dictates otherwise. The results in Chapter 5 on undecidability presuppose a numbering of the variables. Chapters 2–4 go through if we take as our set Var of variables any inﬁnite (possibly uncountable) set. . where one allows that some of these indeterminates do not actually occur in the polynomial p. Each normal subgroup N of G yields a congruence ≡N on G by a ≡N b ⇐⇒ aN = bN. In the few cases in chapters 2–4 that this more general setup requires changes in proofs. . . . . CHAPTER 2. Note: constant symbols of L are Lterms of length 1. . . BASIC CONCEPTS OF LOGIC (1) Let G be a group viewed as a structure for the language of groups. . .
. . . . . . then tA = F A (tA . τn /xn ) the “replacing” should be simultaneous. If t is given in the form t(x1 . . then t(τ1 . . A term is said to be variablefree if no variables occur in it. then tA = cA ∈ A. .2. Note that if B is a second Lstructure and A ⊆ B. . x1 . . Deﬁnition. τn ) as a shorthand for t(τ1 /x1 . Then the function tR : R3 → R is given by tR (a. simultaneously for i = 1. For easier reading we indicate this term instead by (x + (−y)) · z or even (x − y)z. In the deﬁnition of t(τ1 /x1 . . . gm ∈ G} ⊆ B is the underlying set of some A ⊆ B. Then t(τ1 /x1 . xm ) is an Lterm and g1 . . . . . . . . Then the set {tB (g1 . . . . . . . let G ⊆ B. xn ). xn are distinct variables. tn are Lterms. . . . z) be the LRi term (x − y)z. n. . . then tA (a) = F A (tA (a). . . . τn are variablefree and t = t(x1 . . τ2 are Lterms and x1 . xm ). . then we say that B is generated by G. . . . . . . then tA (a) = tB (a) for t as above and a ∈ Am . τ2 /x2 ). . because it can happen that for t := t(τ1 /x1 ) we have t (τ2 /x2 ) = t(τ1 /x1 . tn . . . . . if t is a constant symbol c. We call this A the substructure of B generated by G. . b. . .4. . . x) a function tA : Am → A as follows (i) If t is the variable xi then tA (a) = ai for a = (a1 .4. . . If τ1 . and this A is clearly a substructure of any A ⊆ B with G ⊆ A . . am ) ∈ Am . and τ1 . .) Generators. Consider R as a ring in the usual way. let x1 . . and let τ1 . c) = (a − b)c. . . . tA ). . identiﬁed as usual with its value at the unique element of A0 . . The easy proof of the next lemma is left to the reader. . Then t(τ1 /x1 . . . . gm ) : t(x1 . y. Then the above gives a nullary function tA : A0 → A. τn /xn ) is an Lterm. . . τ1 . . and if t = F t1 . . . Then we associate to the ordered pair (t. . . . . If (ai )i∈I is a family of elements of B. . . n 1 This inductive deﬁnition is justiﬁed by unique readability. x2 are distinct variables. where cA is as in the previous section. . . . . tn where F ∈ Lf is nary and t1 . Let B be an Lstructure. Let t be an Lterm. . Let t be a variablefree Lterm and A an Lstructure. VARIABLES AND TERMS 29 Example. tn with nary F ∈ Lf and variablefree Lterms t1 . . . so tA ∈ A. and assume also that L has a constant symbol or that G = ∅. τn be Lterms. (Here t. In other words. . xn be distinct variables. . . Let A be an Lstructure and t = t(x) be an Lterm where x = (x1 .1. . (ii) If t = F t1 . then we write t(τ1 . . τn ) is variablefree. Suppose t is an Lterm. tA (a)) for a ∈ Am . The word · + x − yz is an LRi term. τn /xn ) is the word obtained by replacing all occurrences of xi in t by ti . n 1 Deﬁnition. then “generated by (ai )” means “generated by G” where G = {ai : i ∈ I}. . . τn /xn ). and let t(x. . We urge the reader to do exercise (1) below and thus acquire the conﬁdence that these formal term substitutions do correspond to actual function substitutions. . . . Example. Lemma 2. xn ). . if A = B. . τn are Lterms.
+). . or a function or relation symbol of any language. yn )) has the property that if A is an Lstructure and a = (a1 . xn ) such that in every commutative ring R = (R. Then the Lterm t∗ (y1 . .30 Exercises. xn ) an Lterm. . . . . ·) the above displayed identity holds. Conversely. . . . . . an ) ∈ An . xn ] such that for every commutative ring R = (R. 0. . an ) ∈ An . . . . yn ). . . 1. . an ) = tB (a1 . . BASIC CONCEPTS OF LOGIC (1) Let t(x1 . 0. . . . . . . 2. −. . . xn ) there are integers k1 . . . . . . . . . . . xn ) such that in every abelian group A = (A. . for all (a1 . for any integers k1 . . . ·) where L = LRig . τm (a) . The atomic Lformulas are the following words on the alphabet L ∪ Var ∪ { . han ). None of these eight symbols is a variable. xn ) there is a polynomial P (x1 . the latter are often referred to as the nonlogical symbols. −. +) the above displayed identity holds. and t = t(x1 . . . . . rn ) ∈ Rn . ·). (3) For every LRi term t(x1 . . . . . . kn such that for every abelian group A = (A. . . . . . . (4) Let A and B be Lstructures. rn ) = P (r1 . . . for all (a1 . . tR (r1 . . . . xn ) ∈ Z[x1 . τm (y1 . τm (y1 . . CHAPTER 2. . an ) for all (a1 . h : A → B a homomorphism. . . . . . . . yn ) be Lterms. . +. 0. . . . −. 1. . . . . .5 Formulas and Sentences ⊥ ¬ ∨ ∧ = ∃ ∀ Besides variables we also introduce the eight distinct logical symbols The ﬁrst ﬁve of these we already met when discussing propositional logic. . an ) = k1 a1 + · · · + kn an . 0. . . . . . . an ) = tB (ha1 . . . . an ) ∈ An . . kn there is an LAb term t(x1 . . tA (a1 . xn ] there is an LRi term t(x1 . +. . . . . (a) Is there an Lterm t(x) such that tA (0) = 1 and tA (1) = 0? (b) Is there an Lterm t(x) such that tA (n) = 2n for all n ∈ N? (c) Find all the substructures of A. 1. . . Conversely. . . . then ` A ´ A (t∗ )A (a) = tA τ1 (a). . . xn ) ∈ Z[x1 . ⊥. . . . yn ). . . . for all (r1 . . . −. . +. . Deﬁnition. an ) ∈ An . (2) For every LAb term t(x1 . xm ) and τ1 (y1 . . (If A ⊆ B and h : A → B is the inclusion. . . . . . . Then ` ´ h tA (a1 . . for any polynomial P (x1 . Below L denotes a language. 0. . To distinguish the logical symbols from those in L. this gives tA (a1 . . . . . rn ).) (5) Consider the Lstructure A = (N. =}: . yn ) := t(τ1 (y1 . . . .
ψ are Lformulas. The context should always make it clear what our intention is in this respect without having to spell it out. ∨. ⊥. (iii) if ϕ is a Lformula and x is a variable. (ii) if ϕ. The Lformulas are the words on the larger alphabet L ∪ Var ∪ { . ∃. . . =. tm are Lterms. . the ﬁrst two occurrences of x are bound. In the formula ∃x(x = y) ∧ x = 0. ∧. (iii) = t1 t2 . where R ∈ Lr is mary and t1 .2. which gives another useful characterization of subformulas. ∀}. This will become clear later. but we also use it to indicate equality of mathematical objects in the way we have done already many times. then so are ¬ϕ. . given Lformulas ϕ and ψ we shall write ϕ ∨ ψ to indicate ∨ϕψ. and ϕ → ψ to indicate ¬ϕ ∨ ψ. sk with i ≤ j ≤ k that is of the form ∃xψ or ∀xψ. t2 are Lterms. then ∃xϕ and ∀xϕ are Lformulas. ∃ and ∀ are given arity 2 and the other symbols have the arities assigned to them earlier. Note that all Lformulas are admissible words on the alphabet L ∪ Var ∪ { . sm .) The notational conventions introduced in the section on propositional logic go through. At this point the reader is invited to do the ﬁrst exercise at the end of this section. . FORMULAS AND SENTENCES (i) and ⊥. . the third is free. ∧. ∨ϕψ and ∧ϕψ.5. and the occurrences of x and y are really the occurrences in this string. The logical symbol = is treated just as a binary relation symbol. where =. sj = x) is said to be a bound occurrence if ϕ has a subformula si si+1 . sk where 1 ≤ i ≤ k ≤ m which also happens to be a formula of L. ∃.) The reader should distinguish between diﬀerent ways of using the symbol =. A subformula of ϕ is a subword of the form si . Written as a word on the alphabet above we have ϕ = s1 . ∨. . where t1 . . ⊥. where x and y are distinct. (Note: the formula is actually the string ∧∃x = xy = x0. . . =. . . Deﬁnition. not all admissible words on this alphabet are Lformulas: the word ∃xx is admissible but not an Lformula. ¬. Sometimes it denotes one of the eight formal logical symbols. This fact makes the results on unique readability applicable to Lformulas. tm . Let ϕ be a formula of L. (ii) Rt1 . with the role of propositions there taken over by formulas here.) . ∀} 31 obtained as follows: (i) every atomic Lformula is an Lformula. ¬. To increase readability we usually write an atomic formula = t1 t2 as t1 = t2 and its negation ¬ = t1 t2 as t1 = t2 . An occurrence of a variable x in ϕ at the jth place (that is. but its interpretation in a structure will always be the equality relation on its underlying set. and the only occurrence of y is free. Example. (For example. where t1 and t2 are Lterms. (However. If an occurrence is not bound then it is said to be a free occurrence.
. . or σ is valid in A). xn be distinct variables. . . In using this notation it is understood that x1 . . Then ϕ(t1 /x1 . but it is not required that each of x1 . . Lemma 2. . . tn /xn ) is the word obtained by replacing all the free occurences of xi in ϕ by ti . . let x1 . . . . . Then A = σ if and only if A = ϕ(a) for some a ∈ A. .5. and let t1 . . . . (iii) Suppose σ = σ1 ∧ σ2 . . In the deﬁnition of ϕ(t1 /x1 . where one allows that some of these indeterminates do not actually occur in p. (ii) Suppose σ = σ1 ∨ σ2 . tn /xn ). .1. t2 . tn ) as a shorthand for ϕ(t1 /x1 . tA ) ∈ RA . xn ) to indicate a formula ϕ such that all variables that occur free in ϕ are among x1 .32 CHAPTER 2. . . . If t1 . . . . which for simplicity of notation we denote instead by tA . . tn be Lterms. . . . and let C ⊆ A. . because it can happen that ϕ(t1 /x1 )(t2 /x2 ) = ϕ(t1 /x1 . . . . Let A be an Lstructure with underlying set A. Hence for each variablefree LC term t we have a corresponding element tAC of A. xn are distinct variables. tn /xn ) the “replacing” should be simultaneous. . BASIC CONCEPTS OF LOGIC Deﬁnition. . tm if and only if (tA . . and by interpreting each name c as the element c ∈ C. These names are symbols not in L. . . Then ϕ(t1 /x1 . . . and t1 . . . also read as A satisﬁes σ or σ holds in A. If ϕ is given in the form ϕ(x1 . All this applies in particular to the case C = A. Suppose ϕ is an Lformula. . (iii) A = t1 = t2 if and only if tA = tA . . . . We have the following lemma whose routine proof is left to the reader. . xn by p(x1 . Then A = σ if and only if A = σ1 or A = σ2 . A sentence is a formula in which all occurrences of variables are bound occurrences. . . Then A = σ if and only if A = σ1 and A = σ2 . . . . then ϕ(t1 . tn are Lterms. (iv) Suppose σ = ∃xϕ(x). tn /xn ) is an Lformula. . . . xn are distinct variables. . and A = ⊥. The LC structure thus obtained is indicated by AC . . (This is like indicating a polynomial in the indeterminates x1 . . . . . First we consider atomic LA sentences: (i) A = . We make A into an LC structure by keeping the same underlying set and interpretations of symbols of L. . . where in LA we have a name a for each a ∈ A. . tn are variablefree and ϕ = ϕ(x1 . . x1 . then we write ϕ(t1 . and m 1 variablefree LA terms t1 . . . . for variablefree LA terms t1 . . . xn . . . . We extend L to a language LC by adding a constant symbol c for each c ∈ C. xn ). We can now deﬁne what it means for an LA sentence σ to be true in the Lstructure A (notation: A = σ. xn ). xn ). . . . . . We write ϕ(x1 . xn occurs free in ϕ. . . called the name of c.) Deﬁnition. tn ) is an Lsentence. . simultaneously for i = 1. t2 /x2 ). . . Then A = σ if and only if A σ1 . 1 2 We extend the deﬁnition inductively to arbitrary LA sentences as follows: (i) Suppose σ = ¬σ1 . . tm . . for mary R ∈ Lr . . n. (ii) A = Rt1 . Deﬁnition. . . . . Let ϕ be an Lformula. .
xm and a quantiﬁerfree Lformula ϕ. Then A = σ if and only if A = ϕ(a) for all a ∈ A. . an ) for some (a1 . (Here x2 abbreviates the term x · x. . √ (1) The set {r ∈ R : r < 2} is 0deﬁnable in (R. ∀xm ϕ with distinct x1 . and sf(∀xϕ) = {∀xϕ} ∪ sf(ϕ). . This is why we introduced names. an ) : A = ϕ(a1 . . . Examples. Given an LA formula ϕ(x1 . . . 1. an )} The formula ϕ(x1 . <. . Exercises. . 1. An Lformula is said to be universal if it has the form ∀x1 . <. A = σ ⇐⇒ A = ϕ(a1 . sf(ϕ ∨ ψ) = {ϕ ∨ ψ} ∪ sf(ϕ) ∪ sf(ψ). . . . −. . but “inductive” refers here to induction with respect to the number of logical symbols in σ. xm and a quantiﬁerfree Lformula ϕ. the fact that ϕ(a) has fewer logical symbols than ∃xϕ(x) is crucial for the above to count as a deﬁnition.5. . . . FORMULAS AND SENTENCES 33 (v) Suppose σ = ∀xϕ(x). An Lformula is said to be existential if it has the form ∃x1 . and that for an LA sentence σ = ∀x1 . . . If ϕ is atomic. . We now single out formulas by certain syntactical conditions. . ∀xn ϕ(x1 . an ) ∈ An .2. . xn ) is said to deﬁne the set ϕA in A. . . . . +. . 0.) (2) The set {r ∈ R : r < π} is deﬁnable in (R. xn ) we let ϕA be the following subset of An : ϕA = {(a1 . an ) ∈ An . 0. . . . . ∃xm ϕ with distinct x1 . an ) for all (a1 . . . An Lformula is said to be quantiﬁerfree if it has no occurrences of ∃ and no occurrences of ∀. . the inductive deﬁnition above forces us to consider LA sentences ϕ(a). then sf(ϕ) = {ϕ}. . sf(∃xϕ) = {∃xϕ} ∪ sf(ϕ). Even if we just want to deﬁne A = σ for Lsentences σ. (1) Let (a) (b) (c) (d) ϕ and ψ be Lformulas. . . . . xn ). . xn ). . ·): it is deﬁned by the formula (x2 < 1 + 1) ∨ (x < 0). . . ·): it is deﬁned by the formula x < π. . If moreover ϕ can be chosen to be an Lformula then S is said to be 0deﬁnable in A. . . . sf(¬ϕ) = {¬ϕ} ∪ sf(ϕ). . ∃xn ϕ(x1 . +. . xn ). These conditions have semantic counterparts in terms of the behaviour of these formulas under various kinds of homomorphisms. . . . . . . put sf(ϕ) := set of subformulas of ϕ. It is easy to check that for an LA sentence σ = ∃x1 . For example. −. . An Lformula is said to be positive if it has no occurrences of ¬ (but it can have occurrences of ⊥). We didn’t say so explicitly. A = σ ⇐⇒ A = ϕ(a1 . . Deﬁnition. A set S ⊆ An is said to be deﬁnable in A if S = ϕA for some LA formula ϕ(x1 . . and sf(ϕ ∧ ψ) = {ϕ ∧ ψ} ∪ sf(ϕ) ∪ sf(ψ). one can see that if σ has the form ∃xϕ(x) or ∀xϕ(x). . as shown in some exercises below. . . .
put S(a) := {b ∈ An : (a. (6) Let the symbols of L be a binary relation symbol < and a unary relation symbol U . The set {2n : n ∈ N} in the semiring N . . an ). y) deﬁnes in A the set {a ∈ Am : S(a) = An }. Then there is an Lsentence σ such that for all X ⊆ R we have (R. 3. . . ∀yn ϕ(x. (7) Let A ⊆ B. 0. . . an ∈ A. b) ∈ S} (a section of S). X) = σ ⇐⇒ X is ﬁnite. . (8) Suppose h : A −→ B is a homomorphism of Lstructures. . . . . . then A = σ ⇔ B = σ. This convention is in force throughout these notes. Suppose that S ⊆ Am+n is deﬁned in A by the LA formula ϕ(x. am+n ) = (a1 . . . then A = σ ⇒ B = σ (d) If σ is a universal LA sentence. . . ∃yn ϕ(x. (b) S1 ∩ S2 is deﬁned in A by (ϕ1 ∧ ϕ2 )(x1 . . . For each LA term t. let ϕh be the LB formula obtained from ϕ by replacing each occurrence of a name a of an element a ∈ A by the name ha of the corresponding element ha ∈ B. . (a) For each variable free LA term t we have tA = tB . 1. . . f ) where f : R → R is any function. 0. <. . . ` (d) S1 ⊆ S2 ⇐⇒ A = ∀x1 . . . and ∀y1 . then B = σ ⇒ A = σ. let th be the LB term obtained from t by replacing each occurrence of a name a of an element a ∈ A by the name ha of the corresponding element ha ∈ B. . <. xn ) respectively. . . . . (4) Let π : Am+n → Am be the projection map given by π(a1 . . The set {a ∈ R : f is continuous at a} in (R. . . . . yn ). ∀xn ϕ1 → ϕ2 ). . so is ϕh . xn ) and ϕ2 (x1 . +. y) deﬁnes in A the subset π(S) of Am . (b) If the LA sentence σ is quantiﬁerfree. xn ) is an LA term and a1 . . n) ∈ N2 : m < n} in (N.} of prime numbers in the semiring N = (N.34 CHAPTER 2. 7. . for each LA formula ϕ. The set {2. . . . and for S ⊆ Am+n and a ∈ Am . . . . Then: (a) S1 ∪ S2 is deﬁned in A by (ϕ1 ∨ ϕ2 )(x1 . BASIC CONCEPTS OF LOGIC (2) If t(x1 . . an )A = tA (a1 . xn ). xn ). . . 5. xm ) and y = (y1 . y) where x = (x1 . (5) The (a) (b) (c) (d) following sets are 0deﬁnable in the corresponding structures: The ordering relation {(m. Then ∃y1 . . . . Similarly. . (c) If σ is an existential LA sentence. . (c) An S1 is deﬁned in A by ¬ϕ1 (x1 . . xn ). am ). Then we consider LA to be a sublanguage of LB in such a way that each a ∈ A has the same name in LA as in LB . . . . ·). . . . then t(a1 . +). (3) Suppose that S1 ⊆ An and S2 ⊆ An are deﬁned in A by the LA formulas ϕ1 (x1 . . . Note that if ϕ is a sentence. Then: . .
h σ is a positive LA sentence without ∀symbol. the term t1 if n = 1.6 Models In the rest of this chapter L is a language. . MODELS (a) (b) (c) (d) if if if if 35 t is a variablefree LA term. then A = σ ⇒ B = σh . ∀x∀y∀z((x + y) + z = x + (y + z))} (3) Torsionfree abelian groups are the LAb structures that are models of Ab ∪ {∀x(nx = 0 → x = 0) : n = 1.} (4) Rings are the LRi structures that are models of Ri := Ab ∪ {∀x∀y∀z (x · y) · z = x · (y · z) . ∀x x · 1 = x ∧ 1 · x = x . We say that A is a model of Σ or Σ holds in A (denoted A = Σ) if A = σ for each σ ∈ Σ. ∀x∀y∀z((x · y) · z = x · (y · z))} (2) Abelian groups are the LAb structures that are models of Ab := {∀x(x + 0 = x). and. Then we have similar notational conventions for t1 · . then h(tA ) = tB . · tn and tn .6. ϕ. A is an Lstructure (with underlying set A). z.2. σ is an LA sentence and h is an isomorphism. and θ are Lformulas. . . . To discuss examples it is convenient to introduce some notation. 0t and 1t denote the terms 0 and t respectively. . We write nt for the term t+· · ·+t with n summands. in particular. Suppose L contains the constant symbol 1 and the binary function symbol · (the multiplication sign). unless this would cause confusion. ∀x(x · x−1 = 1 ∧ x−1 · x = 1). 3. Given any terms t1 . . σ is an Lsentence. isomorphic Lstructures satisfy exactly the same Lsentences. y. Deﬁnition. (1) Groups are the LGr structures that are models of Gr := {∀x(x · 1 = x ∧ 1 · x = x). and the term (t1 + · · · + tn−1 ) + tn for n > 1. In particular. in particular. and t1 is just t. Fix three distinct variables x. ∀x∀y∀z (x · (y + z) = x · y + x · z ∧ (x + y) · z = x · z + y · z) } . We drop the preﬁx L in “Lterm” and “Lformula” and so on. tn we deﬁne the term t1 + · · · + tn inductively as follows: it is the term 0 if n = 0. . Suppose L contains (at least) the constant symbol 0 and the binary function symbol +. ∀x∀y(x + y = y + x). . and Σ is a set of Lsentences. σ is a positive LA sentence and h is surjective. 2. . ∀x(x + (−x) = 0). Examples. 2. unless indicated otherwise. then A = σ ⇒ B = σh . t is an Lterm. for n = 0 both stand for the term 1. then A = σ ⇔ B = σh . ψ.
xm+1 . . . .36 CHAPTER 2. 11. . obtained by adding a variable xm+1 . Deﬁnition. xm ). Of course ϕ can also be written as ϕ(y1 . ﬁelds of characteristic p are the LRi structures that are models of Fl(p) := Fl ∪ {p1 = 0}. The reader should check that if ϕ = ϕ(x1 .” the reader should check that these diﬀerent ways of specifying variables (including at least the variables occurring free in ϕ) give the same Ainstances. We say that ϕ is a logical consequence of Σ (notation: Σ = ϕ) if A = ϕ for all models A of Σ. algebraically closed ﬁelds of characteristic p are the LRi structures that are models of ACF(p) := ACF ∪ {p1 = 0}. . 3. . 1 = 0. . . . yn ) for another sequence of variables y1 . (9) Algebraically closed ﬁelds of characteristic 0 are the LRi structures that are models of ACF(0) := ACF ∪ {n1 = 0 : n = 2. n. . . am ) with a1 . . 4. . . 5. . then A = ϕ ⇐⇒ A = ∀x1 . . for i = 1. . . 3. . 5.} Here u1 . First deﬁne an Ainstance of a formula ϕ = ϕ(x1 . for example. Deﬁnition. . We say that σ is a logical consequence of Σ (written Σ = σ) if σ is true in every model of Σ. . 11. We also extend the notion of “logical consequence of Σ” to formulas. . 7. . ∀xm ϕ. . . We deﬁned what it means for a sentence σ to hold in a given structure A. yn . . 5. is some ﬁxed inﬁnite sequence of distinct variables. (8) Algebraically closed ﬁelds are the LRi structures that are models of ACF := Fl∪{∀u1 . am ∈ A. 3. BASIC CONCEPTS OF LOGIC (5) Fields are the LRi structures that are models of Fl = Ri ∪ {∀x∀y(x · y = y · x). . . . . . . . . This can now be expressed as Ri = ∀x( x · 0 = 0). . u3 . . . xm ) to be an LA sentence of the form ϕ(a1 . .} (7) Given a prime number p. Deﬁnition. . . . . . (10) Given a prime number p. It is wellknown that in any ring R we have x · 0 = 0 for all x ∈ R. or it could be x1 . distinct also from x. ∀x x = 0 → ∃y (x · y = 1) } (6) Fields of characteristic 0 are the LRi structures that are models of Fl(0) := Fl ∪ {n1 = 0 : n = 2. 7. Example. xm . u2 . . . . and ui xn−i abbreviates ui · xn−i . . A formula ϕ is said to be valid in A (notation: A = ϕ) if all its Ainstances are true in A. We now extend this to arbitrary formulas. . . . Thus for the above to count as a deﬁnition of “Ainstance.+un = 0) : n = 2. . . . yn could be obtained by permuting x1 .}. y1 . . . . . ∀un ∃x(xn +u1 xn−1 +. . xm . .
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS
37
One should not confuse the notion of “logical consequence of Σ” with that of “provable from Σ.” We shall give a deﬁnition of provable from Σ in the next section. The two notions will turn out to be equivalent, but that is hardly obvious from their deﬁnitions: we shall need much of the next chapter to prove this equivalence, which is called the Completeness Theorem for Predicate Logic. We ﬁnish this section with two basic facts: Lemma 2.6.1. Let α(x1 , . . . , xm ) be an LA term, and recall that α deﬁnes a map αA : Am → A. Let t1 , . . . , tm be variablefree LA terms, with tA = ai ∈ A i for i = 1, . . . , m. Then α(t1 , . . . , tm ) is a variablefree LA term, and α(t1 , . . . , tm )A = α(a1 , . . . am )A = αA (tA , . . . , tA ). 1 m This follows by a straightforward induction on α. Lemma 2.6.2. Let t1 , . . . , tm be variablefree LA terms with tA = ai ∈ A i for i = 1, . . . , m. Let ϕ(x1 , . . . , xm ) be an LA formula. Then the LA formula ϕ(t1 , . . . , tm ) is a sentence and A = ϕ(t1 , . . . , tm ) ⇐⇒ A = ϕ(a1 , . . . , am ). Proof. To keep notations simple we give the proof only for m = 1 with t = t1 and x = x1 . We proceed by induction on the number of logical symbols in ϕ(x). Suppose that ϕ is atomic. The case where ϕ is or ⊥ is obvious. Assume ϕ is Rα1 . . . αm where R ∈ Lr is mary and α1 (x), . . . , αm (x) are LA terms. Then ϕ(t) = Rα1 (t) . . . αm (t) and ϕ(a) = Rα1 (a) . . . αm (a). We have A = ϕ(t) iﬀ (α1 (t)A , . . . , αm (t)A ) ∈ RA and also A = ϕ(a) iﬀ (α1 (a)A , . . . , αm (a)A ) ∈ RA . As αi (t)A = αi (a)A for all i by the previous lemma, we have A = ϕ(t) iﬀ A = ϕ(a). The case that ϕ(x) is α(x) = β(x) is handled the same way. It is also clear that the desired property is inherited by disjunctions, conjunctions and negations of formulas ϕ(x) that have the property. Suppose now that ϕ(x) = ∃y ψ. Case y = x: Then ψ = ψ(x, y), ϕ(t) = ∃yψ(t, y) and ϕ(a) = ∃yψ(a, y). As ϕ(t) = ∃yψ(t, y), we have A = ϕ(t) iﬀ A = ψ(t, b) for some b ∈ A. By the inductive hypothesis the latter is equivalent to A = ψ(a, b) for some b ∈ A, hence equivalent to A = ∃yψ(a, y). As ϕ(a) = ∃yψ(a, y), we conclude that A = ϕ(t) iﬀ A = ϕ(a). Case y = x: Then x does not occur free in ϕ(x) = ∃xψ. So ϕ(t) = ϕ(a) = ϕ is an LA sentence, and A = ϕ(t) ⇔ A = ϕ(a) is obvious. When ϕ(x) = ∀y ψ then one can proceed exactly as above by distinguishing two cases.
2.7
Logical Axioms and Rules; Formal Proofs
In this section we introduce a proof system for predicate logic and state its completeness. We then derive as a consequence the compactness theorem and some of its corollaries. The completeness is proved in the next chapter.
38
CHAPTER 2. BASIC CONCEPTS OF LOGIC
A propositional axiom of L is by deﬁnition a formula that for some ϕ, ψ, θ occurs in the list below: 1. 2. ϕ → (ϕ ∨ ψ); ϕ → (ψ ∨ ϕ)
3. ¬ϕ → ¬ψ → ¬(ϕ ∨ ψ) 4. (ϕ ∧ ψ) → ϕ; (ϕ ∧ ψ) → ψ
5. ϕ → ψ → (ϕ ∧ ψ) 6. ϕ → (ψ → θ) → (ϕ → ψ) → (ϕ → θ) 7. ϕ → (¬ϕ → ⊥) 8. (¬ϕ → ⊥) → ϕ Each of items 2–8 is a scheme describing inﬁnitely many axioms. Note that this list is the same as the list in Section 2.2 except that instead of propositions p, q, r we have formulas ϕ, ψ, θ. The logical axioms of L are the propositional axioms of L and the equality and quantiﬁer axioms of L as deﬁned below. Deﬁnition. The equality axioms of L are the following formulas: (i) x = x, (ii) x = y → y = x, (iii) (x = y ∧ y = z) → x = z, (iv) (x1 = y1 ∧ . . . ∧ xm = ym ∧ Rx1 . . . xm ) → Ry1 . . . ym , (v) (x1 = y1 ∧ . . . ∧ xn = yn ) → F x1 . . . xn = F y1 . . . yn , with the following restrictions on the variables and symbols of L: x, y, z are distinct in (ii) and (iii); in (iv), x1 , . . . , xm , y1 , . . . , ym are distinct and R ∈ Lr is mary; in (v), x1 , . . . , xn , y1 , . . . , yn are distinct, and F ∈ Lf is nary. Note that (i) represents an axiom scheme rather than a single axiom, since diﬀerent variables x give diﬀerent formulas x = x. Likewise with (ii)–(v). Let x and y be distinct variables, and let ϕ(y) be the formula ∃x(x = y). Then ϕ(y) is valid in all A with A > 1, but ϕ(x/y) is invalid in all A. Thus substituting x for the free occurrences of y does not always preserve validity. To get rid of this anomaly, we introduce the following restriction on substitutions of a term t for free occurrences of y. Deﬁnition. We say that t is free for y in ϕ, if no variable in t can become bound upon replacing the free occurrences of y in ϕ by t, more precisely: whenever x is a variable in t, then there are no occurrences of subformulas in ϕ of the form ∃xψ or ∀xψ that contain an occurrence of y that is free in ϕ. Note that if t is variablefree, then t is free for y in ϕ. We remark that “free for” abbreviates “free to be substituted for.” In exercise 2 the reader is asked to show that, with this restriction, substitution of a term for the free occurrences of a variable does preserve validity.
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS
39
Deﬁnition. The quantiﬁer axioms of L are the formulas ϕ(t/y) → ∃yϕ and ∀yϕ → ϕ(t/y) where t is free for y in ϕ. These axioms have been chosen to have the following property. Proposition 2.7.1. The logical axioms of L are valid in every Lstructure. We ﬁrst prove this for the propositional axioms of L. Let α1 , . . . , αn be distinct propositional atoms not in L. Let p = p(α1 , . . . , αn ) ∈ Prop{α1 , . . . , αn }. Let ϕ1 , . . . , ϕn be formulas and let p(ϕ1 , . . . , ϕn ) be the word obtained by replacing each occurrence of αi in p by ϕi for i = 1, . . . , n. One checks easily that p(ϕ1 , . . . , ϕn ) is a formula. Lemma 2.7.2. Suppose ϕi = ϕi (x1 , . . . , xm ) for 1 ≤ i ≤ n and let a1 , . . . , am ∈ A. Deﬁne a truth assignment t : {α1 , . . . , αn } −→ {0, 1} by t(αi ) = 1 iﬀ A = ϕi (a1 , . . . , am ). Then p(ϕ1 , . . . , ϕn ) is an Lformula and p(ϕ1 , . . . , ϕn )(a1 /x1 , . . . , am /xm ) = p(ϕ1 (a1 , . . . , am ), . . . , ϕn (a1 , . . . , am )), t(p(α1 , . . . , αn )) = 1 ⇐⇒ A =p(ϕ1 (a1 , . . . , am ), . . . , ϕn (a1 , . . . , am )). In particular, if p is a tautology, then A = p(ϕ1 , . . . , ϕn ). Proof. Easy induction on p. We leave the details to the reader. Deﬁnition. An Ltautology is a formula of the form p(ϕ1 , . . . , ϕn ) for some tautology p(α1 , . . . , αn ) ∈ Prop{α1 , . . . , αn } and some formulas ϕ1 , . . . , ϕn . By Lemma 2.7.2 all Ltautologies are valid in all Lstructures. The propositional axioms of L are Ltautologies, so all propositional axioms of L are valid in all Lstructures. It is easy to check that all equality axioms of L are valid in all Lstructures. In exercise 3 below the reader is asked to show that all quantiﬁer axioms of L are valid in all Lstructures. This ﬁnishes the proof of Proposition 2.7.1. Next we introduce rules for deriving new formulas from given formulas. Deﬁnition. The logical rules of L are the following: (i) Modus Ponens (MP): From ϕ and ϕ → ψ, infer ψ. (ii) Generalization Rule (G): If the variable x does not occur free in ϕ, then (a) from ϕ → ψ, infer ϕ → ∀xψ; (b) from ψ → ϕ, infer ∃xψ → ϕ. A key property of the logical rules is that their application preserves validity. Here is a more precise statement of this fact, to be veriﬁed by the reader. (i) If A = ϕ and A = ϕ → ψ, then A = ψ. (ii) Suppose x does not occur free in ϕ. Then (a) if A = ϕ → ψ, then A = ϕ → ∀xψ; (b) if A = ψ → ϕ, then A = ∃xψ → ϕ. Deﬁnition. A formal proof , or just proof , of ϕ from Σ is a sequence ϕ1 , . . . , ϕn of formulas with n ≥ 1 and ϕn = ϕ, such that for k = 1, . . . , n:
Proof. . . .7. Contradiction! . Moreover. Suppose σ is an LRi sentence that holds in all ﬁelds of characteristic 0. The Compactness Theorem has many consequences.} = σ. n ≤ N } = σ. However the equivalence of and = (Completeness Theorem) justiﬁes our choice of logical axioms and rules and shows in particular that no further logical axioms and rules are needed. 5. By the previous result σ holds in some ﬁeld of characteristic p > 0. There is no ﬁnite set of LRi sentences whose models are exactly the ﬁelds of characteristic 0. Let σ := σ1 ∧ · · · ∧ σN . Note that Fl(0) is inﬁnite. and thus our notion of formal proof is somewhat arbitrary. (ii) or ϕk is a logical axiom. (iii) or there are i. 3. Fl(0) = Fl ∪ {n1 = 0 : n = 2. then Σ = ϕ.7. . k − 1} such that ϕk can be inferred from ϕi and ϕj by MP. this equivalence has consequences that can be stated in terms of = alone. . there is N ∈ N such that Fl ∪ {n1 = 0 : n = 2. Then by Compactness. The converse is more interesting.7. σN }. Then the models of σ are just the ﬁelds of characteristic 0. If Σ ϕ. .7. If Σ = σ then there is a ﬁnite subset Σ0 of Σ such that Σ0 = σ.40 CHAPTER 2. . see exercise 8 below. Proposition 2. . Suppose there is such a ﬁnite set of sentences {σ1 . or from ϕi by G. Could there be an alternative ﬁnite set of axioms whose models are exactly the ﬁelds of characteristic 0? Corollary 2. By assumption. 3. Here is one. . . then we write Σ ϕ and say Σ proves ϕ.6. j ∈ {1. The converse of this corollary fails. An example is the important Compactness Theorem. and due to G¨del (1930): o Theorem 2.5 (Compactness Theorem). This follows easily from earlier facts that we stated and which the reader was asked to verify. . Theorem 2. Corollary 2.4 (Completeness Theorem of Predicate Logic). . Then there exists a natural number N such that σ is true in all ﬁelds of characteristic p > N . Proof. Σ ϕ ⇐⇒ Σ = ϕ Remark. It follows that σ is true in all ﬁelds of characteristic p > N . . . .7.3. Our choice of proof system. 5. BASIC CONCEPTS OF LOGIC (i) either ϕk ∈ Σ.7. . If there exists a proof of ϕ from Σ.
(ii) The quantiﬁer axiom ϕ(t/y) → ∃yϕ is valid in A. y). (3) Suppose t is free for y in ϕ = ϕ(x1 . (1) Let L = {R} where R is a binary relation symbol. xn . . then (5) Σ (6) If Σ (7) ϕi for i = 1. . . . (8) Indicate an LRi sentence that is true in the ﬁeld of real numbers. but false in all ﬁelds of positive characteristic. τ ) → ∃yϕ(a1 . y) with a1 . e.2. and let A = (A. two ﬁnite Lstructures are isomorphic iﬀ they satisfy the same Lsentences. .5.4. then Σ ¬∃xϕ ↔ ∀x¬ϕ and ¬∀xϕ ↔ ∃x¬ϕ. ϕ ↔ ψ. an . Exercise (4) will be used in proving Theorem 2. . (In fact. .) (iii) The quantiﬁer axiom ∀yϕ → ϕ(t/y) is valid in A.7. an . All but the last one to be done without using Theorem 2. R) be a ﬁnite Lstructure (i. . . .4 or 2. Actually. . . then ϕ(t/y) is valid in A. an ∈ A and τ a variablefree LA term. (9) Let σ be an LAb sentence which holds in all nontrivial torsion free abelian groups. .7.2.7. .) (2) If t is free for y in ϕ and ϕ is valid in A. . Then there exists an Lsentence σ such that the models of σ are exactly the Lstructures isomorphic to A. ψ → ϕ. . Then: (i) Each Ainstance of the quantiﬁer axiom ϕ(t/y) → ∃yϕ has the form ϕ(a1 . FORMAL PROOFS 41 Exercises. LOGICAL AXIOMS AND RULES. . Then there exists N ∈ N such that σ is true in all groups Z/pZ where p is a prime number and p > N . for an arbitrary language L. n ⇐⇒ Σ ϕ → ψ and Σ ϕ. .6. . .7. . (Hint: use Lemma 2. ϕ1 ∧ · · · ∧ ϕn . (4) If ϕ is an Ltautology. the set A is ﬁnite).
.
Chapter 3 The Completeness Theorem The main aim of this chapter is to prove the Completeness Theorem.second form). if It is convenient to prove ﬁrst a variant of the Completeness Theorem. Proof. Then Σ ∀xϕ. Theorem 3. Lemma 3. From Σ ϕ and the Ltautology ϕ → (¬∀xϕ → ϕ) we obtain Σ ¬∀xϕ → ϕ by MP. Σ is consistent if and only if Σ has a model. Suppose Σ ϕ.1 Another Form of Completeness ⊥. which we can treat here rather eﬃciently using our construction of a socalled termmodel in the proof of the Completeness Theorem.6. ψ. The last section contains some of the basics of universal algebra. As a byproduct we also derive some more elementary facts about predicate logic. t. σ and Σ are as in the beginning of Section 2. Suppose Σ ∪ {σ} ϕ. and otherwise (that is. 3. We ﬁrst show that this second form of the Completeness Theorem implies the ﬁrst form. 43 . The cases where ϕ is a logical axiom. which are also useful later in this Chapter. A. Using the Ltautology (¬∀xϕ → ∀xϕ) → ∀xϕ and MP we get Σ ∀xϕ.2. Then by G we have Σ ¬∀xϕ → ∀xϕ. This will be done through a series of technical lemmas. ϕ. or ϕ ∈ Σ ∪ {σ} or ϕ is obtained by MP are treated just as in the proof of the Deduction Lemma of Propositional Logic. Deﬁnition. Lemma 3. we call Σ inconsistent.1. We say that Σ is consistent if Σ Σ ⊥).1.3 (Deduction Lemma). Conventions on the use of L.1 (Completeness Theorem . Proof. θ.1. By induction on the length of a proof of ϕ from Σ ∪ {σ}. Then Σ σ → ϕ.
Σ σ if and only if Σ ∪ {¬σ} is inconsistent.1. Then Σ σ1 ∧ . . Corollary 3.9 (Compactness Theorem .1. . This job will be done in a series of lemmas. ϕ. σn } We leave the proof as an exercise. and where we assume inductively that Σ σ → (ϕ1 → ψ). ∀yn ϕ.44 CHAPTER 3. ∀yn ϕ where ϕ = ϕ(y1 . For (⇒). .1. by G. We ﬁnish this section with another form of the Compactness Theorem: Theorem 3. . From Σ = ϕ we obtain Σ = ∀y1 .second form).1. .4. We have to argue that then Σ σ → (ϕ1 → ∀xψ). Σ ∀yϕ if and only if Σ ϕ. Corollary 3. . But then by the 2nd form of the Completeness Theorem Σ ∪ {¬σ} is inconsistent. From the Ltautology σ → (ϕ1 → ψ) → (σ ∧ ϕ1 ) → ψ and MP we get Σ (σ∧ϕ1 ) → ψ.2 Proof of the Completeness Theorem We are now going to prove Theorem 3. The proof is just like that of the corresponding fact of Propositional Logic. The case that ϕ is obtained by part (b) of G is left to the reader.2.1. ∀yϕ. . 3. assume Σ quantiﬁer axiom ∀yϕ → ϕ. we do not assume in those lemmas that Σ is consistent. (⇐) This is Lemma 3.1. ∧ σn → ϕ. If each ﬁnite subset of Σ has a model. Suppose Σ ∪ {σ1 . . yn ). THE COMPLETENESS THEOREM Suppose that ϕ is obtained by part (a) of G. . Unless we say so.7 we get Σ ϕ. This follows from the second form of the Completeness Theorem. Using the Ltautology (σ ∧ ϕ1 ) → ∀xψ → σ → (ϕ1 → ∀xψ) and MP this gives Σ σ → (ϕ1 → ∀xψ).1. we focus our attention on (⇒). ∀yn ϕ if and only if Σ ϕ.7.5 we have Σ σ and thus by Corollary 3. then Σ has a model. It suﬃces to show that then Σ ϕ. Theorem 2. Corollary 3.5. so ϕ is ϕ1 → ∀xψ where x does not occur free in ϕ1 and Σ ∪ {σ} ϕ1 → ψ. Assume the second form of the Completeness Theorem holds.7.1. Since (⇐) is clear. Since x does not occur free in σ∧ϕ1 this gives Σ (σ∧ϕ1 ) → ∀xψ. Lemma 3. Then by Corollary 3. Corollary 3.8. .6. . given a consistent set of sentences Σ we must show that Σ has a model. . The second form of the Completeness Theorem implies the ﬁrst form. .4. and so Σ ∪ {¬σ} has no model where σ is the sentence ∀y1 .1.1. . . that is. . Σ ∀y1 .1. We have the Proof. Proof. so by MP we get Σ ϕ. and that Σ = ϕ. .
. By the assumptions and Exercise 5 in Section 2. . . Then Σ ϕ(t/x). . .2. ∧ xn = yn → F x1 . tm .2.2. yn . Take an equality axiom x1 = y1 ∧ .4. 45 Proof. . . . . tm → Rt1 .2. . The relation ∼Σ is an equivalence relation on TL . . . PROOF OF THE COMPLETENESS THEOREM Lemma 3. . . ∧ xm = ym ∧ Rx1 . tn /yn ) = ϕ(t1 /x1 . t3 be Lterms and Σ t1 = t2 and Σ t2 = t3 . . . .1 n times in succession to obtain Σ ψ where ψ = ϕ(y1 /x1 . Then MP gives Σ Rt1 . n. Suppose Σ occur bound in ϕ. .2. . take an equality axiom (x = y ∧ y = z) → (x = z) and apply Lemma 3. . (5) Let F ∈ Lf be nary. . Then Σ ϕ. . . . Then Σ F t1 .2. To prove (4) we ﬁrst use the same Exercise 5 to get Σ t1 = t1 ∧ . (4) Let R ∈ Lr be mary and let t1 . . yn /xn ). Proof. tm . tm . . To ﬁnish. . Then Σ t1 = t3 . Lemma 3. . . tn /xn ). tm . . . Then MP yields Σ t1 = t3 . . . t1 . ym and apply Lemma 3. .1. observe that ψ(t1 /y1 . . take an equality axiom x = y → y = x and apply Lemma 3. Suppose Σ ϕ and t is free for x in ϕ. . . ∧ tm = tm ∧ Rt1 . Use Lemma 3. We deﬁne a binary relation ∼Σ on TL by t1 ∼Σ t2 ⇐⇒ Σ t1 = t2 . . tm be Lterms such that Σ ti = ti for i = 1. . tn . xn = F y1 . . . . (1) For each Lterm t we have t = t. . . m and Σ Rt1 . (3) Let t1 . t2 . . . .2 to get (t1 = t2 ∧ t2 = t3 ) → t1 = t3 . tn . .2. and let t1 .1 again n times to get Σ ψ(t1 /y1 . (2) and (3) of the last lemma yield the following. . yn that do not occur in ϕ or t1 . Part (5) is obtained in a similar way by using an equality axiom x1 = y1 ∧ . tn and that are distinct from x1 . . ∧ tm = tm ∧ Rt1 . . . Apply Lemma 3. tn be Lterms such that Σ ti = ti for i = 1. . From Σ ϕ we get Σ ∀xϕ by Lemma 3. For (1). tm . For (3). .1. . . . xm → Ry1 .2. Lemma 3. . . xn . Then MP gives Σ t = t. Parts (1).3. . use the equality axiom x = x and apply Lemma 3. tm . . . . For (2). .2. Deﬁnition. .3. Then MP together with the quantiﬁer axiom ∀xϕ → ϕ(t/x) gives Σ ϕ(t/x) as required. tn /xn ). .2 to obtain Σ t1 = t1 ∧ .2.2.2.7 we have Σ t1 = t2 ∧ t2 = t3 . . . tn /yn ). . . . . t1 . Then Σ Rt1 . t be Lterms and Σ t = t . and t1 . . .2 to get t = t → t = t. Let TL be the set of variablefree Lterms. Lemma 3. Proof. . . Then Σ t = t. . tn are terms whose variables do not ϕ(t1 /x1 . . . .2. . . . Take distinct variables y1 . tn = F t1 . . .2. (2) Let t.
Completeness is a strong property and it can be hard to show that a given set of axioms is complete. (2) for each atomic σ we have: Σ σ ⇐⇒ AΣ = σ. The set of axioms for algebraically closed ﬁelds of characteristic 0 is complete (see the end of Section 4. Now suppose that σ is t1 = t2 where t1 . tm ⇔ ([t1 ]. tm ∈ TL ).2. [tm ]) ∈ RAΣ ⇔ AΣ = Rt1 . since there are nontrivial abelian groups and σ holds in such groups. . Thus Σ is not complete. and be done. but now we do. . . . . Then (1) for each t ∈ TL we have tAΣ = [t]. . tm where R ∈ Lr is mary and t1 . then AΣ = Σ. . [tn ]) = [F t1 . Let σ be Rt1 .3). 2 1 We also have Σ ⇔ AΣ = . .. THE COMPLETENESS THEOREM Deﬁnition. Also Σ ¬σ. But clearly this equivalence can only hold for all σ if Σ has the property that for each σ. t2 ∈ TL . . . tm (t1 . . .e. i. . (iii) If F ∈ Lf is nary.3. then RAΣ ⊆ Am is given by Σ ([t1 ]. . . Suppose L has at least one constant symbol. . . Let L = LAb . . We say that Σ is complete if Σ is consistent. The consistency of Σ means that Σ ⊥. Then Σ t1 = t2 ⇔ [t1 ] = [t2 ] ⇔ tAΣ = tAΣ ⇔ AΣ = t1 = t2 . Part (1) follows by an easy induction. Then Σ Rt1 .46 CHAPTER 3. . . .5. . Then Σ σ since the trivial group doesn’t satisfy σ. so we would have found a model of Σ. . . So far we haven’t used the assumption that Σ is consistent. The reader should verify that this counts as a deﬁnition. . . . either Σ σ or Σ ¬σ. Proof. and deserves a name: Deﬁnition. tm . does not introduce ambiguity. . Let [t] denote the equivalence class of t ∈ TL with respect to ∼Σ . Suppose L has a constant symbol. .2. Example. where the last “⇔” follows from the deﬁnition of = together with part (1). . . then F AΣ : An → AΣ is given by Σ F AΣ ([t1 ]. [tm ]) ∈ RAΣ :⇐⇒ Σ Rt1 . tn ] (t1 . and σ the sentence ∃x(x = 0). Remark. and Σ is consistent. (Use parts (4) and (5) of Lemma 3. and for each σ either Σ σ or Σ ¬σ. . If the equivalence in part (2) of this corollary holds for all σ (not only for atomic σ). . Σ := Ab (the set of axioms for abelian groups). . Then TL is nonempty. A key fact about completeness needed in this chapter is that any consistent set of sentences extends to a complete set of sentences: . This property is of interest for other reasons as well. (ii) If R ∈ Lr is mary. tm ∈ TL . tn ∈ TL ). We deﬁne the Lstructure AΣ as follows: (i) Its underlying set is AΣ := TL / ∼Σ . We also have AΣ = ⊥ by deﬁnition of =.) Corollary 3. Thus Σ ⊥ ⇔ AΣ = ⊥.
2. and Lc := L ∪ {c}. Let L have a constant symbol. Therefore AΣ = ∃xϕ(x). (Sketch) Take a proof of ϕ(c) from Σ in the language Lc . Then Σ ⊆ Σ for some complete set of Lsentences Σ . The cases that σ = ¬σ1 . and take a variable z diﬀerent from all variables occurring in that proof. and suppose Σ is consistent. Then Σ L ϕ(y). Suppose Σ is consistent.7. Case σ = ∀xϕ(x): This is similar to the previous case but we also need the result from Exercise 7 in Section 2.2. Proof. and also such that z = y. that is. So by Lemma 2. Applying the inductive hypothesis we get Σ ϕ(t). and σ = σ1 ∧ σ2 are treated just as in the proof of the corresponding Lemma 2. c a constant symbol not in L. It remains to consider two cases: Case σ = ∃xϕ(x): (⇒) Suppose that Σ σ. Choose t ∈ TL such that [t] = a.2. Proof.3. We call attention to some new notation in the next lemmas: the symbol used to emphasize that we are dealing with formal provability within L. We already know that (i) holds for atomic sentences. L is Lemma 3. For the converse. Check that one obtains in this way a proof of ϕ(z/y) in the language L from Σ. σ = σ1 ∨ σ2 . Because we are assuming that Σ has witnesses we have a t ∈ TL such that Σ ϕ(t).2.2.11 for Propositional Logic. hence AΣ = ϕ(t) by Lemma 2. (ii) Σ is complete and has witnesses. This yields Σ ∃xϕ(x) by MP and the quantiﬁer axiom ϕ(t) → ∃xϕ(x). Let Σ be a set of Lsentences. So Σ L ϕ(z/y) and hence by Lemma 3.2 we have an a ∈ AΣ such that AΣ = ϕ(a).2.7 that ¬∀xϕ ↔ ∃x¬ϕ. In particular. We use induction on the number of logical symbols in σ to obtain (i). Then there is an a ∈ AΣ such that AΣ = ϕ(a).2.5 holds for all σ. Σ L ϕ(y). and is just like that of the corresponding fact of Propositional Logic in Section 1. hence AΣ = σ. Then by the inductive hypothesis AΣ = ϕ(t). (⇐) Assume AΣ = σ. Completeness is only a necessary condition for this equivalence to hold for all σ. then AΣ is a model of Σ.2. Theorem 3. another necessary condition is “to have witnesses”: Deﬁnition. A Σwitness for the sentence ∃xϕ(x) is a term t ∈ TL such that Σ ϕ(t).8. Then tAΣ = a.6. It should be clear that (i) implies (ii). . PROOF OF THE COMPLETENESS THEOREM 47 Lemma 3. Replace in every formula in this proof each occurrence of c by z. assume (ii). if Σ is complete and has witnesses. The proof uses Zorn’s Lemma.6.2. Then the following two conditions are equivalent: (i) For each σ we have: Σ σ ⇔ AΣ = σ.1 we have Σ L ϕ(z/y)(y/z). Let ϕ(y) be an Lformula and suppose Σ Lc ϕ(c). We say that Σ has witnesses if there is a Σwitness for every sentence ∃xϕ(x) proved by Σ.6 (Lindenbaum). Completeness of Σ does not guarantee that the equivalence of part (2) of Corollary 3.
8 we have Σ L ϕ(y) → ⊥. and let A∗ be an L∗ structure. let cσ be a constant symbol not in L such that if σ is a diﬀerent Lsentence of the form ∃x ϕ (x ) provable from Σ. . Then Σ ∪ {ϕ(c)} Lc ⊥.2. Let σ1 = ∃x1 ϕ1 (x1 ). . The previous lemma covers the case n = 1. . Let c be a constant symbol not in L. Let Σn be a consistent set of Ln sentences. Lemma 3. +) is a reduct of (N. . Proof. 1. Suppose the language L∗ extends L. . . . .2. Then Σw ⊥. By the Deduction Lemma (3. . . . σn = ∃xn ϕn (xn ) be such that Σ σi for every i = 1. . Then we can choose n so large that this is actually a proof of ⊥ from Σn in Ln . cσn }sentences. . For each Lsentence σ = ∃xϕ(x) such that Σ σ. . ϕn (cn )}. . In the next lemma we use a superscript “w” for “witness. Take a proof of ⊥ from Σ∞ . n. . cσn } from Σ ∪ {ϕ1 (cσ1 ). . Note that any L∗ structure A∗ has a unique reduct to an Lstructure. Lemma 3. . ·). This contradicts the consistency of Σn . . Suppose Σ is consistent. let A be an Lstructure. . For example. . and denote their union by L∞ . ..3) Σ Lc ϕ(c) → ⊥. This contradicts Lemma 3. . +.48 CHAPTER 3. cn be distinct constant symbols not in L.1.9. Take a proof of ⊥ from Σw and let cσ1 . Then by Lemma 3. . . Proof. Proof. . . . Suppose that Σ∞ ⊥. <. . then cσ = cσ . Put L := L ∪ {c1 .. Then Σ is a consistent set of L sentences. Let (Ln ) be an increasing sequence of languages: L0 ⊆ L1 ⊆ L2 ⊆ . . So Σ ∪ {ϕ1 (cσ1 ).2. (N.. . 0.10. .2.10. Put Lc := L ∪ {c}.11. for each n. The general case follows by induction on n. cσn be constant symbols in Lw L such that this proof is a proof of ⊥ in the language L ∪ {cσ1 . . . ϕn (cσn )} is an inconsistent set of L ∪ {cσ1 . 0. Suppose not. . .12.2. ϕn (cσn )}.2. Let c1 . . Then A is said to be a reduct of A∗ (and A∗ an expansion of A) if A and A∗ have the same underlying set and the same interpretations of the symbols of L. . . . Then the union Σ∞ := n Σn is a consistent set of L∞ sentences. Then Σ ∪ {ϕ(c)} is a consistent set of Lc sentences. such that Σ0 ⊆ Σ1 ⊆ Σ2 . Proof. Put Lw Σw := L ∪ {cσ : σ = ∃xϕ(x) is an Lsentence such that Σ σ} := Σ ∪ {ϕ(cσ ) : σ = ∃xϕ(x) is an Lsentence such that Σ σ} Then Σw is a consistent set of Lw sentences. . which we indicate . THE COMPLETENESS THEOREM Lemma 3. cn } and Σ = Σ ∪ {ϕ1 (c1 ).” Lemma 3. where σi = ∃xi ϕi (xi ) for 1 ≤ i ≤ n. Applying MP yields Σ ⊥. By G we have Σ L ∃yϕ(y) → ⊥. Assume Σ is consistent and Σ ∃yϕ(y). Suppose not. contradicting the consistency of Σ. Suppose Σ is consistent. . .
and Prenex Form.2. We can now prove Theorem 3. To see this. let σ be an L∞ sentence. Variants. It is also complete. if n is odd. and put Σn+1 := Σn Σw n if n is even. then tA = tA for all variablefree LA terms t. and A = σ ⇐⇒ A∗ = σ for all LA sentences σ. Take n even and so large that σ is an Ln sentence.7 that Σ∞ has a model.11.3. and that f does not occur in the sentences of Σ. Put A := AΣ∞ L . Then Σ ∀xU x.1. In some proofs we shall take advantage of the fact that the Completeness Theorem is now available. Now take n to be odd and so large that σ is an Ln sentence and Σn σ.1. Then Σ is complete if and only if every two models of Σ satisfy the same sentences. . We construct a sequence (Ln ) of languages and a sequence (Σn ) where each Σn is a consistent set of Ln sentences. By the previous lemma the set Σ∞ of L∞ sentences is consistent. let σ = ∃xϕ(x) be an L∞ sentence such that Σ∞ σ. Here Lw and Σw are obtained from Ln and Σn in the same way that Lw and n n Σw are obtained from L and Σ in Lemma 3. Then A = Σ. put Ln+1 := Ln Lw n if n is even. Given the language Ln and the consistent set of Ln sentences Σn . We begin by setting L0 = L and Σ0 = Σ. Then Σn+1 σ or Σn+1 ¬σ and thus Σ∞ σ or Σ∞ ¬σ. choose a complete set of Ln sentences Σn ⊇ Σn . and Σn ⊆ Σn+1 . if n is odd. Then by construction of Σn+1 = Σw we have n Σn+1 ϕ(cσ ). Proof.2. a unary relation symbol U and a unary function symbol f . A key fact (to be veriﬁed by the reader) is that if A is a reduct of ∗ A∗ .3 Some Elementary Results of Predicate Logic Here we obtain some generalities of predicate logic: Equivalence and Equality Theorems. and suppose that Σ U f c. This concludes the proof of the Completeness Theorem (second form). We claim that Σ∞ has witnesses. (1) Suppose Σ is consistent. Exercises. (2) Let L have just a constant symbol c. SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 49 by A∗ L . namely AΣ∞ . To see this. It follows from Theorem 3. Let Σ be a consistent set of Lsentences. Note that Ln ⊆ Ln+1 . so Σ∞ ϕ(cσ ).3. 3.
since (ii) then follows easily. Suppose ψ = ϕ. Suppose that ψ = ∃xθ. y1 . . . Of course. Then either ψ = ϕ and ψ = ϕ . If ψ is atomic. . Then Σ ϕ ↔ ψ. a1 . . Choose variables y1 . We say ϕ1 and ϕ2 are Σequivalent if Σ ϕ1 ↔ ϕ2 . yn ). . y1 . . Let A be a model of Σ. .1 (Distribution Rule). .2 (Equivalence Theorem). Then ψ = ¬θ and the desired result follows easily. . We only do (i). ∀xϕ ↔ ∀xψ.3. an ) Suppose A = ∃xϕ(x. . we just say equivalent.) One veriﬁes easily that Σequivalence is an equivalence relation on the set of Lformulas. . a1 . . Then by the distribution rule Σ ∃xθ ↔ ∃xθ . . Proof. . . . THE COMPLETENESS THEOREM Lemma 3. a1 . We have A = ϕ → ψ. The case ψ = ϕ (and thus ψ = ϕ ) is trivial. . . Then Σ ∃xϕ → ∃xψ and Σ ∃xϕ ↔ ∃xψ and Σ ∀xϕ → ∀xψ. hence A = ψ(a0 . . and the desired result holds trivially. . . . . . (i) Suppose Σ (ii) Suppose Σ ϕ → ψ. an ). Suppose Σ ϕ ↔ ϕ . By induction on the number of logical symbols in ψ. yn such that ϕ = ϕ(x. . Then ψ is again a formula and Σ ψ ↔ ψ . Then the occurrence of ϕ we are replacing is an occurrence inside θ. and let ψ be obtained from ψ by replacing some occurrence of ϕ as a subformula of ψ by ϕ . a1 . . The cases ψ = ψ1 ∨ ψ2 and ψ = ψ1 ∧ ψ2 are left as exercises. . n} with I does not matter. an ∈ A A = ∃xϕ(x. . an ). So by inductive hypothesis we have Σ θ ↔ θ . this is usually the case because the equivalence class of ϕi(1) ∨ · · · ∨ ϕi(n) does not depend on this choice. Then A = ϕ(a0 . . . . . Deﬁnition. an ). . then necessarily ψ = ϕ and ψ = ϕ and the desired result holds trivially. . . . . where θ is obtained by replacing that occurrence (of ϕ) by ϕ . We need only show that then for all a1 . . . a1 . . . . . and thus A = ∃xψ(x. . (In case Σ = ∅. . . an ) → ψ(a0 . an ) for some a0 ∈ A. . Theorem 3.3. . n} → I and set ϕi := ϕi(1) ∨ · · · ∨ ϕi(n) . . or the occurrence of ϕ we are replacing is an occurrence in θ. . By the Completeness Theorem it suﬃces to show that then A = ∃xϕ → ∃xψ and A = ∀xϕ → ∀xψ. Proof. . i∈I i∈I ϕi := ϕi(1) ∧ · · · ∧ ϕi(n) .50 CHAPTER 3. an ). From A = ϕ → ψ we obtain A = ϕ(a0 . . We shall prove A = ∃xϕ → ∃xψ and leave the other part to the reader. . Suppose that ψ = ¬θ. . If I is clear from context we just write i ϕi and i ϕi instead. Given a family (ϕi )i∈I of formulas with ﬁnite index set I we choose a bijection k → i(k) : {1. an ) → ∃xψ(x. and the same is true with “∧” instead of “∨”. . . these notations i∈I ϕi and i∈I ϕi can only be used when the particular choice of bijection of {1. Then the inductive hypothesis gives Σ θ ↔ θ . . The proof is similar if ψ = ∀xθ. yn ) and ψ = ψ(x.
(4) replace a part ψ ∨ Qxθ by Qx(ψ ∨ θ) where x is not free in ψ. We prove the ﬁrst equivalence. ∀}. Qm+n ∈ {∃. . where y is free for x in ϕ and y does not occur free in ϕ. . We call Q1 x1 . each formula is equivalent to one in prenex form. Qm .5 (Prenex Form). that is. Qm+n yn ψ2 . . . . Instead of “occurrence of . . leaving the second as an exercise. ∀}.3. Theorem 3. Similarly we get ∃xϕ → ∃yϕ(y/x) (use that ϕ = ϕ(y/x)(x/y) by the assumption on y). Applying G to the quantiﬁer axiom ϕ(y/x) → ∃xϕ gives ∃yϕ(y/x) → ∃xϕ. By the Equivalence Theorem (3. We leave the proof of the next lemma as an exercise. A formula is equivalent to any of its variants. In this lemma Q denotes a quantiﬁer. (6) replace a part ψ ∧ Qxθ by Qx(ψ ∧ θ) where x is not free in ψ. (3) replace a part (Qxψ) ∨ θ by Qx(ψ ∨ θ) where x is not free in θ.. Qn xn the preﬁx . ∀} and ϕ is quantiﬁerfree. An application of Exercise 6 of Section 2. . . as a subformula” we say “part . Note that a quantiﬁerfree formula is in prenex form.4. . . Qm xm ψ1 =⇒pr Qm+1 y1 . Lemma 3. y1 . each Qi ∈ {∃. . Assume inductively that ϕ1 ϕ2 =⇒pr Q1 x1 . . A formula in prenex form is a formula Q1 x1 .3. Remark. and ψ1 and ψ2 are quantiﬁerfree.. Atomic formulas are already in prenex form.3. . x1 . Lemma 3. To simplify notation. . Proof.3.”. . (ii) replace an occurrence of a subformula ∀xϕ by ∀yϕ(y/x).7 ﬁnishes the proof.. . write ϕ =⇒pr ψ to indicate that ψ can be obtained from ϕ by a ﬁnite sequence of prenex transformations. and ϕ the matrix of the formula. and Q denotes the other quantiﬁer: ∃ = ∀ and ∀ = ∃. xm are distinct. Qn xn ϕ where x1 . In particular. xn are distinct variables.3. . . Every formula can be changed into one in prenex form by a ﬁnite sequence of prenex transformations. . yn are distinct. . . . SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 51 Deﬁnition. . Deﬁnition. Q ∈ {∃. . Proof. By induction on the number of logical symbols. .3.. (5) replace a part (Qxψ) ∧ θ by Qx(ψ ∧ θ) where x is not free in θ.3.2) it suﬃces to show ∃xϕ ↔ ∃yϕ(y/x) and ∀xϕ ↔ ∀yϕ(y/x) where y is free for x in ϕ and does not occur free in ϕ. Note that the free variables of a formula (those that occur free in the formula) do not change under prenex transformations. . . this is the case n = 0. The following prenex transformations always change a formula into an equivalent formula: (1) replace the formula by one of its variants. where Q1 . . . (2) replace a part ¬Qxψ by Q x¬ψ. A variant of a formula is obtained by successive replacements of the following type: (i) replace an occurrence of a subformula ∃xϕ by ∃yϕ(y/x).
. Qm+n yn (ψ1 ∨ ψ2 ). .52 CHAPTER 3. The assumptions above yield ϕ =⇒pr (Q1 x1 . . . hence ϕ =⇒pr Q1 x1 . Σ tn = tn . Qm xm ψ1 ) ∨ (Qm+1 y1 . . . . . . . . tn where F ∈ Lf is nary and t1 . . let t be obtained from t by replacing an occurrence of τ in t by τ . n}. . Inductively we can assume that Σ t1 = t1 . . . . . Likewise. Using the facts on admissible words at the end of Section 2.3 we have Σ t = t . Qm xm ¬ψ1 .1. xm are diﬀerent from x. . Let τ and τ be Lterms such that Σ τ = τ . the result of replacing an occurrence of an Lterm τ in t by an Lterm τ is an Lterm t . Qm+n yn ψ2 ) =⇒pr Q1 x1 . . THE COMPLETENESS THEOREM Then for ϕ := ¬ϕ1 . Qm xm ¬ψ1 . . to deal with ϕ1 ∧ ϕ2 . . Qm xm ψ1 ) ∨ (Qm+1 y1 . . . Note that by Corollary 2. . Qm+n yn (ψ1 ∨ ψ2 ). An occurrence of an Lterm τ in ϕ is said to be outside quantiﬁer scope if the occurrence is not inside quantiﬁer scope.2. we see that t = F t1 . Suppose t = F t1 . We ﬁnish this section with results on equalities. Qm xm ψ1 . . Qm+n yn ψ2 ). tn where for some i ∈ {1. . tn are Lterms. n} we have tj = tj for all j = i. . .6. . . Qm xm ψ1 =⇒pr Q1 x1 . and ti is obtained from ti by replacing an occurrence of τ in ti by τ . let ϕ := ∃xϕ1 . and the RHS is in prenex form. yn } = ∅.7. . j ∈ {1. Then ϕ =⇒pr ∃xQ1 x1 .3.1. . . Qm xm Qm+1 y1 . . then t = τ . xm } ∩ {y1 . Proof. Next. . . . . so by part (5) of Lemma 3. . . . Next. Qm xm Qm+1 y1 . . . . including exercise 5. we have ϕ =⇒pr ¬Q1 x1 . The case ϕ := ∀xϕ1 is similar. . Then Σ t = t . . Proposition 3. Replacing here the RHS (righthand side) by a variant we may assume that {x1 . We proceed by induction on terms. An occurrence of an Lterm τ in ϕ is said to be inside quantiﬁer scope if the occurrence is inside an occurrence of a subformula ∃xψ or ∀xψ in ϕ. This fact takes care of the case that t is a variable. . . Applying m + n times prenex transformation of types (3) and (4) we obtain (Q1 x1 . we apply prenex transformations of types (5) and (6). Qm xm ψ1 . . . . . . and assume t = τ . let ϕ := ϕ1 ∨ ϕ2 . . . . . Hence ϕ =⇒pr Q1 x1 . . Applying m prenex transformations of type (2) we get ¬Q1 x1 . Applying prenex transformations of type (1) we can assume x1 . First note that if t = τ . . and also that no xi occurs free in ψ2 and no yj occurs free in ψ1 .
. y)) → ∃x∀y(Q(x. . B denote Lalgebras. 3. xm ). .3. . yn ) is equivalent to an unnested formula ϕu (y1 . . . . . . Q be a binary relation symbol. So L has no relation symbols. . and also to an unnested universal formula θ2 (x1 . (This allows and ⊥ as atomic subformulas of unnested formulas. . . yn ). . ⊥. (ϕ1 ∧ ϕ2 )(x1 . . .4. xm ).4 Equational Classes and Universal Algebra The termstructure AΣ introduced in the proof of the Completeness Theorem also plays a role in what is called universal algebra. . . xn+1 . Then Σ ϕ ↔ ϕ . Free groups. . . and x. . . . . A substructure of . . are all special cases of a single construction in universal algebra. . and so on. . . and also to an unnested universal formula ϕ2 (y1 . or the form F x1 . . . Let τ and τ be Lterms such that Σ τ = τ . . (4) A formula is said to be unnested if each atomic subformula has the form Rx1 . . . Exercises. . ∀x ϕi ←→ ∀xϕi . (iii) each formula ϕ(y1 . . . . . . xm with mary R ∈ Lr ∪ { . . y). This is a general setting for constructing mathematical objects by generators and relations. . y distinct variables.) Then: (i) for each term t(x1 . . . tensor products of various kinds. . and A. xm ) are existential formulas. xm . xm ) and ϕ∧ (x1 . xm . and then proceed by the usual kind of induction on formulas. . let ϕ be obtained from ϕ by replacing an occurrence of τ in ϕ outside quantiﬁer scope by τ . xm ) and ϕ2 (x1 . . First take care of the case that ϕ is atomic by a similar argument as for terms. . . . xn = xn+1 with nary F ∈ Lf and distinct variables x1 . yn ) is equivalent to an unnested existential formula ϕ1 (y1 . . . . xm } the formula / t(x1 . . . In this section we ﬁx a language L that has only function symbols. then (ϕ1 ∨ ϕ2 )(x1 . (ii) each atomic formula ϕ(y1 . . . yn ). . Proof.3. . . polynomial rings. Then ` _ ´ `_ ´ ` ^ ´ `^ ´ ∃x ϕi ←→ ∃xϕi . (2) Let (ϕi )i∈I be a family of formulas with ﬁnite index set I. . . . .7. . . i i i i (3) If ϕ1 (x1 . . The 12 12 same holds with “existential” replaced by “universal”. . . . EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 53 Proposition 3. (1) Let P be a unary relation symbol. . Instead of “Lstructure” we say “Lalgebra”. Use prenex transformations to put ∀x∃y(P (x) ∧ Q(x. y) → P (y)) into prenex form. . xm ) = y is equivalent to an unnested existential formula θ1 (x1 . xm . yn ). . . xm ) are equivalent to existential formulas ϕ∨ (x1 . xm ) and variable y ∈ {x1 . including at least one constant symbol. y). . =} and distinct variables x1 . . . .
. (G. (3) every subalgebra of any algebra in C belongs to C. xm ) where x1 . We call A trivial if A = 1. t1 . Proof.5) Towards a proof of the converse. = (2) the trivial Lalgebra belongs to C. let j ∈ {1. . Examples. where two such terms s and t are equivalent iﬀ Σ s = t. Then the class C is equational if and only if the following conditions are satisﬁed: (1) closure under isomorphism: if A ∈ C and A ∼ B. Theorem 3. xm are distinct variables and ∀x abbreviates ∀x1 . a Σalgebra is the same as a model of Σ. ∀xm . in other words.Birkhoﬀ) Let C be a class of Lalgebras. we need some universalalgebraic considerations that are of interest beyond the connection to Birkhoﬀ’s theorem. Let a1 . . To such a Σ we associate the class Mod(Σ) of all Σalgebras. . tn are Lterms. then conditions (1)–(5) are satisﬁed. With L = LRi . . It is easy to see that if C is equational. Consider an identity ∀x s1 (x) = t1 (x) ∧ · · · ∧ sn (x) = tn (x) . Associated to Σ is the term algebra AΣ whose elements are the equivalence classes [t] of variablefree Lterms t. . and Mod(Ri). am ∈ AΣ and put A = AΣ . . . For the rest of this section we ﬁx a set Σ of Lidentities. then the corresponding class is the class of commutative rings. . . . THE COMPLETENESS THEOREM A is also called a subalgebra of A. . . . Given a set Σ of Lidentities we deﬁne a Σalgebra to be an Lalgebra that satisﬁes all identities in Σ. am )A = t(a1 . the class of groups. AΣ is a Σalgebra. With L = LGr . and where s1 . Ri is a set of Lidentities.2. . . Gr is a set of Lidentities.4. . . . . x = (x1 . and Mod(Gr). the class of rings. Lemma 3. . . x = (x1 . (5) the product of any family (Ai ) of algebras in C belongs to C. xm ) in Σ. (For (3) and (4) one can also appeal to the Exercises 7 and 8 of section 2. is the corresponding equational class of Lalgebras. Take . . . . then B ∈ C. am )A . If one adds to Ri the identity ∀x∀y(xy = yx) expressing the commutative law. n} and put s = sj and t = tj . . is the corresponding equational class of Lalgebras. .4. . There is up to isomorphism exactly one trivial Lalgebra.54 CHAPTER 3. sn . (4) every quotient algebra of any algebra in C belongs to C.1. . It suﬃces to show that then s(a1 . . . . An Lidentity is an Lsentence ∀x s1 (x) = t1 (x) ∧ · · · ∧ sn (x) = tn (x) . A class C of Lalgebras is said to be equational if there is a set Σ of Lidentities such that C = Mod(Σ). . . and a quotient algebra of A is an Lalgebra A/ ∼ where ∼ is a congruence on A.
. For any commutative ring R and elements b1 . . am )A . It is easy to check that this map is a homomorphism AΣ → B. am = [αm ]. . . then Σ s = t.4 it is the only such homomorphism. . Xn ] be the ring of polynomials in distinct indeterminates X1 . . and the ring of integers is an initial Rialgebra.4. . y are distinct variables. . . as desired. am )A = s(α1 . . EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 55 variablefree Lterms α1 . so [s(α1 . it is unique uptouniqueisomorphism. so s(a1 . Let I be an index set in what follows. αm )A . For example. . αm )] = [t(α1 . am = αm . αm ) = t(α1 . . . . . . . An initial Σalgebra is a Σalgebra A such that for any Σalgebra B there is a unique homomorphism A → B. . . . Then A is said to be a free Σalgebra on (ai ) if for every Σalgebra B and Iindexed family (bi ) of elements of B there is exactly one homomorphism h : A → B such that h(ai ) = bi for all i ∈ I. αm )]. . .6. So if there is an initial Σalgebra. . . respectively. . . αm such that a1 = [α1 ]. . Actually. s(α1 . . . . . αm )A = [s(α1 . . . Xn over Z. . . .5. so i and j are isomorphisms.5 we have a1 = α1 . So the cRialgebras are just the commutative rings. take L = LRi and cRi := Ri ∪ {∀x∀y xy = yz}. . αm )A = [t(α1 . . respectively. By Exercise 4 in Section 2. . . Now Σ s(α1 . We also express this by “ A. . Also. .3. . . am )A = t(a1 . . we are going to show that AΣ is a socalled initial Σalgebra. Let B be any Σalgebra. . the trivial group is an initial Gralgebra. t(a1 . let i and j be the unique homomorphisms A → B and B → A. bn ∈ R we have a . am )A = t(α1 . where x. A itself is sometimes referred to as a free Σalgebra if there is a family (aj )j∈J in A such that A. . . αm )]. . (aj ) is a free Σalgebra. . [t] → tB . by part (1) of Corollary 3. .2. . Thus we have a map AΣ → B. so sB = tB . Then there is a unique isomorphism A → B. . . . . αm )A by Lemma 2. . . . . . . t ∈ TL and [s] = [t]. Proof. . and thus s(a1 . . . Lemma 3. . . Free algebras. .2.3. Finally. Suppose A and B are both initial Σalgebras. Let A be a Σalgebra and (ai )i∈I an Iindexed family of elements of A. . . AΣ is an initial Σalgebra. Let Z[X1 . . Then we have homomorphisms j ◦ i : A → A and i ◦ j : B → B. . . .1. To see this. . . so necessarily j ◦ i = idA and i ◦ j = idB . Then by A A part (1) of Corollary 3. t(α1 . . Note that if s. As an example. .4. . (ai ) is a free Σalgebra”. αm )]. . αm ).
·) of Mo. at most one free Σalgebra on an Iindexed family of its elements. y. A monoid . If A and B are both free Σalgebras on (ai ) and (bi ) respectively. Then we have a unique homomorphism h : A → B such that h(ai ) = ai for all i ∈ I. namely. and consider E ∗ as a monoid by taking the empty word as its identity and the concatenation operation (v. there is a unique free Σalgebra on an Iindexed family of its elements. Xn ) → f (b1 . w) → vw as its product operation. . and consider Mo := {∀x(1 · x = x ∧ x · 1 = x). given I. n. . and likewise g ◦ h = idB . because for any monoid B and elements be ∈ B (for e ∈ E) we have a unique monoid homomorphism E ∗ → B that sends each e ∈ E to be . then (h ◦ g)(ai ) = ai for all i. In particular. To see why. . . up to unique isomorphism preserving Iindexed families. Let ΣI be Σ viewed as a set of LI identities. So an LI algebra B. . the ΣI algebra AΣI is a free Σalgebra on ([ci ]). so B = A. let B be the subalgebra of A generated by (ai ). Then A is generated by (ai ). 1. Then E ∗ is a free monoid on the family (e)e∈E of words of length 1. . and g : A → B and h : B → A are the unique homomorphisms such that g(ai ) = bi and h(bi ) = ai for all i. . f (X1 . so h ◦ g = idA . . en → be1 · · · ben . and we call 1 ∈ A the identity of the monoid A. namely the evaluation map (or substitution map) Z[X1 . is a model A = (A. . up to unique isomorphism of ΣI algebras.56 CHAPTER 3. For a simpler example. Thus. Thus. where new means that ci ∈ L for i ∈ I and / ci = cj for distinct i. . where x. with same index set I. . THE COMPLETENESS THEOREM unique ring homomorphism Z[X1 . . Xn ] → R. . j ∈ I. (bi ) is just an Lalgebra B equipped with an Iindexed family (bi ) of elements of B. . . Remark. Xn ] → R that sends Xi to bi for i = 1. or semigroup with identity. . . . Let (A. . and · its product operation. . ∀x∀y∀z (xy)z = x(yz) }. . so g is an isomorphism with inverse h. ·} ⊆ LGr be the language of monoids. . . . We shall now construct free Σalgebras as initial algebras by working in an extended language. there is. . . Thus Z[X1 . Xn ] is a free commutative ring on (Xi )1≤i≤n . bn ). and then the composition A→B→A is necessarily idA . . Then a free Σalgebra on an Iindexed family of its elements is just an initial ΣI algebra. Let E ∗ be the set of words on an alphabet E. (ai )i∈I ) be a free Σalgebra. e1 . Let LI := L ∪ {ci : i ∈ I} be the language L augmented by new constant symbols ci for i ∈ I. let L = LMo := {1. z are distinct variables.
A is a free as a Σalgebra on (ιg)g∈G . (aj )j∈J . . t) as above. applied to CI in place of C. ι). so hs. By the claim above. Let B := Bs. . This fact can sometimes be used to reduce problems on Σalgebras to the case of free Σalgebras. (aj ) → B. and thus h([s]) = h([t]). Let CI be the class of all LI algebras B. (bj ) . and thus A ∈ C. Here is the key fact from which this will follow: Claim.t ([s]) = hs. so hs.t ∈ C such that Bs. . t ∈ TL such that s = t does not belong to Σ(C) we pick Bs. Then h(tA (aj1 . Then h is injective. . let Σ(C) be the set of Lidentities ∀x s(x) = t(x) that are true in all algebras of C. ι) uptouniqueisomorphism. we have to show that then C is equational. . and it remains to show that every Σ(C)algebra belongs to C. This injectivity gives A ∼ h(A) ⊆ B. . so it remains to show that free Σ(C)algebras belong to C.3. ajn ) = tB (bj1 . jn ∈ J. If A is an initial Σ(C)algebra. = Now. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 57 Let B be any Σalgebra.t : A → Bs. . Then s = t does not belong to Σ(C). Proof of Birkhoﬀ ’s theorem. . Note that A is generated as an Lalgebra by ιG. Note that if (A .t ([t]). . and thus h induces an isomorphism A/∼h ∼ B. so A ∈ C. . Then we have a Σalgebra A with a map ι : G → A such that for any Σalgebra B and any map j : G → B there is a unique homomorphism h : A → B such that h◦ι = j. Generators and relations. It is clear that each algebra in C is a Σ(C)algebra.t = s = t. . Note that B ∈ C. Assume C is closed. then the unique homomorphism h : A → A such that h ◦ ι = ι is an isomorphism. and let h : A → B be the homomorphism given by h(a) = hs. and take the unique homomorphism h : A. . For s. This ﬁnishes the proof of the claim. . Here is a particular way of constructing the free Σalgebra on G. in other words.t (a) . ι) the free Σalgebra on G. (bi ) with B ∈ C. Take a free Σalgebra A. Let A be a free Σ(C)algebra on (ai )i∈I . ι ) (with ι : G → A ) has the same universal property as (A. To see why. (ai ) ∈ CI .t where the product is over all pairs (s. let s.4. so this universal property determines the pair (A. Let G be any set. every Σ(C)algebra is isomorphic to a quotient of a free Σ(C)algebra.t ([s]) = hs. So there is no harm in calling (A. To prove this claim we take A := AΣ(C) . t ∈ TL be such that [s] = [t] in A = AΣ(C) . Take the language LG := L ∪ G (disjoint union) with the elements of G as constant . and we let hs. We have shown: = Every Σalgebra is isomorphic to a quotient of a free Σalgebra.4. then A ∈ C. It is clear that CI is closed as a class of LI algebras. Let us say that a class C of Lalgebras is closed if it has properties (1)–(5) listed in Theorem 3. Indeed. bjn ) for all Lterm t(x1 .t ([t]). xn ) and j1 . and take any family (bj )j∈J in B that generates B. Now. .t be the unique homomorphism. (ai ) is easily seen to be an initial Σ(CI )algebra. . we obtain A. see the next subsection for an example. .1. so h(A) = B. A.
R) := AΣ(R) . . xn ) are Lterms and g = (g1 .4. R) such that: (1) A(G. viewed as a set of LG sentences. xn ) and t(x1 . . . . gn ) ∈ Gn (with n depending on the sentence). Next. there is a unique homomorphism h : A(G. . . . Let Σ(G) be Σ considered as a set of LG identities. R) = s(ig) = t(ig) for all s(g) = t(g) in R. As before one sees that the universal property of the lemma is satisﬁed. THE COMPLETENESS THEOREM symbols. .58 CHAPTER 3. R) → B such that h ◦ i = j. . We wish to construct the Σalgebra generated by G with R as set of relations. .1 This object is described uptoisomorphism in the next lemma. R) by i(g) = [g]. 1 The use of the term “relations” here has nothing to do with nary relations on sets. Lemma 3. Then A := AΣ(G) as a Σalgebra with the map g → [g] : G → AΣ(G) is the free Σalgebra on G. . R) with a map i : G → A(G. There is a Σalgebra A(G. . let A(G. Proof.4. let R be a set of sentences s(g) = t(g) where s(x1 . (2) for any Σalgebra B and map j : G → B with B = s(jg) = t(jg) for all s(g) = t(g) in R. and deﬁne i : G → A(G. Let Σ(R) := Σ ∪ R. .
A. the cardinality of a structure is deﬁned to be the cardinality of its underlying set. The union of countably many countable sets is countable. Hence the language L ∪ {cσ : Σ σ where σ is an Lsentence of the form ∃xϕ(x)} is countable. the hypothesis that L is countable yields that the set of Lsentences is countable. hence AΣ∞ is countable. achieves a lot more than just establishing completeness. Remark. o Suppose L is countable and Σ has a model. when applicable. and thus its reduct AΣ∞ L is a countable model of Σ. adding witnesses keeps the language countable. Then Σ has a countable model. that is. Take a countably inﬁnite subset Var ⊆ Var. . It follows that there are only countably many variablefree L∞ terms. Vaught’s Test o Below. backandforth.1 (Countable L¨wenheimSkolem Theorem). hence the set L∞ constructed in the proof of the Completeness Theorem is countable. Then each sentence is equivalent to one whose variables are all from Var . The above proof is the ﬁrst time that we used the countability of Var = {v0 . }.6.1. v2 . . In this section we have the same conventions concerning L. Theorem 4. As promised in Section 2. 4. Next we develop o some basic methods related to proving completeness of a given set of axioms: Vaught’s Test. Each of these methods. t. quantiﬁer elimination.4.Chapter 4 Some Model Theory In this chapter we ﬁrst derive the L¨wenheimSkolem Theorem. Since Var is countable. .1 L¨wenheimSkolem. we shall now indicate why the Downward L¨wenheimSkolem Theorem goes through without assuming that o Var is countable. v1 . ψ. Proof. σ and Σ as in the beginning of Section 2. ϕ. unless speciﬁed otherwise. By replacing each sentence in Σ by an equivalent one all whose variables are 59 . Suppose that Var is uncountable. θ.
∀x∀y(x < y ∨ x = y ∨ y < x). ∃xn 1≤i<j≤n xi = xj . SOME MODEL THEORY from Var . <) for the language LO that satisﬁes the following axioms (where x. The models of Σ are exactly the inﬁnite sets. . Theorem 4. As in the proof above. Any two countable dense totally ordered sets without endpoints are isomorphic. A = ¬σ and B = σ. . Proposition 4.3 (Cantor). Proof. Suppose Σ is not complete. . To formulate that theorem we deﬁne a totally ordered set to be a structure (A. Let L = ∅. So (Q. and suppose Σ has a model. <) and (R. Hence by the L¨wenheimSkolem Theorem there is a countable model A of Σ o such that A σ.2 (Vaught’s Test). σ2 . = Example. . The proof of the next theorem shows BackandForth in action. Thus by Vaught’s Test Σ is complete. In other cases it may require work to check this hypothesis. We have A ∼ B. All countable models of Σ are countably inﬁnite and hence isomorphic to N. Let L be countable.60 CHAPTER 4. This model is a countable model of Σ. and it is said to have no endpoints if it satisﬁes the axiom ∀x∃y∃z(y < x ∧ x < z). . z are distinct variables): ∀x(x < x). we obtain a countable model of Σ working throughout in the setting where only variables from Var are used in terms and formulas. and there is a countable model B of Σ such that B ¬σ. is often used to verify the hypothesis of Vaught’s Test. <) are dense totally ordered sets without endpoints. Then Σ is complete. so the Lstructures are just the nonempty sets.1. Let To be the set consisting of these three axioms. ∀x∀y∀z (x < y ∧ y < z) → x < z . contradiction. In this example the hypothesis of Vaught’s Test is trivially satisﬁed. The following test can be useful in showing that a set of axioms Σ is complete.} where σn = ∃x1 . BackandForth. A totally ordered set is said to be dense if it satisﬁes in addition the axiom ∀x∀y(x < y → ∃z(x < z ∧ z < y)). Let To (dense without endpoints) be To augmented by these two extra axioms. .1. y. and that all countable models of Σ are isomorphic. One general method in model theory. we obtain a countable set Σ of sentences such that Σ and Σ have the same models. Let Σ = {σ1 . Then there is σ such that Σ σ and Σ ¬σ.
βn−1 as ak is situated with respect to α0 . . which / establishes the claim. βn−1 }. . the set of L sentences ΣΛ := Σ ∪ {cλ = cµ : λ. . . To see this it suﬃces to show that. next take k ∈ N minimal such that ak is situated with / respect to α0 . . <). Proof. . . βn−1 in B such that for all i. . l is minimal such that for i = 0. . Case 2: n is odd. . . Suppose L has o size at most κ and Σ has an inﬁnite model. . Take an inﬁnite model A of Σ. (The reader should check that such an l exists: that is where density and “no endpoints” come in). . Recall that the set of ordinals λ < κ has cardinality κ. that is. . In the results below we let κ denote an inﬁnite cardinal. The same arguments we used in proving the countable version of the L¨wenheimSkolem Theorem show that then Σ o has a model B = (B. . LOWENHEIMSKOLEM. VAUGHT’S TEST 61 Proof. α2n } and bn ∈ {β0 . . So A = {an : n ∈ N} and B = {bn : n ∈ N}. . (Here we go back. αn−1 in A and distinct β0 .) First take k ∈ N minimal such that ak ∈ {α0 . . One proves easily by induction on n that then an ∈ {α0 . In combination with Vaught’s Test this gives Corollary 4. . Then AΛ is a model of ΣΛ . We have the following generalization of the L¨wenheimSkolem theorem. that is. Let L = L ∪ {cλ : λ < κ} and let Σ = Σ ∪ {cλ = cµ : λ < µ < κ}. . cλ = cµ for λ < µ < κ). β2n }. βn−1 . . k is minimal such that for i = 0.5 (Generalized L¨wenheimSkolem Theorem). put αn := ak and βn := bl . Let {cλ }λ<κ be a family of κ new constant symbols that are not in L and are pairwise distinct (that is.4. . and αi > ak ⇐⇒ βi > bl . µ ∈ Λ. . o Theorem 4. . Then we deﬁne αn ∈ A and βn ∈ B as follows: Case 1: n is even. n − 1 we have: αi < ak ⇐⇒ βi < bl . αn−1 as bl is situated with respect to β0 . . . <) → (B. To (dense without endpoints) is complete. then take l ∈ N minimal such that bl is situated with / respect to β0 . and suppose we have distinct α0 . . given any ﬁnite set Λ ⊆ κ. hence B is a model of Σ of cardinality κ. . . . . . <) and (B. and αi > ak ⇐⇒ βi > bl . Let (A.1. . . . Let n > 0. . . . <) be countable dense totally ordered sets without endpoints. . . Put βn := bl and αn := ak . . We claim that Σ is consistent. and this bijection is an isomorphism (A. .1. .) First take l ∈ N minimal such that bl ∈ {β0 . n − 1 we have: αi < ak ⇐⇒ βi < bl .1. Note that L also has size at most κ. λ = µ} has a model. We make an L expansion AΛ of A by interpreting distinct cλ ’s with λ ∈ Λ by distinct elements of A. αn−1 }. Thus we have a bijection αn → βn : A → B. We deﬁne by recursion a sequence (αn ) in A and a sequence (βn ) in B as follows: put α0 := a0 and β0 := b0 . . . . (bλ )λ<κ ) of cardinality at most κ. (Here we go forth. αn−1 . We have bλ = bµ for λ < µ < κ. Then Σ has a model of cardinality κ. . . .¨ 4. and interpreting the cλ ’s with λ ∈ Λ arbitrarily. j < n we have αi < αj ⇐⇒ βi < βj .
. First Σ σ means that Σ ∪ {¬σ} has a model. and for each vector v ∈ V there is a unique family (λb )b∈B of scalars (elements of F ) such that {b ∈ B : λb = 0} is ﬁnite and v = Σb∈B λb b. −. Similarly Σ ¬σ means that Σ ∪ {σ} has a model. a vector space V over F is viewed as an LF structure by interpreting each fλ as the function v −→ λv : V → V . One easily speciﬁes a set ΣF of sentences whose models are exactly the vector spaces over F . ∃xn F 1≤i<j≤n xi = xj : n = 2. Let LF be the language of vector spaces over F : it extends the language LAb = {0. SOME MODEL THEORY The next proposition is Vaught’s Test for arbitrary languages and cardinalities. Moreover. (Proofs can be found in various texts. Suppose also that any two models of Σ of cardinality κ are isomorphic. (iii) 1v = v. (a) V has a basis B. . . Proof. one for each λ ∈ F . w ∈ V . We will derive a contradiction. µ ∈ F and all v. be a sequence of distinct variables and put Σ∞ := ΣF ∪ {∃x1 . We will need the following facts about vector spaces V and W over F . Note that if F F itself is inﬁnite then each nontrivial vector space over F is inﬁnite. So the models of Σ∞ are exactly the inﬁnite vector spaces over F . We now discuss in detail one particular application of this generalized Vaught Test. Σ has a model and all models of Σ are inﬁnite. .6. Suppose L has size at most κ. (ii) λ(v + w) = λv + λw. 3.62 CHAPTER 4. x2 . By assumption A ∼ B. . Let σ be an Lsentence and suppose that Σ σ and Σ ¬σ.1. v) −→ λv such that for all λ. . (iv) (λµ)v = λ(µv). . then we have also nontrivial ﬁnite vector spaces. From a modeltheoretic perspective ﬁnite structures are somewhat exceptional.) Fact. Then Σ is complete.}. so by the Generalized L¨wenheimSkolem o Theorem Σ ∪ {¬σ} has a model A of cardinality κ. that is. (i) (λ + µ)v = λv + µv. +} of abelian groups with unary function symbols fλ . . . Note that ΣF is not complete since the trivial vector space satisﬁes ∀x(x = 0) but F viewed as vector space over F does not. contradicting that A = ¬σ and = B = σ. Let x1 . Fix a ﬁeld F . A vector space over F is an abelian (additively written) group V equipped with a scalar multiplication operation F × V −→ V : (λ. if F is ﬁnite. These models must be inﬁnite since they are models of Σ. and Σ ∪ {σ} has a model B of cardinality κ. B ⊆ V . Proposition 4. so we are going to restrict attention to inﬁnite vector spaces over F .
. . (b) any two transcendence bases of K over k have the same size. Σ∞ is complete. .1. Theorem 4. (c) If K is algebraically closed with transcendence basis B over k and K is also an algebraically closed ﬁeld extension of k with transcendence basis B over k. A transcendence basis of K over k is a set B ⊆ K such that B is algebraically independent over k and K is algebraic over its subﬁeld k(B). Using Vaught’s Test for models of size ℵ1 this yields: Theorem 4.7. With the generalized Vaught Test we can also prove that ACF(0) (whose models are the algebraically closed ﬁelds of characteristic 0) is complete. F Remark. . (d) if K is uncountable and K > k. .8. . Applying this with k = Q and k = Fp for prime numbers p. . . (c) If V has basis B and W has basis C. . . . Likewise. Fact. xn ]. xn ) ∈ k[x1 . The relevant deﬁnitions and facts are as follows. Let V be a vector space over F of cardinality κ. (d) Let B be a basis of V . Thus any two vector spaces over F of cardinality κ have bases of cardinality κ and thus are isomorphic. which goes under the name of “categoricity in power”. (a) K has a transcendence basis over k. then any bijection B → C extends uniquely to an isomorphism V → W . . The proof is similar. where x1 . F Proof. . we obtain that any two algebraically closed ﬁelds of the same characteristic and the same uncountable size are isomorphic. . The set ACF(0) of axioms for algebraically closed ﬁelds of characteristic zero is complete. . Theorem 4. A subset B of K is said to be algebraically independent over k if for all distinct b1 .¨ 4. . it goes beyond the scope of these notes to develop this large chapter of model theory. In particular LF has size at most κ. Then V  = B · F  provided either F  or B is B inﬁnite. If both are ﬁnite then V  = F  . then K = B for each transcendence basis B of K over k.7 and Exercise 3 imply for instance that if F = R then all nontrivial vector spaces over F satisfy exactly the same sentences in LF .1. . . . then any bijection B → B extends to an isomorphism K → K . . LOWENHEIMSKOLEM. bn ) = 0 for all nonzero polynomials p(x1 . with “transcendence bases” taking over the role of bases. Let K be a ﬁeld with subﬁeld k.1.1. xn are distinct variables. It follows by the Generalized Vaught Test that Σ∞ is complete. VAUGHT’S TEST 63 (b) Any two bases B and C of V have the same cardinality. bn ∈ B we have p(b1 . . If the hypothesis of Vaught’s Test (or its generalization) is satisﬁed. then many things follow of which completeness is only one. ACF(p) is complete for each prime number p. Take a κ > F . Then a basis of V must also have size κ by property (d) above.
. and any two inﬁnite vector spaces over a given ﬁeld F are elementarily equivalent. (3) Let L = {S} where S is a unary function symbol. . . ) is an Lstructure. . (ii) for each nary f ∈ Lf and a1 . . then the map ai → bi : {a1 . . An explicit. CHAPTER 4. <) ≡ (R. (A description like {σ : (Z. 4. If Γ is a backandforth system from A to B. then Γ−1 := {γ −1 : γ ∈ Γ} is a backandforth system from B to A. . . .2 Elementary Equivalence and BackandForth In the rest of this chapter we relax notation. . . . aN } → {b1 . . . Give an informative description of a complete set of Lsentences true in (Z. an ) ∈ An . xn ) an LA formula. Suppose Σ1 and Σ2 have inﬁnite models. . Example: suppose A = (A. γan ) = γ(an+1 ). A partial isomorphism from A to B is a bijection γ : X → Y with X ⊆ A and Y ⊆ B such that (i) for each mary R ∈ Lr and a1 . . . . aN ∈ A and b1 . . . an . . . bN ∈ B are such that a1 < a2 < · · · < aN and b1 < b2 < · · · < bN . . . an+1 ∈ X f A (a1 . . an ) for an LA formula ϕ(a1 . am ) ∈ RA ⇐⇒ (γa1 . We call A and B backandforth equivalent . an ) = an+1 ⇐⇒ f B (γa1 . <) are totally ordered sets. Give an informative description of a complete set of Lsentences true in (Z.64 Exercises. and a1 . . . Consider the Lstructure (Z. . . . . Consider the Lstructure (Z. am ∈ X (a1 . . list of axioms is required. . . N). . . . . bN } is a partial isomorphism from A to B. . N) = σ} is not informative. . possibly inﬁnite. an ). ϕ(x1 . S). (ii) (“Back”) for each γ ∈ Γ and b ∈ B there is a γ ∈ Γ such that γ extends γ and b ∈ codomain(γ ). . <). . where A = (A. and (a1 . . .) (2) Let Σ1 and Σ2 be sets of Lsentences such that no symbol of L occurs in both Σ1 and Σ2 . . A backandforth system from A to B is a collection Γ of partial isomorphisms from A to B such that (i) (“Forth”) for each γ ∈ Γ and a ∈ A there is a γ ∈ Γ such that γ extends γ and a ∈ domain(γ ). and just write ϕ(a1 . Then Σ1 ∪ Σ2 has a model. . SOME MODEL THEORY (1) Let L = {U } where U is a unary relation symbol. Hint: Make an educated guess and try to verify it using Vaught’s Test or one of its variants. γam ) ∈ RB . . N). . . . . S) where S(a) = a + 1 for a ∈ Z. . . Thus by the previous section (Q. . <) and B = (B. . In this section A and B denote Lstructures. . . . We say that A and B are elementarily equivalent (notation: A ≡ B) if they satisfy the same Lsentences. .
(This includes the case L = L. . yn ) one shows that for each γ ∈ Γ and a1 . . and construct a sequence (γn ) of partial isomorphisms from A to B such that each γn+1 extends γn .2. then A ≡ B. In the following result we do not need to assume countability. . A = n domain(γn ) and B = n codomain(γn ).) Then every set Σ of L sentences with Σ ⊇ Σ has QE. and c = (c1 . . (1) Deﬁne a ﬁnite restriction of a bijection γ : X → Y to be a map γE : E → γ(E) with ﬁnite E ⊆ X. . . . . By induction on the number of logical symbols in unnested formulas ϕ(y1 . the key is to guess a backandforth system. Suppose Γ is a nonemepty backandforth system from A to B. x) for some Lformula ϕ(u.2. . Remark. .3. . . We proceed as in the proof of Cantor’s theorem. . Then A ∼ B. xn . γan ). xn ) is a tuple of distinct variables and y is a single variable distinct from x1 . cm ) is a tuple of constant symbols in L L. check ﬁrst that each L formula ϕ (x) has the form ϕ(c. Proof.4. If Γ is a backandforth system from A to B. um ) is a tuple of distinct variables distinct from x1 . . . By taking n = 0 in this deﬁnition we see that if Σ has QE. then A ≡bf C. .2. . Let L be a language such that L ⊇ L and all symbols of L L are constant symbols. . Suppose Σ has QE. . . (2) If A ≡bf B and B ≡bf C. . an ) ⇐⇒ B = ϕ(γa1 . . . Cantor’s proof in the previous section generalizes as follows: Proposition 4. . . Deﬁnition. If A ≡bf B. For n = 0 and using Exercise 4 of Section 3. an ∈ domain(γ) we have A = ϕ(a1 .3 this yields A ≡ B. . . QUANTIFIER ELIMINATION 65 (notation: A ≡bf B) if there exists a nonempty backandforth system from A to B. . Hence A ≡bf A. then every Lsentence is Σequivalent to a qfree Lsentence. .) . That is where insight and imagination (and experience) come in.3 Quantiﬁer Elimination In this section x = (x1 . . Suppose A and B are countable and A ≡bf B. x). . = Proof. Let Γ be a nonempty backandforth system from A to B. so is the set of ﬁnite restrictions of members of Γ. Then the map A → B that extends each γn is an isomorphism A → B. where u = (u1 . . . xn . . then B ≡bf A. (To see this. Exercises. . Σ has quantiﬁer elimination (QE) if every Lformula ϕ(x) is Σequivalent to a quantiﬁer free (short: qfree) Lformula ϕqf (x). 4.1. In applying this proposition and the next one in a concrete situation. and if A ≡bf B. . Proposition 4.
y) is Σequivalent to a disjunction i ψi (x. (ϕ1 ∨ ϕ2 )(x). with i ranging over some ﬁnite index set. Each qfree Lformula ϕ(x) is equivalent to a disjunction ϕ1 (x) ∨ · · · ∨ ϕk (x) of basic conjunctions ϕi (x) in L (“disjunctive normal form”). Next. 0.1. y) of basic conjunctions ψi (x.5) iﬀ C = ϕqf (a) (by the same exercise) iﬀ C = σ. y).3. y) ↔ θqf (x). Proof.2. y). Lemma 4. ∃yψi (x. Let σ be an LA sentence. c) := ∃y(ay 2 + by + c = 0) is equivalent to the qfree formula (b2 − 4ac ≥ 0 ∧ a = 0) ∨ (a = 0 ∧ b = 0) ∨ (a = 0 ∧ b = 0 ∧ c = 0). b. Note that this equivalence gives an eﬀective test for the existence of a y with a certain property. let ϕ(x) = ∀yψ(x. Proof. Suppose Σ has QE and B and C are models of Σ with a common substructure A (we do not assume A = Σ). y) has ΣQE. In view of the equivalence of ∃y i ψi (x. −. by hypothesis. Remark. let ϕ(x) = ∃yψ(x. This case reduces to the previous case since ϕ(x) is equivalent to ¬∃y¬ψ(x. ·) the formula ϕ(a. y) we obtain Σ ϕ(x) ←→ i ∃yθ(x. This illustrates the kind of property QE is. and suppose inductively that the Lformula ψ(x. which avoids in particular having to check an inﬁnite number of values of y (even uncountably many in the case above). Lemma 4. We have to show B = σ ⇔ C = σ. <. y). Let us say that an Lformula ϕ(x) has ΣQE if it is Σequivalent to a qfree Lformula ϕqf (x). then ¬ϕ1 (x). Then B = σ iﬀ B = ϕqf (a) iﬀ A = ϕqf (a) (by Exercise 7 of Section 2. Take a qfree Lformula ϕqf (x) that is Σequivalent to ϕ(x). y) in L. so ϕ(x) has ΣQE.3. y) with i ∃yψi (x. +.66 CHAPTER 4. and suppose inductively that the Lformula ψ(x. Hence ψ(x. . and (ϕ1 ∧ ϕ2 )(x) have ΣQE. 1. Finally. y) has ΣQE. In the structure (R. Write σ as ϕ(a) with ϕ(x) an Lformula and a ∈ An . y) in L there is a qfree Lformula θqf (x) such that Σ Then Σ has QE. Then B and C satisfy the same LA sentences. Note that if the Lformulas ϕ1 (x) and ϕ2 (x) have ΣQE. Each ∃yψi (x. SOME MODEL THEORY A basic conjunction in L is by deﬁnition a conjunction of ﬁnitely many atomic and negated atomic Lformulas. Suppose that for every basic conjunction θ(x. y) has ΣQE. y).
y) is equivalent in To(dense without endpoints) to a qfree formula ψ(x). Proof. Completeness is only one of the nice consequences of QE.3. y). Similarly. y) is a basic conjunction in LO . The above corollary gives another way of establishing completeness. y) is equivalent to y = xi ∧ ϕ (x. Up till this point we did not need the density and “no endpoints” axioms. Remark. Let B and C be any two models of Σ. y) is equivalent to ϕ (x. To justify this. y).1 it suﬃces to show that ∃yϕ(x.4. Proof. i∈I. n} and where we allow I or J to be empty. xn . Suppose Σ has a model. Then ∃yϕ(x. We may assume that each conjunct of ϕ is of one of the following types: y = xi .4. y) is equivalent in To(dense without endpoints) to the formula xi < xj . y)) is equivalent to ∃yϕ1 (x. The main impact of QE is rather that it gives access to the structural properties of deﬁnable sets. y). and is often applicable when the hypothesis of Vaught’s Test is not satisﬁed. and likewise with negations ¬(xi < y).3. and we are done. y) has no conjuncts of the form y = xi . . a negation ¬(y < xi ) can be replaced by the disjunction y = xi ∨ xi < y. y < xi (1 ≤ i ≤ n). observe that if we had instead a conjunct y = xi then we could replace it by (y < xi ) ∨ (xi < y) and use the fact that ∃y(ϕ1 (x.3. has QE. QUANTIFIER ELIMINATION 67 Corollary 4. Take an Lstructure A that can be embedded into every model of Σ. . y)) ←→ ψ(x) ∧ ∃yθ(x. . and the easiest one to explain at this stage. To(dense without endpoints) has QE. J ⊆ {1. Theorem 4. y) = (x1 . Then by a slight rewording of the proof of Lemma 4. Applications of model theory to other areas of mathematics often involve QE as a key step. .j∈J . y) ∨ ϕ2 (x. y) be a tuple of n + 1 distinct variables. y). xi ). This will be reﬂected in exercises at the end of this section. So we can assume also that ϕ(x. . y) in LO . y) ∨ ∃yϕ2 (x. we see that B and C satisfy the same Lsentences. and there exists an Lstructure that can be embedded into every model of Σ. . y) is a conjunction xi < y ∧ i∈I j∈J y < xj where I. So A is isomorphic to a substructure of B and of C. By Lemma 4. and after rearranging conjuncts we can assume that ϕ(x. Let (x. ϕ(x. where ϕ (x.3.2 (considering only Lsentences). Suppose that we have a conjunct y = xi in ϕ(x. We have seen that Vaught’s test can be used to prove completeness. .3. but these come in now: ∃yϕ(x. and consider a basic conjunction ϕ(x. xi < y.3. It follows that Σ is complete. so. Then Σ is complete. . Also conjuncts in which y does not appear can be eliminated because ∃y(ψ(x) ∧ θ(x. After all these reductions.
which is one of the key results in real algebraic geometry. Then Σ (as a set of L sentences) also has QE. ·) are exactly the ﬁnite subsets of C and their complements in C. +. Tarski also established the following more diﬃcult theorem. . nor does it prove 1 + 1 = 0. . Robinson using a combination of basic algebra and elementary model theory. then σ ∈ T . . −. Theorem 4.6. <. RCF is a set of axioms true in the ordered ﬁeld (R. RCF has QE and is complete. 1. xn . +. (ii) every equivalence class is inﬁnite. It dates from around 1950. if T σ. <. and an elegant short proof by A. y are distinct variables. is complete by 4.3 and the fact that the ring of integers embeds in every algebraically closed ﬁeld of characteristic 0. and give a complete proof for a third example in the next section. 0. there is a shorter one due to A. (His original proof is rather long.) (4) Suppose that a set Σ of Lsentences has QE. However. In addition to the ordered ﬁeld axioms. Then Eq∞ admits QE and is complete. y distinct variables) and for each odd n > 1 the axiom ∀x1 . −.68 CHAPTER 4. and let Σ ⊇ Σ be a set of L sentences. .5. it has the axiom ∀x (x > 0 → ∃y (x = y 2 )) (x. an Ltheory is a set T of Lsentences such that for all Lsentences σ. An axiomatization of an Ltheory T is a set Σ of Lsentences such that T = {σ : σ is an Lsentence and Σ σ}. Seidenberg. 0. ACF(0). The following theorem is due to Tarski and (independently) to Chevalley. .3. which contains additional axioms forcing the characteristic to be 0. since it says nothing about the characteristic: it doesn’t prove 1 + 1 = 0. (iii) there are inﬁnitely many equivalence classes. ACF is not complete.3. 0. +.) (3) Let Eq∞ be a set of axioms in the language {∼} (where ∼ is a binary relation symbol) that say: (i) ∼ is an equivalence relation. ACF has QE.) Deﬁnition. In (5) and (6). SOME MODEL THEORY We mention without proof two examples of QE. ·) of RCF are exactly the ﬁnite unions of intervals of all kinds (including degenerate intervals with just one point) (Hint: use the fact that RCF has QE. (1) The subsets of C deﬁnable in (C. Theorem 4. ∀xn ∃y (y n + x1 y n−1 + · · · + xn = 0) where x1 . 1. . Exercises. 1. ·) of real numbers. . (It is also possible to use Vaught’s test to prove completeness. Let the language L extend L by new symbols of arity 0. −. (Hint: use the fact that ACF has QE. Clearly.) (2) The subsets of R deﬁnable in the model (R.3.
Then ˜ (1) There is a unique embedding Z −→ A. that is. P2 . −. +. 1. −. <). z for deﬁniteness. 1. 3. . n = 1. +. 1. 0. 0. t1 (x)) ∨ · · · ∨ ϕ(x. . n = 1. y) → ϕ(x. that is. Let A = (A. We expand (Z. ∀x Pn x ↔ ∃y(x = ny) . P2 . Z. In (v) and in the rest of this section r ranges over integers. . P4 .) (6) Assume the Ltheory T has builtin Skolem functions. to obtain the language LPrA of Presburger Arithmetic (named after the Polish logician Presburger who was a student of Tarski). . (division with remainder). <. y) and ψ(x) are qfree. . +. . −. . 1. +. Then T has QE. tk (x) such that T ∃yϕ(x. ∀x∀y∀z(x < y → x + z < y + z) (translation invariance of <). (Hint: let Σ be the set of Lsentences provable from T that have the indicated form. for every ϕ(x. it sends k ∈ Z to k1 ∈ A. <) = ¬ϕ(n) for all n > N . formulas such as ∃y(x = y + y) and ∃y(x = y + y + y) are not Σequivalent to any qfree formula in this language. ). <. for each basic conjunction ϕ(x. 3. y) there are Lterms t1 (x). 2Z. . y) → ϕ(x. Then T has an axiomatization consisting of sentences ∀x∃yϕ(x. tk (x) such that T ∃yϕ(x. . <) = ϕ(n) for all n > N or (Z. this is a complete set of axioms for ordinary arithmetic of integers without multiplication. . ) = PrA. There is a mild complication in trying to obtain this completeness via QE: one can show (exercise) that for any qfree formula ϕ(x) in the language {0. −. y) and ∀xψ(x) where ϕ(x. 1. <} there is an N ∈ N such that either (Z. ∀x∃y 0≤r<n x = ny + r1. y. . In particular. . . . Here we have ﬁxed distinct variables x.4 Presburger Arithmetic In this section we consider in some detail one example of a set of axioms that has QE. 4. . 1. Pn is interpreted as the set nZ. 0. +.4. y) there are Lterms t1 (x). the axioms of To expressing that < is a total order.” Essentially. P3 . t1 (x)) ∨ · · · ∨ ϕ(x. PRESBURGER ARITHMETIC 69 (5) Suppose the Ltheory T has QE. and prove every sentence true in this structure. . 4Z. . This structure satisﬁes the set PrA of Presburger Axioms which consists of the following sentences: (i) (ii) (iii) (iv) (v) (vi) the axioms of Ab for abelian groups. 0. . <). P1 . 1.4. . . namely “Presburger Arithmetic. 0. the axioms are true in (Z. +. . <} by new unary relation symbols P1 . −. +. . P3 . 0.1. 0. <) to the LPrA structure ˜ Z = (Z. −. +. P2 . To overcome this obstacle to QE we augment the language {0. 0 < 1 ∧ ¬∃y(0 < y < 1) (discreteness axiom). and T has an axiomatization consisting of sentences ∀xψ(x) where ψ(x) is qfree. 1. Note that (v) and (vi) are inﬁnite lists of axioms. 1. show that Σ has QE. tk (x)). +. for any set Σ of axioms true in (Z. . −. −. and is an axiomatization of T . . −. 3Z. (deﬁning axioms for P1 . . .4. Here are some elementary facts about models of PrA: A A A Proposition 4. . tk (x)). ) that is. 2. 2.
y). . . . . t(x) < my. ϕ(x. J. . and after rearranging conjuncts. (n − 1)1 + nA. namely 0 + nA. y) = (x1 . PrA has QE. y) ∨ ϕ2 (x.3. I. We allow some of these index sets to be empty in which case the corresponding conjunction can be left out. and likewise with my < t(x) and t(x) < my. say h ∈ H. a + (n − 1)1 lies in nA.70 CHAPTER 4. Let (x.1 it suﬃces to show that ∃yϕ(x. ∨ Pn (my + t(x) + (n − 1)1) Also conjuncts in which y does not appear can be eliminated because ∃y(ψ(x) ∧ θ(x. Proof. and consider a basic conjunction ϕ(x. SOME MODEL THEORY A (2) Given any n > 0 we have Pn = nA. Also. Similarly a negation ¬Pn (my + t(x)) can be replaced by the disjunction Pn (my + t(x) + 1) ∨ . By Lemma 4. . y)) is equivalent to ∃yϕ1 (x. y) in LPrA . and A/nA has exactly n elements. y) has the form my = th (x) ∧ h∈H i∈I ti (x) < my ∧ j∈J my < tj (x) ∧ k∈K Pn(k) (my + tk (x)) where m > 0 and H. We may assume that each conjunct of ϕ is of one of the following types. a + 1. y) is PrAequivalent to a qfree formula ψ(x). y) ∨ ∃yϕ2 (x. . . .4. y). Pn (my + t(x)).2. Since PrA Pn (z) ↔ Prn (rz) for r > 0 we can replace Pn (my + t(x)) by Prn (rmy + rt(x)). where we regard A as an abelian group. my < t(x). . . After all these reductions. y)) ←→ ψ(x) ∧ ∃yθ(x. K are disjoint ﬁnite index sets. y) is PrAequivalent to Pm (th (x)) ∧ h∈H th (x) = th (x) ∧ i∈I ti (x) < th (x) ∧ j∈J th (x) < tj (x) Pn(k) (th (x) + tk (x)) ∧ k∈K For the rest of the proof we assume that H = ∅. . . (3) For any n > 0 and a ∈ A. Then the formula ∃yϕ(x. To justify this assumption observe that if we had instead a conjunct my = t(x) then we could replace it by (my < t(x)) ∨ (t(x) < my) and use the fact that ∃y(ϕ1 (x. for r > 0 we can replace my = t(x) by rmy = rt(x). exactly one of the a. (4) A is torsionfree as an abelian group. xn . for some integer m ≥ 1 and LPrA term t(x): my = t(x). Theorem 4. Suppose that H = ∅. y) be a tuple of n + 1 distinct variables. We can therefore assume that all conjuncts have the same “coeﬃcient” m in front of the variable y. . .
y) now has the form ti (x) < my ∧ i∈I j∈J my < tj (x) . Consider the system of linear congruences (with “unknown” y) Pn(k) (my + tk (a)). b). .) = PrA. . Fix any value a ∈ Zn of x. e.(N − 1)1 + N z. . where as usual we put N = 1 for K = ∅. . . . . PRESBURGER ARITHMETIC 71 ˜ To understand what follows. b). We have to show that A = ∃yϕ(a. mb + tk (a) = m(r1 + N c) + tk (a) = (mr)1 + (mN )c + tk (a) ∈ n(k)A and so A = Pn(k) ((mr)1 + tk (a)). Note that then for k ∈ K. Therefore A = θ(a) with ∃z witnessed by c. This suggests replacing y successively by N z. So let b ∈ A be such that A = ϕ(a. y) if and only if A = θ(a). it may help to focus on the model Z. (k ∈ K). y) is PrAequivalent to the formula θ(x) given by N −1 Pn(k) ((mr)1 + tk (x)) ∧ ∃z r=0 k∈K i∈I ti (x) < m(r1 + N z) ∧ j∈J m(r1 + N z) < tj (x) We prove this equivalence with θ(x) as follows. . .. N − 1} is true in A. . Also. with ∃z witnessed by c ∈ A. (k ∈ K). a = (a1 . For the converse. suppose that the disjunct of θ(a) indexed by a certain r ∈ {0. The solutions in Z of this system form a union of congruence classes modulo N := k∈K n(k).4. ti (a) < m(r1 + N c) m(r1 + N c) < tj (a) for every i ∈ I. Suppose A = (A. for every j ∈ J. Then put b = r1 + N c and we get A = ϕ(a. . . which in more familiar notation would be written as my + tk (a) ≡ 0 mod n(k). . Division with remainder yields a c ∈ A and an r such that b = r1 + N c and 0 ≤ r ≤ N − 1. Now that we have proved the claim we have reduced to the situation (after changing notation) where H = K = ∅ (i. an ) ∈ An .4. no equations and no congruences). 1 + N z. So ϕ(x. This proves the claimed equivalence. although the arguments go through for arbitrary models of PrA. . Our precise claim is that ∃yϕ(x.
we just write t ≤ t to abbreviate (t < t ) ∨ (t = t ).) The algorithms above can easily be implemented by computer programs. This bad behaviour is not at all unusual: no (classical. In view of the equally constructive proof of Lemma 4. given any basic conjunction ϕ(x. ) and computer science interact. constructs a qfree formula ψ(x) of LPrA such that PrA ∃yϕ(x. or feasible: on some moderately small inputs it might have to run for 10100 years before producing an output. SOME MODEL THEORY If J = ∅ or I = ∅ then PrA ∃yϕ(x. There do exist feasible integer linear programming algorithms that decide ˜ the truth in Z of sentences of a special form. Discussion. this is an area where mathematics (logic. as we discuss next for Z. this yields an algorithm that. Thus the structure Z is decidable. A positive impact of QE is that it yields structural properties of deﬁnable ˜ sets. given any LPrA sentence σ. .3 that PrA is complete: it has QE ˜ and Z can be embedded in every model. Even if this algorithm can be implemented by a computer program.3. it does not guarantee that the program is of practical use. . this last algorithm constructs for any LPrA sentence σ a qfree LPrA sentence σ qf such that PrA σ ↔ σ qf . checks whether ˜ ˜ σ is true in Z. given any qfree LPrA sentence σ qf . Moreover each interval of m successive elements of A contains an element of mA. y) ↔ ψ(x). The careful reader will have noticed that the elimination procedure in the proof above is constructive: it describes an algorithm that. Therefore ∃yϕ(x. For each value a ∈ An of x there is i0 ∈ I such that ti0 (a) is maximal among the ti (a) with i ∈ I.1 this yields an algorithm that. checks whether σ qf is true ˜ in Z. sequential) algorithm for deciding the truth of ˜ LPrA sentences in Z is feasible in a precise technical sense. and this shows another (very practical) side of complexity theory. Since we also have an obvious algorithm that. constructs a qfree LPrA formula ϕqf (x) such that PrA ϕ(x) ↔ ϕqf (x).) In particular. and suppose we have an algorithm for deciding whether any given Lsentence is true in A. So suppose A = PrA and that A is the underlying set of A. Results of this kind belong to complexity theory. and a j0 ∈ J such that tj0 (a) is minimal among the tj (a) with j ∈ J. . This leaves the case where both I and J are nonempty. j0 ) ∈ I × J of the qfree formula ti (x) ≤ ti0 (x) ∧ i∈I j∈J m tj0 (x) ≤ tj (x) ∧ r=1 Pm (ti0 (x) + r1) ∧ (ti0 (x) + r1 < tj0 (x)) This completes the proof.3. y) in LPrA as input. Note that LPrA does not contain the relation symbol ≤. y) is equivalent in A to the disjunction over all pairs (i0 . It now follows from Corollary 4. Remark.72 CHAPTER 4. given any LPrA formula ϕ(x) as input. y) ↔ .. (A precise deﬁnition of decidability will be given in the next Chapter. (Thus PrA has eﬀective QE. number theory. Let some Lstructure A be given.
k = 0 and k ≡ 0 mod d. +∞}. and so on). −. It follows that we may assume that ϕ(x) has the form kx + l1 > 0. −. Then P contains with any two sets X. Then ˜ S is deﬁnable in Z ⇐⇒ S is a ﬁnite union of arithmetic progressions. +. Exercises.5. An arithmetic progression of modulus d is a set of the form {k ∈ Z : k ≡ r mod d. α < β. . 1.4. <) by a qfree formula of the language {0. Σ is said to be conservative over Σ (or a conservative extension of Σ) if for every Lsentence σ Σ L σ ⇐⇒ Σ L σ. Every atomic formula ϕ(x) diﬀerent from and ⊥ has the form t1 (x) < t2 (x) or the form t1 (x) = t2 (x) or the form Pd (t(x)).3. l ∈ Z. d − 1}. (3) Let P be the collection of all ﬁnite unions of arithmetic progressions. (1) If P. We leave the proof of the next lemma to the reader. or the form kx + l1 = 0 . Deﬁnition. SKOLEMIZATION AND EXTENSION BY DEFINITION 73 Deﬁnition. 0. then P ∩ Q is an arithmetic progression of modulus lcm(d. Lemma 4. this is straightforward and left to the reader. Σ a set of Lsentences. then Z P is a ﬁnite union of arithmetic progressions. Q ⊆ Z are arithmetic progressions of moduli d and e respectively. Considering cases (k = 0. and Σ a set of L sentences with Σ ⊆ Σ . Y also X ∪ Y . Let S ⊆ Z. .5 Skolemization and Extension by Deﬁnition In this section L is a sublanguage of L . 1. . β ∈ Z ∪ {−∞. α < k < β}. <}. . Arithmetic progressions have the following properties. e). where k. where r ∈ {0. The ﬁrst two kinds reduce to t(x) > 0 and t(x) = 0 respectively (by subtraction). (⇒) By QE and Lemma 4. (⇐) It suﬃces to show that each arithmetic progression is deﬁnable in Z. α. (2) If P ⊆ Z is an arithmetic progression.4. X Y .4. 4. where t1 (x). we see that such a ϕ(x) deﬁnes an arithmetic progression. ˜ Proof. Corollary 4.3 it ˜ suﬃces to show that each atomic LPrA formula ϕ(x) deﬁnes in Z a ﬁnite union of arithmetic progressions.4. Let d be a positive integer. or the form Pd (kx + l1). +. (1) Prove that 2Z cannot be deﬁned in the structure (Z. . X ∩ Y . t2 (x) and t(x) are LPrA terms.4.
∀xn (∃yϕ(x1 .5. Let ϕ(x1 . xn . The sentence ρϕ is called the deﬁning axiom for Rϕ . (3) Suppose A = Σ and S ⊆ Am . . We call Σϕ an extension of Σ by a deﬁnition for the relation symbol Rϕ . By remark (2) it suﬃces to obtain an L expansion A of A that makes the new axiom about fϕ true. Remarks (1) Suppose Σ conservative over Σ. . Deﬁnition. . then Σ is conservative over Σ. . . . xn ))) as desired. . . an . Given an Lformula ϕ(x1 . let Rϕ be an mary relation symbol not in L. . y) → ϕ(x1 . . an ) be an arbitrary element of A. . . . . . Every model of Σϕ is of the form Aϕ for a unique model A of Σ. ∀xm (ϕ(x1 . For any (a1 . xn . . . . . we let fϕ (a1 . . . Thus A = ∀x1 . Let A be any model of Σ. . Remark. A function fϕ as in the proof is called a Skolem function in A for the formula ϕ(x1 . yn ) there is an Lformula ψ ∗ (y). . xn )))} Then Σ is conservative over Σ. . Then Σ is consistent if and only if Σ is consistent. . We choose a A function fϕ : An −→ A as follows.2. . . . . . such that Σϕ ψ(y) ↔ ψ ∗ (y).5. . and if no such b exists. an ) ∈ An . A Remark. (2) If each model of Σ has an L expansion to a model of Σ .74 CHAPTER 4. . . . . and put Lϕ := L ∪ {Rϕ } and Σϕ := Σ ∪ {ρϕ } where ρϕ is ∀x1 . . . xm ). y) → ϕ(x1 . . Then S is 0deﬁnable in A if and only if S is 0deﬁnable in Aϕ . . . (2) For each Lϕ formula ψ(y) where y = (y1 . . . .1. xn . . if there is a A b ∈ A such that A = ϕ(a1 . . . fϕ (x1 . xn . . . ∀xn (∃yϕ(x1 . Let fϕ be an nary function symbol not in L. . . (⇐=) is automatic. and the same with deﬁnable instead of 0deﬁnable. called a translation of ψ(y). and put L := L ∪ {fϕ } and Σ := Σ ∪ {∀x1 . xm ) be as above. fϕ (x1 . . Then we have: (1) Σϕ is conservative over Σ. . xm )). Proof. . . . Proposition 4. . SOME MODEL THEORY Here (=⇒) is the signiﬁcant direction. Let ϕ = ϕ(x1 . . . . . . .) Proposition 4. an ) be such an A element b. We denote this expansion by Aϕ . It yields a “witness” for each relevant ntuple. . y) be an Lformula. . y). . . Each model A of Σ expands uniquely to a model of Σϕ . xn . xm ) ↔ Rϕ (x1 . . (This follows easily from the Completeness Theorem. . . . . b) then we let fϕ (a1 . . . . . xn .
.3. Σϕ . and (3) is immediate from (2). fϕ (x)) The sentence γϕ is called the deﬁning axiom for fϕ . A = ∅. y1k ). .) It can be shown with tools a little beyond the scope of these notes that no inﬁnite totally ordered set can be deﬁned in the ﬁeld of complex numbers. . f (xn )) : (x1 .4. . +) of natural numbers. . . y) = (x1 . .5. ∧ um = tm (y) ∧ ϕ(u1 /x1 . ∃um (u1 = t1 (y) ∧ . xn ) ∈ S} of Y n . Let X. y) is an Lformula where (x. . . . y) abbreviates ∃y ϕ(x. (ii) The set {(δ(a). SKOLEMIZATION AND EXTENSION BY DEFINITION 75 Proof. . For example. given by δ(n) = (n. n). To prove (2) we observe that by the Equivalence Theorem (3. . and A . (yn1 . . . ∀xm ϕ(x. yn . y) is a tuple of m + 1 distinct variables. In order to extend this notion to arbitrary structures A = (A. 0. Also. . respectively. ) we use the following notation and terminology. By a deﬁnition of (A. such that Σ ∀x1 . <) be a totally ordered set. such that (i) δ(A) ⊆ B k is deﬁnable in B. . . δ(b)) : a < b in A} ⊆ (B k )2 = B 2k is deﬁnable in B. . . . ∀xm ∃! yϕ(x. . . we use the bijection ((y11 . . .) Before deﬁning the next concept we consider a simple case. Each model A of Σ expands uniquely to a model of Σ . . . In this case we can take ∃u1 . . with z a variable not occurring in ϕ and not among x1 . . . given k ∈ N. .2) it suﬃces to prove it for formulas ψ(y) = Rϕ t1 (y) . . . <) in a structure B we mean an injective map δ : A → B k . . . . We denote this expansion by A . and S ⊆ X n . (Hint: reduce the analogue of (2) to the case of an unnested formula. Deﬁnition. . (1) is clear from the remark preceding the proposition. Let (A. y) ∧ ∀z(ϕ(x. . . . ynk ) from (Y k )n to Y nk to identify these two sets. with k ∈ N. tm (y) where the ti are Lterms. . Let fϕ be an mary function symbol not in L and put L := L ∪ {fϕ } and Σ := Σ ∪ {γϕ } where γϕ is ∀x1 . . um /xm )) as ψ ∗ (y) where the variables u1 .5. We leave the proof of this as an exercise. . . yn1 . . xm . . xm . we have a deﬁnition δ : Z → N2 of the ordered set (Z. . . <) of integers in the additive monoid (N. y. .2 goes through when Lϕ . where ∃! yϕ(x. Every model of Σ is of the form A for a unique model A of Σ. . . . um do not appear in ϕ and are not among y1 . We call Σ an extension of Σ by a deﬁnition of the function symbol fϕ . . y1k . ynk )) → (y11 . Remark. . f : X → Y a map. z) → y = z) . 0) and δ(−n) = (0. . . . . y). Y be sets. . Σ . Proposition 4. . . . . Then the f image of S is the subset f (S) := {(f (x1 ). . . (We leave it to the reader to check this. Suppose ϕ(x. and A are replaced by L . . .
. xn1 . . . 0. . Let x1 . considered as a subset of C. see exercise 1. . ·) of complex numbers in the ﬁeld (R. . . . d ∈ Z}. A deﬁnition of an Lstructure A in a structure B is an injective map δ : A → B k . one can show that the only ﬁelds deﬁnable in the ﬁeld of complex numbers are ﬁnite ﬁelds and ﬁelds isomorphic to the ﬁeld of complex numbers itself. Remark. . ·). −. this follows easily from the fact that ACF admits QE. b) : C → R2 (a. Here B is a structure for a language L∗ that may have nothing to do with the language L. and imposing suitable conditions. . xn be distinct variables (viewed as ranging over A ). 1. A more general way of viewing a structure A as in some sense living inside a structure B is to allow δ to be an injective map from A into B k /E for some equivalence relation E on B k that is deﬁnable in B. Then we have a map that assigns to each Lformula ϕ(x1 . Proposition 4. . . (iii) For each nary f ∈ Lf the set δ graph of f A ⊆ (B k )n+1 = B (n+1)k is deﬁnable in B. ·) of real numbers. SOME MODEL THEORY Deﬁnition. for n = 0 the map above assigns to each Lsentence σ an L∗ sentence δσ such that A = σ ⇐⇒ B = δσ. . there is no deﬁnition of the ﬁeld of real numbers in the ﬁeld of complex numbers: this follows from the fact. and let x11 . −. xn ) an L∗ formula δϕ(x11 . . . with k ∈ N. ·. . In particular. b ∈ R) is a 0deﬁnition of the ﬁeld (C. . we get the notion of a 0deﬁnition of A in B.) Indeed. such that (i) δ(A) ⊆ B k is deﬁnable in B. . <) in (Z. . . It follows that the inclusion map N → Z is a 0deﬁnition of (N. . . On the other hand. . b. −. 1. 0. +.3. (A special case says that R. . xnk be nk distinct variables (viewed as ranging over B).) Recall that by Lagrange’s “four squares” theorem we have N = {a2 + b2 + c2 + d2 : a. . . . . x1k . that there is no deﬁnition of any inﬁnite totally ordered set in the ﬁeld of complex numbers. c. . 1. . Our special case corresponds to E = equality on B k . 0. The bijection a + bi → (a. (ii) For each mary R ∈ Lr the set δ(RA ) ⊆ (B k )m = B mk is deﬁnable in B. 0.5. is not deﬁnable in the ﬁeld of complex numbers. +. . (We do not develop this idea here further: the right setting for it would be manysorted structures. rather than our onesorted structures. Replacing everywhere “deﬁnable” by “0deﬁnable”. xn1 . +. . stated earlier without proof. x1k . +. . Let δ : A → B k be a 0deﬁnition of the Lstructure A in the L∗ structure B.76 CHAPTER 4. xnk ) such that δ(ϕA ) = (δϕ)B ⊆ B nk .
x.x. For example µx(x2 > 7) = 3..} is nonempty. 77 . µx<4 (x2 > 3) = 2 and µx<2 (x > 5) = 2. no computable set of axioms in the language of N and true in N can be complete.x.. is clear.. We let µx(. Decidability.x. Here .x.. We will only use this notation when the meaning of . Then there exists a sentence σ in that language such that N = σ. Consider the o structure N := (N.) := a. +. For example. 0. and goes far beyond incompleteness..1 Computable Functions First some notation.2.x. 5. The only unexplained terminology here is “computable.. computability plays a role in combinatorial group theory (Higman’s Theorem) and in certain diophantine questions (Hilbert’s 10th problem). where S : N → N is the successor function.. is some condition on natural numbers x... Let Σ be a computable set of sentences in the language of N and true in N. (It seems reasonable to require this of an axiom system for N.. <).) Thus we begin this chapter with developing the notion of computability. A simple form of the incompleteness theorem is as follows.. holds if there is such an x. but Σ σ. For example. For a ∈ N we also let µx<a (. and if there is no such x we put µx<a (. not to mention its role in the ideological underpinnings of computer science. S.. hence the name Incompleteness Theorem. ·.. “Σ is computable” means that there is an algorithm to recognize whether any given sentence in the language of N belongs to Σ.. holds.. In other words..” Intuitively.x.) denote the least x ∈ N for which . and the set {x ∈ N : .) be the least x < a in N such that . and Incompleteness In this chapter we prove G¨del’s famous Incompleteness Theorem.Chapter 5 Computability.x. The interest of this notion is tied to the ChurchTuring Thesis as explained in Section 5.
and for all a ∈ Nn there exists x ∈ N such that G(a. Hm (a)). This 4 4 4 4 follows from (R2) by noting that H(x) = F (G(I1 (x). . I2 (x). Example. Hm ). and the coordinate functions Ii (for each n and i = 1. if a ∈ R. / Deﬁnition. . 0 0 . . . . The functions χ≥ and χ= on N2 are computable..) obtained by inductively applying the following rules: n (R1) + : N2 → N. Hk ) ⊆ Nn is computable. . x2 . . an ) ∈ R. . where for a ∈ Nn we put R(H1 . . Hm )(a) ⇐⇒ R(H1 (a). We call χR the characteristic function of R. χ≤ : N2 → N. . For R ⊆ Nn . x3 . n) = 1 iﬀ m < n. 2. then the function F : Nn → N given by F (a) = µx(G(a. . Hm (a)). . Now apply (R2).1.78 CHAPTER 5. . .2. . . . Then R(H1 . . . (The reader should of course notice when we do so. . COMPUTABILITY 1 0 if a ∈ R. n we deﬁne Ii : Nn → N by Ii (a1 . . If F : N3 → N and G : N2 → N are computable. Observe that χR(H1 . an ) instead of (a1 . x2 . as is the constant function cn : Nn → N where cn (a) = 0 for all n and a ∈ Nn . · : N2 → N. x) = 0. . and often write R(a1 . These functions are called coordinate functions. I4 (x)) where x = (x1 . . n) = 0 iﬀ m ≥ n. but only tacitly. . This is mostly an exercise in programming. x4 ). n n Deﬁnition. x) = 0) is computable. . . I4 (x)). . Deﬁnition. . n) are computable. 1.Hm ) = χR (H1 . . . . . . . x4 ). The computable functions (or recursive functions) are the functions from Nn to N (for n = 0. . . For i = 1. x4 ) = F (G(x1 .. Proof. (R3) If G : Nn+1 → N is computable. (R2) and (R3) we derive further rules for obtaining computable functions. . . Lemma 5. χ< (m. .. . . A relation R ⊆ Nn is said to be computable (or recursive) if its characteristic function χR : Nn −→ N is computable. Hm : Nn → N are computable. . then so is the function H : N4 → N deﬁned by H(x1 . .) From (R1).1. . . . Let H1 . .. We shall use this device from now on in many proofs. Example. we deﬁne χR : Nn → N by χR (a) = Think of such R as an nary relation on N.1. then so is the function F = G(H1 . Lemma 5. . x2 . . x4 ). and χ< (m. . . . . Hm : Nn → N and R ⊆ Nm be computable. x3 . . . Hm ) : Nn → N deﬁned by F (a) = G(H1 (a). (R2) If G : Nm → N is computable and H1 . an ) = ai .
≥. Rk ⊆ Nn be computable such that for each a ∈ Nn exactly one of R1 (a). This is true for k = 0 (and all n) by Lemma 5. P ∧ Q := P ∩ Q. P ∧ Q. P → Q := (¬P ) ∨ Q. . χ= is computable: χ= (m. The function χ≥ is computable because 2 2 χ≥ (m. 0 the relation P ∧ Q is computable since χP ∧Q = χP · χQ . k Proof.2 and (R1).5. x). G(a) = .1. . Then we can form the nary relations ¬P := Nn P.1. Hence ¬P is computable by (R2) and Lemma 5. P ∨ Q. P ∨ Q = ¬(¬P ∧ ¬Q). Lemma 5.1. n). Gk : Nn → N are computable. .1. m) = χ≤ (I2 (m. COMPUTABLE FUNCTIONS Proof. Then G : Nn → N given by G1 (a) if R1 (a) . . P ↔ Q := (P → Q) ∧ (Q → P ) on N.5. n)) 79 which enables us to apply (R1) and (R2).4 they are also computable. n). By De Morgan’s Law. The remaining relations are complements of these three. Rk (a) holds. x) = 0).3. Suppose P. The rest is clear. Let a ∈ Nn . Q are computable.2.1. . Similarly.6. I1 (m.4. and suppose that G1 . Every constant function cn is computable. Lemma 5. =. . . . Next. To handle cn we observe 0 n+1 cn (a) = µx(In+1 (a. (Deﬁnition by Cases) Let R1 . Next. Proof. P ∨ Q := P ∪ Q. Lemma 5. ≤ and = have already been taken care of by Lemma 5. Thus P ∨ Q is computable. The binary relations <. . The relations ≥. . n) = χ≤ (n. so the general result follows by induction on k.1. . 0 For k ∈ N we deﬁne the constant function cn : Nn → N by cn (a) = k. k k Lemma 5. n) · χ≥ (m. observe that n+1 cn (a) = µx(cn (a) < x) = µx χ≥ (cn+1 (a. cn (a)). . Let P.1. . Proof. . . In+1 (a. so χ¬P (a) = 0 χ= (χP (a).1.1. Gk (a) if Rk (a) is computable. n) = χ≤ (m. x)) = 0 k+1 k k for a ∈ Nn .2. Then ¬P . . so by Lemma 5. = on N are computable. P → Q and P ↔ Q are also computable. >. Q be nary relations on N. . Then ¬P (a) iﬀ χP (a) = 0 iﬀ χP (a) = cn (a). ≤.
y) = (a1 . Then for all a ∈ Nn and b ∈ N. Then the function F : Nn → N given by F (a) = µxR(a.8. . x). Q ⊆ Nn+1 be the relations deﬁned by P (a. Let R ⊆ Nn+1 be computable such that for all a ∈ Nn there exists x ∈ N with (a. likewise. Rk ⊆ Nn be computable such that for each a ∈ Nn exactly one of R1 (a). .1. COMPUTABILITY Proof.9. . y) ⇐⇒ ∃ x<y R(a.1. x) or x = y). Some notation: below we use the bold symbol ∃ as shorthand for “there exists a natural number”. P (a) ⇐⇒ . Then P and Q are computable. b) ⇐⇒ F (a) = b. This follows from G = G1 · χR1 + · · · + Gk · χRk .8. Lemma 5. then the function FR : Nn+1 → N deﬁned by FR (a. .1. . Then F is computable if and only if its graph (a subset of Nn+1 ) is computable. Let P. we use the bold symbol ∀ to abbreviate “for all natural numbers. . Pk ⊆ Nn be computable. x) = 0) and apply (R3). Let F : Nn → N. . If R ⊆ Nn+1 is computable. . y) ∈ Nn+1 .11. from which the lemma follows immediately.80 CHAPTER 5. Then the relation P ⊆ Nn deﬁned by P1 (a) if R1 (a) . . Proof.7. Here is a nice consequence of 5.10. for (a.1. Use that P = (P1 ∧ R1 ) ∨ · · · ∨ (Pk ∧ Rk ). . y) ⇐⇒ ∀x<y R(a. y) = µx<y R(a. . Proof.” These abbreviation symbols should not be confused with the logical symbols ∃ and ∀.5 and 5.1. x) Q(a. . an . Suppose R ⊆ Nn+1 is computable.1. Pk (a) if Rk (a) is computable. Proof. (Deﬁnition by Cases) Let R1 . x). Lemma 5. y) = µx(R(a. F (a) = µxR(a. Rk (a) holds. Lemma 5. . x) is computable. Lemma 5. Proof. . . . Let R ⊆ Nn+1 be the graph of F . Note that F (a) = µx(χ¬R (a. x) ∈ R. . .1. Lemma 5. . . R(a. x) is computable. Use that FR (a. Let P1 . .
We now develop some coding tricks due to G¨del that enable us to prove routinely that functions like 2x are computable o according to our deﬁnition of “computable function”. Exercise. Deﬁnition. y) := We call Pair the pairing function. and Left(a) < a if 0 < a ∈ N. Lemma 5. y). y) = χ< (FR (a.10 we note that P (a. y) < y.1. b. .1. x). c ∈ Z we have (by deﬁnition): a ≡ b mod c ⇐⇒ a − b ∈ cZ. But is the exponential function n → 2n computable? It certainly is in the intuitive sense: we know how to compute (in principle) its value at any given argument.5. y) = a ⇐⇒ Left(a) = x and Right(a) = y. Lemma 5. (x + y)(x + y + 1) + x 2 a−b 0 if a ≥ b. The functions Left and Right are computable.1. COMPUTABLE FUNCTIONS 81 Proof.1. Using the notation and results from Lemma 5. y) = a . Use that a−b = µx(b + x = a or a < b). Since Pair is a bijection we can deﬁne functions Left. Use 5. The ternary relation a ≡ b mod c on N is computable. Right : N → N by Pair(x. It is not that obvious from what we have proved so far that it is computable in our precise sense. The function Pair is bijective and computable. Deﬁnition. y). Proof. ˙ ˙ Lemma 5.1. Hence χP (a. For Q.15. The function − : N2 → N deﬁned by a−b = is computable. Deﬁne the function Pair : N2 → N by Pair(x. if a < b For a. ˙ Proof. y) iﬀ FR (a. Lemma 5. Proof.12. The reader should check that Left(a). y) = a and Right(a) = µy ∃ x<a+1 Pair(x. Right(a) ≤ a for a ∈ N.13. note that ¬Q(a.9 in combination with Left(a) = µx ∃ y<a+1 Pair(x. y) iﬀ ∃ x<y ¬R(a.14.1. The results above imply easily that many familiar functions are computable.1.
i) = ai for i < n. n) It follows that n → 2n is computable. Remark. β(a. b. . n) = β(µx(β(x. i) be the remainder of Left(a) upon division by 1 + (i + 1) Right(a). Hence by Proposition 5. i ∈ N we let β(a. ˙ Proposition 5. . . . an are natural numbers such that a0 = 1. i) ≤ ˙ Left(a) ≤ a−1. a contradiction. that is. i ∈ N.16 we have β(a. .1. . Let a0 . .1. We claim that then 1 + N . an−1 ) of natural numbers there exists a ∈ N such that β(a. that is. . The computability of β is clear from earlier results. β(a. i)). i) = ai as required. . For a. Put a := Pair(M. . . Then we take an N ∈ N such that ai ≤ N for all i < n and N is a multiple of every prime number less than n. . 2n = β(a. . i) := µx x ≡ Left(a) mod 1 + (i + 1) Right(a) . Proposition 5.16 shows that we can use β to encode a sequence of numbers a0 . o Deﬁnition. We use this as follows to show that the function n → 2n is computable. Proof. then necessarily an = 2n . an−1 in terms of a single number a. then Left(a) = M and Right(a) = N . . The function β is computable. 1 + 2N. . c ∈ N we have a ≡ b mod c ⇐⇒ ∃ x<a+1 a = x · c + b or ∃ x<b+1 b = x · c + a . and thus β(a.16. and ai+1 = 2ai for all i < n. i + 1) = 2β(x. Use that for a. We have β(a. . ≡ a0 a1 mod 1 + N mod 1 + 2N M an−1 mod 1 + nN. COMPUTABILITY Proof. but p does not divide N . To see this. We can now introduce G¨del’s function β : N2 → N. 1 + nN are relatively prime. an−1 be natural numbers. For any sequence (a0 . N ). . If a0 . 0) = 1 and ∀i<n β(x. 0) = 1 and ∀i<n β(x. . then p divides their diﬀerence (j − i)N .1. i)). . . hence p  j − i < n. suppose p is a prime number such that p  1 + iN and p  1 + jN (1 ≤ i < j ≤ n). By the Chinese Remainder Theorem there exists an M such that M M ≡ ≡ .82 CHAPTER 5. . i) ≤ a−1 for all a. i + 1) = 2β(x. n) = 2n where a := µx(β(x. . .
. . (1) The function In : N2 → N deﬁned by In(a.1. ¯ Lemma 5. . b) = (F (a. b + 1))b . . . .16. .4 and 5. . am ∗ b1 . . . an ) = n. . denoted a1 . Put (a)i := β(a.8. so lh is computable. which we develop next. . . 5. . and apply Lemmas 5. . . . an ∈ N and i ≤ n.1. . bn ∈ N. Proof. 1) = a1 .1. and ( a1 . n. . For any n. . . . . bn for all a1 . i) = a1 . Then F is computable since ¯ F (a. ¯ Proof. . . an )i = ai+1 for i < n. . . . . i + 1). . b1 . β(a. Lemma 5. an ) of natural numbers we assign a sequence number . an ) → a1 . We deﬁne the length function lh : N −→ N by lh(a) = β(a. . . suppose F is computable. . 0) = n (the length of the sequence) and β(a. . . am . . am . an : Nn → N is computable.1. . . . Let F : Nn+1 → N.5. . . i) = ai for i = 1. F (a. . and ai < a1 . 0). (2) The function ∗ : N2 → N deﬁned by a ∗ b = µx(lh(x) = lh(a) + lh(b) and ∀i<lh(a) (x)i = (a)i and ∀j<lh(b) (x)lh(a)+j = (b)j ) is computable and a1 . COMPUTABLE FUNCTIONS 83 The above suggests a general method. The function (a. the function (a1 . ¯ Deﬁnition. . . i) → (a)i : N2 −→ N is computable. . . .1. . . ai for all a1 . 0). b1 . . Use a1 . For n = 0 this gives = 0. . . bn = a1 .18. . i)). . 0) = n. Then F is computable if and only if F is computable. . . . . The set Seq is computable since a ∈ Seq ⇐⇒ ∀x<a (lh(x) = lh(a) or ∃ i<lh(a) (x)i = (a)i ) Lemma 5. .1. Suppose F is computable. To each sequence (a1 . n) = an ). β(a. . ¯ In the other direction. (a ∈ Nn ). . where is the sequence number of the empty sequence. b) = µx(lh(x) = b and ∀i<b (x)i = F (a. . . . . . . . . n. . . an for (a1 .19. Observe that lh( a1 . an ) ∈ Nn and i = 1. . 0) = = 0. . an . an = µa(β(a. .1. . For F : Nn+1 → N . .17. . and deﬁned to be the least natural number a such that β(a. . Finally. . Then F is computable since ¯ F (a. an . b − 1) ¯ Note that F (a. . . . . . let Seq ⊆ N denote the set of sequence numbers. i) = µx(lh(x) = i and ∀j<i (x)j = (a)j ) is computable and In( a1 . let F : Nn+1 → N be given by ¯ F (a. . . b) = F (a. . . .
b. 0) = A(a) = G(a. The claim follows. Deﬁnition.1. Proposition 5. We have ¯ F (a. Then F is computable. Note that ¯ F (a. 0)) and ¯ ¯ F (a. 0) = A(a). b. We claim that ¯ F (a. In(x. F (a. 0). F (a. b + 1) = G(a. Let G and F be as above and suppose G is computable. b + 1) = B(a. F (a. We say that F is obtained from A and B by primitive recursion. by Proposition 5. B. and F are as above. F (a. 0. b. This will be clear if we express the requirement on F as follows: F (a. 0) = G(a. G is computable. The next result is important because it allows us to introduce computable functions by recursion on its values at smaller arguments.84 CHAPTER 5. . b. b + 1) = B(a. F (a. F (a. ˙ B(a. b−1. Let a range over Nn . i. Proof. Let A : Nn → N and B : Nn+2 → N be given. and A and B are computable. (F (a. Proof.1. . c) = A(a) if c = 0. F (a. Suppose A. b + 1)). F (a. b) = µx(Seq(x) and lh(x) = b and ∀i<b (x)i = G(a. Proposition 5. b + 1. and deﬁne the function F : Nn+1 → N by F (a. F (a. b)). b)) = B(a.1. b) = G(a. b + 1. 0) = G(a. . (c)b−1 ) if c > 0. b)) (a ∈ Nn ). . .21. 0. 0.20. b) = G(a. Deﬁne G : Nn+2 → N by G(a. and thus by the previous lemma F is computable. It follows that F is computable. b. 0). i))) ¯ for all a ∈ Nn and b ∈ N. b) ). b)). Then F is computable. This claim yields the computability of F . b. COMPUTABILITY Given G : Nn+2 → N there is a unique function F : Nn+1 → N such that ¯ F (a. b + 1))b ) = G(a. ˙ Clearly. F (a.20.
1.1. b. (1) The set of prime numbers is computable. the function F = G(H1 . an ∈ N. b) otherwise. b) = K(a. (3) If f1 . 0) = H(a. Hm : Nn → N in F . . According to Proposition 5. F (a. and deﬁne F : N → N. There is clearly a unique function F : N2 → N such that for all a. We say that F is closed under composition if for all G : Nm → N in F and all H1 . We claim that F is computable. c) = (c)G(b) if G(b) < b. . an ) for all a1 . . H(a. We say that F is closed under minimalization if for every G : Nn+1 → N in F such that for all a ∈ Nn there exists x ∈ N with G(a. . and Fn+2 = Fn+1 + Fn .1. an ) = F ( a1 . . G(b)) if G(b) < b. . . . . the function F : Nn → N given by F (a) = µx(G(a. b) = F (a. All lemmas and propositions of this Section go through with computable replaced by in F. . then f −1 (X) ⊆ Nm is computable. . let G : N → N and H : N2 → N be computable. . fn : Nm → N are computable and X ⊆ Nn is computable. In particular F (a. (5) If f : N → N is computable and strictly increasing. (6) All computable functions and relations are deﬁnable in N. F1 = 1. ` ´ F (a) := F (a)0 . . so F (a1 .) Let F be a collection of functions F : Nm → N for various m. fn ) : Nm → Nn . . . 0). and is closed under composition and minimalization. . b ∈ N. b)) for all a. . . Such a function K is given by K(a. . COMPUTABLE FUNCTIONS 85 Proposition 5. Then F is computable iﬀ F is computable. As a simple o example of such an application. . . Exercises. x) = 0. b) otherwise. .20 will be applied over and over again in the later section on G¨del numbering.5. x) = 0) is in F . then f (N) ⊆ N is computable. . (8) Suppose F contains the functions mentioned in (R1). (Hence nvariable computability reduces to 1variable computability. (7) Let F : Nn → N. . . (2) The Fibonacci numbers are the natural numbers Fn deﬁned recursively by F0 = 0. . but in combination with deﬁnitions by cases. (4) If f : N → N is computable and surjective. Hm ) : Nn → N is in F . We say that a relation R ⊆ Nn is in F if its characteristic function χR is in F .20 this claim will follow if we can specify a ¯ computable function K : N3 → N such that F (a. . . . . . (a)n−1 . then there is a computable function g : N → N such that f ◦ g = idN . . . The function n → Fn : N → N is computable. where f := (f1 . b. b ∈ N F (a. H(a.
We assume that for each i < n the number ai+1 is calculated by some ﬁxed procedure from the earlier numbers a0 . We deliberately neglect physical constraints of space and time: imagine that the program that implements the algorithm has unlimited access to time and space to do its work on any given input. . It stops after step n with an = F (a). . COMPUTABILITY 5. we have a calculable function G : N → N such that ai+1 = G( a0 . it is an important guiding principle. has turned out to be computable. but Turing gave in 1936 a compelling informal analysis of what functions F : N → N are calculable in principle. Here is a heuristic (informal) argument that might make the Thesis plausible. an ). idealized notion of computable. with a0 = a as starting number. and the informal grounds for calculability have always translated routinely into an actual proof of computability. Once this is agreed to. We can assume that on any input a ∈ N this algorithm consists of a ﬁnite sequence of steps.” Call a set P ⊆ N calculable if its characteristic function is calculable. . . ai . The ChurchTuring Thesis asserts each calculable function F : N → N is computable. numbered from 0 to n. Also. Let an algorithm be given for computing F : N → N. and has never failed in practice: any function that any competent person has ever recognized as being calculable. various alternative formalizations of the informal notion of calculable function have been proposed. . formal systems. . In addition. The above is only a rather narrow version of the ChurchTuring Thesis. namely the computable functions. . because the result of Exercise 7 in Section 5. The corresponding assertion for functions Nn → N follows. we should have a calculable P ⊆ N such that ¬P ( a0 . a computer program. . Since G and P describe only single steps in the algorithm for F it is reasonable to assume that they at least are computable.86 CHAPTER 5. Let us write “calculable” for this intuitive. . . . say. A skeptical reader may ﬁnd this argument dubious. . using various kinds of machines. where at each step i it produces a natural number ai . ai ) for all i < n. one can show easily that F is computable as well.2 The ChurchTuring Thesis The computable functions as deﬁned in the last section are also computable in the informal sense that for each such function F : Nn → N there is an algorithm that on any input a ∈ Nn stops after a ﬁnite number of steps and produces an output F (a). While the ChurchTuring Thesis is not a precise mathematical statement. ai ) for i < n and P ( a0 . see the exercise below. informal. and this has led to general acceptance of the Thesis. . They all have turned out to be equivalent in the sense of deﬁning the same class of functions on N. our ChurchTuring Thesis does not characterize mathematically . An algorithm is given by a ﬁnite list of instructions.1 is clearly also valid for “calculable” instead of “computable. say. but it suﬃces for our purpose. . . that is. These instructions should be deterministic (leave nothing to chance or choice). The algorithm should also tell us when to stop. . that is. There are various reﬁnements and more ambitious versions. and so on.
. . the unary successor function n S. α1 . ai ). . . Hm ) is primitive recursive. only the notion function computable by an algorithm that produces for each input from N an output in N. such that each αn computes a (computable) function fn : N → N. . for all i < n we have ai+1 = G( a0 . and is used in a more sophisticated form in the proof of G¨del’s incompleteness theorem.3. . . and such that every computable function f : N → N occurs in the sequence f0 . Hm : Nn → N are primitive recursive. . . . an . Suppose that for each a ∈ N there is such a ﬁnite sequence a0 . that for all a ∈ Nn there exists y ∈ N such that G(a. . We shall now argue informally that it is impossible to generate in a fully constructive way exactly the computable functions. in violation of the ChurchTuring Thesis. .5. y) = 0. (PR3) If F : Nn+1 → N is obtained by primitive recursion from primitive recursive functions G : Nn → N and H : Nn+2 → N. 5. .1 The same basic idea applies in other cases.3 Primitive Recursive Functions This section is not really needed in the rest of this chapter. . thus deﬁning a function F : N → N. (1) Let G : N → N and P ⊆ N be given. If G and P are computable. but it may throw light on some issues relating to computability. . an ). then G(H1 . f1 . . . an of natural numbers such that a0 = a. . This way of producing a new function fdiag from a sequence (fn ) is called diagonalization. . 1 Perhaps it would be better to call it antidiagonalization. Exercises. . ai ) and ¬P ( a0 . and P ( a0 . Then there is for each a ∈ N at most one ﬁnite sequence a0 . . . but fdiag = fn for all n. . . possibly more than once. . o Here is a class of computable functions that can be generated constructively: The primitive recursive functions are the functions f : Nn → N obtained inductively as follows: (PR1) The nullary function N0 → N with value 0. and all coordinate functions Ii are primitive recursive. . PRIMITIVE RECURSIVE FUNCTIONS 87 the intuitive notion of algorithm. Then fdiag is clearly computable in the intuitive sense. . . f2 . . then F is primitive recursive. . (PR2) If G : Nm → N is primitive recursive and H1 . and put F (a) := an . . Now consider the function fdiag : N → N deﬁned by fdiag (n) = fn (n) + 1. . One such issue is the condition. in Rule (R3) for generating computable functions. . α2 . . . so is F . This condition is not constructive: it could be satisﬁed for a certain G without us ever knowing it. Such a constructive generation process would presumably enable us to enumerate eﬀectively a sequence of algorithms α0 . .
. . an−1 . A consequence of the primitive recursive version of Lemma 5. . ¯ Let a function F : Nn+1 → N be given. Lemma 5. a0 .11–5. y + 1) = χ>0 (x) · χ≥ (P d(x). y)). then the function F : Nn → N given by F (a) = µxR(a. i) = ai for i = 0. Then F : Nn+1 → N satisﬁes the ¯ ¯ ¯ primitive recursion F (a.10. an−1 < N. 0) = 0.” Proof. It follows . b) ∗ F (a. in the order they are listed. The functions in (ii) are obtained by the usual primitive recursions. For (iv). Proof. The following functions and relations are primitive recursive: (i) each constant function cn . and y over N.1 go through with “computable” replaced by “primitive recursive.15 and Proposition 5. 0) = 0 and F (a. . . To obtain the primitive recursive version of Lemma 5. FR (a. cn is obtained by applying (PR2) with G = c0 (with k = 0 and m m t = n). the unary ˙ relation {x ∈ N : x > 0}. x). Next. The primitive recursive versions of Lemmas 5. . x) is primitive recursive. b) . the function − : N2 → N. (n. the computable functions that one ordinarily meets with are primitive recursive. FR (a. Lemma 5. In the rest of this section x ranges over Nm with m depending on the context.1. y).1. ≤ and = on N. Using this fact and restricted minimalization. the unary function lh.88 CHAPTER 5.16 yields: There is a primitive recursive function B : N → N such that. note that χ≥ (x.16 now follow easily.1.9. . Also. a0 < N. and the binary functions (a. and for all a ∈ Nn there exists x < H(a) such that R(a. y+1) = FR (a.1.3. ·. n − 1.1. and (x.3. . b + 1) = F (a.10 is the following “restricted minimalization scheme” for primitive recursive functions: if R ⊆ Nn+1 and H : Nn → N are primitive recursive. the proof of Proposition 5. (iv) the binary relations ≥. it follows that the unary relation Seq. . .1.1. With the possible exceptions of Lemmas 5. the function β is primitive recursive. N ∈ N) then for some a < B(N ) we have β(a. i) → (a)i . whenever n < N. ˙ (iii) the predecessor function P d : N → N given by P d(x) = x−1. In particular. m (ii) the binary operations +. FR (a. y) → xy on N. note that FR (a. all Lemmas and Propositions in Section 5.1. y))+(y+1)·χ¬R (a.8 and 5. . The function c0 is obtained from c0 by applying (PR2) m times with m 0 G = S. COMPUTABILITY A relation R ⊆ Nn is said to be primitive recursive if its characteristic function χR is primitive recursive. . It is also easy to write down primitive recursions for the functions in (iii).2. As the next two lemmas show. y)·χR (a.1. In and ∗ are primitive recursive.
and a second induction on y. . . and F (a. b) ∈ N . We deﬁne the Ackermann function A : N2 → N by A(n. for all n and x. (ii) n ≥ 1 =⇒ An+1 (y) > An (y) + y. . . and plays a role in several contexts. y) := An (y). Assume inductively that A0 . A2 . b)) for all n+1 ¯ satisﬁes the primitive recursion (a. Proof. (iii) An+1 (y) ≥ An (y + 1). Next we show that An+1 (y) > An (y) for all y: An+1 (0) = An (1). b) ∗ G(a. so An+1 (0) > An (0) and An+1 (0) > 1. The Ackermann Function. Thus A0 = S and An+1 ◦ A0 = An ◦ An+1 . An+1 (y + 1) = An (An+1 (y)). y): Using A1 (y) = y + 2 and A2 (y) = 2y + 3. A1 . 0). 0. F (a. 0) = G(a. For inequality (ii). One veriﬁes easily that A1 (y) = y + 2 and A2 (y) = 2y + 3 for all y. so is F . First we deﬁne inductively a sequence A0 .5. F (a. An+1 (0) = An (1). and assume inductively that An (y) > An−1 (y) + y. Inequality (i) follows easily by induction on n. b) = G(a.3. (v) x < y =⇒ An (x + y) ≤ An+2 (y). y: (i) An (x + y) ≥ An (x) + y. of primitive recursive functions An : N → N: A0 (y) = y + 1. The converse is obvious. (iv) 2An (y) < An+2 (y). and strictly increasing in each variable. We leave it as an exercise at the end of this section to show that A is computable. Hence An+1 (y + 1) = An (An+1 (y)) > An (y + 1).3. By diagonalization we can produce a computable function that is not primitive recursive. but the socalled Ackermann function does more. Also. ¯ ¯ ¯ F (a. PRIMITIVE RECURSIVE FUNCTIONS 89 ¯ that if F is primitive recursive. then F ¯ F (A. . b. . we proceed again by induction on (n. so An+1 is strictly increasing.3. and An+1 (y + 1) = An (An+1 (y)) ≥ An (y + 1 + An (y)) ≥ An (y + 1) + An (y) > An (y + 1) + y + 1. An are strictly increasing and A0 (y) < A1 (y) < · · · < An (y) for all y. ¯ so F (and hence F ) is primitive recursive. Lemma 5. we obtain A2 (y) > A1 (y) + y. . Suppose also n+2 ¯ that G : N → N is primitive recursive. b)) . Then An+1 (0) = An (1) > An (0) + 0. so An+1 (y) > y + 1 for all y. b + 1) = F (a. . The function A is computable. Then An+1 (y + 1) = An (An+1 (y)) ≥ A0 (An+1 (y)) > An+1 (y). b. Let n > 1.
(1 ≤ i ≤ m). Then An+1 (x + y + 1) = An (An+1 (x + y)) ≤ An+2 (An+3 (y)) = An+3 (y + 1). . assume F = G(H1 . Case 1. . Call an n = n(F ) with the property above a bound for F . x = y. . . Hk : Nm → N are primitive recursive. . The nullary constant function with value 0. Hk . Then An+1 (x + y + 1) = An+1 (2x + 1) = An (An+1 (2x)) ≤ An+2 (2x) < An+2 (An+3 (x)) = An+3 (y + 1). . . F (x. . Note that (v) holds for n = 0. . . Take N ∈ N such that n(G) ≤ N + 3 and n(H) ≤ N . We can assume inductively that if x < y. Given any primitive recursive function F : Nm → N there is an n = n(F ) such that F (x) ≤ An (x) for all x ∈ Nm . the successor function S. Proposition 5. . For n > 0 we have by (i). n(Hk ) are bounds for G and H1 . COMPUTABILITY In (iii) we proceed by induction on y. and each coordinate m function Ii . Note that (iv) holds for n = 0. . and assume inductively that n(G) and n(H1 ). We have equality for y = 0. . 0) = G(x) ≤ AN +3 (x). and by part (v) of the lemma above. Assuming inductively that (iii) holds for a certain y we obtain An+1 (y + 1) = An (An+1 (y)) ≥ An (An (y + 1)) ≥ An (y + 2). . xm ) ∈ Nm . .3. . F (x. Hk ) where k G : N → N and H1 . Assume (v) holds for a certain n. . Hk (x)) ≤ AN ( i Hi (x)) ≤ AN (AN +1 (x)) ≤ AN +2 (x). . Then F (x) = G(H1 (x). .4. . y + 1) = H(x. assume that F : Nm+1 → N is obtained by primitive recursion from the primitive recursive functions G : Nm → N and H : Nm+2 → N. (ii) and (iii): An (y) + An (y) ≤ An (y + An (y)) < An (An+1 (y)) = An+1 (y + 1) ≤ An+2 (y). and we want to show that An+1 (x + y + 1) ≤ An+3 (y + 1). and i Hi (x) ≤ AN +1 (x) for all x. By part (iv) of the previous lemma we can take N ∈ N such that n(G) ≤ N . Finally. We claim that N + 3 is a bound for F : F (x. x < y. . Case 2. y. then An+1 (x + y) ≤ An+3 (y). Below we put x := x1 + · · · + xm for x = (x1 . . . and assume inductively that n(G) and n(H) are bounds for G and H.90 CHAPTER 5. . Next. y)) ≤ AN {x + y + AN +3 (x + y)} ≤ AN +2 {AN +3 (x + y)} = AN +3 (x + y + 1). Proof. . has bound 0. Let x < y+1.
REPRESENTABILITY 91 Consider the function A∗ : N → N deﬁned by A∗ (n) = A(n. A(x + 1. that is. N x < S n+1 0 ↔ (x = 0 ∨ · · · ∨ x = S n 0). . A(x + 1. and it gives an example showing that Lemma 5. S0 in which S appears exactly n times. A∗ is not primitive recursive. 0) = A(x. ·. The Ackermann function is deﬁned by a recursion involving both variables: A(0. y) = y + 1. S 1 0 is the term S0. The fact that N is ﬁnite will play a role later. 1). but strong enough to prove numerical facts like SS0 + SSS0 = SSSSS0. . the other variables just act as parameters. For each n. Lemma 5. L contains the constant symbol 0 and the unary function symbol S. Hence A is computable but not primitive recursive.4.4. where n(F ) is a bound for F . +. So S 0 0 is the term 0. (It follows that the Ackermann function is recursive.1. Here N is the following set of nine axioms. y)). where we ﬁx two distinct variables x and y for the sake of deﬁniteness: N1 ∀x (Sx = 0) ∀x∀y (Sx = Sy → x = y) N2 N3 ∀x (x + 0 = x) N4 ∀x∀y x + Sy = S(x + y) N5 ∀x (x · 0 = 0) N6 ∀x∀y (x · Sy = x · y + x) N7 ∀x (x < 0) N8 ∀x∀y (x < Sy ↔ x < y ∨ x = y) N9 ∀x∀y (x < y ∨ x = y ∨ y < x) These axioms are clearly true in N.1. Then A∗ is computable. In particular. Exercises. This kind of double recursion is therefore more powerful in some ways than what can be done in terms of primitive recursion and composition. and so on. S. It is a very weak set of axioms.4 Representability Let L be a numerical language. Our key example of a numerical language is L(N) := {0.) 5. We let S n 0 denote the term S . The recursion in “primitive recursion” involves only one variable.5. and for any primitive recursive function F : N → N we have F (y) < A∗ (y) for all y > n(F ).9 fails with “primitive recursive” in place of “computable”. ∀x x < SS0 → (x = 0 ∨ x = S0) . . y + 1) = A(x. A(x + 1. n). (1) The graph of the Ackermann function is primitive recursive. <} (the language of N).
92
CHAPTER 5.
COMPUTABILITY
Proof. By induction on n. For n = 0, N x < S0 ↔ x = 0 by axioms N8 and N7. Assume n > 0 and N x < S n 0 ↔ (x = 0 ∨ · · · ∨ x = S n−1 0). Use axiom N8 to conclude that N x < S n+1 0 ↔ (x = 0 ∨ · · · ∨ x = S n 0). To give an impression how weak N is we consider some of its models: Some models of N. (1) We usually refer to N as the standard model of N. (2) Another model of N is N[x] := (N[x]; . . . ), where 0, S, +, · are interpreted as the zero polynomial, as the unary operation of adding 1 to a polynomial, and as addition and multiplication of polynomials in N[x], and where < is interpreted as follows: f (x) < g(x) iﬀ f (n) < g(n) for all large enough n. (3) A more bizarre model of N: (R≥0 ; . . . ) with the usual interpretations of 0, S, +, ·, in particular S(r) := r + 1, and with < interpreted as the binary relation <N on R≥0 : r <N s ⇔ (r, s ∈ N and r < s) or s ∈ N. / Example (2) shows that N ∀x∃y (x = 2y ∨ x = 2y + S0), since in N[x] the element x is not in 2N[x] ∪ 2N[x] + 1; in other words, N cannot prove “every element is even or odd.” In example (3) the binary relation <N on R≥0 is not even a total order. One useful fact about models of N is that they all contain the socalled standard model N in a unique way: Lemma 5.4.2. Suppose A = N. Then there is a unique homomorphism ι : N → A. This homomorphism ι is an embedding, and for all a ∈ A and n, (i) if a <A ι(n), then a = ι(m) for some m < n; (ii) if a ∈ ι(N), then ι(n) <A a. / As to the proof, note that for any homomorphism ι : N → A and all n we must have ι(n) = (S n 0)A . Hence there is at most one such homomorphism. It remains to show that the map n → (S n 0)A : N → A is an embedding ι : N → A with properties (i) and (ii). We leave this as an exercise to the reader. Deﬁnition. Let L be a numerical language, and Σ a set of Lsentences. A relation R ⊆ Nm is said to be Σrepresentable, if there is an Lformula ϕ(x1 , . . . , xm ) such that for all (a1 , . . . , am ) ∈ Nm we have (i) R(a1 , . . . , am ) =⇒ Σ ϕ(S a1 0, . . . , S am 0) (ii) ¬R(a1 , . . . , am ) =⇒ Σ ¬ϕ(S a1 0, . . . , S am 0) Such a ϕ(x1 , . . . , xm ) is said to represent R in Σ or to Σrepresent R. Note that if ϕ(x1 , . . . , xm ) Σrepresents R and Σ is consistent, then for all (a1 , . . . , am ) ∈ Nm R(a1 , . . . , am ) ⇐⇒ Σ ¬R(a1 , . . . , am ) ⇐⇒ Σ ϕ(S a1 0, . . . , S am 0), ¬ϕ(S a1 0, . . . , S am 0).
A function F : Nm → N is Σrepresentable if there is a formula ϕ(x1 , . . . , xm , y) of L such that for all (a1 , . . . , am ) ∈ Nm we have Σ ϕ(S a1 0, . . . , S am 0, y) ↔ y = S F (a1 ,...,am ) 0.
5.4. REPRESENTABILITY Such a ϕ(x1 , . . . , xm , y) is said to represent F in Σ or to Σrepresent F .
93
An Lterm t(x1 , . . . , xm ) is said to represent the function F : Nm → N in Σ if Σ t(S a1 0, . . . , S am 0) = S F (a) 0 for all a = (a1 , . . . , am ) ∈ Nm . Note that then the function F is Σrepresented by the formula t(x1 , . . . , xm ) = y. Proposition 5.4.3. Let L be a numerical language, Σ a set of Lsentences such that Σ S0 = 0, and R ⊆ Nm a relation. Then R is Σrepresentable ⇐⇒ χR is Σrepresentable. Proof. (⇐) Assume χR is Σrepresentable and let ϕ(x1 , . . . , xm , y) be an Lformula Σrepresenting it. We show that ψ(x1 , . . . , xm ) := ϕ(x1 , . . . , xm , S0) Σrepresents R. Let (a1 , . . . , am ) ∈ R; then χR (a1 , . . . , am ) = 1. Hence Σ ϕ(S a1 0, . . . , S am 0, y) ↔ y = S0,
so Σ ϕ(S a1 0, . . . , S am 0, S0), that is, Σ ψ(S a1 0, . . . , S am 0). Similarly, (a1 , . . . , am ) ∈ R implies Σ ¬ϕ(S a1 0, . . . , S am 0, S0). (Here we need / Σ S0 = 0.) (⇒) Conversely, assume R is Σrepresentable and let ϕ(x1 , . . . , xm ) be an Lformula Σrepresenting it. We show that ψ(x1 , . . . , xm , y) := (ϕ(x1 , . . . , xm ) ∧ y = S0) ∨ (¬ϕ(x1 , . . . , xm ) ∧ y = 0) Σrepresents χR . Let (a1 , . . . , am ) ∈ R; then Σ Σ ϕ(S a1 0, . . . , S am 0). Hence,
[(ϕ(S a1 0, . . . , S am 0) ∧ y = S0) ∨ (¬ϕ(S a1 0, . . . , S am 0) ∧ y = 0)] ↔ y = S0
i.e. Σ ψ(S a1 0, . . . , S am 0, y) ↔ y = S0. And similarly for (a1 , . . . , am ) ∈ R, Σ ψ(S a1 0, . . . , S am 0, y) ↔ y = 0. / Let Σ be a set of Lsentences. An Lterm t(x1 , . . . , xm ) is said to represent the function F : Nm → N in Σ if Σ t(S a1 0, . . . , S am 0) = S F (a) 0 for all m a = (a1 , . . . , am ) ∈ N . Note that then the function F is Σrepresented by the formula t(x1 , . . . , xm ) = y. Theorem 5.4.4 (Representability). Each computable function F : Nn → N is Nrepresentable. Each computable relation R ⊆ Nm is Nrepresentable. Proof. By Proposition 5.4.3 we need only consider the case of functions. We make the following three claims: (R1) + : N2 → N, · : N2 → N, χ≤ : N2 → N, and the coordinate function n Ii (for each n and i = 1, . . . , n) are Nrepresentable. (R2) If G : Nm → N and H1 , . . . , Hm : Nn → N are Nrepresentable, then so is F = G(H1 , . . . , Hm ) : Nn → N deﬁned by F (a) = G(H1 (a), . . . , Hk (a)).
94 (R3)
CHAPTER 5.
COMPUTABILITY
If G : Nn+1 → N is Nrepresentable, and for all a ∈ Nn there exists x ∈ N such that G(a, x) = 0, then the function F : Nn → N given by F (a) = µx(G(a, x) = 0)
is Nrepresentable. (R1) : The proof of this claim has six parts. (i) The formula x1 = x2 represents {(a, b) ∈ N2 : a = b} in N: Let a, b ∈ N. If a = b then obviously N S a 0 = S b 0. Suppose that a = b. Then for every model A of N we have A = S a 0 = S b 0, by Lemma 5.4.2 and its proof. Hence N S a 0 = S b 0. (ii) The term x1 + x2 represents + : N2 → N in N: Let a + b = c where a, b, c ∈ N. By Lemma 5.4.2 and its proof we have A = S a 0 + S b 0 = S c 0 for each model A of N. It follows that N S a 0 + S b 0 = S c 0. (iii) The term x1 · x2 represents · : N2 → N in N: The proof is similar to that of (ii). (iv) The formula x1 < x2 represents {(a, b) ∈ N2 : a < b} in N: The proof is similar to that of (i). (v) χ≤ : N2 → N is Nrepresentable: By (i) and (iv), the formula x1 < x2 ∨ x1 = x2 represents the set {(a, b) ∈ N2 : a ≤ b} in N. So by Proposition 5.4.3, χ≤ : N2 → N is Nrepresentable. (vi) For n ≥ 1 and 1 ≤ i ≤ n , the term tn (x1 , . . . , xn ) := xi , represents i n the function Ii : Nn → N in N. This is obvious. (R2) : Let H1 , . . . , Hm : Nn → N be Nrepresented by ϕi (x1 , . . . , xn , yi ) and let G : Nm → N be Nrepresented by ψ(y1 , . . . , ym , z) where z is distinct from x1 , . . . , xn and y1 , . . . , ym . Claim : F = G(H1 , . . . , Hm ) is Nrepresented by
m
θ(x1 , . . . , xn , z) := ∃y1 . . . ∃ym ((
i=1
ϕi (x1 , . . . , xn , yi )) ∧ ψ(y1 , . . . , ym , z)).
Put a = (a1 , . . . , an ) and let c = F (a). We have to show that N θ(S a 0, z) ↔ z = S c 0
where S a 0 := (S a1 0, . . . , S an 0). Let bi = Hi (a) and put b = (b1 , . . . , bm ). Then F (a) = G(b) = c. Therefore, N ψ(S b 0, z) ↔ z = S c 0 and N ϕi (S a 0, yi ) ↔ yi = S bi 0, (i = 1, . . . , m)
Argue in models to conclude : N
θ(S a 0, z) ↔ z = S c 0.
xm . 0) ∧ ∀w(w < y → ¬ϕ(x1 .i) 0. . y). Examples. is Σrepresented by ϕ(x1 . . We shall prove the converse in the next section.4. xn . . . z) ↔ z = 0 and for i < b. . By arguing in models using Lemma 5. the set Th(K) := {σ : A = σ for all A ∈ K} is an Ltheory. Note that the theory of A is automatically complete. Remark. b) = 0. Then the set of all Σrepresentable functions F : Nm → N. since N S m 0 = S n 0 whenever m = n. y. Deﬁne F : Nn → N by F (a) = µb(G(a. S i 0. . 1. Therefore. (2) Given an Lstructure A. . . . . N ϕ(S a 0. Suppose that G is Nrepresented by ϕ(x1 . L is a numerical language and Σ is a set of Lsentences. . . . . as claimed. i) = 0 and N ϕ(S a 0. . . whenever T σ. Let a ∈ Nn and let b = F (a). . w. 0)) Nrepresents F . In the exercises below. (1) Given a set Σ of Lsentences. DECIDABILITY AND GODEL NUMBERING 95 (R3) : Let G : Nn+1 → N be such that for all a ∈ Nn there exists b ∈ N with G(a. The converse of this theorem is also true.5 Decidability and G¨del Numbering o Deﬁnition. Then G(a. For example. . (3) Given any class K of Lstructures. G(a. z) ↔ z = S G(a. i) = 0 for i < b and G(a. (m = 0. xm . .) (2) Suppose Σ ⊇ N. y) ↔ y = S b 0. If a function F : Nm → N is Σrepresented by the Lformula ϕ(x1 . then the graph of F . . is an Ltheory. (1) Suppose Σ S m 0 = S n 0 whenever m = n. . the set Th(Σ) := {σ : Σ σ} of theorems of Σ. then σ ∈ T . xn . b) = 0). ) is closed under composition and minimalization.¨ 5. 5. (This result applies to Σ = N. . y. Th(K) is the set of Lsentences that are true in all ﬁnite ﬁelds. . y).5. . . Exercises. z). called the theory of A. We claim that the formula ψ(x1 . as a relation of arity m + 1 on N. and K the class of ﬁnite ﬁelds. . for L = LRi . b) = 0. . xn .2 we obtain N ψ(S a 0. xn . called the theory of K. 2. y) := ϕ(x1 . . the set Th(A) := {σ : A = σ} is also an Ltheory. If we need to indicate the dependence on L we write ThL (Σ) for Th(Σ). An Ltheory T is a set of Lsentences closed under provability. and is plausible from the ChurchTuring Thesis. that is. S b 0. For Σ = ∅ we also refer to ThL (Σ) as predicate logic in L. We say that Σ axiomatizes an Ltheory T (or is an axiomatization of T ) if T = Th(Σ).
this is just an informal description at this stage. ϕ = (t1 = t2 ). Since we have not (yet) deﬁned the concept of algorithm. tm . .) Unless we specify otherwise we assume in the rest of this chapter that the language L is ﬁnite. . One of our goals in this section is to deﬁne a formal counterpart. SN(R). tn if t = vi . SN(∧). . . if t = F t1 . SN(¬). called “decidability. ϕ = ⊥. . ϕ2 x . SN(∨). This is done for simplicity. ϕ = ϕ1 ∨ ϕ2 . SN(∃). v2 . t1 . of an Lformula ϕ is given recursively by SN( ) SN(⊥) SN(=). . ϕ = Rt1 . ϕ = ¬ψ. . by the ChurchTuring Thesis. tm ψ ϕ1 . ϕ2 ϕ1 . ψ x .” In the next section we show that the L(N)theory Th(N) is undecidable.5. SN(∀). . and at the end of the remaining sections we indicate how to avoid this assumption. COMPUTABILITY The decision problem for a given Ltheory T is to ﬁnd an algorithm to decide for any Lsentence σ whether or not σ belongs to T . . v1 . . and is closely related to the Incompleteness Theorem. . .96 CHAPTER 5. tn . . . . ϕ = ϕ1 ∧ ϕ2 . Recall that v0 . t2 t1 . . . v2 . The following subsets of N are computable: (1) Vble := { x : x is a variable} (2) Term := { t : t is an Lterm} (3) AFor := { ϕ : ϕ is an atomic Lformula} (4) For := { ϕ : ϕ is an Lformula} . } a symbol number SN(s) ∈ N as follows: SN(vi ) := 2i and to each remaining symbol (in the ﬁnite set L {logical symbols}) we assign an odd natural number as symbol number. if if if if if if if if if ϕ= . The G¨del number t of an Lterm t is deﬁned recursively: o t = The G¨del number ϕ o ϕ = SN(vi ) SN(F ). ϕ = ∀x ψ. ψ Lemma 5. . . (This result is a version of Church’s Theorem. this means that the decision problem for Th(N) has no solution. subject to the condition that diﬀerent symbols have diﬀerent symbol numbers.1. t1 . are our variables. . ϕ = ∃x ψ. We assign to each symbol s∈L {logical symbols} {v0 . Deﬁnition. v1 . We shall number the terms and formulas of L in such a way that various statements about these formulas and about formal proofs in this language can be translated “eﬀectively” into equivalent statements about natural numbers expressible by sentences in L(N).
(a)1 . 0 0 SN(∃). i). (a)2 . (a)1 . (a)1 . τ ) : τ is free for x in ϕ} ⊆ N3 (3) PrAx := { ϕ : ϕ is a propositional axiom} ⊆ N (4) Eq := { ϕ : ϕ is an equality axiom} ⊆ N (5) Quant := { ψ : ψ is a quantiﬁer axiom} ⊆ N (6) MP := {( ϕ1 . The function Sub : N3 → N deﬁned by Sub(a. So For is computable. note that Sent(a) ⇐⇒ For(a) and ∀i<a ¬ Fr(a. . Sub((a)2 . .3. b. (a)2 or a = SN(∧). . x . . (a)1 . and t and τ over Lterms. . (a)1 . c) if a = (a)0 . (a)1 . c) a is computable. ϕ2 ) : ϕ1 . τ ) = t(τ /x) and Sub( ϕ . ϕ and ψ over Lformulas. (a)n with n > 0 and (a) = SN(∃). Sub((a)2 . b. ϕ2 are Lformulas} ⊆ N3 (7) Gen := {( ϕ . . (a)2 and (a)1 = b. (a)1 . ϕ1 → ϕ2 . if a = SN(¬). Proof. . (a)2 or a = SN(∀). t1 . For((a)1 ) For((a)1 ) and For((a)2 ) if a = SN(∨). DECIDABILITY AND GODEL NUMBERING 97 Proof.5. Lemma 5. otherwise Sub( t . . In the next two lemmas. . . . (1) a ∈ Vble iﬀ a = 2b for some b ≤ a. . . (4) We have For(a) ⇔ Vble((a)1 ) and For((a)2 ) if a = SN(∃). x ) : x occurs free in ϕ} ⊆ N2 (2) FrSub := {( ϕ . AFor(a) otherwise. Lemma 5. The usual inductive or explicit description of each of these notions translates easily into a description of its “G¨del image” that establishes computability o of this image. . . c).¨ 5. (2) a ∈ Term iﬀ a ∈ Vble or a = SN(F ). so (8) follows from (1) and earlier results.5. As an example. . x . b. x . tn with G¨del numbers < a. Exercise. c) SN(∀). (a)2 . (a)2 and (a)1 = b. and satisﬁes if a = SN(∃). Sub((a)1 . The following relations on N are computable: (1) Fr := {( ϕ . b. τ ) = ϕ(τ /x) .5. c (a)0 . Sub((a)n . (a) = SN(∀). (a)1 . o We leave (3) to the reader. tn for some function symbol F of L of arity n and Lterms t1 . if a = SN(∀). b. . (a)1 . x ranges over variables. c) = if Vble(a) and a = b. ψ ) : ψ follows from ϕ by the generalization rule} ⊆ N2 (8) Sent := { ϕ : ϕ is a sentence} ⊆ N Proof.2.
” also “recursively axiomatizable” and “eﬀectively axiomatizable” are used. Proof. COMPUTABILITY In the rest of this Section Σ is a set of Lsentences.5. Proof. . . An Ltheory T is said to be computably axiomatizable if T has a computable axiomatization. ϕj with 1 ≤ i. x) “Recursively enumerable” is also used for “computably generated. The complement of a computably generated subset of N is not always computably generated. or obtained from some ϕi with 1 ≤ i < k by Generalization. (a)k ) or ∃ i < k : Gen((a)i . or a logical axiom.6 (Negation Theorem).) Proposition 5. We leave it as an exercise to check that the union and intersection of two computably generated nary relations on N are computably generated.” but for Ltheories “decidable” is more widely used than “computable”. Deﬁnition.5. (Thus “T is decidable” means the same thing as “T is computable. Put Σ := { σ : σ ∈ Σ}.98 CHAPTER 5. and call Σ computable if Σ is computable. then Prf Σ is computable. So every element of Prf Σ is of the form ϕ1 . e. j < k by Modus Ponens. j < k : MP((a)i . (a)j . Then A is computable. .5.2 We say that T is decidable if T is computable. as we shall see later.5. . A relation R ⊆ Nn is said to be computably generated if there is a computable relation Q ⊆ Nn+1 such that for all a ∈ Nn we have R(a) ⇔ ∃ xQ(a. Lemma 5. and undecidable otherwise.” Remark. If Σ is computable. or obtained from some ϕi . ϕn is a proof from Σ}. . i. Prf Σ := { ϕ1 . . This is because a is in Prf Σ iﬀ Seq(a) and lh(a) = 0 and for every k < lh(a) either (a)k ∈ Σ ∪ PrAx ∪ Eq ∪ Quant or ∃ i. . Every computable relation is obviously computably generated. . 2 Instead of “computably axiomatizable. . If Σ is computable. .4. Let A ⊆ Nn and suppose A and ¬A are computably generated. .5. ϕn where n ≥ 1 and every ϕk is either in Σ. We deﬁne Prf Σ to be the “set of all G¨del numbers of proofs from o Σ”. Deﬁnition. ϕn : ϕ1 . . (a)k ). Lemma 5. ˙ Deﬁnition. then Th(Σ) is computably generated. Apply Lemma 5.4 and the fact that for all a ∈ N a ∈ Th(Σ) ⇐⇒ ∃ b Prf Σ (b) and a = (b)lh(b)−1 and Sent(a) . .
5. ⊥. Proof. ¬. µx(P ∨ Q)(a. v1 . Proposition 5.5. a . as a diligent reader can easily verify. 11. 5. Much of the above does not really need the assumption that L is ﬁnite.1.) Given a numbering of L we use it to assign to each Lterm t and each Lformula ϕ its G¨del number t and ϕ . respectively. Then T = Th(Σ) is computably generated.3 go through. 99 Then there is for each a ∈ Nn an x ∈ N such that (P ∨ Q)(a. In the discussion below we only assume that L is countable. Let T be a complete Ltheory with computable axiomatization Σ. . with the corresponding G¨del o numbering of Lterms and Lformulas. v2 . ∃. =. / ˙ Hence the complement of T is computably generated. ∀. x). 13.7. (So if L is ﬁnite. and for s = . ∨. First. Then Lemmas 5. By a numbering of L we mean an injective function L → N that assigns to each s ∈ L an odd natural number SN(s) > 15 such that if s is a relation symbol. put SN(s) = 1. . Relaxing the assumption of a ﬁnite language. } {logical symbols} its symbol number SN(s) ∈ N as follows: SN(vi ) := 2i. x). and 5. {(SN(s). arity(s)) : s ∈ L} ⊆ N2 are computable. Q ⊆ Nn+1 be computable such that for all a ∈ Nn we have A(a) ⇐⇒ ∃ xP (a. x). This part of our numbering of symbols is independent of L. Every complete and computably axiomatizable Ltheory is decidable. ¬A(a) ⇐⇒ ∃ xQ(a. just as we did earlier in the section. DECIDABILITY AND GODEL NUMBERING Proof. o Suppose a computable numbering of L is given. Let P. a ∈ T / / ⇐⇒ a ∈ Sent or ∃ b Prf Σ (b) and (b)lh(b)−1 = SN(¬). 7. Such a numbering of L is said to be computable if the sets SN(L) = {SN(s) : s ∈ L}) ⊆ N. so the case of ﬁnite L is included. respectively. x)). 15.5. . we assign to each symbol s ∈ {v0 .5.2. The computability of A follows by noting that for all a ∈ Nn we have A(a) ⇐⇒ P (a. and if s is a function symbol. then every numbering of L is computable.5. then SN(s) ≡ 1 mod 4. 9. Now observe: a ∈ T ⇐⇒ a ∈ Sent or SN(¬). . Thus T is decidable by the Negation Theorem. 3.5. ∧.¨ 5. then SN(s) ≡ 3 mod 4.
100 CHAPTER 5. T being computably axiomatizable. (3) If f : N → N is computable and f (x) > x for all x ∈ N. and Proposition 5. and e is computable. then f can be chosen injective. T being decidable.5.5. (6) Suppose Σ is a computable and consistent set of sentences in the numerical language L. if S is inﬁnite and computably generated.7 go through. that the deﬁnitions of these notions are all relative to our given computable numbering of L. and T being undecidable.6. for ﬁnite L diﬀerent choices of numbering of L yield equivalent notions of Σ being computable. (5) A function F : Nn → N is computable iﬀ its graph is computably generated. and for an Ltheory T we deﬁne the o notions of T being computably axiomatizable. ab. (4) Every inﬁnite computably generated subset of N has an inﬁnite computable subset. and call Σ computable if Σ is a computable subset of N. (1) Let a and b denote positive real numbers. and 1/a. so are a + b. Theorem 5. (7) A function F : Nm → N is computable if and only if it is Σrepresentable for some ﬁnite.) (2) A nonempty S ⊆ N is computably generated iﬀ there is a computable function f : N → N such that S = f (N). No consistent Ltheory extending N is decidable. and T being decidable. Call a computable if there are computable functions f.” 5.) It is now routine to check that Lemmas 5.5. a is computable if and only if the binary relation Ra on N deﬁned by Ra (m. and if in addition a > b. g(n) = 0 and a − f (n)/g(n) < 1/n. n) ⇐⇒ n > 0 and m/n < a is computable.5.4. if a and b are computable. all just as we did earlier in this section. Exercises.6 Theorems of G¨del and Church o In this section we assume that the ﬁnite language L extends L(N). . COMPUTABILITY Let also a set Σ of Lsentences be given. (Note. 5.1 (Church). consistent set Σ ⊇ N of sentences in some numerical language L ⊇ L(N). Then: (i) (ii) (iii) every positive rational number is computable. Then every Σrepresentable relation R ⊆ Nn is computable. We deﬁne Prf Σ to be the set of all G¨del numbers of proofs from Σ. The last exercise gives an alternative characterization of “computable function. Moreover. then a − b is also computable. (Hint: use the Negation Theorem. g : N → N such that for all n > 0. Deﬁne Σ := { σ : σ ∈ Σ}. then f (N) is computable. however.
Proof.6. x = v0 ) and deﬁne the binary relation P Σ ⊆ N2 by P Σ (a. b) is not of the form P (a) for any a ∈ A. we let P (a) ⊆ A be given by the equivalence P (a)(b) ⇔ P (a. that is. where a ∈ A. a contradiction. Then Q(a) iﬀ P (a. i.4 (Cantor). S b 0 ) = ϕ(S b 0) . x .5. Take a = ϕ(x) . x .5. then P (a) = f (a). b).6. The function Num : N → N deﬁned by Num(a) = S a 0 is computable. (The corollary above only says that such a sentence exists. its antidiagonal Q ⊆ A deﬁned by Q(b) ⇐⇒ ¬P (b. Num(b)) ∈ Th(Σ) For an Lformula ϕ(x) and a = ϕ(x) . X = P Σ (a).3.4. THEOREMS OF GODEL AND CHURCH Before giving the proof we record the following consequence: 101 Corollary 5. Suppose Σ ⊇ N is consistent. We ﬁx a variable x (e. Num(0) = 0 and Num(a + 1) = SN(S). Lemma 5.6. But by deﬁnition. say by the formula ϕ(x). b) ⇐⇒ Σ ϕ(S b 0). This is essentially Cantor’s proof that no f : A → P(A) can be surjective. then X(b) iﬀ Σ ϕ(S b 0) iﬀ P Σ (a. So X(b) ⇔ Σ ϕ(S b 0) (using consistency to get “⇐”).4. Immediate from 5. Q(a) iﬀ ¬P (a.¨ 5.) Deﬁnition. so P Σ (a.6. X(b) ⇒ Σ ϕ(S b 0).7 and Church’s Theorem.6. (Use P (a. . Suppose Q = P (a). Each como putably axiomatizable Ltheory extending N is incomplete. b). e. b) ⇐⇒ Sub(a. Let X ⊆ N be computable.) For the proof of Church’s Theorem we need a few lemmas. b) :⇔ b ∈ f (a). Lemma 5. For a ∈ A. a). Given any P ⊆ A2 . Proof. Proof. Let P ⊆ A2 be any binary relation on a set A. Let Σ be a set of Lsentences. g.2 (Weak form of G¨del’s Incompleteness Theorem). Then each computable set X ⊆ N is of the form X = P Σ (a) for some a ∈ N. We will indicate in the next Section how to construct for any consistent computable set of Lsentences Σ ⊇ N an Lsentence σ such that Σ σ and Σ ¬σ. Then X is Σrepresentable by Theorem 5. and ¬X(b) ⇒ Σ ¬ϕ(S b 0). a). Num(a) . we have Sub( ϕ(x) . Proof. Lemma 5.
4. . Note that PA is consistent. Therefore. 0. if Th(∅) were decidable. Let ∧N be the sentence N1 ∧ · · · ∧ N9. Th(PA) is undecidable and incomplete.102 CHAPTER 5. σ ⇐⇒ ∅ ∧N → σ. y) → ϕ(x. a ∈ Th(∅) . and ∀x stands for ∀x1 . in the sense that PA σ for the sentence σ expressing that fact. Then the antidiagonal QΣ ⊆ N of P Σ is computable: b ∈ QΣ ⇔ (b. S. So Th(∅) is undecidable. . ∧N . (That is. all sentences of the form ∀x [ ϕ(x. ·) as a model. but Th(N) is undecidable. for all a ∈ N. QΣ is not among the P Σ (a). Let Σ ⊇ N be consistent. Suppose that Th(Σ) is computable. Discussion. <.6. ∀xn .5.5 the subset Th(N) of N is computably generated. But this set is not computable: Corollary 5. Th(N) and Th(∅) (in the language L(N)) are undecidable. y) is an L(N)formula. 0) ∧ ∀y (ϕ(x. then Th(N) would be decidable. x = (x1 . Therefore by Lemma 5. for any L(N)sentence σ. xn ). (Apparently it was not widely recognized at the time that completeness of PA would have had . x . Proof. we know how to construct for any number theoretic statement a sentence σ of L(N) such that the statement is true if and only if N = σ. / / By Lemma 5.6. since an actual sentence would be too unwieldy without abbreviations. To appreciate the signiﬁcance of this result. y)] where ϕ(x. Thus by the theorems above. Over a century of experience has shown that number theoretic assertions can be expressed by sentences of L(N). The undecidability of Th(N) is a special case of Church’s Theorem. QΣ is not computable.6.6. In most cases we just indicate how to construct such a sentence. . one needs a little background knowledge. Then. Th(Σ) is not computable. Sy)) → ∀y ϕ(x. a ∈ Th(N) ⇐⇒ a ∈ Sent and SN(∨). By Lemma 5. that is. N that is. Thus before G¨del’s Incompleteness o Theorem it seemed natural to conjecture that PA is complete. +. since it has N = (N. b) ∈ P Σ ⇔ Sub(b. a contradiction. We have seen that N is quite weak. We have to show that then Th(Σ) is undecidable.5. including some history.) What is more important. . Num(b)) ∈ Th(Σ) . that is. . SN(¬). . Also PA is computable (exercise). admittedly in an often contorted way. A very strong set of axioms in the language L(N) is PA (1st order Peano Arithmetic). COMPUTABILITY Proof of Church’s Theorem. we know from experience that any established fact of classical number theory—including results obtained by sophisticated analytic and algebraic methods—can be proved from PA. This concludes the proof. Its axioms are those of N together with all induction axioms.
7. we prove the incompleteness theorem in the more explicit form stated in the introduction to this chapter. Here is a sketch of how to make such a sentence. S b 0. and (2) N = σ ⇐⇒ Σ σ. y) ↔ y = S c 0. We can assume that the variable x does not occur in sub(x1 . y). Then there is for every Lformula ρ(y) an Lsentence σ such that Σ σ ↔ ρ(S n 0) where n = σ . In this sense. x . y). x2 . b in N. one might guess that σ = ∀x¬pr(x. Assuming that N = ¬σ produces a contradiction. From (1) and (2) we get N = σ ⇐⇒ Σ σ. Let sub(x1 . Lemma 5. In this section. the situation cannot be remedied by adding new axioms to PA. o But how do we arrange (1)? Since σ := ∀x¬pr(x. ∀xϕ(x). What follows is a rigorous implementation. The function (a. for any computable consistent Σ ⊇ N. Note that then the sentence ∀xϕ(x) is true in N but not provable from Σ. Suppose Σ ⊇ N. We shall indicate how to construct. We also ﬁx two distinct variables x and y.5. Assume for simplicity that L = L(N) and N = Σ. x .) Of course. Hence σ is true in N. the Incompleteness Theorem is pervasive. S σ 0) where pr(x. where c = Sub(a. a formula ϕ(x) of L(N) with the following properties: (i) N (ii) Σ ϕ(S m 0) for each m. N sub(S a 0. and Σ is a set of Lsentences. S σ 0) depends on σ. This ﬁnishes our sketch. y) be an L(N)formula representing it in N.2.6 we obtained G¨del’s Incompleteness Theorem as an immediate o corollary of Church’s theorem. How to implement this strange idea? To take care of (2). In this section L ⊇ L(N) is a ﬁnite language. the solution is to apply the ﬁxedpoint lemma below to ρ(y) := ∀x¬pr(x. x2 . Hence by the representability theorem it is Nrepresentable. Then for all a. n) ⇐⇒ m is the G¨del number of a proof from Σ o of a sentence with G¨del number n. Num(b)) (1) . b) → Sub(a. at least if we insist that the axioms are true in N and that we have eﬀective means to tell which sentences are axioms. 5. Proof. The idea is to construct sentences σ and σ such that (1) N = σ ↔ σ . Num(b)) : N2 → N is computable by Lemma 5.7 A more explicit incompleteness theorem In Section 5. y) is a formula representing in N the binary relation Pr ⊆ N2 deﬁned by Pr(m.5. and thus(!) not provable from Σ. A MORE EXPLICIT INCOMPLETENESS THEOREM 103 the astonishing consequence that Th(N) is decidable.1 (Fixpoint Lemma).7.
S σ 0). Σ σ ↔ ρ(S n 0).1 (with L = L(N) and Σ = N) provides an L(N)sentence σ such that N σ ↔ ρ(S σ 0). x. σ ↔ ∃y(y = S n 0 ∧ ρ(y)). S n 0) ⇐⇒ ¬PrΣ (m. y) be an L(N)formula representing PrΣ in N. Deﬁne θ(x):= ∃y(sub(x. n) ⇐⇒ m is the G¨del number of a proof from Σ o of an Lsentence with G¨del number n. a contradiction. S m 0. N ϕ(S m 0) for each m. y) ↔ y = S n 0 (2) σ ↔ ρ(S n 0). S n 0) ⇐⇒ PrΣ (m. so by (2) we get Σ Theorem 5. so PrΣ (m.2. that is Σ σ ↔ ∀x¬prΣ (x. x . Lemma 5. S m 0 ) = Sub(m.7. that is. S σ 0) for each m. Because of (3) we also o have Σ ∀x¬prΣ (x. which by (2) yields ¬PrΣ (m. Let σ := θ(S m 0). o Since Σ is computable. N We have sub(S m 0. σ ). Hence PrΣ is representable in N. S σ 0) (3) Claim: Σ σ. Hence ¬PrΣ (m. and hence in Σ. . n = σ = θ(S m 0) = Sub( θ(x) . computable. σ = θ(S m 0) = ∃y(sub(S m 0. n) Σ ¬prΣ (S m 0. n: Σ prΣ (S m 0.7. PrΣ is computable. Suppose Σ is consistent. x . Let prΣ (x. and put n = σ . COMPUTABILITY Now let ρ(y) be an Lformula. So by (1). S m 0. We now show : (i) N ϕ(S m 0) for each m. and proves all axioms of N. y) ∧ ρ(y)) and let m = θ(x) .104 CHAPTER 5. Because Σ is consistent we have for all m. This establishes the claim. Because Σ σ. y). S σ 0). y) ∧ ρ(y)). so Σ ¬prΣ (S m 0. It follows that Σ σ ↔ ρ(S σ 0). but Σ ∀xϕ(x). We claim that Σ Indeed. Assume towards a contradiction that Σ σ. no m is the G¨del number o of a proof of σ from Σ. n) (1) (2) Let ρ(y) be the L(N)formula ∀x¬prΣ (x. let m be the G¨del number of a proof of σ from Σ. Num(m)). Now put ϕ(x) := ¬prΣ (x. σ ) for each m. Consider the relation PrΣ ⊆ N2 deﬁned by PrΣ (m. which by the deﬁning property of prΣ yields N ¬prΣ (S m 0. Then there exists an L(N)formula ϕ(x) such that N ϕ(S m 0) for each m. S σ 0). σ ). Hence. Proof.
apply the theorem above to Σ ∪ N in place of Σ. (1) Let N∗ be an Lexpansion of N. Th(Σ ) is undecidable . and Σ is a set of L sentences. Σ σ ←→ true(S n 0).1. (A conscientious reader will replace these appeals by proofs until reaching a level of skill that makes constructing such proofs a predictable routine.8.5. (Note that then ∀xϕ(x) is true in N∗ but not provable from Σ. then T is undecidable. Then there exists an L(N)formula ϕ(x) such that N ϕ(S n 0) for each n. where n = σ .) To obtain this corollary. Σ is a set of Lsentences. Let L ⊆ L and Σ ⊆ Σ . (2) Suppose Σ ⊇ N is consistent. The aim of this section is to establish this result and indicate some applications.7.) In this section. It strengthens the special case of Church’s theorem which says that the set of G¨del numbers of o Lsentences true in N∗ is not computable. Suppose that Σ is computable and true in the Lexpansion N∗ of N. In order not to distract from this theme by boring details. we shall occasionally replace a proof by an appeal to the ChurchTuring Thesis. Here a truth deﬁnition for Σ is an Lformula true(y) such that for all Lsentences σ. as does the more general result in the next exercise.8 Undecidable Theories Church’s theorem says that any consistent theory containing a certain basic amount of integer arithmetic is undecidable. but Σ ∪ N ∀xϕ(x). Lemma 5. L and L are ﬁnite languages. 5. This result (of Tarski) is known as the undeﬁnability of truth. by (3). This is because of the Claim and Σ 105 σ ↔ ∀xϕ(x). UNDECIDABLE THEORIES (ii) Σ ∀xϕ(x). Corollary 5. Tarski’s result follows easily from the ﬁxpoint lemma. and there is no truth deﬁnition for Σ. (1) Suppose Σ is conservative over Σ.8. and Th(Gr) (the theory of groups)? An easy way to prove the undecidability of such theories is due to Tarski: he noticed that if N is deﬁnable in some model of a theory T . Σ is ﬁnite. How about theories like Th(Fl) (the theory of ﬁelds). Then =⇒ Th(Σ) is undecidable. Then the set of G¨del numbers of Lsentences o true in N∗ is not deﬁnable in N∗ . Then ThL (Σ) is undecidable (2) Suppose L = L and Σ =⇒ ThL (Σ ) is undecidable. Exercises. Then the set Th(Σ) is not Σrepresentable.3.
It follows that if Th(Σ) is decidable then so is Th(Σ ). . Then ⇐⇒ ThL (Σ) is undecidable. (1) In this case we have for all a ∈ N.) that computes for each L sentence σ an Lsentence σ ∗ such that Σ σ ↔ σ ∗ .106 (3) Suppose all symbols of L CHAPTER 5. . . so is ThL (Σ). Remark. Then. (4) The ⇒ direction holds by (1). Deﬁnition. . cn )). It follows that if ThL (Σ ) is decidable. Then for each Lsentence σ we have Σ σ ⇐⇒ Σ σ → σ. a ∈ ThL (Σ) ⇐⇒ a ∈ SentL and a ∈ ThL (Σ ) . .6. a ∈ Th(Σ ) ⇐⇒ a ∈ Sent and SN(∨). (2) Write Σ = {σ1 . for all a ∈ N: a ∈ ThL (Σ) ⇐⇒ a ∈ SentL and a ∈ ThL (Σ) . . Then ThL (Σ ) is undecidable by Corollary 5. .. We cannot drop the assumption L = L in (2): take L = ∅. then σ := ∀vk . . . a ∈ ThL (Σ ) ⇐⇒ a ∈ SentL and a∗ ∈ ThL (Σ) . . This yields the ⇐ direction of (4). . n. σ . we leave it to the reader to replace this appeal to the ChurchTuring Thesis by a proof. ThL (Σ) is undecidable (4) Suppose Σ extends Σ by a deﬁnition. . vk+n ). By the ChurchTuring Thesis there is a computable function a → a : N → N such that σ = σ for all L sentences σ. . for all a ∈ N. . . . Proof. Given any L sentence σ we deﬁne the Lsentence σ as follows: take k ∈ N minimal such that σ contains no variable vm with m ≥ k. . but ThL (Σ) is decidable (exercise). ∀vk+n ϕ(vk . . Σ = ∅. a ∈ Th(Σ) . (3) Let c0 . σN } ∪ Σ. replace each occurrence of ci in σ by vk+i for i = 0. Hence. vk+n ) be the resulting Lformula (so σ = ϕ(c0 . COMPUTABILITY L are constant symbols. Then ThL (Σ) is undecidable ⇐⇒ ThL (Σ ) is undecidable. For the ⇐ we use an algorithm (see . By ∗ the ChurchTuring Thesis there is a computable function a → a : N → N such that σ ∗ = σ ∗ for all L sentences σ. An easy argument using the completeness theorem shows that Σ L σ ⇐⇒ Σ L σ. . . . the converse holds by (1). L = L(N) and Σ = ∅. . . and let ϕ(vk . Th(Σ) is undecidable.6. . cn be the distinct constant symbols of L L. An Lstructure A is said to be strongly undecidable if for every set Σ of Lsentences such that A = Σ. so for all a ∈ N.. SN(¬). . . and put σ := σ1 ∧ · · · ∧ σN . This yields the ⇐ direction of (3). .
. Example. c.1. and more generally. Theorem 5. . 0. . But ThL (Σ) is undecidable by assumption. Σ σ ⇐⇒ Σ ∪ ∆ δσ. Let Σ be a set of L sentences such that B = Σ . ·) is strongly undecidable. Using Lagrange’s theorem that N = {a2 + b2 + c2 + d2 : a. . .8.1. b. Corollary 5. Lemma 5. . . . an ) be an L(c0 .8. and let (A. and (2) a ﬁnite set ∆ of L sentences true in B. (Sketch) By the previous lemma (with L and B instead of L and A). Th(Ri) is undecidable. and by (ii) we obtain an algorithm for deciding whether any given Lsentence is provable from Σ. cn )expansion of the Lstructure A. cn be distinct constant symbols not in L. an ) is strongly undecidable =⇒ A is strongly undecidable. . 0. by part (2) of Lemma 5. S. d ∈ Z}. N = (N.3 (Tarski). . hence Th(Σ) is undecidable by part (2) of Lemma 5.8.8. Proof. a0 . We have to show that Th(Σ) is undecidable. One can show that in this case we have (1) an algorithm that computes for any Lsentence σ an L sentence δσ. 1. UNDECIDABLE THEORIES 107 So A is strongly undecidable iﬀ every Ltheory of which A is a model is undecidable. Now N = Σ ∪ N. Proof. in other words. Then (A. we can reduce to the case that we have a 0deﬁnition δ of A in B. (ii) for each set Σ of L sentences with B = Σ and each Lsentence σ. ·) of integers is strongly undecidable. we need to show that ThL (Σ ) is undecidable. where Σ := {s : s is an Lsentence and Σ ∪ ∆ = δs}.4. Suppose towards a contradiction that ThL (Σ ) is decidable. By Church’s Theorem Th(Σ ∪ N) is undecidable. −. Take Σ as in (ii) above. It suﬃces to show that the ring (Z.2. +. . +. the theory of any class of rings that has the ring of integers among its members is undecidable. .5. . let Σ be a set of L(N)sentences such that N = Σ. the theory of rings is undecidable. . For the same reason. Then A = Σ by (i). Suppose the Lstructure A is deﬁnable in the L structure B and A is strongly undecidable.8. . Then B is strongly undecidable. <. we see that the inclusion map N → Z deﬁnes N in the ring of integers. To see this. Let c0 . of integral domains. so we have an algorithm for deciding whether any given L sentence is provable from Σ ∪ ∆. .8. the theory of commutative rings. with the following properties: (i) A = σ ⇐⇒ B = δσ. so by Tarski’s Theorem the ring of integers is strongly undecidable. The following result is an easy application of part (3) of the previous lemma. and we have a contradiction. a0 . . Then ThL (Σ ∪ ∆) is decidable.
ThL (∅)) is undecidable. It can also be shown that predicate logic in the language whose only symbol is a binary relation symbol is undecidable. 1. and thus the ring of integers is deﬁnable in the ﬁeld of rational numbers. 0. then g = sa for some a. . Then G (as a model of Gr) is strongly undecidable. ). then b = a2 . ) in the group G. +. The set Z ⊆ Q is 0deﬁnable in the ﬁeld (Q. ·) of rational numbers. We shall take this here on faith. is known to be decidable (Szmielew). The theory Th(Fl) of ﬁelds is undecidable. +. and we say that c is a least common multiple of a and b if a  c. Use these facts to specify a deﬁnition of the structure (Z. then they have a unique positive least common multiple. ) is strongly undecidable. there is no integer x > 1 with x  a and x  b). Thus by Tarski’s theorem. and is due to Julia Robinson. COMPUTABILITY Fact. The only known proof uses nontrivial results about quadratic forms. and b − a is a least common multiple of a and a − 1. (2) The structure (Z. Hint: Show that if b + a is a least common multiple of a and a + 1. Exercises. 0. On the other hand. the theory of abelian groups. using the ChurchTuring Thesis. with composition as the group multiplication. 0. the theory of any class of groups that has the group G of the exercise above among its members is undecidable. c denote integers. then they have ab as a least common multiple. (1) Argue informally. In fact. Th(Ab). On the other hand. Then predicate logic in L (that is. 0.5. (3) Consider the group G of bijective maps Z → Z. Recall that if a and b are not both zero. (4) Let L = {F } have just a binary function symbol. 1.108 CHAPTER 5. +. ). 1. Next show that a  b ⇐⇒ sb commutes with each g ∈ G that commutes with sa . Corollary 5. Hint: let s be the element of G given by s(x) = x + 1. Use this to deﬁne the squaring function in (Z. Below we let a. +. and then show that multiplication is 0deﬁnable in (Z. +. You can use the fact that ACF has QE. We say that a divides b (notation: a  b) if ax = b for some integer x. The theory of any class of ﬁelds that has the ﬁeld of rationals among its members is undecidable. Check that if g ∈ G commutes with s. −. and c  x for every integer x such that a  x and b  x. that Th(ACF) is decidable. 1. 0. and that if a and b are coprime (that is. the theory of groups is undecidable. 1.8. where  is the binary relation of divisibility on Z. b. b  c. predicate logic in the language whose only symbol is a unary function symbol is decidable.
• brief discussion of classes at end of “Sets and Maps”? • at the end of results without proof? • brief discussion on P=NP in connection with propositional logic • section(s) on boolean algebra. maybe extra section on RCF. including Stone representation. etc. exams) • footnotes pointing to alternative terminology. and other simple o applications of compactness • include “equality theorem”. etc. Ax’s theorem. • translation of one language in another (needed in connection with Tarski theorem in last section) • more details on backandforth in connection with unnested formulas • extra elementary model theory (universal classes. modeltheoretic criteria for qe. LindenbaumTarski algebras. etc. • On computability: a few extra results on c. application to ACF. .e..To do? • improve titlepage • improve or delete index • more exercises (from homework. sets. groups. • basic framework for manysorted logic. as examples) • solution to a problem by Erd¨s via compactness theorem. and exponential diophantine result. • section on equational logic? (boolean algebras.