Mathematical Logic

(Math 570)
Lecture Notes
Lou van den Dries
Fall Semester 2007
Contents
1 Preliminaries 1
1.1 Mathematical Logic: a brief overview . . . . . . . . . . . . . . . . 1
1.2 Sets and Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Basic Concepts of Logic 13
2.1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Completeness for Propositional Logic . . . . . . . . . . . . . . . . 19
2.3 Languages and Structures . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Variables and Terms . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Formulas and Sentences . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Logical Axioms and Rules; Formal Proofs . . . . . . . . . . . . . 37
3 The Completeness Theorem 43
3.1 Another Form of Completeness . . . . . . . . . . . . . . . . . . . 43
3.2 Proof of the Completeness Theorem . . . . . . . . . . . . . . . . 44
3.3 Some Elementary Results of Predicate Logic . . . . . . . . . . . . 49
3.4 Equational Classes and Universal Algebra . . . . . . . . . . . . . 53
4 Some Model Theory 59
4.1 L¨owenheim-Skolem; Vaught’s Test . . . . . . . . . . . . . . . . . 59
4.2 Elementary Equivalence and Back-and-Forth . . . . . . . . . . . 64
4.3 Quantifier Elimination . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4 Presburger Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5 Skolemization and Extension by Definition . . . . . . . . . . . . . 73
5 Computability, Decidability, and Incompleteness 77
5.1 Computable Functions . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 The Church-Turing Thesis . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Primitive Recursive Functions . . . . . . . . . . . . . . . . . . . . 87
5.4 Representability . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.5 Decidability and G¨odel Numbering . . . . . . . . . . . . . . . . . 95
5.6 Theorems of G¨odel and Church . . . . . . . . . . . . . . . . . . . 100
5.7 A more explicit incompleteness theorem . . . . . . . . . . . . . . 103
i
5.8 Undecidable Theories . . . . . . . . . . . . . . . . . . . . . . . . 105
Chapter 1
Preliminaries
We start with a brief overview of mathematical logic as covered in this course.
Next we review some basic notions from elementary set theory, which provides
a medium for communicating mathematics in a precise and clear way. In this
course we develop mathematical logic using elementary set theory as given,
just as one would do with other branches of mathematics, like group theory or
probability theory.
For more on the course material, see
Shoenfield, J. R., Mathematical Logic, Reading, Addison-Wesley, 1967.
For additional material in Model Theory we refer the reader to
Chang, C. C. and Keisler, H. J., Model Theory, New York, North-
Holland, 1990,
Poizat, B., A Course in Model Theory, Springer, 2000,
and for additional material on Computability, to
Rogers, H., Theory of Recursive Functions and Effective Com-
putability, McGraw-Hill, 1967.
1.1 Mathematical Logic: a brief overview
Aristotle identified some simple patterns in human reasoning, and Leibniz dreamt
of reducing reasoning to calculation. As a mathematical subject, however, logic
is relatively recent: the 19th century pioneers were Bolzano, Boole, Cantor,
Dedekind, Frege, Peano, C.S. Peirce, and E. Schr¨oder. From our perspective
we see their work as leading to boolean algebra, set theory, propositional logic,
predicate logic, as clarifying the foundations of the natural and real number
systems, and as introducing suggestive symbolic notation for logical operations.
Also, their activity led to the view that logic + set theory can serve as a basis for
1
2 CHAPTER 1. PRELIMINARIES
all of mathematics. This era did not produce theorems in mathematical logic
of any real depth,
1
but it did bring crucial progress of a conceptual nature,
and the recognition that logic as used in mathematics obeys mathematical rules
that can be made fully explicit.
In the period 1900-1950 important new ideas came from Russell, Zermelo,
Hausdorff, Hilbert, L¨owenheim, Ramsey, Skolem, Lusin, Post, Herbrand, G¨odel,
Tarski, Church, Kleene, Turing, and Gentzen. They discovered the first real
theorems in mathematical logic, with those of G¨odel having a dramatic impact.
Hilbert (in G¨ottingen), Lusin (in Moscow), Tarski (in Warsaw and Berkeley),
and Church (in Princeton) had many students and collaborators, who made up
a large part of that generation and the next in mathematical logic. Most of
these names will be encountered again during the course.
The early part of the 20th century was also marked by the so-called
foundational crisis in mathematics.
A strong impulse for developing mathematical logic came from the attempts
during these times to provide solid foundations for mathematics. Mathematical
logic has now taken on a life of its own, and also thrives on many interactions
with other areas of mathematics and computer science.
In the second half of the last century, logic as pursued by mathematicians
gradually branched into four main areas: model theory, computability theory (or
recursion theory), set theory, and proof theory. The topics in this course are
part of the common background of mathematicians active in any of these areas.
What distinguishes mathematical logic within mathematics is that
statements about mathematical objects
are taken seriously as mathematical objects in their own right. More generally,
in mathematical logic we formalize (formulate in a precise mathematical way)
notions used informally by mathematicians such as:
• property
• statement (in a given language)
• structure
• truth (what it means for a given statement to be true in a given structure)
• proof (from a given set of axioms)
• algorithm
1
In the case of set theory one could dispute this. Even so, the main influence of set theory
on the rest of mathematics was to enable simple constructions of great generality, like cartesian
products, quotient sets and power sets, and this involves only very elementary set theory.
1.1. MATHEMATICAL LOGIC: A BRIEF OVERVIEW 3
Once we have mathematical definitions of these notions, we can try to prove
theorems about these formalized notions. If done with imagination, this process
can lead to unexpected rewards. Of course, formalization tends to caricature
the informal concepts it aims to capture, but no harm is done if this is kept
firmly in mind.
Example. The notorious Goldbach Conjecture asserts that every even integer
greater than 2 is a sum of two prime numbers. With the understanding that
the variables range over N = ¦0, 1, 2, . . . ¦, and that 0, 1, +, , < denote the
usual arithmetic operations and relations on N, this assertion can be expressed
formally as
(GC) ∀x[(1+1 < x∧even(x)) → ∃p∃q(prime(p)∧prime(q)∧x = p+q)]
where even(x) abbreviates ∃y(x = y +y) and prime(p) abbreviates
1 < p ∧ ∀r∀s(p = r s → (r = 1 ∨ s = 1)).
The expression GC is an example of a formal statement (also called a sentence)
in the language of arithmetic, which has symbols 0, 1, +, , < to denote arithmetic
operations and relations, in addition to logical symbols like =, ∧, ∨, , →, ∀, ∃,
and variables x, y, z, p, q, r, s.
The Goldbach Conjecture asserts that this particular sentence GC is true in
the structure (N; 0, 1, +, , <). (No proof of the Goldbach Conjecture is known.)
It also makes sense to ask whether the sentence GC is true in the structure
(R; 0, 1, +, , <)
(It’s not, as is easily verified. That the question makes sense does not mean
that it is of any interest.)
A century of experience gives us confidence that all classical number-theoretic
results—old or new, proved by elementary methods or by sophisticated algebra
and analysis—can be proved from the Peano axioms for arithmetic.
2
However,
in our present state of knowledge, GC might be true in (N; 0, 1, +, , <), but not
provable from those axioms. (On the other hand, once you know what exactly
we mean by
provable from the Peano axioms,
you will see that if GC is provable from those axioms, then GC is true in
(N; 0, 1, +, , <), and that if GC is false in (N; 0, 1, +, , <), then its negation
GC is provable from those axioms.)
The point of this example is simply to make the reader aware of the notions
“true in a given structure” and “provable from a given set of axioms,” and their
difference. One objective of this course is to figure out the connections (and
disconnections) between these notions.
2
Here we do not count as part of classical number theory some results like Ramsey’s
Theorem that can be stated in the language of arithmetic, but are arguably more in the spirit
of logic and combinatorics.
4 CHAPTER 1. PRELIMINARIES
Some highlights (1900–1950)
The results below are among the most frequently used facts of mathematical
logic. The terminology used in stating these results might be unfamiliar, but
that should change during the course. What matters is to get some preliminary
idea of what we are aiming for. As will become clear during the course, each of
these results has stronger versions, on which applications often depend, but in
this overview we prefer simple statements over strength and applicability.
We begin with two results that are fundamental in model theory. They
concern the notion of model of Σ where Σ is a set of sentences in a language
L. At this stage we only say by way of explanation that a model of Σ is a
mathematical structure in which all sentences of Σ are true. For example, if Σ
is the (infinite) set of axioms for fields of characteristic zero in the language of
rings, then a model of Σ is just a field of characteristic zero.
Theorem of L¨owenheim and Skolem. If Σ is a countable set of sentences
in some language and Σ has a model, then Σ has a countable model.
Compactness Theorem (G¨odel, Mal’cev). Let Σ be a set of sentences in some
language. Then Σ has a model if and only if each finite subset of Σ has a model.
The next result goes a little beyond model theory by relating the notion of
“model of Σ” to that of “provability from Σ”:
Completeness Theorem (G¨odel, 1930). Let Σ be a set of sentences in some
language L, and let σ be a sentence in L. Then σ is provable from Σ if and only
if σ is true in all models of Σ.
In our treatment we shall obtain the first two theorems as byproducts of the
Completeness Theorem and its proof. In the case of the Compactness Theorem
this reflects history, but the theorem of L¨owenheim and Skolem predates the
Completeness Theorem. The L¨owenheim-Skolem and Compactness theorems
do not mention the notion of provability, and thus model theorists often prefer
to bypass Completeness in establishing these results; see for example Poizat’s
book.
Here is an important early result on a specific arithmetic structure:
Theorem of Presburger and Skolem. Each sentence in the language of the
structure (Z; 0, 1, +, −, <) that is true in this structure is provable from the
axioms for ordered abelian groups with least positive element 1, augmented, for
each n = 2, 3, 4, ..., by an axiom that says that for every a there is a b such that
a = nb or a = nb + 1 or . . . or a = nb + 1 + + 1 (with n disjuncts in total).
Moreover, there is an algorithm that, given any sentence in this language as
input, decides whether this sentence is true in (Z; 0, 1, +, −, <).
Note that in (Z; 0, 1, +, −, <) we have not included multiplication among the
primitives; accordingly, nb stands for b + +b (with n summands).
When we do include multiplication, the situation changes dramatically:
1.2. SETS AND MAPS 5
Incompleteness and undecidability of arithmetic. (G¨odel-Church, 1930’s).
One can construct a sentence in the language of arithmetic that is true in the
structure (N; 0, 1, +, , <), but not provable from the Peano axioms.
There is no algorithm that, given any sentence in this language as input,
decides whether this sentence is true in (N; 0, 1, +, , <).
Here “there is no algorithm” is used in the mathematical sense of
there cannot exist an algorithm,
not in the weaker colloquial sense of “no algorithm is known.” This theorem
is intimately connected with the clarification of notions like computability and
algorithm in which Turing played a key role.
In contrast to these incompleteness and undecidability results on (sufficiently
rich) arithmetic, we have
Tarski’s theorem on the field of real numbers (1930-1950). Every sentence
in the language of arithmetic that is true in the structure
(R; 0, 1, +, , <)
is provable from the axioms for ordered fields augmented by the axioms
- every positive element is a square,
- every odd degree polynomial has a zero.
There is also an algorithm that decides for any given sentence in this language
as input, whether this sentence is true in (R; 0, 1, +, , <).
1.2 Sets and Maps
We shall use this section as an opportunity to fix notations and terminologies
that are used throughout these notes, and throughout mathematics. In a few
places we shall need more set theory than we introduce here, for example, or-
dinals and cardinals. The following little book is a good place to read about
these matters. (It also contains an axiomatic treatment of set theory starting
from scratch.)
Halmos, P. R., Naive set theory, New York, Springer, 1974
In an axiomatic treatment of set theory as in the book by Halmos all assertions
about sets below are proved from a few simple axioms. In such a treatment the
notion of set itself is left undefined, but the axioms about sets are suggested
by thinking of a set as a collection of mathematical objects, called its elements
or members. To indicate that an object x is an element of the set A we write
x ∈ A, in words: x is in A (or: x belongs to A). To indicate that x is not in A we
write x / ∈ A. We consider the sets A and B as the same set (notation: A = B)
if and only if they have exactly the same elements. We often introduce a set
via the bracket notation, listing or indicating inside the brackets its elements.
6 CHAPTER 1. PRELIMINARIES
For example, ¦1, 2, 7¦ is the set with 1, 2, and 7 as its only elements. Note that
¦1, 2, 7¦ = ¦2, 7, 1¦, and ¦3, 3¦ = ¦3¦: the same set can be described in many
different ways. Don’t confuse an object x with the set ¦x¦ that has x as its
only element: for example, the object x = ¦0, 1¦ is a set that has exactly two
elements, namely 0 and 1, but the set ¦x¦ = ¦¦0, 1¦¦ has only one element,
namely x.
Here are some important sets that the reader has probably encountered
previously.
Examples.
(1) The empty set: ∅ (it has no elements).
(2) The set of natural numbers: N = ¦0, 1, 2, 3, . . .¦.
(3) The set of integers: Z = ¦. . . , −2, −1, 0, 1, 2, . . .¦.
(4) The set of rational numbers: Q.
(5) The set of real numbers: R.
(6) The set of complex numbers: C.
Remark. Throughout these notes m and n always denote natural numbers.
For example, “for all m ...” will mean “for all m ∈ N...”.
If all elements of the set A are in the set B, then we say that A is a subset of
B (and write A ⊆ B). Thus the empty set ∅ is a subset of every set, and each
set is a subset of itself. We often introduce a set A in our discussions by defining
A to be the set of all elements of a given set B that satisfy some property P.
Notation:
A := ¦x ∈ B : x satisfies P¦ (hence A ⊆ B).
Let A and B be sets. Then we can form the following sets:
(a) A∪ B := ¦x : x ∈ A or x ∈ B¦ (union of A and B);
(b) A∩ B := ¦x : x ∈ A and x ∈ B¦ (intersection of A and B);
(c) AB := ¦x : x ∈ A and x / ∈ B¦ (difference of A and B);
(d) AB := ¦(a, b) : a ∈ A and b ∈ B¦ (cartesian product of A and B).
Thus the elements of A B are the so-called ordered pairs (a, b) with a ∈ A
and b ∈ B. The key property of ordered pairs is that we have (a, b) = (c, d) if
and only if a = c and b = d. For example, you may think of R R as the set
of points (a, b) in the xy-plane of coordinate geometry.
We say that A and B are disjoint if A∩B = ∅, that is, they have no element
in common.
Remark. In a definition such as we just gave: “We say that if —,” the
meaning of “if” is actually “if and only if.” We committed a similar abuse
of language earlier in defining set inclusion by the phrase “If —, then we say
that .” We shall continue such abuse, in accordance with tradition, but
only in similarly worded definitions. Also, we shall often write “iff” or “⇔” to
abbreviate “if and only if.”
1.2. SETS AND MAPS 7
Maps
Definition. A map is a triple f = (A, B, Γ) of sets A, B, Γ such that Γ ⊆ AB
and for each a ∈ A there is exactly one b ∈ B with (a, b) ∈ Γ; we write f(a) for
this unique b, and call it the value of f at a (or the image of a under f).
3
We
call A the domain of f, and B the codomain of f, and Γ the graph of f.
4
We
write f : A → B to indicate that f is a map with domain A and codomain B,
and in this situation we also say that f is a map from A to B.
Among the many synonyms of map are
mapping, assignment, function, operator, transformation.
Typically, “function” is used when the codomain is a set of numbers of some
kind, “operator” when the elements of domain and codomain are themselves
functions, and “transformation” is used in geometric situations where domain
and codomain are equal. (We use equal as synonym for the same or identical;
also coincide is a synonym for being the same.)
Examples.
(1) Given any set A we have the identity map 1
A
: A → A defined by 1
A
(a) = a
for all a ∈ A.
(2) Any polynomial f(X) = a
0
+ a
1
X + + a
n
X
n
with real coefficients
a
0
, . . . , a
n
gives rise to a function x → f(x) : R → R. We often use the
“maps to” symbol → in this way to indicate the rule by which to each x
in the domain we associate its value f(x).
Definition. Given f : A → B and g : B → C we have a map g ◦ f : A → C
defined by (g ◦ f)(a) = g(f(a)) for all a ∈ A. It is called the composition of g
and f.
Definition. Let f : A → B be a map. It is said to be injective if for all a
1
,= a
2
in A we have f(a
1
) ,= f(a
2
). It is said to be surjective if for each b ∈ B there
exists a ∈ A such that f(a) = b. It is said to be bijective (or a bijection) if it is
both injective and surjective. For X ⊆ A we put
f(X) := ¦f(x) : x ∈ X¦ ⊆ B (direct image of X under f).
(There is a notational conflict here when X is both a subset of A and an element
of A, but it will always be clear from the context when f(X) is meant to be the
the direct image of X under f; some authors resolve the conflict by denoting this
direct image by f[X] or in some other way.) We also call f(A) = ¦f(a) : a ∈ A¦
the image of f. For Y ⊆ B we put
f
−1
(Y ) := ¦x ∈ A : f(x) ∈ Y ¦ ⊆ A (inverse image of Y under f).
Thus surjectivity of our map f is equivalent to f(A) = B.
3
Sometimes we shall write fa instead of f(a) in order to cut down on parentheses.
4
Other words for “domain” and “codomain” are “source” and “target”, respectively.
8 CHAPTER 1. PRELIMINARIES
If f : A → B is a bijection then we have an inverse map f
−1
: B → A given by
f
−1
(b) := the unique a ∈ A such that f(a) = b.
Note that then f
−1
◦ f = 1
A
and f ◦ f
−1
= 1
B
. Conversely, if f : A → B and
g : B → A satisfy g ◦ f = 1
A
and f ◦ g = 1
B
, then f is a bijection with f
−1
= g.
(The attentive reader will notice that we just introduced a potential conflict of
notation: for bijective f : A → B and Y ⊆ B, both the inverse image of Y
under f and the direct image of Y under f
−1
are denoted by f
−1
(Y ); no harm
is done, since these two subsets of A coincide.)
It follows from the definition of “map” that f : A → B and g : C → D are
equal (f = g) if and only if A = C, B = D, and f(x) = g(x) for all x ∈ A. We
say that g : C → D extends f : A → B if A ⊆ C, B ⊆ D, and f(x) = g(x) for
all x ∈ A.
5
Definition. A set A is said to be finite if there exists n and a bijection
f : ¦1, . . . , n¦ → A.
Here we use ¦1, . . . , n¦ as a suggestive notation for the set ¦m : 1 ≤ m ≤ n¦.
For n = 0 this is just ∅. If A is finite there is exactly one such n (although if
n > 1 there will be more than one bijection f : ¦1, . . . , n¦ → A); we call this
unique n the number of elements of A or the cardinality of A, and denote it by
[A[. A set which is not finite is said to be infinite.
Definition. A set A is said to be countably infinite if there is a bijection N → A.
It is said to be countable if it is either finite or countably infinite.
Example. The sets N, Z and Q are countably infinite, but the infinite set R
is not countably infinite. Every infinite set has a countably infinite subset.
One of the standard axioms of set theory, the Power Set Axiom says:
For any set A, there is a set whose elements are exactly the subsets of A.
Such a set of subsets of A is clearly uniquely determined by A, is denoted
by T(A), and is called the power set of A. If A is finite, so is T(A) and
[T(A)[ = 2
|A|
. Note that a → ¦a¦ : A → T(A) is an injective map. However,
there is no surjective map A → T(A):
Cantor’s Theorem. Let S : A → T(A) be a map. Then the set
¦a ∈ A : a / ∈ S(a)¦ (a subset of A)
is not an element of S(A).
Proof. Suppose otherwise. Then ¦a ∈ A : a / ∈ S(a)¦ = S(b) where b ∈ A.
Assuming b ∈ S(b) yields b / ∈ S(b), a contradiction. Thus b / ∈ S(b); but then
b ∈ S(b), again a contradiction. This concludes the proof.
5
We also say “g : C →D is an extension of f : A →B” or “f : A →B is a restriction of
g : C →D.”
1.2. SETS AND MAPS 9
Let I and A be sets. Then there is a set whose elements are exactly the maps
f : I → A, and this set is denoted by A
I
. For I = ¦1, . . . , n¦ we also write A
n
instead of A
I
. Thus an element of A
n
is a map a : ¦1, . . . , n¦ → A; we usually
think of such an a as the n-tuple (a(1), . . . , a(n)), and we often write a
i
instead
of a(i). So A
n
can be thought of as the set of n-tuples (a
1
, . . . , a
n
) with each
a
i
∈ A. For n = 0 the set A
n
has just one element — the empty tuple.
An n-ary relation on A is just a subset of A
n
, and an n-ary operation on
A is a map from A
n
into A. Instead of “1-ary” we usually say “unary”, and
instead of “2-ary” we can say “binary”. For example, ¦(a, b) ∈ Z
2
: a < b¦ is a
binary relation on Z, and integer addition is the binary operation (a, b) → a +b
on Z.
Definition. ¦a
i
¦
i∈I
or (a
i
)
i∈I
denotes a family of objects a
i
indexed by the set
I, and is just a suggestive notation for a set ¦(i, a
i
) : i ∈ I¦, not to be confused
with the set ¦a
i
: i ∈ I¦. (There may be repetitions in the family, that is, it
may happen that a
i
= a
j
for distinct indices i, j ∈ I, but such repetition is not
reflected in the set ¦a
i
: i ∈ I¦. For example, if I = N and a
n
= a for all n, then
¦(i, a
i
) : i ∈ I¦ = ¦(i, a) : i ∈ N¦ is countably infinite, but ¦a
i
: i ∈ I¦ = ¦a¦
has just one element.) For I = N we usually say “sequence” instead of “family”.
Given any family (A
i
)
i∈I
of sets (that is, each A
i
is a set) we have a set
_
i∈I
A
i
:= ¦x : x ∈ A
i
for some i ∈ I¦,
the union of the family. If I is finite and each A
i
is finite, then so is the union
above and
[
_
i∈I
A
i
[ ≤

i∈I
[A
i
[.
If I is countable and each A
i
is countable then

i∈I
A
i
is countable.
Given any family (A
i
)
i∈I
of sets we have a set

i∈I
A
i
:= ¦(a
i
)
i∈I
: a
i
∈ A
i
for all i ∈ I¦,
the product of the family. One axiom of set theory, the Axiom of Choice, is a
bit special, but we shall use it a few times. It says that for any family (A
i
)
i∈I
of nonempty sets there is a family (a
i
)
i∈I
such that a
i
∈ A
i
for all i ∈ I, that
is,

i∈I
A
i
,= ∅.
Words
Definition. Let A be a set. Think of A as an alphabet of letters. A word of
length n on A is an n-tuple (a
1
, . . . , a
n
) of letters a
i
∈ A; because we think
of it as a word (string of letters) we shall write this tuple instead as a
1
. . . a
n
(without parentheses or commas). There is a unique word of length 0 on A, the
empty word and written . Given a word a = a
1
. . . a
n
of length n ≥ 1 on A,
10 CHAPTER 1. PRELIMINARIES
the first letter (or first symbol) of a is by definition a
1
, and the last letter (or
last symbol) of a is a
n
. The set of all words on A is denoted A

:
A

=
_
n
A
n
(disjoint union).
Logical expressions like formulas and terms will be introduced later as words of
a special form on suitable alphabets. When A ⊆ B we can identify A

with a
subset of B

, and this will be done whenever convenient.
Definition. Given words a = a
1
. . . a
m
and b = b
1
. . . b
n
on A of length m and
n respectively, we define their concatenation ab ∈ A

:
ab = a
1
. . . a
m
b
1
. . . b
n
.
Thus ab is a word on A of length m + n. Concatenation is a binary operation
on A

that is associative: (ab)c = a(bc) for all a, b, c ∈ A

, with as two-sided
identity: a = a = a for all a ∈ A

, and with two-sided cancellation: for all
a, b, c ∈ A

, if ab = ac, then b = c, and if ac = bc, then a = b.
Equivalence Relations and Quotient Sets
Given a binary relation R on a set A it is often more suggestive to write aRb
instead of (a, b) ∈ R.
Definition. An equivalence relation on a set A is a binary relation ∼ on A such
that for all a, b, c ∈ A:
(i) a ∼ a (reflexivity);
(ii) a ∼ b implies b ∼ a (symmetry);
(iii) (a ∼ b and b ∼ c) implies a ∼ c (transitivity).
Example. Given any n we have the equivalence relation “congruence modulo
n” on Z defined as follows: for any a, b ∈ Z we have
a ≡ b mod n ⇐⇒ a −b = nc for some c ∈ Z.
For n = 0 this is just equality on Z.
Let ∼ be an equivalence relation on the set A. The equivalence class a

of an
element a ∈ A is defined by a

= ¦b ∈ A : a ∼ b¦ (a subset of A). For a, b ∈ A
we have a

= b

if and only if a ∼ b, and a

∩b

= ∅ if and only if a b. The
quotient set of A by ∼ is by definition the set of equivalence classes:
A/∼ = ¦a

: a ∈ A¦.
This quotient set is a partition of A, that is, it is a collection of pairwise disjoint
nonempty subsets of A whose union is A. (Collection is a synonym for set; we
use it here because we don’t like to say “set of ... subsets ...”.) Every partition
of A is the quotient set A/ ∼ for a unique equivalence relation ∼ on A. Thus
1.2. SETS AND MAPS 11
equivalence relations on A and partitions of A are just different ways to describe
the same situation.
In the previous example (congruence modulo n) the equivalence classes are
called congruence classes modulo n (or residue classes modulo n) and the cor-
responding quotient set is often denoted Z/nZ.
Remark. Readers familiar with some abstract algebra will note that the con-
struction in the example above is a special case of a more general construction—
that of a quotient of a group with respect to a normal subgroup.
Posets
A partially ordered set (short: poset) is a pair (P, ≤) consisting of a set P and
a partial ordering ≤ on P, that is, ≤ is a binary relation on P such that for all
p, q, r ∈ P:
(i) p ≤ p (reflexivity);
(ii) if p ≤ q and q ≤ p, then p = q (antisymmetry);
(iii) if p ≤ q and q ≤ r, then p ≤ r (transitivity).
If in addition we have for all p, q ∈ P,
(iv) p ≤ q or q ≤ p,
then we say that ≤ is a linear order on P, or that (P, ≤) is a linearly ordered
set.
6
Each of the sets N, Z, Q, R comes with its familiar linear order on it.
As an example, take any set A and its collection T(A) of subsets. Then
X ≤ Y :⇐⇒ X ⊆ Y (for subsets X, Y of A)
defines a poset (T(A), ≤), also referred to as the power set of A ordered by
inclusion. This is not a linearly ordered set if A has more than one element.
Finite linearly ordered sets are determined “up to unique isomorphism” by their
size: if (P, ≤) is a linearly ordered set and [P[ = n, then there is a unique map
ι : P → ¦1, . . . , n¦ such that for all p, q ∈ P we have: p ≤ q ⇐⇒ ι(p) ≤ ι(q).
This map ι is a bijection.
Let (P, ≤) be a poset. Here is some useful notation. For x, y ∈ P we set
x ≥ y : ⇐⇒ y ≤ x,
x < y : ⇐⇒ y > x :⇐⇒ x ≤ y and x ,= y.
Note that (P, ≥) is also a poset. A least element of P is a p ∈ P such that p ≤ x
for all x ∈ P; a largest element of P is defined likewise, with ≥ instead of ≤.
Of course, P can have at most one least element; therefore we can refer to the
6
One also uses the term total order instead of linear order.
12 CHAPTER 1. PRELIMINARIES
least element of P, if P has a least element; likewise, we can refer to the largest
element of P, if P has a largest element.
A minimal element of P is a p ∈ P such that there is no x ∈ P with x < p;
a maximal element of P is defined likewise, with > instead of <. If P has a
least element, then this element is also the unique minimal element of P; some
posets, however, have more than one minimal element. The reader might want
to prove the following result to get a feeling for these notions:
If P is finite and nonempty, then P has a maximal element, and there is a linear
order ≤

on P that extends ≤ in the sense that
p ≤ q =⇒ p ≤

q, for all p, q ∈ P.
(Hint: use induction on [P[.)
Let X ⊆ P. A lowerbound (respectively, upperbound) of X in P is an element l ∈
P (respectively, an element u ∈ P), such that l ≤ x for all x ∈ X (respectively,
x ≤ u for all x ∈ X).
We often tacitly consider X as a poset in its own right, by restricting the
given partial ordering of P to X. More precisely this means that we consider
the poset (X, ≤
X
) where the partial ordering ≤
X
on X is defined by
x ≤
X
y ⇐⇒ x ≤ y (x, y ∈ X).
Thus we can speak of least, largest, minimal, and maximal elements of a set
X ⊆ P, when the ambient poset (P, ≤) is clear from the context. For example,
when X is the collection of nonempty subsets of a set A and X is ordered by
inclusion, then the minimal elements of X are the singletons ¦a¦ with a ∈ A.
We call X a chain in P if (X, ≤
X
) is linearly ordered.
Occasionally we shall use the following fact about posets (P, ≤).
Zorn’s Lemma. Suppose P is nonempty and every nonempty chain in P has
an upperbound in P. Then P has a maximal element.
For a further discussion of Zorn’s Lemma and its proof using the Axiom of
Choice we refer the reader to Halmos’s book on set theory.
Chapter 2
Basic Concepts of Logic
2.1 Propositional Logic
Propositional logic is the fragment of logic where we construct new statements
from given statements using so-called connectives like “not”, “or” and “and”.
The truth value of such a new statement is then completely determined by the
truth values of the given statements. Thus, given any statements p and q, we
can form the three statements
p (the negation of p, pronounced as “not p”),
p ∨ q (the disjunction of p and q, pronounced as “p or q”),
p ∧ q (the conjunction of p and q, pronounced as “p and q”).
This leads to more complicated combinations like
_
p ∧(q)
_
. We shall regard
p as true if and only if p is not true; also, p ∨ q is defined to be true if and
only if p is true or q is true (including the possibility that both are true), and
p ∧ q is deemed to be true if and only if p is true and q is true. Instead of “not
true” we also say “false”. We now introduce a formalism that makes this into
mathematics.
We start with the five distinct symbols
· ⊥ ∨ ∧
to be thought of as true, false, not, or, and and, respectively. These symbols are
fixed throughout the course, and are called propositional connectives. In this
section we also fix a set A whose elements will be called propositional atoms (or
just atoms), such that no propositional connective is an atom. It may help the
reader to think of an atom a as a variable for which we can substitute arbitrary
statements, assumed to be either true or false.
A proposition on A is a word on the alphabet A ∪ ¦·, ⊥, , ∨, ∧¦ that can
be obtained by applying the following rules:
13
14 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(i) each atom a ∈ A (viewed as a word of length 1) is a proposition on A;
(ii) · and ⊥ (viewed as words of length 1) are propositions on A;
(iii) if p and q are propositions on A, then the concatenations p, ∨pq and ∧pq
are propositions on A.
For the rest of this section “proposition” means “proposition on A”, and p, q, r
(sometimes with subscripts) will denote propositions.
Example. Suppose a, b, c are atoms. Then ∧ ∨ abc is a proposition. This
follows from the rules above: a is a proposition, so a is a proposition, hence
∨ab as well; also c is a proposition, and thus ∧ ∨ abc is a proposition.
We defined “proposition” using the suggestive but vague phrase “can be ob-
tained by applying the following rules”. The reader should take such an infor-
mal description as shorthand for a completely explicit definition, which in the
case at hand is as follows:
A proposition is a word w on the alphabet A ∪ ¦·, ⊥, , ∨, ∧¦ for which there
is a sequence w
1
, . . . , w
n
of words on that same alphabet, with n ≥ 1, such that
w = w
n
and for each k ∈ ¦1, . . . , n¦, either w
k
∈ A∪¦·, ⊥¦ (where each element
in the last set is viewed as a word of length 1), or there are i, j ∈ ¦1, . . . , k −1¦
such that w
k
is one of the concatenations w
i
, ∨w
i
w
j
, ∧w
i
w
j
.
We let Prop(A) denote the set of propositions.
Remark. Having the connectives ∨ and ∧ in front of the propositions they
“connect” rather than in between, is called prefix notation or Polish notation.
This is theoretically elegant, but for the sake of readability we usually write p∨q
and p ∧ q to denote ∨pq and ∧pq respectively, and we also use parentheses and
brackets if this helps to clarify the structure of a proposition. So the proposition
in the example above could be denoted by [(a)∨b]∧(c), or even by (a∨b)∧c
since we shall agree that binds stronger than ∨ and ∧ in this informal way
of indicating propositions. Because of the informal nature of these conventions,
we don’t have to give precise rules for their use; it’s enough that each actual
use is clear to the reader.
The intended structure of a proposition—how we think of it as built up
from atoms via connectives—is best exhibited in the form of a tree, a two-
dimensional array, rather than as a one-dimensional string. Such trees, however,
occupy valuable space on the printed page, and are typographically demanding.
Fortunately, our “official” prefix notation does uniquely determine the intended
structure of a proposition: that is what the next lemma amounts to.
Lemma 2.1.1 (Unique Readability). If p has length 1, then either p = ·, or
p = ⊥, or p is an atom. If p has length > 1, then its first symbol is either ,
or ∨, or ∧. If the first symbol of p is , then p = q for a unique q. If the first
symbol of p is ∨, then p = ∨qr for a unique pair (q, r). If the first symbol of p
is ∧, then p = ∧qr for a unique pair (q, r).
(Note that we used here our convention that p, q, r denote propositions.) Only
the last two claims are worth proving in print, the others should require only a
2.1. PROPOSITIONAL LOGIC 15
moment’s thought. For now we shall assume this lemma without proof. At the
end of this section we establish more general results which are needed also later
in the course.
Remark. Rather than thinking of a proposition as a statement, it’s better
viewed as a function whose arguments and values are statements: replacing the
atoms in a proposition by specific mathematical statements like “2 2 = 4”,
“π
2
< 7”, and “every even integer > 2 is the sum of two prime numbers”, we
obtain again a mathematical statement.
We shall use the following notational conventions: p → q denotes p ∨ q, and
p ↔ q denotes (p → q) ∧ (q → p). By recursion on n we define
p
1
∨ . . . ∨ p
n
=
_
¸
¸
_
¸
¸
_
⊥ if n = 0
p
1
if n = 1
p
1
∨ p
2
if n = 2
(p
1
∨ . . . ∨ p
n−1
) ∨ p
n
if n > 2
Thus p ∨ q ∨ r stands for (p ∨ q) ∨ r. We call p
1
∨ . . . ∨ p
n
the disjunction of
p
1
, . . . , p
n
. The reason that for n = 0 we take this disjunction to be ⊥ is that
we want a disjunction to be true if and only if (at least) one of the disjuncts is
true.
Similarly, the conjunction p
1
∧ . . . ∧ p
n
of p
1
, . . . , p
n
is defined by replacing
everywhere ∨ by ∧ and ⊥ by · in the definition of p
1
∨ . . . ∨ p
n
.
Definition. A truth assignment is a map t : A → ¦0, 1¦. We extend such a t
to
ˆ
t : Prop(A) → ¦0, 1¦ by requiring
(i)
ˆ
t(·) = 1
(ii)
ˆ
t(⊥) = 0
(iii)
ˆ
t(p) = 1 −
ˆ
t(p)
(iv)
ˆ
t(p ∨ q) = max(
ˆ
t(p),
ˆ
t(q))
(v)
ˆ
t(p ∧ q) = min(
ˆ
t(p),
ˆ
t(q))
Note that there is exactly one such extension
ˆ
t by unique readability. To simplify
notation we often write t instead of
ˆ
t. The array below is called a truth table.
It shows on each row below the top row how the two leftmost entries t(p) and
t(q) determine t(p), t(p ∨ q), t(p ∧ q), t(p → q) and t(p ↔ q).
p q p p ∨ q p ∧ q p → q p ↔ q
0 0 1 0 0 1 1
0 1 1 1 0 1 0
1 0 0 1 0 0 0
1 1 0 1 1 1 1
Let t : A → ¦0, 1¦. Note that t(p → q) = 1 if and only if t(p) ≤ t(q), and that
t(p ↔ q) = 1 if and only if t(p) = t(q).
Suppose a
1
, . . . , a
n
are the distinct atoms that occur in p, and we know how
p is built up from those atoms. Then we can compute in a finite number of steps
16 CHAPTER 2. BASIC CONCEPTS OF LOGIC
t(p) from t(a
1
), . . . , t(a
n
). In particular, t(p) = t

(p) for any t

: A → ¦0, 1¦ such
that t(a
i
) = t

(a
i
) for i = 1, . . . , n.
Definition. We say that p is a tautology (notation: [= p) if t(p) = 1 for all
t : A → ¦0, 1¦. We say that p is satisfiable if t(p) = 1 for some t : A → ¦0, 1¦.
Thus · is a tautology, and p ∨ p, p → (p ∨ q) are tautologies for all p and
q. By the remark preceding the definition one can verify whether any given p
with exactly n distinct atoms in it is a tautology by computing 2
n
numbers and
checking that these numbers all come out 1. (To do this accurately by hand is
already cumbersome for n = 5, but computers can handle somewhat larger n.
Fortunately, other methods are often efficient for special cases.)
Remark. Note that [= p ↔ q iff t(p) = t(q) for all t : A → ¦0, 1¦. We call p
equivalent to q if [= p ↔ q. Note that “equivalent to” defines an equivalence
relation on Prop(A). The lemma below gives a useful list of equivalences. We
leave it to the reader to verify them.
Lemma 2.1.2. For all p, q, r we have the following equivalences:
(1) [= (p ∨ p) ↔ p, [= (p ∧ p) ↔ p
(2) [= (p ∨ q) ↔ (q ∨ p), [= (p ∧ q) ↔ (q ∧ p)
(3) [= (p ∨ (q ∨ r)) ↔ ((p ∨ q) ∨ r), [= (p ∧ (q ∧ r)) ↔ ((p ∧ q) ∧ r)
(4) [= (p ∨ (q ∧ r)) ↔ (p ∨ q) ∧ (p ∨ r), [= (p ∧ (q ∨ r)) ↔ (p ∧ q) ∨ (p ∧ r)
(5) [= (p ∨ (p ∧ q)) ↔ p, [= (p ∧ (p ∨ q)) ↔ p
(6) [= ((p ∨ q)) ↔ (p ∧ q), [= ((p ∧ q)) ↔ (p ∨ q)
(7) [= (p ∨ p) ↔ ·, [= (p ∧ p) ↔ ⊥
(8) [= p ↔ p
Items (1), (2), (3), (4), (5), and (6) are often referred to as the idempotent
law, commutativity, associativity, distributivity, the absorption law, and the De
Morgan law, respectively. Note the left-right symmetry in (1)–(7) : the so-called
duality of propositional logic. We shall return to this issue in the more algebraic
setting of boolean algebras.
Some notation: let (p
i
)
i∈I
be a family of propositions with finite index set
I, choose a bijection k → i(k) : ¦1, . . . , n¦ → I and set

i∈I
p
i
:= p
i(1)
∨ ∨ p
i(n)
,

i∈I
p
i
:= p
i(1)
∧ ∧ p
i(n)
.
If I is clear from context we just write
_
i
p
i
and
_
i
p
i
instead. Of course, the
notations
_
i∈I
p
i
and
_
i∈I
p
i
can only be used when the particular choice of
bijection of ¦1, . . . , n¦ with I does not matter; this is usually the case, because
the equivalence class of p
i(1)
∨ ∨ p
i(n)
does not depend on this choice, and
the same is true for the equivalence class of p
i(1)
∧ ∧ p
i(n)
.
Next we define “model of Σ” and “tautological consequence of Σ”.
Definition. Let Σ ⊆ Prop(A). By a model of Σ we mean a truth assignment
t : A → ¦0, 1¦ such that t(p) = 1 for all p ∈ Σ. We say that a proposition p is
a tautological consequence of Σ (written Σ [= p) if t(p) = 1 for every model t of
Σ. Note that [= p is the same as ∅ [= p.
2.1. PROPOSITIONAL LOGIC 17
Lemma 2.1.3. Let Σ ⊆ Prop(A) and p, q ∈ Prop(A). Then
(1) Σ [= p ∧ q ⇐⇒ Σ [= p and Σ [= q.
(2) Σ [= p =⇒ Σ [= p ∨ q.
(3) Σ ∪ ¦p¦ [= q ⇐⇒ Σ [= p → q.
(4) If Σ [= p and Σ [= p → q, then Σ [= q. (Modus Ponens.)
Proof. We will prove (3) here and leave the rest as exercise.
(⇒) Assume Σ ∪ ¦p¦ [= q. To derive Σ [= p → q we consider any model
t : A −→ ¦0, 1¦ of Σ, and need only show that then t(p → q) = 1. If t(p) = 1
then t(Σ ∪ ¦p¦) ⊆ ¦1¦, hence t(q) = 1 and thus t(p → q) = 1. If t(p) = 0 then
t(p → q) = 1 by definition.
(⇐) Assume Σ [= p → q. To derive Σ ∪ ¦p¦ [= q we consider any model
t : A −→ ¦0, 1¦ of Σ ∪ ¦p¦, and need only derive that t(q) = 1. By assumption
t(p → q) = 1 and in view of t(p) = 1, this gives t(q) = 1 as required.
We finish this section with the promised general result on unique readability.
We also establish facts of similar nature that are needed later.
Definition. Let F be a set of symbols with a function a : F → N (called the
arity function). A symbol f ∈ F is said to have arity n if a(f) = n. A word on
F is said to be admissible if it can be obtained by applying the following rules:
(i) If f ∈ F has arity 0, then f viewed as a word of length 1 is admissible.
(ii) If f ∈ F has arity m > 0 and t
1
, . . . , t
m
are admissible words on F, then
the concatenation ft
1
. . . t
m
is admissible.
Below we just write “admissible word” instead of “admissible word on F”. Note
that the empty word is not admissible, and that the last symbol of an admissible
word cannot be of arity > 0.
Example. Take F = A∪ ¦·, ⊥, , ∨, ∧¦ and define arity : F → N by
arity(x) = 0 for x ∈ A∪ ¦·, ⊥¦, arity() = 1, arity(∨) = arity(∧) = 2.
Then the set of admissible words is just Prop(A).
Lemma 2.1.4. Let t
1
, . . . , t
m
and u
1
, . . . , u
n
be admissible words and w any
word on F such that t
1
. . . t
m
w = u
1
. . . u
n
. Then m ≤ n, t
i
= u
i
for i =
1, . . . , m, and w = u
m+1
u
n
.
Proof. By induction on the length of u
1
. . . u
n
. If this length is 0, then m = n =
0 and w is the empty word. Suppose the length is > 0, and assume the lemma
holds for smaller lengths. Note that n > 0. If m = 0, then the conclusion of the
lemma holds, so suppose m > 0. The first symbol of t
1
equals the first symbol
of u
1
. Say this first symbol is h ∈ F with arity k. Then t
1
= ha
1
. . . a
k
and
u
1
= hb
1
. . . b
k
where a
1
, . . . , a
k
and b
1
, . . . , b
k
are admissible words. Cancelling
the first symbol h gives
a
1
. . . a
k
t
2
. . . t
m
w = b
1
. . . b
k
u
2
. . . u
n
.
18 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(Caution: any of k, m−1, n−1 could be 0.) We have length(b
1
. . . b
k
u
2
. . . u
n
) =
length(u
1
. . . u
n
) −1, so the induction hypothesis applies. It yields k +m−1 ≤
k + n −1 (so m ≤ n), a
1
= b
1
, . . . , a
k
= b
k
(so t
1
= u
1
), t
2
= u
2
, . . . , t
m
= u
m
,
and w = u
m+1
u
n
.
Here are two immediate consequences that we shall use:
1. Let t
1
, . . . , t
m
and u
1
, . . . , u
n
be admissible words such that t
1
. . . t
m
=
u
1
. . . u
n
. Then m = n and t
i
= u
i
for i = 1, . . . , m.
2. Let t and u be admissible words and w a word on F such that tw = u.
Then t = u and w is the empty word.
Lemma 2.1.5 (Unique Readability).
Each admissible word equals ft
1
. . . t
m
for a unique tuple (f, t
1
, . . . , t
m
) where
f ∈ F has arity m and t
1
, . . . , t
m
are admissible words.
Proof. Suppose ft
1
. . . t
m
= gu
1
. . . u
n
where f, g ∈ F have arity m and n
respectively, and t
1
, . . . , t
m
, u
1
, . . . , u
n
are admissible words on F. We have to
show that then f = g, m = n and t
i
= u
i
for i = 1, . . . , m. Observe first that
f = g since f and g are the first symbols of two equal words. After cancelling
the first symbol of both words, the first consequence of the previous lemma leads
to the desired conclusion.
Given words v, w ∈ F

and i ∈ ¦1, . . . , length(w)¦, we say that v occurs in
w at starting position i if w = w
1
vw
2
where w
1
, w
2
∈ F

and w
1
has length
i − 1. (For example, if f, g ∈ F are distinct, then the word fgf has exactly
two occurrences in the word fgfgf, one at starting position 1, and the other
at starting position 3; these two occurrences overlap, but such overlapping is
impossible with admissible words, see exercise 5 at the end of this section.)
Given w = w
1
vw
2
as above, and given v

∈ F

, the result of replacing v in w at
starting position i by v

is by definition the word w
1
v

w
2
.
Lemma 2.1.6. Let w be an admissible word and 1 ≤ i ≤ length(w). Then there
is a unique admissible word that occurs in w at starting position i.
Proof. We prove existence by induction on length(w). Uniqueness then follows
from the fact stated just before Lemma 2.1.5. Clearly w is an admissible word
occurring in w at starting position 1. Suppose i > 1. Then we write w =
ft
1
. . . t
n
where f ∈ F has arity n > 0, and t
1
, . . . , t
n
are admissible words, and
we take j ∈ ¦1, . . . , n¦ such that
1 + length(t
1
) + + length(t
j−1
) < i ≤ 1 + length(t
1
) + + length(t
j
).
Now apply the inductive assumption to t
j
.
Remark. Let w = ft
1
. . . t
n
where f ∈ F has arity n > 0, and t
1
, . . . , t
n
are
admissible words. Put l
j
:= 1 +length(t
1
) + +length(t
j
) for j = 0, . . . , n (so
l
0
= 1). Suppose l
j−1
< i ≤ l
j
, 1 ≤ j ≤ n, and let v be the admissible word
that occurs in w at starting position i. Then the proof of the last lemma shows
that this occurrence is entirely inside t
j
, that is, i −1 + length(v) ≤ l
j
.
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 19
Corollary 2.1.7. Let w be an admissible word and 1 ≤ i ≤ length(w). Then
the result of replacing the admissible word v in w at starting position i by an
admissible word v

is again an admissible word.
This follows by a routine induction on length(w), using the last remark.
Exercises. In the exercises below, A = |a
1
, . . . , a
n
¦, [A[ = n.
(1) (Disjunctive Normal Form) Each p is equivalent to a disjunction
p
1
∨ ∨ p
k
where each disjunct p
i
is a conjunction a

1
1
∧ . . . ∧ a

n
n
with all
j
∈ |−1, 1¦ and
where for an atom a we put a
1
:= a and a
−1
:= a.
(2) (Conjunctive Normal Form) Same as last problem, except that the signs ∨ and ∧
are interchanged, as well as the words “disjunction” and “conjunction,” and also
the words “disjunct” and “conjunct.”
(3) To each p associate the function f
p
: |0, 1¦
A
→ |0, 1¦ defined by f
p
(t) = t(p).
(Think of a truth table for p where the 2
n
rows correspond to the 2
n
truth
assignments t : A → |0, 1¦, and the column under p records the values t(p).)
Then for every function f : |0, 1¦
A
→ |0, 1¦ there is a p such that f = f
p
.
(4) Let ∼ be the equivalence relation on Prop(A) given by
p ∼ q :⇐⇒ [= p ↔ q.
Then the quotient set Prop(A)/∼ is finite; determine its cardinality as a function
of n = [A[.
(5) Let w be an admissible word and 1 ≤ i < i

≤ length(w). Let v and v

be the
admissible words that occur at starting positions i and i

respectively in w. Then
these occurrences are either nonoverlapping, that is, i −1 +length(v) < i

, or the
occurrence of v

is entirely inside that of v, that is,
i

−1 + length(v

) ≤ i −1 + length(v).
2.2 Completeness for Propositional Logic
In this section we introduce a proof system for propositional logic, state the
completeness of this proof system, and then prove this completeness.
As in the previous section we fix a set A of atoms, and the conventions of
that section remain in force.
A propositional axiom is by definition a proposition that occurs in the list below,
for some choice of p, q, r:
1. ·
2. p → (p ∨ q); p → (q ∨ p)
3. p →
_
q → (p ∨ q)
_
20 CHAPTER 2. BASIC CONCEPTS OF LOGIC
4. (p ∧ q) → p; (p ∧ q) → q
5. p →
_
q → (p ∧ q)
_
6.
_
p → (q → r)
_

_
(p → q) → (p → r)
_
7. p → (p → ⊥)
8. (p → ⊥) → p
Each of items 2–8 describes infinitely many propositional axioms. That is why
we do not call these items axioms, but axiom schemes. For example, if a, b ∈ A,
then a → (a ∨ ⊥) and b → (b ∨ (a ∧ b)) are distinct propositional axioms,
and both instances of axiom scheme 2. It is easy to check that all propositional
axioms are tautologies.
Here is our single rule of inference for propositional logic:
Modus Ponens (MP): from p and p → q, infer q.
In the rest of this section Σ denotes a set of propositions, that is, Σ ⊆ Prop(A).
Definition. A formal proof , or just proof, of p from Σ is a sequence p
1
, . . . , p
n
with n ≥ 1 and p
n
= p, such that for k = 1, . . . , n:
(i) either p
k
∈ Σ,
(ii) or p
k
is a propositional axiom,
(iii) or there are i, j ∈ ¦1, . . . , k −1¦ such that p
k
can be inferred from p
i
and
p
j
by MP.
If there exists a proof of p from Σ, then we write Σ ¬ p, and say Σ proves p.
For Σ = ∅ we also write ¬ p instead of Σ ¬ p.
Lemma 2.2.1. ¬ p → p.
Proof. The proposition p →
_
(p → p) → p
_
is a propositional axiom by axiom
scheme 2. By axiom scheme 6,
¦p →
_
(p → p) → p
_
¦ → ¦
_
p → (p → p)
_
→ (p → p)¦
is a propositional axiom. Applying MP to these two axioms yields
¬
_
p → (p → p)
_
→ (p → p).
Since p → (p → p) is also a propositional axiom by scheme 2, we can apply MP
again to obtain ¬ p → p.
Proposition 2.2.2. If Σ ¬ p, then Σ [= p.
This should be clear from earlier facts that we stated and which the reader
was asked to verify. The converse is true but less obvious:
Theorem 2.2.3 (Completeness - first form).
Σ ¬ p ⇐⇒ Σ [= p
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 21
Remark. There is some arbitrariness in our choice of axioms and rule, and thus
in our notion of formal proof. This is in contrast to the definition of [=, which
merely formalizes the basic underlying idea of propositional logic as stated in
the introduction to the previous section. However, the equivalence of ¬ and
[= (Completeness Theorem) shows that our choice of axioms and rule yields a
complete proof system. Moreover, this equivalence has consequences which can
be stated in terms of [= alone. An example is the Compactness Theorem.
Theorem 2.2.4 (Compactness of Propositional Logic). If Σ [= p then there is
a finite subset Σ
0
of Σ such that Σ
0
[= p.
It is convenient to prove first a variant of the Completeness Theorem.
Definition. We say that Σ is inconsistent if Σ ¬ ⊥, and otherwise (that is, if
Σ ⊥) we call Σ consistent.
Theorem 2.2.5 (Completeness - second form).
Σ is consistent if and only if Σ has a model.
From this second form of the Completenenes Theorem we obtain easily an al-
ternative form of the Compactness of Propositional Logic:
Corollary 2.2.6. Σ has a model ⇐⇒ every finite subset of Σ has a model.
We first show that the second form of the Completeness Theorem implies the
first form. For this we need a technical lemma that will also be useful later in
the course.
Lemma 2.2.7 (Deduction Lemma). Suppose Σ ∪ ¦p¦ ¬ q. Then Σ ¬ p → q.
Proof. By induction on proofs.
If q is a propositional axiom, then Σ ¬ q, and since q → (p → q) is a
propositional axiom, MP yields Σ ¬ p → q. If q ∈ Σ ∪ ¦p¦, then either q ∈ Σ
in which case the same argument as before gives Σ ¬ p → q, or q = p and then
Σ ¬ p → q since ¬ p → p by the lemma above.
Now assume that q is obtained by MP from r and r → q, where Σ∪¦p¦ ¬ r
and Σ ∪ ¦p¦ ¬ r → q and where we assume inductively that Σ ¬ p → r and
Σ ¬ p → (r → q). Then we obtain Σ ¬ p → q from the propositional axiom
_
p → (r → q)
_

_
(p → r) → (p → q)
_
by applying MP twice.
Corollary 2.2.8. Σ ¬ p if and only if Σ ∪ ¦p¦ is inconsistent.
Proof. (⇒) Assume Σ ¬ p. Since p → (p → ⊥) is a propositional axiom, we
can apply MP twice to get Σ ∪ ¦p¦ ¬ ⊥. Hence Σ ∪ ¦p¦ is inconsistent.
(⇐) Assume Σ ∪ ¦p¦ is inconsistent. Then Σ ∪ ¦p¦ ¬ ⊥, and so by the
Deduction Lemma we have Σ ¬ p → ⊥. Since (p → ⊥) → p is a propositional
axiom, MP yields Σ ¬ p.
22 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Corollary 2.2.9. The second form of Completeness (Theorem 2.2.5) implies
the first form (Theorem 2.2.3).
Proof. Assume the second form of Completeness holds, and that Σ [= p. We
want to show that then Σ ¬ p. From Σ [= p it follows that Σ ∪ ¦p¦ has
no model. Hence by the second form of Completeness, the set Σ ∪ ¦p¦ is
inconsistent. Then by Corollary 2.2.8 we have Σ ¬ p.
Definition. We say that Σ is complete if Σ is consistent, and for each p either
Σ ¬ p or Σ ¬ p.
Completeness as a property of a set of propositions should not be confused
with the completeness of our proof system as expressed by the Completeness
Theorem. (It is just a historical accident that we use the same word.)
Below we use Zorn’s Lemma to show that any consistent set of propositions
can be extended to a complete set of propositions.
Lemma 2.2.10 (Lindenbaum). Suppose Σ is consistent. Then Σ ⊆ Σ

for some
complete Σ

⊆ Prop(A).
Proof. Let P be the collection of all consistent subsets of Prop(A) that contain
Σ. In particular Σ ∈ P. We consider P as partially ordered by inclusion. Any
totally ordered subcollection ¦Σ
i
: i ∈ I¦ of P with I ,= ∅ has an upper bound
in P, namely

¦Σ
i
: i ∈ I¦. (To see this it suffices to check that

¦Σ
i
: i ∈ I¦
is consistent. Suppose otherwise, that is, suppose

¦Σ
i
: i ∈ I¦ ¬ ⊥. Since a
proof can use only finitely many of the axioms in

¦Σ
i
: i ∈ I¦, there exists
i ∈ I such that Σ
i
¬ ⊥, contradicting the consistency of Σ
i
.)
Thus by Zorn’s lemma P has a maximal element Σ

. We claim that then Σ

is complete. For any p, if Σ

p, then by Corollary 2.2.8 the set Σ

∪ ¦p¦ is
consistent, hence p ∈ Σ

by maximality of Σ

, and thus Σ

¬ p.
Suppose A is countable. For this case we can give a proof of Lindenbaum’s
Lemma without using Zorn’s Lemma as follows.
Proof. Because A is countable, Prop(A) is countable. Take an enumeration
(p
n
)
n∈N
of Prop(A). We construct an increasing sequence Σ = Σ
0
⊆ Σ
1
⊆ . . .
of consistent subsets of Prop(A) as follows. Given a consistent Σ
n
⊆ Prop(A)
we define
Σ
n+1
=
_
Σ
n
∪ ¦p
n
¦ if Σ
n
¬ p
n
,
Σ
n
∪ ¦p
n
¦ if Σ
n
p
n
,
so Σ
n+1
remains consistent by earlier results. Thus Σ

:=

¦Σ
n
: n ∈ N¦
is consistent and also complete: for any n either p
n
∈ Σ
n+1
⊆ Σ

or p
n

Σ
n+1
⊆ Σ

.
Define the truth assignment t
Σ
: A → ¦0, 1¦ by
t
Σ
(a) = 1 if Σ ¬ a, and t
Σ
(a) = 0 otherwise.
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 23
Lemma 2.2.11. Suppose Σ is complete. Then for each p we have
Σ ¬ p ⇐⇒ t
Σ
(p) = 1.
In particular, t
Σ
is a model of Σ.
Proof. We proceed by induction on the number of connectives in p. If p is an
atom or p = · or p = ⊥, then the equivalence follows immediately from the
definitions. It remains to consider the three cases below.
Case 1: p = q, and (inductive assumption) Σ ¬ q ⇐⇒ t
Σ
(q) = 1.
(⇒) Suppose Σ ¬ p. Then t
Σ
(p) = 1: Otherwise, t
Σ
(q) = 1, so Σ ¬ q by the
inductive assumption; since q → (p → ⊥) is a propositional axiom, we can apply
MP twice to get Σ ¬ ⊥, which contradicts the consistency of Σ.
(⇐) Suppose t
Σ
(p) = 1. Then t
Σ
(q) = 0, so Σ q, and thus Σ ¬ p by
completeness of Σ.
Case 2: p = q ∨ r, Σ ¬ q ⇐⇒ t
Σ
(q) = 1, and Σ ¬ r ⇐⇒ t
Σ
(r) = 1.
(⇒) Suppose that Σ ¬ p. Then t
Σ
(p) = 1: Otherwise, t
Σ
(p) = 0, so t
Σ
(q) = 0
and t
Σ
(r) = 0, hence Σ q and Σ r, and thus Σ ¬ q and Σ ¬ r by
completeness of Σ; since q → (r → p) is a propositional axiom, we can apply
MP twice to get Σ ¬ p, which in view of the propositional axiom p → (p → ⊥)
and MP yields Σ ¬ ⊥, which contradicts the consistency of Σ.
(⇐) Suppose t
Σ
(p) = 1. Then t
Σ
(q) = 1 or t
Σ
(r) = 1. Hence Σ ¬ q or Σ ¬ r.
Using MP and the propositional axioms q → p and r → p we obtain Σ ¬ p.
Case 3: p = q ∧ r, Σ ¬ q ⇐⇒ t
Σ
(q) = 1, and Σ ¬ r ⇐⇒ t
Σ
(r) = 1.
We leave this case as an exercise.
We can now finish the proof of Completeness (second form):
Suppose Σ is consistent. Then by Lindenbaum’s Lemma Σ is a subset of a
complete set Σ

of propositions. By the previous lemma, such a Σ

has a model,
and such a model is also a model of Σ.
The converse—if Σ has a model, then Σ is consistent—is left to the reader.
Application to coloring infinite graphs. What follows is a standard use
of compactness of propositional logic, one of many. Let (V, E) be a graph, by
which we mean here that V is a set (of vertices) and E (the set of edges) is a
binary relation on V that is irreflexive and symmetric, that is, for all v, w ∈ V
we have (v, v) / ∈ E, and if (v, w) ∈ E, then (w, v) ∈ E. Let some n ≥ 1 be
given. Then an n-coloring of (V, E) is a function c : V → ¦1, . . . , n¦ such that
c(v) ,= c(w) for all (v, w) ∈ E: neighboring vertices should have different colors.
Suppose for every finite V
0
⊆ V there is an n-coloring of (V
0
, E
0
), where
E
0
:= E ∩ (V
0
V
0
). We claim that there exists an n-coloring of (V, E).
Proof. Take A := V ¦1, . . . , n¦ as the set of atoms, and think of an atom (v, i)
as representing the statement that v has color i. Thus for (V, E) to have an
n-coloring means that the following set Σ ⊆ Prop A has a model:
Σ := ¦(v, 1) ∨ ∨ (v, n) : v ∈ V ¦ ∪ ¦
_
(v, i) ∧ (v, j)
_
: v ∈ V, 1 ≤ i < j ≤ n¦
∪ ¦
_
(v, i) ∧ (w, i)
_
: (v, w) ∈ E, 1 ≤ i ≤ n¦.
24 CHAPTER 2. BASIC CONCEPTS OF LOGIC
The assumption that all finite subgraphs of (V, E) are n-colorable yields that
every finite subset of Σ has a model. Hence by compactness Σ has a model.
Exercises.
(1) Let (P, ≤) be a poset. Then there is a linear order ≤

on P that extends ≤. (Hint:
use the compactness theorem and the fact that this is true when P is finite.)
(2) Suppose Σ ⊆ Prop(A) is such that for each truth assignment t : A → |0, 1¦ there
is p ∈ Σ with t(p) = 1. Then there are p
1
, . . . , p
n
∈ Σ such that p
1
∨ ∨ p
n
is a
tautology. (The interesting case is when A is infinite.)
2.3 Languages and Structures
Propositional Logic captures only one aspect of mathematical reasoning. We
also need the capability to deal with predicates (also called relations), variables,
and the quantifiers “for all” and “there exists.” We now begin setting up a
framework for Predicate Logic (or First-Order Logic), which has these additional
features and has a claim on being a complete logic for mathematical reasoning.
Definition. A language
1
L is a disjoint union of:
(i) a set L
r
of relation symbols; each R ∈ L
r
has associated arity a(R) ∈ N;
(ii) a set L
f
of function symbols; each F ∈ L
f
has associated arity a(F) ∈ N.
An m-ary relation or function symbol is one that has arity m. Instead of “0-
ary”, “1-ary”, “2-ary” we say “nullary”, “unary”, “binary”. A constant symbol
is a function symbol of arity 0.
Examples.
(1) The language L
Gr
= ¦1,
−1
, ¦ of groups has constant symbol 1, unary
function symbol
−1
, and binary function symbol .
(2) The language L
Ab
= ¦0, −, +¦ of (additive) abelian groups has constant
symbol 0, unary function symbol −, and binary function symbol +.
(3) The language L
O
= ¦<¦ has just one binary relation symbol <.
(4) The language L
OAb
= ¦<, 0, −, +¦ of ordered abelian groups.
(5) The language L
Rig
= ¦0, 1, +, ¦ of rigs (or semirings) has constant symbols
0 and 1, and binary function symbols + and .
(6) The language L
Ri
= ¦0, 1, −, +, ¦ of rings. The symbols are those of the
previous example, plus the unary function symbol −.
From now on, let L denote a language.
Definition. A structure / for L (or L-structure) is a triple
_
A; (R
A
)
R∈L
r , (F
A
)
F∈L
f
_
consisting of:
(i) a nonempty set A, the underlying set of /;
2
1
What we call here a language is also known as a signature, or a vocabulary.
2
It is also called the universe of A; we prefer less grandiose terminology.
2.3. LANGUAGES AND STRUCTURES 25
(ii) for each m-ary R ∈ L
r
a set R
A
⊆ A
m
(an m-ary relation on A), the
interpretation of R in /;
(iii) for each n-ary F ∈ L
f
an operation F
A
: A
n
−→ A (an n-ary operation
on A), the interpretation of F in /.
Remark. The interpretation of a constant symbol c of L is a function
c
A
: A
0
−→ A.
Since A
0
has just one element, c
A
is uniquely determined by its value at this
element; we shall identify c
A
with this value, so c
A
∈ A.
Given an L-structure /, the relations R
A
on A (for R ∈ L
r
), and operations F
A
on A (for F ∈ L
f
) are called the primitives of /. When / is clear from context
we often omit the superscript / in denoting the interpretation of a symbol of L
in /. The reader is supposed to keep in mind the distinction between symbols
of L and their interpretation in an L-structure, even if we use the same notation
for both.
Examples.
(1) Each group is considered as an L
Gr
-structure by interpreting the symbols
1,
−1
, and as the identity element of the group, its group inverse, and its
group multiplication, respectively.
(2) Let / = (A; 0, −, +) be an abelian group; here 0 ∈ A is the zero element
of the group, and − : A → A and + : A
2
→ A denote the group operations
of /. We consider / as an L
Ab
-structure by taking as interpretations of
the symbols 0,− and + of L
Ab
the group operations 0, − and + on A.
(We took here the liberty of using the same notation for possibly entirely
different things: + is an element of the set L
Ab
, but also denotes in this
context its interpretation as a binary operation on the set A. Similarly
with 0 and −.) In fact, any set A in which we single out an element, a
unary operation on A, and a binary operation on A, can be construed as
an L
Ab
-structure if we choose to do so.
(3) (N; <) is an L
O
-structure where we interpret < as the usual ordering
relation on N. Similarly for (Z; <), (Q; <) and (R; <). (Here we take
even more notational liberties, by letting < denote five different things: a
symbol of L
O
, and the usual orderings of N, Z, Q, and R respectively.)
Again, any nonempty set A equipped with a binary relation on it can be
viewed as an L
O
-structure.
(4) (Z; <, 0, −, +) and (Q, <, 0, −, +) are both L
OAb
-structures.
(5) (N; 0, 1, +, ) is an L
Rig
-structure.
(6) (Z; 0, 1, −, +, ) is an L
Ri
-structure.
Let B be an L-structure with underlying set B, and let A be a nonempty subset
of B such that F
B
(A
n
) ⊆ A for every n-ary function symbol F of L. Then A is
the underlying set of an L-structure / defined by letting
F
A
:= F
B
[
A
n: A
n
→ A, for n-ary F ∈ L
f
,
R
A
:= R
B
∩ A
m
for m-ary R ∈ L
r
.
26 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Definition. Such an L-structure / is said to be a substructure of B, notation:
/ ⊆ B. We also say in this case that B is an extension of /, or extends /.
Examples.
(1) (Z; 0, 1, −, +, ) ⊆ (Q; 0, 1, −, +, ) ⊆ (R; 0, 1, −, +, )
(2) (N; <, 0, 1, +, ) ⊆ (Z; <, 0, 1, +, )
Definition. Let / = (A; . . .) and B = (B; . . .) be L-structures.
A homomorphism h : / → B is a map h : A → B such that
(i) for each m-ary R ∈ L
r
and each (a
1
, . . . , a
m
) ∈ A
m
we have
(a
1
, . . . , a
m
) ∈ R
A
=⇒ (ha
1
, . . . , ha
m
) ∈ R
B
;
(ii) for each n-ary F ∈ L
f
and each (a
1
, . . . , a
n
) ∈ A
n
we have
h(F
A
(a
1
, . . . , a
n
)) = F
B
(ha
1
, . . . , ha
n
).
Replacing =⇒ in (i) by ⇐⇒ yields the notion of a strong homomorphism. An
embedding is an injective strong homomorphism; an isomorphism is a bijective
strong homomorphism. An automorphism of / is an isomorphism / → /.
If / ⊆ B, then the inclusion a → a : A → B is an embedding / → B.
Conversely, a homomorphism h : / → B yields a substructure h(/) of B with
underlying set h(A), and if h is an embedding we have an isomorphism a →
h(a) : / → h(/).
If i : / → B and j : B → ( are homomorphisms (strong homomorphisms,
embeddings, isomorphisms, respectively), then so is j ◦ i : / → (. The identity
map 1
A
on A is an automorphism of /. If i : / → B is an isomorphism then so
is the map i
−1
: B → /. Thus the automorphisms of / form a group Aut(/)
under composition with identity 1
A
.
Examples.
1. Let / = (Z; 0, −, +). Then k → −k is an automorphism of /.
2. Let / = (Z; <). The map k → k + 1 is an automorphism of / with
inverse given by k −→ k −1.
If / and B are groups (viewed as structures for the language L
Gr
), then a
homomorphism h : / → B is exactly what in algebra is called a homomorphism
from the group / to the group B. Likewise with rings, and other kinds of
algebraic structures.
A congruence on the L-structure / is an equivalence relation ∼ on its underlying
set A such that
(i) if R ∈ L
r
is m-ary and a
1
∼ b
1
, . . . , a
m
∼ b
m
, then
(a
1
, . . . , a
m
) ∈ R
A
⇐⇒ (b
1
, . . . , b
m
) ∈ R
A
;
(ii) if F ∈ L
f
is n-ary and a
1
∼ b
1
, . . . , a
n
∼ b
n
, then
F
A
(a
1
, . . . , a
n
) ∼ F
A
(b
1
, . . . , b
n
).
2.3. LANGUAGES AND STRUCTURES 27
Note that a strong homomorphism h : / → B yields a congruence ∼
h
on / as
follows: for a
1
, a
2
∈ A we put
a
1

h
a
2
⇐⇒ h(a
1
) = h(a
2
).
Given a congruence ∼ on the L-structure / we obtain an L-structure //∼ (the
quotient of / by ∼) as follows:
(i) the underlying set of //∼ is the quotient set A/∼;
(ii) the interpretation of an m-ary R ∈ L
r
in //∼ is the m-ary relation
¦(a

1
, . . . , a

m
) : (a
1
, . . . , a
m
) ∈ R
A
¦
on A/∼;
(iii) the interpretation of an n-ary F ∈ L
f
in // ∼ is the n-ary operation
(a

1
, . . . , a

n
) → F
A
(a
1
, . . . , a
n
)

on A/∼.
Note that then we have a strong homomorphism a → a

: / → //∼.
Products. Let
_
B
i
_
i∈I
be a family of L-structures, B
i
= (B
i
; . . . ) for i ∈ I.
The product

i∈I
B
i
is defined to be the L-structure B whose underlying set is the product set

i∈I
B
i
, and where the basic relations and functions are defined coordinate-
wise: for m-ary R ∈ L
r
and elements b
1
= (b
1i
), . . . , b
m
= (b
mi
) ∈

i∈I
B
i
,
(b
1
, . . . , b
m
) ∈ R
B
⇐⇒ (b
1i
, . . . , b
mi
) ∈ R
B
i
for all i ∈ I,
and for n-ary F ∈ L
f
and b
1
= (b
1i
), . . . , b
n
= (b
ni
) ∈

i∈I
B
i
,
F
B
(b
1
, . . . , b
n
) :=
_
F
B
i
(b
1i
, . . . , b
ni
)
_
i∈I
.
For j ∈ I the projection map to the jth factor is the homomorphism

i∈I
B
i
→ B
j
, (b
i
) → b
j
.
This product construction makes it possible to combine several homomorphisms
with a common domain into a single one: if for each i ∈ I we have a homomor-
phism h
i
: / → B
i
we obtain a homomorphism
h = (h
i
) : / →

i∈I
B
i
, h(a) :=
_
h
i
(a
i
)
_
.
28 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Exercises.
(1) Let G be a group viewed as a structure for the language of groups. Each normal
subgroup N of G yields a congruence ≡
N
on G by
a ≡
N
b ⇐⇒ aN = bN,
and each congruence on G equals ≡
N
for a unique normal subgroup N of G.
(2) Consider a strong homomorphism h : / → B of L-structures. Then we have an
isomorphism from //∼
h
onto h(/) given by a

h
→ h(a).
2.4 Variables and Terms
Throughout this course
Var = ¦v
0
, v
1
, v
2
, . . . ¦
is a countably infinite set of symbols whose elements will be called variables;
we assume that v
m
,= v
n
for m ,= n, and that no variable is a function or
relation symbol in any language. We let x, y, z (sometimes with subscripts or
superscripts) denote variables, unless indicated otherwise.
Remark. Chapters 2–4 go through if we take as our set Var of variables any
infinite (possibly uncountable) set; in model theory this can even be convenient.
For this more general Var we still insist that no variable is a function or relation
symbol in any language. In the few cases in chapters 2–4 that this more general
set-up requires changes in proofs, this will be pointed out.
The results in Chapter 5 on undecidability presuppose a numbering of the
variables; our Var = ¦v
0
, v
1
, v
2
, . . . ¦ comes equipped with such a numbering.
Definition. An L-term is a word on the alphabet L
f
∪Var obtained as follows:
(i) each variable (viewed as a word of length 1) is an L-term;
(ii) whenever F ∈ L
f
is n-ary and t
1
, . . . , t
n
are L-terms, then the concatena-
tion Ft
1
. . . t
n
is an L-term.
Note: constant symbols of L are L-terms of length 1, by clause (ii) for n = 0.
The L-terms are the admissible words on the alphabet L
f
∪ Var where each
variable has arity 0. Thus “unique readability” is available.
We often write t(x
1
, . . . , x
n
) to indicate an L-term t in which no variables
other than x
1
, . . . , x
n
occur. Whenever we use this notation we assume tacitly
that x
1
, . . . , x
n
are distinct. Note that we do not require that each of x
1
, . . . , x
n
actually occurs in t(x
1
, . . . , x
n
). (This is like indicating a polynomial in the
indeterminates x
1
, . . . , x
n
by p(x
1
, . . . , x
n
), where one allows that some of these
indeterminates do not actually occur in the polynomial p.)
If a term is written as an admissible word, then it may be hard to see how
it is built up from subterms. In practice we shall therefore use parentheses
and brackets in denoting terms, and avoid prefix notation if tradition dictates
otherwise.
2.4. VARIABLES AND TERMS 29
Example. The word +x −yz is an L
Ri
-term. For easier reading we indicate
this term instead by (x + (−y)) z or even (x −y)z.
Definition. Let / be an L-structure and t = t(x) be an L-term where x =
(x
1
, . . . , x
m
). Then we associate to the ordered pair (t, x) a function t
A
: A
m
→ A
as follows
(i) If t is the variable x
i
then t
A
(a) = a
i
for a = (a
1
, . . . , a
m
) ∈ A
m
.
(ii) If t = Ft
1
. . . t
n
where F ∈ L
f
is n-ary and t
1
, . . . , t
n
are L-terms, then
t
A
(a) = F
A
(t
A
1
(a), . . . , t
A
n
(a)) for a ∈ A
m
.
This inductive definition is justified by unique readability. Note that if B is a
second L-structure and / ⊆ B, then t
A
(a) = t
B
(a) for t as above and a ∈ A
m
.
Example. Consider R as a ring in the usual way, and let t(x, y, z) be the L
Ri
-
term (x−y)z. Then the function t
R
: R
3
→ R is given by t
R
(a, b, c) = (a−b)c.
A term is said to be variable-free if no variables occur in it. Let t be a variable-
free L-term and / an L-structure. Then the above gives a nullary function
t
A
: A
0
→ A, identified as usual with its value at the unique element of A
0
, so
t
A
∈ A. In other words, if t is a constant symbol c, then t
A
= c
A
∈ A, where
c
A
is as in the previous section, and if t = Ft
1
. . . t
n
with n-ary F ∈ L
f
and
variable-free L-terms t
1
, . . . , t
n
, then t
A
= F
A
(t
A
1
, . . . , t
A
n
).
Definition. Let t be an L-term, let x
1
, . . . , x
n
be distinct variables, and let
τ
1
, . . . , τ
n
be L-terms. Then t(τ
1
/x
1
, . . . , τ
n
/x
n
) is the word obtained by re-
placing all occurrences of x
i
in t by t
i
, simultaneously for i = 1, . . . , n. If t is
given in the form t(x
1
, . . . , x
n
), then we write t(τ
1
, . . . , τ
n
) as a shorthand for
t(τ
1
/x
1
, . . . , τ
n
/x
n
).
The easy proof of the next lemma is left to the reader.
Lemma 2.4.1. Suppose t is an L-term, x
1
, . . . , x
n
are distinct variables, and
τ
1
, . . . , τ
n
are L-terms. Then t(τ
1
/x
1
, . . . , τ
n
/x
n
) is an L-term. If τ
1
, . . . , τ
n
are
variable-free and t = t(x
1
, . . . , x
n
), then t(τ
1
, . . . , τ
n
) is variable-free.
We urge the reader to do exercise (1) below and thus acquire the confidence that
these formal term substitutions do correspond to actual function substitutions.
In the definition of t(τ
1
/x
1
, . . . , τ
n
/x
n
) the “replacing” should be simultaneous,
because it can happen that for t

:= t(τ
1
/x
1
) we have t


2
/x
2
) ,= t(τ
1
/x
1
, τ
2
/x
2
).
(Here t, τ
1
, τ
2
are L-terms and x
1
, x
2
are distinct variables.)
Generators. Let B be an L-structure, let G ⊆ B, and assume also that L has
a constant symbol or that G ,= ∅. Then the set
¦t
B
(g
1
, . . . , g
m
) : t(x
1
, . . . , x
m
) is an L-term and g
1
, . . . , g
m
∈ G¦ ⊆ B
is the underlying set of some / ⊆ B, and this / is clearly a substructure of any
/

⊆ B with G ⊆ A

. We call this / the substructure of B generated by G; if
/ = B, then we say that B is generated by G. If (a
i
)
i∈I
is a family of elements of
B, then “generated by (a
i
)” means “generated by G” where G = ¦a
i
: i ∈ I¦.
30 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Exercises.
(1) Let t(x
1
, . . . , x
m
) and τ
1
(y
1
, . . . , y
n
), . . . , τ
m
(y
1
, . . . , y
n
) be L-terms. Then the L-
term t

(y
1
, . . . , y
n
) := t(τ
1
(y
1
, . . . , y
n
), . . . , τ
m
(y
1
, . . . , y
n
)) has the property that
if / is an L-structure and a = (a
1
, . . . , a
n
) ∈ A
n
, then
(t

)
A
(a) = t
A
`
τ
A
1
(a), . . . , τ
A
m
(a)
´
.
(2) For every L
Ab
-term t(x
1
, . . . , x
n
) there are integers k
1
, . . . , k
n
such that for every
abelian group / = (A; 0, −, +),
t
A
(a
1
, . . . , a
n
) = k
1
a
1
+ +k
n
a
n
, for all (a
1
, . . . , a
n
) ∈ A
n
.
Conversely, for any integers k
1
, . . . , k
n
there is an L
Ab
-term t(x
1
, . . . , x
n
) such
that in every abelian group / = (A; 0, −, +) the above displayed identity holds.
(3) For every L
Ri
-term t(x
1
, . . . , x
n
) there is a polynomial
P(x
1
, . . . , x
n
) ∈ Z[x
1
, . . . , x
n
]
such that for every commutative ring 1 = (R; 0, 1, −, +, ),
t
R
(r
1
, . . . , r
n
) = P(r
1
, . . . , r
n
), for all (r
1
, . . . , r
n
) ∈ R
n
.
Conversely, for any polynomial P(x
1
, . . . , x
n
) ∈ Z[x
1
, . . . , x
n
] there is an L
Ri
-
term t(x
1
, . . . , x
n
) such that in every commutative ring 1 = (R; 0, 1, −, +, ) the
above displayed identity holds.
(4) Let / and B be L-structures, h : / → B a homomorphism, and t = t(x
1
, . . . , x
n
)
an L-term. Then
h
`
t
A
(a
1
, . . . , a
n
)
´
= t
B
(ha
1
, . . . , ha
n
), for all (a
1
, . . . , a
n
) ∈ A
n
.
(If / ⊆ B and h : / → B is the inclusion, this gives t
A
(a
1
, . . . , a
n
) = t
B
(a
1
, . . . , a
n
)
for all (a
1
, . . . , a
n
) ∈ A
n
.)
(5) Consider the L-structure / = (N; 0, 1, +, ) where L = L
Rig
.
(a) Is there an L-term t(x) such that t
A
(0) = 1 and t
A
(1) = 0?
(b) Is there an L-term t(x) such that t
A
(n) = 2
n
for all n ∈ N?
(c) Find all the substructures of /.
2.5 Formulas and Sentences
Besides variables we also introduce the eight distinct logical symbols
· ⊥ ∨ ∧ = ∃ ∀
The first five of these we already met when discussing propositional logic. None
of these eight symbols is a variable, or a function or relation symbol of any
language. Below L denotes a language. To distinguish the logical symbols from
those in L, the latter are often referred to as the non-logical symbols.
Definition. The atomic L-formulas are the following words on the alphabet
L ∪ Var ∪ ¦·, ⊥, =¦:
2.5. FORMULAS AND SENTENCES 31
(i) · and ⊥,
(ii) Rt
1
. . . t
m
, where R ∈ L
r
is m-ary and t
1
, . . . , t
m
are L-terms,
(iii) = t
1
t
2
, where t
1
and t
2
are L-terms.
The L-formulas are the words on the larger alphabet
L ∪ Var ∪ ¦·, ⊥, , ∨, ∧, =, ∃, ∀¦
obtained as follows:
(i) every atomic L-formula is an L-formula;
(ii) if ϕ, ψ are L-formulas, then so are ϕ, ∨ϕψ and ∧ϕψ;
(iii) if ϕ is a L-formula and x is a variable, then ∃xϕ and ∀xϕ are L-formulas.
Note that all L-formulas are admissible words on the alphabet
L ∪ Var ∪ ¦·, ⊥, , ∨, ∧, =, ∃, ∀¦,
where =, ∃ and ∀ are given arity 2 and the other symbols have the arities
assigned to them earlier. This fact makes the results on unique readability
applicable to L-formulas. (However, not all admissible words on this alphabet
are L-formulas: the word ∃xx is admissible but not an L-formula.)
The notational conventions introduced in the section on propositional logic
go through, with the role of propositions there taken over by formulas here. (For
example, given L-formulas ϕ and ψ we shall write ϕ ∨ ψ to indicate ∨ϕψ, and
ϕ → ψ to indicate ϕ ∨ ψ.)
The reader should distinguish between different ways of using the symbol =.
Sometimes it denotes one of the eight formal logical symbols, but we also use it
to indicate equality of mathematical objects in the way we have done already
many times. The context should always make it clear what our intention is in
this respect without having to spell it out. To increase readability we usually
write an atomic formula = t
1
t
2
as t
1
= t
2
and its negation = t
1
t
2
as t
1
,= t
2
,
where t
1
, t
2
are L-terms. The logical symbol = is treated just as a binary relation
symbol, but its interpretation in a structure will always be the equality relation
on its underlying set. This will become clear later.
Definition. Let ϕ be a formula of L. Written as a word on the alphabet above
we have ϕ = s
1
. . . s
m
. A subformula of ϕ is a subword of the form s
i
. . . s
k
where 1 ≤ i ≤ k ≤ m which also happens to be a formula of L.
An occurrence of a variable x in ϕ at the j-th place (that is, s
j
= x) is said
to be a bound occurrence if ϕ has a subformula s
i
s
i+1
. . . s
k
with i ≤ j ≤ k that
is of the form ∃xψ or ∀xψ. If an occurrence is not bound then it is said to be a
free occurrence.
At this point the reader is invited to do the first exercise at the end of this
section, which gives another useful characterization of subformulas.
Example. In the formula
_
∃x(x = y)
_
∧x = 0, where x and y are distinct, the
first two occurrences of x are bound, the third is free, and the only occurrence
of y is free. (Note: the formula is actually the string ∧∃x = xy = x0, and the
occurrences of x and y are really the occurrences in this string.)
32 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Definition. A sentence is a formula in which all occurrences of variables are
bound occurrences.
We write ϕ(x
1
, . . . , x
n
) to indicate a formula ϕ such that all variables that occur
free in ϕ are among x
1
, . . . , x
n
. In using this notation it is understood that
x
1
, . . . , x
n
are distinct variables, but it is not required that each of x
1
, . . . , x
n
occurs free in ϕ. (This is like indicating a polynomial in the indeterminates
x
1
, . . . , x
n
by p(x
1
, . . . , x
n
), where one allows that some of these indeterminates
do not actually occur in p.)
Definition. Let ϕ be an L-formula, let x
1
, . . . , x
n
be distinct variables, and
let t
1
, . . . , t
n
be L-terms. Then ϕ(t
1
/x
1
, . . . , t
n
/x
n
) is the word obtained by
replacing all the free occurences of x
i
in ϕ by t
i
, simultaneously for i = 1, . . . , n.
If ϕ is given in the form ϕ(x
1
, . . . , x
n
), then we write ϕ(t
1
, . . . , t
n
) as a shorthand
for ϕ(t
1
/x
1
, . . . , t
n
/x
n
).
We have the following lemma whose routine proof is left to the reader.
Lemma 2.5.1. Suppose ϕ is an L-formula, x
1
, . . . , x
n
are distinct variables,
and t
1
, . . . , t
n
are L-terms. Then ϕ(t
1
/x
1
, . . . , t
n
/x
n
) is an L-formula. If
t
1
, . . . , t
n
are variable-free and ϕ = ϕ(x
1
, . . . , x
n
), then ϕ(t
1
, . . . , t
n
) is an L-
sentence.
In the definition of ϕ(t
1
/x
1
, . . . , t
n
/x
n
) the “replacing” should be simultaneous,
because it can happen that ϕ(t
1
/x
1
)(t
2
/x
2
) ,= ϕ(t
1
/x
1
, t
2
/x
2
).
Let / be an L-structure with underlying set A, and let C ⊆ A. We extend L to
a language L
C
by adding a constant symbol c for each c ∈ C, called the name
of c. These names are symbols not in L. We make / into an L
C
-structure
by keeping the same underlying set and interpretations of symbols of L, and
by interpreting each name c as the element c ∈ C. The L
C
-structure thus
obtained is indicated by /
C
. Hence for each variable-free L
C
-term t we have
a corresponding element t
A
C
of A, which for simplicity of notation we denote
instead by t
A
. All this applies in particular to the case C = A, where in L
A
we
have a name a for each a ∈ A.
Definition. We can now define what it means for an L
A
-sentence σ to be true
in the L-structure / (notation: / [= σ, also read as / satisfies σ or σ holds in
/, or σ is valid in /). First we consider atomic L
A
-sentences:
(i) / [= ·, and / ,[= ⊥;
(ii) / [= Rt
1
. . . t
m
if and only if (t
A
1
, . . . , t
A
m
) ∈ R
A
, for m-ary R ∈ L
r
, and
variable-free L
A
-terms t
1
, . . . , t
m
;
(iii) / [= t
1
= t
2
if and only if t
A
1
= t
A
2
, for variable-free L
A
-terms t
1
, t
2
.
We extend the definition inductively to arbitrary L
A
-sentences as follows:
(i) Suppose σ = σ
1
. Then / [= σ if and only if / σ
1
.
(ii) Suppose σ = σ
1
∨ σ
2
. Then / [= σ if and only if / [= σ
1
or / [= σ
2
.
(iii) Suppose σ = σ
1
∧ σ
2
. Then / [= σ if and only if / [= σ
1
and / [= σ
2
.
(iv) Suppose σ = ∃xϕ(x). Then / [= σ if and only if / [= ϕ(a) for some a ∈ A.
2.5. FORMULAS AND SENTENCES 33
(v) Suppose σ = ∀xϕ(x). Then / [= σ if and only if / [= ϕ(a) for all a ∈ A.
Even if we just want to define / [= σ for L-sentences σ, one can see that if
σ has the form ∃xϕ(x) or ∀xϕ(x), the inductive definition above forces us to
consider L
A
-sentences ϕ(a). This is why we introduced names. We didn’t say so
explicitly, but “inductive” refers here to induction with respect to the number of
logical symbols in σ. For example, the fact that ϕ(a) has fewer logical symbols
than ∃xϕ(x) is crucial for the above to count as a definition.
It is easy to check that for an L
A
-sentence σ = ∃x
1
. . . ∃x
n
ϕ(x
1
, . . . , x
n
),
/ [= σ ⇐⇒ / [= ϕ(a
1
, . . . , a
n
) for some (a
1
, . . . , a
n
) ∈ A
n
,
and that for an L
A
-sentence σ = ∀x
1
. . . ∀x
n
ϕ(x
1
, . . . , x
n
),
/ [= σ ⇐⇒ / [= ϕ(a
1
, . . . , a
n
) for all (a
1
, . . . , a
n
) ∈ A
n
.
Definition. Given an L
A
-formula ϕ(x
1
, . . . , x
n
) we let ϕ
A
be the following
subset of A
n
:
ϕ
A
= ¦(a
1
, . . . , a
n
) : / [= ϕ(a
1
, . . . , a
n

The formula ϕ(x
1
, . . . , x
n
) is said to define the set ϕ
A
in /. A set S ⊆ A
n
is said to be definable in / if S = ϕ
A
for some L
A
-formula ϕ(x
1
, . . . , x
n
). If
moreover ϕ can be chosen to be an L-formula then S is said to be 0-definable
in /.
Examples.
(1) The set ¦r ∈ R : r <

2¦ is 0-definable in (R; <, 0, 1, +, −, ): it is defined
by the formula (x
2
< 1+1) ∨(x < 0). (Here x
2
abbreviates the term x x.)
(2) The set ¦r ∈ R : r < π¦ is definable in (R; <, 0, 1, +, −, ): it is defined by
the formula x < π.
We now single out formulas by certain syntactical conditions. These conditions
have semantic counterparts in terms of the behaviour of these formulas under
various kinds of homomorphisms, as shown in some exercises below.
An L-formula is said to be quantifier-free if it has no occurrences of ∃ and
no occurrences of ∀. An L-formula is said to be existential if it has the form
∃x
1
. . . ∃x
m
ϕ with distinct x
1
, . . . , x
m
and a quantifier-free L-formula ϕ. An
L-formula is said to be universal if it has the form ∀x
1
. . . ∀x
m
ϕ with distinct
x
1
, . . . , x
m
and a quantifier-free L-formula ϕ. An L-formula is said to be positive
if it has no occurrences of (but it can have occurrences of ⊥).
Exercises.
(1) Let ϕ and ψ be L-formulas; put sf(ϕ) := set of subformulas of ϕ.
(a) If ϕ is atomic, then sf(ϕ) = |ϕ¦.
(b) sf(ϕ) = |ϕ¦ ∪ sf(ϕ).
(c) sf(ϕ ∨ψ) = |ϕ ∨ ψ¦ ∪sf(ϕ) ∪ sf(ψ), and sf(ϕ ∧ψ) = |ϕ ∧ψ¦ ∪sf(ϕ) ∪sf(ψ).
(d) sf(∃xϕ) = |∃xϕ¦ ∪ sf(ϕ), and sf(∀xϕ) = |∀xϕ¦ ∪ sf(ϕ).
34 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(2) If t(x
1
, . . . , x
n
) is an L
A
-term and a
1
, . . . , a
n
∈ A, then
t(a
1
, . . . , a
n
)
A
= t
A
(a
1
, . . . , a
n
).
(3) Suppose that S
1
⊆ A
n
and S
2
⊆ A
n
are defined in / by the L
A
-formulas
ϕ
1
(x
1
, . . . , x
n
) and ϕ
2
(x
1
, . . . , x
n
) respectively. Then:
(a) S
1
∪ S
2
is defined in / by (ϕ
1
∨ ϕ
2
)(x
1
, . . . , x
n
).
(b) S
1
∩ S
2
is defined in / by (ϕ
1
∧ ϕ
2
)(x
1
, . . . , x
n
).
(c) A
n
S
1
is defined in / by ϕ
1
(x
1
, . . . , x
n
).
(d) S
1
⊆ S
2
⇐⇒ / [= ∀x
1
. . . ∀x
n
`
ϕ
1
→ ϕ
2
).
(4) Let π : A
m+n
→ A
m
be the projection map given by
π(a
1
, . . . , a
m+n
) = (a
1
, . . . , a
m
),
and for S ⊆ A
m+n
and a ∈ A
m
, put
S(a) := |b ∈ A
n
: (a, b) ∈ S¦ (a section of S).
Suppose that S ⊆ A
m+n
is defined in / by the L
A
-formula ϕ(x, y) where x =
(x
1
, . . . , x
m
) and y = (y
1
, . . . , y
n
). Then ∃y
1
. . . ∃y
n
ϕ(x, y) defines in / the subset
π(S) of A
m
, and ∀y
1
. . . ∀y
n
ϕ(x, y) defines in / the set
|a ∈ A
m
: S(a) = A
n
¦.
(5) The following sets are 0-definable in the corresponding structures:
(a) The ordering relation |(m, n) ∈ N
2
: m < n¦ in (N; 0, +).
(b) The set |2, 3, 5, 7, . . .¦ of prime numbers in the semiring A = (N; 0, 1, +, ).
(c) The set |2
n
: n ∈ N¦ in the semiring A.
(d) The set |a ∈ R : f is continuous at a¦ in (R; <, f) where f : R → R is any
function.
(6) Let the symbols of L be a binary relation symbol < and a unary relation symbol
U. Then there is an L-sentence σ such that for all X ⊆ R we have
(R; <, X) [= σ ⇐⇒ X is finite.
(7) Let / ⊆ B. Then we consider L
A
to be a sublanguage of L
B
in such a way
that each a ∈ A has the same name in L
A
as in L
B
. This convention is in force
throughout these notes.
(a) For each variable free L
A
-term t we have t
A
= t
B
.
(b) If the L
A
-sentence σ is quantifier-free, then / [= σ ⇔ B [= σ.
(c) If σ is an existential L
A
-sentence, then / [= σ ⇒ B [= σ
(d) If σ is a universal L
A
-sentence, then B [= σ ⇒ / [= σ.
(8) Suppose h : / −→ B is a homomorphism of L-structures. For each L
A
-term t,
let t
h
be the L
B
-term obtained from t by replacing each occurrence of a name
a of an element a ∈ A by the name ha of the corresponding element ha ∈ B.
Similarly, for each L
A
-formula ϕ, let ϕ
h
be the L
B
-formula obtained from ϕ by
replacing each occurrence of a name a of an element a ∈ A by the name ha of
the corresponding element ha ∈ B. Note that if ϕ is a sentence, so is ϕ
h
. Then:
2.6. MODELS 35
(a) if t is a variable-free L
A
-term, then h(t
A
) = t
B
h
;
(b) if σ is a positive L
A
-sentence without ∀-symbol, then / [= σ ⇒ B [= σ
h
;
(c) if σ is a positive L
A
-sentence and h is surjective, then / [= σ ⇒ B [= σ
h
;
(d) if σ is an L
A
-sentence and h is an isomorphism, then / [= σ ⇔ B [= σ
h
;
In particular, isomorphic L-structures satisfy exactly the same L-sentences.
2.6 Models
In the rest of this chapter L is a language, / is an L-structure (with underlying
set A), and, unless indicated otherwise, t is an L-term, ϕ, ψ, and θ are L-
formulas, σ is an L-sentence, and Σ is a set of L-sentences. We drop the prefix
L in “L-term” and “L-formula” and so on, unless this would cause confusion.
Definition. We say that / is a model of Σ or Σ holds in / (denoted / [= Σ)
if / [= σ for each σ ∈ Σ.
To discuss examples it is convenient to introduce some notation. Suppose L
contains (at least) the constant symbol 0 and the binary function symbol +.
Given any terms t
1
, . . . , t
n
we define the term t
1
+ +t
n
inductively as follows:
it is the term 0 if n = 0, the term t
1
if n = 1, and the term (t
1
+ +t
n−1
) +t
n
for n > 1. We write nt for the term t+ +t with n summands, in particular, 0t
and 1t denote the terms 0 and t respectively. Suppose L contains the constant
symbol 1 and the binary function symbol (the multiplication sign). Then we
have similar notational conventions for t
1
. . . t
n
and t
n
; in particular, for n = 0
both stand for the term 1, and t
1
is just t.
Examples. Fix three distinct variables x, y, z.
(1) Groups are the L
Gr
-structures that are models of
Gr := ¦∀x(x 1 = x ∧ 1 x = x), ∀x(x x
−1
= 1 ∧ x
−1
x = 1),
∀x∀y∀z((x y) z = x (y z))¦
(2) Abelian groups are the L
Ab
-structures that are models of
Ab := ¦∀x(x + 0 = x), ∀x(x + (−x) = 0), ∀x∀y(x +y = y +x),
∀x∀y∀z((x +y) +z = x + (y +z))¦
(3) Torsion-free abelian groups are the L
Ab
-structures that are models of
Ab ∪ ¦∀x(nx = 0 → x = 0) : n = 1, 2, 3, . . .¦
(4) Rings are the L
Ri
-structures that are models of
Ri := Ab ∪ ¦∀x∀y∀z
_
(x y) z = x (y z)
_
, ∀x
_
x 1 = x ∧ 1 x = x
_
,
∀x∀y∀z
_
(x (y +z) = x y +x z ∧ (x +y) z = x z +y z)
_
¦
36 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(5) Fields are the L
Ri
-structures that are models of
Fl = Ri ∪ ¦∀x∀y(x y = y x), 1 ,= 0, ∀x
_
x ,= 0 → ∃y (x y = 1)
_
¦
(6) Fields of characteristic 0 are the L
Ri
-structures that are models of
Fl(0) := Fl ∪ ¦n1 ,= 0 : n = 2, 3, 5, 7, 11, . . .¦
(7) Given a prime number p, fields of characteristic p are the L
Ri
-structures
that are models of Fl(p) := Fl ∪ ¦p1 = 0¦.
(8) Algebraically closed fields are the L
Ri
-structures that are models of
ACF := Fl∪¦∀u
1
. . . ∀u
n
∃x(x
n
+u
1
x
n−1
+. . .+u
n
= 0) : n = 2, 3, 4, 5, . . .¦
Here u
1
, u
2
, u
3
, . . . is some fixed infinite sequence of distinct variables, dis-
tinct also from x, and u
i
x
n−i
abbreviates u
i
x
n−i
, for i = 1, . . . , n.
(9) Algebraically closed fields of characteristic 0 are the L
Ri
-structures that are
models of ACF(0) := ACF ∪ ¦n1 ,= 0 : n = 2, 3, 5, 7, 11, . . .¦.
(10) Given a prime number p, algebraically closed fields of characteristic p are
the L
Ri
-structures that are models of ACF(p) := ACF ∪ ¦p1 = 0¦.
Definition. We say that σ is a logical consequence of Σ (written Σ [= σ) if σ
is true in every model of Σ.
Example. It is well-known that in any ring R we have x 0 = 0 for all x ∈ R.
This can now be expressed as Ri [= ∀x( x 0 = 0).
We defined what it means for a sentence σ to hold in a given structure /. We
now extend this to arbitrary formulas.
First define an /-instance of a formula ϕ = ϕ(x
1
, . . . , x
m
) to be an L
A
-
sentence of the form ϕ(a
1
, . . . , a
m
) with a
1
, . . . , a
m
∈ A. Of course ϕ can also
be written as ϕ(y
1
, . . . , y
n
) for another sequence of variables y
1
, . . . , y
n
, for ex-
ample, y
1
, . . . , y
n
could be obtained by permuting x
1
, . . . , x
m
, or it could be
x
1
, . . . , x
m
, x
m+1
, obtained by adding a variable x
m+1
. Thus for the above to
count as a definition of “/-instance,” the reader should check that these differ-
ent ways of specifying variables (including at least the variables occurring free
in ϕ) give the same /-instances.
Definition. A formula ϕ is said to be valid in / (notation: / [= ϕ) if all its
/-instances are true in /.
The reader should check that if ϕ = ϕ(x
1
, . . . , x
m
), then
/ [= ϕ ⇐⇒ / [= ∀x
1
. . . ∀x
m
ϕ.
We also extend the notion of “logical consequence of Σ” to formulas.
Definition. We say that ϕ is a logical consequence of Σ (notation: Σ [= ϕ) if
/ [= ϕ for all models / of Σ.
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS 37
One should not confuse the notion of “logical consequence of Σ” with that of
“provable from Σ.” We shall give a definition of provable from Σ in the next
section. The two notions will turn out to be equivalent, but that is hardly
obvious from their definitions: we shall need much of the next chapter to prove
this equivalence, which is called the Completeness Theorem for Predicate Logic.
We finish this section with two basic facts:
Lemma 2.6.1. Let α(x
1
, . . . , x
m
) be an L
A
-term, and recall that α defines a
map α
A
: A
m
→ A. Let t
1
, . . . , t
m
be variable-free L
A
-terms, with t
A
i
= a
i
∈ A
for i = 1, . . . , m. Then α(t
1
, . . . , t
m
) is a variable-free L
A
-term, and
α(t
1
, . . . , t
m
)
A
= α(a
1
, . . . a
m
)
A
= α
A
(t
A
1
, . . . , t
A
m
).
This follows by a straightforward induction on α.
Lemma 2.6.2. Let t
1
, . . . , t
m
be variable-free L
A
-terms with t
A
i
= a
i
∈ A
for i = 1, . . . , m. Let ϕ(x
1
, . . . , x
m
) be an L
A
-formula. Then the L
A
-formula
ϕ(t
1
, . . . , t
m
) is a sentence and
/ [= ϕ(t
1
, . . . , t
m
) ⇐⇒ / [= ϕ(a
1
, . . . , a
m
).
Proof. To keep notations simple we give the proof only for m = 1 with t = t
1
and x = x
1
. We proceed by induction on the number of logical symbols in ϕ(x).
Suppose that ϕ is atomic. The case where ϕ is · or ⊥ is obvious. Assume ϕ
is Rα
1
. . . α
m
where R ∈ L
r
is m-ary and α
1
(x), . . . , α
m
(x) are L
A
-terms. Then
ϕ(t) = Rα
1
(t) . . . α
m
(t) and ϕ(a) = Rα
1
(a) . . . α
m
(a). We have / [= ϕ(t) iff

1
(t)
A
, . . . , α
m
(t)
A
) ∈ R
A
and also / [= ϕ(a) iff (α
1
(a)
A
, . . . , α
m
(a)
A
) ∈ R
A
.
As α
i
(t)
A
= α
i
(a)
A
for all i by the previous lemma, we have / [= ϕ(t) iff
/ [= ϕ(a). The case that ϕ(x) is α(x) = β(x) is handled the same way.
It is also clear that the desired property is inherited by disjunctions, con-
junctions and negations of formulas ϕ(x) that have the property. Suppose now
that ϕ(x) = ∃y ψ.
Case y ,= x: Then ψ = ψ(x, y), ϕ(t) = ∃yψ(t, y) and ϕ(a) = ∃yψ(a, y). As
ϕ(t) = ∃yψ(t, y), we have / [= ϕ(t) iff / [= ψ(t, b) for some b ∈ A. By the
inductive hypothesis the latter is equivalent to / [= ψ(a, b) for some b ∈ A,
hence equivalent to / [= ∃yψ(a, y). As ϕ(a) = ∃yψ(a, y), we conclude that
/ [= ϕ(t) iff / [= ϕ(a).
Case y = x: Then x does not occur free in ϕ(x) = ∃xψ. So ϕ(t) = ϕ(a) = ϕ
is an L
A
-sentence, and / [= ϕ(t) ⇔ / [= ϕ(a) is obvious.
When ϕ(x) = ∀y ψ then one can proceed exactly as above by distinguishing
two cases.
2.7 Logical Axioms and Rules; Formal Proofs
In this section we introduce a proof system for predicate logic and state its
completeness. We then derive as a consequence the compactness theorem and
some of its corollaries. The completeness is proved in the next chapter.
38 CHAPTER 2. BASIC CONCEPTS OF LOGIC
A propositional axiom of L is by definition a formula that for some ϕ, ψ, θ occurs
in the list below:
1. ·
2. ϕ → (ϕ ∨ ψ); ϕ → (ψ ∨ ϕ)
3. ϕ →
_
ψ → (ϕ ∨ ψ)
_
4. (ϕ ∧ ψ) → ϕ; (ϕ ∧ ψ) → ψ
5. ϕ →
_
ψ → (ϕ ∧ ψ)
_
6.
_
ϕ → (ψ → θ)
_

_
(ϕ → ψ) → (ϕ → θ)
_
7. ϕ → (ϕ → ⊥)
8. (ϕ → ⊥) → ϕ
Each of items 2–8 is a scheme describing infinitely many axioms. Note that
this list is the same as the list in Section 2.2 except that instead of propositions
p, q, r we have formulas ϕ, ψ, θ.
The logical axioms of L are the propositional axioms of L and the equality
and quantifier axioms of L as defined below.
Definition. The equality axioms of L are the following formulas:
(i) x = x,
(ii) x = y → y = x,
(iii) (x = y ∧ y = z) → x = z,
(iv) (x
1
= y
1
∧ . . . ∧ x
m
= y
m
∧ Rx
1
. . . x
m
) → Ry
1
. . . y
m
,
(v) (x
1
= y
1
∧ . . . ∧ x
n
= y
n
) → Fx
1
. . . x
n
= Fy
1
. . . y
n
,
with the following restrictions on the variables and symbols of L: x, y, z are
distinct in (ii) and (iii); in (iv), x
1
, . . . , x
m
, y
1
, . . . , y
m
are distinct and R ∈ L
r
is m-ary; in (v), x
1
, . . . , x
n
, y
1
, . . . , y
n
are distinct, and F ∈ L
f
is n-ary. Note
that (i) represents an axiom scheme rather than a single axiom, since different
variables x give different formulas x = x. Likewise with (ii)–(v).
Let x and y be distinct variables, and let ϕ(y) be the formula ∃x(x ,= y).
Then ϕ(y) is valid in all / with [A[ > 1, but ϕ(x/y) is invalid in all /. Thus
substituting x for the free occurrences of y does not always preserve validity. To
get rid of this anomaly, we introduce the following restriction on substitutions
of a term t for free occurrences of y.
Definition. We say that t is free for y in ϕ, if no variable in t can become bound
upon replacing the free occurrences of y in ϕ by t, more precisely: whenever x
is a variable in t, then there are no occurrences of subformulas in ϕ of the form
∃xψ or ∀xψ that contain an occurrence of y that is free in ϕ.
Note that if t is variable-free, then t is free for y in ϕ. We remark that “free
for” abbreviates “free to be substituted for.” In exercise 2 the reader is asked to
show that, with this restriction, substitution of a term for the free occurrences
of a variable does preserve validity.
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS 39
Definition. The quantifier axioms of L are the formulas ϕ(t/y) → ∃yϕ and
∀yϕ → ϕ(t/y) where t is free for y in ϕ.
These axioms have been chosen to have the following property.
Proposition 2.7.1. The logical axioms of L are valid in every L-structure.
We first prove this for the propositional axioms of L. Let α
1
, . . . , α
n
be distinct
propositional atoms not in L. Let p = p(α
1
, . . . , α
n
) ∈ Prop¦α
1
, . . . , α
n
¦. Let
ϕ
1
, . . . , ϕ
n
be formulas and let p(ϕ
1
, . . . , ϕ
n
) be the word obtained by replac-
ing each occurrence of α
i
in p by ϕ
i
for i = 1, . . . , n. One checks easily that
p(ϕ
1
, . . . , ϕ
n
) is a formula.
Lemma 2.7.2. Suppose ϕ
i
= ϕ
i
(x
1
, . . . , x
m
) for 1 ≤ i ≤ n and let a
1
, . . . , a
m

A. Define a truth assignment t : ¦α
1
, . . . , α
n
¦ −→ ¦0, 1¦ by t(α
i
) = 1 iff
/ [= ϕ
i
(a
1
, . . . , a
m
). Then p(ϕ
1
, . . . , ϕ
n
) is an L-formula and
p(ϕ
1
, . . . , ϕ
n
)(a
1
/x
1
, . . . , a
m
/x
m
) = p(ϕ
1
(a
1
, . . . , a
m
), . . . , ϕ
n
(a
1
, . . . , a
m
)),
t(p(α
1
, . . . , α
n
)) = 1 ⇐⇒ / [=p(ϕ
1
(a
1
, . . . , a
m
), . . . , ϕ
n
(a
1
, . . . , a
m
)).
In particular, if p is a tautology, then / [= p(ϕ
1
, . . . , ϕ
n
).
Proof. Easy induction on p. We leave the details to the reader.
Definition. An L-tautology is a formula of the form p(ϕ
1
, . . . , ϕ
n
) for some
tautology p(α
1
, . . . , α
n
) ∈ Prop¦α
1
, . . . , α
n
¦ and some formulas ϕ
1
, . . . , ϕ
n
.
By Lemma 2.7.2 all L-tautologies are valid in all L-structures. The propositional
axioms of L are L-tautologies, so all propositional axioms of L are valid in all
L-structures. It is easy to check that all equality axioms of L are valid in all
L-structures. In exercise 3 below the reader is asked to show that all quantifier
axioms of L are valid in all L-structures. This finishes the proof of Proposition
2.7.1.
Next we introduce rules for deriving new formulas from given formulas.
Definition. The logical rules of L are the following:
(i) Modus Ponens (MP): From ϕ and ϕ → ψ, infer ψ.
(ii) Generalization Rule (G): If the variable x does not occur free in ϕ, then
(a) from ϕ → ψ, infer ϕ → ∀xψ;
(b) from ψ → ϕ, infer ∃xψ → ϕ.
A key property of the logical rules is that their application preserves validity.
Here is a more precise statement of this fact, to be verified by the reader.
(i) If / [= ϕ and / [= ϕ → ψ, then / [= ψ.
(ii) Suppose x does not occur free in ϕ. Then
(a) if / [= ϕ → ψ, then / [= ϕ → ∀xψ;
(b) if / [= ψ → ϕ, then / [= ∃xψ → ϕ.
Definition. A formal proof , or just proof , of ϕ from Σ is a sequence ϕ
1
, . . . , ϕ
n
of formulas with n ≥ 1 and ϕ
n
= ϕ, such that for k = 1, . . . , n:
40 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(i) either ϕ
k
∈ Σ,
(ii) or ϕ
k
is a logical axiom,
(iii) or there are i, j ∈ ¦1, . . . , k −1¦ such that ϕ
k
can be inferred from ϕ
i
and
ϕ
j
by MP, or from ϕ
i
by G.
If there exists a proof of ϕ from Σ, then we write Σ ¬ ϕ and say Σ proves ϕ.
Proposition 2.7.3. If Σ ¬ ϕ, then Σ [= ϕ.
This follows easily from earlier facts that we stated and which the reader was
asked to verify. The converse is more interesting, and due to G¨odel (1930):
Theorem 2.7.4 (Completeness Theorem of Predicate Logic).
Σ ¬ ϕ ⇐⇒ Σ [= ϕ
Remark. Our choice of proof system, and thus our notion of formal proof
is somewhat arbitrary. However the equivalence of ¬ and [= (Completeness
Theorem) justifies our choice of logical axioms and rules and shows in particular
that no further logical axioms and rules are needed. Moreover, this equivalence
has consequences that can be stated in terms of [= alone. An example is the
important Compactness Theorem.
Theorem 2.7.5 (Compactness Theorem). If Σ [= σ then there is a finite subset
Σ
0
of Σ such that Σ
0
[= σ.
The Compactness Theorem has many consequences. Here is one.
Corollary 2.7.6. Suppose σ is an L
Ri
-sentence that holds in all fields of char-
acteristic 0. Then there exists a natural number N such that σ is true in all
fields of characteristic p > N.
Proof. By assumption,
Fl(0) = Fl ∪ ¦n1 ,= 0 : n = 2, 3, 5, . . .¦ [= σ.
Then by Compactness, there is N ∈ N such that
Fl ∪ ¦n1 ,= 0 : n = 2, 3, 5, , . . . , n ≤ N¦ [= σ.
It follows that σ is true in all fields of characteristic p > N.
The converse of this corollary fails, see exercise 8 below. Note that Fl(0) is
infinite. Could there be an alternative finite set of axioms whose models are
exactly the fields of characteristic 0?
Corollary 2.7.7. There is no finite set of L
Ri
-sentences whose models are
exactly the fields of characteristic 0.
Proof. Suppose there is such a finite set of sentences ¦σ
1
, . . . , σ
N
¦. Let σ :=
σ
1
∧ ∧σ
N
. Then the models of σ are just the fields of characteristic 0. By the
previous result σ holds in some field of characteristic p > 0. Contradiction!
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS 41
Exercises. All but the last one to be done without using Theorem 2.7.4 or
2.7.5. Actually, Exercise (4) will be used in proving Theorem 2.7.4.
(1) Let L = |R¦ where R is a binary relation symbol, and let / = (A; R) be a finite
L-structure (i. e. the set A is finite). Then there exists an L-sentence σ such that
the models of σ are exactly the L-structures isomorphic to /. (In fact, for an
arbitrary language L, two finite L-structures are isomorphic iff they satisfy the
same L-sentences.)
(2) If t is free for y in ϕ and ϕ is valid in /, then ϕ(t/y) is valid in /.
(3) Suppose t is free for y in ϕ = ϕ(x
1
, . . . , x
n
, y). Then:
(i) Each /-instance of the quantifier axiom ϕ(t/y) → ∃yϕ has the form
ϕ(a
1
, . . . , a
n
, τ) → ∃yϕ(a
1
, . . . , a
n
, y)
with a
1
, . . . , a
n
∈ A and τ a variable-free L
A
-term.
(ii) The quantifier axiom ϕ(t/y) → ∃yϕ is valid in /. (Hint: use Lemma 2.6.2.)
(iii) The quantifier axiom ∀yϕ → ϕ(t/y) is valid in /.
(4) If ϕ is an L-tautology, then ¬ ϕ.
(5) Σ ¬ ϕ
i
for i = 1, . . . , n ⇐⇒ Σ ¬ ϕ
1
∧ ∧ ϕ
n
.
(6) If Σ ¬ ϕ → ψ and Σ ¬ ψ → ϕ, then Σ ¬ ϕ ↔ ψ.
(7) ¬ ∃xϕ ↔ ∀xϕ and ¬ ∀xϕ ↔ ∃xϕ.
(8) Indicate an L
Ri
-sentence that is true in the field of real numbers, but false in all
fields of positive characteristic.
(9) Let σ be an L
Ab
-sentence which holds in all non-trivial torsion free abelian groups.
Then there exists N ∈ N such that σ is true in all groups Z/pZ where p is a
prime number and p > N.
Chapter 3
The Completeness Theorem
The main aim of this chapter is to prove the Completeness Theorem. As a
byproduct we also derive some more elementary facts about predicate logic.
The last section contains some of the basics of universal algebra, which we can
treat here rather efficiently using our construction of a so-called term-model in
the proof of the Completeness Theorem.
Conventions on the use of L, /, t, ϕ, ψ, θ, σ and Σ are as in the beginning
of Section 2.6.
3.1 Another Form of Completeness
It is convenient to prove first a variant of the Completeness Theorem.
Definition. We say that Σ is consistent if Σ ⊥, and otherwise (that is, if
Σ ¬ ⊥), we call Σ inconsistent.
Theorem 3.1.1 (Completeness Theorem - second form).
Σ is consistent if and only if Σ has a model.
We first show that this second form of the Completeness Theorem implies the
first form. This will be done through a series of technical lemmas, which are
also useful later in this Chapter.
Lemma 3.1.2. Suppose Σ ¬ ϕ. Then Σ ¬ ∀xϕ.
Proof. From Σ ¬ ϕ and the L-tautology ϕ → (∀xϕ → ϕ) we obtain Σ ¬
∀xϕ → ϕ by MP. Then by G we have Σ ¬ ∀xϕ → ∀xϕ. Using the L-
tautology (∀xϕ → ∀xϕ) → ∀xϕ and MP we get Σ ¬ ∀xϕ.
Lemma 3.1.3 (Deduction Lemma). Suppose Σ ∪ ¦σ¦ ¬ ϕ. Then Σ ¬ σ → ϕ.
Proof. By induction on the length of a proof of ϕ from Σ ∪ ¦σ¦.
The cases where ϕ is a logical axiom, or ϕ ∈ Σ ∪ ¦σ¦ or ϕ is obtained by
MP are treated just as in the proof of the Deduction Lemma of Propositional
Logic.
43
44 CHAPTER 3. THE COMPLETENESS THEOREM
Suppose that ϕ is obtained by part (a) of G, so ϕ is ϕ
1
→ ∀xψ where x does
not occur free in ϕ
1
and Σ ∪ ¦σ¦ ¬ ϕ
1
→ ψ, and where we assume inductively
that Σ ¬ σ → (ϕ
1
→ ψ). We have to argue that then Σ ¬ σ → (ϕ
1
→ ∀xψ).
From the L-tautology
_
σ → (ϕ
1
→ ψ)
_

_
(σ ∧ ϕ
1
) → ψ
_
and MP we get Σ ¬
(σ∧ϕ
1
) → ψ. Since x does not occur free in σ∧ϕ
1
this gives Σ ¬ (σ∧ϕ
1
) → ∀xψ,
by G. Using the L-tautology
_
(σ ∧ ϕ
1
) → ∀xψ
_

_
σ → (ϕ
1
→ ∀xψ)
_
and MP this gives Σ ¬ σ → (ϕ
1
→ ∀xψ).
The case that ϕ is obtained by part (b) of G is left to the reader.
Corollary 3.1.4. Suppose Σ ∪ ¦σ
1
, . . . , σ
n
¦ ¬ ϕ. Then Σ ¬ σ
1
∧ . . . ∧ σ
n
→ ϕ.
We leave the proof as an exercise.
Corollary 3.1.5. Σ ¬ σ if and only if Σ ∪ ¦σ¦ is inconsistent.
The proof is just like that of the corresponding fact of Propositional Logic.
Lemma 3.1.6. Σ ¬ ∀yϕ if and only if Σ ¬ ϕ.
Proof. (⇐) This is Lemma 3.1.2. For (⇒), assume Σ ¬ ∀yϕ. We have the
quantifier axiom ∀yϕ → ϕ, so by MP we get Σ ¬ ϕ.
Corollary 3.1.7. Σ ¬ ∀y
1
. . . ∀y
n
ϕ if and only if Σ ¬ ϕ.
Corollary 3.1.8. The second form of the Completeness Theorem implies the
first form, Theorem 2.7.4.
Proof. Assume the second form of the Completeness Theorem holds, and that
Σ [= ϕ. It suffices to show that then Σ ¬ ϕ. From Σ [= ϕ we obtain Σ [=
∀y
1
. . . ∀y
n
ϕ where ϕ = ϕ(y
1
, . . . , y
n
), and so Σ ∪ ¦σ¦ has no model where
σ is the sentence ∀y
1
. . . ∀y
n
ϕ. But then by the 2
nd
form of the Completeness
Theorem Σ∪ ¦σ¦ is inconsistent. Then by Corollary 3.1.5 we have Σ ¬ σ and
thus by Corollary 3.1.7 we get Σ ¬ ϕ.
We finish this section with another form of the Compactness Theorem:
Theorem 3.1.9 (Compactness Theorem - second form).
If each finite subset of Σ has a model, then Σ has a model.
This follows from the second form of the Completeness Theorem.
3.2 Proof of the Completeness Theorem
We are now going to prove Theorem 3.1.1. Since (⇐) is clear, we focus our
attention on (⇒), that is, given a consistent set of sentences Σ we must show
that Σ has a model. This job will be done in a series of lemmas. Unless we say
so, we do not assume in those lemmas that Σ is consistent.
3.2. PROOF OF THE COMPLETENESS THEOREM 45
Lemma 3.2.1. Suppose Σ ¬ ϕ and t is free for x in ϕ. Then Σ ¬ ϕ(t/x).
Proof. From Σ ¬ ϕ we get Σ ¬ ∀xϕ by Lemma 3.1.2. Then MP together with
the quantifier axiom ∀xϕ → ϕ(t/x) gives Σ ¬ ϕ(t/x) as required.
Lemma 3.2.2. Suppose Σ ¬ ϕ, and t
1
, . . . , t
n
are terms whose variables do not
occur bound in ϕ. Then Σ ¬ ϕ(t
1
/x
1
, . . . , t
n
/x
n
).
Proof. Take distinct variables y
1
, . . . , y
n
that do not occur in ϕ or t
1
, . . . , t
n
and that are distinct from x
1
, . . . , x
n
. Use Lemma 3.2.1 n times in succession to
obtain Σ ¬ ψ where ψ = ϕ(y
1
/x
1
, . . . , y
n
/x
n
). Apply Lemma 3.2.1 again n times
to get Σ ¬ ψ(t
1
/y
1
, . . . , t
n
/y
n
). To finish, observe that ψ(t
1
/y
1
, . . . , t
n
/y
n
) =
ϕ(t
1
/x
1
, . . . , t
n
/x
n
).
Lemma 3.2.3.
(1) For each L-term t we have ¬ t = t.
(2) Let t, t

be L-terms and Σ ¬ t = t

. Then Σ ¬ t

= t.
(3) Let t
1
, t
2
, t
3
be L-terms and Σ ¬ t
1
= t
2
and Σ ¬ t
2
= t
3
. Then Σ ¬ t
1
= t
3
.
(4) Let R ∈ L
r
be m-ary and let t
1
, t

1
, . . . , t
m
, t

m
be L-terms such that Σ ¬
t
i
= t

i
for i = 1, . . . , m and Σ ¬ Rt
1
. . . t
m
. Then Σ ¬ Rt

1
. . . t

m
.
(5) Let F ∈ L
f
be n-ary, and let t
1
, t

1
, . . . , t
n
, t

n
be L-terms such that Σ ¬ t
i
=
t

i
for i = 1, . . . , n. Then Σ ¬ Ft
1
. . . t
n
= Ft

1
. . . t

n
.
Proof. For (1), use the equality axiom x = x and apply Lemma 3.2.2. For
(2), take an equality axiom x = y → y = x and apply Lemma 3.2.2 to get
¬ t = t

→ t

= t. Then MP gives Σ ¬ t

= t.
For (3), take an equality axiom (x = y ∧ y = z) → (x = z) and apply Lemma
3.2.2 to get ¬ (t
1
= t
2
∧ t
2
= t
3
) → t
1
= t
3
. By the assumptions and Exercise 5
in Section 2.7 we have Σ ¬ t
1
= t
2
∧ t
2
= t
3
. Then MP yields Σ ¬ t
1
= t
3
.
To prove (4) we first use the same Exercise 5 to get
Σ ¬ t
1
= t

1
∧ . . . ∧ t
m
= t

m
∧ Rt
1
. . . t
m
.
Take an equality axiom x
1
= y
1
∧. . . ∧x
m
= y
m
∧Rx
1
. . . x
m
→ Ry
1
. . . y
m
and
apply Lemma 3.2.2 to obtain
Σ ¬ t
1
= t

1
∧ . . . ∧ t
m
= t

m
∧ Rt
1
. . . t
m
→ Rt

1
. . . t

m
.
Then MP gives Σ ¬ Rt

1
. . . t

m
. Part (5) is obtained in a similar way by using
an equality axiom x
1
= y
1
∧ . . . ∧ x
n
= y
n
→ Fx
1
. . . x
n
= Fy
1
. . . y
n
.
Definition. Let T
L
be the set of variable-free L-terms. We define a binary
relation ∼
Σ
on T
L
by
t
1

Σ
t
2
⇐⇒ Σ ¬ t
1
= t
2
.
Parts (1), (2) and (3) of the last lemma yield the following.
Lemma 3.2.4. The relation ∼
Σ
is an equivalence relation on T
L
.
46 CHAPTER 3. THE COMPLETENESS THEOREM
Definition. Suppose L has at least one constant symbol. Then T
L
is non-
empty. We define the L-structure /
Σ
as follows:
(i) Its underlying set is A
Σ
:= T
L
/ ∼
Σ
. Let [t] denote the equivalence class of
t ∈ T
L
with respect to ∼
Σ
.
(ii) If R ∈ L
r
is m-ary, then R
A
Σ
⊆ A
m
Σ
is given by
([t
1
], . . . , [t
m
]) ∈ R
A
Σ
:⇐⇒ Σ ¬ Rt
1
. . . t
m
(t
1
, . . . , t
m
∈ T
L
).
(iii) If F ∈ L
f
is n-ary, then F
A
Σ
: A
n
Σ
→ A
Σ
is given by
F
A
Σ
([t
1
], . . . , [t
n
]) = [Ft
1
. . . t
n
] (t
1
, . . . , t
n
∈ T
L
).
Remark. The reader should verify that this counts as a definition, i.e., does
not introduce ambiguity. (Use parts (4) and (5) of Lemma 3.2.3.)
Corollary 3.2.5. Suppose L has a constant symbol, and Σ is consistent. Then
(1) for each t ∈ T
L
we have t
A
Σ
= [t];
(2) for each atomic σ we have: Σ ¬ σ ⇐⇒ /
Σ
[= σ.
Proof. Part (1) follows by an easy induction. Let σ be Rt
1
. . . t
m
where R ∈ L
r
is m-ary and t
1
, . . . , t
m
∈ T
L
. Then
Σ ¬ Rt
1
. . . t
m
⇔ ([t
1
], . . . , [t
m
]) ∈ R
A
Σ
⇔ /
Σ
[= Rt
1
. . . t
m
,
where the last “⇔” follows from the definition of [= together with part (1). Now
suppose that σ is t
1
= t
2
where t
1
, t
2
∈ T
L
. Then
Σ ¬ t
1
= t
2
⇔ [t
1
] = [t
2
] ⇔ t
A
Σ
1
= t
A
Σ
2
⇔ /
Σ
[= t
1
= t
2
.
We also have Σ ¬ · ⇔ /
Σ
[= ·. So far we haven’t used the assumption that Σ
is consistent, but now we do. The consistency of Σ means that Σ ⊥. We also
have /
Σ
,[= ⊥ by definition of [=. Thus Σ ¬ ⊥ ⇔ /
Σ
[= ⊥.
If the equivalence in part (2) of this corollary holds for all σ (not only for atomic
σ), then /
Σ
[= Σ, so we would have found a model of Σ, and be done. But
clearly this equivalence can only hold for all σ if Σ has the property that for
each σ, either Σ ¬ σ or Σ ¬ σ. This property is of interest for other reasons
as well, and deserves a name:
Definition. We say that Σ is complete if Σ is consistent, and for each σ either
Σ ¬ σ or Σ ¬ σ.
Example. Let L = L
Ab
, Σ := Ab (the set of axioms for abelian groups), and
σ the sentence ∃x(x ,= 0). Then Σ σ since the trivial group doesn’t satisfy
σ. Also Σ σ, since there are non-trivial abelian groups and σ holds in such
groups. Thus Σ is not complete.
Completeness is a strong property and it can be hard to show that a given
set of axioms is complete. The set of axioms for algebraically closed fields of
characteristic 0 is complete (see the end of Section 4.3).
A key fact about completeness needed in this chapter is that any consistent
set of sentences extends to a complete set of sentences:
3.2. PROOF OF THE COMPLETENESS THEOREM 47
Lemma 3.2.6 (Lindenbaum). Suppose Σ is consistent. Then Σ ⊆ Σ

for some
complete set of L-sentences Σ

.
The proof uses Zorn’s Lemma, and is just like that of the corresponding fact of
Propositional Logic in Section 1.2.
Completeness of Σ does not guarantee that the equivalence of part (2) of
Corollary 3.2.5 holds for all σ. Completeness is only a necessary condition
for this equivalence to hold for all σ; another necessary condition is “to have
witnesses”:
Definition. A Σ-witness for the sentence ∃xϕ(x) is a term t ∈ T
L
such that
Σ ¬ ϕ(t). We say that Σ has witnesses if there is a Σ-witness for every sentence
∃xϕ(x) proved by Σ.
Theorem 3.2.7. Let L have a constant symbol, and suppose Σ is consistent.
Then the following two conditions are equivalent:
(i) For each σ we have: Σ ¬ σ ⇔ /
Σ
[= σ.
(ii) Σ is complete and has witnesses.
In particular, if Σ is complete and has witnesses, then /
Σ
is a model of Σ.
Proof. It should be clear that (i) implies (ii). For the converse, assume (ii). We
use induction on the number of logical symbols in σ to obtain (i). We already
know that (i) holds for atomic sentences. The cases that σ = σ
1
, σ = σ
1
∨σ
2
,
and σ = σ
1
∧ σ
2
are treated just as in the proof of the corresponding Lemma
2.2.11 for Propositional Logic. It remains to consider two cases:
Case σ = ∃xϕ(x):
(⇒) Suppose that Σ ¬ σ. Because we are assuming that Σ has witnesses we
have a t ∈ T
L
such that Σ ¬ ϕ(t). Then by the inductive hypothesis /
Σ
[= ϕ(t).
So by Lemma 2.6.2 we have an a ∈ A
Σ
such that /
Σ
[= ϕ(a). Therefore
/
Σ
[= ∃xϕ(x), hence /
Σ
[= σ.
(⇐) Assume /
Σ
[= σ. Then there is an a ∈ A
Σ
such that /
Σ
[= ϕ(a). Choose
t ∈ T
L
such that [t] = a. Then t
A
Σ
= a, hence /
Σ
[= ϕ(t) by Lemma 2.6.2.
Applying the inductive hypothesis we get Σ ¬ ϕ(t). This yields Σ ¬ ∃xϕ(x) by
MP and the quantifier axiom ϕ(t) → ∃xϕ(x).
Case σ = ∀xϕ(x): This is similar to the previous case but we also need the
result from Exercise 7 in Section 2.7 that ¬ ∀xϕ ↔ ∃xϕ.
We call attention to some new notation in the next lemmas: the symbol ¬
L
is
used to emphasize that we are dealing with formal provability within L.
Lemma 3.2.8. Let Σ be a set of L-sentences, c a constant symbol not in L,
and L
c
:= L ∪ ¦c¦. Let ϕ(y) be an L-formula and suppose Σ ¬
L
c
ϕ(c). Then
Σ ¬
L
ϕ(y).
Proof. (Sketch) Take a proof of ϕ(c) from Σ in the language L
c
, and take a
variable z different from all variables occurring in that proof, and also such that
z ,= y. Replace in every formula in this proof each occurrence of c by z. Check
that one obtains in this way a proof of ϕ(z/y) in the language L from Σ. So
Σ ¬
L
ϕ(z/y) and hence by Lemma 3.2.1 we have Σ ¬
L
ϕ(z/y)(y/z), that is,
Σ ¬
L
ϕ(y).
48 CHAPTER 3. THE COMPLETENESS THEOREM
Lemma 3.2.9. Assume Σ is consistent and Σ ¬ ∃yϕ(y). Let c be a constant
symbol not in L. Put L
c
:= L ∪ ¦c¦. Then Σ ∪ ¦ϕ(c)¦ is a consistent set of
L
c
-sentences.
Proof. Suppose not. Then Σ∪¦ϕ(c)¦ ¬
L
c
⊥. By the Deduction Lemma (3.1.3)
Σ ¬
L
c
ϕ(c) → ⊥. Then by Lemma 3.2.8 we have Σ ¬
L
ϕ(y) → ⊥. By G we have
Σ ¬
L
∃yϕ(y) → ⊥. Applying MP yields Σ ¬ ⊥, contradicting the consistency
of Σ.
Lemma 3.2.10. Suppose Σ is consistent. Let σ
1
= ∃x
1
ϕ
1
(x
1
), . . ., σ
n
=
∃x
n
ϕ
n
(x
n
) be such that Σ ¬ σ
i
for every i = 1, . . . , n. Let c
1
, . . . , c
n
be
distinct constant symbols not in L. Put L

:= L ∪ ¦c
1
, . . . , c
n
¦ and Σ

=
Σ ∪ ¦ϕ
1
(c
1
), . . . , ϕ
n
(c
n
)¦. Then Σ

is a consistent set of L

-sentences.
Proof. The previous lemma covers the case n = 1. The general case follows by
induction on n.
In the next lemma we use a superscript “w” for “witness.”
Lemma 3.2.11. Suppose Σ is consistent. For each L-sentence σ = ∃xϕ(x) such
that Σ ¬ σ, let c
σ
be a constant symbol not in L such that if σ

is a different
L-sentence of the form ∃x

ϕ

(x

) provable from Σ, then c
σ
,= c
σ
. Put
L
w
:= L ∪ ¦c
σ
: σ = ∃xϕ(x) is an L-sentence such that Σ ¬ σ¦
Σ
w
:= Σ ∪ ¦ϕ(c
σ
) : σ = ∃xϕ(x) is an L-sentence such that Σ ¬ σ¦
Then Σ
w
is a consistent set of L
w
-sentences.
Proof. Suppose not. Then Σ
w
¬ ⊥. Take a proof of ⊥ from Σ
w
and let
c
σ
1
, . . . , c
σ
n
be constant symbols in L
w
L such that this proof is a proof
of ⊥ in the language L∪¦c
σ
1
, . . . , c
σ
n
¦ from Σ∪¦ϕ
1
(c
σ
1
), . . . , ϕ
n
(c
σ
n
)¦, where
σ
i
= ∃x
i
ϕ
i
(x
i
) for 1 ≤ i ≤ n. So Σ ∪ ¦ϕ
1
(c
σ
1
), . . . , ϕ
n
(c
σ
n
)¦ is an inconsistent
set of L ∪ ¦c
σ
1
, . . . , c
σ
n
¦-sentences. This contradicts Lemma 3.2.10.
Lemma 3.2.12. Let (L
n
) be an increasing sequence of languages: L
0
⊆ L
1

L
2
⊆ . . ., and denote their union by L

. Let Σ
n
be a consistent set of L
n
-
sentences, for each n, such that Σ
0
⊆ Σ
1
⊆ Σ
2
. . .. Then the union Σ

:=

n
Σ
n
is a consistent set of L

-sentences.
Proof. Suppose that Σ

¬ ⊥. Take a proof of ⊥ from Σ

. Then we can choose
n so large that this is actually a proof of ⊥ from Σ
n
in L
n
. This contradicts the
consistency of Σ
n
.
Suppose the language L

extends L, let / be an L-structure, and let /

be an
L

-structure. Then / is said to be a reduct of /

(and /

an expansion of /)
if / and /

have the same underlying set and the same interpretations of the
symbols of L. For example, (N; 0, +) is a reduct of (N; <, 0, 1, +, ). Note that
any L

-structure /

has a unique reduct to an L-structure, which we indicate
3.3. SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 49
by /

[
L
. A key fact (to be verified by the reader) is that if / is a reduct of
/

, then t
A
= t
A

for all variable-free L
A
-terms t, and
/ [= σ ⇐⇒ /

[= σ
for all L
A
-sentences σ.
We can now prove Theorem 3.1.1.
Proof. Let Σ be a consistent set of L-sentences. We construct a sequence (L
n
)
of languages and a sequence (Σ
n
) where each Σ
n
is a consistent set of L
n
-
sentences. We begin by setting L
0
= L and Σ
0
= Σ. Given the language L
n
and the consistent set of L
n
-sentences Σ
n
, put
L
n+1
:=
_
L
n
if n is even,
L
w
n
if n is odd,
choose a complete set of L
n
-sentences Σ

n
⊇ Σ
n
, and put
Σ
n+1
:=
_
Σ

n
if n is even,
Σ
w
n
if n is odd.
Here L
w
n
and Σ
w
n
are obtained from L
n
and Σ
n
in the same way that L
w
and
Σ
w
are obtained from L and Σ in Lemma 3.2.11. Note that L
n
⊆ L
n+1
, and
Σ
n
⊆ Σ
n+1
.
By the previous lemma the set Σ

of L

-sentences is consistent. It is also
complete. To see this, let σ be an L

-sentence. Take n even and so large that σ
is an L
n
-sentence. Then Σ
n+1
¬ σ or Σ
n+1
¬ σ and thus Σ

¬ σ or Σ

¬ σ.
We claim that Σ

has witnesses. To see this, let σ = ∃xϕ(x) be an L

-
sentence such that Σ

¬ σ. Now take n to be odd and so large that σ is
an L
n
-sentence and Σ
n
¬ σ. Then by construction of Σ
n+1
= Σ
w
n
we have
Σ
n+1
¬ ϕ(c
σ
), so Σ

¬ ϕ(c
σ
).
It follows from Theorem 3.2.7 that Σ

has a model, namely /
Σ

. Put
/ := /
Σ

[
L
. Then / [= Σ. This concludes the proof of the Completeness
Theorem (second form).
Exercises.
(1) Suppose Σ is consistent. Then Σ is complete if and only if every two models of
Σ satisfy the same sentences.
(2) Let L have just a constant symbol c, a unary relation symbol U and a unary
function symbol f, and suppose that Σ ¬ Ufc, and that f does not occur in the
sentences of Σ. Then Σ ¬ ∀xUx.
3.3 Some Elementary Results of Predicate Logic
Here we obtain some generalities of predicate logic: Equivalence and Equality
Theorems, Variants, and Prenex Form. In some proofs we shall take advantage
of the fact that the Completeness Theorem is now available.
50 CHAPTER 3. THE COMPLETENESS THEOREM
Lemma 3.3.1 (Distribution Rule).
(i) Suppose Σ ¬ ϕ → ψ. Then Σ ¬ ∃xϕ → ∃xψ and Σ ¬ ∀xϕ → ∀xψ.
(ii) Suppose Σ ¬ ϕ ↔ ψ. Then Σ ¬ ∃xϕ ↔ ∃xψ and Σ ¬ ∀xϕ ↔ ∀xψ.
Proof. We only do (i), since (ii) then follows easily. Let / be a model of Σ. By
the Completeness Theorem it suffices to show that then / [= ∃xϕ → ∃xψ and
/ [= ∀xϕ → ∀xψ. We shall prove / [= ∃xϕ → ∃xψ and leave the other part
to the reader. We have / [= ϕ → ψ. Choose variables y
1
, . . . , y
n
such that
ϕ = ϕ(x, y
1
, . . . , y
n
) and ψ = ψ(x, y
1
, . . . , y
n
). We need only show that then
for all a
1
, . . . , a
n
∈ A
/ [= ∃xϕ(x, a
1
, . . . , a
n
) → ∃xψ(x, a
1
, . . . , a
n
)
Suppose / [= ∃xϕ(x, a
1
, . . . , a
n
). Then / [= ϕ(a
0
, a
1
, . . . , a
n
) for some a
0
∈ A.
From / [= ϕ → ψ we obtain / [= ϕ(a
0
, . . . , a
n
) → ψ(a
0
, . . . , a
n
), hence / [=
ψ(a
0
, . . . , a
n
), and thus / [= ∃xψ(x, a
1
, . . . , a
n
).
Theorem 3.3.2 (Equivalence Theorem). Suppose Σ ¬ ϕ ↔ ϕ

, and let ψ

be
obtained from ψ by replacing some occurrence of ϕ as a subformula of ψ by ϕ

.
Then ψ

is again a formula and Σ ¬ ψ ↔ ψ

.
Proof. By induction on the number of logical symbols in ψ. If ψ is atomic, then
necessarily ψ = ϕ and ψ

= ϕ

and the desired result holds trivially.
Suppose that ψ = θ. Then either ψ = ϕ and ψ

= ϕ

, and the desired
result holds trivially, or the occurrence of ϕ we are replacing is an occurrence
in θ. Then the inductive hypothesis gives Σ ¬ θ ↔ θ

, where θ

is obtained by
replacing that occurrence (of ϕ) by ϕ

. Then ψ

= θ

and the desired result
follows easily. The cases ψ = ψ
1
∨ ψ
2
and ψ = ψ
1
∧ ψ
2
are left as exercises.
Suppose that ψ = ∃xθ. The case ψ = ϕ (and thus ψ

= ϕ

) is trivial. Suppose
ψ ,= ϕ. Then the occurrence of ϕ we are replacing is an occurrence inside θ.
So by inductive hypothesis we have Σ ¬ θ ↔ θ

. Then by the distribution rule
Σ ¬ ∃xθ ↔ ∃xθ

. The proof is similar if ψ = ∀xθ.
Definition. We say ϕ
1
and ϕ
2
are Σ-equivalent if Σ ¬ ϕ
1
↔ ϕ
2
. (In case Σ = ∅,
we just say equivalent.) One verifies easily that Σ-equivalence is an equivalence
relation on the set of L-formulas.
Given a family (ϕ
i
)
i∈I
of formulas with finite index set I we choose a bijection
k → i(k) : ¦1, . . . , n¦ → I and set

i∈I
ϕ
i
:= ϕ
i(1)
∨ ∨ ϕ
i(n)
,

i∈I
ϕ
i
:= ϕ
i(1)
∧ ∧ ϕ
i(n)
.
If I is clear from context we just write
_
i
ϕ
i
and
_
i
ϕ
i
instead. Of course, these
notations
_
i∈I
ϕ
i
and
_
i∈I
ϕ
i
can only be used when the particular choice of
bijection of ¦1, . . . , n¦ with I does not matter; this is usually the case because
the equivalence class of ϕ
i(1)
∨ ∨ ϕ
i(n)
does not depend on this choice, and
the same is true with “∧” instead of “∨”.
3.3. SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 51
Definition. A variant of a formula is obtained by successive replacements of
the following type:
(i) replace an occurrence of a subformula ∃xϕ by ∃yϕ(y/x);
(ii) replace an occurrence of a subformula ∀xϕ by ∀yϕ(y/x).
where y is free for x in ϕ and y does not occur free in ϕ.
Lemma 3.3.3. A formula is equivalent to any of its variants.
Proof. By the Equivalence Theorem (3.3.2) it suffices to show ¬ ∃xϕ ↔ ∃yϕ(y/x)
and ¬ ∀xϕ ↔ ∀yϕ(y/x) where y is free for x in ϕ and does not occur free in ϕ.
We prove the first equivalence, leaving the second as an exercise. Applying G
to the quantifier axiom ϕ(y/x) → ∃xϕ gives ¬ ∃yϕ(y/x) → ∃xϕ. Similarly we
get ¬ ∃xϕ → ∃yϕ(y/x) (use that ϕ = ϕ(y/x)(x/y) by the assumption on y).
An application of Exercise 6 of Section 2.7 finishes the proof.
Definition. A formula in prenex form is a formula Q
1
x
1
. . . Q
n
x
n
ϕ where
x
1
, . . . , x
n
are distinct variables, each Q
i
∈ ¦∃, ∀¦ and ϕ is quantifier-free. We
call Q
1
x
1
. . . Q
n
x
n
the prefix, and ϕ the matrix of the formula. Note that a
quantifier-free formula is in prenex form; this is the case n = 0.
We leave the proof of the next lemma as an exercise. Instead of “occurrence of
... as a subformula” we say “part ...”. In this lemma Q denotes a quantifier,
that is, Q ∈ ¦∃, ∀¦, and Q

denotes the other quantifier: ∃

= ∀ and ∀

= ∃.
Lemma 3.3.4. The following prenex transformations always change a formula
into an equivalent formula:
(1) replace the formula by one of its variants;
(2) replace a part Qxψ by Q

xψ;
(3) replace a part (Qxψ) ∨ θ by Qx(ψ ∨ θ) where x is not free in θ;
(4) replace a part ψ ∨ Qxθ by Qx(ψ ∨ θ) where x is not free in ψ;
(5) replace a part (Qxψ) ∧ θ by Qx(ψ ∧ θ) where x is not free in θ;
(6) replace a part ψ ∧ Qxθ by Qx(ψ ∧ θ) where x is not free in ψ.
Remark. Note that the free variables of a formula (those that occur free in the
formula) do not change under prenex transformations.
Theorem 3.3.5 (Prenex Form). Every formula can be changed into one in
prenex form by a finite sequence of prenex transformations. In particular, each
formula is equivalent to one in prenex form.
Proof. By induction on the number of logical symbols. Atomic formulas are
already in prenex form. To simplify notation, write ϕ =⇒
pr
ψ to indicate
that ψ can be obtained from ϕ by a finite sequence of prenex transformations.
Assume inductively that
ϕ
1
=⇒
pr
Q
1
x
1
. . . Q
m
x
m
ψ
1
ϕ
2
=⇒
pr
Q
m+1
y
1
. . . Q
m+n
y
n
ψ
2
,
where Q
1
, . . . , Q
m
, . . . , Q
m+n
∈ ¦∃, ∀¦, x
1
, . . . , x
m
are distinct, y
1
, . . . , y
n
are
distinct, and ψ
1
and ψ
2
are quantifier-free.
52 CHAPTER 3. THE COMPLETENESS THEOREM
Then for ϕ := ϕ
1
, we have
ϕ =⇒
pr
Q
1
x
1
. . . Q
m
x
m
ψ
1
.
Applying m prenex transformations of type (2) we get
Q
1
x
1
. . . Q
m
x
m
ψ
1
=⇒
pr
Q
1

x
1
. . . Q
m

x
m
ψ
1
,
hence ϕ =⇒
pr
Q
1

x
1
. . . Q
m

x
m
ψ
1
.
Next, let ϕ := ϕ
1
∨ ϕ
2
. The assumptions above yield
ϕ =⇒
pr
(Q
1
x
1
. . . Q
m
x
m
ψ
1
) ∨ (Q
m+1
y
1
. . . Q
m+n
y
n
ψ
2
).
Replacing here the RHS (righthand side) by a variant we may assume that
¦x
1
, . . . , x
m
¦ ∩ ¦y
1
, . . . , y
n
¦ = ∅, and also that no x
i
occurs free in ψ
2
and no
y
j
occurs free in ψ
1
. Applying m+n times prenex transformation of types (3)
and (4) we obtain
(Q
1
x
1
. . . Q
m
x
m
ψ
1
) ∨ (Q
m+1
y
1
. . . Q
m+n
y
n
ψ
2
) =⇒
pr
Q
1
x
1
. . . Q
m
x
m
Q
m+1
y
1
. . . Q
m+n
y
n

1
∨ ψ
2
).
Hence ϕ =⇒
pr
Q
1
x
1
. . . Q
m
x
m
Q
m+1
y
1
. . . Q
m+n
y
n

1
∨ ψ
2
). Likewise, to deal
with ϕ
1
∧ ϕ
2
, we apply prenex transformations of types (5) and (6).
Next, let ϕ := ∃xϕ
1
. Applying prenex transformations of type (1) we can
assume x
1
, . . . , x
m
are different from x. Then ϕ =⇒
pr
∃xQ
1
x
1
. . . Q
m
x
m
ψ
1
, and
the RHS is in prenex form. The case ϕ := ∀xϕ
1
is similar.
We finish this section with results on equalities. Note that by Corollary 2.1.7,
the result of replacing an occurrence of an L-term τ in t by an L-term τ

is an
L-term t

.
Proposition 3.3.6. Let τ and τ

be L-terms such that Σ ¬ τ = τ

, let t

be
obtained from t by replacing an occurrence of τ in t by τ

. Then Σ ¬ t = t

.
Proof. We proceed by induction on terms. First note that if t = τ, then t

= τ

.
This fact takes care of the case that t is a variable. Suppose t = Ft
1
. . . t
n
where F ∈ L
f
is n-ary and t
1
, . . . , t
n
are L-terms, and assume t ,= τ. Using the
facts on admissible words at the end of Section 2.1, including exercise 5, we see
that t

= Ft

1
. . . t

n
where for some i ∈ ¦1, . . . , n¦ we have t
j
= t

j
for all j ,= i,
j ∈ ¦1, . . . , n¦, and t

i
is obtained from t
i
by replacing an occurrence of τ in t
i
by τ

. Inductively we can assume that Σ ¬ t
1
= t

1
, . . . , Σ ¬ t
n
= t

n
, so by part
(5) of Lemma 3.2.3 we have Σ ¬ t = t

.
An occurrence of an L-term τ in ϕ is said to be inside quantifier scope if the
occurrence is inside an occurrence of a subformula ∃xψ or ∀xψ in ϕ. An occur-
rence of an L-term τ in ϕ is said to be outside quantifier scope if the occurrence
is not inside quantifier scope.
3.4. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 53
Proposition 3.3.7. Let τ and τ

be L-terms such that Σ ¬ τ = τ

, let ϕ

be
obtained from ϕ by replacing an occurrence of τ in ϕ outside quantifier scope by
τ

. Then Σ ¬ ϕ ↔ ϕ

.
Proof. First take care of the case that ϕ is atomic by a similar argument as for
terms, and then proceed by the usual kind of induction on formulas.
Exercises.
(1) Let P be a unary relation symbol, Q be a binary relation symbol, and x, y distinct
variables. Use prenex transformations to put
∀x∃y(P(x) ∧ Q(x, y)) → ∃x∀y(Q(x, y) → P(y))
into prenex form.
(2) Let (ϕ
i
)
i∈I
be a family of formulas with finite index set I. Then
¬
`
∃x
_
i
ϕ
i
´
←→
`
_
i
∃xϕ
i
´
, ¬
`
∀x
^
i
ϕ
i
´
←→
`
^
i
∀xϕ
i
´
.
(3) If ϕ
1
(x
1
, . . . , x
m
) and ϕ
2
(x
1
, . . . , x
m
) are existential formulas, then

1
∨ ϕ
2
)(x
1
, . . . , x
m
), (ϕ
1
∧ ϕ
2
)(x
1
, . . . , x
m
)
are equivalent to existential formulas ϕ

12
(x
1
, . . . , x
m
) and ϕ

12
(x
1
, . . . , x
m
). The
same holds with “existential” replaced by “universal”.
(4) A formula is said to be unnested if each atomic subformula has the form Rx
1
. . . x
m
with m-ary R ∈ L
r
∪ |¯, ⊥, =¦ and distinct variables x
1
, . . . , x
m
, or the form
Fx
1
. . . x
n
= x
n+1
with n-ary F ∈ L
f
and distinct variables x
1
, . . . , x
n+1
. (This
allows ¯ and ⊥ as atomic subformulas of unnested formulas.) Then:
(i) for each term t(x
1
, . . . , x
m
) and variable y / ∈ |x
1
, . . . , x
m
¦ the formula
t(x
1
, . . . , x
m
) = y
is equivalent to an unnested existential formula θ
1
(x
1
, . . . , x
m
, y), and also to an
unnested universal formula θ
2
(x
1
, . . . , x
m
, y).
(ii) each atomic formula ϕ(y
1
, . . . , y
n
) is equivalent to an unnested existential
formula ϕ
1
(y
1
, . . . , y
n
), and also to an unnested universal formula ϕ
2
(y
1
, . . . , y
n
).
(iii) each formula ϕ(y
1
, . . . , y
n
) is equivalent to an unnested formula ϕ
u
(y
1
, . . . , y
n
).
3.4 Equational Classes and Universal Algebra
The term-structure /
Σ
introduced in the proof of the Completeness Theorem
also plays a role in what is called universal algebra. This is a general setting
for constructing mathematical objects by generators and relations. Free groups,
tensor products of various kinds, polynomial rings, and so on, are all special
cases of a single construction in universal algebra.
In this section we fix a language L that has only function symbols, including
at least one constant symbol. So L has no relation symbols. Instead of “L-
structure” we say “L-algebra”, and /, B denote L-algebras. A substructure of
54 CHAPTER 3. THE COMPLETENESS THEOREM
/ is also called a subalgebra of /, and a quotient algebra of / is an L-algebra
// ∼ where ∼ is a congruence on /. We call / trivial if [A[ = 1. There is up
to isomorphism exactly one trivial L-algebra.
An L-identity is an L-sentence
∀x
_
s
1
(x) = t
1
(x) ∧ ∧ s
n
(x) = t
n
(x)
_
, x = (x
1
, . . . , x
m
)
where x
1
, . . . , x
m
are distinct variables and ∀x abbreviates ∀x
1
. . . ∀x
m
, and
where s
1
, t
1
, . . . , s
n
, t
n
are L-terms.
Given a set Σ of L-identities we define a Σ-algebra to be an L-algebra that
satisfies all identities in Σ, in other words, a Σ-algebra is the same as a model
of Σ. To such a Σ we associate the class Mod(Σ) of all Σ-algebras. A class ( of
L-algebras is said to be equational if there is a set Σ of L-identities such that
( = Mod(Σ).
Examples. With L = L
Gr
, Gr is a set of L-identities, and Mod(Gr), the class
of groups, is the corresponding equational class of L-algebras. With L = L
Ri
,
Ri is a set of L-identities, and Mod(Ri), the class of rings, is the corresponding
equational class of L-algebras. If one adds to Ri the identity ∀x∀y(xy = yx)
expressing the commutative law, then the corresponding class is the class of
commutative rings.
Theorem 3.4.1. (G.Birkhoff) Let ( be a class of L-algebras. Then the class (
is equational if and only if the following conditions are satisfied:
(1) closure under isomorphism: if / ∈ ( and /

= B, then B ∈ (.
(2) the trivial L-algebra belongs to (;
(3) every subalgebra of any algebra in ( belongs to (;
(4) every quotient algebra of any algebra in ( belongs to (;
(5) the product of any family (/
i
) of algebras in ( belongs to (.
It is easy to see that if ( is equational, then conditions (1)–(5) are satisfied. (For
(3) and (4) one can also appeal to the Exercises 7 and 8 of section 2.5) Towards
a proof of the converse, we need some universal-algebraic considerations that
are of interest beyond the connection to Birkhoff’s theorem.
For the rest of this section we fix a set Σ of L-identities. Associated to Σ is the
term algebra /
Σ
whose elements are the equivalence classes [t] of variable-free
L-terms t, where two such terms s and t are equivalent iff Σ ¬ s = t.
Lemma 3.4.2. /
Σ
is a Σ-algebra.
Proof. Consider an identity
∀x
_
s
1
(x) = t
1
(x) ∧ ∧ s
n
(x) = t
n
(x)
_
, x = (x
1
, . . . , x
m
)
in Σ, let j ∈ ¦1, . . . , n¦ and put s = s
j
and t = t
j
. Let a
1
, . . . , a
m
∈ A
Σ
and put
/ = /
Σ
. It suffices to show that then s(a
1
, . . . , a
m
)
A
= t(a
1
, . . . , a
m
)
A
. Take
3.4. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 55
variable-free L-terms α
1
, . . . , α
m
such that a
1
= [α
1
], . . . , a
m
= [α
m
]. Then by
part (1) of Corollary 3.2.5 we have a
1
= α
A
1
, . . . , a
m
= α
A
m
, so
s(a
1
, . . . , a
m
)
A
= s(α
1
, . . . , α
m
)
A
, t(a
1
, . . . , a
m
)
A
= t(α
1
, . . . , α
m
)
A
by Lemma 2.6.1. Also, by part (1) of Corollary 3.2.5,
s(α
1
, . . . , α
m
)
A
= [s(α
1
, . . . , α
m
)], t(α
1
, . . . , α
m
)
A
= [t(α
1
, . . . , α
m
)].
Now Σ ¬ s(α
1
, . . . , α
m
) = t(α
1
, . . . , α
m
), so [s(α
1
, . . . , α
m
)] = [t(α
1
, . . . , α
m
)],
and thus s(a
1
, . . . , a
m
)
A
= t(a
1
, . . . , a
m
)
A
, as desired.
Actually, we are going to show that /
Σ
is a so-called initial Σ-algebra.
An initial Σ-algebra is a Σ-algebra / such that for any Σ-algebra B there is a
unique homomorphism / → B.
For example, the trivial group is an initial Gr-algebra, and the ring of integers
is an initial Ri-algebra.
Suppose / and B are both initial Σ-algebras. Then there is a unique iso-
morphism / → B. To see this, let i and j be the unique homomorphisms
/ → B and B → /, respectively. Then we have homomorphisms j ◦ i : / → /
and i ◦ j : B → B, respectively, so necessarily j ◦ i = id
A
and i ◦ j = id
B
,
so i and j are isomorphisms. So if there is an initial Σ-algebra, it is unique
up-to-unique-isomorphism.
Lemma 3.4.3. /
Σ
is an initial Σ-algebra.
Proof. Let B be any Σ-algebra. Note that if s, t ∈ T
L
and [s] = [t], then
Σ ¬ s = t, so s
B
= t
B
. Thus we have a map
A
Σ
→ B, [t] → t
B
.
It is easy to check that this map is a homomorphism /
Σ
→ B. By Exercise 4
in Section 2.4 it is the only such homomorphism.
Free algebras. Let I be an index set in what follows. Let / be a Σ-algebra
and (a
i
)
i∈I
an I-indexed family of elements of A. Then / is said to be a free
Σ-algebra on (a
i
) if for every Σ-algebra B and I-indexed family (b
i
) of elements
of B there is exactly one homomorphism h : / → B such that h(a
i
) = b
i
for all
i ∈ I. We also express this by “
_
/, (a
i
)
_
is a free Σ-algebra”. Finally, / itself is
sometimes referred to as a free Σ-algebra if there is a family (a
j
)
j∈J
in A such
that
_
/, (a
j
)
_
is a free Σ-algebra.
As an example, take L = L
Ri
and cRi := Ri ∪ ¦∀x∀y xy = yz¦, where x, y
are distinct variables. So the cRi-algebras are just the commutative rings. Let
Z[X
1
, . . . , X
n
] be the ring of polynomials in distinct indeterminates X
1
, . . . , X
n
over Z. For any commutative ring R and elements b
1
, . . . , b
n
∈ R we have a
56 CHAPTER 3. THE COMPLETENESS THEOREM
unique ring homomorphism Z[X
1
, . . . , X
n
] → R that sends X
i
to b
i
for i =
1, . . . , n, namely the evaluation map (or substitution map)
Z[X
1
, . . . , X
n
] → R, f(X
1
, . . . , X
n
) → f(b
1
, . . . , b
n
).
Thus Z[X
1
, . . . , X
n
] is a free commutative ring on (X
i
)
1≤i≤n
.
For a simpler example, let L = L
Mo
:= ¦1, ¦ ⊆ L
Gr
be the language of monoids,
and consider
Mo := ¦∀x(1 x = x ∧ x 1 = x), ∀x∀y∀z
_
(xy)z = x(yz)
_
¦,
where x, y, z are distinct variables. A monoid, or semigroup with identity, is a
model / = (A; 1, ) of Mo, and we call 1 ∈ A the identity of the monoid /, and
its product operation.
Let E

be the set of words on an alphabet E, and consider E

as a monoid by
taking the empty word as its identity and the concatenation operation (v, w) →
vw as its product operation. Then E

is a free monoid on the family (e)
e∈E
of
words of length 1, because for any monoid B and elements b
e
∈ B (for e ∈ E)
we have a unique monoid homomorphism E

→ B that sends each e ∈ E to b
e
,
namely,
e
1
. . . e
n
→ b
e
1
b
e
n
.
Remark. If / and B are both free Σ-algebras on (a
i
) and (b
i
) respectively, with
same index set I, and g : / → B and h : B → / are the unique homomorphisms
such that g(a
i
) = b
i
and h(b
i
) = a
i
for all i, then (h ◦ g)(a
i
) = a
i
for all i, so
h ◦ g = id
A
, and likewise g ◦ h = id
B
, so g is an isomorphism with inverse h.
Thus, given I, there is, up to unique isomorphism preserving I-indexed families,
at most one free Σ-algebra on an I-indexed family of its elements.
We shall now construct free Σ-algebras as initial algebras by working in an
extended language. Let L
I
:= L ∪ ¦c
i
: i ∈ I¦ be the language L augmented by
new constant symbols c
i
for i ∈ I, where new means that c
i
/ ∈ L for i ∈ I and
c
i
,= c
j
for distinct i, j ∈ I. So an L
I
-algebra
_
B, (b
i
)
_
is just an L-algebra B
equipped with an I-indexed family (b
i
) of elements of B. Let Σ
I
be Σ viewed
as a set of L
I
-identities. Then a free Σ-algebra on an I-indexed family of its
elements is just an initial Σ
I
-algebra. In particular, the Σ
I
-algebra /
Σ
I
is a
free Σ-algebra on ([c
i
]). Thus, up to unique isomorphism of Σ
I
-algebras, there
is a unique free Σ-algebra on an I-indexed family of its elements.
Let (/, (a
i
)
i∈I
) be a free Σ-algebra. Then / is generated by (a
i
). To see
why, let B be the subalgebra of / generated by (a
i
). Then we have a unique
homomorphism h : / → B such that h(a
i
) = a
i
for all i ∈ I, and then the
composition
/ → B → /
is necessarily id
A
, so B = /.
3.4. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 57
Let B be any Σ-algebra, and take any family (b
j
)
j∈J
in B that generates B.
Take a free Σ-algebra
_
/, (a
j
)
j∈J
_
, and take the unique homomorphism h :
_
/, (a
j
)
_

_
B, (b
j
)
_
. Then h(t
A
(a
j
1
, . . . , a
j
n
) = t
B
(b
j
1
, . . . , b
j
n
) for all L-
term t(x
1
, . . . , x
n
) and j
1
, . . . , j
n
∈ J, so h(A) = B, and thus h induces an
isomorphism //∼
h

= B. We have shown:
Every Σ-algebra is isomorphic to a quotient of a free Σ-algebra.
This fact can sometimes be used to reduce problems on Σ-algebras to the case
of free Σ-algebras; see the next subsection for an example.
Proof of Birkhoff’s theorem. Let us say that a class ( of L-algebras is closed
if it has properties (1)–(5) listed in Theorem 3.4.1. Assume ( is closed; we have
to show that then ( is equational. Indeed, let Σ(() be the set of L-identities
∀x
_
s(x) = t(x)
_
that are true in all algebras of (. It is clear that each algebra in ( is a Σ(()-
algebra, and it remains to show that every Σ(()-algebra belongs to (. Here is
the key fact from which this will follow:
Claim. If / is an initial Σ(()-algebra, then / ∈ (.
To prove this claim we take / := /
Σ(C)
. For s, t ∈ T
L
such that s = t does not
belong to Σ(() we pick B
s,t
∈ ( such that B
s,t
[= s ,= t, and we let h
s,t
: / → B
s,t
be the unique homomorphism, so h
s,t
([s]) ,= h
s,t
([t]). Let B :=

B
s,t
where the
product is over all pairs (s, t) as above, and let h : / → B be the homomorphism
given by h(a) =
_
h
s,t
(a)
_
. Note that B ∈ (. Then h is injective. To see why,
let s, t ∈ T
L
be such that [s] ,= [t] in A = A
Σ(C)
. Then s = t does not belong
to Σ((), so h
s,t
([s]) ,= h
s,t
([t]), and thus h([s]) ,= h([t]). This injectivity gives
/

= h(/) ⊆ B, so / ∈ (. This finishes the proof of the claim.
Now, every Σ(()-algebra is isomorphic to a quotient of a free Σ(()-algebra, so
it remains to show that free Σ(()-algebras belong to (. Let / be a free Σ(()-
algebra on (a
i
)
i∈I
. Let (
I
be the class of all L
I
-algebras
_
B, (b
i
)
_
with B ∈ (.
It is clear that (
I
is closed as a class of L
I
-algebras. Now,
_
/, (a
i
)
_
is easily
seen to be an initial Σ((
I
)-algebra. By the claim above, applied to (
I
in place
of (, we obtain
_
/, (a
i
)
_
∈ (
I
, and thus / ∈ (.
Generators and relations. Let G be any set. Then we have a Σ-algebra /
with a map ι : G → A such that for any Σ-algebra B and any map j : G → B
there is a unique homomorphism h : / → B such that h◦ι = j; in other words, /
is a free as a Σ-algebra on (ιg)
g∈G
. Note that if (/

, ι

) (with ι

: G → A

) has the
same universal property as (/, ι), then the unique homomorphism h : / → /

such that h◦ι = ι

is an isomorphism, so this universal property determines the
pair (/, ι) up-to-unique-isomorphism. So there is no harm in calling (/, ι) the
free Σ-algebra on G. Note that / is generated as an L-algebra by ιG.
Here is a particular way of constructing the free Σ-algebra on G. Take the
language L
G
:= L ∪ G (disjoint union) with the elements of G as constant
58 CHAPTER 3. THE COMPLETENESS THEOREM
symbols. Let Σ(G) be Σ considered as a set of L
G
-identities. Then / := /
Σ(G)
as a Σ-algebra with the map g → [g] : G → A
Σ(G)
is the free Σ-algebra on G.
Next, let R be a set of sentences s(g) = t(g) where s(x
1
, . . . , x
n
) and
t(x
1
, . . . , x
n
) are L-terms and g = (g
1
, . . . , g
n
) ∈ G
n
(with n depending on
the sentence). We wish to construct the Σ-algebra generated by G with R as set
of relations.
1
This object is described up-to-isomorphism in the next lemma.
Lemma 3.4.4. There is a Σ-algebra /(G, R) with a map i : G → A(G, R) such
that:
(1) /(G, R) [= s(ig) = t(ig) for all s(g) = t(g) in R;
(2) for any Σ-algebra B and map j : G → B with B [= s(jg) = t(jg) for all
s(g) = t(g) in R, there is a unique homomorphism h : /(G, R) → B such
that h ◦ i = j.
Proof. Let Σ(R) := Σ ∪ R, viewed as a set of L
G
-sentences, let /(G, R) :=
/
Σ(R)
, and define i : G → A(G, R) by i(g) = [g]. As before one sees that the
universal property of the lemma is satisfied.
1
The use of the term “relations” here has nothing to do with n-ary relations on sets.
Chapter 4
Some Model Theory
In this chapter we first derive the L¨owenheim-Skolem Theorem. Next we develop
some basic methods related to proving completeness of a given set of axioms:
Vaught’s Test, back-and-forth, quantifier elimination. Each of these methods,
when applicable, achieves a lot more than just establishing completeness.
4.1 L¨owenheim-Skolem; Vaught’s Test
Below, the cardinality of a structure is defined to be the cardinality of its un-
derlying set. In this section we have the same conventions concerning L, /, t,
ϕ, ψ, θ, σ and Σ as in the beginning of Section 2.6, unless specified otherwise.
Theorem 4.1.1 (Countable L¨owenheim-Skolem Theorem).
Suppose L is countable and Σ has a model. Then Σ has a countable model.
Proof. Since Var is countable, the hypothesis that L is countable yields that the
set of L-sentences is countable. Hence the language
L ∪ ¦c
σ
: Σ ¬ σ where σ is an L-sentence of the form ∃xϕ(x)¦
is countable, that is, adding witnesses keeps the language countable. The union
of countably many countable sets is countable, hence the set L

constructed
in the proof of the Completeness Theorem is countable. It follows that there
are only countably many variable-free L

-terms, hence /
Σ

is countable, and
thus its reduct /
Σ

[
L
is a countable model of Σ.
Remark. The above proof is the first time that we used the countability of
Var = ¦v
0
, v
1
, v
2
, . . . ¦. As promised in Section 2.4, we shall now indicate why the
Downward L¨owenheim-Skolem Theorem goes through without assuming that
Var is countable.
Suppose that Var is uncountable. Take a countably infinite subset Var


Var. Then each sentence is equivalent to one whose variables are all from Var

.
By replacing each sentence in Σ by an equivalent one all whose variables are
59
60 CHAPTER 4. SOME MODEL THEORY
from Var

, we obtain a countable set Σ

of sentences such that Σ and Σ

have
the same models. As in the proof above, we obtain a countable model of Σ

working throughout in the setting where only variables from Var

are used in
terms and formulas. This model is a countable model of Σ.
The following test can be useful in showing that a set of axioms Σ is complete.
Proposition 4.1.2 (Vaught’s Test). Let L be countable, and suppose Σ has a
model, and that all countable models of Σ are isomorphic. Then Σ is complete.
Proof. Suppose Σ is not complete. Then there is σ such that Σ σ and Σ σ.
Hence by the L¨owenheim-Skolem Theorem there is a countable model / of Σ
such that / σ, and there is a countable model B of Σ such that B σ. We
have /

= B, / [= σ and B [= σ, contradiction.
Example. Let L = ∅, so the L-structures are just the non-empty sets. Let
Σ = ¦σ
1
, σ
2
, . . .¦ where
σ
n
= ∃x
1
. . . ∃x
n

1≤i<j≤n
x
i
,= x
j
.
The models of Σ are exactly the infinite sets. All countable models of Σ are
countably infinite and hence isomorphic to N. Thus by Vaught’s Test Σ is
complete.
In this example the hypothesis of Vaught’s Test is trivially satisfied. In other
cases it may require work to check this hypothesis. One general method in
model theory, Back-and-Forth, is often used to verify the hypothesis of Vaught’s
Test. The proof of the next theorem shows Back-and-Forth in action. To
formulate that theorem we define a totally ordered set to be a structure (A; <)
for the language L
O
that satisfies the following axioms (where x, y, z are distinct
variables):
∀x(x ,< x), ∀x∀y∀z
_
(x < y ∧y < z) → x < z
_
, ∀x∀y(x < y ∨x = y ∨y < x).
Let To be the set consisting of these three axioms. A totally ordered set is said
to be dense if it satisfies in addition the axiom
∀x∀y(x < y → ∃z(x < z ∧ z < y)),
and it is said to have no endpoints if it satisfies the axiom
∀x∃y∃z(y < x ∧ x < z).
So (Q; <) and (R; <) are dense totally ordered sets without endpoints. Let
To (dense without endpoints) be To augmented by these two extra axioms.
Theorem 4.1.3 (Cantor). Any two countable dense totally ordered sets without
endpoints are isomorphic.
4.1. L
¨
OWENHEIM-SKOLEM; VAUGHT’S TEST 61
Proof. Let (A; <) and (B; <) be countable dense totally ordered sets without
endpoints. So A = ¦a
n
: n ∈ N¦ and B = ¦b
n
: n ∈ N¦. We define by recursion
a sequence (α
n
) in A and a sequence (β
n
) in B as follows: put α
0
:= a
0
and
β
0
:= b
0
. Let n > 0, and suppose we have distinct α
0
, . . . , α
n−1
in A and distinct
β
0
, . . . , β
n−1
in B such that for all i, j < n we have α
i
< α
j
⇐⇒ β
i
< β
j
. Then
we define α
n
∈ A and β
n
∈ B as follows:
Case 1: n is even. (Here we go forth.) First take k ∈ N minimal such that
a
k
/ ∈ ¦α
0
, . . . , α
n−1
¦; then take l ∈ N minimal such that b
l
is situated with
respect to β
0
, . . . , β
n−1
as a
k
is situated with respect to α
0
, . . . , α
n−1
, that is, l
is minimal such that for i = 0, . . . , n − 1 we have: α
i
< a
k
⇐⇒ β
i
< b
l
, and
α
i
> a
k
⇐⇒ β
i
> b
l
. (The reader should check that such an l exists: that is
where density and “no endpoints” come in); put α
n
:= a
k
and β
n
:= b
l
.
Case 2: n is odd. (Here we go back.) First take l ∈ N minimal such that
b
l
/ ∈ ¦β
0
, . . . , β
n−1
¦; next take k ∈ N minimal such that a
k
is situated with
respect to α
0
, . . . , α
n−1
as b
l
is situated with respect to β
0
, . . . , β
n−1
, that is, k
is minimal such that for i = 0, . . . , n − 1 we have: α
i
< a
k
⇐⇒ β
i
< b
l
, and
α
i
> a
k
⇐⇒ β
i
> b
l
. Put β
n
:= b
l
and α
n
:= a
k
.
One proves easily by induction on n that then a
n
∈ ¦α
0
, . . . , α
2n
¦ and b
n

¦β
0
, . . . , β
2n
¦. Thus we have a bijection α
n
→ β
n
: A → B, and this bijection
is an isomorphism (A; <) → (B; <).
In combination with Vaught’s Test this gives
Corollary 4.1.4. To (dense without endpoints) is complete.
In the results below we let κ denote an infinite cardinal. Recall that the set of
ordinals λ < κ has cardinality κ. We have the following generalization of the
L¨ owenheim-Skolem theorem.
Theorem 4.1.5 (Generalized L¨owenheim-Skolem Theorem). Suppose L has
size at most κ and Σ has an infinite model. Then Σ has a model of cardinality
κ.
Proof. Let ¦c
λ
¦
λ<κ
be a family of κ new constant symbols that are not in L and
are pairwise distinct (that is, c
λ
,= c
µ
for λ < µ < κ). Let L

= L∪¦c
λ
: λ < κ¦
and let Σ

= Σ ∪ ¦c
λ
,= c
µ
: λ < µ < κ¦. We claim that Σ

is consistent. To see
this it suffices to show that, given any finite set Λ ⊆ κ, the set of L

-sentences
Σ
Λ
:= Σ ∪ ¦c
λ
,= c
µ
: λ, µ ∈ Λ, λ ,= µ¦
has a model. Take an infinite model / of Σ. We make an L

-expansion /
Λ
of / by interpreting distinct c
λ
’s with λ ∈ Λ by distinct elements of A, and
interpreting the c
λ
’s with λ / ∈ Λ arbitrarily. Then /
Λ
is a model of Σ
Λ
, which
establishes the claim.
Note that L

also has size at most κ. The same arguments we used in proving
the countable version of the L¨owenheim-Skolem Theorem show that then Σ

has a model B

= (B, (b
λ
)
λ<κ
) of cardinality at most κ. We have b
λ
,= b
µ
for
λ < µ < κ, hence B is a model of Σ of cardinality κ.
62 CHAPTER 4. SOME MODEL THEORY
The next proposition is Vaught’s Test for arbitrary languages and cardinalities.
Proposition 4.1.6. Suppose L has size at most κ, Σ has a model and all models
of Σ are infinite. Suppose also that any two models of Σ of cardinality κ are
isomorphic. Then Σ is complete.
Proof. Let σ be an L-sentence and suppose that Σ σ and Σ σ. We
will derive a contradiction. First Σ σ means that Σ ∪ ¦σ¦ has a model.
Similarly Σ σ means that Σ ∪ ¦σ¦ has a model. These models must be
infinite since they are models of Σ, so by the Generalized L¨owenheim-Skolem
Theorem Σ ∪ ¦σ¦ has a model / of cardinality κ, and Σ ∪ ¦σ¦ has a model
B of cardinality κ. By assumption /

= B, contradicting that / [= σ and
B [= σ.
We now discuss in detail one particular application of this generalized Vaught
Test. Fix a field F. A vector space over F is an abelian (additively written)
group V equipped with a scalar multiplication operation
F V −→ V : (λ, v) −→ λv
such that for all λ, µ ∈ F and all v, w ∈ V ,
(i) (λ +µ)v = λv +µv,
(ii) λ(v +w) = λv +λw,
(iii) 1v = v,
(iv) (λµ)v = λ(µv).
Let L
F
be the language of vector spaces over F: it extends the language
L
Ab
= ¦0, −, +¦ of abelian groups with unary function symbols f
λ
, one for each
λ ∈ F; a vector space V over F is viewed as an L
F
-structure by interpreting
each f
λ
as the function v −→ λv : V → V . One easily specifies a set Σ
F
of
sentences whose models are exactly the vector spaces over F. Note that Σ
F
is
not complete since the trivial vector space satisfies ∀x(x = 0) but F viewed as
vector space over F does not. Moreover, if F is finite, then we have also non-
trivial finite vector spaces. From a model-theoretic perspective finite structures
are somewhat exceptional, so we are going to restrict attention to infinite vector
spaces over F. Let x
1
, x
2
, . . . be a sequence of distinct variables and put
Σ

F
:= Σ
F
∪ ¦∃x
1
. . . ∃x
n

1≤i<j≤n
x
i
,= x
j
: n = 2, 3, . . .¦.
So the models of Σ

F
are exactly the infinite vector spaces over F. Note that if
F itself is infinite then each non-trivial vector space over F is infinite.
We will need the following facts about vector spaces V and W over F.
(Proofs can be found in various texts.)
Fact.
(a) V has a basis B, that is, B ⊆ V , and for each vector v ∈ V there is a
unique family (λ
b
)
b∈B
of scalars (elements of F) such that ¦b ∈ B : λ
b
,= 0¦
is finite and v = Σ
b∈B
λ
b
b.
4.1. L
¨
OWENHEIM-SKOLEM; VAUGHT’S TEST 63
(b) Any two bases B and C of V have the same cardinality.
(c) If V has basis B and W has basis C, then any bijection B → C extends
uniquely to an isomorphism V → W.
(d) Let B be a basis of V . Then [V [ = [B[ [F[ provided either [F[ or [B[ is
infinite. If both are finite then [V [ = [F[
|B|
.
Theorem 4.1.7. Σ

F
is complete.
Proof. Take a κ > [F[. In particular L
F
has size at most κ. Let V be a vector
space over F of cardinality κ. Then a basis of V must also have size κ by
property (d) above. Thus any two vector spaces over F of cardinality κ have
bases of cardinality κ and thus are isomorphic. It follows by the Generalized
Vaught Test that Σ

F
is complete.
Remark. Theorem 4.1.7 and Exercise 3 imply for instance that if F = R then
all non-trivial vector spaces over F satisfy exactly the same sentences in L
F
.
With the generalized Vaught Test we can also prove that ACF(0) (whose models
are the algebraically closed fields of characteristic 0) is complete. The proof is
similar, with “transcendence bases” taking over the role of bases. The relevant
definitions and facts are as follows.
Let K be a field with subfield k. A subset B of K is said to be algebraically
independent over k if for all distinct b
1
, . . . , b
n
∈ B we have p(b
1
, . . . , b
n
) ,= 0
for all nonzero polynomials p(x
1
, . . . , x
n
) ∈ k[x
1
, . . . , x
n
], where x
1
, . . . , x
n
are
distinct variables. A transcendence basis of K over k is a set B ⊆ K such that
B is algebraically independent over k and K is algebraic over its subfield k(B).
Fact.
(a) K has a transcendence basis over k;
(b) any two transcendence bases of K over k have the same size;
(c) If K is algebraically closed with transcendence basis B over k and K

is
also an algebraically closed field extension of k with transcendence basis B

over k, then any bijection B → B

extends to an isomorphism K → K

;
(d) if K is uncountable and [K[ > [k[, then [K[ = [B[ for each transcendence
basis B of K over k.
Applying this with k = Q and k = F
p
for prime numbers p, we obtain that
any two algebraically closed fields of the same characteristic and the same un-
countable size are isomorphic. Using Vaught’s Test for models of size ℵ
1
this
yields:
Theorem 4.1.8. The set ACF(0) of axioms for algebraically closed fields of
characteristic zero is complete. Likewise, ACF(p) is complete for each prime
number p.
If the hypothesis of Vaught’s Test (or its generalization) is satisfied, then many
things follow of which completeness is only one; it goes beyond the scope of
these notes to develop this large chapter of model theory, which goes under the
name of “categoricity in power”.
64 CHAPTER 4. SOME MODEL THEORY
Exercises.
(1) Let L = |U¦ where U is a unary relation symbol. Consider the L-structure
(Z; N). Give an informative description of a complete set of L-sentences true in
(Z; N). (A description like |σ : (Z; N) [= σ¦ is not informative. An explicit,
possibly infinite, list of axioms is required. Hint: Make an educated guess and
try to verify it using Vaught’s Test or one of its variants.)
(2) Let Σ
1
and Σ
2
be sets of L-sentences such that no symbol of L occurs in both Σ
1
and Σ
2
. Suppose Σ
1
and Σ
2
have infinite models. Then Σ
1
∪ Σ
2
has a model.
(3) Let L = |S¦ where S is a unary function symbol. Consider the L-structure (Z; S)
where S(a) = a + 1 for a ∈ Z. Give an informative description of a complete set
of L-sentences true in (Z; S).
4.2 Elementary Equivalence and Back-and-Forth
In the rest of this chapter we relax notation, and just write ϕ(a
1
, . . . , a
n
) for an
L
A
-formula ϕ(a
1
, . . . , a
n
), where / = (A; . . . ) is an L-structure, ϕ(x
1
, . . . , x
n
)
an L
A
-formula, and (a
1
, . . . , a
n
) ∈ A
n
.
In this section / and B denote L-structures. We say that / and B are elemen-
tarily equivalent (notation: / ≡ B) if they satisfy the same L-sentences. Thus
by the previous section (Q; <) ≡ (R; <), and any two infinite vector spaces
over a given field F are elementarily equivalent.
A partial isomorphism from / to B is a bijection γ : X → Y with X ⊆ A and
Y ⊆ B such that
(i) for each m-ary R ∈ L
r
and a
1
, . . . , a
m
∈ X
(a
1
, . . . , a
m
) ∈ R
A
⇐⇒ (γa
1
, . . . , γa
m
) ∈ R
B
.
(ii) for each n-ary f ∈ L
f
and a
1
, . . . , a
n
, a
n+1
∈ X
f
A
(a
1
, . . . , a
n
) = a
n+1
⇐⇒ f
B
(γa
1
, . . . , γa
n
) = γ(a
n+1
).
Example: suppose / = (A; <) and B = (B; <) are totally ordered sets, and
a
1
, . . . , a
N
∈ A and b
1
, . . . , b
N
∈ B are such that a
1
< a
2
< < a
N
and
b
1
< b
2
< < b
N
; then the map a
i
→ b
i
: ¦a
1
, . . . , a
N
¦ → ¦b
1
, . . . , b
N
¦ is a
partial isomorphism from / to B.
A back-and-forth system from / to B is a collection Γ of partial isomorphisms
from / to B such that
(i) (“Forth”) for each γ ∈ Γ and a ∈ A there is a γ

∈ Γ such that γ

extends
γ and a ∈ domain(γ

);
(ii) (“Back”) for each γ ∈ Γ and b ∈ B there is a γ

∈ Γ such that γ

extends
γ and b ∈ codomain(γ

).
If Γ is a back-and-forth system from / to B, then Γ
−1
:= ¦γ
−1
: γ ∈ Γ¦ is a
back-and-forth system from B to /. We call / and B back-and-forth equivalent
4.3. QUANTIFIER ELIMINATION 65
(notation: / ≡
bf
B) if there exists a nonempty back-and-forth system from /
to B. Hence / ≡
bf
/, and if / ≡
bf
B, then B ≡
bf
/.
Cantor’s proof in the previous section generalizes as follows:
Proposition 4.2.1. Suppose / and B are countable and / ≡
bf
B. Then /

= B.
Proof. Let Γ be a nonempty back-and-forth system from / to B. We proceed
as in the proof of Cantor’s theorem, and construct a sequence (γ
n
) of partial
isomorphisms from / to B such that each γ
n+1
extends γ
n
, A =

n
domain(γ
n
)
and B =

n
codomain(γ
n
). Then the map A → B that extends each γ
n
is an
isomorphism / → B.
In applying this proposition and the next one in a concrete situation, the
key is to guess a back-and-forth system. That is where insight and imagination
(and experience) come in. In the following result we do not need to assume
countability.
Proposition 4.2.2. If / ≡
bf
B, then / ≡ B.
Proof. Suppose Γ is a nonemepty back-and-forth system from / to B. By
induction on the number of logical symbols in unnested formulas ϕ(y
1
, . . . , y
n
)
one shows that for each γ ∈ Γ and a
1
, . . . , a
n
∈ domain(γ) we have
/ [= ϕ(a
1
, . . . , a
n
) ⇐⇒ B [= ϕ(γa
1
, . . . , γa
n
).
For n = 0 and using Exercise 4 of Section 3.3 this yields / ≡ B.
Exercises.
(1) Define a finite restriction of a bijection γ : X → Y to be a map γ[E : E → γ(E)
with finite E ⊆ X. If Γ is a back-and-forth system from / to B, so is the set of
finite restrictions of members of Γ.
(2) If / ≡
bf
B and B ≡
bf
(, then / ≡
bf
(.
4.3 Quantifier Elimination
In this section x = (x
1
, . . . , x
n
) is a tuple of distinct variables and y is a single
variable distinct from x
1
, . . . , x
n
.
Definition. Σ has quantifier elimination (QE) if every L-formula ϕ(x) is Σ-
equivalent to a quantifier free (short: q-free) L-formula ϕ
qf
(x).
By taking n = 0 in this definition we see that if Σ has QE, then every L-sentence
is Σ-equivalent to a q-free L-sentence.
Remark. Suppose Σ has QE. Let L

be a language such that L

⊇ L and all
symbols of L

L are constant symbols. (This includes the case L

= L.) Then
every set Σ

of L

-sentences with Σ

⊇ Σ has QE. (To see this, check first that
each L

-formula ϕ

(x) has the form ϕ(c, x) for some L-formula ϕ(u, x), where
u = (u
1
, . . . , u
m
) is a tuple of distinct variables distinct from x
1
, . . . , x
n
, and
c = (c
1
, . . . , c
m
) is a tuple of constant symbols in L

L.)
66 CHAPTER 4. SOME MODEL THEORY
A basic conjunction in L is by definition a conjunction of finitely many atomic
and negated atomic L-formulas. Each q-free L-formula ϕ(x) is equivalent to a
disjunction ϕ
1
(x) ∨ ∨ ϕ
k
(x) of basic conjunctions ϕ
i
(x) in L (“disjunctive
normal form”).
Lemma 4.3.1. Suppose that for every basic conjunction θ(x, y) in L there is a
q-free L-formula θ
qf
(x) such that
Σ ¬ ∃yθ(x, y) ↔ θ
qf
(x).
Then Σ has QE.
Proof. Let us say that an L-formula ϕ(x) has Σ-QE if it is Σ-equivalent to a q-
free L-formula ϕ
qf
(x). Note that if the L-formulas ϕ
1
(x) and ϕ
2
(x) have Σ-QE,
then ϕ
1
(x), (ϕ
1
∨ ϕ
2
)(x), and (ϕ
1
∧ ϕ
2
)(x) have Σ-QE.
Next, let ϕ(x) = ∃yψ(x, y), and suppose inductively that the L-formula
ψ(x, y) has Σ-QE. Hence ψ(x, y) is Σ-equivalent to a disjunction
_
i
ψ
i
(x, y) of
basic conjunctions ψ
i
(x, y) in L, with i ranging over some finite index set. In
view of the equivalence of ∃y
_
i
ψ
i
(x, y) with
_
i
∃yψ
i
(x, y) we obtain
Σ ¬ ϕ(x) ←→

i
∃yψ
i
(x, y).
Each ∃yψ
i
(x, y) has Σ-QE, by hypothesis, so ϕ(x) has Σ-QE.
Finally, let ϕ(x) = ∀yψ(x, y), and suppose inductively that the L-formula
ψ(x, y) has Σ-QE. This case reduces to the previous case since ϕ(x) is equivalent
to ∃yψ(x, y).
Remark. In the structure (R; <, 0, 1, +, −, ) the formula
ϕ(a, b, c) := ∃y(ay
2
+by +c = 0)
is equivalent to the q-free formula
(b
2
−4ac ≥ 0 ∧ a ,= 0) ∨ (a = 0 ∧ b ,= 0) ∨ (a = 0 ∧ b = 0 ∧ c = 0).
Note that this equivalence gives an effective test for the existence of a y with a
certain property, which avoids in particular having to check an infinite number
of values of y (even uncountably many in the case above). This illustrates the
kind of property QE is.
Lemma 4.3.2. Suppose Σ has QE and B and ( are models of Σ with a common
substructure / (we do not assume / [= Σ). Then B and ( satisfy the same L
A
-
sentences.
Proof. Let σ be an L
A
-sentence. We have to show B [= σ ⇔ ( [= σ. Write σ
as ϕ(a) with ϕ(x) an L-formula and a ∈ A
n
. Take a q-free L-formula ϕ
qf
(x)
that is Σ-equivalent to ϕ(x). Then B [= σ iff B [= ϕ
qf
(a) iff / [= ϕ
qf
(a) (by
Exercise 7 of Section 2.5) iff ( [= ϕ
qf
(a) (by the same exercise) iff ( [= σ.
4.3. QUANTIFIER ELIMINATION 67
Corollary 4.3.3. Suppose Σ has a model, has QE, and there exists an L-
structure that can be embedded into every model of Σ. Then Σ is complete.
Proof. Take an L-structure / that can be embedded into every model of Σ. Let
B and ( be any two models of Σ. So / is isomorphic to a substructure of B and
of (. Then by a slight rewording of the proof of Lemma 4.3.2 (considering only
L-sentences), we see that B and ( satisfy the same L-sentences. It follows that
Σ is complete.
Remark. We have seen that Vaught’s test can be used to prove completeness.
The above corollary gives another way of establishing completeness, and is often
applicable when the hypothesis of Vaught’s Test is not satisfied. Completeness
is only one of the nice consequences of QE, and the easiest one to explain at this
stage. The main impact of QE is rather that it gives access to the structural
properties of definable sets. This will be reflected in exercises at the end of
this section. Applications of model theory to other areas of mathematics often
involve QE as a key step.
Theorem 4.3.4. To(dense without endpoints) has QE.
Proof. Let (x, y) = (x
1
, . . . , x
n
, y) be a tuple of n + 1 distinct variables, and
consider a basic conjunction ϕ(x, y) in L
O
. By Lemma 4.3.1 it suffices to show
that ∃yϕ(x, y) is equivalent in To(dense without endpoints) to a q-free formula
ψ(x). We may assume that each conjunct of ϕ is of one of the following types:
y = x
i
, x
i
< y, y < x
i
(1 ≤ i ≤ n).
To justify this, observe that if we had instead a conjunct y ,= x
i
then we could
replace it by (y < x
i
) ∨ (x
i
< y) and use the fact that ∃y(ϕ
1
(x, y) ∨ ϕ
2
(x, y))
is equivalent to ∃yϕ
1
(x, y) ∨ ∃yϕ
2
(x, y). Similarly, a negation (y < x
i
) can
be replaced by the disjunction y = x
i
∨ x
i
< y, and likewise with negations
(x
i
< y). Also conjuncts in which y does not appear can be eliminated because
¬ ∃y(ψ(x) ∧ θ(x, y)) ←→ ψ(x) ∧ ∃yθ(x, y).
Suppose that we have a conjunct y = x
i
in ϕ(x, y), so, ϕ(x, y) is equivalent to
y = x
i
∧ ϕ

(x, y), where ϕ

(x, y) is a basic conjunction in L
O
. Then ∃yϕ(x, y)
is equivalent to ϕ

(x, x
i
), and we are done. So we can assume also that ϕ(x, y)
has no conjuncts of the form y = x
i
.
After all these reductions, and after rearranging conjuncts we can assume
that ϕ(x, y) is a conjunction

i∈I
x
i
< y ∧

j∈J
y < x
j
where I, J ⊆ ¦1, . . . , n¦ and where we allow I or J to be empty. Up till this
point we did not need the density and “no endpoints” axioms, but these come
in now: ∃yϕ(x, y) is equivalent in To(dense without endpoints) to the formula

i∈I,j∈J
x
i
< x
j
.
68 CHAPTER 4. SOME MODEL THEORY
We mention without proof two examples of QE, and give a complete proof for a
third example in the next section. The following theorem is due to Tarski and
(independently) to Chevalley. It dates from around 1950.
Theorem 4.3.5. ACF has QE.
Clearly, ACF is not complete, since it says nothing about the characteristic: it
doesn’t prove 1 + 1 = 0, nor does it prove 1 + 1 ,= 0. However, ACF(0), which
contains additional axioms forcing the characteristic to be 0, is complete by 4.3.3
and the fact that the ring of integers embeds in every algebraically closed field
of characteristic 0. Tarski also established the following more difficult theorem,
which is one of the key results in real algebraic geometry. (His original proof is
rather long; there is a shorter one due to A. Seidenberg, and an elegant short
proof by A. Robinson using a combination of basic algebra and elementary
model theory.)
Definition. RCF is a set of axioms true in the ordered field (R; <, 0, 1, −, +, )
of real numbers. In addition to the ordered field axioms, it has the axiom
∀x (x > 0 → ∃y (x = y
2
)) (x, y distinct variables) and for each odd n > 1 the
axiom
∀x
1
. . . ∀x
n
∃y (y
n
+x
1
y
n−1
+ +x
n
= 0)
where x
1
, . . . , x
n
, y are distinct variables.
Theorem 4.3.6. RCF has QE and is complete.
Exercises. In (5) and (6), an L-theory is a set T of L-sentences such that for all
L-sentences σ, if T ¬ σ, then σ ∈ T. An axiomatization of an L-theory T is a set Σ of
L-sentences such that T = |σ : σ is an L-sentence and Σ ¬ σ¦.
(1) The subsets of C definable in (C; 0, 1, −, +, ) are exactly the finite subsets of C
and their complements in C. (Hint: use the fact that ACF has QE.)
(2) The subsets of R definable in the model (R; <, 0, 1, −, +, ) of RCF are exactly
the finite unions of intervals of all kinds (including degenerate intervals with just
one point) (Hint: use the fact that RCF has QE.)
(3) Let Eq

be a set of axioms in the language |∼¦ (where ∼ is a binary relation
symbol) that say:
(i) ∼ is an equivalence relation;
(ii) every equivalence class is infinite;
(iii) there are infinitely many equivalence classes.
Then Eq

admits QE and is complete. (It is also possible to use Vaught’s test
to prove completeness.)
(4) Suppose that a set Σ of L-sentences has QE. Let the language L

extend L by
new symbols of arity 0, and let Σ

⊇ Σ be a set of L

-sentences. Then Σ

(as a
set of L

-sentences) also has QE.
4.4. PRESBURGER ARITHMETIC 69
(5) Suppose the L-theory T has QE. Then T has an axiomatization consisting of
sentences ∀x∃yϕ(x, y) and ∀xψ(x) where ϕ(x, y) and ψ(x) are q-free. (Hint: let
Σ be the set of L-sentences provable from T that have the indicated form; show
that Σ has QE, and is an axiomatization of T.)
(6) Assume the L-theory T has built-in Skolem functions, that is, for each basic
conjunction ϕ(x, y) there are L-terms t
1
(x), . . . , t
k
(x) such that
T ¬ ∃yϕ(x, y) → ϕ(x, t
1
(x)) ∨ ∨ ϕ(x, t
k
(x)).
Then T has QE, for every ϕ(x, y) there are L-terms t
1
(x), . . . , t
k
(x) such that
T ¬ ∃yϕ(x, y) → ϕ(x, t
1
(x)) ∨ ∨ ϕ(x, t
k
(x)), and T has an axiomatization
consisting of sentences ∀xψ(x) where ψ(x) is q-free.
4.4 Presburger Arithmetic
In this section we consider in some detail one example of a set of axioms that
has QE, namely “Presburger Arithmetic.” Essentially, this is a complete set
of axioms for ordinary arithmetic of integers without multiplication, that is,
the axioms are true in (Z; 0, 1, +, −, <), and prove every sentence true in this
structure. There is a mild complication in trying to obtain this completeness
via QE: one can show (exercise) that for any q-free formula ϕ(x) in the language
¦0, 1, +, −, <¦ there is an N ∈ N such that either (Z; 0, 1, +, −, <) [= ϕ(n) for
all n > N or (Z; 0, 1, +, −, <) [= ϕ(n) for all n > N. In particular, formulas
such as ∃y(x = y +y) and ∃y(x = y +y +y) are not Σ-equivalent to any q-free
formula in this language, for any set Σ of axioms true in (Z; 0, 1, +, −, <).
To overcome this obstacle to QE we augment the language ¦0, 1, +, −, <¦
by new unary relation symbols P
1
, P
2
, P
3
, P
4
, . . . to obtain the language L
PrA
of Presburger Arithmetic (named after the Polish logician Presburger who was
a student of Tarski). We expand (Z; 0, 1, +, −, <) to the L
PrA
-structure
˜
Z = (Z; 0, 1, +, −, <, Z, 2Z, 3Z, 4Z, . . . )
that is, P
n
is interpreted as the set nZ. This structure satisfies the set PrA of
Presburger Axioms which consists of the following sentences:
(i) the axioms of Ab for abelian groups;
(ii) the axioms of To expressing that < is a total order;
(iii) ∀x∀y∀z(x < y → x +z < y +z) (translation invariance of <);
(iv) 0 < 1 ∧ ∃y(0 < y < 1) (discreteness axiom);
(v) ∀x∃y
_
0≤r<n
x = ny +r1, n = 1, 2, 3, . . . (division with remainder);
(vi) ∀x
_
P
n
x ↔ ∃y(x = ny)
_
, n = 1, 2, 3, . . . (defining axioms for P
1
, P
2
, . . . ).
Here we have fixed distinct variables x, y, z for definiteness. In (v) and in the
rest of this section r ranges over integers. Note that (v) and (vi) are infinite
lists of axioms. Here are some elementary facts about models of PrA:
Proposition 4.4.1. Let / = (A; 0, 1, +, −, <, P
A
1
, P
A
2
, P
A
3
, . . . ) [= PrA. Then
(1) There is a unique embedding
˜
Z −→ /; it sends k ∈ Z to k1 ∈ A.
70 CHAPTER 4. SOME MODEL THEORY
(2) Given any n > 0 we have P
A
n
= n/, where we regard / as an abelian group,
and //n/ has exactly n elements, namely 0 +n/, . . . , (n −1)1 +n/.
(3) For any n > 0 and a ∈ A, exactly one of the a, a + 1, . . . , a + (n −1)1 lies
in n/;
(4) / is torsion-free as an abelian group.
Theorem 4.4.2. PrA has QE.
Proof. Let (x, y) = (x
1
, . . . , x
n
, y) be a tuple of n + 1 distinct variables, and
consider a basic conjunction ϕ(x, y) in L
PrA
. By Lemma 4.3.1 it suffices to show
that ∃yϕ(x, y) is PrA-equivalent to a q-free formula ψ(x). We may assume that
each conjunct of ϕ is of one of the following types, for some integer m ≥ 1 and
L
PrA
-term t(x):
my = t(x), my < t(x), t(x) < my, P
n
(my +t(x)).
To justify this assumption observe that if we had instead a conjunct my ,= t(x)
then we could replace it by (my < t(x)) ∨ (t(x) < my) and use the fact that
∃y(ϕ
1
(x, y) ∨ ϕ
2
(x, y)) is equivalent to ∃yϕ
1
(x, y) ∨ ∃yϕ
2
(x, y). Similarly a
negation P
n
(my +t(x)) can be replaced by the disjunction
P
n
(my +t(x) + 1) ∨ . . . ∨ P
n
(my +t(x) + (n −1)1)
Also conjuncts in which y does not appear can be eliminated because
¬ ∃y(ψ(x) ∧ θ(x, y)) ←→ ψ(x) ∧ ∃yθ(x, y).
Since PrA ¬ P
n
(z) ↔ P
rn
(rz) for r > 0 we can replace P
n
(my + t(x)) by
P
rn
(rmy + rt(x)). Also, for r > 0 we can replace my = t(x) by rmy = rt(x),
and likewise with my < t(x) and t(x) < my. We can therefore assume that
all conjuncts have the same “coefficient” m in front of the variable y. After all
these reductions, and after rearranging conjuncts, ϕ(x, y) has the form

h∈H
my = t
h
(x) ∧

i∈I
t
i
(x) < my ∧

j∈J
my < t
j
(x) ∧

k∈K
P
n(k)
(my +t
k
(x))
where m > 0 and H, I, J, K are disjoint finite index sets. We allow some of
these index sets to be empty in which case the corresponding conjunction can
be left out.
Suppose that H ,= ∅, say h

∈ H. Then the formula ∃yϕ(x, y) is PrA-
equivalent to
P
m
(t
h
(x)) ∧

h∈H
t
h
(x) = t
h
(x) ∧

i∈I
t
i
(x) < t
h
(x) ∧

j∈J
t
h
(x) < t
j
(x)

k∈K
P
n(k)
(t
h
(x) +t
k
(x))
For the rest of the proof we assume that H = ∅.
4.4. PRESBURGER ARITHMETIC 71
To understand what follows, it may help to focus on the model
˜
Z, although
the arguments go through for arbitrary models of PrA. Fix any value a ∈ Z
n
of x. Consider the system of linear congruences (with “unknown” y)
P
n(k)
(my +t
k
(a)), (k ∈ K),
which in more familiar notation would be written as
my +t
k
(a) ≡ 0 mod n(k), (k ∈ K).
The solutions in Z of this system form a union of congruence classes modulo
N :=

k∈K
n(k), where as usual we put N = 1 for K = ∅. This suggests
replacing y successively by Nz, 1 +Nz,. . . ,(N −1)1 +Nz. Our precise claim is
that ∃yϕ(x, y) is PrA-equivalent to the formula θ(x) given by
N−1

r=0
_

k∈K
P
n(k)
((mr)1 +t
k
(x)) ∧ ∃z
_

i∈I
t
i
(x) < m(r1 +Nz)

j∈J
m(r1 +Nz) < t
j
(x)
__
We prove this equivalence with θ(x) as follows. Suppose
/ = (A, . . .) [= PrA, a = (a
1
, . . . , a
n
) ∈ A
n
.
We have to show that / [= ∃yϕ(a, y) if and only if / [= θ(a). So let b ∈ A be
such that / [= ϕ(a, b). Division with remainder yields a c ∈ A and an r such
that b = r1 +Nc and 0 ≤ r ≤ N −1. Note that then for k ∈ K,
mb +t
k
(a) = m(r1 +Nc) +t
k
(a) = (mr)1 + (mN)c +t
k
(a) ∈ n(k)/
and so / [= P
n(k)
((mr)1 +t
k
(a)). Also,
t
i
(a) < m(r1 +Nc) for every i ∈ I,
m(r1 +Nc) < t
j
(a) for every j ∈ J.
Therefore / [= θ(a) with ∃z witnessed by c. For the converse, suppose that the
disjunct of θ(a) indexed by a certain r ∈ ¦0, . . . , N − 1¦ is true in /, with ∃z
witnessed by c ∈ A. Then put b = r1+Nc and we get / [= ϕ(a, b). This proves
the claimed equivalence.
Now that we have proved the claim we have reduced to the situation (after
changing notation) where H = K = ∅ (i. e. no equations and no congruences).
So ϕ(x, y) now has the form

i∈I
t
i
(x) < my ∧

j∈J
my < t
j
(x)
72 CHAPTER 4. SOME MODEL THEORY
If J = ∅ or I = ∅ then PrA ¬ ∃yϕ(x, y) ↔ ·. This leaves the case where
both I and J are nonempty. So suppose / [= PrA and that A is the underlying
set of /. For each value a ∈ A
n
of x there is i
0
∈ I such that t
i
0
(a) is maximal
among the t
i
(a) with i ∈ I, and a j
0
∈ J such that t
j
0
(a) is minimal among the
t
j
(a) with j ∈ J. Moreover each interval of m successive elements of A contains
an element of m/. Therefore ∃yϕ(x, y) is equivalent in / to the disjunction
over all pairs (i
0
, j
0
) ∈ I J of the q-free formula

i∈I
t
i
(x) ≤ t
i
0
(x) ∧

j∈J
t
j
0
(x) ≤ t
j
(x)

m

r=1
_
P
m
(t
i
0
(x) +r1) ∧ (t
i
0
(x) +r1 < t
j
0
(x))
_
This completes the proof. Note that L
PrA
does not contain the relation symbol
≤; we just write t ≤ t

to abbreviate (t < t

) ∨ (t = t

).
Remark. It now follows from Corollary 4.3.3 that PrA is complete: it has QE
and
˜
Z can be embedded in every model.
Discussion. The careful reader will have noticed that the elimination procedure
in the proof above is constructive: it describes an algorithm that, given any basic
conjunction ϕ(x, y) in L
PrA
as input, constructs a q-free formula ψ(x) of L
PrA
such that PrA ¬ ∃yϕ(x, y) ↔ ψ(x). In view of the equally constructive proof
of Lemma 4.3.1 this yields an algorithm that, given any L
PrA
-formula ϕ(x) as
input, constructs a q-free L
PrA
-formula ϕ
qf
(x) such that PrA ¬ ϕ(x) ↔ ϕ
qf
(x).
(Thus PrA has effective QE.)
In particular, this last algorithm constructs for any L
PrA
-sentence σ a q-free
L
PrA
-sentence σ
qf
such that PrA ¬ σ ↔ σ
qf
. Since we also have an obvious
algorithm that, given any q-free L
PrA
-sentence σ
qf
, checks whether σ
qf
is true
in
˜
Z, this yields an algorithm that, given any L
PrA
-sentence σ, checks whether
σ is true in
˜
Z. Thus the structure
˜
Z is decidable. (A precise definition of
decidability will be given in the next Chapter.) The algorithms above can
easily be implemented by computer programs.
Let some L-structure / be given, and suppose we have an algorithm for
deciding whether any given L-sentence is true in /. Even if this algorithm can
be implemented by a computer program, it does not guarantee that the program
is of practical use, or feasible: on some moderately small inputs it might have
to run for 10
100
years before producing an output. This bad behaviour is not
at all unusual: no (classical, sequential) algorithm for deciding the truth of
L
PrA
-sentences in
˜
Z is feasible in a precise technical sense. Results of this kind
belong to complexity theory; this is an area where mathematics (logic, number
theory,. . . ) and computer science interact.
There do exist feasible integer linear programming algorithms that decide
the truth in
˜
Z of sentences of a special form, and this shows another (very
practical) side of complexity theory.
A positive impact of QE is that it yields structural properties of definable
sets, as we discuss next for
˜
Z.
4.5. SKOLEMIZATION AND EXTENSION BY DEFINITION 73
Definition. Let d be a positive integer. An arithmetic progression of modulus
d is a set of the form
¦k ∈ Z : k ≡ r mod d, α < k < β¦,
where r ∈ ¦0, . . . , d −1¦, α, β ∈ Z ∪ ¦−∞, +∞¦, α < β.
We leave the proof of the next lemma to the reader.
Lemma 4.4.3. Arithmetic progressions have the following properties.
(1) If P, Q ⊆ Z are arithmetic progressions of moduli d and e respectively, then
P ∩ Q is an arithmetic progression of modulus lcm(d, e).
(2) If P ⊆ Z is an arithmetic progression, then Z P is a finite union of
arithmetic progressions.
(3) Let T be the collection of all finite unions of arithmetic progressions. Then
T contains with any two sets X, Y also X ∪ Y , X ∩ Y , X Y .
Corollary 4.4.4. Let S ⊆ Z. Then
S is definable in
˜
Z ⇐⇒ S is a finite union of arithmetic progressions.
Proof. (⇐) It suffices to show that each arithmetic progression is definable in
˜
Z;
this is straightforward and left to the reader. (⇒) By QE and Lemma 4.4.3 it
suffices to show that each atomic L
PrA
-formula ϕ(x) defines in
˜
Z a finite union
of arithmetic progressions. Every atomic formula ϕ(x) different from · and ⊥
has the form t
1
(x) < t
2
(x) or the form t
1
(x) = t
2
(x) or the form P
d
(t(x)), where
t
1
(x), t
2
(x) and t(x) are L
PrA
-terms. The first two kinds reduce to t(x) > 0
and t(x) = 0 respectively (by subtraction). It follows that we may assume that
ϕ(x) has the form kx+l1 > 0, or the form kx+l1 = 0 , or the form P
d
(kx+l1),
where k, l ∈ Z. Considering cases (k = 0, k ,= 0 and k ≡ 0 mod d, and so on),
we see that such a ϕ(x) defines an arithmetic progression.
Exercises.
(1) Prove that 2Z cannot be defined in the structure (Z; 0, 1, +, −, <) by a q-free
formula of the language |0, 1, +, −, <¦.
4.5 Skolemization and Extension by Definition
In this section L is a sublanguage of L

, Σ a set of L-sentences, and Σ

a set of
L

-sentences with Σ ⊆ Σ

.
Definition. Σ

is said to be conservative over Σ (or a conservative extension
of Σ) if for every L-sentence σ
Σ

¬
L
σ ⇐⇒ Σ ¬
L
σ.
74 CHAPTER 4. SOME MODEL THEORY
Here (=⇒) is the significant direction, (⇐=) is automatic.
Remarks
(1) Suppose Σ

conservative over Σ. Then Σ is consistent if and only if Σ

is
consistent.
(2) If each model of Σ has an L

-expansion to a model of Σ

, then Σ

is conser-
vative over Σ. (This follows easily from the Completeness Theorem.)
Proposition 4.5.1. Let ϕ(x
1
, . . . , x
n
, y) be an L-formula. Let f
ϕ
be an n-ary
function symbol not in L, and put L

:= L ∪ ¦f
ϕ
¦ and
Σ

:= Σ ∪ ¦∀x
1
. . . ∀x
n
(∃yϕ(x
1
, . . . , x
n
, y) → ϕ(x
1
, . . . , x
n
, f
ϕ
(x
1
, . . . , x
n
)))¦
Then Σ

is conservative over Σ.
Proof. Let / be any model of Σ. By remark (2) it suffices to obtain an L

-
expansion /

of / that makes the new axiom about f
ϕ
true. We choose a
function f
A

ϕ
: A
n
−→ A as follows. For any (a
1
, . . . , a
n
) ∈ A
n
, if there is a
b ∈ A such that / [= ϕ(a
1
, . . . , a
n
, b) then we let f
A

ϕ
(a
1
, . . . , a
n
) be such an
element b, and if no such b exists, we let f
A

ϕ
(a
1
, . . . , a
n
) be an arbitrary element
of A. Thus
/

[= ∀x
1
. . . ∀x
n
(∃yϕ(x
1
, . . . , x
n
, y) → ϕ(x
1
, . . . , x
n
, f
ϕ
(x
1
, . . . , x
n
)))
as desired.
Remark. A function f
A

ϕ
as in the proof is called a Skolem function in / for
the formula ϕ(x
1
, . . . , x
n
, y). It yields a “witness” for each relevant n-tuple.
Definition. Given an L-formula ϕ(x
1
, . . . , x
m
), let R
ϕ
be an m-ary relation
symbol not in L, and put L
ϕ
:= L ∪ ¦R
ϕ
¦ and Σ
ϕ
:= Σ ∪ ¦ρ
ϕ
¦ where ρ
ϕ
is
∀x
1
. . . ∀x
m
(ϕ(x
1
, . . . , x
m
) ↔ R
ϕ
(x
1
, . . . , x
m
)).
The sentence ρ
ϕ
is called the defining axiom for R
ϕ
. We call Σ
ϕ
an extension
of Σ by a definition for the relation symbol R
ϕ
.
Remark. Each model / of Σ expands uniquely to a model of Σ
ϕ
. We denote
this expansion by /
ϕ
. Every model of Σ
ϕ
is of the form /
ϕ
for a unique model
/ of Σ.
Proposition 4.5.2. Let ϕ = ϕ(x
1
, . . . , x
m
) be as above. Then we have:
(1) Σ
ϕ
is conservative over Σ.
(2) For each L
ϕ
-formula ψ(y) where y = (y
1
, . . . , y
n
) there is an L-formula
ψ

(y), called a translation of ψ(y), such that Σ
ϕ
¬ ψ(y) ↔ ψ

(y).
(3) Suppose / [= Σ and S ⊆ A
m
. Then S is 0-definable in / if and only if S
is 0-definable in /
ϕ
, and the same with definable instead of 0-definable.
4.5. SKOLEMIZATION AND EXTENSION BY DEFINITION 75
Proof. (1) is clear from the remark preceding the proposition, and (3) is im-
mediate from (2). To prove (2) we observe that by the Equivalence Theorem
(3.3.2) it suffices to prove it for formulas ψ(y) = R
ϕ
t
1
(y) . . . t
m
(y) where the t
i
are L-terms. In this case we can take
∃u
1
. . . ∃u
m
(u
1
= t
1
(y) ∧ . . . ∧ u
m
= t
m
(y) ∧ ϕ(u
1
/x
1
, . . . , u
m
/x
m
))
as ψ

(y) where the variables u
1
, . . . , u
m
do not appear in ϕ and are not among
y
1
, . . . , y
n
.
Definition. Suppose ϕ(x, y) is an L-formula where (x, y) = (x
1
, . . . , x
m
, y) is
a tuple of m + 1 distinct variables, such that Σ ¬ ∀x
1
. . . ∀x
m

!
yϕ(x, y), where

!
yϕ(x, y) abbreviates ∃y
_
ϕ(x, y) ∧ ∀z(ϕ(x, z) → y = z)
_
, with z a variable
not occurring in ϕ and not among x
1
, . . . , x
m
, y. Let f
ϕ
be an m-ary function
symbol not in L and put L

:= L ∪ ¦f
ϕ
¦ and Σ

:= Σ ∪ ¦γ
ϕ
¦ where γ
ϕ
is
∀x
1
. . . ∀x
m
ϕ(x, f
ϕ
(x))
The sentence γ
ϕ
is called the defining axiom for f
ϕ
. We call Σ

an extension of
Σ by a definition of the function symbol f
ϕ
.
Remark. Each model / of Σ expands uniquely to a model of Σ

. We denote
this expansion by /

. Every model of Σ

is of the form /

for a unique model
/ of Σ. Proposition 4.5.2 goes through when L
ϕ
, Σ
ϕ
, and / are replaced by
L

, Σ

, and /

, respectively. We leave the proof of this as an exercise. (Hint:
reduce the analogue of (2) to the case of an unnested formula.)
Before defining the next concept we consider a simple case. Let (A; <) be a
totally ordered set, A ,= ∅. By a definition of (A; <) in a structure B we mean
an injective map δ : A → B
k
, with k ∈ N, such that
(i) δ(A) ⊆ B
k
is definable in B.
(ii) The set ¦(δ(a), δ(b)) : a < b in A¦ ⊆ (B
k
)
2
= B
2k
is definable in B.
For example, we have a definition δ : Z → N
2
of the ordered set (Z; <) of integers
in the additive monoid (N; 0, +) of natural numbers, given by δ(n) = (n, 0) and
δ(−n) = (0, n). (We leave it to the reader to check this.)
It can be shown with tools a little beyond the scope of these notes that no
infinite totally ordered set can be defined in the field of complex numbers.
In order to extend this notion to arbitrary structures / = (A; . . . ) we use the
following notation and terminology. Let X, Y be sets, f : X → Y a map, and
S ⊆ X
n
. Then the f-image of S is the subset
f(S) := ¦(f(x
1
), . . . , f(x
n
)) : (x
1
, . . . , x
n
) ∈ S¦
of Y
n
. Also, given k ∈ N, we use the bijection
((y
11
, . . . , y
1k
), . . . , (y
n1
, . . . , y
nk
)) → (y
11
, . . . , y
1k
, . . . , y
n1
, . . . , y
nk
)
from (Y
k
)
n
to Y
nk
to identify these two sets.
76 CHAPTER 4. SOME MODEL THEORY
Definition. A definition of an L-structure / in a structure B is an injective
map δ : A → B
k
, with k ∈ N, such that
(i) δ(A) ⊆ B
k
is definable in B.
(ii) For each m-ary R ∈ L
r
the set δ(R
A
) ⊆ (B
k
)
m
= B
mk
is definable in B.
(iii) For each n-ary f ∈ L
f
the set δ
_
graph of f
A
_
⊆ (B
k
)
n+1
= B
(n+1)k
is
definable in B.
Remark. Here B is a structure for a language L

that may have nothing to
do with the language L. Replacing everywhere “definable” by “0-definable”, we
get the notion of a 0-definition of / in B.
A more general way of viewing a structure / as in some sense living inside
a structure B is to allow δ to be an injective map from A into B
k
/E for some
equivalence relation E on B
k
that is definable in B, and imposing suitable
conditions. Our special case corresponds to E = equality on B
k
. (We do not
develop this idea here further: the right setting for it would be many-sorted
structures, rather than our one-sorted structures.)
Recall that by Lagrange’s “four squares” theorem we have
N = ¦a
2
+b
2
+c
2
+d
2
: a, b, c, d ∈ Z¦.
It follows that the inclusion map N → Z is a 0-definition of (N; 0, +, , <) in
(Z; 0, 1, +, −, ). The bijection
a +bi → (a, b) : C → R
2
(a, b ∈ R)
is a 0-definition of the field (C; 0, 1, +, −, ) of complex numbers in the field
(R; 0, 1, +, −, ) of real numbers.
On the other hand, there is no definition of the field of real numbers in
the field of complex numbers: this follows from the fact, stated earlier without
proof, that there is no definition of any infinite totally ordered set in the field of
complex numbers. (A special case says that R, considered as a subset of C, is
not definable in the field of complex numbers; this follows easily from the fact
that ACF admits QE, see exercise 1.) Indeed, one can show that the only fields
definable in the field of complex numbers are finite fields and fields isomorphic
to the field of complex numbers itself.
Proposition 4.5.3. Let δ : A → B
k
be a 0-definition of the L-structure / in the
L

-structure B. Let x
1
, . . . , x
n
be distinct variables (viewed as ranging over A ),
and let x
11
, . . . , x
1k
, . . . , x
n1
, . . . , x
nk
be nk distinct variables (viewed as ranging
over B). Then we have a map that assigns to each L-formula ϕ(x
1
, . . . , x
n
) an
L

-formula δϕ(x
11
, . . . , x
1k
, . . . , x
n1
, . . . , x
nk
) such that
δ(ϕ
A
) = (δϕ)
B
⊆ B
nk
.
In particular, for n = 0 the map above assigns to each L-sentence σ an
L

-sentence δσ such that / [= σ ⇐⇒ B [= δσ.
Chapter 5
Computability, Decidability,
and Incompleteness
In this chapter we prove G¨odel’s famous Incompleteness Theorem. Consider the
structure N := (N; 0, S, +, , <), where S : N → N is the successor function. A
simple form of the incompleteness theorem is as follows.
Let Σ be a computable set of sentences in the language of N and true in N.
Then there exists a sentence σ in that language such that N [= σ, but Σ ,¬ σ.
In other words, no computable set of axioms in the language of N and true
in N can be complete, hence the name Incompleteness Theorem. The only
unexplained terminology here is “computable.” Intuitively, “Σ is computable”
means that there is an algorithm to recognize whether any given sentence in
the language of N belongs to Σ. (It seems reasonable to require this of an
axiom system for N.) Thus we begin this chapter with developing the notion of
computability. The interest of this notion is tied to the Church-Turing Thesis
as explained in Section 5.2, and goes far beyond incompleteness. For example,
computability plays a role in combinatorial group theory (Higman’s Theorem)
and in certain diophantine questions (Hilbert’s 10th problem), not to mention
its role in the ideological underpinnings of computer science.
5.1 Computable Functions
First some notation. We let µx(..x..) denote the least x ∈ N for which ..x.. holds.
Here ..x.. is some condition on natural numbers x. For example µx(x
2
> 7) = 3.
We will only use this notation when the meaning of ..x.. is clear, and the set
¦x ∈ N : ..x..¦ is non-empty. For a ∈ N we also let µx
<a
(..x..) be the least
x < a in N such that ..x.. holds if there is such an x, and if there is no such x
we put µx
<a
(..x..) := a. For example, µx
<4
(x
2
> 3) = 2 and µx
<2
(x > 5) = 2.
77
78 CHAPTER 5. COMPUTABILITY
Definition. For R ⊆ N
n
, we define χ
R
: N
n
→ N by χ
R
(a) =
_
1 if a ∈ R,
0 if a / ∈ R.
Think of such R as an n-ary relation on N. We call χ
R
the characteristic
function of R, and often write R(a
1
, . . . , a
n
) instead of (a
1
, . . . , a
n
) ∈ R.
Example. χ
<
(m, n) = 1 iff m < n, and χ
<
(m, n) = 0 iff m ≥ n.
Definition. For i = 1, . . . , n we define I
n
i
: N
n
→ N by I
n
i
(a
1
, . . . , a
n
) = a
i
.
These functions are called coordinate functions.
Definition. The computable functions (or recursive functions) are the functions
from N
n
to N (for n = 0, 1, 2, . . .) obtained by inductively applying the following
rules:
(R1) + : N
2
→ N, : N
2
→ N, χ

: N
2
→ N, and the coordinate functions I
n
i
(for each n and i = 1, . . . , n) are computable.
(R2) If G : N
m
→ N is computable and H
1
, . . . , H
m
: N
n
→ N are computable,
then so is the function F = G(H
1
, . . . , H
m
) : N
n
→ N defined by
F(a) = G(H
1
(a), . . . , H
m
(a)).
(R3) If G : N
n+1
→ N is computable, and for all a ∈ N
n
there exists x ∈ N
such that G(a, x) = 0, then the function F : N
n
→ N given by
F(a) = µx(G(a, x) = 0)
is computable.
A relation R ⊆ N
n
is said to be computable (or recursive) if its characteristic
function χ
R
: N
n
−→ N is computable.
Example. If F : N
3
→ N and G : N
2
→ N are computable, then so is the
function H : N
4
→ N defined by H(x
1
, x
2
, x
3
, x
4
) = F(G(x
1
, x
4
), x
2
, x
4
). This
follows from (R2) by noting that H(x) = F(G(I
4
1
(x), I
4
4
(x)), I
4
2
(x), I
4
4
(x)) where
x = (x
1
, x
2
, x
3
, x
4
). We shall use this device from now on in many proofs, but
only tacitly. (The reader should of course notice when we do so.)
From (R1), (R2) and (R3) we derive further rules for obtaining computable
functions. This is mostly an exercise in programming.
Lemma 5.1.1. Let H
1
, . . . , H
m
: N
n
→ N and R ⊆ N
m
be computable. Then
R(H
1
, . . . , H
k
) ⊆ N
n
is computable, where for a ∈ N
n
we put
R(H
1
, . . . , H
m
)(a) ⇐⇒ R(H
1
(a), . . . , H
m
(a)).
Proof. Observe that χ
R(H
1
,...,H
m
)
= χ
R
(H
1
, . . . , H
m
). Now apply (R2).
Lemma 5.1.2. The functions χ

and χ
=
on N
2
are computable, as is the
constant function c
n
0
: N
n
→ N where c
n
0
(a) = 0 for all n and a ∈ N
n
.
5.1. COMPUTABLE FUNCTIONS 79
Proof. The function χ

is computable because
χ

(m, n) = χ

(n, m) = χ

(I
2
2
(m, n), I
2
1
(m, n))
which enables us to apply (R1) and (R2). Similarly, χ
=
is computable:
χ
=
(m, n) = χ

(m, n) χ

(m, n).
To handle c
n
0
we observe
c
n
0
(a) = µx(I
n+1
n+1
(a, x) = 0).
For k ∈ N we define the constant function c
n
k
: N
n
→ N by c
n
k
(a) = k.
Lemma 5.1.3. Every constant function c
n
k
is computable.
Proof. This is true for k = 0 (and all n) by Lemma 5.1.2. Next, observe that
c
n
k+1
(a) = µx(c
n
k
(a) < x) = µx
_
χ

(c
n+1
k
(a, x), I
n+1
n+1
(a, x)) = 0
_
for a ∈ N
n
, so the general result follows by induction on k.
Let P, Q be n-ary relations on N. Then we can form the n-ary relations
P := N
n
P, P ∨ Q := P ∪ Q, P ∧ Q := P ∩ Q,
P → Q := (P) ∨ Q, P ↔ Q := (P → Q) ∧ (Q → P)
on N.
Lemma 5.1.4. Suppose P, Q are computable. Then P, P ∨Q, P ∧Q, P → Q
and P ↔ Q are also computable.
Proof. Let a ∈ N
n
. Then P(a) iff χ
P
(a) = 0 iff χ
P
(a) = c
n
0
(a), so χ
¬P
(a) =
χ
=

P
(a), c
n
0
(a)). Hence P is computable by (R2) and Lemma 5.1.2. Next,
the relation P ∧ Q is computable since χ
P∧Q
= χ
P
χ
Q
. By De Morgan’s Law,
P ∨ Q = (P ∧ Q). Thus P ∨ Q is computable. The rest is clear.
Lemma 5.1.5. The binary relations <, ≤, =, >, ≥, ,= on N are computable.
Proof. The relations ≥, ≤ and = have already been taken care of by Lemma
5.1.2 and (R1). The remaining relations are complements of these three, so by
Lemma 5.1.4 they are also computable.
Lemma 5.1.6. (Definition by Cases) Let R
1
, . . . , R
k
⊆ N
n
be computable such
that for each a ∈ N
n
exactly one of R
1
(a), . . . , R
k
(a) holds, and suppose that
G
1
, . . . , G
k
: N
n
→ N are computable. Then G : N
n
→ N given by
G(a) =
_
¸
_
¸
_
G
1
(a) if R
1
(a)
.
.
.
.
.
.
G
k
(a) if R
k
(a)
is computable.
80 CHAPTER 5. COMPUTABILITY
Proof. This follows from G = G
1
χ
R
1
+ +G
k
χ
R
k
.
Lemma 5.1.7. (Definition by Cases) Let R
1
, . . . , R
k
⊆ N
n
be computable such
that for each a ∈ N
n
exactly one of R
1
(a), . . . , R
k
(a) holds. Let P
1
, . . . , P
k
⊆ N
n
be computable. Then the relation P ⊆ N
n
defined by
P(a) ⇐⇒
_
¸
_
¸
_
P
1
(a) if R
1
(a)
.
.
.
.
.
.
P
k
(a) if R
k
(a)
is computable.
Proof. Use that P = (P
1
∧ R
1
) ∨ ∨ (P
k
∧ R
k
).
Lemma 5.1.8. Let R ⊆ N
n+1
be computable such that for all a ∈ N
n
there
exists x ∈ N with (a, x) ∈ R. Then the function F : N
n
→ N given by
F(a) = µxR(a, x)
is computable.
Proof. Note that F(a) = µx(χ
¬R
(a, x) = 0) and apply (R3).
Here is a nice consequence of 5.1.5 and 5.1.8.
Lemma 5.1.9. Let F : N
n
→ N. Then F is computable if and only if its graph
(a subset of N
n+1
) is computable.
Proof. Let R ⊆ N
n+1
be the graph of F. Then for all a ∈ N
n
and b ∈ N,
R(a, b) ⇐⇒ F(a) = b, F(a) = µxR(a, x),
from which the lemma follows immediately.
Lemma 5.1.10. If R ⊆ N
n+1
is computable, then the function F
R
: N
n+1
→ N
defined by F
R
(a, y) = µx
<y
R(a, x) is computable.
Proof. Use that F
R
(a, y) = µx(R(a, x) or x = y).
Some notation: below we use the bold symbol ∃∃∃ as shorthand for “there
exists a natural number”; likewise, we use the bold symbol ∀ to abbreviate “for
all natural numbers.” These abbreviation symbols should not be confused with
the logical symbols ∃ and ∀.
Lemma 5.1.11. Suppose R ⊆ N
n+1
is computable. Let P, Q ⊆ N
n+1
be the
relations defined by
P(a, y) ⇐⇒ ∃∃∃x
<y
R(a, x)
Q(a, y) ⇐⇒ ∀x
<y
R(a, x),
for (a, y) = (a
1
, . . . , a
n
, y) ∈ N
n+1
. Then P and Q are computable.
5.1. COMPUTABLE FUNCTIONS 81
Proof. Using the notation and results from Lemma 5.1.10 we note that P(a, y)
iff F
R
(a, y) < y. Hence χ
P
(a, y) = χ
<
(F
R
(a, y), y). For Q, note that Q(a, y)
iff ∃∃∃x
<y
R(a, x).
Lemma 5.1.12. The function
˙
− : N
2
→ N defined by a
˙
−b =
_
a −b if a ≥ b,
0 if a < b
is computable.
Proof. Use that a
˙
−b = µx(b +x = a or a < b).
The results above imply easily that many familiar functions are computable.
But is the exponential function n → 2
n
computable? It certainly is in the
intuitive sense: we know how to compute (in principle) its value at any given
argument. It is not that obvious from what we have proved so far that it is
computable in our precise sense. We now develop some coding tricks due to
G¨odel that enable us to prove routinely that functions like 2
x
are computable
according to our definition of “computable function”.
Definition. Define the function Pair : N
2
→ N by
Pair(x, y) :=
(x +y)(x +y + 1)
2
+ x
We call Pair the pairing function.
Lemma 5.1.13. The function Pair is bijective and computable.
Proof. Exercise.
Definition. Since Pair is a bijection we can define functions
Left, Right : N → N
by
Pair(x, y) = a ⇐⇒ Left(a) = x and Right(a) = y.
The reader should check that Left(a), Right(a) ≤ a for a ∈ N, and Left(a) < a
if 0 < a ∈ N.
Lemma 5.1.14. The functions Left and Right are computable.
Proof. Use 5.1.9 in combination with
Left(a) = µx
_
∃∃∃y
<a+1
Pair(x, y) = a
_
and Right(a) = µy
_
∃∃∃x
<a+1
Pair(x, y) = a
_
.
For a, b, c ∈ Z we have (by definition): a ≡ b mod c ⇐⇒ a −b ∈ cZ.
Lemma 5.1.15. The ternary relation a ≡ b mod c on N is computable.
82 CHAPTER 5. COMPUTABILITY
Proof. Use that for a, b, c ∈ N we have
a ≡ b mod c ⇐⇒
_
∃∃∃x
<a+1
a = x c +b or ∃∃∃x
<b+1
b = x c +a
_
.
We can now introduce G¨odel’s function β : N
2
→ N.
Definition. For a, i ∈ N we let β(a, i) be the remainder of Left(a) upon division
by 1 + (i + 1) Right(a), that is,
β(a, i) := µx
_
x ≡ Left(a) mod 1 + (i + 1) Right(a)
_
.
Proposition 5.1.16. The function β is computable, β(a, i) ≤ a
˙
−1 for all a, i ∈
N. For any sequence (a
0
, . . . , a
n−1
) of natural numbers there exists a ∈ N such
that β(a, i) = a
i
for i < n.
Proof. The computability of β is clear from earlier results. We have β(a, i) ≤
Left(a) ≤ a
˙
−1.
Let a
0
, . . . , a
n−1
be natural numbers. Then we take an N ∈ N such that
a
i
≤ N for all i < n and N is a multiple of every prime number less than
n. We claim that then 1 + N, 1 + 2N, . . . , 1 + nN are relatively prime. To
see this, suppose p is a prime number such that p [ 1 + iN and p [ 1 + jN
(1 ≤ i < j ≤ n); then p divides their difference (j −i)N, but p does not divide
N, hence p [ j −i < n, a contradiction.
By the Chinese Remainder Theorem there exists an M such that
M ≡ a
0
mod 1 +N
M ≡ a
1
mod 1 + 2N
.
.
.
M ≡ a
n−1
mod 1 +nN.
Put a := Pair(M, N); then Left(a) = M and Right(a) = N, and thus β(a, i) =
a
i
as required.
Remark. Proposition 5.1.16 shows that we can use β to encode a sequence of
numbers a
0
, . . . , a
n−1
in terms of a single number a. We use this as follows to
show that the function n → 2
n
is computable.
If a
0
, . . . , a
n
are natural numbers such that a
0
= 1, and a
i+1
= 2a
i
for
all i < n, then necessarily a
n
= 2
n
. Hence by Proposition 5.1.16 we have
β(a, n) = 2
n
where
a := µx(β(x, 0) = 1 and ∀i
<n
β(x, i + 1) = 2β(x, i)),
that is,
2
n
= β(a, n) = β(µx(β(x, 0) = 1 and ∀i
<n
β(x, i + 1) = 2β(x, i)), n)
It follows that n → 2
n
is computable.
5.1. COMPUTABLE FUNCTIONS 83
The above suggests a general method, which we develop next. To each se-
quence (a
1
, . . . , a
n
) of natural numbers we assign a sequence number, denoted
¸a
1
, . . . , a
n
), and defined to be the least natural number a such that β(a, 0) = n
(the length of the sequence) and β(a, i) = a
i
for i = 1, . . . , n. For n = 0 this
gives ¸) = 0, where ¸) is the sequence number of the empty sequence. We de-
fine the length function lh : N −→ N by lh(a) = β(a, 0), so lh is computable.
Observe that lh(¸a
1
, . . . , a
n
)) = n.
Put (a)
i
:= β(a, i +1). The function (a, i) → (a)
i
: N
2
−→ N is computable,
and (¸a
1
, . . . , a
n
))
i
= a
i+1
for i < n. Finally, let Seq ⊆ N denote the set of
sequence numbers. The set Seq is computable since
a ∈ Seq ⇐⇒ ∀x
<a
(lh(x) ,= lh(a) or ∃∃∃i
<lh(a)
(x)
i
,= (a)
i
)
Lemma 5.1.17. For any n, the function (a
1
, . . . , a
n
) → ¸a
1
, . . . , a
n
) : N
n
→ N
is computable, and a
i
< ¸a
1
, . . . , a
n
) for (a
1
, . . . , a
n
) ∈ N
n
and i = 1, . . . , n.
Proof. Use ¸a
1
, . . . , a
n
) = µa(β(a, 0) = n, β(a, 1) = a
1
, . . . , β(a, n) = a
n
), and
apply Lemmas 5.1.8, 5.1.4 and 5.1.16.
Lemma 5.1.18.
(1) The function In : N
2
→ N defined by
In(a, i) = µx(lh(x) = i and ∀j
<i
(x)
j
= (a)
j
)
is computable and In(¸a
1
, . . . , a
n
), i) = ¸a
1
, . . . , a
i
) for all a
1
, . . . , a
n
∈ N
and i ≤ n.
(2) The function ∗ : N
2
→ N defined by
a ∗ b = µx(lh(x) = lh(a) + lh(b) and ∀i
<lh(a)
(x)
i
= (a)
i
and ∀j
<lh(b)
(x)
lh(a)+j
= (b)
j
)
is computable and ¸a
1
, . . . , a
m
) ∗ ¸b
1
, . . . , b
n
) = ¸a
1
, . . . , a
m
, b
1
, . . . , b
n
) for
all a
1
, . . . , a
m
, b
1
, . . . , b
n
∈ N.
Definition. For F : N
n+1
→ N , let
¯
F : N
n+1
→ N be given by
¯
F(a, b) = ¸F(a, 0), . . . , F(a, b −1)) (a ∈ N
n
).
Note that
¯
F(a, 0) = ¸) = 0.
Lemma 5.1.19. Let F : N
n+1
→ N. Then F is computable if and only if
¯
F is
computable.
Proof. Suppose F is computable. Then
¯
F is computable since
¯
F(a, b) = µx(lh(x) = b and ∀i
<b
(x)
i
= F(a, i)).
In the other direction, suppose
¯
F is computable. Then F is computable since
F(a, b) = (
¯
F(a, b + 1))
b
.
84 CHAPTER 5. COMPUTABILITY
Given G : N
n+2
→ N there is a unique function F : N
n+1
→ N such that
F(a, b) = G(a, b,
¯
F(a, b)) (a ∈ N
n
).
This will be clear if we express the requirement on F as follows:
F(a, 0) = G(a, 0, 0), F(a, b + 1) = G(a, b + 1, ¸F(a, 0), . . . , F(a, b))).
The next result is important because it allows us to introduce computable func-
tions by recursion on its values at smaller arguments.
Proposition 5.1.20. Let G and F be as above and suppose G is computable.
Then F is computable.
Proof. Note that
¯
F(a, b) = µx(Seq(x) and lh(x) = b and ∀i
<b
(x)
i
= G(a, i, In(x, i)))
for all a ∈ N
n
and b ∈ N. It follows that
¯
F is computable, and thus by the
previous lemma F is computable.
Definition. Let A : N
n
→ N and B : N
n+2
→ N be given. Let a range over
N
n
, and define the function F : N
n+1
→ N by
F(a, 0) = A(a),
F(a, b + 1) = B(a, b, F(a, b)).
We say that F is obtained from A and B by primitive recursion.
Proposition 5.1.21. Suppose A, B, and F are as above, and A and B are
computable. Then F is computable.
Proof. Define G : N
n+2
→ N by
G(a, b, c) =
_
A(a) if c = 0,
B(a, b
˙
−1, (c)
b
˙
−1
) if c > 0.
Clearly, G is computable. We claim that
F(a, b) = G(a, b,
¯
F(a, b)).
This claim yields the computability of F, by Proposition 5.1.20. We have
F(a, 0) = A(a) = G(a, 0, 0) = G(a, 0,
¯
F(a, 0)) and
F(a, b + 1) = B(a, b, F(a, b)) = B(a, b, (
¯
F(a, b + 1))
b
) = G(a, b + 1,
¯
F(a, b + 1)).
The claim follows.
5.1. COMPUTABLE FUNCTIONS 85
Proposition 5.1.20 will be applied over and over again in the later section on
G¨odel numbering, but in combination with definitions by cases. As a simple
example of such an application, let G : N → N and H : N
2
→ N be computable.
There is clearly a unique function F : N
2
→ N such that for all a, b ∈ N
F(a, b) =
_
F(a, G(b)) if G(b) < b,
H(a, b) otherwise.
In particular F(a, 0) = H(a, 0). We claim that F is computable.
According to Proposition 5.1.20 this claim will follow if we can specify a
computable function K : N
3
→ N such that F(a, b) = K(a, b,
¯
F(a, b)) for all
a, b ∈ N. Such a function K is given by
K(a, b, c) =
_
(c)
G(b)
if G(b) < b,
H(a, b) otherwise.
Exercises.
(1) The set of prime numbers is computable.
(2) The Fibonacci numbers are the natural numbers F
n
defined recursively by F
0
= 0,
F
1
= 1, and F
n+2
= F
n+1
+F
n
. The function n → F
n
: N → N is computable.
(3) If f
1
, . . . , f
n
: N
m
→ N are computable and X ⊆ N
n
is computable, then
f
−1
(X) ⊆ N
m
is computable, where f := (f
1
, . . . , f
n
) : N
m
→ N
n
.
(4) If f : N → N is computable and surjective, then there is a computable function
g : N → N such that f ◦ g = id
N
.
(5) If f : N → N is computable and strictly increasing, then f(N) ⊆ N is com-
putable.
(6) All computable functions and relations are definable in N.
(7) Let F : N
n
→ N, and define
¸F) : N → N, ¸F)(a) := F
`
(a)
0
, . . . , (a)
n−1
´
,
so F(a
1
, . . . , a
n
) = ¸F)(¸a
1
, . . . , a
n
)) for all a
1
, . . . , a
n
∈ N. Then F is com-
putable iff ¸F) is computable. (Hence n-variable computability reduces to 1-
variable computability.)
Let T be a collection of functions F : N
m
→ N for various m. We say that T is
closed under composition if for all G : N
m
→ Nin T and all H
1
, . . . , H
m
: N
n
→ N
in T, the function F = G(H
1
, . . . , H
m
) : N
n
→ N is in T. We say that T is
closed under minimalization if for every G : N
n+1
→ N in T such that for all
a ∈ N
n
there exists x ∈ N with G(a, x) = 0, the function F : N
n
→ N given by
F(a) = µx(G(a, x) = 0) is in T. We say that a relation R ⊆ N
n
is in T if its
characteristic function χ
R
is in T.
(8) Suppose T contains the functions mentioned in (R1), and is closed under com-
position and minimalization. All lemmas and propositions of this Section go
through with computable replaced by in T.
86 CHAPTER 5. COMPUTABILITY
5.2 The Church-Turing Thesis
The computable functions as defined in the last section are also computable
in the informal sense that for each such function F : N
n
→ N there is an
algorithm that on any input a ∈ N
n
stops after a finite number of steps and
produces an output F(a). An algorithm is given by a finite list of instructions,
a computer program, say. These instructions should be deterministic (leave
nothing to chance or choice). We deliberately neglect physical constraints of
space and time: imagine that the program that implements the algorithm has
unlimited access to time and space to do its work on any given input.
Let us write “calculable” for this intuitive, informal, idealized notion of
computable. The Church-Turing Thesis asserts
each calculable function F : N → N is computable.
The corresponding assertion for functions N
n
→ N follows, because the re-
sult of Exercise 7 in Section 5.1 is clearly also valid for “calculable” instead
of “computable.” Call a set P ⊆ N calculable if its characteristic function is
calculable.
While the Church-Turing Thesis is not a precise mathematical statement, it
is an important guiding principle, and has never failed in practice: any function
that any competent person has ever recognized as being calculable, has turned
out to be computable, and the informal grounds for calculability have always
translated routinely into an actual proof of computability. Here is a heuristic
(informal) argument that might make the Thesis plausible.
Let an algorithm be given for computing F : N → N. We can assume
that on any input a ∈ N this algorithm consists of a finite sequence of steps,
numbered from 0 to n, say, where at each step i it produces a natural number
a
i
, with a
0
= a as starting number. It stops after step n with a
n
= F(a).
We assume that for each i < n the number a
i+1
is calculated by some fixed
procedure from the earlier numbers a
0
, . . . , a
i
, that is, we have a calculable
function G : N → N such that a
i+1
= G(¸a
0
, . . . , a
i
)) for all i < n. The
algorithm should also tell us when to stop, that is, we should have a calculable
P ⊆ N such that P(¸a
0
, . . . , a
i
)) for i < n and P(¸a
0
, . . . , a
n
)). Since G and
P describe only single steps in the algorithm for F it is reasonable to assume
that they at least are computable. Once this is agreed to, one can show easily
that F is computable as well, see the exercise below.
A skeptical reader may find this argument dubious, but Turing gave in 1936
a compelling informal analysis of what functions F : N → N are calculable
in principle, and this has led to general acceptance of the Thesis. In addition,
various alternative formalizations of the informal notion of calculable function
have been proposed, using various kinds of machines, formal systems, and so
on. They all have turned out to be equivalent in the sense of defining the same
class of functions on N, namely the computable functions.
The above is only a rather narrow version of the Church-Turing Thesis, but
it suffices for our purpose. There are various refinements and more ambitious
versions. Also, our Church-Turing Thesis does not characterize mathematically
5.3. PRIMITIVE RECURSIVE FUNCTIONS 87
the intuitive notion of algorithm, only the notion function computable by an
algorithm that produces for each input from N an output in N.
Exercises.
(1) Let G : N → N and P ⊆ N be given. Then there is for each a ∈ N at most one
finite sequence a
0
, . . . , a
n
of natural numbers such that a
0
= a, for all i < n we
have a
i+1
= G(¸a
0
, . . . , a
i
)) and P(¸a
0
, . . . , a
i
)), and P(¸a
0
, . . . , a
n
)). Suppose
that for each a ∈ N there is such a finite sequence a
0
, . . . , a
n
, and put F(a) := a
n
,
thus defining a function F : N → N. If G and P are computable, so is F.
5.3 Primitive Recursive Functions
This section is not really needed in the rest of this chapter, but it may throw light
on some issues relating to computability. One such issue is the condition, in Rule
(R3) for generating computable functions, that for all a ∈ N
n
there exists y ∈ N
such that G(a, y) = 0. This condition is not constructive: it could be satisfied
for a certain G without us ever knowing it. We shall now argue informally that
it is impossible to generate in a fully constructive way exactly the computable
functions. Such a constructive generation process would presumably enable
us to enumerate effectively a sequence of algorithms α
0
, α
1
, α
2
, . . . such that
each α
n
computes a (computable) function f
n
: N → N, and such that every
computable function f : N → N occurs in the sequence f
0
, f
1
, f
2
, . . . , possibly
more than once. Now consider the function f
diag
: N → N defined by
f
diag
(n) = f
n
(n) + 1.
Then f
diag
is clearly computable in the intuitive sense, but f
diag
,= f
n
for all n,
in violation of the Church-Turing Thesis.
This way of producing a new function f
diag
from a sequence (f
n
) is called
diagonalization.
1
The same basic idea applies in other cases, and is used in a
more sophisticated form in the proof of G¨odel’s incompleteness theorem.
Here is a class of computable functions that can be generated constructively:
The primitive recursive functions are the functions f : N
n
→ N obtained in-
ductively as follows:
(PR1) The nullary function N
0
→ N with value 0, the unary successor function
S, and all coordinate functions I
n
i
are primitive recursive.
(PR2) If G : N
m
→ N is primitive recursive and H
1
, . . . , H
m
: N
n
→ N are
primitive recursive, then G(H
1
, . . . , H
m
) is primitive recursive.
(PR3) If F : N
n+1
→ N is obtained by primitive recursion from primitive re-
cursive functions G : N
n
→ N and H : N
n+2
→ N, then F is primitive
recursive.
1
Perhaps it would be better to call it antidiagonalization.
88 CHAPTER 5. COMPUTABILITY
A relation R ⊆ N
n
is said to be primitive recursive if its characteristic function
χ
R
is primitive recursive. As the next two lemmas show, the computable func-
tions that one ordinarily meets with are primitive recursive. In the rest of this
section x ranges over N
m
with m depending on the context, and y over N.
Lemma 5.3.1. The following functions and relations are primitive recursive:
(i) each constant function c
n
m
;
(ii) the binary operations +, , and (x, y) → x
y
on N;
(iii) the predecessor function Pd : N → N given by Pd(x) = x
˙
−1, the unary
relation ¦x ∈ N : x > 0¦, the function
˙
− : N
2
→ N;
(iv) the binary relations ≥, ≤ and = on N.
Proof. The function c
0
m
is obtained from c
0
0
by applying (PR2) m times with
G = S. Next, c
n
m
is obtained by applying (PR2) with G = c
0
m
(with k = 0 and
t = n). The functions in (ii) are obtained by the usual primitive recursions. It is
also easy to write down primitive recursions for the functions in (iii), in the order
they are listed. For (iv), note that χ

(x, y + 1) = χ
>0
(x) χ

(Pd(x), y).
Lemma 5.3.2. With the possible exceptions of Lemmas 5.1.8 and 5.1.9, all
Lemmas and Propositions in Section 5.1 go through with “computable” replaced
by “primitive recursive.”
Proof. To obtain the primitive recursive version of Lemma 5.1.10, note that
F
R
(a, 0) = 0, F
R
(a, y+1) = F
R
(a, y)χ
R
(a, F
R
(a, y))+(y+1)χ
¬R
(a, F
R
(a, y)).
A consequence of the primitive recursive version of Lemma 5.1.10 is the following
“restricted minimalization scheme” for primitive recursive functions:
if R ⊆ N
n+1
and H : N
n
→ N are primitive recursive, and for all a ∈ N
n
there exists x < H(a) such that R(a, x), then the function F : N
n
→ N given
by F(a) = µxR(a, x) is primitive recursive.
The primitive recursive versions of Lemmas 5.1.11–5.1.15 and Proposition 5.1.16
now follow easily. In particular, the function β is primitive recursive. Also, the
proof of Proposition 5.1.16 yields:
There is a primitive recursive function B : N → N such that, whenever
n < N, a
0
< N, . . . , a
n−1
< N, (n, a
0
, . . . , a
n−1
, N ∈ N)
then for some a < B(N) we have β(a, i) = a
i
for i = 0, . . . , n −1.
Using this fact and restricted minimalization, it follows that the unary relation
Seq, the unary function lh, and the binary functions (a, i) → (a)
i
, In and ∗ are
primitive recursive.
Let a function F : N
n+1
→ N be given. Then
¯
F : N
n+1
→ N satisfies the
primitive recursion
¯
F(a, 0) = 0 and
¯
F(a, b + 1) =
¯
F(a, b) ∗ ¸F(a, b)). It follows
5.3. PRIMITIVE RECURSIVE FUNCTIONS 89
that if F is primitive recursive, so is
¯
F. The converse is obvious. Suppose also
that G : N
n+2
→ N is primitive recursive, and F(a, b) = G(a, b,
¯
F(a, b)) for all
(a, b) ∈ N
n+1
; then
¯
F satisfies the primitive recursion
¯
F(A, 0) = G(a, 0, 0),
¯
F(a, b + 1) =
¯
F(a, b) ∗ ¸G(a, b,
¯
F(a, b))).
so
¯
F (and hence F) is primitive recursive.
The Ackermann Function. By diagonalization we can produce a computable
function that is not primitive recursive, but the so-called Ackermann function
does more, and plays a role in several contexts. First we define inductively a
sequence A
0
, A
1
, A
2
, . . . of primitive recursive functions A
n
: N → N:
A
0
(y) = y + 1, A
n+1
(0) = A
n
(1),
A
n+1
(y + 1) = A
n
(A
n+1
(y)).
Thus A
0
= S and A
n+1
◦A
0
= A
n
◦A
n+1
. One verifies easily that A
1
(y) = y +2
and A
2
(y) = 2y + 3 for all y. We define the Ackermann function A : N
2
→ N
by A(n, y) := A
n
(y).
Lemma 5.3.3. The function A is computable, and strictly increasing in each
variable. Also, for all n and x, y:
(i) A
n
(x +y) ≥ A
n
(x) +y;
(ii) n ≥ 1 =⇒ A
n+1
(y) > A
n
(y) +y;
(iii) A
n+1
(y) ≥ A
n
(y + 1);
(iv) 2A
n
(y) < A
n+2
(y);
(v) x < y =⇒ A
n
(x +y) ≤ A
n+2
(y).
Proof. We leave it as an exercise at the end of this section to show that A is
computable. Assume inductively that A
0
, . . . , A
n
are strictly increasing and
A
0
(y) < A
1
(y) < < A
n
(y) for all y. Then
A
n+1
(y + 1) = A
n
(A
n+1
(y)) ≥ A
0
(A
n+1
(y)) > A
n+1
(y),
so A
n+1
is strictly increasing. Next we show that A
n+1
(y) > A
n
(y) for all y:
A
n+1
(0) = A
n
(1), so A
n+1
(0) > A
n
(0) and A
n+1
(0) > 1, so A
n+1
(y) > y + 1
for all y. Hence A
n+1
(y + 1) = A
n
(A
n+1
(y)) > A
n
(y + 1).
Inequality (i) follows easily by induction on n, and a second induction on y.
For inequality (ii), we proceed again by induction on (n, y): Using A
1
(y) =
y +2 and A
2
(y) = 2y +3, we obtain A
2
(y) > A
1
(y) +y. Let n > 1, and assume
inductively that A
n
(y) > A
n−1
(y) + y. Then A
n+1
(0) = A
n
(1) > A
n
(0) + 0,
and
A
n+1
(y + 1) = A
n
(A
n+1
(y)) ≥ A
n
(y + 1 +A
n
(y))
≥ A
n
(y + 1) +A
n
(y) > A
n
(y + 1) +y + 1.
90 CHAPTER 5. COMPUTABILITY
In (iii) we proceed by induction on y. We have equality for y = 0. Assuming
inductively that (iii) holds for a certain y we obtain
A
n+1
(y + 1) = A
n
(A
n+1
(y)) ≥ A
n
(A
n
(y + 1)) ≥ A
n
(y + 2).
Note that (iv) holds for n = 0. For n > 0 we have by (i), (ii) and (iii):
A
n
(y) +A
n
(y) ≤ A
n
(y +A
n
(y)) < A
n
(A
n+1
(y)) = A
n+1
(y + 1) ≤ A
n+2
(y).
Note that (v) holds for n = 0. Assume (v) holds for a certain n. Let x < y+1.
We can assume inductively that if x < y, then A
n+1
(x +y) ≤ A
n+3
(y), and we
want to show that
A
n+1
(x +y + 1) ≤ A
n+3
(y + 1).
Case 1. x = y. Then
A
n+1
(x +y + 1) = A
n+1
(2x + 1) = A
n
(A
n+1
(2x))
≤ A
n+2
(2x) < A
n+2
(A
n+3
(x)) = A
n+3
(y + 1).
Case 2. x < y. Then
A
n+1
(x +y + 1) = A
n
(A
n+1
(x +y)) ≤ A
n+2
(A
n+3
(y)) = A
n+3
(y + 1).
Below we put [x[ := x
1
+ +x
m
for x = (x
1
, . . . , x
m
) ∈ N
m
.
Proposition 5.3.4. Given any primitive recursive function F : N
m
→ N there
is an n = n(F) such that F(x) ≤ A
n
([x[) for all x ∈ N
m
.
Proof. Call an n = n(F) with the property above a bound for F. The nullary
constant function with value 0, the successor function S, and each coordinate
function I
m
i
, (1 ≤ i ≤ m), has bound 0. Next, assume F = G(H
1
, . . . , H
k
) where
G : N
k
→ N and H
1
, . . . , H
k
: N
m
→ N are primitive recursive, and assume
inductively that n(G) and n(H
1
), . . . , n(H
k
) are bounds for G and H
1
, . . . , H
k
.
By part (iv) of the previous lemma we can take N ∈ N such that n(G) ≤ N,
and

i
H
i
(x) ≤ A
N+1
([x[) for all x. Then
F(x) = G(H
1
(x), . . . , H
k
(x)) ≤ A
N
(

i
H
i
(x)) ≤ A
N
(A
N+1
([x[)) ≤ A
N+2
([x[).
Finally, assume that F : N
m+1
→ N is obtained by primitive recursion from
the primitive recursive functions G : N
m
→ N and H : N
m+2
→ N, and assume
inductively that n(G) and n(H) are bounds for G and H. Take N ∈ N such
that n(G) ≤ N + 3 and n(H) ≤ N. We claim that N + 3 is a bound for F:
F(x, 0) = G(x) ≤ A
N+3
([x[), and by part (v) of the lemma above,
F(x, y + 1) = H(x, y, F(x, y)) ≤ A
N
¦[x[ +y +A
N+3
([x[ +y)¦
≤ A
N+2
¦A
N+3
([x[ +y)¦ = A
N+3
([x[ +y + 1).
5.4. REPRESENTABILITY 91
Consider the function A

: N → N defined by A

(n) = A(n, n). Then A

is computable, and for any primitive recursive function F : N → N we have
F(y) < A

(y) for all y > n(F), where n(F) is a bound for F. In particular, A

is not primitive recursive. Hence A is computable but not primitive recursive.
The recursion in “primitive recursion” involves only one variable; the other
variables just act as parameters. The Ackermann function is defined by a re-
cursion involving both variables:
A(0, y) = y + 1, A(x + 1, 0) = A(x, 1), A(x + 1, y + 1) = A(x, A(x + 1, y)).
This kind of double recursion is therefore more powerful in some ways than what
can be done in terms of primitive recursion and composition.
Exercises.
(1) The graph of the Ackermann function is primitive recursive. (It follows that the
Ackermann function is recursive, and it gives an example showing that Lemma 5.1.9
fails with “primitive recursive” in place of “computable”.)
5.4 Representability
Let L be a numerical language, that is, L contains the constant symbol 0 and
the unary function symbol S. We let S
n
0 denote the term S . . . S0 in which S
appears exactly n times. So S
0
0 is the term 0, S
1
0 is the term S0, and so on.
Our key example of a numerical language is
L(N) := ¦0, S, +, , <¦ (the language of N).
Here N is the following set of nine axioms, where we fix two distinct variables
x and y for the sake of definiteness:
N1 ∀x(Sx ,= 0)
N2 ∀x∀y (Sx = Sy → x = y)
N3 ∀x(x + 0 = x)
N4 ∀x∀y
_
x +Sy = S(x +y)
_
N5 ∀x(x 0 = 0)
N6 ∀x∀y (x Sy = x y +x)
N7 ∀x(x ,< 0)
N8 ∀x∀y (x < Sy ↔ x < y ∨ x = y)
N9 ∀x∀y (x < y ∨ x = y ∨ y < x)
These axioms are clearly true in N. The fact that N is finite will play a role
later. It is a very weak set of axioms, but strong enough to prove numerical
facts like
SS0 +SSS0 = SSSSS0, ∀x
_
x < SS0 → (x = 0 ∨ x = S0)
_
.
Lemma 5.4.1. For each n,
N ¬ x < S
n+1
0 ↔ (x = 0 ∨ ∨ x = S
n
0).
92 CHAPTER 5. COMPUTABILITY
Proof. By induction on n. For n = 0, N ¬ x < S0 ↔ x = 0 by axioms N8 and
N7. Assume n > 0 and N ¬ x < S
n
0 ↔ (x = 0 ∨ ∨ x = S
n−1
0). Use axiom
N8 to conclude that N ¬ x < S
n+1
0 ↔ (x = 0 ∨ ∨ x = S
n
0).
To give an impression how weak N is we consider some of its models:
Some models of N.
(1) We usually refer to N as the standard model of N.
(2) Another model of N is N[x] := (N[x]; . . . ), where 0, S, +, are interpreted
as the zero polynomial, as the unary operation of adding 1 to a polynomial,
and as addition and multiplication of polynomials in N[x], and where < is
interpreted as follows: f(x) < g(x) iff f(n) < g(n) for all large enough n.
(3) A more bizarre model of N: (R
≥0
; . . . ) with the usual interpretations of
0, S, +, , in particular S(r) := r +1, and with < interpreted as the binary
relation <
N
on R
≥0
: r <
N
s ⇔ (r, s ∈ N and r < s) or s / ∈ N.
Example (2) shows that N ,¬ ∀x∃y (x = 2y ∨ x = 2y + S0), since in N[x] the
element x is not in 2N[x] ∪ 2N[x] + 1; in other words, N cannot prove “every
element is even or odd.” In example (3) the binary relation <
N
on R
≥0
is not
even a total order. One useful fact about models of N is that they all contain
the so-called standard model N in a unique way:
Lemma 5.4.2. Suppose / [= N. Then there is a unique homomorphism
ι : N → /.
This homomorphism ι is an embedding, and for all a ∈ A and n,
(i) if a <
A
ι(n), then a = ι(m) for some m < n;
(ii) if a / ∈ ι(N), then ι(n) <
A
a.
As to the proof, note that for any homomorphism ι : N → / and all n we
must have ι(n) = (S
n
0)
A
. Hence there is at most one such homomorphism. It
remains to show that the map n → (S
n
0)
A
: N → A is an embedding ι : N → /
with properties (i) and (ii). We leave this as an exercise to the reader.
Definition. Let L be a numerical language, and Σ a set of L-sentences. A rela-
tion R ⊆ N
m
is said to be Σ-representable, if there is an L-formula ϕ(x
1
, . . . , x
m
)
such that for all (a
1
, . . . , a
m
) ∈ N
m
we have
(i) R(a
1
, . . . , a
m
) =⇒ Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0)
(ii) R(a
1
, . . . , a
m
) =⇒ Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0)
Such a ϕ(x
1
, . . . , x
m
) is said to represent R in Σ or to Σ-represent R. Note that if
ϕ(x
1
, . . . , x
m
) Σ-represents R and Σ is consistent, then for all (a
1
, . . . , a
m
) ∈ N
m
R(a
1
, . . . , a
m
) ⇐⇒ Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0),
R(a
1
, . . . , a
m
) ⇐⇒ Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0).
A function F : N
m
→ N is Σ-representable if there is a formula ϕ(x
1
, . . . , x
m
, y)
of L such that for all (a
1
, . . . , a
m
) ∈ N
m
we have
Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0, y) ↔ y = S
F(a
1
,...,a
m
)
0.
5.4. REPRESENTABILITY 93
Such a ϕ(x
1
, . . . , x
m
, y) is said to represent F in Σ or to Σ-represent F.
An L-term t(x
1
, . . . , x
m
) is said to represent the function F : N
m
→ N in Σ if
Σ ¬ t(S
a
1
0, . . . , S
a
m
0) = S
F(a)
0 for all a = (a
1
, . . . , a
m
) ∈ N
m
. Note that then
the function F is Σ-represented by the formula t(x
1
, . . . , x
m
) = y.
Proposition 5.4.3. Let L be a numerical language, Σ a set of L-sentences such
that Σ ¬ S0 ,= 0, and R ⊆ N
m
a relation. Then
R is Σ-representable ⇐⇒ χ
R
is Σ-representable.
Proof. (⇐) Assume χ
R
is Σ-representable and let ϕ(x
1
, . . . , x
m
, y) be an L-
formula Σ-representing it. We show that ψ(x
1
, . . . , x
m
) := ϕ(x
1
, . . . , x
m
, S0)
Σ-represents R. Let (a
1
, . . . , a
m
) ∈ R; then χ
R
(a
1
, . . . , a
m
) = 1. Hence
Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0, y) ↔ y = S0,
so Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0, S0), that is, Σ ¬ ψ(S
a
1
0, . . . , S
a
m
0).
Similarly, (a
1
, . . . , a
m
) / ∈ R implies Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0, S0). (Here we need
Σ ¬ S0 ,= 0.)
(⇒) Conversely, assume R is Σ-representable and let ϕ(x
1
, . . . , x
m
) be an L-
formula Σ-representing it. We show that
ψ(x
1
, . . . , x
m
, y) := (ϕ(x
1
, . . . , x
m
) ∧ y = S0) ∨ (ϕ(x
1
, . . . , x
m
) ∧ y = 0)
Σ-represents χ
R
.
Let (a
1
, . . . , a
m
) ∈ R; then Σ ¬ ϕ(S
a
1
0, . . . , S
a
m
0). Hence,
Σ ¬ [(ϕ(S
a
1
0, . . . , S
a
m
0) ∧ y = S0) ∨ (ϕ(S
a
1
0, . . . , S
a
m
0) ∧ y = 0)] ↔ y = S0
i.e. Σ ¬ ψ(S
a
1
0, . . . , S
a
m
0, y) ↔ y = S0.
And similarly for (a
1
, . . . , a
m
) / ∈ R, Σ ¬ ψ(S
a
1
0, . . . , S
a
m
0, y) ↔ y = 0.
Let Σ be a set of L-sentences. An L-term t(x
1
, . . . , x
m
) is said to represent
the function F : N
m
→ N in Σ if Σ ¬ t(S
a
1
0, . . . , S
a
m
0) = S
F(a)
0 for all
a = (a
1
, . . . , a
m
) ∈ N
m
. Note that then the function F is Σ-represented by the
formula t(x
1
, . . . , x
m
) = y.
Theorem 5.4.4 (Representability). Each computable function F : N
n
→ N is
N-representable. Each computable relation R ⊆ N
m
is N-representable.
Proof. By Proposition 5.4.3 we need only consider the case of functions. We
make the following three claims:
(R1)

+ : N
2
→ N, : N
2
→ N, χ

: N
2
→ N, and the coordinate function
I
n
i
(for each n and i = 1, . . . , n) are N-representable.
(R2)

If G : N
m
→ N and H
1
, . . . , H
m
: N
n
→ N are N-representable, then so
is F = G(H
1
, . . . , H
m
) : N
n
→ N defined by
F(a) = G(H
1
(a), . . . , H
k
(a)).
94 CHAPTER 5. COMPUTABILITY
(R3)

If G : N
n+1
→ N is N-representable, and for all a ∈ N
n
there exists
x ∈ N such that G(a, x) = 0, then the function F : N
n
→ N given by
F(a) = µx(G(a, x) = 0)
is N-representable.
(R1)

: The proof of this claim has six parts.
(i) The formula x
1
= x
2
represents ¦(a, b) ∈ N
2
: a = b¦ in N:
Let a, b ∈ N. If a = b then obviously N ¬ S
a
0 = S
b
0. Suppose that
a ,= b. Then for every model / of N we have / [= S
a
0 ,= S
b
0, by
Lemma 5.4.2 and its proof. Hence N ¬ S
a
0 ,= S
b
0.
(ii) The term x
1
+x
2
represents + : N
2
→ N in N:
Let a + b = c where a, b, c ∈ N. By Lemma 5.4.2 and its proof we
have / [= S
a
0 + S
b
0 = S
c
0 for each model / of N. It follows that
N ¬ S
a
0 +S
b
0 = S
c
0.
(iii) The term x
1
x
2
represents : N
2
→ N in N:
The proof is similar to that of (ii).
(iv) The formula x
1
< x
2
represents ¦(a, b) ∈ N
2
: a < b¦ in N:
The proof is similar to that of (i).
(v) χ

: N
2
→ N is N-representable:
By (i) and (iv), the formula x
1
< x
2
∨ x
1
= x
2
represents the set
¦(a, b) ∈ N
2
: a ≤ b¦ in N. So by Proposition 5.4.3, χ

: N
2
→ N is
N-representable.
(vi) For n ≥ 1 and 1 ≤ i ≤ n , the term t
n
i
(x
1
, . . . , x
n
) := x
i
, represents
the function I
n
i
: N
n
→ N in N. This is obvious.
(R2)

: Let H
1
, . . . , H
m
: N
n
→ N be N-represented by ϕ
i
(x
1
, . . . , x
n
, y
i
) and
let G : N
m
→ N be N-represented by ψ(y
1
, . . . , y
m
, z) where z is distinct
from x
1
, . . . , x
n
and y
1
, . . . , y
m
.
Claim : F = G(H
1
, . . . , H
m
) is N-represented by
θ(x
1
, . . . , x
n
, z) := ∃y
1
. . . ∃y
m
((
m

i=1
ϕ
i
(x
1
, . . . , x
n
, y
i
)) ∧ ψ(y
1
, . . . , y
m
, z)).
Put a = (a
1
, . . . , a
n
) and let c = F(a). We have to show that
N ¬ θ(S
a
0, z) ↔ z = S
c
0
where S
a
0 := (S
a
1
0, . . . , S
a
n
0). Let b
i
= H
i
(a) and put b = (b
1
, . . . , b
m
).
Then F(a) = G(b) = c. Therefore, N ¬ ψ(S
b
0, z) ↔ z = S
c
0 and
N ¬ ϕ
i
(S
a
0, y
i
) ↔ y
i
= S
b
i
0, (i = 1, . . . , m)
Argue in models to conclude : N ¬ θ(S
a
0, z) ↔ z = S
c
0.
5.5. DECIDABILITY AND G
¨
ODEL NUMBERING 95
(R3)

: Let G : N
n+1
→ N be such that for all a ∈ N
n
there exists b ∈ N with
G(a, b) = 0. Define F : N
n
→ N by F(a) = µb(G(a, b) = 0). Suppose
that G is N-represented by ϕ(x
1
, . . . , x
n
, y, z). We claim that the formula
ψ(x
1
, . . . , x
n
, y) := ϕ(x
1
, . . . , x
n
, y, 0) ∧ ∀w(w < y → ϕ(x
1
, . . . , x
n
, w, 0))
N-represents F. Let a ∈ N
n
and let b = F(a). Then G(a, i) ,= 0 for i < b
and G(a, b) = 0. Therefore, N ¬ ϕ(S
a
0, S
b
0, z) ↔ z = 0 and for i < b,
G(a, i) ,= 0 and N ¬ ϕ(S
a
0, S
i
0, z) ↔ z = S
G(a,i)
0. By arguing in models
using Lemma 5.4.2 we obtain N ¬ ψ(S
a
0, y) ↔ y = S
b
0, as claimed.
Remark. The converse of this theorem is also true, and is plausible from the
Church-Turing Thesis. We shall prove the converse in the next section.
Exercises. In the exercises below, L is a numerical language and Σ is a set of
L-sentences.
(1) Suppose Σ ¬ S
m
0 ,= S
n
0 whenever m ,= n. If a function F : N
m
→ N is Σ-
represented by the L-formula ϕ(x
1
, . . . , x
m
, y), then the graph of F, as a relation
of arity m+1 on N, is Σ-represented by ϕ(x
1
, . . . , x
m
, y). (This result applies to
Σ = N, since N ¬ S
m
0 ,= S
n
0 whenever m ,= n.)
(2) Suppose Σ ⊇ N. Then the set of all Σ-representable functions F : N
m
→ N,
(m = 0, 1, 2, . . . ) is closed under composition and minimalization.
5.5 Decidability and G¨odel Numbering
Definition. An L-theory T is a set of L-sentences closed under provability, that
is, whenever T ¬ σ, then σ ∈ T.
Examples.
(1) Given a set Σ of L-sentences, the set Th(Σ) := ¦σ : Σ ¬ σ¦ of theorems
of Σ, is an L-theory. If we need to indicate the dependence on L we write
Th
L
(Σ) for Th(Σ). We say that Σ axiomatizes an L-theory T (or is an
axiomatization of T) if T = Th(Σ). For Σ = ∅ we also refer to Th
L
(Σ) as
predicate logic in L.
(2) Given an L-structure /, the set Th(/) := ¦σ : / [= σ¦ is also an L-
theory, called the theory of /. Note that the theory of / is automatically
complete.
(3) Given any class / of L-structures, the set
Th(/) := ¦σ : / [= σ for all / ∈ /¦
is an L-theory, called the theory of /. For example, for L = L
Ri
, and /
the class of finite fields, Th(/) is the set of L-sentences that are true in all
finite fields.
96 CHAPTER 5. COMPUTABILITY
The decision problem for a given L-theory T is to find an algorithm to
decide for any L-sentence σ whether or not σ belongs to T. Since we have not
(yet) defined the concept of algorithm, this is just an informal description at
this stage. One of our goals in this section is to define a formal counterpart,
called “decidability.” In the next section we show that the L(N)-theory Th(N) is
undecidable; by the Church-Turing Thesis, this means that the decision problem
for Th(N) has no solution. (This result is a version of Church’s Theorem, and
is closely related to the Incompleteness Theorem.)
Unless we specify otherwise we assume in the rest of this chapter that the
language L is finite. This is done for simplicity, and at the end of the remaining
sections we indicate how to avoid this assumption. We shall number the terms
and formulas of L in such a way that various statements about these formulas
and about formal proofs in this language can be translated “effectively” into
equivalent statements about natural numbers expressible by sentences in L(N).
Recall that v
0
, v
1
, v
2
, . . . are our variables. We assign to each symbol
s ∈ L . ¦logical symbols¦ . ¦v
0
, v
1
, v
2
, . . . ¦
a symbol number SN(s) ∈ N as follows: SN(v
i
) := 2i and to each remaining
symbol (in the finite set L.¦logical symbols¦) we assign an odd natural number
as symbol number, subject to the condition that different symbols have different
symbol numbers.
Definition. The G¨odel number t of an L-term t is defined recursively:
t =
_
¸SN(v
i
)) if t = v
i
,
¸SN(F), t
1
, . . . , t
n
) if t = Ft
1
, . . . , t
n
.
The G¨odel number ϕ of an L-formula ϕ is given recursively by
ϕ =
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸SN(·)) if ϕ = ·,
¸SN(⊥)) if ϕ = ⊥,
¸SN(=), t
1
, t
2
) if ϕ = (t
1
= t
2
),
¸SN(R), t
1
, . . . , t
m
) if ϕ = Rt
1
. . . t
m
,
¸SN(), ψ) if ϕ = ψ,
¸SN(∨), ϕ
1
, ϕ
2
) if ϕ = ϕ
1
∨ ϕ
2
,
¸SN(∧), ϕ
1
, ϕ
2
) if ϕ = ϕ
1
∧ ϕ
2
,
¸SN(∃), x, ψ) if ϕ = ∃xψ,
¸SN(∀), x, ψ) if ϕ = ∀xψ.
Lemma 5.5.1. The following subsets of N are computable:
(1) Vble := ¦x : x is a variable¦
(2) Term := ¦t : t is an L-term¦
(3) AFor := ¦ϕ : ϕ is an atomic L-formula¦
(4) For := ¦ϕ : ϕ is an L-formula¦
5.5. DECIDABILITY AND G
¨
ODEL NUMBERING 97
Proof. (1) a ∈ Vble iff a = ¸2b) for some b ≤ a.
(2) a ∈ Term iff a ∈ Vble or a = ¸SN(F), t
1
, . . . , t
n
) for some function
symbol F of L of arity n and L-terms t
1
, . . . , t
n
with G¨odel numbers < a.
We leave (3) to the reader.
(4) We have For(a) ⇔
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
For((a)
1
) if a = ¸SN(), (a)
1
),
For((a)
1
) and For((a)
2
) if a = ¸SN(∨), (a)
1
, (a)
2
)
or a = ¸SN(∧), (a)
1
, (a)
2
),
Vble((a)
1
) and For((a)
2
) if a = ¸SN(∃), (a)
1
, (a)
2
)
or a = ¸SN(∀), (a)
1
, (a)
2
),
AFor(a) otherwise.
So For is computable.
In the next two lemmas, x ranges over variables, ϕ and ψ over L-formulas,
and t and τ over L-terms.
Lemma 5.5.2. The function Sub : N
3
→ N defined by Sub(a, b, c) =
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
c if Vble(a) and a = b,
¸(a)
0
, Sub((a)
1
, b, c), . . . , Sub((a)
n
, b, c)) if a = ¸(a)
0
, . . . , (a)
n
) with n > 0 and
(a)
0
,= SN(∃), (a)
0
,= SN(∀),
¸SN(∃), (a)
1
, Sub((a)
2
, b, c)) if a = ¸SN(∃), (a)
1
, (a)
2
) and (a)
1
,= b,
¸SN(∀), (a)
1
, Sub((a)
2
, b, c)) if a = ¸SN(∀), (a)
1
, (a)
2
) and (a)
1
,= b,
a otherwise
is computable, and satisfies
Sub(t, x, τ) = t(τ/x) and Sub(ϕ, x, τ) = ϕ(τ/x).
Proof. Exercise.
Lemma 5.5.3. The following relations on N are computable:
(1) Fr := ¦(ϕ, x) : x occurs free in ϕ¦ ⊆ N
2
(2) FrSub := ¦(ϕ, x, τ) : τ is free for x in ϕ¦ ⊆ N
3
(3) PrAx := ¦ϕ : ϕ is a propositional axiom¦ ⊆ N
(4) Eq := ¦ϕ : ϕ is an equality axiom¦ ⊆ N
(5) Quant := ¦ψ : ψ is a quantifier axiom¦ ⊆ N
(6) MP := ¦(ϕ
1
, ϕ
1
→ ϕ
2
, ϕ
2
) : ϕ
1
, ϕ
2
are L-formulas¦ ⊆ N
3
(7) Gen := ¦(ϕ, ψ) : ψ follows from ϕ by the generalization rule¦ ⊆ N
2
(8) Sent := ¦ϕ : ϕ is a sentence¦ ⊆ N
Proof. The usual inductive or explicit description of each of these notions trans-
lates easily into a description of its “G¨odel image” that establishes computability
of this image. As an example, note that
Sent(a) ⇐⇒ For(a) and ∀i
<a
Fr(a, i),
so (8) follows from (1) and earlier results.
98 CHAPTER 5. COMPUTABILITY
In the rest of this Section Σ is a set of L-sentences. Put
Σ := ¦σ : σ ∈ Σ¦,
and call Σ computable if Σ is computable.
Definition. We define Prf
Σ
to be the “set of all G¨odel numbers of proofs from
Σ”, i. e. Prf
Σ
:= ¦¸ϕ
1
, . . . , ϕ
n
) : ϕ
1
, . . . , ϕ
n
is a proof from Σ¦. So every
element of Prf
Σ
is of the form ¸ϕ
1
, . . . , ϕ
n
) where n ≥ 1 and every ϕ
k
is
either in Σ, or a logical axiom, or obtained from some ϕ
i
, ϕ
j
with 1 ≤ i, j < k
by Modus Ponens, or obtained from some ϕ
i
with 1 ≤ i < k by Generalization.
Lemma 5.5.4. If Σ is computable, then Prf
Σ
is computable.
Proof. This is because a is in Prf
Σ
iff Seq(a) and lh(a) ,= 0 and for every k <
lh(a) either (a)
k
∈ Σ ∪ PrAx ∪Eq ∪Quant or ∃∃∃i, j < k : MP((a)
i
, (a)
j
, (a)
k
)
or ∃∃∃i < k : Gen((a)
i
, (a)
k
).
Definition. A relation R ⊆ N
n
is said to be computably generated if there is a
computable relation Q ⊆ N
n+1
such that for all a ∈ N
n
we have
R(a) ⇔∃∃∃xQ(a, x)
“Recursively enumerable” is also used for “computably generated.”
Remark. Every computable relation is obviously computably generated. We
leave it as an exercise to check that the union and intersection of two computably
generated n-ary relations on N are computably generated. The complement of
a computably generated subset of N is not always computably generated, as we
shall see later.
Lemma 5.5.5. If Σ is computable, then Th(Σ) is computably generated.
Proof. Apply Lemma 5.5.4 and the fact that for all a ∈ N
a ∈ Th(Σ) ⇐⇒∃∃∃b
_
Prf
Σ
(b) and a = (b)
lh(b)
˙
−1
and Sent(a)
_
.
Definition. An L-theory T is said to be computably axiomatizable if T has a
computable axiomatization.
2
We say that T is decidable if T is computable, and undecidable otherwise.
(Thus “T is decidable” means the same thing as “T is computable,” but for
L-theories “decidable” is more widely used than “computable”.)
Proposition 5.5.6 (Negation Theorem). Let A ⊆ N
n
and suppose A and A
are computably generated. Then A is computable.
2
Instead of “computably axiomatizable,” also “recursively axiomatizable” and “effectively
axiomatizable” are used.
5.5. DECIDABILITY AND G
¨
ODEL NUMBERING 99
Proof. Let P, Q ⊆ N
n+1
be computable such that for all a ∈ N
n
we have
A(a) ⇐⇒∃∃∃xP(a, x), A(a) ⇐⇒∃∃∃xQ(a, x).
Then there is for each a ∈ N
n
an x ∈ N such that (P ∨ Q)(a, x). The com-
putability of A follows by noting that for all a ∈ N
n
we have
A(a) ⇐⇒ P(a, µx(P ∨ Q)(a, x)).
Proposition 5.5.7. Every complete and computably axiomatizable L-theory is
decidable.
Proof. Let T be a complete L-theory with computable axiomatization Σ. Then
T = Th(Σ) is computably generated. Now observe:
a / ∈ T ⇐⇒ a / ∈ Sent or ¸SN(), a) ∈ T
⇐⇒ a / ∈ Sent or ∃∃∃b
_
Prf
Σ
(b) and (b)
lh(b)
˙
−1
= ¸SN(), a)
_
.
Hence the complement of T is computably generated. Thus T is decidable by
the Negation Theorem.
Relaxing the assumption of a finite language. Much of the above does
not really need the assumption that L is finite. In the discussion below we only
assume that L is countable, so the case of finite L is included. First, we assign
to each symbol
s ∈ ¦v
0
, v
1
, v
2
, . . . ¦ . ¦logical symbols¦
its symbol number SN(s) ∈ N as follows: SN(v
i
) := 2i, and for
s = ·, ⊥, , ∨, ∧, =, ∃, ∀, respectively, put
SN(s) = 1, 3, 5, 7, 9, 11, 13, 15, respectively.
This part of our numbering of symbols is independent of L.
By a numbering of L we mean an injective function L → N that assigns to each
s ∈ L an odd natural number SN(s) > 15 such that if s is a relation symbol,
then SN(s) ≡ 1 mod 4, and if s is a function symbol, then SN(s) ≡ 3 mod 4.
Such a numbering of L is said to be computable if the sets
SN(L) = ¦SN(s) : s ∈ L¦) ⊆ N, ¦(SN(s), arity(s)) : s ∈ L¦ ⊆ N
2
are computable. (So if L is finite, then every numbering of L is computable.)
Given a numbering of L we use it to assign to each L-term t and each L-
formula ϕ its G¨odel number t and ϕ, just as we did earlier in the section.
Suppose a computable numbering of L is given, with the corresponding G¨odel
numbering of L-terms and L-formulas. Then Lemmas 5.5.1, 5.5.2, and 5.5.3 go
through, as a diligent reader can easily verify.
100 CHAPTER 5. COMPUTABILITY
Let also a set Σ of L-sentences be given. Define Σ := ¦σ : σ ∈ Σ¦, and
call Σ computable if Σ is a computable subset of N. We define Prf
Σ
to be the
set of all G¨odel numbers of proofs from Σ, and for an L-theory T we define the
notions of T being computably axiomatizable, T being decidable, and T being
undecidable, all just as we did earlier in this section. (Note, however, that the
definitions of these notions are all relative to our given computable numbering
of L; for finite L different choices of numbering of L yield equivalent notions of Σ
being computable, T being computably axiomatizable, and T being decidable.)
It is now routine to check that Lemmas 5.5.4, 5.5.5, and Proposition 5.5.7 go
through.
Exercises.
(1) Let a and b denote positive real numbers. Call a computable if there are com-
putable functions f, g : N → N such that for all n > 0,
g(n) ,= 0 and [a −f(n)/g(n)[ < 1/n.
Then:
(i) every positive rational number is computable, and e is computable;
(ii) if a and b are computable, so are a + b, ab, and 1/a, and if in addition
a > b, then a −b is also computable;
(iii) a is computable if and only if the binary relation R
a
on N defined by
R
a
(m, n) ⇐⇒ n > 0 and m/n < a
is computable. (Hint: use the Negation Theorem.)
(2) A nonempty S ⊆ N is computably generated iff there is a computable function
f : N → N such that S = f(N). Moreover, if S is infinite and computably
generated, then f can be chosen injective.
(3) If f : N → Nis computable and f(x) > x for all x ∈ N, then f(N) is computable.
(4) Every infinite computably generated subset of N has an infinite computable sub-
set.
(5) A function F : N
n
→ N is computable iff its graph is computably generated.
(6) Suppose Σ is a computable and consistent set of sentences in the numerical lan-
guage L. Then every Σ-representable relation R ⊆ N
n
is computable.
(7) A function F : N
m
→ Nis computable if and only if it is Σ-representable for some
finite, consistent set Σ ⊇ N of sentences in some numerical language L ⊇ L(N).
The last exercise gives an alternative characterization of “computable function.”
5.6 Theorems of G¨odel and Church
In this section we assume that the finite language L extends L(N).
Theorem 5.6.1 (Church). No consistent L-theory extending N is decidable.
5.6. THEOREMS OF G
¨
ODEL AND CHURCH 101
Before giving the proof we record the following consequence:
Corollary 5.6.2 (Weak form of G¨odel’s Incompleteness Theorem). Each com-
putably axiomatizable L-theory extending N is incomplete.
Proof. Immediate from 5.5.7 and Church’s Theorem.
We will indicate in the next Section how to construct for any consistent com-
putable set of L-sentences Σ ⊇ N an L-sentence σ such that Σ ,¬ σ and Σ ,¬ σ.
(The corollary above only says that such a sentence exists.)
For the proof of Church’s Theorem we need a few lemmas.
Lemma 5.6.3. The function Num : N → N defined by Num(a) = S
a
0 is
computable.
Proof. Num(0) = 0 and Num(a + 1) = ¸SN(S), Num(a)).
Let P ⊆ A
2
be any binary relation on a set A. For a ∈ A, we let P(a) ⊆ A be
given by the equivalence P(a)(b) ⇔ P(a, b).
Lemma 5.6.4 (Cantor). Given any P ⊆ A
2
, its antidiagonal Q ⊆ A defined by
Q(b) ⇐⇒ P(b, b)
is not of the form P(a) for any a ∈ A.
Proof. Suppose Q = P(a), where a ∈ A. Then Q(a) iff P(a, a). But by defini-
tion, Q(a) iff P(a, a), a contradiction.
This is essentially Cantor’s proof that no f : A →P(A) can be surjective. (Use
P(a, b) :⇔ b ∈ f(a); then P(a) = f(a).)
Definition. Let Σ be a set of L-sentences. We fix a variable x (e. g. x = v
0
)
and define the binary relation P
Σ
⊆ N
2
by
P
Σ
(a, b) ⇐⇒ Sub(a, x, Num(b)) ∈ Th(Σ)
For an L-formula ϕ(x) and a = ϕ(x), we have
Sub(ϕ(x), x, S
b
0) = ϕ(S
b
0),
so
P
Σ
(a, b) ⇐⇒ Σ ¬ ϕ(S
b
0).
Lemma 5.6.5. Suppose Σ ⊇ N is consistent. Then each computable set X ⊆ N
is of the form X = P
Σ
(a) for some a ∈ N.
Proof. Let X ⊆ Nbe computable. Then X is Σ-representable by Theorem 5.4.4,
say by the formula ϕ(x), i. e. X(b) ⇒ Σ ¬ ϕ(S
b
0), and X(b) ⇒ Σ ¬ ϕ(S
b
0).
So X(b) ⇔ Σ ¬ ϕ(S
b
0) (using consistency to get “⇐”). Take a = ϕ(x); then
X(b) iff Σ ¬ ϕ(S
b
0) iff P
Σ
(a, b), that is, X = P
Σ
(a).
102 CHAPTER 5. COMPUTABILITY
Proof of Church’s Theorem. Let Σ ⊇ N be consistent. We have to show that
then Th(Σ) is undecidable, that is, Th(Σ) is not computable. Suppose that
Th(Σ) is computable. Then the antidiagonal Q
Σ
⊆ N of P
Σ
is computable:
b ∈ Q
Σ
⇔ (b, b) / ∈ P
Σ
⇔ Sub(b, x, Num(b)) / ∈ Th(Σ).
By Lemma 5.6.4, Q
Σ
is not among the P
Σ
(a). Therefore by Lemma 5.6.5, Q
Σ
is not computable, a contradiction. This concludes the proof.
By Lemma 5.5.5 the subset Th(N) of N is computably generated. But this
set is not computable:
Corollary 5.6.6. Th(N) and Th(∅) (in the language L(N)) are undecidable.
Proof. The undecidability of Th(N) is a special case of Church’s Theorem. Let
∧N be the sentence N1 ∧ ∧ N9. Then, for any L(N)-sentence σ,
N ¬ σ ⇐⇒ ∅ ¬ ∧N → σ,
that is, for all a ∈ N,
a ∈ Th(N) ⇐⇒ a ∈ Sent and ¸SN(∨), ¸SN(), ∧N), a) ∈ Th(∅).
Therefore, if Th(∅) were decidable, then Th(N) would be decidable; but Th(N)
is undecidable. So Th(∅) is undecidable.
Discussion. We have seen that N is quite weak. A very strong set of axioms
in the language L(N) is PA (1st order Peano Arithmetic). Its axioms are those
of N together with all induction axioms, that is, all sentences of the form
∀x[
_
ϕ(x, 0) ∧ ∀y (ϕ(x, y) → ϕ(x, Sy))
_
→ ∀y ϕ(x, y)]
where ϕ(x, y) is an L(N)-formula, x = (x
1
, . . . , x
n
), and ∀x stands for ∀x
1
. . . ∀x
n
.
Note that PA is consistent, since it has N = (N; <, 0, S, +, ) as a model.
Also PA is computable (exercise). Thus by the theorems above, Th(PA) is
undecidable and incomplete. To appreciate the significance of this result, one
needs a little background knowledge, including some history.
Over a century of experience has shown that number theoretic assertions can
be expressed by sentences of L(N), admittedly in an often contorted way. (That
is, we know how to construct for any number theoretic statement a sentence σ
of L(N) such that the statement is true if and only if N [= σ. In most cases we
just indicate how to construct such a sentence, since an actual sentence would
be too unwieldy without abbreviations.)
What is more important, we know from experience that any established fact
of classical number theory—including results obtained by sophisticated analytic
and algebraic methods—can be proved from PA, in the sense that PA ¬ σ
for the sentence σ expressing that fact. Thus before G¨odel’s Incompleteness
Theorem it seemed natural to conjecture that PA is complete. (Apparently it
was not widely recognized at the time that completeness of PA would have had
5.7. A MORE EXPLICIT INCOMPLETENESS THEOREM 103
the astonishing consequence that Th(N) is decidable.) Of course, the situation
cannot be remedied by adding new axioms to PA, at least if we insist that the
axioms are true in N and that we have effective means to tell which sentences
are axioms. In this sense, the Incompleteness Theorem is pervasive.
5.7 A more explicit incompleteness theorem
In Section 5.6 we obtained G¨odel’s Incompleteness Theorem as an immediate
corollary of Church’s theorem. In this section, we prove the incompleteness
theorem in the more explicit form stated in the introduction to this chapter.
In this section L ⊇ L(N) is a finite language, and Σ is a set of L-sentences.
We also fix two distinct variables x and y.
We shall indicate how to construct, for any computable consistent Σ ⊇ N, a
formula ϕ(x) of L(N) with the following properties:
(i) N ¬ ϕ(S
m
0) for each m;
(ii) Σ ,¬ ∀xϕ(x).
Note that then the sentence ∀xϕ(x) is true in N but not provable from Σ. Here
is a sketch of how to make such a sentence. Assume for simplicity that L = L(N)
and N [= Σ. The idea is to construct sentences σ and σ

such that
(1) N [= σ ↔ σ

; and (2) N [= σ

⇐⇒ Σ ,¬ σ.
From (1) and (2) we get N [= σ ⇐⇒ Σ ,¬ σ. Assuming that N [= σ produces
a contradiction. Hence σ is true in N, and thus(!) not provable from Σ.
How to implement this strange idea? To take care of (2), one might guess
that σ

= ∀xpr(x, S
σ
0) where pr(x, y) is a formula representing in N the
binary relation Pr ⊆ N
2
defined by
Pr(m, n) ⇐⇒ m is the G¨odel number of a proof from Σ
of a sentence with G¨odel number n.
But how do we arrange (1)? Since σ

:= ∀xpr(x, S
σ
0) depends on σ, the
solution is to apply the fixed-point lemma below to ρ(y) := ∀xpr(x, y).
This finishes our sketch. What follows is a rigorous implementation.
Lemma 5.7.1 (Fixpoint Lemma). Suppose Σ ⊇ N. Then there is for every
L-formula ρ(y) an L-sentence σ such that Σ ¬ σ ↔ ρ(S
n
0) where n = σ.
Proof. The function (a, b) → Sub(a, x, Num(b)) : N
2
→ N is computable by
Lemma 5.5.2. Hence by the representability theorem it is N-representable. Let
sub(x
1
, x
2
, y) be an L(N)-formula representing it in N. We can assume that the
variable x does not occur in sub(x
1
, x
2
, y). Then for all a, b in N,
N ¬ sub(S
a
0, S
b
0, y) ↔ y = S
c
0, where c = Sub(a, x, Num(b)) (1)
104 CHAPTER 5. COMPUTABILITY
Now let ρ(y) be an L-formula. Define θ(x):= ∃y(sub(x, x, y) ∧ ρ(y)) and let
m = θ(x). Let σ := θ(S
m
0), and put n = σ. We claim that
Σ ¬ σ ↔ ρ(S
n
0).
Indeed,
n = σ = θ(S
m
0) = Sub(θ(x), x, S
m
0) = Sub(m, x, Num(m)).
So by (1),
N ¬ sub(S
m
0, S
m
0, y) ↔ y = S
n
0 (2)
We have
σ = θ(S
m
0) = ∃y(sub(S
m
0, S
m
0, y) ∧ ρ(y)),
so by (2) we get Σ ¬ σ ↔ ∃y(y = S
n
0 ∧ ρ(y)). Hence, Σ ¬ σ ↔ ρ(S
n
0).
Theorem 5.7.2. Suppose Σ is consistent, computable, and proves all axioms
of N. Then there exists an L(N)-formula ϕ(x) such that N ¬ ϕ(S
m
0) for each
m, but Σ ,¬ ∀xϕ(x).
Proof. Consider the relation Pr
Σ
⊆ N
2
defined by
Pr
Σ
(m, n) ⇐⇒ m is the G¨odel number of a proof from Σ
of an L-sentence with G¨odel number n.
Since Σ is computable, Pr
Σ
is computable. Hence Pr
Σ
is representable in N. Let
pr
Σ
(x, y) be an L(N)-formula representing Pr
Σ
in N, and hence in Σ. Because
Σ is consistent we have for all m, n:
Σ ¬ pr
Σ
(S
m
0, S
n
0) ⇐⇒ Pr
Σ
(m, n) (1)
Σ ¬ pr
Σ
(S
m
0, S
n
0) ⇐⇒ Pr
Σ
(m, n) (2)
Let ρ(y) be the L(N)-formula ∀xpr
Σ
(x, y). Lemma 5.7.1 (with L = L(N) and
Σ = N) provides an L(N)-sentence σ such that N ¬ σ ↔ ρ(S
σ
0). It follows
that Σ ¬ σ ↔ ρ(S
σ
0), that is
Σ ¬ σ ↔ ∀xpr
Σ
(x, S
σ
0) (3)
Claim: Σ ,¬ σ. Assume towards a contradiction that Σ ¬ σ; let m be the
G¨odel number of a proof of σ from Σ, so Pr
Σ
(m, σ). Because of (3) we also
have Σ ¬ ∀xpr
Σ
(x, S
σ
0), so Σ ¬ pr
Σ
(S
m
0, S
σ
0), which by (2) yields
Pr
Σ
(m, σ), a contradiction. This establishes the claim.
Now put ϕ(x) := pr
Σ
(x, S
σ
0). We now show :
(i) N ¬ ϕ(S
m
0) for each m. Because Σ ,¬ σ, no m is the G¨odel number
of a proof of σ from Σ. Hence Pr
Σ
(m, σ) for each m, which by the
defining property of pr
Σ
yields N ¬ pr
Σ
(S
m
0, S
σ
0) for each m, that is,
N ¬ ϕ(S
m
0) for each m.
5.8. UNDECIDABLE THEORIES 105
(ii) Σ ,¬ ∀xϕ(x). This is because of the Claim and Σ ¬ σ ↔ ∀xϕ(x), by (3).
Corollary 5.7.3. Suppose that Σ is computable and true in the L-expansion
N

of N. Then there exists an L(N)-formula ϕ(x) such that N ¬ ϕ(S
n
0) for
each n, but Σ ∪ N ,¬ ∀xϕ(x).
(Note that then ∀xϕ(x) is true in N

but not provable from Σ.) To obtain
this corollary, apply the theorem above to Σ ∪ N in place of Σ.
Exercises.
(1) Let N

be an L-expansion of N. Then the set of G¨odel numbers of L-sentences
true in N

is not definable in N

.
This result (of Tarski) is known as the undefinability of truth. It strengthens the
special case of Church’s theorem which says that the set of G¨odel numbers of
L-sentences true in N

is not computable. Tarski’s result follows easily from the
fixpoint lemma, as does the more general result in the next exercise.
(2) Suppose Σ ⊇ N is consistent. Then the set Th(Σ) is not Σ-representable, and
there is no truth definition for Σ. Here a truth definition for Σ is an L-formula
true(y) such that for all L-sentences σ,
Σ ¬ σ ←→ true(S
n
0), where n = σ.
5.8 Undecidable Theories
Church’s theorem says that any consistent theory containing a certain basic
amount of integer arithmetic is undecidable. How about theories like Th(Fl)
(the theory of fields), and Th(Gr) (the theory of groups)? An easy way to prove
the undecidability of such theories is due to Tarski: he noticed that if N is
definable in some model of a theory T, then T is undecidable. The aim of this
section is to establish this result and indicate some applications. In order not to
distract from this theme by boring details, we shall occasionally replace a proof
by an appeal to the Church-Turing Thesis. (A conscientious reader will replace
these appeals by proofs until reaching a level of skill that makes constructing
such proofs a predictable routine.)
In this section, L and L

are finite languages, Σ is a set of L-sentences, and
Σ

is a set of L

-sentences.
Lemma 5.8.1. Let L ⊆ L

and Σ ⊆ Σ

.
(1) Suppose Σ

is conservative over Σ. Then
Th
L
(Σ) is undecidable =⇒ Th
L

) is undecidable.
(2) Suppose L = L

and Σ

Σ is finite. Then
Th(Σ

) is undecidable =⇒ Th(Σ) is undecidable.
106 CHAPTER 5. COMPUTABILITY
(3) Suppose all symbols of L

L are constant symbols. Then
Th
L
(Σ) is undecidable ⇐⇒ Th
L
(Σ) is undecidable.
(4) Suppose Σ

extends Σ by a definition. Then
Th
L
(Σ) is undecidable ⇐⇒ Th
L

) is undecidable.
Proof. (1) In this case we have for all a ∈ N,
a ∈ Th
L
(Σ) ⇐⇒ a ∈ Sent
L
and a ∈ Th
L

).
It follows that if Th
L

) is decidable, so is Th
L
(Σ).
(2) Write Σ

= ¦σ
1
, . . . , σ
N
¦ ∪ Σ, and put σ

:= σ
1
∧ ∧ σ
N
. Then for each
L-sentence σ we have Σ

¬ σ ⇐⇒ Σ ¬ σ

→ σ, so for all a ∈ N,
a ∈ Th(Σ

) ⇐⇒ a ∈ Sent and ¸SN(∨), ¸SN(), σ

), a) ∈ Th(Σ).
It follows that if Th(Σ) is decidable then so is Th(Σ

).
(3) Let c
0
, . . . , c
n
be the distinct constant symbols of L

L. Given any L

-
sentence σ we define the L-sentence σ

as follows: take k ∈ N minimal such
that σ contains no variable v
m
with m ≥ k, replace each occurrence of c
i
in σ
by v
k+i
for i = 0, . . . , n, and let ϕ(v
k
, . . . , v
k+n
) be the resulting L-formula (so
σ = ϕ(c
0
, . . . , c
n
)); then σ

:= ∀v
k
. . . ∀v
k+n
ϕ(v
k
, . . . , v
k+n
). An easy argument
using the completeness theorem shows that
Σ ¬
L
σ ⇐⇒ Σ ¬
L
σ

.
By the Church-Turing Thesis there is a computable function a → a

: N → N
such that σ

= σ

for all L

-sentences σ; we leave it to the reader to replace
this appeal to the Church-Turing Thesis by a proof. Then, for all a ∈ N:
a ∈ Th
L
(Σ) ⇐⇒ a ∈ Sent
L
and a

∈ Th
L
(Σ).
This yields the ⇐ direction of (3); the converse holds by (1).
(4) The ⇒ direction holds by (1). For the ⇐ we use an algorithm (see ...) that
computes for each L

-sentence σ an L-sentence σ

such that Σ

¬ σ ↔ σ

. By
the Church-Turing Thesis there is a computable function a → a

: N → N such
that σ

= σ

for all L

-sentences σ. Hence, for all a ∈ N,
a ∈ Th
L

) ⇐⇒ a ∈ Sent
L
and a

∈ Th
L
(Σ).
This yields the ⇐ direction of (4).
Remark. We cannot drop the assumption L = L

in (2): take L = ∅, Σ = ∅,
L

= L(N) and Σ

= ∅. Then Th
L

) is undecidable by Corollary 5.6.6, but
Th
L
(Σ) is decidable (exercise).
Definition. An L-structure / is said to be strongly undecidable if for every set
Σ of L-sentences such that / [= Σ, Th(Σ) is undecidable.
5.8. UNDECIDABLE THEORIES 107
So / is strongly undecidable iff every L-theory of which / is a model is unde-
cidable.
Example. N = (N; <, 0, S, +, ) is strongly undecidable. To see this, let Σ
be a set of L(N)-sentences such that N [= Σ. We have to show that Th(Σ) is
undecidable. Now N [= Σ∪N. By Church’s Theorem Th(Σ∪N) is undecidable,
hence Th(Σ) is undecidable by part (2) of Lemma 5.8.1.
The following result is an easy application of part (3) of the previous lemma.
Lemma 5.8.2. Let c
0
, . . . , c
n
be distinct constant symbols not in L, and let
(/, a
0
, . . . , a
n
) be an L(c
0
, . . . , c
n
)-expansion of the L-structure /. Then
(/, a
0
, . . . , a
n
) is strongly undecidable =⇒ / is strongly undecidable.
Theorem 5.8.3 (Tarski). Suppose the L-structure / is definable in the L

-
structure B and / is strongly undecidable. Then B is strongly undecidable.
Proof. (Sketch) By the previous lemma (with L

and B instead of L and /), we
can reduce to the case that we have a 0-definition δ of / in B. One can show
that in this case we have (1) an algorithm that computes for any L-sentence
σ an L

-sentence δσ, and (2) a finite set ∆ of L

-sentences true in B, with the
following properties:
(i) / [= σ ⇐⇒ B [= δσ;
(ii) for each set Σ

of L

-sentences with B [= Σ

and each L-sentence σ,
Σ ¬ σ ⇐⇒ Σ

∪ ∆ ¬ δσ, where Σ := ¦s : s is an L-sentence and Σ

∪ ∆ [= δs¦.
Let Σ

be a set of L

-sentences such that B [= Σ

; we need to show that Th
L

)
is undecidable. Suppose towards a contradiction that Th
L

) is decidable.
Then Th
L

∪ ∆) is decidable, by part (2) of Lemma 5.8.1, so we have an
algorithm for deciding whether any given L

-sentence is provable from Σ

∪ ∆.
Take Σ as in (ii) above. Then / [= Σ by (i), and by (ii) we obtain an algorithm
for deciding whether any given L-sentence is provable from Σ. But Th
L
(Σ) is
undecidable by assumption, and we have a contradiction.
Corollary 5.8.4. Th(Ri) is undecidable, in other words, the theory of rings is
undecidable.
Proof. It suffices to show that the ring (Z; 0, 1, +, −, ) of integers is strongly
undecidable. Using Lagrange’s theorem that
N = ¦a
2
+b
2
+c
2
+d
2
: a, b, c, d ∈ Z¦,
we see that the inclusion map N → Z defines N in the ring of integers, so by
Tarski’s Theorem the ring of integers is strongly undecidable.
For the same reason, the theory of commutative rings, of integral domains, and
more generally, the theory of any class of rings that has the ring of integers
among its members is undecidable.
108 CHAPTER 5. COMPUTABILITY
Fact. The set Z ⊆ Q is 0-definable in the field (Q; 0, 1, +, −, ) of rational
numbers, and thus the ring of integers is definable in the field of rational num-
bers.
We shall take this here on faith. The only known proof uses non-trivial results
about quadratic forms, and is due to Julia Robinson.
Corollary 5.8.5. The theory Th(Fl) of fields is undecidable. The theory of any
class of fields that has the field of rationals among its members is undecidable.
Exercises.
(1) Argue informally, using the Church-Turing Thesis, that Th(ACF) is decidable.
You can use the fact that ACF has QE.
Below we let a, b, c denote integers. We say that a divides b (notation: a [ b) if
ax = b for some integer x, and we say that c is a least common multiple of a and
b if a [ c, b [ c, and c [ x for every integer x such that a [ x and b [ x. Recall
that if a and b are not both zero, then they have a unique positive least common
multiple, and that if a and b are coprime (that is, there is no integer x > 1 with
x [ a and x [ b), then they have ab as a least common multiple.
(2) The structure (Z; 0, 1, +, [) is strongly undecidable, where [ is the binary relation
of divisibility on Z. Hint: Show that if b +a is a least common multiple of a and
a+1, and b−a is a least common multiple of a and a−1, then b = a
2
. Use this to
define the squaring function in (Z; 0, 1, +, [), and then show that multiplication
is 0-definable in (Z; 0, 1, +, [).
(3) Consider the group G of bijective maps Z → Z, with composition as the group
multiplication. Then G (as a model of Gr) is strongly undecidable. Hint: let s
be the element of G given by s(x) = x + 1. Check that if g ∈ G commutes with
s, then g = s
a
for some a. Next show that
a [ b ⇐⇒ s
b
commutes with each g ∈ G that commutes with s
a
.
Use these facts to specify a definition of the structure (Z; 0, 1, +, [) in the group
G.
Thus by Tarski’s theorem, the theory of groups is undecidable. In fact, the
theory of any class of groups that has the group G of the exercise above among
its members is undecidable. On the other hand, Th(Ab), the theory of abelian
groups, is known to be decidable (Szmielew).
(4) Let L = |F¦ have just a binary function symbol. Then predicate logic in L (that
is, Th
L
(∅)) is undecidable.
It can also be shown that predicate logic in the language whose only symbol is a
binary relation symbol is undecidable. On the other hand, predicate logic in the
language whose only symbol is a unary function symbol is decidable.
To do?
• improve titlepage
• improve or delete index
• more exercises (from homework, exams)
• footnotes pointing to alternative terminology, etc.
• brief discussion of classes at end of “Sets and Maps”?
• at the end of results without proof?
• brief discussion on P=NP in connection with propositional logic
• section(s) on boolean algebra, including Stone representation, Lindenbaum-
Tarski algebras, etc.
• section on equational logic? (boolean algebras, groups, as examples)
• solution to a problem by Erd¨os via compactness theorem, and other simple
applications of compactness
• include “equality theorem”,
• translation of one language in another (needed in connection with Tarski
theorem in last section)
• more details on back-and-forth in connection with unnested formulas
• extra elementary model theory (universal classes, model-theoretic crite-
ria for qe, etc., application to ACF, maybe extra section on RCF, Ax’s
theorem.
• On computability: a few extra results on c.e. sets, and exponential dio-
phantine result.
• basic framework for many-sorted logic.

Contents
1 Preliminaries 1.1 Mathematical Logic: a brief overview . . . . . . . . . . . . . . . . 1.2 Sets and Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Basic Concepts of Logic 2.1 Propositional Logic . . . . . . . . . . . . . 2.2 Completeness for Propositional Logic . . . 2.3 Languages and Structures . . . . . . . . . 2.4 Variables and Terms . . . . . . . . . . . . 2.5 Formulas and Sentences . . . . . . . . . . 2.6 Models . . . . . . . . . . . . . . . . . . . . 2.7 Logical Axioms and Rules; Formal Proofs 3 The 3.1 3.2 3.3 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 5 13 13 19 24 28 30 35 37 43 43 44 49 53 59 59 64 65 69 73

Completeness Theorem Another Form of Completeness . . . . . . . . Proof of the Completeness Theorem . . . . . Some Elementary Results of Predicate Logic . Equational Classes and Universal Algebra . .

4 Some Model Theory 4.1 L¨wenheim-Skolem; Vaught’s Test . . . . . . o 4.2 Elementary Equivalence and Back-and-Forth 4.3 Quantifier Elimination . . . . . . . . . . . . . 4.4 Presburger Arithmetic . . . . . . . . . . . . . 4.5 Skolemization and Extension by Definition . .

5 Computability, Decidability, and Incompleteness 5.1 Computable Functions . . . . . . . . . . . . . . . . 5.2 The Church-Turing Thesis . . . . . . . . . . . . . . 5.3 Primitive Recursive Functions . . . . . . . . . . . . 5.4 Representability . . . . . . . . . . . . . . . . . . . 5.5 Decidability and G¨del Numbering . . . . . . . . . o 5.6 Theorems of G¨del and Church . . . . . . . . . . . o 5.7 A more explicit incompleteness theorem . . . . . . i

77 . 77 . 86 . 87 . 91 . 95 . 100 . 103

. . . .8 Undecidable Theories . . . . . . . . . .5. . . . . . . . . . . 105 .

For additional material in Model Theory we refer the reader to Chang. and Keisler. C. Cantor.S. B. their activity led to the view that logic + set theory can serve as a basis for 1 . predicate logic.. R. like group theory or probability theory. Reading. H. propositional logic. see Shoenfield. Model Theory. J. Boole. Schr¨der.. as clarifying the foundations of the natural and real number systems. Peirce. to Rogers. C. New York. McGraw-Hill. and Leibniz dreamt of reducing reasoning to calculation. H. For more on the course material. As a mathematical subject. Poizat. Addison-Wesley.Chapter 1 Preliminaries We start with a brief overview of mathematical logic as covered in this course. 1967. From our perspective o we see their work as leading to boolean algebra. In this course we develop mathematical logic using elementary set theory as given. and as introducing suggestive symbolic notation for logical operations. Frege. Springer. 1990.. 2000. Theory of Recursive Functions and Effective Computability.1 Mathematical Logic: a brief overview Aristotle identified some simple patterns in human reasoning. Also. Next we review some basic notions from elementary set theory. C. 1. NorthHolland. which provides a medium for communicating mathematics in a precise and clear way. Peano. just as one would do with other branches of mathematics. A Course in Model Theory. J. set theory. Dedekind. and for additional material on Computability.. however. logic is relatively recent: the 19th century pioneers were Bolzano. Mathematical Logic. and E. 1967.

and Gentzen. o o Tarski. Hilbert. quotient sets and power sets. . Kleene. like cartesian products. Ramsey. 1 but it did bring crucial progress of a conceptual nature. Skolem. Church. In the period 1900-1950 important new ideas came from Russell. logic as pursued by mathematicians gradually branched into four main areas: model theory. Lusin (in Moscow). The topics in this course are part of the common background of mathematicians active in any of these areas. Post. The early part of the 20th century was also marked by the so-called f oundational crisis in mathematics. They discovered the first real theorems in mathematical logic. set theory. Lusin. Tarski (in Warsaw and Berkeley). and proof theory. PRELIMINARIES all of mathematics. who made up a large part of that generation and the next in mathematical logic. and also thrives on many interactions with other areas of mathematics and computer science. In the second half of the last century. o and Church (in Princeton) had many students and collaborators. More generally. o Hilbert (in G¨ttingen). G¨del. Turing. computability theory (or recursion theory). A strong impulse for developing mathematical logic came from the attempts during these times to provide solid foundations for mathematics. and the recognition that logic as used in mathematics obeys mathematical rules that can be made fully explicit. What distinguishes mathematical logic within mathematics is that statements about mathematical objects are taken seriously as mathematical objects in their own right. with those of G¨del having a dramatic impact. Hausdorff. in mathematical logic we formalize (formulate in a precise mathematical way) notions used informally by mathematicians such as: • property • statement (in a given language) • structure • truth (what it means for a given statement to be true in a given structure) • proof (from a given set of axioms) • algorithm 1 In the case of set theory one could dispute this. Even so. This era did not produce theorems in mathematical logic of any real depth. Zermelo.2 CHAPTER 1. the main influence of set theory on the rest of mathematics was to enable simple constructions of great generality. and this involves only very elementary set theory. Most of these names will be encountered again during the course. L¨wenheim. Mathematical logic has now taken on a life of its own. Herbrand.

this assertion can be expressed formally as (GC) ∀x[(1 + 1 < x ∧ even(x)) → ∃p∃q(prime(p) ∧ prime(q) ∧ x = p + q)] where even(x) abbreviates ∃y(x = y + y) and prime(p) abbreviates 1 < p ∧ ∀r∀s(p = r · s → (r = 1 ∨ s = 1)). then GC is true in (N. ·. which has symbols 0.) A century of experience gives us confidence that all classical number-theoretic results—old or new. you will see that if GC is provable from those axioms. 0.1. as is easily verified. <). 0. we can try to prove theorems about these formalized notions. ·. 2. →. . z. 1. With the understanding that the variables range over N = {0. If done with imagination. <). <). proved by elementary methods or by sophisticated algebra and analysis—can be proved from the Peano axioms for arithmetic. ∨. 1. 1. ∀. 2 Here we do not count as part of classical number theory some results like Ramsey’s Theorem that can be stated in the language of arithmetic. in our present state of knowledge. The notorious Goldbach Conjecture asserts that every even integer greater than 2 is a sum of two prime numbers. . 0. +. p. ∧. r. 0. and that 0. +. . +. +. +. ·. < denote the usual arithmetic operations and relations on N. One objective of this course is to figure out the connections (and disconnections) between these notions. formalization tends to caricature the informal concepts it aims to capture. ·. ·. 1. and variables x. That the question makes sense does not mean that it is of any interest. 2 However.) The point of this example is simply to make the reader aware of the notions “true in a given structure” and “provable from a given set of axioms.1. ¬. s. in addition to logical symbols like =. q. but are arguably more in the spirit of logic and combinatorics. 1. but not provable from those axioms. 1. ∃. < to denote arithmetic operations and relations. . (No proof of the Goldbach Conjecture is known. }. but no harm is done if this is kept firmly in mind. GC might be true in (N. The expression GC is an example of a formal statement (also called a sentence) in the language of arithmetic. Example. then its negation ¬GC is provable from those axioms.) It also makes sense to ask whether the sentence GC is true in the structure (R. (On the other hand. and that if GC is false in (N. MATHEMATICAL LOGIC: A BRIEF OVERVIEW 3 Once we have mathematical definitions of these notions. ·. ·. 1. +. Of course. once you know what exactly we mean by provable from the Peano axioms. The Goldbach Conjecture asserts that this particular sentence GC is true in the structure (N. +. 0. <) (It’s not.” and their difference. this process can lead to unexpected rewards. 1. y. <).

+. At this stage we only say by way of explanation that a model of Σ is a mathematical structure in which all sentences of Σ are true. Here is an important early result on a specific arithmetic structure: Theorem of Presburger and Skolem. When we do include multiplication. each of these results has stronger versions. What matters is to get some preliminary idea of what we are aiming for. Then σ is provable from Σ if and only if σ is true in all models of Σ. for each n = 2. Each sentence in the language of the structure (Z. but that should change during the course. . . If Σ is a countable set of sentences o in some language and Σ has a model. then Σ has a countable model. there is an algorithm that. but in this overview we prefer simple statements over strength and applicability. accordingly. 1. Theorem of L¨wenheim and Skolem. Then Σ has a model if and only if each finite subset of Σ has a model. The L¨wenheim-Skolem and Compactness theorems o do not mention the notion of provability. 4. if Σ is the (infinite) set of axioms for fields of characteristic zero in the language of rings. 0. or a = nb + 1 + · · · + 1 (with n disjuncts in total). We begin with two results that are fundamental in model theory. The next result goes a little beyond model theory by relating the notion of “model of Σ” to that of “provability from Σ”: Completeness Theorem (G¨del. <) we have not included multiplication among the primitives. but the theorem of L¨wenheim and Skolem predates the o Completeness Theorem. PRELIMINARIES Some highlights (1900–1950) The results below are among the most frequently used facts of mathematical logic.4 CHAPTER 1. 0. For example. The terminology used in stating these results might be unfamiliar. Moreover. . −. on which applications often depend. given any sentence in this language as input. and thus model theorists often prefer to bypass Completeness in establishing these results. decides whether this sentence is true in (Z. 3. 0.. Let Σ be a set of sentences in some o language L. +. Mal’cev). In the case of the Compactness Theorem this reflects history. by an axiom that says that for every a there is a b such that a = nb or a = nb + 1 or . and let σ be a sentence in L. the situation changes dramatically: .. 1930).. −. −. <) that is true in this structure is provable from the axioms for ordered abelian groups with least positive element 1. nb stands for b + · · · + b (with n summands). <). then a model of Σ is just a field of characteristic zero. As will become clear during the course. Let Σ be a set of sentences in some o language. In our treatment we shall obtain the first two theorems as byproducts of the Completeness Theorem and its proof. +. 1. They concern the notion of model of Σ where Σ is a set of sentences in a language L. see for example Poizat’s book. Compactness Theorem (G¨del. augmented. Note that in (Z. 1.

In contrast to these incompleteness and undecidability results on (sufficiently rich) arithmetic. In a few places we shall need more set theory than we introduce here. <). .every positive element is a square. 1. In such a treatment the notion of set itself is left undefined. ordinals and cardinals. but not provable from the Peano axioms. R. <). 1930’s). . 0. Here “there is no algorithm” is used in the mathematical sense of there cannot exist an algorithm. for example. ·. we have Tarski’s theorem on the field of real numbers (1930-1950). (G¨del-Church. We consider the sets A and B as the same set (notation: A = B) / if and only if they have exactly the same elements.2 Sets and Maps We shall use this section as an opportunity to fix notations and terminologies that are used throughout these notes. but the axioms about sets are suggested by thinking of a set as a collection of mathematical objects.) Halmos. 1. <) is provable from the axioms for ordered fields augmented by the axioms . 0. 1.. Naive set theory. ·. decides whether this sentence is true in (N. To indicate that x is not in A we write x ∈ A. ·. in words: x is in A (or: x belongs to A). +. (It also contains an axiomatic treatment of set theory starting from scratch. +. 1. ·. called its elements or members. listing or indicating inside the brackets its elements. The following little book is a good place to read about these matters.2. New York. 1. SETS AND MAPS 5 Incompleteness and undecidability of arithmetic. given any sentence in this language as input. 1974 In an axiomatic treatment of set theory as in the book by Halmos all assertions about sets below are proved from a few simple axioms. There is also an algorithm that decides for any given sentence in this language as input. and throughout mathematics.” This theorem is intimately connected with the clarification of notions like computability and algorithm in which Turing played a key role.every odd degree polynomial has a zero. Every sentence in the language of arithmetic that is true in the structure (R. +. o One can construct a sentence in the language of arithmetic that is true in the structure (N. There is no algorithm that. We often introduce a set via the bracket notation. +.1. To indicate that an object x is an element of the set A we write x ∈ A. <). 0. P. Springer. whether this sentence is true in (R. not in the weaker colloquial sense of “no algorithm is known. 0.

Don’t confuse an object x with the set {x} that has x as its only element: for example. −1. 1. (3) The set of integers: Z = {. and 7 as its only elements. 3. we shall often write “iff” or “⇔” to abbreviate “if and only if. that is. / A × B := {(a. you may think of R × R as the set of points (a. namely 0 and 1. . The key property of ordered pairs is that we have (a. Here are some important sets that the reader has probably encountered previously. b) in the xy-plane of coordinate geometry. “for all m .” . and each set is a subset of itself. A ∩ B := {x : x ∈ A and x ∈ B} (intersection of A and B). the object x = {0. Remark. (4) The set of rational numbers: Q. 2. 2. We often introduce a set A in our discussions by defining A to be the set of all elements of a given set B that satisfy some property P . . then we say that A is a subset of B (and write A ⊆ B).}. We say that A and B are disjoint if A ∩ B = ∅. d) if and only if a = c and b = d. Examples. PRELIMINARIES For example. .” We committed a similar abuse of language earlier in defining set inclusion by the phrase “If —. b) = (c.}.” We shall continue such abuse. .” will mean “for all m ∈ N. 1}} has only one element. (1) The empty set: ∅ (it has no elements). (6) The set of complex numbers: C. 1}. (2) The set of natural numbers: N = {0. then we say that · · · . Remark.. For example. b) : a ∈ A and b ∈ B} (cartesian product of A and B). Then we can form the following sets: (a) (b) (c) (d) A ∪ B := {x : x ∈ A or x ∈ B} (union of A and B).”... and {3.” the meaning of “if” is actually “if and only if. but the set {x} = {{0. 2. (5) The set of real numbers: R. 0. 3} = {3}: the same set can be described in many different ways. Let A and B be sets. . . Throughout these notes m and n always denote natural numbers. For example. they have no element in common. 2. . namely x. In a definition such as we just gave: “We say that · · · if —. . 7} is the set with 1. Thus the empty set ∅ is a subset of every set. but only in similarly worded definitions. Thus the elements of A × B are the so-called ordered pairs (a.6 CHAPTER 1. 7} = {2. Also. A B := {x : x ∈ A and x ∈ B} (difference of A and B). b) with a ∈ A and b ∈ B. 1} is a set that has exactly two elements. If all elements of the set A are in the set B. Note that {1. −2. 7. 2. in accordance with tradition. 1. {1.. Notation: A := {x ∈ B : x satisfies P } (hence A ⊆ B). .

Let f : A → B be a map. It is called the composition of g and f .1. and “transformation” is used in geometric situations where domain and codomain are equal. “operator” when the elements of domain and codomain are themselves functions.2. but it will always be clear from the context when f (X) is meant to be the the direct image of X under f . B. “function” is used when the codomain is a set of numbers of some kind. an gives rise to a function x → f (x) : R → R. Γ) of sets A. some authors resolve the conflict by denoting this direct image by f [X] or in some other way. . We often use the “maps to” symbol → in this way to indicate the rule by which to each x in the domain we associate its value f (x). For Y ⊆ B we put f −1 (Y ) := {x ∈ A : f (x) ∈ Y } ⊆ A (inverse image of Y under f ).3 We call A the domain of f . respectively. B. transf ormation. It is said to be injective if for all a1 = a2 in A we have f (a1 ) = f (a2 ). Typically. and call it the value of f at a (or the image of a under f ). 3 Sometimes 4 Other we shall write f a instead of f (a) in order to cut down on parentheses.) Examples. f unction. . SETS AND MAPS 7 Maps Definition. and Γ the graph of f . Γ such that Γ ⊆ A × B and for each a ∈ A there is exactly one b ∈ B with (a. operator. . . Among the many synonyms of map are mapping. (We use equal as synonym for the same or identical. Definition. For X ⊆ A we put f (X) := {f (x) : x ∈ X} ⊆ B (direct image of X under f ). (There is a notational conflict here when X is both a subset of A and an element of A. A map is a triple f = (A. (2) Any polynomial f (X) = a0 + a1 X + · · · + an X n with real coefficients a0 . b) ∈ Γ. It is said to be surjective if for each b ∈ B there exists a ∈ A such that f (a) = b. (1) Given any set A we have the identity map 1A : A → A defined by 1A (a) = a for all a ∈ A. Thus surjectivity of our map f is equivalent to f (A) = B.4 We write f : A → B to indicate that f is a map with domain A and codomain B. and B the codomain of f .) We also call f (A) = {f (a) : a ∈ A} the image of f . words for “domain” and “codomain” are “source” and “target”. also coincide is a synonym for being the same. we write f (a) for this unique b. assignment. Definition. It is said to be bijective (or a bijection) if it is both injective and surjective. Given f : A → B and g : B → C we have a map g ◦ f : A → C defined by (g ◦ f )(a) = g(f (a)) for all a ∈ A. and in this situation we also say that f is a map from A to B. .

8 CHAPTER 1. This concludes the proof. a contradiction. Then {a ∈ A : a ∈ S(a)} = S(b) where b ∈ A. we call this unique n the number of elements of A or the cardinality of A. . Conversely. (The attentive reader will notice that we just introduced a potential conflict of notation: for bijective f : A → B and Y ⊆ B. . . The sets N. Thus b ∈ S(b). n} as a suggestive notation for the set {m : 1 ≤ m ≤ n}. A set which is not finite is said to be infinite. since these two subsets of A coincide. B ⊆ D.) It follows from the definition of “map” that f : A → B and g : C → D are equal (f = g) if and only if A = C. . Z and Q are countably infinite. . We say that g : C → D extends f : A → B if A ⊆ C. Such a set of subsets of A is clearly uniquely determined by A. the Power Set Axiom says: For any set A. If A is finite. . A set A is said to be finite if there exists n and a bijection f : {1. no harm is done. . and is called the power set of A. but the infinite set R is not countably infinite. if f : A → B and g : B → A satisfy g ◦ f = 1A and f ◦ g = 1B . but then / / b ∈ S(b). is denoted by P(A). . . Proof. Definition. However. One of the standard axioms of set theory. A set A is said to be countably infinite if there is a bijection N → A. Every infinite set has a countably infinite subset.” (a subset of A) . 5 We also say “g : C → D is an extension of f : A → B” or “f : A → B is a restriction of g : C → D. B = D. . 5 Definition. n} → A. again a contradiction. / Assuming b ∈ S(b) yields b ∈ S(b). there is no surjective map A → P(A): Cantor’s Theorem. and denote it by |A|. both the inverse image of Y under f and the direct image of Y under f −1 are denoted by f −1 (Y ). For n = 0 this is just ∅. so is P(A) and |P(A)| = 2|A| . Here we use {1. Note that a → {a} : A → P(A) is an injective map. and f (x) = g(x) for all x ∈ A. Example. then f is a bijection with f −1 = g. Note that then f −1 ◦ f = 1A and f ◦ f −1 = 1B . and f (x) = g(x) for all x ∈ A. It is said to be countable if it is either finite or countably infinite. there is a set whose elements are exactly the subsets of A. Let S : A → P(A) be a map. . PRELIMINARIES If f : A → B is a bijection then we have an inverse map f −1 : B → A given by f −1 (b) := the unique a ∈ A such that f (a) = b. Suppose otherwise. Then the set {a ∈ A : a ∈ S(a)} / is not an element of S(A). n} → A). If A is finite there is exactly one such n (although if n > 1 there will be more than one bijection f : {1. .

. i∈I the product of the family. For example. and this set is denoted by AI . . Then there is a set whose elements are exactly the maps f : I → A. an ) with each ai ∈ A. {ai }i∈I or (ai )i∈I denotes a family of objects ai indexed by the set I. . the empty word and written . . i∈I the union of the family. Words Definition. Given a word a = a1 . {(a. ai ) : i ∈ I}. . A word of length n on A is an n-tuple (a1 . . then {(i. and we often write ai instead of a(i). . each Ai is a set) we have a set Ai := {x : x ∈ Ai for some i ∈ I}. that is. b) ∈ Z2 : a < b} is a binary relation on Z. . it may happen that ai = aj for distinct indices i. For example. Let A be a set. . .) For I = N we usually say “sequence” instead of “family”. Think of A as an alphabet of letters. For I = {1. . . b) → a + b on Z.1. then so is the union above and | Ai | ≤ |Ai |. . j ∈ I. because we think of it as a word (string of letters) we shall write this tuple instead as a1 . An n-ary relation on A is just a subset of An . . . but we shall use it a few times. is a bit special. It says that for any family (Ai )i∈I of nonempty sets there is a family (ai )i∈I such that ai ∈ Ai for all i ∈ I. the Axiom of Choice. There is a unique word of length 0 on A. . . So An can be thought of as the set of n-tuples (a1 . . SETS AND MAPS 9 Let I and A be sets. an ) of letters ai ∈ A. . Ai := {(ai )i∈I : ai ∈ Ai for all i ∈ I}. and instead of “2-ary” we can say “binary”. an of length n ≥ 1 on A. n} → A. if I = N and an = a for all n. and an n-ary operation on A is a map from An into A. Thus an element of An is a map a : {1. a) : i ∈ N} is countably infinite. One axiom of set theory. .2. For n = 0 the set An has just one element — the empty tuple. (There may be repetitions in the family. we usually think of such an a as the n-tuple (a(1). . i∈I i∈I i∈I If I is countable and each Ai is countable then Given any family (Ai )i∈I of sets we have a set Ai is countable. i∈I Ai = ∅. a(n)). n} we also write An instead of AI . . ai ) : i ∈ I} = {(i. an (without parentheses or commas). and integer addition is the binary operation (a. but {ai : i ∈ I} = {a} has just one element. Given any family (Ai )i∈I of sets (that is. . not to be confused with the set {ai : i ∈ I}. Instead of “1-ary” we usually say “unary”. . . Definition. but such repetition is not reflected in the set {ai : i ∈ I}. If I is finite and each Ai is finite. that is. and is just a suggestive notation for a set {(i.

and if ac = bc. Logical expressions like formulas and terms will be introduced later as words of a special form on suitable alphabets. . . b) ∈ R. b ∈ A we have a∼ = b∼ if and only if a ∼ b. c ∈ A: (i) a ∼ a (reflexivity). then a = b. The quotient set of A by ∼ is by definition the set of equivalence classes: A/ ∼ = {a∼ : a ∈ A}. with as two-sided identity: a = a = a for all a ∈ A∗ . The equivalence class a∼ of an element a ∈ A is defined by a∼ = {b ∈ A : a ∼ b} (a subset of A). The set of all words on A is denoted A∗ : A∗ = n An (disjoint union). am b1 .. bn .) Every partition of A is the quotient set A/ ∼ for a unique equivalence relation ∼ on A. b. Given any n we have the equivalence relation “congruence modulo n” on Z defined as follows: for any a. b ∈ Z we have a≡b mod n ⇐⇒ a − b = nc for some c ∈ Z.”. and the last letter (or last symbol) of a is an . bn on A of length m and n respectively. if ab = ac. Thus ab is a word on A of length m + n. Definition. . we use it here because we don’t like to say “set of . we define their concatenation ab ∈ A∗ : ab = a1 . subsets . and with two-sided cancellation: for all a. c ∈ A∗ .. . Thus . (ii) a ∼ b implies b ∼ a (symmetry). it is a collection of pairwise disjoint nonempty subsets of A whose union is A. Example. c ∈ A∗ . . Let ∼ be an equivalence relation on the set A. . Concatenation is a binary operation on A∗ that is associative: (ab)c = a(bc) for all a. and this will be done whenever convenient. PRELIMINARIES the first letter (or first symbol) of a is by definition a1 . For n = 0 this is just equality on Z. b. Equivalence Relations and Quotient Sets Given a binary relation R on a set A it is often more suggestive to write aRb instead of (a. (Collection is a synonym for set. An equivalence relation on a set A is a binary relation ∼ on A such that for all a. b. Given words a = a1 . .10 CHAPTER 1. Definition. am and b = b1 . This quotient set is a partition of A. . that is.. When A ⊆ B we can identify A∗ with a subset of B ∗ . (iii) (a ∼ b and b ∼ c) implies a ∼ c (transitivity). For a. and a∼ ∩ b∼ = ∅ if and only if a b.. then b = c.

. ≤) is a linearly ordered set and |P | = n. Y of A) defines a poset (P(A). ≤) be a poset. (iii) if p ≤ q and q ≤ r. x < y : ⇐⇒ y > x :⇐⇒ x ≤ y and x = y.1. ≥) is also a poset. This is not a linearly ordered set if A has more than one element. R comes with its familiar linear order on it. Finite linearly ordered sets are determined “up to unique isomorphism” by their size: if (P. ≤) consisting of a set P and a partial ordering ≤ on P . that is. . . . n} such that for all p. or that (P. then p = q (antisymmetry). Note that (P. ≤) is a linearly ordered set. If in addition we have for all p. r ∈ P : (i) p ≤ p (reflexivity). Posets A partially ordered set (short: poset) is a pair (P. Q. This map ι is a bijection. (ii) if p ≤ q and q ≤ p. q ∈ P . P can have at most one least element. Remark. y ∈ P we set x ≥ y : ⇐⇒ y ≤ x. In the previous example (congruence modulo n) the equivalence classes are called congruence classes modulo n (or residue classes modulo n) and the corresponding quotient set is often denoted Z/nZ. SETS AND MAPS 11 equivalence relations on A and partitions of A are just different ways to describe the same situation.6 Each of the sets N. A least element of P is a p ∈ P such that p ≤ x for all x ∈ P . Readers familiar with some abstract algebra will note that the construction in the example above is a special case of a more general construction— that of a quotient of a group with respect to a normal subgroup.2. . q ∈ P we have: p ≤ q ⇐⇒ ι(p) ≤ ι(q). then we say that ≤ is a linear order on P . For x. ≤). As an example. Here is some useful notation. a largest element of P is defined likewise. (iv) p ≤ q or q ≤ p. Of course. with ≥ instead of ≤. also referred to as the power set of A ordered by inclusion. Let (P. ≤ is a binary relation on P such that for all p. Then X ≤ Y :⇐⇒ X ⊆ Y (for subsets X. Z. therefore we can refer to the 6 One also uses the term total order instead of linear order. then p ≤ r (transitivity). take any set A and its collection P(A) of subsets. q. then there is a unique map ι : P → {1.

then P has a maximal element. then the minimal elements of X are the singletons {a} with a ∈ A. ≤X ) where the partial ordering ≤X on X is defined by x ≤X y ⇐⇒ x ≤ y (x. we can refer to the largest element of P . y ∈ X). such that l ≤ x for all x ∈ X (respectively. likewise. q ∈ P. . and maximal elements of a set X ⊆ P . We call X a chain in P if (X. For a further discussion of Zorn’s Lemma and its proof using the Axiom of Choice we refer the reader to Halmos’s book on set theory. if P has a least element. For example. with > instead of <. ≤). then this element is also the unique minimal element of P . ≤X ) is linearly ordered. have more than one minimal element. Thus we can speak of least. PRELIMINARIES least element of P . A lowerbound (respectively. when the ambient poset (P. If P has a least element. A minimal element of P is a p ∈ P such that there is no x ∈ P with x < p. by restricting the given partial ordering of P to X. if P has a largest element. x ≤ u for all x ∈ X). More precisely this means that we consider the poset (X. Then P has a maximal element. and there is a linear order ≤ on P that extends ≤ in the sense that p ≤ q =⇒ p ≤ q.12 CHAPTER 1. minimal. for all p. We often tacitly consider X as a poset in its own right. some posets. (Hint: use induction on |P |. a maximal element of P is defined likewise. Suppose P is nonempty and every nonempty chain in P has an upperbound in P . Zorn’s Lemma. when X is the collection of nonempty subsets of a set A and X is ordered by inclusion. upperbound ) of X in P is an element l ∈ P (respectively. The reader might want to prove the following result to get a feeling for these notions: If P is finite and nonempty. ≤) is clear from the context.) Let X ⊆ P . however. an element u ∈ P ). Occasionally we shall use the following fact about posets (P. largest.

∨. false. ¬. and and. assumed to be either true or false. Instead of “not true” we also say “false”. or. “or” and “and”. ⊥. We now introduce a formalism that makes this into mathematics. We shall regard ¬p as true if and only if p is not true. These symbols are fixed throughout the course. p ∨ q is defined to be true if and only if p is true or q is true (including the possibility that both are true). pronounced as “p and q”). This leads to more complicated combinations like ¬ p ∧ (¬q) . given any statements p and q. respectively.1 Propositional Logic Propositional logic is the fragment of logic where we construct new statements from given statements using so-called connectives like “not”. pronounced as “p or q”). We start with the five distinct symbols ⊥ ¬ ∨ ∧ to be thought of as true.Chapter 2 Basic Concepts of Logic 2. Thus. The truth value of such a new statement is then completely determined by the truth values of the given statements. we can form the three statements ¬p p∨q p∧q (the negation of p. and p ∧ q is deemed to be true if and only if p is true and q is true. also. such that no propositional connective is an atom. not. ∧} that can be obtained by applying the following rules: 13 . A proposition on A is a word on the alphabet A ∪ { . (the disjunction of p and q. pronounced as “not p”). (the conjunction of p and q. In this section we also fix a set A whose elements will be called propositional atoms (or just atoms). and are called propositional connectives. It may help the reader to think of an atom a as a variable for which we can substitute arbitrary statements.

Remark. or ∧. . . hence ∨¬ab as well. Then ∧ ∨ ¬ab¬c is a proposition. and thus ∧ ∨ ¬ab¬c is a proposition. If the first symbol of p is ¬. so ¬a is a proposition. k − 1} such that wk is one of the concatenations ¬wi . occupy valuable space on the printed page. . or p is an atom. but for the sake of readability we usually write p∨q and p ∧ q to denote ∨pq and ∧pq respectively. we don’t have to give precise rules for their use. rather than as a one-dimensional string. . q. (ii) and ⊥ (viewed as words of length 1) are propositions on A. (iii) if p and q are propositions on A. and we also use parentheses and brackets if this helps to clarify the structure of a proposition. We let Prop(A) denote the set of propositions. then p = ∧qr for a unique pair (q. is called prefix notation or Polish notation. and p. or there are i. . then either p = . either wk ∈ A∪{ . a twodimensional array. ⊥. r). The intended structure of a proposition—how we think of it as built up from atoms via connectives—is best exhibited in the form of a tree. with n ≥ 1. c are atoms. . Such trees. This follows from the rules above: a is a proposition. Fortunately. q. Because of the informal nature of these conventions. ¬. BASIC CONCEPTS OF LOGIC (i) each atom a ∈ A (viewed as a word of length 1) is a proposition on A. If the first symbol of p is ∧. n}. then p = ¬q for a unique q. ⊥} (where each element in the last set is viewed as a word of length 1). r). The reader should take such an informal description as shorthand for a completely explicit definition. then p = ∨qr for a unique pair (q.1. Suppose a. . . ∧} for which there is a sequence w1 . it’s enough that each actual use is clear to the reader. For the rest of this section “proposition” means “proposition on A”. Example. and are typographically demanding. If p has length 1. r denote propositions. or p = ⊥. .) Only the last two claims are worth proving in print. ∨pq and ∧pq are propositions on A. j ∈ {1. also ¬c is a proposition. then the concatenations ¬p. ∧wi wj . r (sometimes with subscripts) will denote propositions. Having the connectives ∨ and ∧ in front of the propositions they “connect” rather than in between. our “official” prefix notation does uniquely determine the intended structure of a proposition: that is what the next lemma amounts to. b. . or even by (¬a∨b)∧¬c since we shall agree that ¬ binds stronger than ∨ and ∧ in this informal way of indicating propositions. ∨wi wj . the others should require only a . This is theoretically elegant. such that w = wn and for each k ∈ {1. ∨. which in the case at hand is as follows: A proposition is a word w on the alphabet A ∪ { . however. . . So the proposition in the example above could be denoted by [(¬a)∨b]∧(¬c). or ∨. then its first symbol is either ¬. If the first symbol of p is ∨.1 (Unique Readability). (Note that we used here our convention that p.14 CHAPTER 2. If p has length > 1. wn of words on that same alphabet. We defined “proposition” using the suggestive but vague phrase “can be obtained by applying the following rules”. Lemma 2.

. t(p ∨ q). 1}. For now we shall assume this lemma without proof. . . . . the conjunction p1 ∧ . . p 0 0 1 1 q 0 1 0 1 ¬p 1 1 0 0 p∨q 0 1 1 1 p∧q 0 0 0 1 p→q 1 1 0 1 p↔q 1 0 0 1 Let t : A → {0. The reason that for n = 0 we take this disjunction to be ⊥ is that we want a disjunction to be true if and only if (at least) one of the disjuncts is true. By recursion on n we define  if n = 0  ⊥   p1 if n = 1 p1 ∨ . an are the distinct atoms that occur in p. pn . it’s better viewed as a function whose arguments and values are statements: replacing the atoms in a proposition by specific mathematical statements like “2 × 2 = 4”. pn is defined by replacing everywhere ∨ by ∧ and ⊥ by in the definition of p1 ∨ . . ∨ pn . A truth assignment is a map t : A → {0. . and “every even integer > 2 is the sum of two prime numbers”. .1. Rather than thinking of a proposition as a statement. and p ↔ q denotes (p → q) ∧ (q → p). PROPOSITIONAL LOGIC 15 moment’s thought. We call p1 ∨ . . The array below is called a truth table. Remark. Suppose a1 . We shall use the following notational conventions: p → q denotes ¬p ∨ q. . 1}. Definition. Then we can compute in a finite number of steps . . . . . ∨ pn the disjunction of p1 . and that t(p ↔ q) = 1 if and only if t(p) = t(q). Similarly. Note that t(p → q) = 1 if and only if t(p) ≤ t(q). . t(p → q) and t(p ↔ q). . It shows on each row below the top row how the two leftmost entries t(p) and t(q) determine t(¬p). ∨ pn−1 ) ∨ pn if n > 2 Thus p ∨ q ∨ r stands for (p ∨ q) ∨ r. 1} by requiring ˆ (i) t( ) = 1 ˆ (ii) t(⊥) = 0 ˆ ˆ (iii) t(¬p) = 1 − t(p) ˆ ˆ ˆ (iv) t(p ∨ q) = max(t(p). ∨ pn = if n = 2  p1 ∨ p2   (p1 ∨ . we obtain again a mathematical statement. At the end of this section we establish more general results which are needed also later in the course. .2. t(p ∧ q). We extend such a t ˆ to t : Prop(A) → {0. “π 2 < 7”. . . To simplify ˆ notation we often write t instead of t. t(q)) ˆ Note that there is exactly one such extension t by unique readability. t(q)) ˆ ˆ ˆ (v) t(p ∧ q) = min(t(p). and we know how p is built up from those atoms. ∧ pn of p1 . . .

Note that |= p is the same as ∅ |= p. . and the same is true for the equivalence class of pi(1) ∧ · · · ∧ pi(n) . i∈I i∈I pi := pi(1) ∧ · · · ∧ pi(n) . . . Note that |= p ↔ q iff t(p) = t(q) for all t : A → {0.) Remark. . We shall return to this issue in the more algebraic setting of boolean algebras. n} with I does not matter. |= (p ∧ q) ↔ (q ∧ p) (3) |= (p ∨ (q ∨ r)) ↔ ((p ∨ q) ∨ r). . (5). . but computers can handle somewhat larger n. associativity. 1}. commutativity. distributivity. t(an ). Definition. We call p equivalent to q if |= p ↔ q. . . By the remark preceding the definition one can verify whether any given p with exactly n distinct atoms in it is a tautology by computing 2n numbers and checking that these numbers all come out 1. Some notation: let (pi )i∈I be a family of propositions with finite index set I. |= (p ∧ (q ∨ r)) ↔ (p ∧ q) ∨ (p ∧ r) (5) |= (p ∨ (p ∧ q)) ↔ p. . 1}. Note the left-right symmetry in (1)–(7) : the so-called duality of propositional logic. 1} such that t(p) = 1 for all p ∈ Σ. . We say that a proposition p is a tautological consequence of Σ (written Σ |= p) if t(p) = 1 for every model t of Σ. this is usually the case. q. Lemma 2.1. n. choose a bijection k → i(k) : {1.16 CHAPTER 2. n} → I and set pi := pi(1) ∨ · · · ∨ pi(n) . The lemma below gives a useful list of equivalences. BASIC CONCEPTS OF LOGIC t(p) from t(a1 ). Note that “equivalent to” defines an equivalence relation on Prop(A). We leave it to the reader to verify them. By a model of Σ we mean a truth assignment t : A → {0. the absorption law . because the equivalence class of pi(1) ∨ · · · ∨ pi(n) does not depend on this choice. For all p. In particular. . (To do this accurately by hand is already cumbersome for n = 5. |= (p ∧ (p ∨ q)) ↔ p (6) |= (¬(p ∨ q)) ↔ (¬p ∧ ¬q). 1} such that t(ai ) = t (ai ) for i = 1.2. Next we define “model of Σ” and “tautological consequence of Σ”. (3). Fortunately. Thus is a tautology. and (6) are often referred to as the idempotent law . Let Σ ⊆ Prop(A). (4). r we have the following equivalences: (1) |= (p ∨ p) ↔ p. other methods are often efficient for special cases. |= (p ∧ ¬p) ↔ ⊥ (8) |= ¬¬p ↔ p Items (1). Definition. t(p) = t (p) for any t : A → {0. . . . (2). respectively. . . |= (p ∧ (q ∧ r)) ↔ ((p ∧ q) ∧ r) (4) |= (p ∨ (q ∧ r)) ↔ (p ∨ q) ∧ (p ∨ r). |= (¬(p ∧ q)) ↔ (¬p ∨ ¬q) (7) |= (p ∨ ¬p) ↔ . |= (p ∧ p) ↔ p (2) |= (p ∨ q) ↔ (q ∨ p). We say that p is satisfiable if t(p) = 1 for some t : A → {0. . 1}. and the De Morgan law . the notations i∈I pi and i∈I pi can only be used when the particular choice of bijection of {1. p → (p ∨ q) are tautologies for all p and q. We say that p is a tautology (notation: |= p) if t(p) = 1 for all t : A → {0. and p ∨ ¬p. If I is clear from context we just write i pi and i pi instead. Of course.

2. Below we just write “admissible word” instead of “admissible word on F ”. Then m ≤ n. ak and u1 = hb1 . We will prove (3) here and leave the rest as exercise. ∨. ti = ui for i = 1. . 1} of Σ ∪ {p}.3. Note that the empty word is not admissible. A symbol f ∈ F is said to have arity n if a(f ) = n. arity(∨) = arity(∧) = 2. To derive Σ |= p → q we consider any model t : A −→ {0.) 17 Proof. . . If t(p) = 0 then t(p → q) = 1 by definition. ⊥. and assume the lemma holds for smaller lengths. The first symbol of t1 equals the first symbol of u1 . . ¬. tm and u1 . . . . . arity(¬) = 1. . un .1. Definition. and need only derive that t(q) = 1. To derive Σ ∪ {p} |= q we consider any model t : A −→ {0. q ∈ Prop(A). 1} of Σ. (⇒) Assume Σ ∪ {p} |= q. then the concatenation f t1 . . . m. Note that n > 0. . . . . . Say this first symbol is h ∈ F with arity k. tm w = b1 . (⇐) Assume Σ |= p → q. Cancelling the first symbol h gives a1 . If m = 0. then the conclusion of the lemma holds. ak t2 . . . Lemma 2. Let t1 . . . bk are admissible words. We finish this section with the promised general result on unique readability. un . . By induction on the length of u1 . so suppose m > 0. ak and b1 . . . (3) Σ ∪ {p} |= q ⇐⇒ Σ |= p → q. . Take F = A ∪ { . . . Example. ⊥}. . and that the last symbol of an admissible word cannot be of arity > 0. (Modus Ponens. . . Then t1 = ha1 . By assumption t(p → q) = 1 and in view of t(p) = 1. (4) If Σ |= p and Σ |= p → q. ∧} and define arity : F → N by arity(x) = 0 for x ∈ A ∪ { . . Suppose the length is > 0. . . un be admissible words and w any word on F such that t1 . . . bk where a1 . (2) Σ |= p =⇒ Σ |= p ∨ q. Let Σ ⊆ Prop(A) and p. Then (1) Σ |= p ∧ q ⇐⇒ Σ |= p and Σ |= q. tm are admissible words on F . . PROPOSITIONAL LOGIC Lemma 2.4. We also establish facts of similar nature that are needed later. .1. this gives t(q) = 1 as required. tm w = u1 . . If t(p) = 1 then t(Σ ∪ {p}) ⊆ {1}. (ii) If f ∈ F has arity m > 0 and t1 .1. and w = um+1 · · · un . then f viewed as a word of length 1 is admissible. Then the set of admissible words is just Prop(A). Proof. . . . . . A word on F is said to be admissible if it can be obtained by applying the following rules: (i) If f ∈ F has arity 0. Let F be a set of symbols with a function a : F → N (called the arity function). If this length is 0. bk u2 . . un . . hence t(q) = 1 and thus t(p → q) = 1. then Σ |= q. tm is admissible. then m = n = 0 and w is the empty word. . and need only show that then t(p → q) = 1.

. . tm and u1 . (For example. .1. m. Each admissible word equals f t1 .) We have length(b1 . ak = bk (so t1 = u1 ). .1. . and given v ∈ F ∗ . . .5. Then m = n and ti = ui for i = 1. Then t = u and w is the empty word. . . . . . un are admissible words on F . . . these two occurrences overlap. . Then there is a unique admissible word that occurs in w at starting position i. . . Proof. . one at starting position 1. m = n and ti = ui for i = 1. Clearly w is an admissible word occurring in w at starting position 1. . Put lj := 1 + length(t1 ) + · · · + length(tj ) for j = 0. tm are admissible words. . tn where f ∈ F has arity n > 0. tm = gu1 . Here are two immediate consequences that we shall use: 1. tn are admissible words. Uniqueness then follows from the fact stated just before Lemma 2. tm for a unique tuple (f. g ∈ F have arity m and n respectively. . tn where f ∈ F has arity n > 0. Let t and u be admissible words and w a word on F such that tw = u. . u1 . . . . After cancelling the first symbol of both words. . We have to show that then f = g. un be admissible words such that t1 . n} such that 1 + length(t1 ) + · · · + length(tj−1 ) < i ≤ 1 + length(t1 ) + · · · + length(tj ). Suppose lj−1 < i ≤ lj . . . t2 = u2 . . . . Observe first that f = g since f and g are the first symbols of two equal words. length(w)}. so the induction hypothesis applies. tm = um . n (so l0 = 1).18 CHAPTER 2. tm . Let w be an admissible word and 1 ≤ i ≤ length(w). Then we write w = f t1 . . w ∈ F ∗ and i ∈ {1. . . and t1 . the result of replacing v in w at starting position i by v is by definition the word w1 v w2 . 2. . . Lemma 2. . . . . . we say that v occurs in w at starting position i if w = w1 vw2 where w1 . . the first consequence of the previous lemma leads to the desired conclusion. t1 . . . i − 1 + length(v) ≤ lj . . bk u2 . and let v be the admissible word that occurs in w at starting position i. un ) = length(u1 . un where f. . We prove existence by induction on length(w). . n−1 could be 0. . Remark. . Then the proof of the last lemma shows that this occurrence is entirely inside tj . and t1 . . . . . . see exercise 5 at the end of this section. . and t1 . . . tn are admissible words. . then the word f gf has exactly two occurrences in the word f gf gf .) Given w = w1 vw2 as above. and w = um+1 · · · un . . Suppose f t1 . . .1. . but such overlapping is impossible with admissible words. Let t1 . if f. m−1. . . m. . Proof. w2 ∈ F ∗ and w1 has length i − 1. . . 1 ≤ j ≤ n. Lemma 2. . tm = u1 . It yields k + m − 1 ≤ k + n − 1 (so m ≤ n). . . . BASIC CONCEPTS OF LOGIC (Caution: any of k. tm ) where f ∈ F has arity m and t1 . Let w = f t1 . . . and the other at starting position 3. . Given words v. a1 = b1 . . .6. . . un ) − 1. un . g ∈ F are distinct. . that is. and we take j ∈ {1. Suppose i > 1.5 (Unique Readability). . . Now apply the inductive assumption to tj .

7.2. and then prove this completeness.1. A propositional axiom is by definition a proposition that occurs in the list below. . . . and the conventions of that section remain in force.) Then for every function f : {0. (Think of a truth table for p where the 2n rows correspond to the 2n truth assignments t : A → {0. A = {a1 . |A| = n.2 Completeness for Propositional Logic In this section we introduce a proof system for propositional logic. Let v and v be the admissible words that occur at starting positions i and i respectively in w. that is. . as well as the words “disjunction” and “conjunction. 1}A → {0. In the exercises below. (1) (Disjunctive Normal Form) Each p is equivalent to a disjunction p1 ∨ · · · ∨ pk where each disjunct pi is a conjunction a11 ∧ . and the column under p records the values t(p). 1} and (2) (Conjunctive Normal Form) Same as last problem. (4) Let ∼ be the equivalence relation on Prop(A) given by p ∼ q :⇐⇒ |= p ↔ q. 1} there is a p such that f = fp . Exercises. or the occurrence of v is entirely inside that of v. r: 1.2. j ∈ {−1. p → (q ∨ p) 3.” and also the words “disjunct” and “conjunct. p → (p ∨ q). Let w be an admissible word and 1 ≤ i ≤ length(w). determine its cardinality as a function of n = |A|. 1} defined by fp (t) = t(p). As in the previous section we fix a set A of atoms. Then the result of replacing the admissible word v in w at starting position i by an admissible word v is again an admissible word. except that the signs ∨ and ∧ are interchanged. for some choice of p. 1}. 2. that is. Then the quotient set Prop(A)/ ∼ is finite. using the last remark. 2. Then these occurrences are either nonoverlapping. (5) Let w be an admissible word and 1 ≤ i < i ≤ length(w). an }. ∧ ann with all where for an atom a we put a1 := a and a−1 := ¬a. COMPLETENESS FOR PROPOSITIONAL LOGIC 19 Corollary 2.” (3) To each p associate the function fp : {0. 1}A → {0. state the completeness of this proof system. This follows by a routine induction on length(w). i − 1 + length(v ) ≤ i − 1 + length(v). q. ¬p → ¬q → ¬(p ∨ q) . . . i − 1 + length(v) < i .

. .2.first form). (ii) or pk is a propositional axiom. infer q. For example. but axiom schemes. That is why we do not call these items axioms.2. If Σ p. Since p → (p → p) is also a propositional axiom by scheme 2. . . pn with n ≥ 1 and pn = p. Lemma 2. . Here is our single rule of inference for propositional logic: Modus Ponens (MP): from p and p → q. and both instances of axiom scheme 2. . . such that for k = 1.2. then a → (a ∨ ⊥) and b → (b ∨ (¬a ∧ ¬b)) are distinct propositional axioms. This should be clear from earlier facts that we stated and which the reader was asked to verify. b ∈ A. p → q → (p ∧ q) CHAPTER 2. BASIC CONCEPTS OF LOGIC (p ∧ q) → q 6. if a. p → p. p → (q → r) → (p → q) → (p → r) 7. The converse is true but less obvious: Theorem 2. It is easy to check that all propositional axioms are tautologies. For Σ = ∅ we also write p instead of Σ p. then Σ |= p. . (iii) or there are i. . j ∈ {1. . k − 1} such that pk can be inferred from pi and pj by MP. then we write Σ p.3 (Completeness . By axiom scheme 6. The proposition p → (p → p) → p is a propositional axiom by axiom scheme 2. of p from Σ is a sequence p1 . In the rest of this section Σ denotes a set of propositions. Proposition 2. Definition. (p ∧ q) → p. . Σ ⊆ Prop(A). 5. Σ p ⇐⇒ Σ |= p . Proof. Applying MP to these two axioms yields p → (p → p) → (p → p). (¬p → ⊥) → p Each of items 2–8 describes infinitely many propositional axioms. If there exists a proof of p from Σ. we can apply MP again to obtain p → p. that is. n: (i) either pk ∈ Σ.20 4. {p → (p → p) → p } → { p → (p → p) → (p → p)} is a propositional axiom. p → (¬p → ⊥) 8. . A formal proof . or just proof.2. and say Σ proves p.1.

Now assume that q is obtained by MP from r and r → q. Proof. It is convenient to prove first a variant of the Completeness Theorem. Since p → (¬p → ⊥) is a propositional axiom. We first show that the second form of the Completeness Theorem implies the first form. (⇒) Assume Σ p.second form). However.7 (Deduction Lemma). MP yields Σ p. and thus in our notion of formal proof. where Σ ∪ {p} r and Σ ∪ {p} r → q and where we assume inductively that Σ p → r and Σ p → (r → q). and since q → (p → q) is a propositional axiom. ⊥. then either q ∈ Σ in which case the same argument as before gives Σ p → q. Theorem 2.4 (Compactness of Propositional Logic). Σ is consistent if and only if Σ has a model. Σ p if and only if Σ ∪ {¬p} is inconsistent.2.8.2. which merely formalizes the basic underlying idea of propositional logic as stated in the introduction to the previous section.2. then Σ q. An example is the Compactness Theorem. the equivalence of and |= (Completeness Theorem) shows that our choice of axioms and rule yields a complete proof system. Then Σ p → q.2. If q is a propositional axiom. If Σ |= p then there is a finite subset Σ0 of Σ such that Σ0 |= p. For this we need a technical lemma that will also be useful later in the course. or q = p and then Σ p → q since p → p by the lemma above. MP yields Σ p → q.5 (Completeness . and so by the Deduction Lemma we have Σ ¬p → ⊥. Theorem 2. if Proof.6. we can apply MP twice to get Σ ∪ {¬p} ⊥. We say that Σ is inconsistent if Σ Σ ⊥) we call Σ consistent. If q ∈ Σ ∪ {p}. COMPLETENESS FOR PROPOSITIONAL LOGIC 21 Remark. (⇐) Assume Σ ∪ {¬p} is inconsistent. Then we obtain Σ p → q from the propositional axiom p → (r → q) → (p → r) → (p → q) by applying MP twice. Lemma 2. Since (¬p → ⊥) → p is a propositional axiom. Hence Σ ∪ {¬p} is inconsistent. Suppose Σ ∪ {p} q. This is in contrast to the definition of |=. and otherwise (that is. .2.2. Corollary 2.2. this equivalence has consequences which can be stated in terms of |= alone. Definition. Σ has a model ⇐⇒ every finite subset of Σ has a model. By induction on proofs. Moreover. There is some arbitrariness in our choice of axioms and rule. Then Σ ∪ {¬p} ⊥. From this second form of the Completenenes Theorem we obtain easily an alternative form of the Compactness of Propositional Logic: Corollary 2.

Completeness as a property of a set of propositions should not be confused with the completeness of our proof system as expressed by the Completeness Theorem.5) implies the first form (Theorem 2.2. For any p. Suppose otherwise. Lemma 2.) Thus by Zorn’s lemma P has a maximal element Σ . . Σn+1 = Σn ∪ {¬pn } if Σn pn . Because A is countable. Proof. Since a proof can use only finitely many of the axioms in {Σi : i ∈ I}. In particular Σ ∈ P . We claim that then Σ is complete. namely {Σi : i ∈ I}. Given a consistent Σn ⊆ Prop(A) we define Σn ∪ {pn } if Σn pn . We want to show that then Σ p. From Σ |= p it follows that Σ ∪ {¬p} has no model. Proof. We consider P as partially ordered by inclusion.2. Hence by the second form of Completeness. suppose {Σi : i ∈ I} ⊥. and thus Σ ¬p.9. Definition. of consistent subsets of Prop(A) as follows. We say that Σ is complete if Σ is consistent. Proof.2. Then by Corollary 2.2.22 CHAPTER 2. The second form of Completeness (Theorem 2. Prop(A) is countable. . . Suppose A is countable. that is.) Below we use Zorn’s Lemma to show that any consistent set of propositions can be extended to a complete set of propositions. Take an enumeration (pn )n∈N of Prop(A). and that Σ |= p. 1} by tΣ (a) = 1 if Σ a.3). For this case we can give a proof of Lindenbaum’s Lemma without using Zorn’s Lemma as follows. there exists i ∈ I such that Σi ⊥. hence ¬p ∈ Σ by maximality of Σ . Any totally ordered subcollection {Σi : i ∈ I} of P with I = ∅ has an upper bound in P . Then Σ ⊆ Σ for some complete Σ ⊆ Prop(A). We construct an increasing sequence Σ = Σ0 ⊆ Σ1 ⊆ . Assume the second form of Completeness holds.10 (Lindenbaum). (It is just a historical accident that we use the same word. BASIC CONCEPTS OF LOGIC Corollary 2. Suppose Σ is consistent. if Σ p.2. and tΣ (a) = 0 otherwise. and for each p either Σ p or Σ ¬p. contradicting the consistency of Σi .8 the set Σ ∪ {¬p} is consistent.2. Define the truth assignment tΣ : A → {0. then by Corollary 2. (To see this it suffices to check that {Σi : i ∈ I} is consistent. Thus Σ∞ := {Σn : n ∈ N} is consistent and also complete: for any n either pn ∈ Σn+1 ⊆ Σ∞ or ¬pn ∈ Σn+1 ⊆ Σ∞ .8 we have Σ p. so Σn+1 remains consistent by earlier results. Let P be the collection of all consistent subsets of Prop(A) that contain Σ. the set Σ ∪ {¬p} is inconsistent.

tΣ (p) = 0. Application to coloring infinite graphs. n) : v ∈ V } ∪ {¬ (v. so Σ q. It remains to consider the three cases below. Σ q ⇐⇒ tΣ (q) = 1. and if (v. tΣ (q) = 1. which contradicts the consistency of Σ. w ∈ V we have (v. Let some n ≥ 1 be / given. v) ∈ E. that is. and Σ r ⇐⇒ tΣ (r) = 1. Then tΣ (q) = 0.2. Then tΣ (p) = 1: Otherwise. Then an n-coloring of (V. . and Σ r ⇐⇒ tΣ (r) = 1. . then the equivalence follows immediately from the definitions. i) ∧ (w. Take A := V × {1. (⇒) Suppose that Σ p. E0 ). n} such that c(v) = c(w) for all (v. . Using MP and the propositional axioms q → p and r → p we obtain Σ p. Hence Σ q or Σ r. i) : (v. where E0 := E ∩ (V0 × V0 ). v) ∈ E. we can apply MP twice to get Σ ¬p. 1 ≤ i < j ≤ n} ∪ {¬ (v. and such a model is also a model of Σ. since q → (p → ⊥) is a propositional axiom. w) ∈ E. Suppose Σ is complete. E) is a function c : V → {1. Case 2 : p = q ∨ r. 1 ≤ i ≤ n}. E) to have an n-coloring means that the following set Σ ⊆ Prop A has a model: Σ := {(v. Proof. The converse—if Σ has a model. and think of an atom (v. tΣ is a model of Σ. by which we mean here that V is a set (of vertices) and E (the set of edges) is a binary relation on V that is irreflexive and symmetric. We can now finish the proof of Completeness (second form): Suppose Σ is consistent. for all v. w) ∈ E: neighboring vertices should have different colors. (⇐) Suppose tΣ (p) = 1. . We claim that there exists an n-coloring of (V. and thus Σ p by completeness of Σ. Case 3 : p = q ∧ r. (⇒) Suppose Σ p. We proceed by induction on the number of connectives in p. What follows is a standard use of compactness of propositional logic. Thus for (V. . so Σ q by the inductive assumption. one of many. Then for each p we have Σ p ⇐⇒ tΣ (p) = 1. 23 In particular. COMPLETENESS FOR PROPOSITIONAL LOGIC Lemma 2. Σ q ⇐⇒ tΣ (q) = 1. . which in view of the propositional axiom p → (¬p → ⊥) and MP yields Σ ⊥. then Σ is consistent—is left to the reader. we can apply MP twice to get Σ ⊥. i) as representing the statement that v has color i. and (inductive assumption) Σ q ⇐⇒ tΣ (q) = 1. . E) be a graph. If p is an atom or p = or p = ⊥.11. n} as the set of atoms. . (⇐) Suppose tΣ (p) = 1. Then tΣ (p) = 1: Otherwise.2. such a Σ has a model. Proof. Suppose for every finite V0 ⊆ V there is an n-coloring of (V0 . Let (V. w) ∈ E.2. Case 1 : p = ¬q. so tΣ (q) = 0 and tΣ (r) = 0. Then tΣ (q) = 1 or tΣ (r) = 1. hence Σ q and Σ r. 1) ∨ · · · ∨ (v. which contradicts the consistency of Σ. since ¬q → (¬r → ¬p) is a propositional axiom. By the previous lemma. then (w. E). We leave this case as an exercise. Then by Lindenbaum’s Lemma Σ is a subset of a complete set Σ of propositions. j) : v ∈ V. i) ∧ (v. and thus Σ ¬q and Σ ¬r by completeness of Σ. .

1. “binary”. and binary function symbol +. plus the unary function symbol −. (2) The language LAb = {0. Definition. ·} of rigs (or semirings) has constant symbols 0 and 1. pn ∈ Σ such that p1 ∨ · · · ∨ pn is a tautology. (The interesting case is when A is infinite. A structure A for L (or L-structure) is a triple A. We also need the capability to deal with predicates (also called relations). Definition. (4) The language LOAb = {<. BASIC CONCEPTS OF LOGIC The assumption that all finite subgraphs of (V.2 1 What 2 It we call here a language is also known as a signature. ·} of rings. Hence by compactness Σ has a model. Exercises. (1) Let (P. A constant symbol is a function symbol of arity 0. (F A )F ∈Lf consisting of: (i) a nonempty set A. we prefer less grandiose terminology. −1 . E) are n-colorable yields that every finite subset of Σ has a model. (RA )R∈Lr . −. and the quantifiers “for all” and “there exists. From now on. . (5) The language LRig = {0. (6) The language LRi = {0. which has these additional features and has a claim on being a complete logic for mathematical reasoning. (3) The language LO = {<} has just one binary relation symbol <. let L denote a language. Instead of “0ary”. −. . or a vocabulary. −. 1. (Hint: use the compactness theorem and the fact that this is true when P is finite. Examples. Then there are p1 . +. 0.3 Languages and Structures Propositional Logic captures only one aspect of mathematical reasoning.” We now begin setting up a framework for Predicate Logic (or First-Order Logic). ·} of groups has constant symbol 1. Then there is a linear order ≤ on P that extends ≤. each F ∈ Lf has associated arity a(F ) ∈ N. (1) The language LGr = {1. A language 1 L is a disjoint union of: (i) a set Lr of relation symbols. The symbols are those of the previous example. . the underlying set of A. and binary function symbol ·. (ii) a set Lf of function symbols. “1-ary”. is also called the universe of A. and binary function symbols + and ·. +} of (additive) abelian groups has constant symbol 0. “unary”. . each R ∈ Lr has associated arity a(R) ∈ N. +.) (2) Suppose Σ ⊆ Prop(A) is such that for each truth assignment t : A → {0. “2-ary” we say “nullary”. 1} there is p ∈ Σ with t(p) = 1. unary function symbol −1 .) 2. . +} of ordered abelian groups. An m-ary relation or function symbol is one that has arity m. ≤) be a poset.24 CHAPTER 2. variables. unary function symbol −.

<. any nonempty set A equipped with a binary relation on it can be viewed as an LO -structure. (1) Each group is considered as an LGr -structure by interpreting the symbols 1. <). +) and (Q. and its group multiplication. 0. 0. Since A0 has just one element. cA is uniquely determined by its value at this element. can be construed as an LAb -structure if we choose to do so. −. 0. (iii) for each n-ary F ∈ Lf an operation F A : An −→ A (an n-ary operation on A). and R respectively. Examples. −1 . Similarly with 0 and −. +. respectively. (Here we take even more notational liberties. The reader is supposed to keep in mind the distinction between symbols of L and their interpretation in an L-structure. . +) be an abelian group. and · as the identity element of the group. <. Let B be an L-structure with underlying set B. any set A in which we single out an element. and the usual orderings of N. but also denotes in this context its interpretation as a binary operation on the set A. and a binary operation on A.3. We consider A as an LAb -structure by taking as interpretations of the symbols 0. Similarly for (Z. ·) is an LRig -structure. − and + on A. Then A is the underlying set of an L-structure A defined by letting F A := F B |An : An → A. a unary operation on A. LANGUAGES AND STRUCTURES 25 (ii) for each m-ary R ∈ Lr a set RA ⊆ Am (an m-ary relation on A). 0. −. +) are both LOAb -structures. ·) is an LRi -structure. and operations F A on A (for F ∈ Lf ) are called the primitives of A. The interpretation of a constant symbol c of L is a function cA : A0 −→ A. When A is clear from context we often omit the superscript A in denoting the interpretation of a symbol of L in A. (4) (Z. 1. Z. +. the interpretation of F in A. −. (Q.− and + of LAb the group operations 0.2.) In fact. we shall identify cA with this value. Remark. for m-ary R ∈ Lr . 0.) Again. <). (6) (Z. <) is an LO -structure where we interpret < as the usual ordering relation on N. its group inverse. (3) (N. and − : A → A and + : A2 → A denote the group operations of A. RA := RB ∩ Am for n-ary F ∈ Lf . so cA ∈ A. −. Given an L-structure A. the interpretation of R in A. 1. (2) Let A = (A. even if we use the same notation for both. here 0 ∈ A is the zero element of the group. (We took here the liberty of using the same notation for possibly entirely different things: + is an element of the set LAb . Q. (5) (N. the relations RA on A (for R ∈ Lr ). <) and (R. by letting < denote five different things: a symbol of LO . and let A be a nonempty subset of B such that F B (An ) ⊆ A for every n-ary function symbol F of L.

a homomorphism h : A → B yields a substructure h(A) of B with underlying set h(A). An automorphism of A is an isomorphism A → A. . . ·) ⊆ (R. The map k → k + 1 is an automorphism of A with inverse given by k −→ k − 1. . . (ii) for each n-ary F ∈ Lf and each (a1 . Examples. and other kinds of algebraic structures. Let A = (Z. an ) ∼ F A (b1 . . bn ). A congruence on the L-structure A is an equivalence relation ∼ on its underlying set A such that (i) if R ∈ Lr is m-ary and a1 ∼ b1 . . 0. ·) Definition. 1. and if h is an embedding we have an isomorphism a → h(a) : A → h(A). an ) ∈ An we have h(F A (a1 . . If A and B are groups (viewed as structures for the language LGr ). . 0. . +. respectively). isomorphisms. am ) ∈ Am we have (a1 . 0. . . 0. . am ) ∈ RA =⇒ (ha1 . . A homomorphism h : A → B is a map h : A → B such that (i) for each m-ary R ∈ Lr and each (a1 . . . The identity map 1A on A is an automorphism of A. an ∼ bn . . then F A (a1 . . If i : A → B and j : B → C are homomorphisms (strong homomorphisms. (ii) if F ∈ Lf is n-ary and a1 ∼ b1 . 1. 0. . an isomorphism is a bijective strong homomorphism. then the inclusion a → a : A → B is an embedding A → B.26 CHAPTER 2. . <). . −. bm ) ∈ RA . or extends A. notation: A ⊆ B. 1. . . . 0. ·) ⊆ (Z. +. Let A = (Z. then a homomorphism h : A → B is exactly what in algebra is called a homomorphism from the group A to the group B. <. +). −. an )) = F B (ha1 . BASIC CONCEPTS OF LOGIC Definition. Thus the automorphisms of A form a group Aut(A) under composition with identity 1A . embeddings. am ) ∈ RA ⇐⇒ (b1 . . . ham ) ∈ RB . . . 1. We also say in this case that B is an extension of A. . If A ⊆ B. . +. . (1) (Z. . . . . . . Likewise with rings. . . Then k → −k is an automorphism of A. am ∼ bm . . Replacing =⇒ in (i) by ⇐⇒ yields the notion of a strong homomorphism. . . . ·) (2) (N. If i : A → B is an isomorphism then so is the map i−1 : B → A. +. Examples. −. . . 1. . <. 2. . . . then (a1 . then so is j ◦ i : A → C.) be L-structures. Let A = (A. han ). Such an L-structure A is said to be a substructure of B.) and B = (B. −. . 1. . . . . ·) ⊆ (Q. An embedding is an injective strong homomorphism. +. . Conversely. .

. . ) for i ∈ I. For j ∈ I the projection map to the jth factor is the homomorphism Bi → Bj . bmi ) ∈ RBi for all i ∈ I. . . Products. . . Note that then we have a strong homomorphism a → a∼ : A → A/ ∼. . and where the basic relations and functions are defined coordinatewise: for m-ary R ∈ Lr and elements b1 = (b1i ). bm ) ∈ RB ⇐⇒ (b1i . Bi i∈I is defined to be the L-structure B whose underlying set is the product set i∈I Bi . a∼ ) : (a1 . . . . . . . . . . . .3. . a2 ∈ A we put a1 ∼h a2 ⇐⇒ h(a1 ) = h(a2 ). .2. bm = (bmi ) ∈ i∈I Bi . . . an )∼ n on A/ ∼. (iii) the interpretation of an n-ary F ∈ Lf in A/ ∼ is the n-ary operation ∼ (a1 . . . . . bn = (bni ) ∈ F B (b1 . Bi = (Bi . . . Given a congruence ∼ on the L-structure A we obtain an L-structure A/ ∼ (the quotient of A by ∼) as follows: (i) the underlying set of A/ ∼ is the quotient set A/ ∼. . . . h(a) := hi (ai ) . Let Bi The product i∈I be a family of L-structures. . a∼ ) → F A (a1 . . LANGUAGES AND STRUCTURES 27 Note that a strong homomorphism h : A → B yields a congruence ∼h on A as follows: for a1 . (ii) the interpretation of an m-ary R ∈ Lr in A/ ∼ is the m-ary relation {(a∼ . . bn ) := F Bi (b1i . . . am ) ∈ RA } 1 m on A/ ∼. . . (b1 . . . i∈I (bi ) → bj . bni ) i∈I i∈I Bi . . and for n-ary F ∈ Lf and b1 = (b1i ). . . . . This product construction makes it possible to combine several homomorphisms with a common domain into a single one: if for each i ∈ I we have a homomorphism hi : A → Bi we obtain a homomorphism h = (hi ) : A → i∈I Bi .

Definition.4 Variables and Terms Var = {v0 . Then we have an isomorphism from A/ ∼h onto h(A) given by a∼h → h(a). xn are distinct. then the concatenation F t1 . } comes equipped with such a numbering. . tn is an L-term. then it may be hard to see how it is built up from subterms. and each congruence on G equals ≡N for a unique normal subgroup N of G. . xn by p(x1 . . . tn are L-terms. . (2) Consider a strong homomorphism h : A → B of L-structures. . by clause (ii) for n = 0. . v1 . (This is like indicating a polynomial in the indeterminates x1 . . The L-terms are the admissible words on the alphabet Lf ∪ Var where each variable has arity 0. . . . . . y. . Remark.) If a term is written as an admissible word. v2 . . . unless indicated otherwise. 2. We let x. An L-term is a word on the alphabet Lf ∪ Var obtained as follows: (i) each variable (viewed as a word of length 1) is an L-term.28 Exercises. xn ) to indicate an L-term t in which no variables other than x1 . . this will be pointed out. . we assume that vm = vn for m = n. v1 . Whenever we use this notation we assume tacitly that x1 . . xn occur. Note that we do not require that each of x1 . . xn ). . xn ). and that no variable is a function or relation symbol in any language. (ii) whenever F ∈ Lf is n-ary and t1 . in model theory this can even be convenient. . v2 . In practice we shall therefore use parentheses and brackets in denoting terms. our Var = {v0 . . . . . We often write t(x1 . . . Thus “unique readability” is available. z (sometimes with subscripts or superscripts) denote variables. . . } Throughout this course is a countably infinite set of symbols whose elements will be called variables. . For this more general Var we still insist that no variable is a function or relation symbol in any language. . xn actually occurs in t(x1 . and avoid prefix notation if tradition dictates otherwise. The results in Chapter 5 on undecidability presuppose a numbering of the variables. Chapters 2–4 go through if we take as our set Var of variables any infinite (possibly uncountable) set. . where one allows that some of these indeterminates do not actually occur in the polynomial p. Each normal subgroup N of G yields a congruence ≡N on G by a ≡N b ⇐⇒ aN = bN. In the few cases in chapters 2–4 that this more general set-up requires changes in proofs. . . . . CHAPTER 2. Note: constant symbols of L are L-terms of length 1. . . BASIC CONCEPTS OF LOGIC (1) Let G be a group viewed as a structure for the language of groups. . .

. . . . . . then tA = F A (tA . τn /xn ) the “replacing” should be simultaneous. If t is given in the form t(x1 . . then t(τ1 . . A term is said to be variable-free if no variables occur in it. then tA = cA ∈ A. .2. Note that if B is a second L-structure and A ⊆ B. . x1 . . Definition. τn ) as a shorthand for t(τ1 /x1 . Then the function tR : R3 → R is given by tR (a. simultaneously for i = 1. For easier reading we indicate this term instead by (x + (−y)) · z or even (x − y)z. In the definition of t(τ1 /x1 . . . gm ∈ G} ⊆ B is the underlying set of some A ⊆ B. Then t(τ1 /x1 . xm ) is an L-term and g1 . . . . . . . . Then the set {tB (g1 . . . . . . . let G ⊆ B. xn ). xn are distinct variables. tn are L-terms. . . . z) be the LRi term (x − y)z. n. . . then tA (a) = F A (tA (a). . . . τn are variable-free and t = t(x1 . . τ2 are L-terms and x1 . xm ). . then we say that B is generated by G. . . . . . . then tA (a) = tB (a) for t as above and a ∈ Am . τ2 /x2 ). . because it can happen that for t := t(τ1 /x1 ) we have t (τ2 /x2 ) = t(τ1 /x1 . tn . . . . . if t is a constant symbol c. We call this A the substructure of B generated by G. . b. . .4. . . x) a function tA : Am → A as follows (i) If t is the variable xi then tA (a) = ai for a = (a1 .4. . . If τ1 . and this A is clearly a substructure of any A ⊆ B with G ⊆ A . . am ) ∈ Am . and τ1 . .) Generators. Consider R as a ring in the usual way. let x1 . . and let τ1 . c) = (a − b)c. . . . tA ). . identified as usual with its value at the unique element of A0 . . The easy proof of the next lemma is left to the reader. . Then t(τ1 /x1 . . . . gm ) : t(x1 . y. Then the above gives a nullary function tA : A0 → A. τn /xn ) is an L-term. . . τ1 . . and if t = F t1 . . . Then we associate to the ordered pair (t. . . . . If (ai )i∈I is a family of elements of B. . . n 1 This inductive definition is justified by unique readability. x2 are distinct variables. where cA is as in the previous section. . . . . tn where F ∈ Lf is n-ary and t1 . Let B be an L-structure. Let t be an L-term. . Let t be a variablefree L-term and A an L-structure. VARIABLES AND TERMS 29 Example. tn with n-ary F ∈ Lf and variable-free L-terms t1 . . . so tA ∈ A. and assume also that L has a constant symbol or that G = ∅. τn be L-terms. (Here t. In other words. . xn be distinct variables. . . Let A be an L-structure and t = t(x) be an L-term where x = (x1 .1. . (ii) If t = F t1 . then we write t(τ1 . . τn ) is variable-free. Suppose t is an L-term. tA (a)) for a ∈ Am . The word · + x − yz is an LRi -term. τn /xn ) is the word obtained by replacing all occurrences of xi in t by ti . n 1 Definition. then “generated by (ai )” means “generated by G” where G = {ai : i ∈ I}. . . τn /xn ). and let t(x. . We urge the reader to do exercise (1) below and thus acquire the confidence that these formal term substitutions do correspond to actual function substitutions. . . . Example. Lemma 2. xn ). . if A = B. . τn are L-terms.

+). . or a function or relation symbol of any language. yn )) has the property that if A is an L-structure and a = (a1 . xn ) such that in every commutative ring R = (R. Then the Lterm t∗ (y1 . .30 Exercises. xn ) an L-term. . . . . ·) the above displayed identity holds. Conversely. . . . . . an ) ∈ An . xn ] such that for every commutative ring R = (R. 0. . an ) ∈ An . . . . yn ). . . 1. . an ) = tB (a1 . . BASIC CONCEPTS OF LOGIC (1) Let t(x1 . 0. . . . . . . 2. −. . . xn ) there are integers k1 . . . . . . . . . . . xn ) such that in every abelian group A = (A. . for all (a1 . for any integers k1 . . . ·) where L = LRig . τm (a) . The atomic L-formulas are the following words on the alphabet L ∪ Var ∪ { . han ). None of these eight symbols is a variable. xn ) there is a polynomial P (x1 . the latter are often referred to as the non-logical symbols. −. +) the above displayed identity holds. and t = t(x1 . . . . . rn ) ∈ Rn . ·). (3) For every LRi -term t(x1 . . . . . . kn such that for every abelian group A = (A. . . . . . . (4) Let A and B be L-structures. rn ) = P (r1 . . . for all (a1 . . tR (r1 . . . . xn ) ∈ Z[x1 . τm (y1 . τm (y1 . . CHAPTER 2. . an ) for all (a1 . h : A → B a homomorphism. . . . . . . . yn ) be L-terms. . +. 0. . . . −. 1. . . . . .5 Formulas and Sentences ⊥ ¬ ∨ ∧ = ∃ ∀ Besides variables we also introduce the eight distinct logical symbols The first five of these we already met when discussing propositional logic. . an ) = k1 a1 + · · · + kn an . 0. . . . . . . an ) = tB (ha1 . . . . an ) ∈ An . . kn there is an LAb -term t(x1 . . tA (a1 . xn ] there is an LRi term t(x1 . +. . . . . (a) Is there an L-term t(x) such that tA (0) = 1 and tA (1) = 0? (b) Is there an L-term t(x) such that tA (n) = 2n for all n ∈ N? (c) Find all the substructures of A. 1. . . Conversely. . . . then ` A ´ A (t∗ )A (a) = tA τ1 (a). . . xn ) ∈ Z[x1 . ⊥. . . . yn ). . . . for all (r1 . . . −. . +. . Definition. an ) ∈ An . (2) For every LAb -term t(x1 . xm ) and τ1 (y1 . . (If A ⊆ B and h : A → B is the inclusion. . . . . . . Then ` ´ h tA (a1 . . for any polynomial P (x1 . Below L denotes a language. 0. . To distinguish the logical symbols from those in L. this gives tA (a1 . . . . . rn ).) (5) Consider the L-structure A = (N. =}: . yn ) := t(τ1 (y1 . . . .

ψ are L-formulas. The context should always make it clear what our intention is in this respect without having to spell it out. ∨. ⊥. (iii) if ϕ is a L-formula and x is a variable. (ii) if ϕ. The L-formulas are the words on the larger alphabet L ∪ Var ∪ { . ∃. . . =. tm are L-terms. . the first two occurrences of x are bound. In the formula ∃x(x = y) ∧ x = 0. ∧. (iii) = t1 t2 . where R ∈ Lr is m-ary and t1 .2. which gives another useful characterization of subformulas. ∀}. This will become clear later. but we also use it to indicate equality of mathematical objects in the way we have done already many times. then so are ¬ϕ. . given L-formulas ϕ and ψ we shall write ϕ ∨ ψ to indicate ∨ϕψ. and ϕ → ψ to indicate ¬ϕ ∨ ψ. sk with i ≤ j ≤ k that is of the form ∃xψ or ∀xψ. t2 are L-terms. then ∃xϕ and ∀xϕ are L-formulas. ∃ and ∀ are given arity 2 and the other symbols have the arities assigned to them earlier. Note that all L-formulas are admissible words on the alphabet L ∪ Var ∪ { . sm .) The notational conventions introduced in the section on propositional logic go through. At this point the reader is invited to do the first exercise at the end of this section. . FORMULAS AND SENTENCES (i) and ⊥. . the third is free. ∧. ∨ϕψ and ∧ϕψ.5. and the occurrences of x and y are really the occurrences in this string. The logical symbol = is treated just as a binary relation symbol. where =. sj = x) is said to be a bound occurrence if ϕ has a subformula si si+1 . sk where 1 ≤ i ≤ k ≤ m which also happens to be a formula of L. ∃.) The reader should distinguish between different ways of using the symbol =. A subformula of ϕ is a subword of the form si . Written as a word on the alphabet above we have ϕ = s1 . ∨. . where t1 . . ⊥. where x and y are distinct. (Note: the formula is actually the string ∧∃x = xy = x0. . . =. . . Definition. not all admissible words on this alphabet are L-formulas: the word ∃xx is admissible but not an L-formula. ¬. Sometimes it denotes one of the eight formal logical symbols. This fact makes the results on unique readability applicable to L-formulas. tm . Let ϕ be a formula of L. (ii) Rt1 . with the role of propositions there taken over by formulas here.) . ∀} 31 obtained as follows: (i) every atomic L-formula is an L-formula. ¬. To increase readability we usually write an atomic formula = t1 t2 as t1 = t2 and its negation ¬ = t1 t2 as t1 = t2 . An occurrence of a variable x in ϕ at the j-th place (that is. but its interpretation in a structure will always be the equality relation on its underlying set. and the only occurrence of y is free. Example. (For example. where t1 and t2 are L-terms. (However. If an occurrence is not bound then it is said to be a free occurrence.

. . or σ is valid in A). xn be distinct variables. . . In using this notation it is understood that x1 . . Then ϕ(t1 /x1 . but it is not required that each of x1 . . Lemma 2. . . tn /xn ) is the word obtained by replacing all the free occurences of xi in ϕ by ti . . let x1 . . . . . Then A |= σ if and only if A |= ϕ(a) for some a ∈ A. .5. and let t1 . . . . (iii) Suppose σ = σ1 ∧ σ2 . . In the definition of ϕ(t1 /x1 . where one allows that some of these indeterminates do not actually occur in p. (ii) Suppose σ = σ1 ∨ σ2 . tn /xn ). .1. t2 . tn ) as a shorthand for ϕ(t1 /x1 . tA ) ∈ RA . xn ) to indicate a formula ϕ such that all variables that occur free in ϕ are among x1 .32 CHAPTER 2. . . . If t1 . . . . which for simplicity of notation we denote instead by tA . . tn be L-terms. . . . and let C ⊆ A. . because it can happen that ϕ(t1 /x1 )(t2 /x2 ) = ϕ(t1 /x1 . . . . Let A be an L-structure with underlying set A. Hence for each variable-free LC -term t we have a corresponding element tAC of A. xn are distinct variables. tn /xn ) the “replacing” should be simultaneous. . BASIC CONCEPTS OF LOGIC Definition. . tm if and only if (tA . . and by interpreting each name c as the element c ∈ C. These names are symbols not in L. . . Then ϕ(t1 /x1 . . . and t1 . . . also read as A satisfies σ or σ holds in A. If ϕ is given in the form ϕ(x1 . All this applies in particular to the case C = A. Suppose ϕ is an L-formula. . (iii) A |= t1 = t2 if and only if tA = tA . . . . We have the following lemma whose routine proof is left to the reader. . xn by p(x1 . Then A |= σ if and only if A |= σ1 or A |= σ2 . A sentence is a formula in which all occurrences of variables are bound occurrences. . . Then A |= σ if and only if A |= σ1 and A |= σ2 . . . . then ϕ(t1 . tn are L-terms. (iv) Suppose σ = ∃xϕ(x). tn /xn ) is an L-formula. . . . xn are distinct variables. . and A |= ⊥. The LC -structure thus obtained is indicated by AC . . (This is like indicating a polynomial in the indeterminates x1 . . . . . First we consider atomic LA -sentences: (i) A |= . We make A into an LC -structure by keeping the same underlying set and interpretations of symbols of L. . . where in LA we have a name a for each a ∈ A. . tn are variable-free and ϕ = ϕ(x1 . . x1 . then we write ϕ(t1 . and m 1 variable-free LA -terms t1 . . . . for variable-free LA -terms t1 . . . xn . . . . We extend L to a language LC by adding a constant symbol c for each c ∈ C. xn ). We can now define what it means for an LA -sentence σ to be true in the L-structure A (notation: A |= σ. xn ). xn ). . . . . . We write ϕ(x1 . xn occurs free in ϕ. . . called the name of c.) Definition. tn ) is an Lsentence. . simultaneously for i = 1. t2 /x2 ). . . Then A |= σ if and only if A σ1 . 1 2 We extend the definition inductively to arbitrary LA -sentences as follows: (i) Suppose σ = ¬σ1 . . tm . . for m-ary R ∈ Lr . . n. (ii) A |= Rt1 . Definition. . . . . Let ϕ be an L-formula. .

xm and a quantifier-free L-formula ϕ. Then A |= σ if and only if A |= ϕ(a) for all a ∈ A. . an ) for some (a1 . (Here x2 abbreviates the term x · x. . √ (1) The set {r ∈ R : r < 2} is 0-definable in (R. ∀xm ϕ with distinct x1 . and sf(∀xϕ) = {∀xϕ} ∪ sf(ϕ). . This is why we introduced names. an ) : A |= ϕ(a1 . . . Examples. Given an LA -formula ϕ(x1 . . . 1. an )} The formula ϕ(x1 . <. . Exercises. . 1. An L-formula is said to be universal if it has the form ∀x1 . <. A |= σ ⇐⇒ A |= ϕ(a1 . sf(ϕ ∨ ψ) = {ϕ ∨ ψ} ∪ sf(ϕ) ∪ sf(ψ). . . . −. . but “inductive” refers here to induction with respect to the number of logical symbols in σ. xm and a quantifier-free L-formula ϕ. the fact that ϕ(a) has fewer logical symbols than ∃xϕ(x) is crucial for the above to count as a definition.5. . . . FORMULAS AND SENTENCES 33 (v) Suppose σ = ∀xϕ(x). An L-formula is said to be existential if it has the form ∃x1 . and that for an LA -sentence σ = ∀x1 . . . If ϕ is atomic. . We now single out formulas by certain syntactical conditions. . ∀xn ϕ(x1 . an ) ∈ An .2. . xn ) is said to define the set ϕA in A. . . . . +. . 0.) (2) The set {r ∈ R : r < π} is definable in (R. xn ) we let ϕA be the following subset of An : ϕA = {(a1 . an ) ∈ An . 0. . . . . ∃xm ϕ with distinct x1 . an ) for all (a1 . . . An L-formula is said to be quantifier-free if it has no occurrences of ∃ and no occurrences of ∀. . the inductive definition above forces us to consider LA -sentences ϕ(a). then sf(ϕ) = {ϕ}. . sf(∃xϕ) = {∃xϕ} ∪ sf(ϕ). Even if we just want to define A |= σ for L-sentences σ. (1) Let (a) (b) (c) (d) ϕ and ψ be L-formulas. . . . . xn ). . xn ). . ·): it is defined by the formula (x2 < 1 + 1) ∨ (x < 0). . . ·): it is defined by the formula x < π. . If moreover ϕ can be chosen to be an L-formula then S is said to be 0-definable in A. . . . sf(¬ϕ) = {¬ϕ} ∪ sf(ϕ). . ∃xn ϕ(x1 . +. . xn ). These conditions have semantic counterparts in terms of the behaviour of these formulas under various kinds of homomorphisms. . . . . . . put sf(ϕ) := set of subformulas of ϕ. It is easy to check that for an LA -sentence σ = ∃x1 . For example. −. . An L-formula is said to be positive if it has no occurrences of ¬ (but it can have occurrences of ⊥). We didn’t say so explicitly. A |= σ ⇐⇒ A |= ϕ(a1 . . Definition. A set S ⊆ An is said to be definable in A if S = ϕA for some LA -formula ϕ(x1 . . and sf(ϕ ∧ ψ) = {ϕ ∧ ψ} ∪ sf(ϕ) ∪ sf(ψ). one can see that if σ has the form ∃xϕ(x) or ∀xϕ(x). . as shown in some exercises below. . . .

put S(a) := {b ∈ An : (a. (6) Let the symbols of L be a binary relation symbol < and a unary relation symbol U . The set {2n : n ∈ N} in the semiring N . . an ). y) defines in A the set {a ∈ Am : S(a) = An }. Then there is an L-sentence σ such that for all X ⊆ R we have (R. 3. . . ∀yn ϕ(x. (7) Let A ⊆ B. 0. . . an ∈ A. b) ∈ S} (a section of S). X) |= σ ⇐⇒ X is finite. . (8) Suppose h : A −→ B is a homomorphism of L-structures. . . . . . then A |= σ ⇔ B |= σ. This convention is in force throughout these notes. Suppose that S ⊆ Am+n is defined in A by the LA -formula ϕ(x. am+n ) = (a1 . . . then A |= σ ⇒ B |= σ (d) If σ is a universal LA -sentence. . . ∃yn ϕ(x. (b) S1 ∩ S2 is defined in A by (ϕ1 ∧ ϕ2 )(x1 . . . For each LA -term t. let ϕh be the LB -formula obtained from ϕ by replacing each occurrence of a name a of an element a ∈ A by the name ha of the corresponding element ha ∈ B. . (a) For each variable free LA -term t we have tA = tB . 1. . . f ) where f : R → R is any function. 0. <. . . ` (d) S1 ⊆ S2 ⇐⇒ A |= ∀x1 . . . and ∀y1 . then B |= σ ⇒ A |= σ. let th be the LB -term obtained from t by replacing each occurrence of a name a of an element a ∈ A by the name ha of the corresponding element ha ∈ B. . <. xn ) respectively. . . . . (4) Let π : Am+n → Am be the projection map given by π(a1 . . The set {a ∈ R : f is continuous at a} in (R. . . . . yn ). ∀xn ϕ1 → ϕ2 ). . so is ϕh . xn ) and ϕ2 (x1 . +. y) defines in A the subset π(S) of Am . (b) If the LA -sentence σ is quantifier-free. xn ) is an LA -term and a1 . . n) ∈ N2 : m < n} in (N.} of prime numbers in the semiring N = (N.34 CHAPTER 2. 7. . for each LA -formula ϕ. The set {2. . . . and for S ⊆ Am+n and a ∈ Am . . . . Then: (a) S1 ∪ S2 is defined in A by (ϕ1 ∨ ϕ2 )(x1 . BASIC CONCEPTS OF LOGIC (2) If t(x1 . . an )A = tA (a1 . xn ). xn ). . . 5. xm ) and y = (y1 . y) where x = (x1 . (5) The (a) (b) (c) (d) following sets are 0-definable in the corresponding structures: The ordering relation {(m. Then ∃y1 . . . . Similarly. . (c) If σ is an existential LA -sentence. . (c) An S1 is defined in A by ¬ϕ1 (x1 . . xn ). am ). Then we consider LA to be a sublanguage of LB in such a way that each a ∈ A has the same name in LA as in LB . . . . ·). . . . then t(a1 . +). (3) Suppose that S1 ⊆ An and S2 ⊆ An are defined in A by the LA -formulas ϕ1 (x1 . . . Note that if ϕ is a sentence. Then: . .

h σ is a positive LA -sentence without ∀-symbol. the term t1 if n = 1.6 Models In the rest of this chapter L is a language. . MODELS (a) (b) (c) (d) if if if if 35 t is a variable-free LA -term. then A |= σ ⇒ B |= σh . ∀x∀y∀z((x + y) + z = x + (y + z))} (3) Torsion-free abelian groups are the LAb -structures that are models of Ab ∪ {∀x(nx = 0 → x = 0) : n = 1.} (4) Rings are the LRi -structures that are models of Ri := Ab ∪ {∀x∀y∀z (x · y) · z = x · (y · z) . ∀x x · 1 = x ∧ 1 · x = x . We say that A is a model of Σ or Σ holds in A (denoted A |= Σ) if A |= σ for each σ ∈ Σ. ∀x∀y∀z((x · y) · z = x · (y · z))} (2) Abelian groups are the LAb -structures that are models of Ab := {∀x(x + 0 = x). and. Then we have similar notational conventions for t1 · . then h(tA ) = tB . · tn and tn .6. ϕ. A is an L-structure (with underlying set A). z.2. σ is an LA -sentence and h is an isomorphism. and θ are Lformulas. . . . To discuss examples it is convenient to introduce some notation. 0t and 1t denote the terms 0 and t respectively. . We write nt for the term t+· · ·+t with n summands. in particular. Suppose L contains the constant symbol 1 and the binary function symbol · (the multiplication sign). unless this would cause confusion. ∀x(x · x−1 = 1 ∧ x−1 · x = 1). 3. Given any terms t1 . . σ is an L-sentence. isomorphic L-structures satisfy exactly the same L-sentences. y. Definition. (1) Groups are the LGr -structures that are models of Gr := {∀x(x · 1 = x ∧ 1 · x = x). and the term (t1 + · · · + tn−1 ) + tn for n > 1. In particular. in particular. and t1 is just t. Fix three distinct variables x. ∀x∀y∀z (x · (y + z) = x · y + x · z ∧ (x + y) · z = x · z + y · z) } . We drop the prefix L in “L-term” and “L-formula” and so on. tn we define the term t1 + · · · + tn inductively as follows: it is the term 0 if n = 0. . Suppose L contains (at least) the constant symbol 0 and the binary function symbol +. ∀x∀y(x + y = y + x). . and Σ is a set of L-sentences. σ is a positive LA -sentence and h is surjective. 2. . ∀x(x + (−x) = 0). Examples. 2. unless indicated otherwise. then A |= σ ⇒ B |= σh . t is an L-term. for n = 0 both stand for the term 1. then A |= σ ⇔ B |= σh . ψ.

xm+1 . . . .36 CHAPTER 2. 11. . obtained by adding a variable xm+1 . Definition. xm ). Of course ϕ can also be written as ϕ(y1 . fields of characteristic p are the LRi -structures that are models of Fl(p) := Fl ∪ {p1 = 0}. The reader should check that if ϕ = ϕ(x1 .” the reader should check that these different ways of specifying variables (including at least the variables occurring free in ϕ) give the same A-instances. We say that ϕ is a logical consequence of Σ (notation: Σ |= ϕ) if A |= ϕ for all models A of Σ. algebraically closed fields of characteristic p are the LRi -structures that are models of ACF(p) := ACF ∪ {p1 = 0}. . 3. . 1 = 0. . . . yn ) for another sequence of variables y1 . (9) Algebraically closed fields of characteristic 0 are the LRi -structures that are models of ACF(0) := ACF ∪ {n1 = 0 : n = 2. n. . . am ) with a1 . . 4. . . 5. . then A |= ϕ ⇐⇒ A |= ∀x1 . . for i = 1. . . 3. . 5.} Here u1 . First define an A-instance of a formula ϕ = ϕ(x1 . for example. Definition. . We say that σ is a logical consequence of Σ (written Σ |= σ) if σ is true in every model of Σ. . 11. We also extend the notion of “logical consequence of Σ” to formulas. . 7. . ∀xm ϕ. . . We defined what it means for a sentence σ to hold in a given structure A. yn . . 5. is some fixed infinite sequence of distinct variables. (8) Algebraically closed fields are the LRi -structures that are models of ACF := Fl∪{∀u1 . am ∈ A. 3. BASIC CONCEPTS OF LOGIC (5) Fields are the LRi -structures that are models of Fl = Ri ∪ {∀x∀y(x · y = y · x). . . . . . . . . This can now be expressed as Ri |= ∀x( x · 0 = 0). . u3 . . . xm ) to be an LA sentence of the form ϕ(a1 . .} (7) Given a prime number p. Definition. . . . . . (10) Given a prime number p. It is well-known that in any ring R we have x · 0 = 0 for all x ∈ R. or it could be x1 . distinct also from x. ∀x x = 0 → ∃y (x · y = 1) } (6) Fields of characteristic 0 are the LRi -structures that are models of Fl(0) := Fl ∪ {n1 = 0 : n = 2. 7. Example. xm . u2 . . . . and ui xn−i abbreviates ui · xn−i . . A formula ϕ is said to be valid in A (notation: A |= ϕ) if all its A-instances are true in A. We now extend this to arbitrary formulas. . . . Thus for the above to count as a definition of “A-instance.+un = 0) : n = 2. . . . yn could be obtained by permuting x1 .}. y1 . . . . . ∀un ∃x(xn +u1 xn−1 +. . xm . .

2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS

37

One should not confuse the notion of “logical consequence of Σ” with that of “provable from Σ.” We shall give a definition of provable from Σ in the next section. The two notions will turn out to be equivalent, but that is hardly obvious from their definitions: we shall need much of the next chapter to prove this equivalence, which is called the Completeness Theorem for Predicate Logic. We finish this section with two basic facts: Lemma 2.6.1. Let α(x1 , . . . , xm ) be an LA -term, and recall that α defines a map αA : Am → A. Let t1 , . . . , tm be variable-free LA -terms, with tA = ai ∈ A i for i = 1, . . . , m. Then α(t1 , . . . , tm ) is a variable-free LA -term, and α(t1 , . . . , tm )A = α(a1 , . . . am )A = αA (tA , . . . , tA ). 1 m This follows by a straightforward induction on α. Lemma 2.6.2. Let t1 , . . . , tm be variable-free LA -terms with tA = ai ∈ A i for i = 1, . . . , m. Let ϕ(x1 , . . . , xm ) be an LA -formula. Then the LA -formula ϕ(t1 , . . . , tm ) is a sentence and A |= ϕ(t1 , . . . , tm ) ⇐⇒ A |= ϕ(a1 , . . . , am ). Proof. To keep notations simple we give the proof only for m = 1 with t = t1 and x = x1 . We proceed by induction on the number of logical symbols in ϕ(x). Suppose that ϕ is atomic. The case where ϕ is or ⊥ is obvious. Assume ϕ is Rα1 . . . αm where R ∈ Lr is m-ary and α1 (x), . . . , αm (x) are LA -terms. Then ϕ(t) = Rα1 (t) . . . αm (t) and ϕ(a) = Rα1 (a) . . . αm (a). We have A |= ϕ(t) iff (α1 (t)A , . . . , αm (t)A ) ∈ RA and also A |= ϕ(a) iff (α1 (a)A , . . . , αm (a)A ) ∈ RA . As αi (t)A = αi (a)A for all i by the previous lemma, we have A |= ϕ(t) iff A |= ϕ(a). The case that ϕ(x) is α(x) = β(x) is handled the same way. It is also clear that the desired property is inherited by disjunctions, conjunctions and negations of formulas ϕ(x) that have the property. Suppose now that ϕ(x) = ∃y ψ. Case y = x: Then ψ = ψ(x, y), ϕ(t) = ∃yψ(t, y) and ϕ(a) = ∃yψ(a, y). As ϕ(t) = ∃yψ(t, y), we have A |= ϕ(t) iff A |= ψ(t, b) for some b ∈ A. By the inductive hypothesis the latter is equivalent to A |= ψ(a, b) for some b ∈ A, hence equivalent to A |= ∃yψ(a, y). As ϕ(a) = ∃yψ(a, y), we conclude that A |= ϕ(t) iff A |= ϕ(a). Case y = x: Then x does not occur free in ϕ(x) = ∃xψ. So ϕ(t) = ϕ(a) = ϕ is an LA -sentence, and A |= ϕ(t) ⇔ A |= ϕ(a) is obvious. When ϕ(x) = ∀y ψ then one can proceed exactly as above by distinguishing two cases.

2.7

Logical Axioms and Rules; Formal Proofs

In this section we introduce a proof system for predicate logic and state its completeness. We then derive as a consequence the compactness theorem and some of its corollaries. The completeness is proved in the next chapter.

38

CHAPTER 2. BASIC CONCEPTS OF LOGIC

A propositional axiom of L is by definition a formula that for some ϕ, ψ, θ occurs in the list below: 1. 2. ϕ → (ϕ ∨ ψ); ϕ → (ψ ∨ ϕ)

3. ¬ϕ → ¬ψ → ¬(ϕ ∨ ψ) 4. (ϕ ∧ ψ) → ϕ; (ϕ ∧ ψ) → ψ

5. ϕ → ψ → (ϕ ∧ ψ) 6. ϕ → (ψ → θ) → (ϕ → ψ) → (ϕ → θ) 7. ϕ → (¬ϕ → ⊥) 8. (¬ϕ → ⊥) → ϕ Each of items 2–8 is a scheme describing infinitely many axioms. Note that this list is the same as the list in Section 2.2 except that instead of propositions p, q, r we have formulas ϕ, ψ, θ. The logical axioms of L are the propositional axioms of L and the equality and quantifier axioms of L as defined below. Definition. The equality axioms of L are the following formulas: (i) x = x, (ii) x = y → y = x, (iii) (x = y ∧ y = z) → x = z, (iv) (x1 = y1 ∧ . . . ∧ xm = ym ∧ Rx1 . . . xm ) → Ry1 . . . ym , (v) (x1 = y1 ∧ . . . ∧ xn = yn ) → F x1 . . . xn = F y1 . . . yn , with the following restrictions on the variables and symbols of L: x, y, z are distinct in (ii) and (iii); in (iv), x1 , . . . , xm , y1 , . . . , ym are distinct and R ∈ Lr is m-ary; in (v), x1 , . . . , xn , y1 , . . . , yn are distinct, and F ∈ Lf is n-ary. Note that (i) represents an axiom scheme rather than a single axiom, since different variables x give different formulas x = x. Likewise with (ii)–(v). Let x and y be distinct variables, and let ϕ(y) be the formula ∃x(x = y). Then ϕ(y) is valid in all A with |A| > 1, but ϕ(x/y) is invalid in all A. Thus substituting x for the free occurrences of y does not always preserve validity. To get rid of this anomaly, we introduce the following restriction on substitutions of a term t for free occurrences of y. Definition. We say that t is free for y in ϕ, if no variable in t can become bound upon replacing the free occurrences of y in ϕ by t, more precisely: whenever x is a variable in t, then there are no occurrences of subformulas in ϕ of the form ∃xψ or ∀xψ that contain an occurrence of y that is free in ϕ. Note that if t is variable-free, then t is free for y in ϕ. We remark that “free for” abbreviates “free to be substituted for.” In exercise 2 the reader is asked to show that, with this restriction, substitution of a term for the free occurrences of a variable does preserve validity.

2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS

39

Definition. The quantifier axioms of L are the formulas ϕ(t/y) → ∃yϕ and ∀yϕ → ϕ(t/y) where t is free for y in ϕ. These axioms have been chosen to have the following property. Proposition 2.7.1. The logical axioms of L are valid in every L-structure. We first prove this for the propositional axioms of L. Let α1 , . . . , αn be distinct propositional atoms not in L. Let p = p(α1 , . . . , αn ) ∈ Prop{α1 , . . . , αn }. Let ϕ1 , . . . , ϕn be formulas and let p(ϕ1 , . . . , ϕn ) be the word obtained by replacing each occurrence of αi in p by ϕi for i = 1, . . . , n. One checks easily that p(ϕ1 , . . . , ϕn ) is a formula. Lemma 2.7.2. Suppose ϕi = ϕi (x1 , . . . , xm ) for 1 ≤ i ≤ n and let a1 , . . . , am ∈ A. Define a truth assignment t : {α1 , . . . , αn } −→ {0, 1} by t(αi ) = 1 iff A |= ϕi (a1 , . . . , am ). Then p(ϕ1 , . . . , ϕn ) is an L-formula and p(ϕ1 , . . . , ϕn )(a1 /x1 , . . . , am /xm ) = p(ϕ1 (a1 , . . . , am ), . . . , ϕn (a1 , . . . , am )), t(p(α1 , . . . , αn )) = 1 ⇐⇒ A |=p(ϕ1 (a1 , . . . , am ), . . . , ϕn (a1 , . . . , am )). In particular, if p is a tautology, then A |= p(ϕ1 , . . . , ϕn ). Proof. Easy induction on p. We leave the details to the reader. Definition. An L-tautology is a formula of the form p(ϕ1 , . . . , ϕn ) for some tautology p(α1 , . . . , αn ) ∈ Prop{α1 , . . . , αn } and some formulas ϕ1 , . . . , ϕn . By Lemma 2.7.2 all L-tautologies are valid in all L-structures. The propositional axioms of L are L-tautologies, so all propositional axioms of L are valid in all L-structures. It is easy to check that all equality axioms of L are valid in all L-structures. In exercise 3 below the reader is asked to show that all quantifier axioms of L are valid in all L-structures. This finishes the proof of Proposition 2.7.1. Next we introduce rules for deriving new formulas from given formulas. Definition. The logical rules of L are the following: (i) Modus Ponens (MP): From ϕ and ϕ → ψ, infer ψ. (ii) Generalization Rule (G): If the variable x does not occur free in ϕ, then (a) from ϕ → ψ, infer ϕ → ∀xψ; (b) from ψ → ϕ, infer ∃xψ → ϕ. A key property of the logical rules is that their application preserves validity. Here is a more precise statement of this fact, to be verified by the reader. (i) If A |= ϕ and A |= ϕ → ψ, then A |= ψ. (ii) Suppose x does not occur free in ϕ. Then (a) if A |= ϕ → ψ, then A |= ϕ → ∀xψ; (b) if A |= ψ → ϕ, then A |= ∃xψ → ϕ. Definition. A formal proof , or just proof , of ϕ from Σ is a sequence ϕ1 , . . . , ϕn of formulas with n ≥ 1 and ϕn = ϕ, such that for k = 1, . . . , n:

Proof. . . .7. Contradiction! . Moreover. Suppose σ is an LRi -sentence that holds in all fields of characteristic 0. The Compactness Theorem has many consequences.} |= σ. n ≤ N } |= σ. However the equivalence of and |= (Completeness Theorem) justifies our choice of logical axioms and rules and shows in particular that no further logical axioms and rules are needed. 5. By the previous result σ holds in some field of characteristic p > 0. There is no finite set of LRi -sentences whose models are exactly the fields of characteristic 0. Let σ := σ1 ∧ · · · ∧ σN . Note that Fl(0) is infinite. and thus our notion of formal proof is somewhat arbitrary. (ii) or ϕk is a logical axiom. (iii) or there are i. 3. Fl(0) = Fl ∪ {n1 = 0 : n = 2. then Σ |= ϕ.7. . k − 1} such that ϕk can be inferred from ϕi and ϕj by MP. this equivalence has consequences that can be stated in terms of |= alone. . there is N ∈ N such that Fl ∪ {n1 = 0 : n = 2. Then by Compactness. The converse is more interesting.7. σN }. Then the models of σ are just the fields of characteristic 0. If Σ ϕ. .7. If Σ |= σ then there is a finite subset Σ0 of Σ such that Σ0 |= σ.40 CHAPTER 2. . see exercise 8 below. Proposition 2. . Suppose there is such a finite set of sentences {σ1 . or from ϕi by G. Could there be an alternative finite set of axioms whose models are exactly the fields of characteristic 0? Corollary 2. By assumption. 3. Here is one. . . then we write Σ ϕ and say Σ proves ϕ.6. j ∈ {1. The converse of this corollary fails. An example is the important Compactness Theorem. and due to G¨del (1930): o Theorem 2.5 (Compactness Theorem). This follows easily from earlier facts that we stated and which the reader was asked to verify. . Theorem 2. Corollary 2.4 (Completeness Theorem of Predicate Logic). . Then there exists a natural number N such that σ is true in all fields of characteristic p > N . Proof. Σ ϕ ⇐⇒ Σ |= ϕ Remark. It follows that σ is true in all fields of characteristic p > N . . . .7.3. Our choice of proof system. 5. BASIC CONCEPTS OF LOGIC (i) either ϕk ∈ Σ.7. . If there exists a proof of ϕ from Σ.

(ii) The quantifier axiom ϕ(t/y) → ∃yϕ is valid in A. y). (3) Suppose t is free for y in ϕ = ϕ(x1 . (1) Let L = {R} where R is a binary relation symbol. xn . . then (5) Σ (6) If Σ (7) ϕi for i = 1. . . . (8) Indicate an LRi -sentence that is true in the field of real numbers. but false in all fields of positive characteristic. τ ) → ∃yϕ(a1 . y) with a1 . e.2. and let A = (A. two finite L-structures are isomorphic iff they satisfy the same L-sentences. .5.4. then Σ ¬∃xϕ ↔ ∀x¬ϕ and ¬∀xϕ ↔ ∃x¬ϕ. ϕ ↔ ψ. an . Exercise (4) will be used in proving Theorem 2. . (In fact. .) (iii) The quantifier axiom ∀yϕ → ϕ(t/y) is valid in A.7. an . All but the last one to be done without using Theorem 2. R) be a finite L-structure (i. . . .4 or 2. Actually. . . then ϕ(t/y) is valid in A. an ∈ A and τ a variable-free LA -term. (9) Let σ be an LAb -sentence which holds in all non-trivial torsion free abelian groups. .7.2.7. .) (2) If t is free for y in ϕ and ϕ is valid in A. . Then there exists an L-sentence σ such that the models of σ are exactly the L-structures isomorphic to A. ψ → ϕ. . Then: (i) Each A-instance of the quantifier axiom ϕ(t/y) → ∃yϕ has the form ϕ(a1 . FORMAL PROOFS 41 Exercises. LOGICAL AXIOMS AND RULES. . Then there exists N ∈ N such that σ is true in all groups Z/pZ where p is a prime number and p > N . for an arbitrary language L. n ⇐⇒ Σ ϕ → ψ and Σ ϕ. .6. . .7. . (Hint: use Lemma 2. ϕ1 ∧ · · · ∧ ϕn . (4) If ϕ is an L-tautology. the set A is finite).

.

Chapter 3 The Completeness Theorem The main aim of this chapter is to prove the Completeness Theorem.second form). if It is convenient to prove first a variant of the Completeness Theorem. Proof. Then Σ ∀xϕ. Theorem 3. Lemma 3. From Σ ϕ and the L-tautology ϕ → (¬∀xϕ → ϕ) we obtain Σ ¬∀xϕ → ϕ by MP. Σ is consistent if and only if Σ has a model. Suppose Σ ϕ.1 Another Form of Completeness ⊥. which we can treat here rather efficiently using our construction of a so-called term-model in the proof of the Completeness Theorem.6. ψ. The last section contains some of the basics of universal algebra. As a byproduct we also derive some more elementary facts about predicate logic. t. σ and Σ are as in the beginning of Section 2. Suppose Σ ∪ {σ} ϕ. and otherwise (that is. 3. We first show that this second form of the Completeness Theorem implies the first form. 43 . The cases where ϕ is a logical axiom. which are also useful later in this Chapter. A. Using the Ltautology (¬∀xϕ → ∀xϕ) → ∀xϕ and MP we get Σ ∀xϕ.2. Then by G we have Σ ¬∀xϕ → ∀xϕ. This will be done through a series of technical lemmas. ϕ. or ϕ ∈ Σ ∪ {σ} or ϕ is obtained by MP are treated just as in the proof of the Deduction Lemma of Propositional Logic. Definition. Lemma 3. we call Σ inconsistent.1. We say that Σ is consistent if Σ Σ ⊥).1.3 (Deduction Lemma). Conventions on the use of L.1 (Completeness Theorem . Proof. θ.1. By induction on the length of a proof of ϕ from Σ ∪ {σ}. Then Σ σ → ϕ.

Σ σ if and only if Σ ∪ {¬σ} is inconsistent.1. Then Σ σ1 ∧ . . Corollary 3.9 (Compactness Theorem .1. . This job will be done in a series of lemmas. ϕ. σn } We leave the proof as an exercise. and where we assume inductively that Σ σ → (ϕ1 → ψ). ∀yn ϕ.44 CHAPTER 3. ∀yn ϕ where ϕ = ϕ(y1 . For (⇒). .1. by G. We finish this section with another form of the Compactness Theorem: Theorem 3. . From Σ |= ϕ we obtain Σ |= ∀y1 .second form).1. .4. We have to argue that then Σ σ → (ϕ1 → ∀xψ). Σ ∀yϕ if and only if Σ ϕ. Corollary 3. . But then by the 2nd form of the Completeness Theorem Σ ∪ {¬σ} is inconsistent. From the L-tautology σ → (ϕ1 → ψ) → (σ ∧ ϕ1 ) → ψ and MP we get Σ (σ∧ϕ1 ) → ψ.2 Proof of the Completeness Theorem We are now going to prove Theorem 3. The proof is just like that of the corresponding fact of Propositional Logic. The case that ϕ is obtained by part (b) of G is left to the reader.2.1. ∀yϕ. . 3. assume Σ quantifier axiom ∀yϕ → ϕ. we do not assume in those lemmas that Σ is consistent. (⇐) This is Lemma 3.1. ∧ σn → ϕ. If each finite subset of Σ has a model. Suppose Σ ∪ {σ1 . . yn ). THE COMPLETENESS THEOREM Suppose that ϕ is obtained by part (a) of G. . Unless we say so.7 we get Σ ϕ. This follows from the second form of the Completeness Theorem. Using the L-tautology (σ ∧ ϕ1 ) → ∀xψ → σ → (ϕ1 → ∀xψ) and MP this gives Σ σ → (ϕ1 → ∀xψ).1. we focus our attention on (⇒). ∀yn ϕ if and only if Σ ϕ.7.5 we have Σ σ and thus by Corollary 3. then Σ has a model. It suffices to show that then Σ ϕ. Theorem 2. Corollary 3.5. so ϕ is ϕ1 → ∀xψ where x does not occur free in ϕ1 and Σ ∪ {σ} ϕ1 → ψ. Assume the second form of the Completeness Theorem holds.7.1. Since (⇐) is clear. Since x does not occur free in σ∧ϕ1 this gives Σ (σ∧ϕ1 ) → ∀xψ. Lemma 3. Then by Corollary 3. Corollary 3.8. .6. . given a consistent set of sentences Σ we must show that Σ has a model. . The second form of the Completeness Theorem implies the first form. .4. and so Σ ∪ {¬σ} has no model where σ is the sentence ∀y1 .1.1. . . that is. . Σ ∀y1 .1. We have the Proof. Proof. so by MP we get Σ ϕ. and that Σ |= ϕ. .

. By the assumptions and Exercise 5 in Section 2. . . Then Σ ϕ(t/x). . .2. ∧ xn = yn → F x1 . tm .2.2. yn . Take an equality axiom x1 = y1 ∧ .4. 45 Proof. . . . . tm → Rt1 .2. . The relation ∼Σ is an equivalence relation on TL . . . PROOF OF THE COMPLETENESS THEOREM Lemma 3. . . ∧ xm = ym ∧ Rx1 . tn /yn ) = ϕ(t1 /x1 . t3 be L-terms and Σ t1 = t2 and Σ t2 = t3 . . . .1 n times in succession to obtain Σ ψ where ψ = ϕ(y1 /x1 . Then MP gives Σ Rt1 . n. Suppose Σ occur bound in ϕ. .2. . take an equality axiom (x = y ∧ y = z) → (x = z) and apply Lemma 3. . (5) Let F ∈ Lf be n-ary. . Then Σ ϕ. . . . Then Σ F t1 .2. To prove (4) we first use the same Exercise 5 to get Σ t1 = t1 ∧ . (4) Let R ∈ Lr be m-ary and let t1 . . yn /xn ). Proof. tm . tm . . To finish. . Then Σ t1 = t3 . Lemma 3. . . tn /xn ). tm . . . Then MP yields Σ t1 = t3 . . . t1 . ym and apply Lemma 3. .1. observe that ψ(t1 /y1 . . take an equality axiom x = y → y = x and apply Lemma 3. Suppose Σ ϕ and t is free for x in ϕ. . . ∧ tm = tm ∧ Rt1 . Use Lemma 3. We define a binary relation ∼Σ on TL by t1 ∼Σ t2 ⇐⇒ Σ t1 = t2 . . tm be L-terms such that Σ ti = ti for i = 1. . tn . xn = F y1 . . . . (1) For each L-term t we have t = t. . . m and Σ Rt1 . (3) Let t1 . t2 . . . .2 to get (t1 = t2 ∧ t2 = t3 ) → t1 = t3 . tn . .2. and let t1 .1 again n times to get Σ ψ(t1 /y1 . (2) and (3) of the last lemma yield the following. . yn that do not occur in ϕ or t1 . Part (5) is obtained in a similar way by using an equality axiom x1 = y1 ∧ . tn and that are distinct from x1 . . ∧ tm = tm ∧ Rt1 . . . Apply Lemma 3. tn be L-terms such that Σ ti = ti for i = 1. . From Σ ϕ we get Σ ∀xϕ by Lemma 3. For (1). tm . For (3). .1. . . . xm → Ry1 .2. Lemma 3. . . xn . Then MP gives Σ t = t. Parts (1).3. . use the equality axiom x = x and apply Lemma 3. tm . . . . For (2). .2. Definition. .3. Then MP together with the quantifier axiom ∀xϕ → ϕ(t/x) gives Σ ϕ(t/x) as required. tn /xn ). .2 to obtain Σ t1 = t1 ∧ .2.2.2.7 we have Σ t1 = t2 ∧ t2 = t3 . . . tn /yn ). . . . . t1 . Then Σ Rt1 . t be L-terms and Σ t = t . and t1 . . .2 to get t = t → t = t. Let TL be the set of variable-free L-terms. Lemma 3. Proof. . . Then Σ t = t. . tn are terms whose variables do not ϕ(t1 /x1 . . . .2. . . . Take distinct variables y1 . tn = F t1 . . .2. (2) Let t.

Completeness is a strong property and it can be hard to show that a given set of axioms is complete. (2) for each atomic σ we have: Σ σ ⇐⇒ AΣ |= σ. The set of axioms for algebraically closed fields of characteristic 0 is complete (see the end of Section 4. Now suppose that σ is t1 = t2 where t1 . tm ⇔ ([t1 ]. tm ∈ TL ).2. [tm ]) ∈ RAΣ ⇔ AΣ |= Rt1 . since there are non-trivial abelian groups and σ holds in such groups. . Thus Σ is not complete. and be done. but now we do. . . . . Then (1) for each t ∈ TL we have tAΣ = [t]. . tm where R ∈ Lr is m-ary and t1 . then AΣ |= Σ. . [tn ]) = [F t1 . Let σ be Rt1 .3). 2 1 We also have Σ ⇔ AΣ |= . .. THE COMPLETENESS THEOREM Definition. Also Σ ¬σ. But clearly this equivalence can only hold for all σ if Σ has the property that for each σ. t2 ∈ TL . . . tm (t1 . . .e. i. . (iii) If F ∈ Lf is n-ary.3. then RAΣ ⊆ Am is given by Σ ([t1 ]. . . Suppose L has at least one constant symbol. . . Let L = LAb . . We say that Σ is complete if Σ is consistent. The consistency of Σ means that Σ ⊥. Then Σ t1 = t2 ⇔ [t1 ] = [t2 ] ⇔ tAΣ = tAΣ ⇔ AΣ |= t1 = t2 . Part (1) follows by an easy induction. Then Σ Rt1 .46 CHAPTER 3. . . .5. . Then Σ σ since the trivial group doesn’t satisfy σ. so we would have found a model of Σ. . . So far we haven’t used the assumption that Σ is consistent. The reader should verify that this counts as a definition. . . . either Σ σ or Σ ¬σ. Proof. and deserves a name: Definition. tm . does not introduce ambiguity. . Let [t] denote the equivalence class of t ∈ TL with respect to ∼Σ . Suppose L has a constant symbol. .2. Example. where the last “⇔” follows from the definition of |= together with part (1). . . then F AΣ : An → AΣ is given by Σ F AΣ ([t1 ]. [tm ]) ∈ RAΣ :⇐⇒ Σ Rt1 . tn ] (t1 . and σ the sentence ∃x(x = 0). Remark. and Σ is consistent. (Use parts (4) and (5) of Lemma 3. and for each σ either Σ σ or Σ ¬σ. . If the equivalence in part (2) of this corollary holds for all σ (not only for atomic σ). . Σ := Ab (the set of axioms for abelian groups). . Then TL is nonempty. A key fact about completeness needed in this chapter is that any consistent set of sentences extends to a complete set of sentences: . This property is of interest for other reasons as well. (ii) If R ∈ Lr is m-ary. tm ∈ TL . tn ∈ TL ). We define the L-structure AΣ as follows: (i) Its underlying set is AΣ := TL / ∼Σ . We also have AΣ |= ⊥ by definition of |=.) Corollary 3. Thus Σ ⊥ ⇔ AΣ |= ⊥.

2. and Lc := L ∪ {c}. Let L have a constant symbol. Therefore AΣ |= ∃xϕ(x). (Sketch) Take a proof of ϕ(c) from Σ in the language Lc . Then Σ ⊆ Σ for some complete set of L-sentences Σ . The cases that σ = ¬σ1 . and take a variable z different from all variables occurring in that proof. and suppose Σ is consistent. Then Σ L ϕ(y). Suppose Σ is consistent.7. Case σ = ∀xϕ(x): This is similar to the previous case but we also need the result from Exercise 7 in Section 2.2. Proof. and also such that z = y. that is. So by Lemma 2. Applying the inductive hypothesis we get Σ ϕ(t). and σ = σ1 ∧ σ2 are treated just as in the proof of the corresponding Lemma 2. c a constant symbol not in L. It remains to consider two cases: Case σ = ∃xϕ(x): (⇒) Suppose that Σ σ. Choose t ∈ TL such that [t] = a.2. Proof.3. We call attention to some new notation in the next lemmas: the symbol used to emphasize that we are dealing with formal provability within L. We already know that (i) holds for atomic sentences. L is Lemma 3. For the converse. Check that one obtains in this way a proof of ϕ(z/y) in the language L from Σ. σ = σ1 ∨ σ2 . Because we are assuming that Σ has witnesses we have a t ∈ TL such that Σ ϕ(t).2.2.11 for Propositional Logic. hence AΣ |= ϕ(t) by Lemma 2. (ii) Σ is complete and has witnesses. This yields Σ ∃xϕ(x) by MP and the quantifier axiom ϕ(t) → ∃xϕ(x). Let Σ be a set of L-sentences. So Σ L ϕ(z/y) and hence by Lemma 3.2 we have an a ∈ AΣ such that AΣ |= ϕ(a).2.7 that ¬∀xϕ ↔ ∃x¬ϕ. In particular. We use induction on the number of logical symbols in σ to obtain (i). Then there is an a ∈ AΣ such that AΣ |= ϕ(a).2.5 holds for all σ. Σ L ϕ(y). and is just like that of the corresponding fact of Propositional Logic in Section 1. hence AΣ |= σ. Then by the inductive hypothesis AΣ |= ϕ(t). (⇐) Assume AΣ |= σ. Completeness is only a necessary condition for this equivalence to hold for all σ. then AΣ is a model of Σ.2. Theorem 3. another necessary condition is “to have witnesses”: Definition. A Σ-witness for the sentence ∃xϕ(x) is a term t ∈ TL such that Σ ϕ(t).8. Then tAΣ = a.6. It should be clear that (i) implies (ii). . PROOF OF THE COMPLETENESS THEOREM 47 Lemma 3. Replace in every formula in this proof each occurrence of c by z. assume (ii). if Σ is complete and has witnesses. The proof uses Zorn’s Lemma.6.2. Then the following two conditions are equivalent: (i) For each σ we have: Σ σ ⇔ AΣ |= σ.1 we have Σ L ϕ(z/y)(y/z). Let ϕ(y) be an L-formula and suppose Σ Lc ϕ(c). We say that Σ has witnesses if there is a Σ-witness for every sentence ∃xϕ(x) proved by Σ.6 (Lindenbaum). Completeness of Σ does not guarantee that the equivalence of part (2) of Corollary 3.

8 we have Σ L ϕ(y) → ⊥. and let A∗ be an L∗ -structure. let cσ be a constant symbol not in L such that if σ is a different L-sentence of the form ∃x ϕ (x ) provable from Σ. . Then Σ ∪ {ϕ(c)} Lc ⊥.2. Let σ1 = ∃x1 ϕ1 (x1 ). . The previous lemma covers the case n = 1. . Let c be a constant symbol not in L. Let Σn be a consistent set of Ln sentences. Lemma 3. +) is a reduct of (N. . Proof. 1. Suppose the language L∗ extends L. . . . .2. Then Σw ⊥. By the Deduction Lemma (3. . . . σn = ∃xn ϕn (xn ) be such that Σ σi for every i = 1. . Then we can choose n so large that this is actually a proof of ⊥ from Σn in Ln . cσn }-sentences. . For each L-sentence σ = ∃xϕ(x) such that Σ σ. . ϕn (cn )}. . In the next lemma we use a superscript “w” for “witness. Take a proof of ⊥ from Σ∞ . n. . cσn } from Σ ∪ {ϕ1 (cσ1 ). . Note that any L∗ -structure A∗ has a unique reduct to an L-structure. Lemma 3. . ·). This contradicts the consistency of Σn . . Suppose Σ is consistent. let A be an L-structure. . For example. . and denote their union by L∞ . ..3) Σ Lc ϕ(c) → ⊥. This contradicts Lemma 3. . +.48 CHAPTER 3. cn be distinct constant symbols not in L.1.9. Take a proof of ⊥ from Σw and let cσ1 . Then by Lemma 3. . . Proof. Proof. . . . Suppose that Σ∞ ⊥. <. . then cσ = cσ . Put L := L ∪ {c1 .. Then Σ is a consistent set of L -sentences. Let (Ln ) be an increasing sequence of languages: L0 ⊆ L1 ⊆ L2 ⊆ . . So Σ ∪ {ϕ1 (cσ1 ).2. (N.. . 0.10. .2.10. Put Lc := L ∪ {c}.11. for each n. The general case follows by induction on n. cσn be constant symbols in Lw L such that this proof is a proof of ⊥ in the language L ∪ {cσ1 . . . ϕn (cσn )} is an inconsistent set of L ∪ {cσ1 . 0. Suppose not. . .12.2. ϕn (cσn )}.2. Let c1 . . Then A is said to be a reduct of A∗ (and A∗ an expansion of A) if A and A∗ have the same underlying set and the same interpretations of the symbols of L. . . . Then the union Σ∞ := n Σn is a consistent set of L∞ -sentences. Then Σ ∪ {ϕ(c)} is a consistent set of Lc -sentences. such that Σ0 ⊆ Σ1 ⊆ Σ2 . Proof. Put Lw Σw := L ∪ {cσ : σ = ∃xϕ(x) is an L-sentence such that Σ σ} := Σ ∪ {ϕ(cσ ) : σ = ∃xϕ(x) is an L-sentence such that Σ σ} Then Σw is a consistent set of Lw -sentences. . which we indicate . THE COMPLETENESS THEOREM Lemma 3. cn } and Σ = Σ ∪ {ϕ1 (c1 ).” Lemma 3. where σi = ∃xi ϕi (xi ) for 1 ≤ i ≤ n. Applying MP yields Σ ⊥. By G we have Σ L ∃yϕ(y) → ⊥. Assume Σ is consistent and Σ ∃yϕ(y). Suppose not. contradicting the consistency of Σ. Suppose Σ is consistent. . .

and Prenex Form.2. We can now prove Theorem 3. To see this. let σ be an L∞ -sentence. Variants. It is also complete. if n is odd. and put Σn+1 := Σn Σw n if n is even. then tA = tA for all variable-free LA -terms t. and A |= σ ⇐⇒ A∗ |= σ for all LA -sentences σ. Take n even and so large that σ is an Ln -sentence.7 that Σ∞ has a model.11.3. and that f does not occur in the sentences of Σ. Put A := AΣ∞ |L . Then Σ ∀xU x.1. In some proofs we shall take advantage of the fact that the Completeness Theorem is now available. Now take n to be odd and so large that σ is an Ln -sentence and Σn σ.1. Then Σ is complete if and only if every two models of Σ satisfy the same sentences. . We construct a sequence (Ln ) of languages and a sequence (Σn ) where each Σn is a consistent set of Ln sentences. By the previous lemma the set Σ∞ of L∞ -sentences is consistent. let σ = ∃xϕ(x) be an L∞ sentence such that Σ∞ σ. Here Lw and Σw are obtained from Ln and Σn in the same way that Lw and n n Σw are obtained from L and Σ in Lemma 3. Then A |= Σ. put Ln+1 := Ln Lw n if n is even. Given the language Ln and the consistent set of Ln -sentences Σn . We begin by setting L0 = L and Σ0 = Σ. Then Σn+1 σ or Σn+1 ¬σ and thus Σ∞ σ or Σ∞ ¬σ. choose a complete set of Ln -sentences Σn ⊇ Σn . and Σn ⊆ Σn+1 . if n is odd. Then by construction of Σn+1 = Σw we have n Σn+1 ϕ(cσ ). Proof.2. a unary relation symbol U and a unary function symbol f . A key fact (to be verified by the reader) is that if A is a reduct of ∗ A∗ .3 Some Elementary Results of Predicate Logic Here we obtain some generalities of predicate logic: Equivalence and Equality Theorems. and suppose that Σ U f c. This concludes the proof of the Completeness Theorem (second form). We claim that Σ∞ has witnesses. (1) Suppose Σ is consistent. Exercises. (2) Let L have just a constant symbol c. SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 49 by A∗ |L . namely AΣ∞ . To see this. It follows from Theorem 3. Let Σ be a consistent set of L-sentences. Note that Ln ⊆ Ln+1 . so Σ∞ ϕ(cσ ).3. 3.

since (ii) then follows easily. Suppose ψ = ϕ. Suppose that ψ = ∃xθ. y1 . . . Of course. Then either ψ = ϕ and ψ = ϕ . If ψ is atomic. . Then Σ ϕ ↔ ψ. a1 . . Choose variables y1 . We say ϕ1 and ϕ2 are Σ-equivalent if Σ ϕ1 ↔ ϕ2 . yn ). . y1 . . Let A be a model of Σ. .1 (Distribution Rule). .2 (Equivalence Theorem). Then ψ = ¬θ and the desired result follows easily. . We only do (i). ∀xϕ ↔ ∀xψ.3. an ) Suppose A |= ∃xϕ(x. . we just say equivalent.) One verifies easily that Σ-equivalence is an equivalence relation on the set of L-formulas. . a1 . . Then by the distribution rule Σ ∃xθ ↔ ∃xθ . . Proof. . . . THE COMPLETENESS THEOREM Lemma 3. a1 . We have A |= ϕ → ψ. The case ψ = ϕ (and thus ψ = ϕ ) is trivial. . . Then Σ ∃xϕ → ∃xψ and Σ ∃xϕ ↔ ∃xψ and Σ ∀xϕ → ∀xψ. hence A |= ψ(a0 . . and the desired result holds trivially. . . . . . (i) Suppose Σ (ii) Suppose Σ ϕ → ψ. an ). Suppose Σ ϕ ↔ ϕ . By induction on the number of logical symbols in ψ. yn such that ϕ = ϕ(x. . Then ψ is again a formula and Σ ψ ↔ ψ . Then the occurrence of ϕ we are replacing is an occurrence inside θ. and let ψ be obtained from ψ by replacing some occurrence of ϕ as a subformula of ψ by ϕ . a1 . . The cases ψ = ψ1 ∨ ψ2 and ψ = ψ1 ∧ ψ2 are left as exercises. . n} with I does not matter. an ∈ A A |= ∃xϕ(x. . an ). So by inductive hypothesis we have Σ θ ↔ θ . this is usually the case because the equivalence class of ϕi(1) ∨ · · · ∨ ϕi(n) does not depend on this choice. Then A |= ϕ(a0 . . . . . Definition. an ). . then necessarily ψ = ϕ and ψ = ϕ and the desired result holds trivially. . . . . where θ is obtained by replacing that occurrence (of ϕ) by ϕ . We need only show that then for all a1 . . . a1 . . . . . and thus A |= ∃xψ(x. . (In case Σ = ∅. . . an ) → ψ(a0 . an ) for some a0 ∈ A. . Theorem 3.3. . n} → I and set ϕi := ϕi(1) ∨ · · · ∨ ϕi(n) . . or the occurrence of ϕ we are replacing is an occurrence in θ. . By the Completeness Theorem it suffices to show that then A |= ∃xϕ → ∃xψ and A |= ∀xϕ → ∀xψ. Proof. . i∈I i∈I ϕi := ϕi(1) ∧ · · · ∧ ϕi(n) .50 CHAPTER 3. an ). From A |= ϕ → ψ we obtain A |= ϕ(a0 . . We shall prove A |= ∃xϕ → ∃xψ and leave the other part to the reader. . Suppose that ψ = ¬θ. . If I is clear from context we just write i ϕi and i ϕi instead. Given a family (ϕi )i∈I of formulas with finite index set I we choose a bijection k → i(k) : {1. an ) → ∃xψ(x. and the same is true with “∧” instead of “∨”. . . these notations i∈I ϕi and i∈I ϕi can only be used when the particular choice of bijection of {1. Then the inductive hypothesis gives Σ θ ↔ θ . . The proof is similar if ψ = ∀xθ. yn ) and ψ = ψ(x.

(4) replace a part ψ ∨ Qxθ by Qx(ψ ∨ θ) where x is not free in ψ. We prove the first equivalence. ∀}. Qm+n ∈ {∃. . where y is free for x in ϕ and y does not occur free in ϕ. . We call Q1 x1 . each formula is equivalent to one in prenex form. Qm .5 (Prenex Form). that is. Qm+n yn ψ2 . . . . Instead of “occurrence of . . leaving the second as an exercise. ∀}.3. Theorem 3. Similarly we get ∃xϕ → ∃yϕ(y/x) (use that ϕ = ϕ(y/x)(x/y) by the assumption on y). Applying G to the quantifier axiom ϕ(y/x) → ∃xϕ gives ∃yϕ(y/x) → ∃xϕ. By the Equivalence Theorem (3. We leave the proof of the next lemma as an exercise. A formula is equivalent to any of its variants. In this lemma Q denotes a quantifier. (6) replace a part ψ ∧ Qxθ by Qx(ψ ∧ θ) where x is not free in ψ. (3) replace a part (Qxψ) ∨ θ by Qx(ψ ∨ θ) where x is not free in θ.. Qn xn the prefix . ∀} and ϕ is quantifier-free. An application of Exercise 6 of Section 2. . . as a subformula” we say “part . Note that a quantifier-free formula is in prenex form.4. . . Qm xm ψ1 =⇒pr Qm+1 y1 . Lemma 3. y1 . each Qi ∈ {∃. . Assume inductively that ϕ1 ϕ2 =⇒pr Q1 x1 . . A formula in prenex form is a formula Q1 x1 .3. Remark. and ψ1 and ψ2 are quantifier-free.. Atomic formulas are already in prenex form.3. . x1 . Lemma 3. To simplify notation. . Proof.3.”. . (ii) replace an occurrence of a subformula ∀xϕ by ∀yϕ(y/x).7 finishes the proof.. . write ϕ =⇒pr ψ to indicate that ψ can be obtained from ϕ by a finite sequence of prenex transformations. and ϕ the matrix of the formula. and Q denotes the other quantifier: ∃ = ∀ and ∀ = ∃. xm are distinct. Qn xn ϕ where x1 . In particular. xn are distinct variables.3. . . Every formula can be changed into one in prenex form by a finite sequence of prenex transformations. . yn are distinct. . . . SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 51 Definition. . Definition. Q ∈ {∃. . Proof. By induction on the number of logical symbols. .3.. (5) replace a part (Qxψ) ∧ θ by Qx(ψ ∧ θ) where x is not free in θ.3.2) it suffices to show ∃xϕ ↔ ∃yϕ(y/x) and ∀xϕ ↔ ∀yϕ(y/x) where y is free for x in ϕ and does not occur free in ϕ. Note that the free variables of a formula (those that occur free in the formula) do not change under prenex transformations. . . this is the case n = 0. The following prenex transformations always change a formula into an equivalent formula: (1) replace the formula by one of its variants. where Q1 . . . (2) replace a part ¬Qxψ by Q x¬ψ. A variant of a formula is obtained by successive replacements of the following type: (i) replace an occurrence of a subformula ∃xϕ by ∃yϕ(y/x).

. Qm+n yn (ψ1 ∨ ψ2 ). .52 CHAPTER 3. The assumptions above yield ϕ =⇒pr (Q1 x1 . . . hence ϕ =⇒pr Q1 x1 . Σ tn = tn . Qm xm ψ1 ) ∨ (Qm+1 y1 . . . . . . . . tn where F ∈ Lf is n-ary and t1 . . let t be obtained from t by replacing an occurrence of τ in t by τ . n}. . Inductively we can assume that Σ t1 = t1 . . . . . Likewise. Using the facts on admissible words at the end of Section 2.3 we have Σ t = t . Qm xm ¬ψ1 .1. xm are different from x. . Let τ and τ be L-terms such that Σ τ = τ . the result of replacing an occurrence of an L-term τ in t by an L-term τ is an L-term t . Qm+n yn ψ2 ) =⇒pr Q1 x1 . . THE COMPLETENESS THEOREM Then for ϕ := ¬ϕ1 . Qm xm ¬ψ1 . . to deal with ϕ1 ∧ ϕ2 . . Qm xm ψ1 ) ∨ (Qm+1 y1 . . . Note that by Corollary 2. . Qm+n yn (ψ1 ∨ ψ2 ). An occurrence of an L-term τ in ϕ is said to be outside quantifier scope if the occurrence is not inside quantifier scope.2. we see that t = F t1 . Suppose t = F t1 . We finish this section with results on equalities. Qm xm ψ1 . . Qm+n yn ψ2 ). tn where for some i ∈ {1. . tn are L-terms. n} we have tj = tj for all j = i. . .6. . . Qm xm ψ1 =⇒pr Q1 x1 . and ti is obtained from ti by replacing an occurrence of τ in ti by τ . let ϕ := ∃xϕ1 . and the RHS is in prenex form. yn } = ∅.7. . j ∈ {1. Then ϕ =⇒pr ∃xQ1 x1 .3.1. . . Qm xm Qm+1 y1 . . then t = τ . xm } ∩ {y1 . Proof. Next. . . . . so by part (5) of Lemma 3. . . . Next. Qm xm Qm+1 y1 . . . . including exercise 5. we have ϕ =⇒pr ¬Q1 x1 . The case ϕ := ∀xϕ1 is similar. . Then Σ t = t . . Proposition 3. Replacing here the RHS (righthand side) by a variant we may assume that {x1 . We proceed by induction on terms. An occurrence of an L-term τ in ϕ is said to be inside quantifier scope if the occurrence is inside an occurrence of a subformula ∃xψ or ∀xψ in ϕ. This fact takes care of the case that t is a variable. . . Applying m + n times prenex transformation of types (3) and (4) we obtain (Q1 x1 . we apply prenex transformations of types (5) and (6). Qm xm ψ1 . . . . . . and assume t = τ . let ϕ := ϕ1 ∨ ϕ2 . . . . . Hence ϕ =⇒pr Q1 x1 . . Applying m prenex transformations of type (2) we get ¬Q1 x1 . Applying prenex transformations of type (1) we can assume x1 . First note that if t = τ . . and also that no xi occurs free in ψ2 and no yj occurs free in ψ1 .

. y)) → ∃x∀y(Q(x. . B denote L-algebras. 3. xm ). .3. . yn ) is equivalent to an unnested formula ϕu (y1 . . . . . . Q be a binary relation symbol. So L has no relation symbols. . and also to an unnested universal formula θ2 (x1 . (This allows and ⊥ as atomic subformulas of unnested formulas. . . yn ). . ⊥. (ϕ1 ∧ ϕ2 )(x1 . . .4. xm ).4 Equational Classes and Universal Algebra The term-structure AΣ introduced in the proof of the Completeness Theorem also plays a role in what is called universal algebra. . . xn+1 . Then Σ ϕ ↔ ϕ . Free groups. . . and x. . . . . A substructure of . . are all special cases of a single construction in universal algebra. . and so on. . . and also to an unnested universal formula ϕ2 (y1 . or the form F x1 . . . Let τ and τ be L-terms such that Σ τ = τ . . (4) A formula is said to be unnested if each atomic subformula has the form Rx1 . . . Exercises. . ∀x ϕi ←→ ∀xϕi . (iii) each formula ϕ(y1 . . . . . . xm with m-ary R ∈ Lr ∪ { . . y). This is a general setting for constructing mathematical objects by generators and relations. . y distinct variables.) Then: (i) for each term t(x1 . . . tensor products of various kinds. . and A. xm ) are existential formulas. xm . xm ) and ϕ∧ (x1 . xm . and then proceed by the usual kind of induction on formulas. . let ϕ be obtained from ϕ by replacing an occurrence of τ in ϕ outside quantifier scope by τ . xm ) and ϕ2 (x1 . . First take care of the case that ϕ is atomic by a similar argument as for terms. . . . xn = xn+1 with n-ary F ∈ Lf and distinct variables x1 . yn ) is equivalent to an unnested existential formula ϕ1 (y1 . . . . xm } the formula / t(x1 . . . In this section we fix a language L that has only function symbols. then (ϕ1 ∨ ϕ2 )(x1 . (ii) each atomic formula ϕ(y1 . . . yn ). . Proof.3. . . polynomial rings. Then ` _ ´ `_ ´ ` ^ ´ `^ ´ ∃x ϕi ←→ ∃xϕi . (2) Let (ϕi )i∈I be a family of formulas with finite index set I. . . . .7. . . i i i i (3) If ϕ1 (x1 . . The 12 12 same holds with “existential” replaced by “universal”. . . . EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 53 Proposition 3. (1) Let P be a unary relation symbol. . Instead of “Lstructure” we say “L-algebra”. Use prenex transformations to put ∀x∃y(P (x) ∧ Q(x. y) → P (y)) into prenex form. . xm ) = y is equivalent to an unnested existential formula θ1 (x1 . xm . yn ). . . xm ) are equivalent to existential formulas ϕ∨ (x1 . xm ) and variable y ∈ {x1 . including at least one constant symbol. y). . =} and distinct variables x1 . . . .

. (G. (3) every subalgebra of any algebra in C belongs to C. xm ) where x1 . We call A trivial if |A| = 1. t1 . Proof.5) Towards a proof of the converse. = (2) the trivial L-algebra belongs to C. let j ∈ {1. . Examples. where two such terms s and t are equivalent iff Σ s = t. Then the class C is equational if and only if the following conditions are satisfied: (1) closure under isomorphism: if A ∈ C and A ∼ B. Theorem 3. xm are distinct variables and ∀x abbreviates ∀x1 . a Σ-algebra is the same as a model of Σ. ∀xm . in other words.Birkhoff) Let C be a class of L-algebras. we need some universal-algebraic considerations that are of interest beyond the connection to Birkhoff’s theorem. Let a1 . . To such a Σ we associate the class Mod(Σ) of all Σ-algebras. . tn are L-terms. then conditions (1)–(5) are satisfied. With L = LRi . . It is easy to see that if C is equational. Consider an identity ∀x s1 (x) = t1 (x) ∧ · · · ∧ sn (x) = tn (x) . Associated to Σ is the term algebra AΣ whose elements are the equivalence classes [t] of variable-free L-terms t. . and Mod(Ri). am ∈ AΣ and put A = AΣ . . . For the rest of this section we fix a set Σ of L-identities. then the corresponding class is the class of commutative rings. . . . THE COMPLETENESS THEOREM A is also called a subalgebra of A. . . . Given a set Σ of L-identities we define a Σ-algebra to be an L-algebra that satisfies all identities in Σ. am )A = t(a1 . the class of groups. AΣ is a Σ-algebra. With L = LGr . and where s1 . Ri is a set of L-identities.2. . . Gr is a set of L-identities.4. . . . . x = (x1 . and Mod(Gr). the class of rings. Lemma 3. . . x = (x1 . (5) the product of any family (Ai ) of algebras in C belongs to C. xm ) in Σ. (For (3) and (4) one can also appeal to the Exercises 7 and 8 of section 2. is the corresponding equational class of L-algebras. Take . . . . then B ∈ C. am )A . If one adds to Ri the identity ∀x∀y(xy = yx) expressing the commutative law. n} and put s = sj and t = tj . . is the corresponding equational class of L-algebras. .4. . There is up to isomorphism exactly one trivial L-algebra.54 CHAPTER 3. sn . (4) every quotient algebra of any algebra in C belongs to C.1. . It suffices to show that then s(a1 . . . . An L-identity is an L-sentence ∀x s1 (x) = t1 (x) ∧ · · · ∧ sn (x) = tn (x) . A class C of L-algebras is said to be equational if there is a set Σ of L-identities such that C = Mod(Σ). . . and a quotient algebra of A is an L-algebra A/ ∼ where ∼ is a congruence on A.

. For any commutative ring R and elements b1 . . am )A . It is easy to check that this map is a homomorphism AΣ → B. am = [αm ]. . . then Σ s = t.4 it is the only such homomorphism. . Xn ] be the ring of polynomials in distinct indeterminates X1 . . and the ring of integers is an initial Ri-algebra.4. . y are distinct variables. . . as desired. am )A = s(α1 . . EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 55 variable-free L-terms α1 . so [s(α1 . it is unique up-to-unique-isomorphism. so s(a1 . Let I be an index set in what follows. αm )A . For example. . αm )] = [t(α1 . am = αm . αm ) = t(α1 . . . . . . . An initial Σ-algebra is a Σ-algebra A such that for any Σ-algebra B there is a unique homomorphism A → B. . . . Then A is said to be a free Σ-algebra on (ai ) if for every Σ-algebra B and I-indexed family (bi ) of elements of B there is exactly one homomorphism h : A → B such that h(ai ) = bi for all i ∈ I. αm )]. . .6. So if there is an initial Σ-algebra. . . respectively. . . αm such that a1 = [α1 ]. . Actually. s(α1 . . . . . αm )A = [s(α1 . . . Xn over Z. . . .5. so i and j are isomorphisms.5 we have a1 = α1 . So the cRi-algebras are just the commutative rings. take L = LRi and cRi := Ri ∪ {∀x∀y xy = yz}. . αm )A = [t(α1 . . respectively. By Exercise 4 in Section 2. . . Now Σ s(α1 . We also express this by “ A. . Also. .3. . . am )A = t(a1 . . we are going to show that AΣ is a so-called initial Σ-algebra. Let B be any Σ-algebra. . the trivial group is an initial Gr-algebra. t(a1 . let i and j be the unique homomorphisms A → B and B → A. bn ∈ R we have a . am )A = t(α1 . where x. A itself is sometimes referred to as a free Σ-algebra if there is a family (aj )j∈J in A such that A. . . αm )]. . (aj ) is a free Σ-algebra. . [t] → tB . by part (1) of Corollary 3. .2. . Thus we have a map AΣ → B. so sB = tB . Then there is a unique isomorphism A → B. . . . . αm )A by Lemma 2. . . . . . . t ∈ TL and [s] = [t]. Proof. . and thus s(a1 . . . Lemma 3. . . Free algebras. .2.3. Finally. Suppose A and B are both initial Σ-algebras. Let A be a Σ-algebra and (ai )i∈I an I-indexed family of elements of A. . . AΣ is an initial Σ-algebra. Let Z[X1 . . Then we have homomorphisms j ◦ i : A → A and i ◦ j : B → B. . . .1. To see this. . . so necessarily j ◦ i = idA and i ◦ j = idB . Then by A A part (1) of Corollary 3. t(α1 . . Note that if s. As an example. .4. . (ai ) is a free Σ-algebra”. αm )]. . αm ).

·) of Mo. at most one free Σ-algebra on an I-indexed family of its elements. y. A monoid . If A and B are both free Σ-algebras on (ai ) and (bi ) respectively. Then we have a unique homomorphism h : A → B such that h(ai ) = ai for all i ∈ I. namely. and consider E ∗ as a monoid by taking the empty word as its identity and the concatenation operation (v. there is a unique free Σ-algebra on an I-indexed family of its elements. Xn ) → f (b1 . w) → vw as its product operation. . and consider Mo := {∀x(1 · x = x ∧ x · 1 = x). given I. n. . and likewise g ◦ h = idB . because for any monoid B and elements be ∈ B (for e ∈ E) we have a unique monoid homomorphism E ∗ → B that sends each e ∈ E to be . then (h ◦ g)(ai ) = ai for all i. In particular. To see why. . . up to unique isomorphism preserving I-indexed families. Let ΣI be Σ viewed as a set of LI -identities. So an LI -algebra B. . the ΣI -algebra AΣI is a free Σ-algebra on ([ci ]). so B = A. let B be the subalgebra of A generated by (ai ). Then A is generated by (ai ). 1. Then E ∗ is a free monoid on the family (e)e∈E of words of length 1. . and g : A → B and h : B → A are the unique homomorphisms such that g(ai ) = bi and h(bi ) = ai for all i. . f (X1 . so h ◦ g = idA . . en → be1 · · · ben . and we call 1 ∈ A the identity of the monoid A. namely the evaluation map (or substitution map) Z[X1 . is a model A = (A. . up to unique isomorphism of ΣI -algebras.56 CHAPTER 3. For a simpler example. Thus. Thus. where new means that ci ∈ L for i ∈ I and / ci = cj for distinct i. . where x. with same index set I. . THE COMPLETENESS THEOREM unique ring homomorphism Z[X1 . . Xn ] → R. . j ∈ I. (bi ) is just an L-algebra B equipped with an I-indexed family (bi ) of elements of B. . . Remark. Xn ] → R that sends Xi to bi for i = 1. or semigroup with identity. . . . Let (A. . and · its product operation. . ∀x∀y∀z (xy)z = x(yz) }. . so g is an isomorphism with inverse h. ·} ⊆ LGr be the language of monoids. . . . We shall now construct free Σ-algebras as initial algebras by working in an extended language. there is. . . Thus Z[X1 . Xn ] is a free commutative ring on (Xi )1≤i≤n . bn ). and then the composition A→B→A is necessarily idA . . Then a free Σ-algebra on an I-indexed family of its elements is just an initial ΣI -algebra. Let E ∗ be the set of words on an alphabet E. (ai )i∈I ) be a free Σ-algebra. e1 . Let LI := L ∪ {ci : i ∈ I} be the language L augmented by new constant symbols ci for i ∈ I. let L = LMo := {1. z are distinct variables.

A is a free as a Σ-algebra on (ιg)g∈G . (aj )j∈J . . t) as above. applied to CI in place of C. ι). so hs. By the claim above. Let B := Bs. . This fact can sometimes be used to reduce problems on Σ-algebras to the case of free Σ-algebras. (aj ) → B. and thus h([s]) = h([t]). Let CI be the class of all LI -algebras B. (bj ) . and thus A ∈ C. Here is the key fact from which this will follow: Claim.t ([s]) = hs. so hs.t ∈ C such that Bs. . t ∈ TL such that s = t does not belong to Σ(C) we pick Bs. Then h(tA (aj1 . Then h is injective. . let Σ(C) be the set of L-identities ∀x s(x) = t(x) that are true in all algebras of C. ι) up-to-unique-isomorphism. we have to show that then C is equational. . and it remains to show that every Σ(C)-algebra belongs to C. This injectivity gives A ∼ h(A) ⊆ B. . so it remains to show that free Σ(C)-algebras belong to C.3. ajn ) = tB (bj1 . jn ∈ J. If A is an initial Σ(C)-algebra. = Now. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 57 Let B be any Σ-algebra.t : A → Bs. . Then s = t does not belong to Σ(C). Proof of Birkhoff ’s theorem. . Note that A is generated as an L-algebra by ιG. Note that if (A .t ([t]). . and thus h induces an isomorphism A/∼h ∼ B. so A ∈ C. . Then we have a Σ-algebra A with a map ι : G → A such that for any Σ-algebra B and any map j : G → B there is a unique homomorphism h : A → B such that h◦ι = j. Generators and relations. It is clear that each algebra in C is a Σ(C)algebra.t |= s = t. . Note that B ∈ C. Assume C is closed. then the unique homomorphism h : A → A such that h ◦ ι = ι is an isomorphism. and let h : A → B be the homomorphism given by h(a) = hs. and take the unique homomorphism h : A. . For s. This finishes the proof of the claim. . Here is a particular way of constructing the free Σ-algebra on G. in other words.t (a) . ι) the free Σ-algebra on G. (bi ) with B ∈ C. Take a free Σ-algebra A. Let A be a free Σ(C)algebra on (ai )i∈I . ι ) (with ι : G → A ) has the same universal property as (A. To see why. (ai ) ∈ CI .t where the product is over all pairs (s. let s.4. so this universal property determines the pair (A. Let G be any set. every Σ(C)-algebra is isomorphic to a quotient of a free Σ(C)-algebra.t ([s]) = hs. So there is no harm in calling (A. To prove this claim we take A := AΣ(C) . t ∈ TL be such that [s] = [t] in A = AΣ(C) . Take the language LG := L ∪ G (disjoint union) with the elements of G as constant . and we let hs. We have shown: = Every Σ-algebra is isomorphic to a quotient of a free Σ-algebra.4. then A ∈ C. It is clear that CI is closed as a class of LI -algebras. Let us say that a class C of L-algebras is closed if it has properties (1)–(5) listed in Theorem 3. Indeed. bjn ) for all Lterm t(x1 .t ([t]). xn ) and j1 . and take any family (bj )j∈J in B that generates B. Now. .t be the unique homomorphism. (ai ) is easily seen to be an initial Σ(CI )-algebra. . we obtain A. see the next subsection for an example. .1. so h(A) = B. A.

R) := AΣ(R) . . xn ) are L-terms and g = (g1 .4. R) such that: (1) A(G. viewed as a set of LG -sentences. xn ) and t(x1 . . . . gn ) ∈ Gn (with n depending on the sentence). Next. there is a unique homomorphism h : A(G. . . . Let Σ(G) be Σ considered as a set of LG -identities. R) |= s(ig) = t(ig) for all s(g) = t(g) in R. As before one sees that the universal property of the lemma is satisfied. THE COMPLETENESS THEOREM symbols. .58 CHAPTER 3. R) → B such that h ◦ i = j. . We wish to construct the Σ-algebra generated by G with R as set of relations. .1 This object is described up-to-isomorphism in the next lemma. R) by i(g) = [g]. 1 The use of the term “relations” here has nothing to do with n-ary relations on sets. Lemma 3. Then A := AΣ(G) as a Σ-algebra with the map g → [g] : G → AΣ(G) is the free Σ-algebra on G. . R) with a map i : G → A(G. There is a Σ-algebra A(G. . let A(G. Proof.4. let R be a set of sentences s(g) = t(g) where s(x1 . (2) for any Σ-algebra B and map j : G → B with B |= s(jg) = t(jg) for all s(g) = t(g) in R. and define i : G → A(G. Let Σ(R) := Σ ∪ R. .

A. the cardinality of a structure is defined to be the cardinality of its underlying set. The union of countably many countable sets is countable. Hence the language L ∪ {cσ : Σ σ where σ is an L-sentence of the form ∃xϕ(x)} is countable. the hypothesis that L is countable yields that the set of L-sentences is countable. hence AΣ∞ is countable. achieves a lot more than just establishing completeness. Remark. o Suppose L is countable and Σ has a model. when applicable. and thus its reduct AΣ∞ |L is a countable model of Σ. adding witnesses keeps the language countable. Then Σ has a countable model. that is. Take a countably infinite subset Var ⊆ Var. . It follows that there are only countably many variable-free L∞ -terms. Vaught’s Test o Below. back-and-forth.1 (Countable L¨wenheim-Skolem Theorem). hence the set L∞ constructed in the proof of the Completeness Theorem is countable. Then each sentence is equivalent to one whose variables are all from Var . The above proof is the first time that we used the countability of Var = {v0 . }.6.1. v2 . . In this section we have the same conventions concerning L. Theorem 4. As promised in Section 2. 4. Next we develop o some basic methods related to proving completeness of a given set of axioms: Vaught’s Test. Each of these methods. t. quantifier elimination.4.Chapter 4 Some Model Theory In this chapter we first derive the L¨wenheim-Skolem Theorem. Since Var is countable. .1 L¨wenheim-Skolem. we shall now indicate why the Downward L¨wenheim-Skolem Theorem goes through without assuming that o Var is countable. v1 . ψ. Proof. σ and Σ as in the beginning of Section 2. ϕ. unless specified otherwise. By replacing each sentence in Σ by an equivalent one all whose variables are 59 . Suppose that Var is uncountable. θ.

∀x∀y(x < y ∨ x = y ∨ y < x). ∃xn 1≤i<j≤n xi = xj . SOME MODEL THEORY from Var . <) for the language LO that satisfies the following axioms (where x. The models of Σ are exactly the infinite sets. . Theorem 4. As in the proof above. Any two countable dense totally ordered sets without endpoints are isomorphic. A |= ¬σ and B |= σ. . Proposition 4.3 (Cantor). Proof. Suppose Σ is not complete. . To formulate that theorem we define a totally ordered set to be a structure (A. Let L = ∅. So (Q. and suppose Σ has a model. <) and (R. Hence by the L¨wenheim-Skolem Theorem there is a countable model A of Σ o such that A σ.2 (Vaught’s Test). σ2 . = Example. . The proof of the next theorem shows Back-and-Forth in action. Thus by Vaught’s Test Σ is complete. In other cases it may require work to check this hypothesis. We have A ∼ B. All countable models of Σ are countably infinite and hence isomorphic to N. Let L be countable.60 CHAPTER 4. This model is a countable model of Σ. and it is said to have no endpoints if it satisfies the axiom ∀x∃y∃z(y < x ∧ x < z). . z are distinct variables): ∀x(x < x). we obtain a countable model of Σ working throughout in the setting where only variables from Var are used in terms and formulas. and there is a countable model B of Σ such that B ¬σ. is often used to verify the hypothesis of Vaught’s Test. <) are dense totally ordered sets without endpoints. Then Σ is complete. so the L-structures are just the non-empty sets.1. Let To be the set consisting of these three axioms. ∀x∀y∀z (x < y ∧ y < z) → x < z . contradiction. In this example the hypothesis of Vaught’s Test is trivially satisfied. The following test can be useful in showing that a set of axioms Σ is complete.} where σn = ∃x1 . Back-and-Forth. A totally ordered set is said to be dense if it satisfies in addition the axiom ∀x∀y(x < y → ∃z(x < z ∧ z < y)). Let To (dense without endpoints) be To augmented by these two extra axioms. .1. y. and that all countable models of Σ are isomorphic. One general method in model theory. we obtain a countable set Σ of sentences such that Σ and Σ have the same models. Let Σ = {σ1 . Then there is σ such that Σ σ and Σ ¬σ.

βn−1 as ak is situated with respect to α0 . . which / establishes the claim. βn−1 }. . the set of L -sentences ΣΛ := Σ ∪ {cλ = cµ : λ. . . To see this it suffices to show that. next take k ∈ N minimal such that ak is situated with / respect to α0 . . <). Proof. . . βn−1 in B such that for all i. . l is minimal such that for i = 0. . Case 2: n is odd. . . Suppose L has o size at most κ and Σ has an infinite model. . Take an infinite model A of Σ. (The reader should check that such an l exists: that is where density and “no endpoints” come in). . Recall that the set of ordinals λ < κ has cardinality κ. that is. . In the results below we let κ denote an infinite cardinal. The same arguments we used in proving the countable version of the L¨wenheim-Skolem Theorem show that then Σ o has a model B = (B. . LOWENHEIM-SKOLEM. VAUGHT’S TEST 61 Proof. α2n } and bn ∈ {β0 . . So A = {an : n ∈ N} and B = {bn : n ∈ N}. . (Here we go back. αn−1 in A and distinct β0 .) First take k ∈ N minimal such that ak ∈ {α0 . . One proves easily by induction on n that then an ∈ {α0 . In combination with Vaught’s Test this gives Corollary 4. . Then AΛ is a model of ΣΛ . We have the following generalization of the L¨wenheim-Skolem theorem. that is. Let L = L ∪ {cλ : λ < κ} and let Σ = Σ ∪ {cλ = cµ : λ < µ < κ}. . cλ = cµ for λ < µ < κ). β2n }. βn−1 . . k is minimal such that for i = 0.5 (Generalized L¨wenheim-Skolem Theorem). put αn := ak and βn := bl . Let {cλ }λ<κ be a family of κ new constant symbols that are not in L and are pairwise distinct (that is.4. . and αi > ak ⇐⇒ βi > bl . µ ∈ Λ. . o Theorem 4. . Then we define αn ∈ A and βn ∈ B as follows: Case 1: n is even. n − 1 we have: αi < ak ⇐⇒ βi < bl . αn−1 as bl is situated with respect to β0 . . . <) → (B. To (dense without endpoints) is complete. then take l ∈ N minimal such that bl is situated with / respect to β0 . and suppose we have distinct α0 . . given any finite set Λ ⊆ κ. hence B is a model of Σ of cardinality κ. . . . . . <) and (B. and αi > ak ⇐⇒ βi > bl . Let (A.1. . . . Let n > 0. . . . <) be countable dense totally ordered sets without endpoints. . . Put βn := bl and αn := ak . . We claim that Σ is consistent. and this bijection is an isomorphism (A. .1. .) First take l ∈ N minimal such that bl ∈ {β0 . n − 1 we have: αi < ak ⇐⇒ βi < bl .1. Note that L also has size at most κ. λ = µ} has a model. We make an L -expansion AΛ of A by interpreting distinct cλ ’s with λ ∈ Λ by distinct elements of A. αn−1 }. Thus we have a bijection αn → βn : A → B. We define by recursion a sequence (αn ) in A and a sequence (βn ) in B as follows: put α0 := a0 and β0 := b0 . . . . (bλ )λ<κ ) of cardinality at most κ. (Here we go forth. αn−1 . We have bλ = bµ for λ < µ < κ. Then Σ has a model of cardinality κ. . . .¨ 4. and interpreting the cλ ’s with λ ∈ Λ arbitrarily. j < n we have αi < αj ⇐⇒ βi < βj .

. First Σ σ means that Σ ∪ {¬σ} has a model. and for each vector v ∈ V there is a unique family (λb )b∈B of scalars (elements of F ) such that {b ∈ B : λb = 0} is finite and v = Σb∈B λb b. −. Similarly Σ ¬σ means that Σ ∪ {σ} has a model. a vector space V over F is viewed as an LF -structure by interpreting each fλ as the function v −→ λv : V → V . One easily specifies a set ΣF of sentences whose models are exactly the vector spaces over F . ∃xn F 1≤i<j≤n xi = xj : n = 2. Let LF be the language of vector spaces over F : it extends the language LAb = {0. SOME MODEL THEORY The next proposition is Vaught’s Test for arbitrary languages and cardinalities. Moreover. (Proofs can be found in various texts. Suppose also that any two models of Σ of cardinality κ are isomorphic. (iii) 1v = v. (a) V has a basis B. . . Proof. one for each λ ∈ F . w ∈ V . We will derive a contradiction. µ ∈ F and all v. be a sequence of distinct variables and put Σ∞ := ΣF ∪ {∃x1 . We will need the following facts about vector spaces V and W over F . Note that if F F itself is infinite then each non-trivial vector space over F is infinite. So the models of Σ∞ are exactly the infinite vector spaces over F . We now discuss in detail one particular application of this generalized Vaught Test. Σ has a model and all models of Σ are infinite. .6. Suppose L has size at most κ. (ii) λ(v + w) = λv + λw. 3.62 CHAPTER 4. x2 . By assumption A ∼ B. . Let σ be an L-sentence and suppose that Σ σ and Σ ¬σ.1. v) −→ λv such that for all λ. . (iv) (λµ)v = λ(µv). . then we have also nontrivial finite vector spaces. From a model-theoretic perspective finite structures are somewhat exceptional.) Fact. Then Σ is complete.}. so by the Generalized L¨wenheim-Skolem o Theorem Σ ∪ {¬σ} has a model A of cardinality κ. that is. (i) (λ + µ)v = λv + µv. +} of abelian groups with unary function symbols fλ . . . Note that ΣF is not complete since the trivial vector space satisfies ∀x(x = 0) but F viewed as vector space over F does not. contradicting that A |= ¬σ and = B |= σ. Let x1 . Fix a field F . A vector space over F is an abelian (additively written) group V equipped with a scalar multiplication operation F × V −→ V : (λ. if F is finite. These models must be infinite since they are models of Σ. and Σ ∪ {σ} has a model B of cardinality κ. B ⊆ V . Proposition 4. so we are going to restrict attention to infinite vector spaces over F .

. . (b) any two transcendence bases of K over k have the same size. Σ∞ is complete. .1. Theorem 4. (c) If K is algebraically closed with transcendence basis B over k and K is also an algebraically closed field extension of k with transcendence basis B over k. A transcendence basis of K over k is a set B ⊆ K such that B is algebraically independent over k and K is algebraic over its subfield k(B). Using Vaught’s Test for models of size ℵ1 this yields: Theorem 4.7. With the generalized Vaught Test we can also prove that ACF(0) (whose models are the algebraically closed fields of characteristic 0) is complete. F Remark. . (d) if K is uncountable and |K| > |k|. .8. . Applying this with k = Q and k = Fp for prime numbers p. . . (c) If V has basis B and W has basis C. . . . Likewise. Fact. xn ]. xn ) ∈ k[x1 . The relevant definitions and facts are as follows. Let V be a vector space over F of cardinality κ. (d) Let B be a basis of V . Thus any two vector spaces over F of cardinality κ have bases of cardinality κ and thus are isomorphic. which goes under the name of “categoricity in power”. (a) K has a transcendence basis over k. then any bijection B → C extends uniquely to an isomorphism V → W . . The proof is similar. where x1 . F Proof. . we obtain that any two algebraically closed fields of the same characteristic and the same uncountable size are isomorphic. . The set ACF(0) of axioms for algebraically closed fields of characteristic zero is complete. . Theorem 4. A subset B of K is said to be algebraically independent over k if for all distinct b1 .¨ 4. . it goes beyond the scope of these notes to develop this large chapter of model theory. In particular LF has size at most κ. Then |V | = |B| · |F | provided either |F | or |B| is |B| infinite. If both are finite then |V | = |F | . then |K| = |B| for each transcendence basis B of K over k.7 and Exercise 3 imply for instance that if F = R then all non-trivial vector spaces over F satisfy exactly the same sentences in LF .1. . . . then any bijection B → B extends to an isomorphism K → K . . LOWENHEIM-SKOLEM. bn ) = 0 for all nonzero polynomials p(x1 . with “transcendence bases” taking over the role of bases. Let K be a field with subfield k.1.1. xn are distinct variables. It follows by the Generalized Vaught Test that Σ∞ is complete. VAUGHT’S TEST 63 (b) Any two bases B and C of V have the same cardinality. bn ∈ B we have p(b1 . . If the hypothesis of Vaught’s Test (or its generalization) is satisfied. then many things follow of which completeness is only one. ACF(p) is complete for each prime number p. Take a κ > |F |. Then a basis of V must also have size κ by property (d) above.

. and any two infinite vector spaces over a given field F are elementarily equivalent. (3) Let L = {S} where S is a unary function symbol. . . ) is an L-structure. . (ii) for each n-ary f ∈ Lf and a1 . . then the map ai → bi : {a1 . . An explicit. CHAPTER 4. <) ≡ (R. (A description like {σ : (Z. 4. If Γ is a back-and-forth system from A to B. then Γ−1 := {γ −1 : γ ∈ Γ} is a back-and-forth system from B to A. . . .2 Elementary Equivalence and Back-and-Forth In the rest of this chapter we relax notation. . . . aN } → {b1 . . . Give an informative description of a complete set of L-sentences true in (Z. an ) ∈ An . xn ) an LA -formula. Suppose Σ1 and Σ2 have infinite models. . Example: suppose A = (A. γan ) = γ(an+1 ). A partial isomorphism from A to B is a bijection γ : X → Y with X ⊆ A and Y ⊆ B such that (i) for each m-ary R ∈ Lr and a1 . . . . aN ∈ A and b1 . . . an . . . bN ∈ B are such that a1 < a2 < · · · < aN and b1 < b2 < · · · < bN . . . an+1 ∈ X f A (a1 . . an ) for an LA -formula ϕ(a1 . am ) ∈ RA ⇐⇒ (γa1 . We call A and B back-and-forth equivalent . an ) = an+1 ⇐⇒ f B (γa1 . <) are totally ordered sets. Give an informative description of a complete set of L-sentences true in (Z.64 Exercises. and a1 . . . Consider the L-structure (Z. . . . . Consider the L-structure (Z. am ∈ X (a1 . . list of axioms is required. . . N). . . . . bN } is a partial isomorphism from A to B. . N) |= σ} is not informative. . possibly infinite. an ). ϕ(x1 . S). (ii) (“Back”) for each γ ∈ Γ and b ∈ B there is a γ ∈ Γ such that γ extends γ and b ∈ codomain(γ ). . <). . where A = (A. and (a1 . . .) (2) Let Σ1 and Σ2 be sets of L-sentences such that no symbol of L occurs in both Σ1 and Σ2 . . A back-and-forth system from A to B is a collection Γ of partial isomorphisms from A to B such that (i) (“Forth”) for each γ ∈ Γ and a ∈ A there is a γ ∈ Γ such that γ extends γ and a ∈ domain(γ ). and just write ϕ(a1 . Then Σ1 ∪ Σ2 has a model. . SOME MODEL THEORY (1) Let L = {U } where U is a unary relation symbol. Hint: Make an educated guess and try to verify it using Vaught’s Test or one of its variants. γam ) ∈ RB . . N). . . . . S) where S(a) = a + 1 for a ∈ Z. . . Thus by the previous section (Q. . <) and B = (B. . In this section A and B denote L-structures. . . . We say that A and B are elementarily equivalent (notation: A ≡ B) if they satisfy the same L-sentences. .

(This includes the case L = L. . yn ) one shows that for each γ ∈ Γ and a1 . . and construct a sequence (γn ) of partial isomorphisms from A to B such that each γn+1 extends γn .2. then A ≡ B. In the following result we do not need to assume countability. . A = n domain(γn ) and B = n codomain(γn ).) Then every set Σ of L -sentences with Σ ⊇ Σ has QE. and c = (c1 . . (1) Define a finite restriction of a bijection γ : X → Y to be a map γ|E : E → γ(E) with finite E ⊆ X. . . . . By induction on the number of logical symbols in unnested formulas ϕ(y1 . the key is to guess a back-and-forth system. Suppose Γ is a nonemepty back-and-forth system from A to B. x) for some L-formula ϕ(u.2. . Remark. .3. . . We proceed as in the proof of Cantor’s theorem. . Then A ∼ B. xn . γan ). xn ) is a tuple of distinct variables and y is a single variable distinct from x1 . cm ) is a tuple of constant symbols in L L. check first that each L -formula ϕ (x) has the form ϕ(c. Proof.4. If Γ is a back-and-forth system from A to B. um ) is a tuple of distinct variables distinct from x1 . . . By taking n = 0 in this definition we see that if Σ has QE. then A ≡bf C. .2. . Let L be a language such that L ⊇ L and all symbols of L L are constant symbols. . Suppose Σ has QE. . . (2) If A ≡bf B and B ≡bf C. . an ) ⇐⇒ B |= ϕ(γa1 . . . Cantor’s proof in the previous section generalizes as follows: Proposition 4. . . Definition. If A ≡bf B. For n = 0 and using Exercise 4 of Section 3. an ∈ domain(γ) we have A |= ϕ(a1 .3 this yields A ≡ B. . . QUANTIFIER ELIMINATION 65 (notation: A ≡bf B) if there exists a nonempty back-and-forth system from A to B. . Hence A ≡bf A. then every L-sentence is Σ-equivalent to a q-free L-sentence. .) . That is where insight and imagination (and experience) come in.3 Quantifier Elimination In this section x = (x1 . . Suppose A and B are countable and A ≡bf B. x). . = Proof. Let Γ be a nonempty back-and-forth system from A to B. so is the set of finite restrictions of members of Γ. Then the map A → B that extends each γn is an isomorphism A → B. where u = (u1 . . . xn . . then B ≡bf A. (To see this. Exercises. . Σ has quantifier elimination (QE) if every L-formula ϕ(x) is Σequivalent to a quantifier free (short: q-free) L-formula ϕqf (x). 4.1. In applying this proposition and the next one in a concrete situation. and if A ≡bf B. . Proposition 4.

y) is Σ-equivalent to a disjunction i ψi (x. (ϕ1 ∨ ϕ2 )(x). with i ranging over some finite index set. Each q-free L-formula ϕ(x) is equivalent to a disjunction ϕ1 (x) ∨ · · · ∨ ϕk (x) of basic conjunctions ϕi (x) in L (“disjunctive normal form”). Next. 0.1. y) of basic conjunctions ψi (x.5) iff C |= ϕqf (a) (by the same exercise) iff C |= σ. y).3. y) ↔ θqf (x). Proof.2. y). Lemma 4. ∃yψi (x. Let σ be an LA -sentence. c) := ∃y(ay 2 + by + c = 0) is equivalent to the q-free formula (b2 − 4ac ≥ 0 ∧ a = 0) ∨ (a = 0 ∧ b = 0) ∨ (a = 0 ∧ b = 0 ∧ c = 0). b. Note that this equivalence gives an effective test for the existence of a y with a certain property. let ϕ(x) = ∀yψ(x. Proof. Suppose Σ has QE and B and C are models of Σ with a common substructure A (we do not assume A |= Σ). y) has Σ-QE. In view of the equivalence of ∃y i ψi (x. −. by hypothesis. Remark. let ϕ(x) = ∃yψ(x. This case reduces to the previous case since ϕ(x) is equivalent to ¬∃y¬ψ(x. ·) the formula ϕ(a. y) we obtain Σ ϕ(x) ←→ i ∃yθ(x. This illustrates the kind of property QE is. and suppose inductively that the L-formula ψ(x. which avoids in particular having to check an infinite number of values of y (even uncountably many in the case above). Lemma 4. We have to show B |= σ ⇔ C |= σ. <. y). Let us say that an L-formula ϕ(x) has Σ-QE if it is Σ-equivalent to a qfree L-formula ϕqf (x). then ¬ϕ1 (x). Then B |= σ iff B |= ϕqf (a) iff A |= ϕqf (a) (by Exercise 7 of Section 2. Take a q-free L-formula ϕqf (x) that is Σ-equivalent to ϕ(x). y) in L. so ϕ(x) has Σ-QE.3. y) with i ∃yψi (x. +.66 CHAPTER 4. and suppose inductively that the L-formula ψ(x. Hence ψ(x. . and (ϕ1 ∧ ϕ2 )(x) have Σ-QE. 1. Finally. y) has Σ-QE. In the structure (R. Write σ as ϕ(a) with ϕ(x) an L-formula and a ∈ An . y) in L there is a q-free L-formula θqf (x) such that Σ Then Σ has QE. Then B and C satisfy the same LA sentences. Note that if the L-formulas ϕ1 (x) and ϕ2 (x) have Σ-QE. Each ∃yψi (x. SOME MODEL THEORY A basic conjunction in L is by definition a conjunction of finitely many atomic and negated atomic L-formulas. Suppose that for every basic conjunction θ(x. y) has Σ-QE. y).

y) is equivalent in To(dense without endpoints) to a q-free formula ψ(x). Proof. Completeness is only one of the nice consequences of QE.3. y). Similarly. y) is a basic conjunction in LO . The above corollary gives another way of establishing completeness. y) is equivalent to y = xi ∧ ϕ (x. Up till this point we did not need the density and “no endpoints” axioms. Remark. Let B and C be any two models of Σ. y) is equivalent to ϕ (x. To justify this. y).1 it suffices to show that ∃yϕ(x.4. Proof. i∈I. n} and where we allow I or J to be empty. xn . Suppose Σ has a model. Then ∃yϕ(x. We may assume that each conjunct of ϕ is of one of the following types: y = xi .4. y) is equivalent in To(dense without endpoints) to the formula xi < xj . y)) is equivalent to ∃yϕ1 (x. The main impact of QE is rather that it gives access to the structural properties of definable sets. y). and is often applicable when the hypothesis of Vaught’s Test is not satisfied. and likewise with negations ¬(xi < y).3. and we are done. y) has no conjuncts of the form y = xi . . a negation ¬(y < xi ) can be replaced by the disjunction y = xi ∨ xi < y. y < xi (1 ≤ i ≤ n). observe that if we had instead a conjunct y = xi then we could replace it by (y < xi ) ∨ (xi < y) and use the fact that ∃y(ϕ1 (x.3. has QE. QUANTIFIER ELIMINATION 67 Corollary 4. Take an L-structure A that can be embedded into every model of Σ. . y)) ←→ ψ(x) ∧ ∃yθ(x. . and the easiest one to explain at this stage. To(dense without endpoints) has QE. J ⊆ {1. Theorem 4. y) = (x1 . Then by a slight rewording of the proof of Lemma 4. Applications of model theory to other areas of mathematics often involve QE as a key step. .j∈J . y) ∨ ϕ2 (x. y) be a tuple of n + 1 distinct variables. y). xi ). This will be reflected in exercises at the end of this section. So we can assume also that ϕ(x. . y) in LO . y) ∨ ∃yϕ2 (x. we see that B and C satisfy the same L-sentences. and there exists an Lstructure that can be embedded into every model of Σ. . y) is a conjunction xi < y ∧ i∈I j∈J y < xj where I. So A is isomorphic to a substructure of B and of C. By Lemma 4. and after rearranging conjuncts we can assume that ϕ(x. Let (x. ϕ(x. where ϕ (x.3.2 (considering only L-sentences). Suppose that we have a conjunct y = xi in ϕ(x. We have seen that Vaught’s test can be used to prove completeness. .3. but these come in now: ∃yϕ(x. and consider a basic conjunction ϕ(x. xi < y.3. It follows that Σ is complete. so. Then Σ is complete. . Also conjuncts in which y does not appear can be eliminated because ∃y(ψ(x) ∧ θ(x. After all these reductions.

which is one of the key results in real algebraic geometry. Then Σ (as a set of L -sentences) also has QE. ·) are exactly the finite subsets of C and their complements in C. +. Tarski also established the following more difficult theorem. . nor does it prove 1 + 1 = 0. . Robinson using a combination of basic algebra and elementary model theory. then σ ∈ T . . −. Theorem 4.6. <. RCF is a set of axioms true in the ordered field (R. RCF has QE and is complete. 1. xn . +. (ii) every equivalence class is infinite. It dates from around 1950. if T σ. <. and an elegant short proof by A. y are distinct variables. is complete by 4.3 and the fact that the ring of integers embeds in every algebraically closed field of characteristic 0. and give a complete proof for a third example in the next section. 0. there is a shorter one due to A. (His original proof is rather long.) (4) Suppose that a set Σ of L-sentences has QE. However. In addition to the ordered field axioms. Then Eq∞ admits QE and is complete. y distinct variables) and for each odd n > 1 the axiom ∀x1 . −.68 CHAPTER 4. and let Σ ⊇ Σ be a set of L -sentences. .5. it has the axiom ∀x (x > 0 → ∃y (x = y 2 )) (x. an L-theory is a set T of L-sentences such that for all L-sentences σ. An axiomatization of an L-theory T is a set Σ of L-sentences such that T = {σ : σ is an L-sentence and Σ σ}. Seidenberg. 0. ACF(0). The following theorem is due to Tarski and (independently) to Chevalley. .3. which contains additional axioms forcing the characteristic to be 0. since it says nothing about the characteristic: it doesn’t prove 1 + 1 = 0. (iii) there are infinitely many equivalence classes. ACF is not complete.3. 0. +.) (3) Let Eq∞ be a set of axioms in the language {∼} (where ∼ is a binary relation symbol) that say: (i) ∼ is an equivalence relation. ACF has QE.) Definition. In (5) and (6). SOME MODEL THEORY We mention without proof two examples of QE. ·) of RCF are exactly the finite unions of intervals of all kinds (including degenerate intervals with just one point) (Hint: use the fact that RCF has QE. (1) The subsets of C definable in (C. Theorem 4. ∀xn ∃y (y n + x1 y n−1 + · · · + xn = 0) where x1 . 1. . Exercises. 1. ·) of real numbers. . (It is also possible to use Vaught’s test to prove completeness. Let the language L extend L by new symbols of arity 0. −. (Hint: use the fact that ACF has QE. Clearly.) (2) The subsets of R definable in the model (R.3.

Then ˜ (1) There is a unique embedding Z −→ A. that is. P2 . −. +. 1. −. <). z for definiteness. 1. 3. . n = 1. +. 1. 0. 0. t1 (x)) ∨ · · · ∨ ϕ(x. . n = 1. y) → ϕ(x. that is. Let A = (A. We expand (Z. ∀x Pn x ↔ ∃y(x = ny) . P2 . Z. In (v) and in the rest of this section r ranges over integers. . P4 .) (6) Assume the L-theory T has built-in Skolem functions. to obtain the language LPrA of Presburger Arithmetic (named after the Polish logician Presburger who was a student of Tarski). . (division with remainder). <. y) and ψ(x) are q-free. . +. . −. . 1. +. Then T has QE. tk (x) such that T ∃yϕ(x. ∀x∀y∀z(x < y → x + z < y + z) (translation invariance of <). (Hint: let Σ be the set of L-sentences provable from T that have the indicated form. for every ϕ(x. it sends k ∈ Z to k1 ∈ A. <) |= ¬ϕ(n) for all n > N . formulas such as ∃y(x = y + y) and ∃y(x = y + y + y) are not Σ-equivalent to any q-free formula in this language. ). <. for each basic conjunction ϕ(x. 3. y) there are L-terms t1 (x). 2Z. . y) → ϕ(x. Then T has an axiomatization consisting of sentences ∀x∃yϕ(x. tk (x) such that T ∃yϕ(x. . <) |= ϕ(n) for all n > N or (Z. this is a complete set of axioms for ordinary arithmetic of integers without multiplication. . ) |= PrA. There is a mild complication in trying to obtain this completeness via QE: one can show (exercise) that for any q-free formula ϕ(x) in the language {0. −. y) and ∀xψ(x) where ϕ(x. 1. <} there is an N ∈ N such that either (Z. ∀x∃y 0≤r<n x = ny + r1. y. . In particular. . . . Here we have fixed distinct variables x.4 Presburger Arithmetic In this section we consider in some detail one example of a set of axioms that has QE. 4. . 1. Pn is interpreted as the set nZ. 0. +.4. y) there are L-terms t1 (x). the axioms of To expressing that < is a total order.” Essentially. P3 . t1 (x)) ∨ · · · ∨ ϕ(x. PRESBURGER ARITHMETIC 69 (5) Suppose the L-theory T has QE. and prove every sentence true in this structure. . 4Z. . This structure satisfies the set PrA of Presburger Axioms which consists of the following sentences: (i) (ii) (iii) (iv) (v) (vi) the axioms of Ab for abelian groups. 0. . <). P1 . 1.4. . . namely “Presburger Arithmetic. 0. the axioms are true in (Z. +. . <} by new unary relation symbols P1 . −. +. . P3 . 0.1. 0. <) to the LPrA -structure ˜ Z = (Z. −. +. P2 . To overcome this obstacle to QE we augment the language {0. 0 < 1 ∧ ¬∃y(0 < y < 1) (discreteness axiom). and T has an axiomatization consisting of sentences ∀xψ(x) where ψ(x) is q-free. 1. Note that (v) and (vi) are infinite lists of axioms. 1. show that Σ has QE. tk (x)). +. for any set Σ of axioms true in (Z. . −. −. and is an axiomatization of T . . −. 3Z. (defining axioms for P1 . . .4. Here are some elementary facts about models of PrA: A A A Proposition 4. . tk (x)). ) that is. 2. 2.

y). . . . . t(x) < my. ϕ(x. J. . and after rearranging conjuncts. (n − 1)1 + nA. namely 0 + nA. y) = (x1 . PrA has QE. y) ∨ ϕ2 (x.3. I. We allow some of these index sets to be empty in which case the corresponding conjunction can be left out. and likewise with my < t(x) and t(x) < my. say h ∈ H. a + (n − 1)1 lies in nA.70 CHAPTER 4. Let (x.1 it suffices to show that ∃yϕ(x. ∨ Pn (my + t(x) + (n − 1)1) Also conjuncts in which y does not appear can be eliminated because ∃y(ψ(x) ∧ θ(x. Proof. and consider a basic conjunction ϕ(x. SOME MODEL THEORY A (2) Given any n > 0 we have Pn = nA. Also. Similarly a negation ¬Pn (my + t(x)) can be replaced by the disjunction Pn (my + t(x) + 1) ∨ . By Lemma 4. . y)) is equivalent to ∃yϕ1 (x. y) in LPrA . and A/nA has exactly n elements. y) has the form my = th (x) ∧ h∈H i∈I ti (x) < my ∧ j∈J my < tj (x) ∧ k∈K Pn(k) (my + tk (x)) where m > 0 and H. We may assume that each conjunct of ϕ is of one of the following types. a + 1. y) is PrA-equivalent to a q-free formula ψ(x). y) ∨ ∃yϕ2 (x. . . .4. y). Pn (my + t(x)).2. Since PrA Pn (z) ↔ Prn (rz) for r > 0 we can replace Pn (my + t(x)) by Prn (rmy + rt(x)). where we regard A as an abelian group. my < t(x). . . After all these reductions. y)) ←→ ψ(x) ∧ ∃yθ(x. K are disjoint finite index sets. y) is PrAequivalent to Pm (th (x)) ∧ h∈H th (x) = th (x) ∧ i∈I ti (x) < th (x) ∧ j∈J th (x) < tj (x) Pn(k) (th (x) + tk (x)) ∧ k∈K For the rest of the proof we assume that H = ∅. . . (3) For any n > 0 and a ∈ A. Then the formula ∃yϕ(x. To justify this assumption observe that if we had instead a conjunct my = t(x) then we could replace it by (my < t(x)) ∨ (t(x) < my) and use the fact that ∃y(ϕ1 (x. for r > 0 we can replace my = t(x) by rmy = rt(x). exactly one of the a. (4) A is torsion-free as an abelian group. xn . for some integer m ≥ 1 and LPrA -term t(x): my = t(x). Theorem 4. Suppose that H = ∅. y) be a tuple of n + 1 distinct variables. We can therefore assume that all conjuncts have the same “coefficient” m in front of the variable y. . .

y) now has the form ti (x) < my ∧ i∈I j∈J my < tj (x) . Consider the system of linear congruences (with “unknown” y) Pn(k) (my + tk (a)). b). .) |= PrA. . Fix any value a ∈ Zn of x. e.(N − 1)1 + N z. . where as usual we put N = 1 for K = ∅. . . . . PRESBURGER ARITHMETIC 71 ˜ To understand what follows. b). We have to show that A |= ∃yϕ(a. mb + tk (a) = m(r1 + N c) + tk (a) = (mr)1 + (mN )c + tk (a) ∈ n(k)A and so A |= Pn(k) ((mr)1 + tk (a)). Note that then for k ∈ K. Therefore A |= θ(a) with ∃z witnessed by c. This suggests replacing y successively by N z. So let b ∈ A be such that A |= ϕ(a. y) if and only if A |= θ(a). it may help to focus on the model Z. (k ∈ K). y) is PrA-equivalent to the formula θ(x) given by N −1 Pn(k) ((mr)1 + tk (x)) ∧ ∃z r=0 k∈K i∈I ti (x) < m(r1 + N z) ∧ j∈J m(r1 + N z) < tj (x) We prove this equivalence with θ(x) as follows. . .. N − 1} is true in A. . Also. with ∃z witnessed by c ∈ A. (k ∈ K). a = (a1 . For the converse. suppose that the disjunct of θ(a) indexed by a certain r ∈ {0. The solutions in Z of this system form a union of congruence classes modulo N := k∈K n(k).4. ti (a) < m(r1 + N c) m(r1 + N c) < tj (a) for every i ∈ I. Suppose A = (A. for every j ∈ J. Then put b = r1 + N c and we get A |= ϕ(a. . . which in more familiar notation would be written as my + tk (a) ≡ 0 mod n(k). . Division with remainder yields a c ∈ A and an r such that b = r1 + N c and 0 ≤ r ≤ N − 1. Now that we have proved the claim we have reduced to the situation (after changing notation) where H = K = ∅ (i. an ) ∈ An .4. no equations and no congruences). 1 + N z. So ϕ(x. This proves the claimed equivalence. although the arguments go through for arbitrary models of PrA. . Our precise claim is that ∃yϕ(x.

we just write t ≤ t to abbreviate (t < t ) ∨ (t = t ).) The algorithms above can easily be implemented by computer programs. This bad behaviour is not at all unusual: no (classical. In view of the equally constructive proof of Lemma 4. given any basic conjunction ϕ(x. ) and computer science interact. constructs a q-free formula ψ(x) of LPrA such that PrA ∃yϕ(x. or feasible: on some moderately small inputs it might have to run for 10100 years before producing an output. SOME MODEL THEORY If J = ∅ or I = ∅ then PrA ∃yϕ(x. There do exist feasible integer linear programming algorithms that decide ˜ the truth in Z of sentences of a special form. Discussion. this is an area where mathematics (logic. as we discuss next for Z. this yields an algorithm that. Thus the structure Z is decidable. A positive impact of QE is that it yields structural properties of definable ˜ sets. given any LPrA -sentence σ. .3 that PrA is complete: it has QE ˜ and Z can be embedded in every model. Even if this algorithm can be implemented by a computer program.3. it does not guarantee that the program is of practical use. . this last algorithm constructs for any LPrA -sentence σ a q-free LPrA -sentence σ qf such that PrA σ ↔ σ qf . checks whether ˜ ˜ σ is true in Z. given any q-free LPrA -sentence σ qf . Moreover each interval of m successive elements of A contains an element of mA. y) ↔ ψ(x). The careful reader will have noticed that the elimination procedure in the proof above is constructive: it describes an algorithm that. Therefore ∃yϕ(x. For each value a ∈ An of x there is i0 ∈ I such that ti0 (a) is maximal among the ti (a) with i ∈ I.1 this yields an algorithm that. checks whether σ qf is true ˜ in Z. sequential) algorithm for deciding the truth of ˜ LPrA -sentences in Z is feasible in a precise technical sense. and this shows another (very practical) side of complexity theory. Since we also have an obvious algorithm that. constructs a q-free LPrA -formula ϕqf (x) such that PrA ϕ(x) ↔ ϕqf (x).) In particular. and suppose we have an algorithm for deciding whether any given L-sentence is true in A. So suppose A |= PrA and that A is the underlying set of A. Results of this kind belong to complexity theory. and a j0 ∈ J such that tj0 (a) is minimal among the tj (a) with j ∈ J. . This leaves the case where both I and J are nonempty. j0 ) ∈ I × J of the q-free formula ti (x) ≤ ti0 (x) ∧ i∈I j∈J m tj0 (x) ≤ tj (x) ∧ r=1 Pm (ti0 (x) + r1) ∧ (ti0 (x) + r1 < tj0 (x)) This completes the proof.3. y) in LPrA as input. Note that LPrA does not contain the relation symbol ≤. y) is equivalent in A to the disjunction over all pairs (i0 . It now follows from Corollary 4. Remark.72 CHAPTER 4. given any LPrA -formula ϕ(x) as input. y) ↔ .. (A precise definition of decidability will be given in the next Chapter. (Thus PrA has effective QE. number theory. Let some L-structure A be given.

k = 0 and k ≡ 0 mod d. +∞}. and so on). −. It follows that we may assume that ϕ(x) has the form kx + l1 > 0. −. Then P contains with any two sets X. Then ˜ S is definable in Z ⇐⇒ S is a finite union of arithmetic progressions. +. Exercises.5. An arithmetic progression of modulus d is a set of the form {k ∈ Z : k ≡ r mod d. α < β. . 1.4. <) by a q-free formula of the language {0. Σ is said to be conservative over Σ (or a conservative extension of Σ) if for every L-sentence σ Σ L σ ⇐⇒ Σ L σ. Every atomic formula ϕ(x) different from and ⊥ has the form t1 (x) < t2 (x) or the form t1 (x) = t2 (x) or the form Pd (t(x)).3. l ∈ Z. d − 1}. (3) Let P be the collection of all finite unions of arithmetic progressions. (1) If P. We leave the proof of the next lemma to the reader. or the form kx + l1 = 0 . Definition. SKOLEMIZATION AND EXTENSION BY DEFINITION 73 Definition. 0. then P ∩ Q is an arithmetic progression of modulus lcm(d. Lemma 4. this is straightforward and left to the reader. Σ a set of L-sentences. then Z P is a finite union of arithmetic progressions. Q ⊆ Z are arithmetic progressions of moduli d and e respectively. Considering cases (k = 0. and Σ a set of L -sentences with Σ ⊆ Σ . Y also X ∪ Y . Let S ⊆ Z. .5 Skolemization and Extension by Definition In this section L is a sublanguage of L . 1. . β ∈ Z ∪ {−∞. α < k < β}. <}. . Arithmetic progressions have the following properties. e). where k. where r ∈ {0. The first two kinds reduce to t(x) > 0 and t(x) = 0 respectively (by subtraction). (⇒) By QE and Lemma 4. (⇐) It suffices to show that each arithmetic progression is definable in Z. α. (2) If P ⊆ Z is an arithmetic progression.4. X Y .4. 4. where t1 (x). we see that such a ϕ(x) defines an arithmetic progression. ˜ Proof. Corollary 4.3 it ˜ suffices to show that each atomic LPrA -formula ϕ(x) defines in Z a finite union of arithmetic progressions.4. Let d be a positive integer. or the form Pd (kx + l1). +. (1) Prove that 2Z cannot be defined in the structure (Z. . X ∩ Y . t2 (x) and t(x) are LPrA -terms.4.

∀xn (∃yϕ(x1 .5. Let ϕ(x1 . xn . The sentence ρϕ is called the defining axiom for Rϕ . (3) Suppose A |= Σ and S ⊆ Am . . We call Σϕ an extension of Σ by a definition for the relation symbol Rϕ . By remark (2) it suffices to obtain an L expansion A of A that makes the new axiom about fϕ true. Remarks (1) Suppose Σ conservative over Σ. . Definition. . then Σ is conservative over Σ. . . . xn ))) as desired. . . an . Given an L-formula ϕ(x1 . let Rϕ be an m-ary relation symbol not in L. . y) → ϕ(x1 . . an ) be an arbitrary element of A. . . . . . Every model of Σϕ is of the form Aϕ for a unique model A of Σ. ∀xm (ϕ(x1 . For any (a1 . xn . . . . . we let fϕ (a1 . . . Thus A |= ∀x1 . Let A be any model of Σ. . Remark. A function fϕ as in the proof is called a Skolem function in A for the formula ϕ(x1 . yn ) there is an L-formula ψ ∗ (y). . xn )))} Then Σ is conservative over Σ. . Then Σ is consistent if and only if Σ is consistent. . We choose a A function fϕ : An −→ A as follows.2. . . . . . such that Σϕ ψ(y) ↔ ψ ∗ (y).5. . and if no such b exists. an ) ∈ An . A Remark. (2) If each model of Σ has an L -expansion to a model of Σ .74 CHAPTER 4. . . . . and put Lϕ := L ∪ {Rϕ } and Σϕ := Σ ∪ {ρϕ } where ρϕ is ∀x1 . . . xm ). y) → ϕ(x1 . . Then S is 0-definable in A if and only if S is 0-definable in Aϕ . . . (2) For each Lϕ -formula ψ(y) where y = (y1 . . . .1. xn . . if there is a A b ∈ A such that A |= ϕ(a1 . . . fϕ (x1 . xn . . . ∀xn (∃yϕ(x1 . Let fϕ be an n-ary function symbol not in L. . . (⇐=) is automatic. and the same with definable instead of 0-definable. called a translation of ψ(y). and put L := L ∪ {fϕ } and Σ := Σ ∪ {∀x1 . xm ) be as above. fϕ (x1 . . Then we have: (1) Σϕ is conservative over Σ. . xm )). Proof. . . . Proposition 4. . SOME MODEL THEORY Here (=⇒) is the significant direction. Let ϕ = ϕ(x1 . . . . . . .) Proposition 4. an ) be such an A element b. We denote this expansion by Aϕ . It yields a “witness” for each relevant n-tuple. . y) be an L-formula. . y). . . Each model A of Σ expands uniquely to a model of Σϕ . xn . xm ) ↔ Rϕ (x1 . . (This follows easily from the Completeness Theorem. . . . . b) then we let fϕ (a1 . . . . . xn .

.3. Σϕ . and (3) is immediate from (2). fϕ (x)) The sentence γϕ is called the defining axiom for fϕ . A = ∅. y1k ). .) It can be shown with tools a little beyond the scope of these notes that no infinite totally ordered set can be defined in the field of complex numbers. . f (xn )) : (x1 .4. . +) of natural numbers. . . y) = (x1 . .5. ∧ um = tm (y) ∧ ϕ(u1 /x1 . ∃um (u1 = t1 (y) ∧ . xn ) ∈ S} of Y n . Let X. y) is an L-formula where (x. . . . y) abbreviates ∃y ϕ(x. (ii) The set {(δ(a). SKOLEMIZATION AND EXTENSION BY DEFINITION 75 Proof. . For example. given by δ(n) = (n. n). To prove (2) we observe that by the Equivalence Theorem (3. . and A . (yn1 . . . ∀xm ϕ(x. yn . y) is a tuple of m + 1 distinct variables. In order to extend this notion to arbitrary structures A = (A. 0. Also. . respectively. ) we use the following notation and terminology. By a definition of (A. such that Σ ∀x1 . <) be a totally ordered set. such that (i) δ(A) ⊆ B k is definable in B. . . δ(b)) : a < b in A} ⊆ (B k )2 = B 2k is definable in B. . . . ∀xm ∃! yϕ(x. . . we use the bijection ((y11 . . .) Before defining the next concept we consider a simple case. Each model A of Σ expands uniquely to a model of Σ . . . In this case we can take ∃u1 . . with z a variable not occurring in ϕ and not among x1 . . . given k ∈ N. .2) it suffices to prove it for formulas ψ(y) = Rϕ t1 (y) . . . <) in a structure B we mean an injective map δ : A → B k . . . . We denote this expansion by A . and S ⊆ X n . (Hint: reduce the analogue of (2) to the case of an unnested formula. Definition. . (1) is clear from the remark preceding the proposition. Let (A. y) ∧ ∀z(ϕ(x. . . . ynk ) from (Y k )n to Y nk to identify these two sets. with k ∈ N. tm (y) where the ti are L-terms. . Let fϕ be an m-ary function symbol not in L and put L := L ∪ {fϕ } and Σ := Σ ∪ {γϕ } where γϕ is ∀x1 . . um /xm )) as ψ ∗ (y) where the variables u1 .5. We leave the proof of this as an exercise. . . yn1 . . xm . . xm . we have a definition δ : Z → N2 of the ordered set (Z. . . <) of integers in the additive monoid (N. y. .2 goes through when Lϕ . where ∃! yϕ(x. Every model of Σ is of the form A for a unique model A of Σ. . . . um do not appear in ϕ and are not among y1 . We call Σ an extension of Σ by a definition of the function symbol fϕ . . y1k . ynk )) → (y11 . Remark. . f : X → Y a map. z) → y = z) . 0) and δ(−n) = (0. . . . . y). Y be sets. . Σ . Proposition 4. . . . . Then the f -image of S is the subset f (S) := {(f (x1 ). . . (We leave it to the reader to check this. Suppose ϕ(x. and A are replaced by L . . .

. xn1 . . . 0. . Let x1 . considered as a subset of C. see exercise 1. . ·) of complex numbers in the field (R. . . . d ∈ Z}. A definition of an L-structure A in a structure B is an injective map δ : A → B k . one can show that the only fields definable in the field of complex numbers are finite fields and fields isomorphic to the field of complex numbers itself. Remark. . ·). −. this follows easily from the fact that ACF admits QE. b) : C → R2 (a. Here B is a structure for a language L∗ that may have nothing to do with the language L. and imposing suitable conditions. . xn be distinct variables (viewed as ranging over A ). 1. A more general way of viewing a structure A as in some sense living inside a structure B is to allow δ to be an injective map from A into B k /E for some equivalence relation E on B k that is definable in B. Then we have a map that assigns to each L-formula ϕ(x1 . Proposition 4. . . (iii) For each n-ary f ∈ Lf the set δ graph of f A ⊆ (B k )n+1 = B (n+1)k is definable in B. ·) of real numbers. SOME MODEL THEORY Definition. for n = 0 the map above assigns to each L-sentence σ an L∗ -sentence δσ such that A |= σ ⇐⇒ B |= δσ. . there is no definition of the field of real numbers in the field of complex numbers: this follows from the fact. and let x11 . −. xn ) an L∗ -formula δϕ(x11 . . . with k ∈ N. ·. . In particular. b ∈ R) is a 0-definition of the field (C. . we get the notion of a 0-definition of A in B.) Indeed. such that (i) δ(A) ⊆ B k is definable in B. . <) in (Z. . . It follows that the inclusion map N → Z is a 0-definition of (N. . . On the other hand. . b. −. 1. 0. +.3. (A special case says that R. . xnk be nk distinct variables (viewed as ranging over B).) Recall that by Lagrange’s “four squares” theorem we have N = {a2 + b2 + c2 + d2 : a. . . . . x1k . that there is no definition of any infinite totally ordered set in the field of complex numbers. c. . 1. . Our special case corresponds to E = equality on B k . 0. The bijection a + bi → (a. (ii) For each m-ary R ∈ Lr the set δ(RA ) ⊆ (B k )m = B mk is definable in B. 0.5. is not definable in the field of complex numbers. +. . (We do not develop this idea here further: the right setting for it would be many-sorted structures. rather than our one-sorted structures. Replacing everywhere “definable” by “0-definable”. xn1 . +. . stated earlier without proof. x1k . +. . Let δ : A → B k be a 0-definition of the L-structure A in the L∗ -structure B.76 CHAPTER 4. xnk ) such that δ(ϕA ) = (δϕ)B ⊆ B nk .

x.x. For example µx(x2 > 7) = 3..} is non-empty. 77 . µx<4 (x2 > 3) = 2 and µx<2 (x > 5) = 2. no computable set of axioms in the language of N and true in N can be complete.x.. is clear.. We let µx(. Decidability.x. Here .x.. We will only use this notation when the meaning of . Then there exists a sentence σ in that language such that N |= σ. Consider the o structure N := (N.) := a. +. For example. 0. and goes far beyond incompleteness..1 Computable Functions First some notation.2.x. 5. The only unexplained terminology here is “computable.. computability plays a role in combinatorial group theory (Higman’s Theorem) and in certain diophantine questions (Hilbert’s 10th problem). where S : N → N is the successor function.. is some condition on natural numbers x... Let Σ be a computable set of sentences in the language of N and true in N. (It seems reasonable to require this of an axiom system for N.. <).) Thus we begin this chapter with developing the notion of computability. A simple form of the incompleteness theorem is as follows.. holds if there is such an x. but Σ σ. For example. For a ∈ N we also let µx<a (. and if there is no such x we put µx<a (. not to mention its role in the ideological underpinnings of computer science. S.. hence the name Incompleteness Theorem. ·.. “Σ is computable” means that there is an algorithm to recognize whether any given sentence in the language of N belongs to Σ.. holds.. In other words..” Intuitively.x.) denote the least x ∈ N for which . and the set {x ∈ N : .) be the least x < a in N such that . and Incompleteness In this chapter we prove G¨del’s famous Incompleteness Theorem.Chapter 5 Computability.x. The interest of this notion is tied to the Church-Turing Thesis as explained in Section 5.

and for all a ∈ Nn there exists x ∈ N such that G(a. Hm (a)). This 4 4 4 4 follows from (R2) by noting that H(x) = F (G(I1 (x). . I2 (x). Example. Hm ). and the coordinate functions Ii (for each n and i = 1. if a ∈ R. / Definition. . 0 0 . . . . The functions χ≥ and χ= on N2 are computable..) obtained by inductively applying the following rules: n (R1) + : N2 → N. Hk ) ⊆ Nn is computable. . x2 . . an ) ∈ R. . where for a ∈ Nn we put R(H1 . . Hm )(a) ⇐⇒ R(H1 (a). We call χR the characteristic function of R. χ≤ : N2 → N. . For R ⊆ Nn . x3 . n) = 1 iff m < n. 2. then the function F : Nn → N given by F (a) = µx(G(a. . Hm (a)). . Now apply (R2).1.78 CHAPTER 5. . .2. . . . Then R(H1 . . . (The reader should of course notice when we do so. . COMPUTABILITY 1 0 if a ∈ R. n we define Ii : Nn → N by Ii (a1 . . If F : N3 → N and G : N2 → N are computable. Observe that χR(H1 . an ) instead of (a1 . x2 . as is the constant function cn : Nn → N where cn (a) = 0 for all n and a ∈ Nn . · : N2 → N. x) = 0. . and often write R(a1 . These functions are called coordinate functions. I4 (x)) where x = (x1 . . n) = 0 iff m ≥ n. but only tacitly. . This is mostly an exercise in programming. x4 ). n n Definition. x) = 0) is computable. . . I4 (x)). . Definition. . n) are computable. 1.Hm ) = χR (H1 . . . . . . . x4 ). The computable functions (or recursive functions) are the functions from Nn to N (for n = 0. . . For i = 1. x4 ) = F (G(x1 .. Proof. (R3) If G : Nn+1 → N is computable. (R2) and (R3) we derive further rules for obtaining computable functions. . . Lemma 5. χ< (m. .. . . A relation R ⊆ Nn is said to be computable (or recursive) if its characteristic function χR : Nn −→ N is computable. Hm : Nn → N are computable. . then so is the function H : N4 → N defined by H(x1 . .) From (R1).1. . . . Let H1 . .. We shall use this device from now on in many proofs. Example. we define χR : Nn → N by χR (a) = Think of such R as an n-ary relation on N.1. then so is the function F = G(H1 . Lemma 5. . x2 . . x4 ). and χ< (m. . . . . Hm : Nn → N and R ⊆ Nm be computable. x3 . . . Hm ) : Nn → N defined by F (a) = G(H1 (a). (R2) If G : Nm → N is computable and H1 . an ) = ai .

≥. Rk ⊆ Nn be computable such that for each a ∈ Nn exactly one of R1 (a). This is true for k = 0 (and all n) by Lemma 5. P ∧ Q := P ∩ Q. P ∧ Q. P → Q := (¬P ) ∨ Q. . χ= is computable: χ= (m. The function χ≥ is computable because 2 2 χ≥ (m. 0 the relation P ∧ Q is computable since χP ∧Q = χP · χQ . k Proof.2 and (R1).5. x). G(a) = .1. . Then we can form the n-ary relations ¬P := Nn P.1. Hence ¬P is computable by (R2) and Lemma 5. P ∨ Q. P ∨ Q = ¬(¬P ∧ ¬Q). Lemma 5.1. n). Gk : Nn → N are computable. .1. m) = χ≤ (I2 (m. COMPUTABLE FUNCTIONS Proof. Then G : Nn → N given by   G1 (a) if R1 (a)  . . P ↔ Q := (P → Q) ∧ (Q → P ) on N.5. n)) 79 which enables us to apply (R1) and (R2).4 they are also computable. n). By De Morgan’s Law. The remaining relations are complements of these three. Rk (a) holds. x) = 0).3. Suppose P. The rest is clear. Let a ∈ Nn . Q are computable.2.1. . Similarly.6. I1 (m.4. and suppose that G1 . Every constant function cn is computable. Lemma 5. =. . . . Next. To handle cn we observe 0 n+1 cn (a) = µx(In+1 (a. (Definition by Cases) Let R1 . Next. Proof. P ∨ Q := P ∪ Q. Lemma 5. ≤ and = have already been taken care of by Lemma 5. Thus P ∨ Q is computable. The binary relations <. . The relations ≥. . n) = χ≤ (n. so the general result follows by induction on k.1. . 0 For k ∈ N we define the constant function cn : Nn → N by cn (a) = k. k k Lemma 5. n) · χ≥ (m. observe that n+1 cn (a) = µx(cn (a) < x) = µx χ≥ (cn+1 (a. cn (a)). . Let P.1. . Proof. . . In+1 (a. so χ¬P (a) = 0 χ= (χP (a).1.1.   Gk (a) if Rk (a) is computable. n) = χ≤ (m. x)) = 0 k+1 k k for a ∈ Nn .2. Then ¬P . . so by Lemma 5. = on N are computable. P → Q and P ↔ Q are also computable. >. Q be n-ary relations on N. . Then ¬P (a) iff χP (a) = 0 iff χP (a) = cn (a). ≤.

y) = (a1 . Then for all a ∈ Nn and b ∈ N. Then the function F : Nn → N given by F (a) = µxR(a.8. . x). Q ⊆ Nn+1 be the relations defined by P (a. Let R ⊆ Nn+1 be computable such that for all a ∈ Nn there exists x ∈ N with (a. likewise. Rk ⊆ Nn be computable such that for each a ∈ Nn exactly one of R1 (a). .1. COMPUTABILITY Proof.9. . y) ⇐⇒ ∃ x<y R(a.1. x) or x = y). Some notation: below we use the bold symbol ∃ as shorthand for “there exists a natural number”. P (a) ⇐⇒ . Then P and Q are computable. b) ⇐⇒ F (a) = b. This follows from G = G1 · χR1 + · · · + Gk · χRk .8. Lemma 5. then the function FR : Nn+1 → N defined by FR (a. .1. . Then F is computable if and only if its graph (a subset of Nn+1 ) is computable. Let P. we use the bold symbol ∀ to abbreviate “for all natural numbers. . Pk ⊆ Nn be computable. x) = 0) and apply (R3). Let F : Nn → N. . If R ⊆ Nn+1 is computable. . y) ∈ Nn+1 .11. from which the lemma follows immediately.80 CHAPTER 5. Then the relation P ⊆ Nn defined by   P1 (a) if R1 (a)  . . Proof.7. Here is a nice consequence of 5.10. for (a.1. Use that P = (P1 ∧ R1 ) ∨ · · · ∨ (Pk ∧ Rk ). . y) ⇐⇒ ∀x<y R(a. y) = µx<y R(a. . Proof.” These abbreviation symbols should not be confused with the logical symbols ∃ and ∀.5 and 5.1. x) Q(a. . an . Suppose R ⊆ Nn+1 is computable.1.   Pk (a) if Rk (a) is computable. Proof. (Definition by Cases) Let R1 . x). Lemma 5. y) = µx(R(a. F (a) = µxR(a. Rk (a) holds. Lemma 5. . x) is computable. Lemma 5. Proof. . . . Let R ⊆ Nn+1 be the graph of F . Note that F (a) = µx(χ¬R (a. x) ∈ R. . .1. Lemma 5. . . R(a. x) is computable. Use that FR (a. Let P1 . .

We now develop some coding tricks due to G¨del that enable us to prove routinely that functions like 2x are computable o according to our definition of “computable function”. Exercise. Definition. y) := We call Pair the pairing function. and Left(a) < a if 0 < a ∈ N. Lemma 5. y). y) = χ< (FR (a.10 we note that P (a. y) < y.1. b. .1. x). c ∈ Z we have (by definition): a ≡ b mod c ⇐⇒ a − b ∈ cZ. But is the exponential function n → 2n computable? It certainly is in the intuitive sense: we know how to compute (in principle) its value at any given argument.5. y) = a ⇐⇒ Left(a) = x and Right(a) = y. Lemma 5. (x + y)(x + y + 1) + x 2 a−b 0 if a ≥ b. The functions Left and Right are computable.1. COMPUTABLE FUNCTIONS 81 Proof.1. Using the notation and results from Lemma 5. y) = a . Use that a−b = µx(b + x = a or a < b). Since Pair is a bijection we can define functions Left. Use 5. The ternary relation a ≡ b mod c on N is computable. Right : N → N by Pair(x. It is not that obvious from what we have proved so far that it is computable in our precise sense. The function Pair is bijective and computable. Definition. y). Proof. ˙ ˙ Lemma 5.1. Hence χP (a. For Q.15. The function − : N2 → N defined by a−b = is computable. Define the function Pair : N2 → N by Pair(x. if a < b For a. ˙ Proof. y) iff FR (a. Lemma 5. Proof.12. The reader should check that Left(a). y) = a and Right(a) = µy ∃ x<a+1 Pair(x. Right(a) ≤ a for a ∈ N.13. note that ¬Q(a.9 in combination with Left(a) = µx ∃ y<a+1 Pair(x. y) iff ∃ x<y ¬R(a.14.1. The results above imply easily that many familiar functions are computable.1.

i) = ai for i < n. n) It follows that n → 2n is computable. Remark. β(a. b. . n) = β(µx(β(x. i) be the remainder of Left(a) upon division by 1 + (i + 1) Right(a). Hence by Proposition 5. i ∈ N we let β(a. ˙ Proposition 5. . . . an are natural numbers such that a0 = 1. i) ≤ ˙ Left(a) ≤ a−1. a contradiction. that is. i ∈ N.16 we have β(a. .1. . Let a0 . .1. We claim that then 1 + N . an−1 ) of natural numbers there exists a ∈ N such that β(a. that is. . The computability of β is clear from earlier results. β(a. i)). i) = ai as required. . For a. Put a := Pair(M. . . Then we take an N ∈ N such that ai ≤ N for all i < n and N is a multiple of every prime number less than n. . 2n = β(a. . i) := µx x ≡ Left(a) mod 1 + (i + 1) Right(a) . Proposition 5.16 shows that we can use β to encode a sequence of numbers a0 . o Definition. We use this as follows to show that the function n → 2n is computable. Proof. then necessarily an = 2n . an−1 in terms of a single number a. then Left(a) = M and Right(a) = N . . The function β is computable. 1 + 2N. . c ∈ N we have a ≡ b mod c ⇐⇒ ∃ x<a+1 a = x · c + b or ∃ x<b+1 b = x · c + a . and thus β(a.16. and ai+1 = 2ai for all i < n. i + 1) = 2β(x. Use that for a. We have β(a. . ≡ a0 a1 mod 1 + N mod 1 + 2N M an−1 mod 1 + nN. COMPUTABILITY Proof. but p does not divide N . To see this. We can now introduce G¨del’s function β : N2 → N. 1 + nN are relatively prime. an−1 be natural numbers. For any sequence (a0 . N ). . If a0 . 0) = 1 and ∀i<n β(x. 0) = 1 and ∀i<n β(x. . then p divides their difference (j − i)N .1. i)). . . hence p | j − i < n. suppose p is a prime number such that p | 1 + iN and p | 1 + jN (1 ≤ i < j ≤ n). By the Chinese Remainder Theorem there exists an M such that M M ≡ ≡ .82 CHAPTER 5. . i) ≤ a−1 for all a. i + 1) = 2β(x. n) = 2n where a := µx(β(x. . .

. . (1) The function In : N2 → N defined by In(a.1. ¯ Lemma 5. . b) = (F (a. b + 1))b . . . .16. .4 and 5. . am ∗ b1 . . . an ) = n. . denoted a1 . Put (a)i := β(a.8. so lh is computable. which we develop next. . . 5. . and apply Lemmas 5. . . . an ∈ N and i ≤ n.1. . bn ∈ N. Proof. 1) = a1 .1. and ( a1 . n. . For any n. . . . . bn for all a1 . i) = a1 . Then F is computable since ¯ F (a. ¯ Proof. . . an )i = ai+1 for i < n. . . . . i + 1). . b1 . β(a. Lemma 5. an ) of natural numbers we assign a sequence number . an ) → a1 . We define the length function lh : N −→ N by lh(a) = β(a. . . suppose F is computable. . 0) = n (the length of the sequence) and β(a. . . am . . am . an : Nn → N is computable.1. . . . Let F : Nn+1 → N.5. . . i) = ai for i = 1. F (a. . and ai < a1 . 0). (2) The function ∗ : N2 → N defined by a ∗ b = µx(lh(x) = lh(a) + lh(b) and ∀i<lh(a) (x)i = (a)i and ∀j<lh(b) (x)lh(a)+j = (b)j ) is computable and a1 . COMPUTABLE FUNCTIONS 83 The above suggests a general method. The function (a. the function (a1 . ¯ Definition. . . i) → (a)i : N2 −→ N is computable. . . .1. . . ai for all a1 . 0). b1 . . Use a1 . For n = 0 this gives = 0. . . bn = a1 .18. . i)). . 0) = n. Then F is computable if and only if F is computable. . . . . The set Seq is computable since a ∈ Seq ⇐⇒ ∀x<a (lh(x) = lh(a) or ∃ i<lh(a) (x)i = (a)i ) Lemma 5. .1. Suppose F is computable. To each sequence (a1 . n) = an ). β(a. . ¯ In the other direction. (a ∈ Nn ). . where is the sequence number of the empty sequence. b) = µx(lh(x) = b and ∀i<b (x)i = F (a. . . . . . . . . n. . . an for (a1 .19. Observe that lh( a1 . an ) ∈ Nn and i = 1. . 0) = = 0. . an . an = µa(β(a. .1. . For F : Nn+1 → N . .17. . and defined to be the least natural number a such that β(a. . Finally. . Then F is computable since ¯ F (a. an . b − 1) ¯ Note that F (a. . . . . . let Seq ⊆ N denote the set of sequence numbers. i) = µx(lh(x) = i and ∀j<i (x)j = (a)j ) is computable and In( a1 . let F : Nn+1 → N be given by ¯ F (a. . . b) = F (a. . . .

b. 0) = A(a) = G(a. The claim follows. Definition.1. Proposition 5. We have ¯ F (a. Then F is computable. Note that ¯ F (a. 0)) and ¯ ¯ F (a. 0) = A(a). b. We claim that ¯ F (a. In(x. F (a. 0). F (a. b + 1) = G(a. Let G and F be as above and suppose G is computable. b + 1) = B(a. F (a. We say that F is obtained from A and B by primitive recursion. by Proposition 5. B. and F are as above. F (a. 0. b. This will be clear if we express the requirement on F as follows: F (a. 0) = G(a. G is computable. The next result is important because it allows us to introduce computable functions by recursion on its values at smaller arguments.84 CHAPTER 5. . b. b + 1) = B(a. F (a. F (a. ˙ B(a. b−1. Let a range over Nn . i. Proof. Let A : Nn → N and B : Nn+2 → N be given. and A and B are computable. (F (a. Proof.1. . c) = A(a) if c = 0. F (a. Suppose A. b + 1)). F (a. b) = µx(Seq(x) and lh(x) = b and ∀i<b (x)i = G(a. Proposition 5. b + 1. and define the function F : Nn+1 → N by F (a. F (a. b)). b)) = B(a.1. b) = G(a. b + 1. 0) = G(a. . (c)b−1 ) if c > 0. b)) (a ∈ Nn ). . .21. 0. 0.20. b) = G(a. Define G : Nn+2 → N by G(a. and thus by the previous lemma F is computable. It follows that F is computable. b. 0). i))) ¯ for all a ∈ Nn and b ∈ N. b) ). b)). Then F is computable. This claim yields the computability of F . b. COMPUTABILITY Given G : Nn+2 → N there is a unique function F : Nn+1 → N such that ¯ F (a. b + 1))b ) = G(a. ˙ Clearly. F (a.20.

1.1. b. (1) The set of prime numbers is computable. the function F = G(H1 . an ∈ N. b) otherwise. b) = K(a. (3) If f1 . 0) = H(a. Hm : Nn → N in F . . According to Proposition 5. F (a. and define F : N → N. There is clearly a unique function F : N2 → N such that for all a. We say that F is closed under composition if for all G : Nm → N in F and all H1 . We claim that F is computable. c) = (c)G(b) if G(b) < b. . an ) for all a1 . . H(a. We say that F is closed under minimalization if for every G : Nn+1 → N in F such that for all a ∈ Nn there exists x ∈ N with G(a. . and Fn+2 = Fn+1 + Fn .1. an ) = F ( a1 . . G(b)) if G(b) < b. . . . . the function F : Nn → N given by F (a) = µx(G(a. b) = F (a. All lemmas and propositions of this Section go through with computable replaced by in F. . then f −1 (X) ⊆ Nm is computable. . let G : N → N and H : N2 → N be computable. . fn : Nm → N are computable and X ⊆ Nn is computable. In particular F (a. (5) If f : N → N is computable and strictly increasing. (6) All computable functions and relations are definable in N. F1 = 1. ` ´ F (a) := F (a)0 . . so F (a1 .) Let F be a collection of functions F : Nm → N for various m. fn ) : Nm → Nn . . . 0). and is closed under composition and minimalization. . b ∈ N. b)) for all a. . . Such a function K is given by K(a. . COMPUTABLE FUNCTIONS 85 Proposition 5. Then F is computable iff F is computable. As a simple o example of such an application. . . Exercises. x) = 0. b) otherwise. .20 will be applied over and over again in the later section on G¨del numbering.5. x) = 0) is in F . then f (N) ⊆ N is computable. . (8) Suppose F contains the functions mentioned in (R1). (Hence n-variable computability reduces to 1variable computability. (7) Let F : Nn → N. . . (2) The Fibonacci numbers are the natural numbers Fn defined recursively by F0 = 0. . but in combination with definitions by cases. (4) If f : N → N is computable and surjective. Hm ) : Nn → N is in F . We say that a relation R ⊆ Nn is in F if its characteristic function χR is in F .20 this claim will follow if we can specify a ¯ computable function K : N3 → N such that F (a. . . . . . (a)n−1 . then there is a computable function g : N → N such that f ◦ g = idN . . . The function n → Fn : N → N is computable. where f := (f1 . b. b ∈ N F (a. H(a.

We assume that for each i < n the number ai+1 is calculated by some fixed procedure from the earlier numbers a0 . We deliberately neglect physical constraints of space and time: imagine that the program that implements the algorithm has unlimited access to time and space to do its work on any given input. . It stops after step n with an = F (a). . COMPUTABILITY 5. we have a calculable function G : N → N such that ai+1 = G( a0 . it is an important guiding principle. has turned out to be computable. but Turing gave in 1936 a compelling informal analysis of what functions F : N → N are calculable in principle. Here is a heuristic (informal) argument that might make the Thesis plausible. an ). idealized notion of computable. with a0 = a as starting number. and the informal grounds for calculability have always translated routinely into an actual proof of computability. Once this is agreed to. We can assume that on any input a ∈ N this algorithm consists of a finite sequence of steps.” Call a set P ⊆ N calculable if its characteristic function is calculable. . . ai . The Church-Turing Thesis asserts each calculable function F : N → N is computable. numbered from 0 to n. Also. Let an algorithm be given for computing F : N → N. and has never failed in practice: any function that any competent person has ever recognized as being calculable. various alternative formalizations of the informal notion of calculable function have been proposed. . formal systems. . In addition. The above is only a rather narrow version of the Church-Turing Thesis. namely the computable functions. . because the result of Exercise 7 in Section 5. The corresponding assertion for functions Nn → N follows. we should have a calculable P ⊆ N such that ¬P ( a0 . a computer program. . Since G and P describe only single steps in the algorithm for F it is reasonable to assume that they at least are computable.86 CHAPTER 5. Let us write “calculable” for this intuitive. . . . say. A skeptical reader may find this argument dubious. . using various kinds of machines. where at each step i it produces a natural number ai . ai ) for all i < n. one can show easily that F is computable as well.2 The Church-Turing Thesis The computable functions as defined in the last section are also computable in the informal sense that for each such function F : Nn → N there is an algorithm that on any input a ∈ Nn stops after a finite number of steps and produces an output F (a). While the Church-Turing Thesis is not a precise mathematical statement. ai ) for i < n and P ( a0 . see the exercise below. informal. and this has led to general acceptance of the Thesis. . They all have turned out to be equivalent in the sense of defining the same class of functions on N. our Church-Turing Thesis does not characterize mathematically . An algorithm is given by a finite list of instructions.1 is clearly also valid for “calculable” instead of “computable. say. but it suffices for our purpose. . . that is. These instructions should be deterministic (leave nothing to chance or choice). The algorithm should also tell us when to stop. . that is. There are various refinements and more ambitious versions. and so on.

. . the unary successor function n S. α1 . ai ). . . Hm ) is primitive recursive. only the notion function computable by an algorithm that produces for each input from N an output in N. such that each αn computes a (computable) function fn : N → N. . for all i < n we have ai+1 = G( a0 . and is used in a more sophisticated form in the proof of G¨del’s incompleteness theorem.3. . . and such that every computable function f : N → N occurs in the sequence f0 . Hm : Nn → N are primitive recursive. . . . an . Suppose that for each a ∈ N there is such a finite sequence a0 . that for all a ∈ Nn there exists y ∈ N such that G(a. . We shall now argue informally that it is impossible to generate in a fully constructive way exactly the computable functions. in violation of the Church-Turing Thesis. .5. y) = 0. (PR3) If F : Nn+1 → N is obtained by primitive recursion from primitive recursive functions G : Nn → N and H : Nn+2 → N. 5. .1 The same basic idea applies in other cases.3 Primitive Recursive Functions This section is not really needed in the rest of this chapter. . thus defining a function F : N → N. (1) Let G : N → N and P ⊆ N be given. If G and P are computable. but it may throw light on some issues relating to computability. . an ). then G(H1 . f1 . . . an of natural numbers such that a0 = a. . This way of producing a new function fdiag from a sequence (fn ) is called diagonalization. . 1 Perhaps it would be better to call it antidiagonalization. Exercises. . ai ) and ¬P ( a0 . and P ( a0 . Then there is for each a ∈ N at most one finite sequence a0 . . . but fdiag = fn for all n. . . possibly more than once. . o Here is a class of computable functions that can be generated constructively: The primitive recursive functions are the functions f : Nn → N obtained inductively as follows: (PR1) The nullary function N0 → N with value 0. and all coordinate functions Ii are primitive recursive. . PRIMITIVE RECURSIVE FUNCTIONS 87 the intuitive notion of algorithm. Then fdiag is clearly computable in the intuitive sense. . . f2 . . then F is primitive recursive. . (PR2) If G : Nm → N is primitive recursive and H1 . and put F (a) := an . . Now consider the function fdiag : N → N defined by fdiag (n) = fn (n) + 1. . One such issue is the condition. in Rule (R3) for generating computable functions. . α2 . . . so is F . This condition is not constructive: it could be satisfied for a certain G without us ever knowing it. Such a constructive generation process would presumably enable us to enumerate effectively a sequence of algorithms α0 . .

. . an−1 . A consequence of the primitive recursive version of Lemma 5. . ¯ Let a function F : Nn+1 → N be given. Lemma 5. a0 .11–5. y + 1) = χ>0 (x) · χ≥ (P d(x). y)). then the function F : Nn → N given by F (a) = µxR(a. i) = ai for i = 0. Then F : Nn+1 → N satisfies the ¯ ¯ ¯ primitive recursion F (a.10. an−1 < N. 0) = 0.” Proof. It follows . b) ∗ F (a. in the order they are listed. The functions in (ii) are obtained by the usual primitive recursions. For (iv). Proof. The following functions and relations are primitive recursive: (i) each constant function cn . and y over N.1 go through with “computable” replaced by “primitive recursive.15 and Proposition 5. 0) = 0 and F (a. . . To obtain the primitive recursive version of Lemma 5. FR (a. cn is obtained by applying (PR2) with G = c0 (with k = 0 and m m t = n). the unary ˙ relation {x ∈ N : x > 0}. x). Next. The primitive recursive versions of Lemmas 5. . x) is primitive recursive. b) . the function − : N2 → N. (n. the computable functions that one ordinarily meets with are primitive recursive. FR (a. Lemma 5. In the rest of this section x ranges over Nm with m depending on the context.1. y).1. ≤ and = on N. Using this fact and restricted minimalization. the unary function lh.88 CHAPTER 5.16 yields: There is a primitive recursive function B : N → N such that. note that χ≥ (x.16 now follow easily.1.9. . Also. a0 < N. and the binary functions (a. and for all a ∈ Nn there exists x < H(a) such that R(a. y+1) = FR (a.1.3. ·. n − 1.1. and (x.3. . b + 1) = F (a.10 is the following “restricted minimalization scheme” for primitive recursive functions: if R ⊆ Nn+1 and H : Nn → N are primitive recursive. the proof of Proposition 5. (iv) the binary relations ≥. it follows that the unary relation Seq. . .1.1. With the possible exceptions of Lemmas 5. the function β is primitive recursive. N ∈ N) then for some a < B(N ) we have β(a. i) → (a)i . whenever n < N. ˙ (iii) the predecessor function P d : N → N given by P d(x) = x−1. In particular. m (ii) the binary operations +. FR (a. y) → xy on N. note that FR (a. all Lemmas and Propositions in Section 5.1. y))+(y+1)·χ¬R (a.8 and 5. . The function c0 is obtained from c0 by applying (PR2) m times with m 0 G = S. COMPUTABILITY A relation R ⊆ Nn is said to be primitive recursive if its characteristic function χR is primitive recursive. . It is also easy to write down primitive recursions for the functions in (iii).2. As the next two lemmas show. y)·χR (a.1. In and ∗ are primitive recursive.

and a second induction on y. . . and F (a. b) ∈ N . We define the Ackermann function A : N2 → N by A(n. for all n and x. (ii) n ≥ 1 =⇒ An+1 (y) > An (y) + y. . . and plays a role in several contexts. y) := An (y). Assume inductively that A0 . A2 . b)) for all n+1 ¯ satisfies the primitive recursion (a. Proof. (iii) An+1 (y) ≥ An (y + 1). Next we show that An+1 (y) > An (y) for all y: An+1 (0) = An (1). b) ∗ G(a. so An+1 (0) > An (0) and An+1 (0) > 1. The Ackermann Function. Thus A0 = S and An+1 ◦ A0 = An ◦ An+1 . An+1 (y + 1) = An (An+1 (y)). y): Using A1 (y) = y + 2 and A2 (y) = 2y + 3. A1 . 0). 0. F (a. 0) = G(a. For inequality (ii). One verifies easily that A1 (y) = y + 2 and A2 (y) = 2y + 3 for all y. so is F . First we define inductively a sequence A0 .5. F (a. An+1 (0) = An (1). and assume inductively that An (y) > An−1 (y) + y. Inequality (i) follows easily by induction on n. b) = G(a.3. (v) x < y =⇒ An (x + y) ≤ An+2 (y). y: (i) An (x + y) ≥ An (x) + y. of primitive recursive functions An : N → N: A0 (y) = y + 1. The converse is obvious. (iv) 2An (y) < An+2 (y). and strictly increasing in each variable. We leave it as an exercise at the end of this section to show that A is computable. Hence An+1 (y + 1) = An (An+1 (y)) > An (y + 1).3. By diagonalization we can produce a computable function that is not primitive recursive. but the so-called Ackermann function does more. Also. ¯ ¯ ¯ F (a. PRIMITIVE RECURSIVE FUNCTIONS 89 ¯ that if F is primitive recursive. then F ¯ F (A. . b. . we proceed again by induction on (n. so An+1 is strictly increasing.3. and An+1 (y + 1) = An (An+1 (y)) ≥ An (y + 1 + An (y)) ≥ An (y + 1) + An (y) > An (y + 1) + y + 1. An are strictly increasing and A0 (y) < A1 (y) < · · · < An (y) for all y. ¯ so F (and hence F ) is primitive recursive. Lemma 5. we obtain A2 (y) > A1 (y) + y. . Suppose also n+2 ¯ that G : N → N is primitive recursive. b)) . Then An+1 (0) = An (1) > An (0) + 0. so An+1 (y) > y + 1 for all y. b + 1) = F (a. . The function A is computable. Then An+1 (y + 1) = An (An+1 (y)) ≥ A0 (An+1 (y)) > An+1 (y). b. Let n > 1.

(1 ≤ i ≤ m). Then An+1 (x + y + 1) = An (An+1 (x + y)) ≤ An+2 (An+3 (y)) = An+3 (y + 1). . assume F = G(H1 . Case 1. . Call an n = n(F ) with the property above a bound for F . x = y. . . Hk : Nm → N are primitive recursive. . The nullary constant function with value 0. Hk . Then An+1 (x + y + 1) = An+1 (2x + 1) = An (An+1 (2x)) ≤ An+2 (2x) < An+2 (An+3 (x)) = An+3 (y + 1). . . F (x. . Note that (v) holds for n = 0. . . Take N ∈ N such that n(G) ≤ N + 3 and n(H) ≤ N . We can assume inductively that if x < y. Given any primitive recursive function F : Nm → N there is an n = n(F ) such that F (x) ≤ An (|x|) for all x ∈ Nm . the successor function S. Proposition 5. . For n > 0 we have by (i). n(Hk ) are bounds for G and H1 . COMPUTABILITY In (iii) we proceed by induction on y. and each coordinate m function Ii . Note that (iv) holds for n = 0. . and assume inductively that n(G) and n(H1 ). We have equality for y = 0. . 0) = G(x) ≤ AN +3 (|x|). and by part (v) of the lemma above. Assuming inductively that (iii) holds for a certain y we obtain An+1 (y + 1) = An (An+1 (y)) ≥ An (An (y + 1)) ≥ An (y + 2). . xm ) ∈ Nm . .3. . F (x. Hk ) where k G : N → N and H1 . Assume (v) holds for a certain n. . Hk (x)) ≤ AN ( i Hi (x)) ≤ AN (AN +1 (|x|)) ≤ AN +2 (|x|). . Then F (x) = G(H1 (x). .4. . y + 1) = H(x. assume that F : Nm+1 → N is obtained by primitive recursion from the primitive recursive functions G : Nm → N and H : Nm+2 → N. (ii) and (iii): An (y) + An (y) ≤ An (y + An (y)) < An (An+1 (y)) = An+1 (y + 1) ≤ An+2 (y). and we want to show that An+1 (x + y + 1) ≤ An+3 (y + 1). and i Hi (x) ≤ AN +1 (|x|) for all x. By part (iv) of the previous lemma we can take N ∈ N such that n(G) ≤ N . Finally. We claim that N + 3 is a bound for F : F (x. x < y. . Case 2. y. then An+1 (x + y) ≤ An+3 (y). Below we put |x| := x1 + · · · + xm for x = (x1 . . . and assume inductively that n(G) and n(H) are bounds for G and H.90 CHAPTER 5. . Next. y)) ≤ AN {|x| + y + AN +3 (|x| + y)} ≤ AN +2 {AN +3 (|x| + y)} = AN +3 (|x| + y + 1). Proof. . has bound 0. Let x < y+1.

REPRESENTABILITY 91 Consider the function A∗ : N → N defined by A∗ (n) = A(n. A(x + 1. that is. N x < S n+1 0 ↔ (x = 0 ∨ · · · ∨ x = S n 0). . A(x + 1. and it gives an example showing that Lemma 5. S0 in which S appears exactly n times. A∗ is not primitive recursive. 0) = A(x. ·. The Ackermann function is defined by a recursion involving both variables: A(0. y) = y + 1. S 1 0 is the term S0. The fact that N is finite will play a role later. 1). but strong enough to prove numerical facts like SS0 + SSS0 = SSSSS0. . the other variables just act as parameters. For each n. Lemma 5. L contains the constant symbol 0 and the unary function symbol S. Hence A is computable but not primitive recursive.4.4. where n(F ) is a bound for F . +. So S 0 0 is the term 0. (It follows that the Ackermann function is recursive.1. Here N is the following set of nine axioms. y)). where we fix two distinct variables x and y for the sake of definiteness: N1 ∀x (Sx = 0) ∀x∀y (Sx = Sy → x = y) N2 N3 ∀x (x + 0 = x) N4 ∀x∀y x + Sy = S(x + y) N5 ∀x (x · 0 = 0) N6 ∀x∀y (x · Sy = x · y + x) N7 ∀x (x < 0) N8 ∀x∀y (x < Sy ↔ x < y ∨ x = y) N9 ∀x∀y (x < y ∨ x = y ∨ y < x) These axioms are clearly true in N.1. Then A∗ is computable. In particular. Exercises. This kind of double recursion is therefore more powerful in some ways than what can be done in terms of primitive recursion and composition. and so on. S. It is a very weak set of axioms.4 Representability Let L be a numerical language. Our key example of a numerical language is L(N) := {0.) 5. We let S n 0 denote the term S . The recursion in “primitive recursion” involves only one variable.5. and for any primitive recursive function F : N → N we have F (y) < A∗ (y) for all y > n(F ).9 fails with “primitive recursive” in place of “computable”. ∀x x < SS0 → (x = 0 ∨ x = S0) . . y + 1) = A(x. A(x + 1. n). (1) The graph of the Ackermann function is primitive recursive. <} (the language of N).

92

CHAPTER 5.

COMPUTABILITY

Proof. By induction on n. For n = 0, N x < S0 ↔ x = 0 by axioms N8 and N7. Assume n > 0 and N x < S n 0 ↔ (x = 0 ∨ · · · ∨ x = S n−1 0). Use axiom N8 to conclude that N x < S n+1 0 ↔ (x = 0 ∨ · · · ∨ x = S n 0). To give an impression how weak N is we consider some of its models: Some models of N. (1) We usually refer to N as the standard model of N. (2) Another model of N is N[x] := (N[x]; . . . ), where 0, S, +, · are interpreted as the zero polynomial, as the unary operation of adding 1 to a polynomial, and as addition and multiplication of polynomials in N[x], and where < is interpreted as follows: f (x) < g(x) iff f (n) < g(n) for all large enough n. (3) A more bizarre model of N: (R≥0 ; . . . ) with the usual interpretations of 0, S, +, ·, in particular S(r) := r + 1, and with < interpreted as the binary relation <N on R≥0 : r <N s ⇔ (r, s ∈ N and r < s) or s ∈ N. / Example (2) shows that N ∀x∃y (x = 2y ∨ x = 2y + S0), since in N[x] the element x is not in 2N[x] ∪ 2N[x] + 1; in other words, N cannot prove “every element is even or odd.” In example (3) the binary relation <N on R≥0 is not even a total order. One useful fact about models of N is that they all contain the so-called standard model N in a unique way: Lemma 5.4.2. Suppose A |= N. Then there is a unique homomorphism ι : N → A. This homomorphism ι is an embedding, and for all a ∈ A and n, (i) if a <A ι(n), then a = ι(m) for some m < n; (ii) if a ∈ ι(N), then ι(n) <A a. / As to the proof, note that for any homomorphism ι : N → A and all n we must have ι(n) = (S n 0)A . Hence there is at most one such homomorphism. It remains to show that the map n → (S n 0)A : N → A is an embedding ι : N → A with properties (i) and (ii). We leave this as an exercise to the reader. Definition. Let L be a numerical language, and Σ a set of L-sentences. A relation R ⊆ Nm is said to be Σ-representable, if there is an L-formula ϕ(x1 , . . . , xm ) such that for all (a1 , . . . , am ) ∈ Nm we have (i) R(a1 , . . . , am ) =⇒ Σ ϕ(S a1 0, . . . , S am 0) (ii) ¬R(a1 , . . . , am ) =⇒ Σ ¬ϕ(S a1 0, . . . , S am 0) Such a ϕ(x1 , . . . , xm ) is said to represent R in Σ or to Σ-represent R. Note that if ϕ(x1 , . . . , xm ) Σ-represents R and Σ is consistent, then for all (a1 , . . . , am ) ∈ Nm R(a1 , . . . , am ) ⇐⇒ Σ ¬R(a1 , . . . , am ) ⇐⇒ Σ ϕ(S a1 0, . . . , S am 0), ¬ϕ(S a1 0, . . . , S am 0).

A function F : Nm → N is Σ-representable if there is a formula ϕ(x1 , . . . , xm , y) of L such that for all (a1 , . . . , am ) ∈ Nm we have Σ ϕ(S a1 0, . . . , S am 0, y) ↔ y = S F (a1 ,...,am ) 0.

5.4. REPRESENTABILITY Such a ϕ(x1 , . . . , xm , y) is said to represent F in Σ or to Σ-represent F .

93

An L-term t(x1 , . . . , xm ) is said to represent the function F : Nm → N in Σ if Σ t(S a1 0, . . . , S am 0) = S F (a) 0 for all a = (a1 , . . . , am ) ∈ Nm . Note that then the function F is Σ-represented by the formula t(x1 , . . . , xm ) = y. Proposition 5.4.3. Let L be a numerical language, Σ a set of L-sentences such that Σ S0 = 0, and R ⊆ Nm a relation. Then R is Σ-representable ⇐⇒ χR is Σ-representable. Proof. (⇐) Assume χR is Σ-representable and let ϕ(x1 , . . . , xm , y) be an Lformula Σ-representing it. We show that ψ(x1 , . . . , xm ) := ϕ(x1 , . . . , xm , S0) Σ-represents R. Let (a1 , . . . , am ) ∈ R; then χR (a1 , . . . , am ) = 1. Hence Σ ϕ(S a1 0, . . . , S am 0, y) ↔ y = S0,

so Σ ϕ(S a1 0, . . . , S am 0, S0), that is, Σ ψ(S a1 0, . . . , S am 0). Similarly, (a1 , . . . , am ) ∈ R implies Σ ¬ϕ(S a1 0, . . . , S am 0, S0). (Here we need / Σ S0 = 0.) (⇒) Conversely, assume R is Σ-representable and let ϕ(x1 , . . . , xm ) be an Lformula Σ-representing it. We show that ψ(x1 , . . . , xm , y) := (ϕ(x1 , . . . , xm ) ∧ y = S0) ∨ (¬ϕ(x1 , . . . , xm ) ∧ y = 0) Σ-represents χR . Let (a1 , . . . , am ) ∈ R; then Σ Σ ϕ(S a1 0, . . . , S am 0). Hence,

[(ϕ(S a1 0, . . . , S am 0) ∧ y = S0) ∨ (¬ϕ(S a1 0, . . . , S am 0) ∧ y = 0)] ↔ y = S0

i.e. Σ ψ(S a1 0, . . . , S am 0, y) ↔ y = S0. And similarly for (a1 , . . . , am ) ∈ R, Σ ψ(S a1 0, . . . , S am 0, y) ↔ y = 0. / Let Σ be a set of L-sentences. An L-term t(x1 , . . . , xm ) is said to represent the function F : Nm → N in Σ if Σ t(S a1 0, . . . , S am 0) = S F (a) 0 for all m a = (a1 , . . . , am ) ∈ N . Note that then the function F is Σ-represented by the formula t(x1 , . . . , xm ) = y. Theorem 5.4.4 (Representability). Each computable function F : Nn → N is N-representable. Each computable relation R ⊆ Nm is N-representable. Proof. By Proposition 5.4.3 we need only consider the case of functions. We make the following three claims: (R1) + : N2 → N, · : N2 → N, χ≤ : N2 → N, and the coordinate function n Ii (for each n and i = 1, . . . , n) are N-representable. (R2) If G : Nm → N and H1 , . . . , Hm : Nn → N are N-representable, then so is F = G(H1 , . . . , Hm ) : Nn → N defined by F (a) = G(H1 (a), . . . , Hk (a)).

94 (R3)

CHAPTER 5.

COMPUTABILITY

If G : Nn+1 → N is N-representable, and for all a ∈ Nn there exists x ∈ N such that G(a, x) = 0, then the function F : Nn → N given by F (a) = µx(G(a, x) = 0)

is N-representable. (R1) : The proof of this claim has six parts. (i) The formula x1 = x2 represents {(a, b) ∈ N2 : a = b} in N: Let a, b ∈ N. If a = b then obviously N S a 0 = S b 0. Suppose that a = b. Then for every model A of N we have A |= S a 0 = S b 0, by Lemma 5.4.2 and its proof. Hence N S a 0 = S b 0. (ii) The term x1 + x2 represents + : N2 → N in N: Let a + b = c where a, b, c ∈ N. By Lemma 5.4.2 and its proof we have A |= S a 0 + S b 0 = S c 0 for each model A of N. It follows that N S a 0 + S b 0 = S c 0. (iii) The term x1 · x2 represents · : N2 → N in N: The proof is similar to that of (ii). (iv) The formula x1 < x2 represents {(a, b) ∈ N2 : a < b} in N: The proof is similar to that of (i). (v) χ≤ : N2 → N is N-representable: By (i) and (iv), the formula x1 < x2 ∨ x1 = x2 represents the set {(a, b) ∈ N2 : a ≤ b} in N. So by Proposition 5.4.3, χ≤ : N2 → N is N-representable. (vi) For n ≥ 1 and 1 ≤ i ≤ n , the term tn (x1 , . . . , xn ) := xi , represents i n the function Ii : Nn → N in N. This is obvious. (R2) : Let H1 , . . . , Hm : Nn → N be N-represented by ϕi (x1 , . . . , xn , yi ) and let G : Nm → N be N-represented by ψ(y1 , . . . , ym , z) where z is distinct from x1 , . . . , xn and y1 , . . . , ym . Claim : F = G(H1 , . . . , Hm ) is N-represented by
m

θ(x1 , . . . , xn , z) := ∃y1 . . . ∃ym ((
i=1

ϕi (x1 , . . . , xn , yi )) ∧ ψ(y1 , . . . , ym , z)).

Put a = (a1 , . . . , an ) and let c = F (a). We have to show that N θ(S a 0, z) ↔ z = S c 0

where S a 0 := (S a1 0, . . . , S an 0). Let bi = Hi (a) and put b = (b1 , . . . , bm ). Then F (a) = G(b) = c. Therefore, N ψ(S b 0, z) ↔ z = S c 0 and N ϕi (S a 0, yi ) ↔ yi = S bi 0, (i = 1, . . . , m)

Argue in models to conclude : N

θ(S a 0, z) ↔ z = S c 0.

xm . 0) ∧ ∀w(w < y → ¬ϕ(x1 .i) 0. . y). Examples. is Σ-represented by ϕ(x1 . . We shall prove the converse in the next section.4. xn . . . z) ↔ z = 0 and for i < b. . By arguing in models using Lemma 5. the set Th(K) := {σ : A |= σ for all A ∈ K} is an L-theory. Note that the theory of A is automatically complete. Remark. b) = 0. Then the set of all Σ-representable functions F : Nm → N. since N S m 0 = S n 0 whenever m = n. y. Define F : Nn → N by F (a) = µb(G(a. S i 0. . 1. Therefore. (2) Given an L-structure A. . . . . N ϕ(S a 0. Suppose that G is N-represented by ϕ(x1 . L is a numerical language and Σ is a set of L-sentences. . . . . as claimed. i) = 0 and N ϕ(S a 0. . . whenever T σ. Let a ∈ Nn and let b = F (a). . w. 0)) N-represents F . In the exercises below. (1) Given a set Σ of L-sentences. DECIDABILITY AND GODEL NUMBERING 95 (R3) : Let G : Nn+1 → N be such that for all a ∈ Nn there exists b ∈ N with G(a. The converse of this theorem is also true.5 Decidability and G¨del Numbering o Definition. Then G(a. For example. . (3) Given any class K of L-structures. G(a. z) ↔ z = S G(a. i) = 0 for i < b and G(a. (m = 0. xm . .) (2) Suppose Σ ⊇ N. y) ↔ y = S b 0. If a function F : Nm → N is Σrepresented by the L-formula ϕ(x1 . then the graph of F . . is an L-theory. (1) Suppose Σ S m 0 = S n 0 whenever m = n. . the set Th(Σ) := {σ : Σ σ} of theorems of Σ. then σ ∈ T . xn . b) = 0). ) is closed under composition and minimalization.¨ 5. 5. (This result applies to Σ = N. . y. Th(K) is the set of L-sentences that are true in all finite fields. . y).5. . . Exercises. z). called the theory of A. We claim that the formula ψ(x1 . as a relation of arity m + 1 on N. and K the class of finite fields. . for L = LRi . b) = 0. . xn .2 we obtain N ψ(S a 0. xn . called the theory of K. 2. y) := ϕ(x1 . . the set Th(A) := {σ : A |= σ} is also an Ltheory. If we need to indicate the dependence on L we write ThL (Σ) for Th(Σ). An L-theory T is a set of L-sentences closed under provability. and is plausible from the Church-Turing Thesis. that is. S b 0. For Σ = ∅ we also refer to ThL (Σ) as predicate logic in L. We say that Σ axiomatizes an L-theory T (or is an axiomatization of T ) if T = Th(Σ).

this is just an informal description at this stage. ϕ = (t1 = t2 ). Since we have not (yet) defined the concept of algorithm. tm . .) Unless we specify otherwise we assume in the rest of this chapter that the language L is finite. . One of our goals in this section is to define a formal counterpart. SN(R). tn if t = vi . SN(∧). . . if t = F t1 . SN(¬). called “decidability. ϕ = ⊥. . ϕ2 x . SN(∨). This is done for simplicity. ϕ = ϕ1 ∨ ϕ2 . SN(∃). v2 . t1 . of an L-formula ϕ is given recursively by SN( ) SN(⊥) SN(=). . ϕ = Rt1 . ϕ = ¬ψ. . by the Church-Turing Thesis. tm ψ ϕ1 . ϕ2 ϕ1 . ψ x .” In the next section we show that the L(N)-theory Th(N) is undecidable.5. SN(∀). . and at the end of the remaining sections we indicate how to avoid this assumption. COMPUTABILITY The decision problem for a given L-theory T is to find an algorithm to decide for any L-sentence σ whether or not σ belongs to T . . v1 . . and is closely related to the Incompleteness Theorem. . .96 CHAPTER 5. tn . . . . ϕ = ϕ1 ∧ ϕ2 . Recall that v0 . t2 t1 . . . v2 . The following subsets of N are computable: (1) Vble := { x : x is a variable} (2) Term := { t : t is an L-term} (3) AFor := { ϕ : ϕ is an atomic L-formula} (4) For := { ϕ : ϕ is an L-formula} . } a symbol number SN(s) ∈ N as follows: SN(vi ) := 2i and to each remaining symbol (in the finite set L {logical symbols}) we assign an odd natural number as symbol number. if if if if if if if if if ϕ= . The G¨del number t of an L-term t is defined recursively: o t = The G¨del number ϕ o                  ϕ =                 SN(vi ) SN(F ). ϕ = ∀x ψ. ψ Lemma 5. . . (This result is a version of Church’s Theorem. this means that the decision problem for Th(N) has no solution. subject to the condition that different symbols have different symbol numbers.1. t1 . are our variables. . ϕ = ∃x ψ. We assign to each symbol s∈L {logical symbols} {v0 . Definition. v1 . We shall number the terms and formulas of L in such a way that various statements about these formulas and about formal proofs in this language can be translated “effectively” into equivalent statements about natural numbers expressible by sentences in L(N).

(a)1 . 0 0  SN(∃). i). (a)2 . (a)1 . (a)1 . τ ) : τ is free for x in ϕ} ⊆ N3 (3) PrAx := { ϕ : ϕ is a propositional axiom} ⊆ N (4) Eq := { ϕ : ϕ is an equality axiom} ⊆ N (5) Quant := { ψ : ψ is a quantifier axiom} ⊆ N (6) MP := {( ϕ1 . The function Sub : N3 → N defined by Sub(a. So For is computable. note that Sent(a) ⇐⇒ For(a) and ∀i<a ¬ Fr(a. . Sub((a)2 . .3. b. (a)2     or a = SN(∧). . x . . (a)1 . and t and τ over L-terms. . (a)1 . c) if a = (a)0 . (a)1 . c)     a is computable. ϕ2 ) : ϕ1 . τ ) = t(τ /x) and Sub( ϕ . ϕ and ψ over L-formulas. (a)n with n > 0 and     (a) = SN(∃). Sub((a)2 . b. ϕ2 are L-formulas} ⊆ N3 (7) Gen := {( ϕ . . (a)2 and (a)1 = b. (a)1 . ϕ1 → ϕ2 .  if a = SN(¬). Proof. . (a)2      or a = SN(∀). t1 . For((a)1 )    For((a)1 ) and For((a)2 ) if a = SN(∨). DECIDABILITY AND GODEL NUMBERING 97 Proof.5. Lemma 5. otherwise Sub( t . . In the next two lemmas. . . . (1) a ∈ Vble iff a = 2b for some b ≤ a. . . (4) We have For(a) ⇔ Vble((a)1 ) and For((a)2 ) if a = SN(∃). x ) : x occurs free in ϕ} ⊆ N2 (2) FrSub := {( ϕ .    AFor(a) otherwise. Lemma 5. The usual inductive or explicit description of each of these notions translates easily into a description of its “G¨del image” that establishes computability o of this image. . . c).¨ 5. (2) a ∈ Term iff a ∈ Vble or a = SN(F ). so (8) follows from (1) and earlier results.5. As an example. . x . b. x . tn with G¨del numbers < a. Exercise. c)     SN(∀). (a)2 . (a)2 and (a)1 = b. and satisfies if a = SN(∃). Sub((a)1 . The following relations on N are computable: (1) Fr := {( ϕ . b. τ ) = ϕ(τ /x) .5. c     (a)0 . Sub((a)n . (a) = SN(∀). (a)1 . o We leave (3) to the reader. tn for some function symbol F of L of arity n and L-terms t1 . if a = SN(∀). b. . (a)1 . x ranges over variables. c) =  if Vble(a) and a = b. ψ ) : ψ follows from ϕ by the generalization rule} ⊆ N2 (8) Sent := { ϕ : ϕ is a sentence} ⊆ N Proof.2.

” also “recursively axiomatizable” and “effectively axiomatizable” are used. Proof. COMPUTABILITY In the rest of this Section Σ is a set of L-sentences.5. Proof. . . An L-theory T is said to be computably axiomatizable if T has a computable axiomatization. ϕj with 1 ≤ i. x) “Recursively enumerable” is also used for “computably generated. The complement of a computably generated subset of N is not always computably generated. or obtained from some ϕi with 1 ≤ i < k by Generalization. (a)k ) or ∃ i < k : Gen((a)i . or a logical axiom.6 (Negation Theorem).) Proposition 5. We leave it as an exercise to check that the union and intersection of two computably generated n-ary relations on N are computably generated.” but for L-theories “decidable” is more widely used than “computable”. Definition.5. (Thus “T is decidable” means the same thing as “T is computable. Put Σ := { σ : σ ∈ Σ}.98 CHAPTER 5. and call Σ computable if Σ is computable. then Prf Σ is computable. So every element of Prf Σ is of the form ϕ1 . e. j < k by Modus Ponens. j < k : MP((a)i . (a)j . Then A is computable. .5.2 We say that T is decidable if T is computable. as we shall see later.5. . A relation R ⊆ Nn is said to be computably generated if there is a computable relation Q ⊆ Nn+1 such that for all a ∈ Nn we have R(a) ⇔ ∃ xQ(a. Lemma 5. and undecidable otherwise.” Remark. If Σ is computable. or obtained from some ϕi . ϕn is a proof from Σ}. . i. Prf Σ := { ϕ1 . . This is because a is in Prf Σ iff Seq(a) and lh(a) = 0 and for every k < lh(a) either (a)k ∈ Σ ∪ PrAx ∪ Eq ∪ Quant or ∃ i. . Every computable relation is obviously computably generated. . 2 Instead of “computably axiomatizable. . If Σ is computable. .4. Let A ⊆ Nn and suppose A and ¬A are computably generated. .5. ϕn where n ≥ 1 and every ϕk is either in Σ. We define Prf Σ to be the “set of all G¨del numbers of proofs from o Σ”. Definition. ϕn : ϕ1 . . (a)k ). Lemma 5. ˙ Definition. then Th(Σ) is computably generated. Apply Lemma 5.4 and the fact that for all a ∈ N a ∈ Th(Σ) ⇐⇒ ∃ b Prf Σ (b) and a = (b)lh(b)−1 and Sent(a) . .

5. ⊥. Proof. ¬. µx(P ∨ Q)(a. v1 . Proposition 5.5. a . as a diligent reader can easily verify. 11. 5. Much of the above does not really need the assumption that L is finite.1.) Given a numbering of L we use it to assign to each L-term t and each Lformula ϕ its G¨del number t and ϕ . respectively. Then T = Th(Σ) is computably generated.3 go through. 99 Then there is for each a ∈ Nn an x ∈ N such that (P ∨ Q)(a. In the discussion below we only assume that L is countable. Let T be a complete L-theory with computable axiomatization Σ. . with the corresponding G¨del o numbering of L-terms and L-formulas. v2 . ∃. =. / ˙ Hence the complement of T is computably generated. ∀. x). 13.7. (So if L is finite. and for s = . ∨. First. Then Lemmas 5. By a numbering of L we mean an injective function L → N that assigns to each s ∈ L an odd natural number SN(s) > 15 such that if s is a relation symbol. put SN(s) = 1. . Relaxing the assumption of a finite language. } {logical symbols} its symbol number SN(s) ∈ N as follows: SN(vi ) := 2i. x). and 5. {(SN(s). arity(s)) : s ∈ L} ⊆ N2 are computable. Q ⊆ Nn+1 be computable such that for all a ∈ Nn we have A(a) ⇐⇒ ∃ xP (a. x). This part of our numbering of symbols is independent of L. Every complete and computably axiomatizable L-theory is decidable. ¬A(a) ⇐⇒ ∃ xQ(a. just as we did earlier in the section. DECIDABILITY AND GODEL NUMBERING Proof. o Suppose a computable numbering of L is given. Let P. a ∈ T / / ⇐⇒ a ∈ Sent or ∃ b Prf Σ (b) and (b)lh(b)−1 = SN(¬). 7. Such a numbering of L is said to be computable if the sets SN(L) = {SN(s) : s ∈ L}) ⊆ N. so the case of finite L is included. respectively. x)). 15.5. . we assign to each symbol s ∈ {v0 .5.2. The computability of A follows by noting that for all a ∈ Nn we have A(a) ⇐⇒ P (a. and if s is a function symbol. then every numbering of L is computable.5. then SN(s) ≡ 1 mod 4. 9. Now observe: a ∈ T ⇐⇒ a ∈ Sent or SN(¬). . Thus T is decidable by the Negation Theorem. 3.5. ∧.¨ 5. then SN(s) ≡ 3 mod 4.

100 CHAPTER 5. T being computably axiomatizable. (3) If f : N → N is computable and f (x) > x for all x ∈ N. and Proposition 5. and e is computable. then f can be chosen injective. T being decidable.5.5. (6) Suppose Σ is a computable and consistent set of sentences in the numerical language L. if S is infinite and computably generated.7 go through. that the definitions of these notions are all relative to our given computable numbering of L. and T being undecidable.6. for finite L different choices of numbering of L yield equivalent notions of Σ being computable. (5) A function F : Nn → N is computable iff its graph is computably generated. and for an L-theory T we define the o notions of T being computably axiomatizable. ab. (4) Every infinite computably generated subset of N has an infinite computable subset. and call Σ computable if Σ is a computable subset of N. (1) Let a and b denote positive real numbers. and 1/a. so are a + b. Theorem 5. (7) A function F : Nm → N is computable if and only if it is Σ-representable for some finite.) (2) A nonempty S ⊆ N is computably generated iff there is a computable function f : N → N such that S = f (N). No consistent L-theory extending N is decidable. and T being decidable. Call a computable if there are computable functions f.” 5.) It is now routine to check that Lemmas 5.5. a is computable if and only if the binary relation Ra on N defined by Ra (m. and if in addition a > b. g(n) = 0 and |a − f (n)/g(n)| < 1/n. n) ⇐⇒ n > 0 and m/n < a is computable.5.4. if a and b are computable. all just as we did earlier in this section. Exercises.6 Theorems of G¨del and Church o In this section we assume that the finite language L extends L(N). . COMPUTABILITY Let also a set Σ of L-sentences be given. (Note. 5.1 (Church). consistent set Σ ⊇ N of sentences in some numerical language L ⊇ L(N). Then: (i) (ii) (iii) every positive rational number is computable. Then every Σ-representable relation R ⊆ Nn is computable. We define Prf Σ to be the set of all G¨del numbers of proofs from Σ. The last exercise gives an alternative characterization of “computable function. Moreover. then a − b is also computable. (Hint: use the Negation Theorem. g : N → N such that for all n > 0. Define Σ := { σ : σ ∈ Σ}. then f (N) is computable. however.

Proof.6. x = v0 ) and define the binary relation P Σ ⊆ N2 by P Σ (a. b) is not of the form P (a) for any a ∈ A. we let P (a) ⊆ A be given by the equivalence P (a)(b) ⇔ P (a. that is. where a ∈ A. a contradiction. Then Q(a) iff P (a. i.4 (Cantor). S b 0 ) = ϕ(S b 0) . x .5. Take a = ϕ(x) . x .5. then P (a) = f (a). b).6. The function Num : N → N defined by Num(a) = S a 0 is computable. (The corollary above only says that such a sentence exists. its antidiagonal Q ⊆ A defined by Q(b) ⇐⇒ ¬P (b. Num(b)) ∈ Th(Σ) For an L-formula ϕ(x) and a = ϕ(x) . X = P Σ (a).3.4. THEOREMS OF GODEL AND CHURCH Before giving the proof we record the following consequence: 101 Corollary 5. Suppose Σ ⊇ N is consistent. We fix a variable x (e. Num(0) = 0 and Num(a + 1) = SN(S). Lemma 5.6. But by definition. say by the formula ϕ(x). b) ⇐⇒ Σ ϕ(S b 0). This is essentially Cantor’s proof that no f : A → P(A) can be surjective. then X(b) iff Σ ϕ(S b 0) iff P Σ (a. So X(b) ⇔ Σ ϕ(S b 0) (using consistency to get “⇐”).4. Immediate from 5. Q(a) iff ¬P (a.¨ 5.) Definition. so P Σ (a.6. X(b) ⇒ Σ ϕ(S b 0).7 and Church’s Theorem.6. (Use P (a. . Suppose Q = P (a). Each como putably axiomatizable L-theory extending N is incomplete. b). e. b) ⇐⇒ Sub(a. Let X ⊆ N be computable.) For the proof of Church’s Theorem we need a few lemmas. b) :⇔ b ∈ f (a). Lemma 5. For a ∈ A. a). Given any P ⊆ A2 . Proof. Proof. Let P ⊆ A2 be any binary relation on a set A. Let Σ be a set of L-sentences. g.2 (Weak form of G¨del’s Incompleteness Theorem). Then each computable set X ⊆ N is of the form X = P Σ (a) for some a ∈ N. We will indicate in the next Section how to construct for any consistent computable set of L-sentences Σ ⊇ N an L-sentence σ such that Σ σ and Σ ¬σ. Then X is Σ-representable by Theorem 5. and ¬X(b) ⇒ Σ ¬ϕ(S b 0). a). Num(a) . we have Sub( ϕ(x) . Proof. Lemma 5.

4. . Note that PA is consistent. Therefore. 0. if Th(∅) were decidable. Let ∧N be the sentence N1 ∧ · · · ∧ N9. Th(PA) is undecidable and incomplete.102 CHAPTER 5. σ ⇐⇒ ∅ ∧N → σ. y) → ϕ(x. a ∈ Th(∅) . and ∀x stands for ∀x1 . in the sense that PA σ for the sentence σ expressing that fact. Then the antidiagonal QΣ ⊆ N of P Σ is computable: b ∈ QΣ ⇔ (b. S. So Th(∅) is undecidable. . ∧N . (That is. all sentences of the form ∀x [ ϕ(x. ·) as a model. but Th(N) is undecidable. for all a ∈ N. QΣ is not among the P Σ (a). Let Σ ⊇ N be consistent. Suppose that Th(Σ) is computable. Discussion. <.6. ∀xn .5.5 the subset Th(N) of N is computably generated. But this set is not computable: Corollary 5. Th(N) and Th(∅) (in the language L(N)) are undecidable. y) is an L(N)-formula. 0) ∧ ∀y (ϕ(x. then Th(N) would be decidable. x = (x1 . Therefore by Lemma 5. for any L(N)-sentence σ. xn ). (Apparently it was not widely recognized at the time that completeness of PA would have had . x . Proof. we know how to construct for any number theoretic statement a sentence σ of L(N) such that the statement is true if and only if N |= σ. / / By Lemma 5.6. since an actual sentence would be too unwieldy without abbreviations. To appreciate the significance of this result. y)] where ϕ(x. Thus by the theorems above. Over a century of experience has shown that number theoretic assertions can be expressed by sentences of L(N). The undecidability of Th(N) is a special case of Church’s Theorem. QΣ is not computable.6.6. In most cases we just indicate how to construct such a sentence. . one needs a little background knowledge. Then. Th(Σ) is not computable. Sy)) → ∀y ϕ(x. a ∈ Th(N) ⇐⇒ a ∈ Sent and SN(∨). By Lemma 5. that is. N that is. Thus before G¨del’s Incompleteness o Theorem it seemed natural to conjecture that PA is complete. +. since it has N = (N. b) ∈ P Σ ⇔ Sub(b. a contradiction. We have seen that N is quite weak. We have to show that then Th(Σ) is undecidable.5. including some history.) What is more important. . Num(b)) ∈ Th(Σ) . that is. . SN(¬). . Also PA is computable (exercise). admittedly in an often contorted way. A very strong set of axioms in the language L(N) is PA (1st order Peano Arithmetic). COMPUTABILITY Proof of Church’s Theorem. we know from experience that any established fact of classical number theory—including results obtained by sophisticated analytic and algebraic methods—can be proved from PA. This concludes the proof. Its axioms are those of N together with all induction axioms.

7. we prove the incompleteness theorem in the more explicit form stated in the introduction to this chapter. Here is a sketch of how to make such a sentence. S b 0. and (2) N |= σ ⇐⇒ Σ σ. y) ↔ y = S c 0. We can assume that the variable x does not occur in sub(x1 . y). Then there is for every L-formula ρ(y) an L-sentence σ such that Σ σ ↔ ρ(S n 0) where n = σ . In this sense. x . y). x2 . b in N. one might guess that σ = ∀x¬pr(x. Assuming that N |= ¬σ produces a contradiction. From (1) and (2) we get N |= σ ⇐⇒ Σ σ. Let sub(x1 . Lemma 5. In this section. the situation cannot be remedied by adding new axioms to PA. o But how do we arrange (1)? Since σ := ∀x¬pr(x. ∀xϕ(x). What follows is a rigorous implementation. The function (a. for any computable consistent Σ ⊇ N. Note that then the sentence ∀xϕ(x) is true in N but not provable from Σ. Suppose Σ ⊇ N. We shall indicate how to construct. We also fix two distinct variables x and y.5. Assume for simplicity that L = L(N) and N |= Σ. x .) Of course. Hence σ is true in N. the Incompleteness Theorem is pervasive. S σ 0) where pr(x. where c = Sub(a. a formula ϕ(x) of L(N) with the following properties: (i) N (ii) Σ ϕ(S m 0) for each m. N sub(S a 0. and Σ is a set of L-sentences. S σ 0) depends on σ. This finishes our sketch. y) be an L(N)-formula representing it in N.2.6 we obtained G¨del’s Incompleteness Theorem as an immediate o corollary of Church’s theorem. How to implement this strange idea? To take care of (2). In this section L ⊇ L(N) is a finite language. the solution is to apply the fixed-point lemma below to ρ(y) := ∀x¬pr(x. x2 . Hence by the representability theorem it is N-representable. Then for all a. n) ⇐⇒ m is the G¨del number of a proof from Σ o of a sentence with G¨del number n. Num(b)) (1) . b) → Sub(a. at least if we insist that the axioms are true in N and that we have effective means to tell which sentences are axioms. 5. Proof. The idea is to construct sentences σ and σ such that (1) N |= σ ↔ σ . Num(b)) : N2 → N is computable by Lemma 5.7 A more explicit incompleteness theorem In Section 5. y) is a formula representing in N the binary relation Pr ⊆ N2 defined by Pr(m.5. and thus(!) not provable from Σ. A MORE EXPLICIT INCOMPLETENESS THEOREM 103 the astonishing consequence that Th(N) is decidable.1 (Fixpoint Lemma).7.

S σ 0). Σ σ ↔ ρ(S n 0).1 (with L = L(N) and Σ = N) provides an L(N)-sentence σ such that N σ ↔ ρ(S σ 0). x. σ ↔ ∃y(y = S n 0 ∧ ρ(y)). S n 0) ⇐⇒ ¬PrΣ (m. y) be an L(N)-formula representing PrΣ in N. Define θ(x):= ∃y(sub(x. n) ⇐⇒ m is the G¨del number of a proof from Σ o of an L-sentence with G¨del number n. a contradiction. S m 0. N ϕ(S m 0) for each m. y) ↔ y = S n 0 (2) σ ↔ ρ(S n 0). S n 0) ⇐⇒ PrΣ (m. so by (2) we get Σ Theorem 5. so PrΣ (m.2. that is Σ σ ↔ ∀x¬prΣ (x. x . Lemma 5. S m 0 ) = Sub(m.7. that is. S σ 0) for each m. Because of (3) we also o have Σ ∀x¬prΣ (x. which by (2) yields ¬PrΣ (m. Let σ := θ(S m 0). o Since Σ is computable. N We have sub(S m 0. σ ). Hence PrΣ is representable in N. S σ 0) (3) Claim: Σ σ. Hence ¬PrΣ (m. and hence in Σ. . n = σ = θ(S m 0) = Sub( θ(x) . computable. σ = θ(S m 0) = ∃y(sub(S m 0. n) Σ ¬prΣ (S m 0. n: Σ prΣ (S m 0.7. PrΣ is computable. Suppose Σ is consistent. x . Let prΣ (x. and put n = σ . COMPUTABILITY Now let ρ(y) be an L-formula. So by (1). S m 0. We now show : (i) N ϕ(S m 0) for each m. and proves all axioms of N. y) ∧ ρ(y)) and let m = θ(x) .104 CHAPTER 5. Because Σ is consistent we have for all m. This establishes the claim. Because Σ σ. y). S σ 0). y) ∧ ρ(y)). so Σ ¬prΣ (S m 0. It follows that Σ σ ↔ ρ(S σ 0). but Σ ∀xϕ(x). We claim that Σ Indeed. Assume towards a contradiction that Σ σ. no m is the G¨del number o of a proof of σ from Σ. n) (1) (2) Let ρ(y) be the L(N)-formula ∀x¬prΣ (x. let m be the G¨del number of a proof of σ from Σ. Num(m)). Now put ϕ(x) := ¬prΣ (x. σ ) for each m. Consider the relation PrΣ ⊆ N2 defined by PrΣ (m. which by the defining property of prΣ yields N ¬prΣ (S m 0. Then there exists an L(N)-formula ϕ(x) such that N ϕ(S m 0) for each m. S σ 0). σ ). Hence. Proof.

apply the theorem above to Σ ∪ N in place of Σ. (1) Let N∗ be an L-expansion of N. Th(Σ ) is undecidable . and Σ is a set of L -sentences. Σ σ ←→ true(S n 0).1. (A conscientious reader will replace these appeals by proofs until reaching a level of skill that makes constructing such proofs a predictable routine.8.5. (Note that then ∀xϕ(x) is true in N∗ but not provable from Σ. then T is undecidable. Then there exists an L(N)-formula ϕ(x) such that N ϕ(S n 0) for each n. where n = σ .) To obtain this corollary. Σ is a set of L-sentences. Let L ⊆ L and Σ ⊆ Σ . (2) Suppose Σ ⊇ N is consistent. The aim of this section is to establish this result and indicate some applications.7.) In this section. It strengthens the special case of Church’s theorem which says that the set of G¨del numbers of o L-sentences true in N∗ is not computable. Suppose that Σ is computable and true in the L-expansion N∗ of N. In order not to distract from this theme by boring details. we shall occasionally replace a proof by an appeal to the Church-Turing Thesis. Here a truth definition for Σ is an L-formula true(y) such that for all L-sentences σ. as does the more general result in the next exercise.8 Undecidable Theories Church’s theorem says that any consistent theory containing a certain basic amount of integer arithmetic is undecidable. but Σ ∪ N ∀xϕ(x). Lemma 5. L and L are finite languages. 5. This result (of Tarski) is known as the undefinability of truth. by (3). This is because of the Claim and Σ 105 σ ↔ ∀xϕ(x). UNDECIDABLE THEORIES (ii) Σ ∀xϕ(x). Corollary 5. Tarski’s result follows easily from the fixpoint lemma. and there is no truth definition for Σ. (1) Suppose Σ is conservative over Σ.8. and Th(Gr) (the theory of groups)? An easy way to prove the undecidability of such theories is due to Tarski: he noticed that if N is definable in some model of a theory T . Σ is finite. How about theories like Th(Fl) (the theory of fields). Then =⇒ Th(Σ) is undecidable. Then the set of G¨del numbers of L-sentences o true in N∗ is not definable in N∗ . Then ThL (Σ) is undecidable (2) Suppose L = L and Σ =⇒ ThL (Σ ) is undecidable. Exercises. Then the set Th(Σ) is not Σ-representable.3.

It follows that if Th(Σ) is decidable then so is Th(Σ ). . Then ⇐⇒ ThL (Σ) is undecidable. (1) In this case we have for all a ∈ N.) that computes for each L -sentence σ an L-sentence σ ∗ such that Σ σ ↔ σ ∗ .106 (3) Suppose all symbols of L CHAPTER 5. . . so is ThL (Σ). Remark. Then. (4) The ⇒ direction holds by (1). Definition. . cn )). It follows that if ThL (Σ ) is decidable. Then for each L-sentence σ we have Σ σ ⇐⇒ Σ σ → σ. a ∈ ThL (Σ) ⇐⇒ a ∈ SentL and a ∈ ThL (Σ ) . .6. a ∈ Th(Σ ) ⇐⇒ a ∈ Sent and SN(∨). (2) Write Σ = {σ1 . for all a ∈ N: a ∈ ThL (Σ) ⇐⇒ a ∈ SentL and a ∈ ThL (Σ) . . Then ThL (Σ ) is undecidable by Corollary 5. .. We cannot drop the assumption L = L in (2): take L = ∅. then σ := ∀vk . . . a ∈ ThL (Σ ) ⇐⇒ a ∈ SentL and a∗ ∈ ThL (Σ) . . This yields the ⇐ direction of (4). . n. σ . we leave it to the reader to replace this appeal to the Church-Turing Thesis by a proof. ThL (Σ) is undecidable (4) Suppose Σ extends Σ by a definition. . vk+n ). By the Church-Turing Thesis there is a computable function a → a : N → N such that σ = σ for all L -sentences σ. . for all a ∈ N. . . . Proof. Given any L sentence σ we define the L-sentence σ as follows: take k ∈ N minimal such that σ contains no variable vm with m ≥ k. . but ThL (Σ) is decidable (exercise). ∀vk+n ϕ(vk . . Σ = ∅. a ∈ Th(Σ) . (3) Let c0 . σN } ∪ Σ. replace each occurrence of ci in σ by vk+i for i = 0. Hence. vk+n ) be the resulting L-formula (so σ = ϕ(c0 . COMPUTABILITY L are constant symbols. Then ThL (Σ) is undecidable ⇐⇒ ThL (Σ ) is undecidable. For the ⇐ we use an algorithm (see . By ∗ the Church-Turing Thesis there is a computable function a → a : N → N such that σ ∗ = σ ∗ for all L -sentences σ. An easy argument using the completeness theorem shows that Σ L σ ⇐⇒ Σ L σ. . . . the converse holds by (1). L = L(N) and Σ = ∅. . . and let ϕ(vk . Th(Σ) is undecidable.6. . cn be the distinct constant symbols of L L. An L-structure A is said to be strongly undecidable if for every set Σ of L-sentences such that A |= Σ. so for all a ∈ N.. SN(¬). . . and put σ := σ1 ∧ · · · ∧ σN . This yields the ⇐ direction of (3). .

. Example. c.1. and more generally. Theorem 5. . 0. . But ThL (Σ) is undecidable by assumption. Σ σ ⇐⇒ Σ ∪ ∆ δσ. Let Σ be a set of L -sentences such that B |= Σ . ·) is strongly undecidable. Using Lagrange’s theorem that N = {a2 + b2 + c2 + d2 : a. . .8.1. b. Corollary 5. Lemma 5. . . . an ) be an L(c0 .8. and let (A. and (2) a finite set ∆ of L -sentences true in B. (Sketch) By the previous lemma (with L and B instead of L and A). Th(Ri) is undecidable. and by (ii) we obtain an algorithm for deciding whether any given L-sentence is provable from Σ. cn )-expansion of the L-structure A. cn be distinct constant symbols not in L. an ) is strongly undecidable =⇒ A is strongly undecidable. . 0. by part (2) of Lemma 5. S. d ∈ Z}. N = (N.3 (Tarski). . hence Th(Σ) is undecidable by part (2) of Lemma 5.8.8. Proof. a0 . We have to show that Th(Σ) is undecidable. One can show that in this case we have (1) an algorithm that computes for any L-sentence σ an L -sentence δσ. 1. UNDECIDABLE THEORIES 107 So A is strongly undecidable iff every L-theory of which A is a model is undecidable. Now N |= Σ ∪ N. Proof. in other words. Then (A. we can reduce to the case that we have a 0-definition δ of A in B. (ii) for each set Σ of L -sentences with B |= Σ and each L-sentence σ. ·) of integers is strongly undecidable. we need to show that ThL (Σ ) is undecidable. where Σ := {s : s is an L-sentence and Σ ∪ ∆ |= δs}.4. Suppose towards a contradiction that ThL (Σ ) is decidable. By Church’s Theorem Th(Σ ∪ N) is undecidable. −. Take Σ as in (ii) above. It suffices to show that the ring (Z.2. +. . +. the theory of any class of rings that has the ring of integers among its members is undecidable. .5. . let Σ be a set of L(N)-sentences such that N |= Σ. the theory of rings is undecidable. . For the same reason. Then A |= Σ by (i). Suppose the L-structure A is definable in the L structure B and A is strongly undecidable.8. . Then B is strongly undecidable. <. we see that the inclusion map N → Z defines N in the ring of integers. To see this. Let c0 . of integral domains. so we have an algorithm for deciding whether any given L -sentence is provable from Σ ∪ ∆. .8. the theory of commutative rings. with the following properties: (i) A |= σ ⇐⇒ B |= δσ. so by Tarski’s Theorem the ring of integers is strongly undecidable. The following result is an easy application of part (3) of the previous lemma. and we have a contradiction. a0 . . Then ThL (Σ ∪ ∆) is decidable.

ThL (∅)) is undecidable. It can also be shown that predicate logic in the language whose only symbol is a binary relation symbol is undecidable. 1. and thus the ring of integers is definable in the field of rational numbers. 0. then g = sa for some a. . Then G (as a model of Gr) is strongly undecidable. |). then b = a2 . |) in the group G. +. The set Z ⊆ Q is 0-definable in the field (Q. ·) of rational numbers. We shall take this here on faith. is known to be decidable (Szmielew). The theory Th(Fl) of fields is undecidable. +. and we say that c is a least common multiple of a and b if a | c. Use these facts to specify a definition of the structure (Z. then they have a unique positive least common multiple. |) is strongly undecidable. there is no integer x > 1 with x | a and x | b). Thus by Tarski’s theorem. and is due to Julia Robinson. COMPUTABILITY Fact. The only known proof uses non-trivial results about quadratic forms. and b − a is a least common multiple of a and a − 1. (2) The structure (Z. Hint: Show that if b + a is a least common multiple of a and a + 1. Exercises. 0. On the other hand. the theory of abelian groups. using the Church-Turing Thesis. with composition as the group multiplication. 0. the theory of any class of groups that has the group G of the exercise above among its members is undecidable. c denote integers. then they have ab as a least common multiple. (1) Argue informally. In fact. Th(Ab). On the other hand. Then predicate logic in L (that is. 0.5. (3) Consider the group G of bijective maps Z → Z. Recall that if a and b are not both zero. (4) Let L = {F } have just a binary function symbol. 1.108 CHAPTER 5. +. |). 1. Next show that a | b ⇐⇒ sb commutes with each g ∈ G that commutes with sa . Corollary 5. Hint: let s be the element of G given by s(x) = x + 1. Use this to define the squaring function in (Z. Below we let a. +. and then show that multiplication is 0-definable in (Z. +. You can use the fact that ACF has QE. We say that a divides b (notation: a | b) if ax = b for some integer x. The theory of any class of fields that has the field of rationals among its members is undecidable. Check that if g ∈ G commutes with s. −. and c | x for every integer x such that a | x and b | x. that Th(ACF) is decidable. 1. 0. and that if a and b are coprime (that is. the theory of groups is undecidable. 1.8. where | is the binary relation of divisibility on Z. b. b | c. predicate logic in the language whose only symbol is a unary function symbol is decidable.

• brief discussion of classes at end of “Sets and Maps”? • at the end of results without proof? • brief discussion on P=NP in connection with propositional logic • section(s) on boolean algebra. maybe extra section on RCF. including Stone representation. etc. exams) • footnotes pointing to alternative terminology. and other simple o applications of compactness • include “equality theorem”. etc. Ax’s theorem. • translation of one language in another (needed in connection with Tarski theorem in last section) • more details on back-and-forth in connection with unnested formulas • extra elementary model theory (universal classes. model-theoretic criteria for qe. LindenbaumTarski algebras. etc. • On computability: a few extra results on c. application to ACF. .e..To do? • improve titlepage • improve or delete index • more exercises (from homework. sets. groups. • basic framework for many-sorted logic. as examples) • solution to a problem by Erd¨s via compactness theorem. and exponential diophantine result. • section on equational logic? (boolean algebras.

Sign up to vote on this title
UsefulNot useful