You are on page 1of 454

The Universal Algebra of First Order Logic

Noel Vaillant
http://www.probability.net/personal.html
(use Alt+Left-Arrow to navigate back within document)
August 29, 2013
Contents
1 Universal Algebra 5
1.1 Existence of Free Universal Algebra . . . . . . . . . . . . . . . . 5
1.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Type of Universal Algebra . . . . . . . . . . . . . . . . . . 6
1.1.3 Denition of Universal Algebra . . . . . . . . . . . . . . . 7
1.1.4 Homomorphism of Universal Algebra . . . . . . . . . . . . 7
1.1.5 Isomorphism of Universal Algebra . . . . . . . . . . . . . 9
1.1.6 Free Generator of Universal Algebra . . . . . . . . . . . . 10
1.1.7 Free Universal Algebra . . . . . . . . . . . . . . . . . . . . 12
1.1.8 Construction of Free Universal Algebra . . . . . . . . . . 14
1.1.9 Cantors Theorem . . . . . . . . . . . . . . . . . . . . . . 16
1.1.10 Disjoint Copy of a Set . . . . . . . . . . . . . . . . . . . . 17
1.1.11 Main Existence Theorem . . . . . . . . . . . . . . . . . . 20
1.2 Structural Induction, Structural Recursion . . . . . . . . . . . . . 21
1.2.1 Unique Representation in Free Universal Algebra . . . . . 21
1.2.2 Order in Free Universal Algebra . . . . . . . . . . . . . . 23
1.2.3 Universal Sub-Algebra of Universal Algebra . . . . . . . . 25
1.2.4 Generator of Universal Algebra . . . . . . . . . . . . . . . 29
1.2.5 Proof by Structural Induction . . . . . . . . . . . . . . . . 30
1.2.6 Denition by Recursion over N . . . . . . . . . . . . . . . 34
1.2.7 Denition by Structural Recursion . . . . . . . . . . . . . 43
1.3 Relation on Universal Algebra . . . . . . . . . . . . . . . . . . . . 50
1.3.1 Relation and Congruence on Universal Algebra . . . . . . 50
1.3.2 Congruence Generated by a Set . . . . . . . . . . . . . . . 51
1.3.3 Quotient of Universal Algebra . . . . . . . . . . . . . . . . 55
1.3.4 First Isomorphism Theorem . . . . . . . . . . . . . . . . . 56
1.4 Sub-Formula in Free Universal Algebra . . . . . . . . . . . . . . . 58
1.4.1 Sub-Formula of Formula in Free Universal Algebra . . . . 58
1.4.2 The Sub-Formula Partial Order . . . . . . . . . . . . . . . 60
1.4.3 Increasing Map on Free Universal Algebra . . . . . . . . . 63
1.4.4 Structural Substitution between Free Algebras . . . . . . 64
1
2 The Free Universal Algebra of First Order Logic 66
2.1 Formula of First Order Logic . . . . . . . . . . . . . . . . . . . . 66
2.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.1.2 The Free Universal Algebra of First Order Logic . . . . . 68
2.1.3 Variable Substitution in Formula . . . . . . . . . . . . . . 70
2.1.4 Variable Substitution and Congruence . . . . . . . . . . . 75
2.1.5 Variable of a Formula . . . . . . . . . . . . . . . . . . . . 76
2.1.6 Substitution of Single Variable . . . . . . . . . . . . . . . 80
2.1.7 Free Variable of a Formula . . . . . . . . . . . . . . . . . 83
2.1.8 Bound Variable of a Formula . . . . . . . . . . . . . . . . 88
2.1.9 Valid Substitution of Variable . . . . . . . . . . . . . . . . 92
2.1.10 Dual Substitution of Variable . . . . . . . . . . . . . . . . 99
2.1.11 Local Inversion of Variable Substitution . . . . . . . . . . 102
2.2 The Substitution Congruence . . . . . . . . . . . . . . . . . . . . 109
2.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 109
2.2.2 The Strong Substitution Congruence . . . . . . . . . . . . 109
2.2.3 Free Variable and Strong Substitution Congruence . . . . 110
2.2.4 Substitution and Strong Substitution Congruence . . . . . 111
2.2.5 Characterization of the Strong Congruence . . . . . . . . 115
2.2.6 Counterexamples of the Strong Congruence . . . . . . . . 123
2.2.7 The Substitution Congruence . . . . . . . . . . . . . . . . 125
2.2.8 Free Variable and Substitution Congruence . . . . . . . . 128
2.2.9 Substitution and Substitution Congruence . . . . . . . . . 129
2.2.10 Characterization of the Substitution Congruence . . . . . 130
2.2.11 Substitution Congruence vs Strong Congruence . . . . . . 136
2.3 Essential Substitution of Variable . . . . . . . . . . . . . . . . . . 140
2.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 140
2.3.2 Minimal Transform of Formula . . . . . . . . . . . . . . . 141
2.3.3 Minimal Transform and Valid Substitution . . . . . . . . 147
2.3.4 Minimal Transform and Substitution Congruence . . . . . 152
2.3.5 Isomorphic Representation Modulo Substitution . . . . . 156
2.3.6 Substitution Rank of Formula . . . . . . . . . . . . . . . . 158
2.3.7 Existence of Essential Substitution . . . . . . . . . . . . . 171
2.3.8 Properties of Essential Substitution . . . . . . . . . . . . 179
2.4 The Permutation Congruence . . . . . . . . . . . . . . . . . . . . 188
2.4.1 The Permutation Congruence . . . . . . . . . . . . . . . . 188
2.4.2 Integer Permutation . . . . . . . . . . . . . . . . . . . . . 190
2.4.3 Iterated Quantication . . . . . . . . . . . . . . . . . . . . 192
2.4.4 Irreducible Formula . . . . . . . . . . . . . . . . . . . . . 195
2.4.5 Characterization of the Permutation Congruence . . . . . 196
2.5 The Absorption Congruence . . . . . . . . . . . . . . . . . . . . . 201
2.5.1 The Absorption Congruence . . . . . . . . . . . . . . . . . 201
2.5.2 Characterization of the Absorption Congruence . . . . . . 207
2.5.3 Absorption Mapping and Absorption Congruence . . . . . 210
2.6 The Propositional Congruence . . . . . . . . . . . . . . . . . . . 214
2.6.1 The Propositional Congruence . . . . . . . . . . . . . . . 214
2
3 The Free Universal Algebra of Proofs 215
3.1 The Hilbert Deductive Congruence . . . . . . . . . . . . . . . . . 215
3.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 215
3.1.2 Axioms of First Order Logic . . . . . . . . . . . . . . . . . 216
3.1.3 Proofs as Free Universal Algebra . . . . . . . . . . . . . . 220
3.1.4 Provability and Sequent Calculus . . . . . . . . . . . . . . 226
3.1.5 The Deduction Theorem . . . . . . . . . . . . . . . . . . . 231
3.1.6 Transitivity of Consequence Relation . . . . . . . . . . . . 234
3.1.7 The Hilbert Deductive Congruence . . . . . . . . . . . . . 235
3.2 The Free Universal Algebra of Proofs . . . . . . . . . . . . . . . . 240
3.2.1 Totally Clean Proof . . . . . . . . . . . . . . . . . . . . . 240
3.2.2 Variable Substitution in Proof . . . . . . . . . . . . . . . 247
3.2.3 Hypothesis of a Proof . . . . . . . . . . . . . . . . . . . . 252
3.2.4 Axiom of a Proof . . . . . . . . . . . . . . . . . . . . . . . 254
3.2.5 Variable of a Proof . . . . . . . . . . . . . . . . . . . . . . 257
3.2.6 Specic Variable of a Proof . . . . . . . . . . . . . . . . . 263
3.2.7 Free Variable of a Proof . . . . . . . . . . . . . . . . . . . 265
3.2.8 Bound Variable of a Proof . . . . . . . . . . . . . . . . . . 269
3.2.9 Valid Substitution of Variable in Proof . . . . . . . . . . . 272
3.2.10 Valid Substitution of Axiom . . . . . . . . . . . . . . . . . 284
3.2.11 Valid Substitution of Totally Clean Proof . . . . . . . . . 287
3.3 Proof Modulo and Minimal Transform . . . . . . . . . . . . . . . 291
3.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 291
3.3.2 Valuation of Proof Modulo . . . . . . . . . . . . . . . . . 293
3.3.3 Clean Proof . . . . . . . . . . . . . . . . . . . . . . . . . . 304
3.3.4 Valid Substitution of Axiom Modulo . . . . . . . . . . . . 310
3.3.5 Valid Substitution of Clean Proof . . . . . . . . . . . . . . 319
3.3.6 Minimal Transform of Proof . . . . . . . . . . . . . . . . . 321
3.3.7 Minimal Transform of Clean Proof . . . . . . . . . . . . . 328
3.3.8 Minimal Transform and Valid Substitution . . . . . . . . 332
3.4 The Substitution Congruence for Proofs . . . . . . . . . . . . . . 336
3.4.1 The Substitution Congruence . . . . . . . . . . . . . . . . 336
3.4.2 Characterization of the Substitution Congruence . . . . . 339
3.4.3 Local Inversion of Substitution for Proofs . . . . . . . . . 347
3.4.4 Minimal Transform and Substitution Congruence . . . . . 354
3.4.5 Proof with Clean Minimal Transform . . . . . . . . . . . 360
3.4.6 Valuation Modulo and Substitution Congruence . . . . . 362
3.5 Essential Substitution for Proofs . . . . . . . . . . . . . . . . . . 363
3.5.1 Substitution Rank of Proof . . . . . . . . . . . . . . . . . 363
3.5.2 Existence of Essential Substitution . . . . . . . . . . . . . 377
3.5.3 Properties of Essential Substitution . . . . . . . . . . . . 384
3.5.4 Essential Substitution of Clean Proof . . . . . . . . . . . . 393
3.5.5 The Substitution Theorem . . . . . . . . . . . . . . . . . 395
3
4 The Gentzen LK-System 398
4.1 The Universal Algebra of Sequents . . . . . . . . . . . . . . . . . 398
4.1.1 The Set of Sequents . . . . . . . . . . . . . . . . . . . . . 398
4.1.2 Operators on the Set of Sequents . . . . . . . . . . . . . . 398
4.1.3 Operator Symbols on the Set of Sequents . . . . . . . . . 400
4.1.4 The Universal Algebra of Sequents . . . . . . . . . . . . . 401
4.2 The Free Universal Algebra of LK-Proofs . . . . . . . . . . . . . 401
4.2.1 LK-Proofs and Provable Sequents . . . . . . . . . . . . . 401
5 Semantics and Dual Space 403
5.1 Valuation and Semantics . . . . . . . . . . . . . . . . . . . . . . . 403
5.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 403
5.1.2 Valuation and Dual Space . . . . . . . . . . . . . . . . . . 404
5.1.3 Semantic Entailment and Valid Formula . . . . . . . . . . 409
5.1.4 Maximal Consistent Subset . . . . . . . . . . . . . . . . . 411
5.1.5 Lindenbaums Lemma . . . . . . . . . . . . . . . . . . . . 416
5.1.6 The Compactness Theorem . . . . . . . . . . . . . . . . . 418
5.1.7 Equivalence of Syntax and Semantics . . . . . . . . . . . . 419
5.1.8 Dual Characterization of the Deductive Congruence . . . 420
5.2 Elements of Classical Model Theory . . . . . . . . . . . . . . . . 422
5.2.1 Model and Variables Assignment . . . . . . . . . . . . . . 422
5.2.2 Model Valuation Function . . . . . . . . . . . . . . . . . . 423
5.2.3 The Relevance Lemma . . . . . . . . . . . . . . . . . . . . 427
5.2.4 The Substitution Lemma . . . . . . . . . . . . . . . . . . 429
5.2.5 The Soundness Theorem . . . . . . . . . . . . . . . . . . . 437
5.2.6 Regular Valuation . . . . . . . . . . . . . . . . . . . . . . 441
4
Chapter 1
Universal Algebra
1.1 Existence of Free Universal Algebra
1.1.1 Preliminaries
In the following, N = 0, 1, 2, . . . denotes the set of non-negative integers, while
N

= 1, 2, . . . is the set of positive integers. The integer 0 is also the empty


set , and for all n N

:
n = 0, 1, . . . , n 1
Recall that a map or function is a (possibly empty) set of ordered pairs f such
that y = y

whenever (x, y) f and (x, y

) f. The domain of f is the set:


dom(f) = x : y, (x, y) f
while the range of f is the set:
rng(f) = y : x, (x, y) f
The notation f : A B indicates that f is a map with domain A and whose
range is a subset of B. In particular, f is a subset of the cartesian product:
A B = (x, y) : x A, y B
The set of all maps f : A B is denoted B
A
. Note that B

= as the empty
set is the only map f : B. The formula B

= can equally be written


B
0
= 1 which makes it easy to remember, and holds even if B = 0. If A ,=
then
A
= as there is no map f : A . The formula
A
= can equally
be written 0
A
= 0 which is also easy to remember, but does not hold if A = 0.
There is nothing deep or interesting about the empty map : B, B

or
A
.
But they are often confusing and we should confront them at an early stage.
Regarding the cartesian product AB, it is very common to refer to it as A
2
whenever A = B. This may seem confusing at rst since A
2
is the set of maps
5
f : 0, 1 A, which is not the same thing as the set of ordered pairs (x, y) with
x, y A. However, it is natural to represent a map f : 0, 1 A by the ordered
pair (f(0), f(1)). If x = f(0) and y = f(1), then we are eectively representing
f = (0, x), (1, y) by the ordered pair (x, y). It makes communication easier.
In summary, when we refer to (x, y) as the element of AA, things are exactly
as they are. But if we are referring to (x, y) as an element of A
2
, what we really
mean is the map f = (0, x), (1, y), i.e the maps f : 0, 1 A such that
f(0) = x and f(1) = y. Note that the sets A and A
1
are also dierent.
1.1.2 Type of Universal Algebra
Denition 1 A Type of Universal Algebra is a map with rng() N.
Since the empty set is a map with empty range, it is also is a type of universal
algebra. When is an element of N
n
, it is a map : n N which can be
represented as the n-uple ((0), . . . , (n1)). If is a type of universal algebra,
its cardinal number represents the number of operators dened on any universal
algebra of type (which can be nite, innite, countable or uncountable) while
for all i dom(), (i) N represents the arity of the i-th operator so to speak,
i.e. the number of arguments it has. If X is a universal algebra of type , then
an operator of arity (i) is a map f : X
(i)
X. Hence an operator of arity 1
is a map f : X
1
X while an operator of arity 2 is a map f : X
2
X. What
may be more unusual is an operator or arity 0, namely a map f : X
0
X. Since
X
0
= 0, an operator of arity 0 is therefore a map f : 0 X, which is a
set f = (0, f(0)) containing the single ordered pair (0, f(0)). Fundamentally,
there is not much dierence between specifying an operator of arity 0 on X, and
the constant element f(0) of X. It is however convenient to allow the arity of an
operator to be 0, so we do not have to treat constants and operators separately.
Universal algebras may have constants, like an identity element in a group.
If G is a group with identity element e and product , it is possible to regard
G as a universal algebra with 3 operators. The identity operator f : 0 G is
dened by f(0) = e, the product operator f : G
2
G by f(x) = x(0)x(1) and
the inverse operator f : G
1
G by f(x) = x(0)
1
. The type of this universal
algebra could be : 3 N dened by (0) = 0, (1) = 2 and (2) = 1,
that is = (0, 0), (1, 2), (2, 1). Of course a group G has more properties
than a universal algebra of type because the operators need to satisfy certain
conditions. Hence a group is arguably a universal algebra of type , but the
converse is not true in general.
There is something slightly unsatisfactory about denition (1), namely the
fact that the arity of the operators are ordered. After all, a group is also a univer-
sal algebra of type

= (0, 0), (1, 1), (2, 2) or even

= (, 0), (+, 1), (@, 2)


provided , + and @ denote distinct sets.
Let be a type of universal algebra and f . Then f is an ordered pair
(i, (i)) for some i dom(). From now on, we shall write (f) rather than
(i). This notational convention is very convenient as we no longer need to
6
refer to dom(), or i dom(). If X is a universal algebra of type , we simply
require the existence of an operator T(f) : X
(f)
X for all f .
1.1.3 Denition of Universal Algebra
Denition 2 Let be a type of universal algebra. A Universal Algebra of type
is an ordered pair (X, T) where X is a set and T is a map with domain ,
such that for f we have T(f) : X
(f)
X.
In fact there is no need to keep writing T(f) or (X, T) when everything is clear
from the context. If X is a universal algebra of type , then for all f we
have the operator f : X
(f)
X, also denoted f. Of course, if X and Y are
two universal algebras of type , then f : X
(f)
X and f : Y
(f)
Y are
dierent sets in general. One is T(f) where (X, T) is the rst universal algebra
referred to as X, while the other is S(f) where (Y, S) is the second universal
algebra referred to as Y .
If is the empty set, then T is a map with empty domain, i.e. the empty
set itself. So (X, ) is a universal algebra of empty type for any set X, which
is not very dierent from the set X itself. A universal algebra of empty type is
therefore a set with no structure. If is not an empty type of universal algebra,
then T has a non-empty domain and therefore cannot be empty. So there is at
least one operator T(f) : X
(f)
X. If X is the empty set then we must have
(f) 1, as there is no map T(f) : 0 . In other words, there cannot be
any constant in an empty universal algebra, as we should expect. Furthermore,
any T(f) : X
(f)
X is in fact the empty map : . In summary, if
is a type of universal algebra with constants (i.e. (f) = 0 for some f ),
then no universal algebra of type can be empty, while if has no constant
(i.e. (f) 1 for all f ), then the only empty universal algebra of type
is (, T) where T(f) = for all f . So much for confronting the fear of the
empty set.
1.1.4 Homomorphism of Universal Algebra
Recall that if f and g are both maps then g f is the set of ordered pairs:
g f = (x, z) : y, (x, y) f and (y, z) g
It is easy to check that g f is also a map, namely that z = z

whenever both
(x, z) and (x, z

) are elements of g f.
Let g : A B be a map. Then for all x A, it is a well established
convention to denote g(x) the unique element of B such that (x, g(x)) g. Now
if n N, we can dene a map g
n
: A
n
B
n
by setting:
g
n
(x
0
, . . . , x
n1
) = (g(x
0
), . . . , g(x
n1
))
or more rigorously:
x A
n
, i n , g
n
(x)(i) = g(x(i))
7
Note that if n = 0, then g
0
: A
0
B
0
is the map g
0
: 0 0. There
is only one such map, dened by g
0
(0) = 0 or g
0
= (0, 0). If n = 1, then
g
1
: A
1
B
1
is the map dened by g
1
((0, x)) = (0, g(x)) for all x A.
Anyone being confused by this last statement should remember that (0, x) is
simply the map f : 0 A dened by f(0) = x which is an element of A
1
not
fundamentally dierent from the constant x A. Likewise (0, g(x)) is simply
the map f : 0 B dened by f(0) = g(x) which is an element of B
1
not
fundamentally dierent from g(x) B. From now, whenever it is clear from the
context that x A
n
rather than x A, we shall write g(x) rather than g
n
(x).
Denition 3 Let X and Y be universal algebras of type . We say that a map
g : X Y is a morphism or homomorphism, if and only if:
f , x X
(f)
, g f(x) = f g(x)
With stricter notations, given two universal algebras (X, T) and (Y, S) of type
, a map g : X Y is a morphism, if and only if:
f , x X
(f)
, g T(f)(x) = S(f) g
(f)
(x)
Suppose = (0, 0), (1, 2), (1, 1) and let X, Y be universal algebras of type .
For example, let us assume that X and Y are groups. We have three operators
1
T(0, 0) : 0 X, T(1, 2) : X
2
X and T(1, 1) : X
1
X. To simplify
notations, for all x, y X, let us dene e X, x y and x
1
as:
2
e = T(0, 0)(0) , x y = T(1, 2)(x, y) , x
1
= T(2, 1)(x)
Adopting similar conventions for the universal algebra Y , the rst condition
required for a map g : X Y to be a morphism can be expressed as:
g(e) = g T(0, 0)(0) = T(0, 0) g
0
(0) = T(0, 0)(0) = e
While the second is, for all x, y X:
g(xy) = gT(1, 2)(x, y) = T(1, 2)g
2
(x, y) = T(1, 2)(g(x), g(y)) = g(x)g(y)
Finally the last condition is, for all x X:
g(x
1
) = g T(2, 1)(x) = T(2, 1) g
1
(x) = T(2, 1) g(x) = g(x)
1
It follows that g : X Y is a morphism, if and only if for all x, y X:
g(e) = e , g(x y) = g(x) g(y) , g(x
1
) = g(x)
1
In particular, we see that a morphism (of universal algebra of type ) is no
dierent from group morphism whenever X and Y are groups.
1
T is a map with domain . The notation T(0, 0) is a shortcut for T((0, 0)).
2
Here is an example when (x, y) does not really mean (x, y) in x y = T(1, 2)(x, y).
Instead, (x, y) is a notational shortcut for the element f X
2
dened by f = {(0, x), (1, y)}.
Equivalently, f is the map f : 2 X dened by f(0) = x and f(1) = y. Similarly x does
not really mean x in T(2, 1)(x), but is rather a notational shortcut for the element f X
1
dened by f = {(0, x)}. Equivalently, f is the map f : 1 X dened by f(0) = x.
8
Proposition 4 Let X, Y and Z be universal algebras of type . If g : X Y
and g

: Y Z are morphisms, then g

g : X Z is also a morphism.
Proof
For all f and x X
(f)
we have g

g f(x) = g

f g(x) = f g

g(x).

1.1.5 Isomorphism of Universal Algebra


Recall that a map f is said to be injective or one-to-one if and only if x = x

whenever (x, y) f and (x

, y) f. When this is the case, the set:


f
1
= (y, x) : (x, y) f
is also a map. If f : A B then saying that f is injective can be expressed as:
x, x

A , f(x) = f(x

) x = x

However the domain of f


1
is the range of f or rng(f), which may not be
the whole set B. We say that f : A B is surjective or onto if and only if
rng(f) = B, or equivalently:
y B , x A , y = f(x)
Note that a map f : A B is simply a set of ordered pairs with no knowledge
of the set B. In other words, being given the set f will not give you the set B,
which can be any set with rng(f) B. Saying that f is surjective is therefore
not meaningful on its own, unless the set B is clear from the context.
We say that f : A B is bijective, or that it is a bijection or one-to-one
correspondence if and only if it is both injective and surjective. When this is
the case we have f
1
: B A, and it is also a bijective map.
Denition 5 Let X and Y be universal algebras of type . We say that a map
g : X Y is an isomorphism if and only if g is a bijective morphism. If there
exists an isomorphism g : X Y , we say that X and Y are isomorphic.
Proposition 6 Let X and Y be universal algebras of type . If g : X Y is
an isomorphism, then g
1
: Y X is also an isomorphism.
Proof
Suppose g : X Y is a morphism which is bijective. Then g
1
: Y X is
also bijective and we only need to check that it is a morphism. Let f and
y Y
(f)
. We need to check that g
1
f(y) = f g
1
(y), and since g is an
injective map, this is equivalent to:
g g
1
f(y) = g f g
1
(y) (1.1)
9
The l.h.s of (1.1) is clearly f(y) and since g : X Y is a morphism, the r.h.s
is f g g
1
(y) . So it remains to check that g g
1
(y) = y for all y Y
(f)
.
This follows immediately from:
g g
1
(y)(i) = g(g
1
(y)(i)) = g(g
1
(y(i))) = y(i)
which is true for all i (f).
1.1.6 Free Generator of Universal Algebra
Recall that if f is a map and A is a set, the restriction of f to A is the set:
f
|A
= (x, y) : x A and (x, y) f
This is also a set of ordered pairs such that y = y

whenever (x, y) f
|A
and
(x, y

) f
|A
. Hence the restriction f
|A
is also a map. Furthermore if f : A

B
and A A

, then f
|A
: A B.
Denition 7 Let X be a universal algebra of type . We call free generator
of X any subset X
0
X with the following property: for any universal algebra
Y of type , and for any map g
0
: X
0
Y , there exists a unique morphism
g : X Y such that g
|X0
= g
0
.
Consider the set X = N together with the successor operator s : N N
dened by s(n) = n + 1. Then X can be viewed as a universal algebra of
type = (0, 1) where T(0, 1) : X
1
X is dened as T(0, 1)(n) = n(0) + 1.
Furthermore the set X
0
= 0 is a free generator of X. To check this, let
Y be another universal algebra of type and assume that g
0
: X
0
Y is
an arbitrary map. We have to prove that g
0
can be uniquely extended into
a morphism g : X Y . This is another way of saying that there exists a
morphism g : X Y such that g
|X0
= g
0
, and that such morphism is unique.
Since X
0
= 0, we have g
0
: 0 Y . We dene e = g
0
(0) Y . Consider the
map g : X Y dened by the recursion g(0) = e and g(n+1) = sg(n), where
s : Y
1
Y is the successor operator on Y . To check that g is a morphism, we
need to check the single condition for all n X:
g s(n) = s g(n)
which is true from the very denition of g. Hence we have proved the existence
of a morphism g : X Y , and since g(0) = e = g
0
(0) we have g
|X0
= g
0
. It
remains to check that g is unique. Suppose that g

: X Y is another morphism
such that g

|X0
= g
0
, i.e. g

(0) = e. In particular g

(0) = g(0). Furthermore,


since g

is a morphism, if g

(n) = g(n) for some n X, then:


g

(n + 1) = s g

(n) = s g(n) = g(n + 1)


This proves by induction that g

(n) = g(n) for all n X, i.e. that g

= g.
10
We have proved that 0 is a free generator of N viewed as a universal
algebra with a single successor operator. Note that the empty set alone would
be too small to be a free generator of N. If g
0
: Y is a map then g
0
has
to be the empty set, and it is not dicult to extend g
0
into a full morphism
g : X Y whenever Y is not empty. Indeed pick an arbitrary e Y and dene
g by the same recursion as before, g(0) = e and g(n+1) = sg(n). We obtain a
morphism such that g
|X0
= = g
0
. However, such morphism will not be unique
as there are in general many ways to choose the element e Y . This shows that
the empty set is not a free generator of N viewed as a universal algebra with a
single successor operator. In fact, we could have made the argument simpler:
consider Y = which is a universal algebra of type by setting T(f) = for
all f
3
. Consider the map g
0
: Y which is just the empty map. Then it
is impossible to extend g
0
to a full morphism g : X Y , for the simple reason
that there cannot exist any map g : X since X is not empty.
We now claim that X
0
= 0, 1 is too large to be a free generator of N. In
the case of X
0
= , we failed to obtain uniqueness when Y had more than one
element, and we failed to obtain existence for Y = . In the case of X
0
= 0, 1,
we will fail to obtain existence. Indeed consider Y = N and g
0
: X
0
Y dened
by g
0
(0) = 2 and g
0
(1) = 4. Then no morphism g : X Y can extend g
0
, as
otherwise:
4 = g
0
(1) = g(1) = g(0 + 1) = g(0) + 1 = g
0
(0) + 1 = 3
So we have seen that X
0
= is too small, while X
0
= 0, 1 is too large
to be a free generator of N. This last statement is in fact misleading as size is
not everything. Consider X
0
= 1 with Y = N and g
0
: X
0
Y dened by
g
0
(1) = 0. Then no morphism g : X Y can extend g
0
as otherwise:
0 = g
0
(1) = g(1) = g(0 + 1) = g(0) + 1
contradicting the fact that g(0) N. So X
0
= 1 is not a free generator of N
viewed as a universal algebra with successor operator.
Suppose we now regard N as a universal algebra of type = (0, 1), (1, 0)
by adding to the successor operator the operator T(1, 0) : 0 N dened by
T(1, 0)(0) = 0. In eect, we are giving N additional structure by singling out the
element 0. We now claim that the empty set X
0
= is a free generator of N. So
let Y be a universal algebra of type and consider a map g
0
: X
0
Y . Since
X
0
= , g
0
is necessarily the empty map g
0
= . We need to show that g
0
can be
uniquely extended into a morphism g : X Y . Since Y is a universal algebra of
type , we have an operator T(1, 0) : 0 Y . Dene e = T(1, 0)(0) Y and
consider g : X Y dened by the recursion g(0) = e and g(n + 1) = s g(n).
Then g is a morphism as it satises the two conditions:
g T(1, 0)(0) = g(0) = e = T(1, 0)(0) = T(1, 0) g
0
(0)
3
We have T(f) : Y
(f)
Y since (f) 1 for all f . The empty set Y = is a
universal algebra of type only when has no constant, i.e. no operator with arity 0.
11
and:
g T(0, 1)(n) = g s(n) = g(n + 1) = s g(n) = T(0, 1) g
1
(n)
Furthermore, g
|X0
= = g
0
, and since any other morphism g

: X Y would
need to satisfy g

(0) = e and g

(n +1) = s g

(n), a simple induction argument


shows that g is in fact unique. So we have proved that X
0
= is a free generator
of N viewed as a universal algebra of type .
Before we end this section, let us confront our fear of the empty set once
more. We have just seen that it is possible for X
0
= to be a free generator of a
universal algebra which is not empty. But such universal algebra had a constant
0. Suppose now that is a type of universal algebra without constant, i.e. such
that (f) 1 for all f . Then the empty set X = is a universal algebra of
type , provided the map T of domain is dened as T(f) = for all f .
Then T(f) is indeed a map T(f) : X
(f)
X, and it is the only possible choice.
We claim that X is the only universal algebra of type which can have X
0
=
as a free generator. So rst we show that X
0
= is indeed a free generator of
X. Suppose Y is a universal algebra of type and let g
0
: X
0
Y be a map.
Then g
0
is necessarily the empty map. Dene g = g
0
= . Then g : X Y and
in fact g is a morphism since the conditions required f , x X
(f)
. . .
are vacuously satised, as X
(f)
= whenever X = and (f) 1. We clearly
have g
|X0
= = g
0
. Furthermore, the morphism g = is unique, since it is the
only possible map g : X Y . So we have proved that X
0
= is a free generator
of X = viewed as a universal algebra of type . We now show that X = is
the only universal algebra of type which can have X
0
= as a free generator.
So suppose X
0
= is a free generator of some universal algebra X of type .
Let Y = be viewed as a universal algebra of type with the obvious structure
T. Consider the empty map g
0
: X
0
Y . Since X
0
is a free generator, g
0
can
be extended into a morphism g : X Y . But since Y is empty, there exists no
map g : X Y , unless X is also empty. So we have proved that X = .
1.1.7 Free Universal Algebra
Denition 8 A Universal Algebra is said to be free if it has a free generator.
We have already seen a few examples of free universal algebras. The set N
with the successor operator s : N
1
N , or the set N with both the successor
operator and the constant 0. We have seen that is also a free universal algebra
of type , whenever has no constant. Along the same lines, any set X is a
free universal algebra of empty type. Indeed take X
0
= X and suppose Y is
any set while g
0
: X
0
Y is a map. There exists a unique map g : X Y
such that g
|X0
= g
0
, namely g
0
itself. Furthermore, the conditions required for
g : X Y to be a morphism are vacuously satised when = . Hence we see
that X
0
= X is a free generator of X, as a universal algebra of empty type.
It would be wrong to think that most universal algebras are free. Before
we proceed, let us nd a simple example of universal algebra which is not free.
The empty set is not a universal algebra of type unless has no constant.
12
When this is the case the empty set is free. So let us try a slightly bigger set,
namely X = 0. If is the empty type, then X is free, with itself as a free
generator. Suppose = (0, 0). Then the only possible structure on X is
given by T(0, 0) : 0 X dened by T(0, 0)(0) = 0. In other words, there
is only one possible choice of constant within X. Unfortunately X is still free,
with the empty set as a free generator. For if Y is another universal algebra
of type = (0, 0) and g
0
: Y is a map, there exists a unique morphism
g : X Y which extends g
0
, namely the map g dened by g(0) = 0, where
the r.h.s 0 refer to the constant of Y . So we need to choose a slightly more
complex structure on X. Consider = (0, 1). Since X = 0, there is only
one possible choice of operator T(0, 1) : X
1
X, namely T(0, 1)(0) = 0. With
this particular structure, X becomes a universal algebra of the same type as
N with a single successor operator s : N
1
N. We claim that X is not free.
If it was free, there would exist X
0
X which is a free generator of X. The
only possible values for X
0
are X
0
= and X
0
= 0. Now X
0
= cannot be
a free generator of X. As we have already seen, since has no constant, the
empty set is the only universal algebra of type which can have an empty free
generator. Suppose now that X
0
= 0 is a free generator of X. We shall arrive
at a contradiction. Consider the map g
0
: X
0
N dened by g
0
(0) = 0. Since
X
0
is a free generator of X, there exists a unique morphism g : X N such
that g
|X0
= g
0
. In fact, we must have g = g
0
since X
0
= X. But g = g
0
is not
a morphism. Indeed, If g was a morphism, we would have:
0 = g(0) = g T(0, 1)(0) = s g(0) = 0 + 1 = 1
So we have found a simple example of universal algebra which is not free, namely
X = 0 of type = (0, 1) with the only possible operator T(0, 1) : X
1
X.
Of course, having one single example of universal algebra which fails to be
free is little evidence so far. The following proposition shows that free universal
algebras are in fact pretty rare. There is essentially at most one free universal
algebra of type , for a given cardinality of free generator.
Proposition 9 Suppose X and Y are two free universal algebras of type with
free generators X
0
and Y
0
respectively. If there exists a bijection j : X
0
Y
0
,
then X and Y are isomorphic.
Proof
Since Y
0
Y we also have j : X
0
Y . Since X
0
is a free generator of X,
there exists a unique morphism g : X Y such that g
|X0
= j. We claim
that g is in fact an isomorphism. To show this, we only need to prove that
g : X Y is bijective. We shall do so by nding a map g

: Y X such that
g

g = id
X
and g g

= id
Y
where id
X
: X X and id
Y
: Y Y are the
identity mappings. Since j : X
0
Y
0
is a bijection, we have j
1
: Y
0
X
0
.
In particular, j
1
: Y
0
X and since Y
0
is a free generator of Y there exists
a unique morphism g

: Y X such that g

|Y0
= j
1
. We will complete the
proof by showing that g

g = id
X
and g g

= id
Y
. Since both g : X Y and
13
g

: Y X are morphisms, g

g : X X is also a morphism. Furthermore:


(g

g)
|X0
= g

(g
|X0
) = g

j = g

|Y0
j = j
1
j = id
X0
It follows that g

g : X X is a morphism which extends id


X0
: X
0
X.
Since X
0
is a free generator of X, such morphism is unique. As id
X
: X X is
also a morphism such that (id
X
)
|X0
= id
X0
we conclude that g

g = id
X
. We
prove similarly that g g

= id
Y
.
1.1.8 Construction of Free Universal Algebra
In this section, we explicitly construct a free universal algebra of type whose
free generator is a copy of a given set X
0
. In a later section, we shall prove the
existence of a free universal algebra whose generator is the set X
0
itself. Recall
that if A and B are sets, the dierence A B is dened as:
A B = x : x A and x , B
This notion will be used while proving the following proposition:
Proposition 10 Let be a type of universal algebra and X
0
be a set. Dene:
Y
0
= (0, x) : x X
0
, Y
n+1
= Y
n


Y
n
, n N
with:

Y
n
=
_
(1, (f, x)) : f , x Y
(f)
n
_
Let:
Y =
+
_
n=0
Y
n
and T be the map with domain dened by setting T(f) : Y
(f)
Y as:
T(f)(x) = (1, (f, x))
Then (Y, T) is a free universal algebra of type with free generator Y
0
.
Proof
First we need to check that we indeed have T(f) : Y
(f)
Y for all f .
It is clear that T(f) is a well dened map with domain Y
(f)
. However given
x Y
(f)
, we need to check that T(f)(x) Y . In order to do so, it is sucient
to show the existence of n N such that x Y
(f)
n
as this will imply that:
T(f)(x) = (1, (f, x))

Y
n
Y
n+1
Y
If (f) = 0 then Y
(f)
n
= Y
(f)
= 1 for all n N and in particular x Y
(f)
n
.
So we assume that (f) 1. Since x : (f) Y , given i (f) we have
x(i) Y . So there exists n
i
N such that x(i) Y
ni
. If we dene:
n = max(n
0
, . . . , n
(f)1
)
14
since Y
ni
Y
n
for all i (f), we see that x(i) Y
n
for all i (f). It follows
that x : (f) Y
n
, i.e. x Y
(f)
n
and we have proved that T(f) : Y
(f)
Y
for all f as required. It follows that Y is a set and T is a map with domain
such that T(f) : Y
(f)
Y for all f . So (Y, T) is a universal algebra
of type . It remains to show that (Y, T) is free with Y
0
as a free generator.
So let (Z, S) be a universal algebra of type and g
0
: Y
0
Z be a map. We
need to show that g
0
can be uniquely extended into a morphism g : Y Z.
We dene g by setting g
|Y0
= g
0
and g
|(Yn+1)
= g
n+1
where each g
n
is a map
g
n
: Y
n
Z and the sequence (g
n
)
nN
is dened by recursion with the property
that (g
n+1
)
|Yn
= g
n
for all n N. Specically, we dene g
n+1
(y) = g
n
(y) for
all y Y
n
, and given y Y
n+1
Y
n


Y
n
, we consider f and x Y
(f)
n
such that y = (1, (f, x)). Note that such representation of y exists and is clearly
unique. We then dene g
n+1
(y) by setting:
g
n+1
(y) = S(f)(g
(f)
n
(x)) (1.2)
Recall that g
(f)
n
: Y
(f)
n
Z
(f)
is the map dened by g
(f)
n
(x)(i) = g
n
(x(i))
for all i (f). Since S(f) : Z
(f)
Z it follows that g
n+1
(y) as given by
equation (1.2) is a well-dened element of Z. This completes our recursion and
we have a sequence of maps g
n
: Y
n
Z such that (g
n+1
)
|Yn
= g
n
for all n N.
From this last property, it follows that g : Y Z is itself well-dened by setting
g
|Yn
= g
n
for all n N. So we have map g : Y Z such that g
|Y0
= g
0
. We
shall now check that g is a morphism. So let f and x Y
(f)
. We need to
check that gT(f)(x) = S(f)g
(f)
(x). Since x Y
(f)
we have already shown
the existence of n N such that x Y
(f)
n
. In fact, let us pick n to be the
smallest of such integers. By denition we have T(f)(x) = (1, (f, x)). From this
equality and x Y
(f)
n
, it follows that T(f)(x) is an element of

Y
n
Y
n+1
. We
claim that T(f)(x) is in fact an element of Y
n+1
Y
n
. So we need to show that
T(f)(x) , Y
n
. Suppose to the contrary that T(f)(x) Y
n
. Let k N denote
the smallest integer such that T(f)(x) Y
k
. Note that k n. If k = 0 we
obtain (1, (f, x)) = (0, x

) for some x

X
0
which is a contradiction. So k 1
and Y
k
= Y
k1


Y
k1
. From the minimality of k we have T(f)(x) , Y
k1
. It
follows that T(f)(x)

Y
k1
. So there exist f

and x

Y
(f)
k1
such that:
(1, (f, x)) = T(f)(x) = (1, (f

, x

))
From this we see that x = x

Y
(f)
k1
. Since k n, we have k 1 < n and
x Y
(f)
k1
contradicts the minimality of n. So we have shown that T(f)(x) Y
n
leads to a contradiction, and it follows that T(f)(x) Y
n+1
Y
n
. Thus:
g T(f)(x) = g
n+1
(T(f)(x)) = S(f)(g
(f)
n
(x)) = S(f) g
(f)
(x)
where the last equality follows from g
n
= g
|Yn
and x Y
(f)
n
. This completes
our proof of the fact that g : Y Z is a morphism. It remains to check that
g is unique. So let g

: Y Z be another morphism such that g

|Y0
= g
0
. We
15
shall prove by induction that g
|Yn
= g

|Yn
for all n N. Since g
|Y0
= g
0
= g

|Y0
,
this is clearly true for n = 0. So we assume that n N and y Y
n+1
. We need
to show that g(y) = g

(y). From the induction hypothesis, the equality is true


for y Y
n
. So we may assume that y Y
n+1
Y
n


Y
n
. In particular, there
exist f and x Y
(f)
n
such that y = (1, (f, x)). It follows that y = T(f)(x)
and nally:
g(y) = g T(f)(x) = S(f) g
(f)
(x) = S(f) g
(f)
(x) = g

T(f)(x) = g

(y)
where the third equality follows from the induction hypothesis and x Y
(f)
n
.
1.1.9 Cantors Theorem
Given an arbitrary set X
0
, we were able to construct a free universal algebra of
type whose free generator Y
0
is a copy of X
0
, i.e. a set for which there exists
a bijection j : X
0
Y
0
. Our aim is to go slightly beyond that and nd a free
universal algebra of type with free generator the set X
0
itself. For this, we
shall need a couple of set theoretic results which we include in this section and
the following for the sake of completeness. In particular we shall need to prove
that any set A has a disjoint copy of itself, i.e. that there exist a set B with
A B = and a bijection j : A B. To prove this result, we shall make use
of Cantors Theorem which is the focus of the present section.
Given a set A, recall that T(A) denotes the power set of A, i.e. the set of
all subsets of A, that is:
T(A) = B : B A
Note that there is an obvious bijection j : 2
A
T(A), which is dened by
j(x) = a A : x(a) = 1. Cantors Theorem asserts that T(A) has higher
cardinality than the set A, i.e. that there exists no injective map j : T(A) A.
Most of us will be familiar with this result, and indeed other similar results such
as the fact that R has higher cardinality than N. Some of us may remember a
version of Cantors diagonal argument. We restrict our attention to the interval
]0, 1[ and consider every x ]0, 1[ as a sequence x = 0.11010111001 . . . This
sequence is not unique but after some ironing out, we may be convinced that
]0, 1[ has the same cardinality as 2
N
which in turn has the same cardinality
as T(N). Hence Cantors diagonal argument essentially proves that T(N) is a
bigger set than N, and R having a higher cardinality than N is essentially a
particular case of Cantors Theorem.
There is an easy way to remember the proof of Cantors Theorem, which
is to think of Russells paradox. Bertrand Russell discovered that x : x , x
cannot be a set, or more precisely that the statement:
yx[x y x , x]
leads to a contradiction. If a = x : x , x is a set, then a a implies that a , a,
while a , a implies that a a. Now suppose we have a bijection j : T(A) A
16
(we assume this is a bijection rather than an injection to make this discussion
simpler). Then every element of A can be viewed as a subset of A. It is tempting
to consider the set of those elements of A which do not belong to themselves, but
rather than considering the meaning of literally as in a = x A : x , x,
we need to adjust slightly with a = j(x A : x , j
1
(x)) A. Following
Russells argument, we then ask whether a A belongs to itself, or specically
if a j
1
(a). But assuming a j
1
(a) leads to a , j
1
(a), while assuming
a , j
1
(a) leads to a j
1
(a). So we have proved that there cannot exist a
bijection j : T(A) A. The following lemma deals with an injection rather
than a bijection. We need to be slightly more careful, but the idea underlying
the proof is the same.
Lemma 11 (Cantors Theorem) Let A be an arbitrary set. There exists no
injective map j : T(A) A.
Proof
Suppose j : T(A) A is an injective map and dene b

= j(B

) where:
B

= b A : B T(A) , b = j(B) and b , B


We shall complete the proof by showing that both b

and b

, B

lead
to a contradiction. So suppose rst that b

. There exists B T(A) such


that b

= j(B) and b

, B. Hence in particular we have j(B

) = b

= j(B) and
since j is an injective map, it follows that B

= B. So we see that b

, B

which
contradicts the initial assumption of b

. We now assume that b

, B

.
Since b

= j(B

), taking B = B

we see that there exists B T(A) such that


b

= j(B) and b

, B. Hence it follows that b

which contradicts the


initial assumption of b

, B

.
1.1.10 Disjoint Copy of a Set
In this section we prove that any set A has a disjoint copy of itself, namely
that there exists a set B with A B = and a bijection j : A B. We shall
prove this result using Cantors Theorem (lemma (11)) as well as the Axiom of
Choice. There are probably many other ways to prove it, some of which not
involving the Axiom of Choice. It is in fact unknown to me whether this can be
proved within ZF rather than ZFC.
4
Given a set A, recall that the Axiom of Choice states the existence of a
choice function on A, i.e. the existence of a map k : T(A) A such that
k(x) x for all x T(A) . A choice function on A is simply a map which
chooses an arbitrary element from every non-empty subset of A.
Lemma 12 Let A be an arbitrary set. There exists a set B with AB = and
a bijective map j : A B.
4
For those interested, there is a thread on sci.math entitled [Set Theory] A set has a
disjoint copy of itself (which may have the answer to the ZF question). I am grateful to
those who contributed to this thread. The proof of lemma (12) is due to Robin Chapman.
17
Proof
For all x T(A) dene A
x
= (x, y) : y A. There is a clear bijection between
A and A
x
, namely j
x
: A A
x
dened by j
x
(y) = (x, y). We shall complete
the proof by taking B = A
x
for some x T(A) for which AA
x
= . Of course
we need to check that it is always possible to do so. Suppose to the contrary
that A A
x
,= for all x T(A). Using the Axiom of Choice, there exists
a choice function k : T(A) A. We dene j : T(A) A by setting
j(x) = k(AA
x
). Note that j(x) is well dened since AA
x
,= . We obtain a
map j : T(A) A such that j(x) AA
x
for all x T(A). It is now sucient
to show that j is injective, as this will contradict Cantors Theorem. So suppose
x, x

T(A) are such that j(x) = j(x

). Since j(x) A
x
there exists y A
such that j(x) = (x, y). Similarly, there exists y

A such that j(x

) = (x

, y

).
From j(x) = j(x

) we obtain (x, y) = (x

, y

) and it follows in particular that


x = x

. So we have proved that j : T(A) A is indeed an injective map.


Given a set X
0
, our goal is to obtain a free universal algebra of type with
free generator the set X
0
itself. Dening Y
0
= (0, x) : x X
0
, we were
able to construct a free universal algebra Y with free generator Y
0
, as described
in proposition (10). So we have an injection j : X
0
Y , where Y is a free
universal algebra of type . In fact, this injection is a bijection between X
0
and Y
0
, the free generator of Y . In order for us to construct a free universal
algebra with free generator X
0
, all we need is a copy of Y containing X
0
. In
other words we need a set X with X
0
X and a bijection g : X Y . Once
we have this set X and the bijection g : X Y , we can easily carry over the
structure of universal algebra from Y onto X, by choosing the structure on X
which turns the bijection g : X Y into an isomorphism. However, we want
X
0
to be a free generator of X. One way to ensure this is to request that the
bijection g : X Y carries X
0
onto Y
0
. In other words, we want the restriction
g
|X0
to coincide with the bijection j : X
0
Y
0
. The following lemma shows
the existence of the set X with the bijection g : X Y satisfying the required
property g
|X0
= j.
Note that the idea of starting from an injection X
0
Y and obtaining a
copy of Y containing X
0
while preserving this injection can be used in many
other cases. Most of us are familiar with the way the ring of integers Z is cre-
ated as a quotient set (N N)/ with the appropriate equivalence relation.
We obtain an embedding j
1
: N Z. We then construct the eld Q of rational
numbers and we obtain another embedding j
2
: Z Q. We go on by construct-
ing the eld R of real numbers and the eld C of complex numbers with their
associated embedding j
3
: Q R and j
4
: R C. Strictly speaking, the set
theoretic inclusions N Z Q R C are false and most living mathemati-
cians do not care. The inclusion N Z has to be true. If it is not, it needs to be
re-interpreted in a way which makes it true. In computing terms, mathemati-
cians are eectively overloading the operators and . This is all very nice
and ne in practice. But assuming we wanted to code or compile a high level
mathematical statement into a low level expression of rst order predicate logic,
it may be that dening Z, Q, R and C in a way which avoids the overloading of
18
primitive symbols, will make our life a lot easier. We believe the aim of creating
a formal high level mathematical language whose statements could be compiled
as formulas of rst order logic is a worthy objective. Such high level language
would most likely need to allow the overloading of symbols. So it may be that
whether N Z is literally true or not will not matter. There is a fascinating
website [45] on http://www.metamath.org/ created by Norman Megill where
these questions are likely to have been answered. It is certainly our intention
to use this reference more and more as this document progresses.
Recall that if f : A B is a map and A

A, then f(A

) is the set:
f(A

) = y : x , x A

and (x, y) f
which coincides with rng(f
|A
). In particular, f(A) = rng(f). The notation
f(A

) is a case of overloading. The meaning of f(x) must be derived from the


context, and depends on whether x is viewed as an element of A or a subset
of A. It can easily be both, as 0 is both an element and a subset of 1. Some
authors will use the notation f[A

] rather than f(A

).
Lemma 13 Let X
0
and Y be sets and j : X
0
Y be an injective map. There
exist a set X and a bijection g : X Y such that X
0
X and g
|X0
= j.
Proof
Consider the set A = X
0
Y . Using lemma (12) there exists a set B with
A B = and a bijective map i : A B. Dene:
X = X
0
i(Y j(X
0
))
and g : X Y by setting g(x) = j(x) for all x X
0
and g(x) = i
1
(x) for
all x i(Y j(X
0
)). Note rst that g(x) is well dened since x X
0
and
x i(Y j(X
0
)) cannot occur at the same time, as X
0
A, i(Y j(X
0
)) B
and A B = . Note also that g(x) is dened for all x X. To show that
we have g : X Y we need to check that g(x) Y for all x X. If x X
0
then this is clear since j : X
0
Y and g(x) = j(x). If x i(Y j(X
0
)),
then x can be written as x = i(y) for some y Y j(X
0
). It follows that
g(x) = i
1
(x) = y Y . So we have proved that g : X Y . We obviously
have X
0
X and g
|X0
= j. So we shall complete the proof by showing that
g : X Y is a bijection. First we show that it is surjective. So let y Y . We
need to show that y = g(x) for some x X. If y j(X
0
) then y = j(x) for
some x X
0
and in particular y = g(x). Otherwise y Y j(X
0
) and taking
x = i(y) i(Y j(X
0
)) we see that y = i
1
(x) and in particular y = g(x). So
we have proved that g is surjective. To show that it is injective, let x, x

X
and assume that g(x) = g(x

). We need to show that x = x

. Note rst that


we cannot have x X
0
while x

i(Y j(X
0
)). If this was the case, we would
have g(x) = j(x) j(X
0
) while g(x

) = i
1
(x

) Y j(X
0
) contradicting the
fact that g(x) = g(x

) as one lies in j(X


0
) while the other does not. Similarly,
we cannot have x i(Y j(X
0
)) while x

X
0
. It follows that either both x
and x

lie in X
0
or they both lie in i(Y j(X
0
)). In the rst case we obtain
19
j(x) = g(x) = g(x

) = j(x

) and since j : X
0
Y is injective we see that
x = x

. In the second case, we obtain i


1
(x) = g(x) = g(x

) = i
1
(x

) and
since i
1
: B A is injective we see that x = x

. So we have proved that g is


injective, and nally g : X Y is a bijection.
1.1.11 Main Existence Theorem
Given a set X
0
, we are now in a position to prove the existence of a free uni-
versal algebra of type with X
0
as free generator. As already hinted in the
previous section, the proof essentially runs in three stages. We start using
proposition (10) to construct a free universal algebra Y with free generator
Y
0
= (0, x) : x X
0
. We then use lemma (13) to obtain a copy X of Y
and a bijection g : X Y which preserves the bijection j : X
0
Y
0
. We
nally dene on X the unique structure of universal algebra of type turning
g : X Y into an isomorphism.
Theorem 1 Let be a type of universal algebra and X
0
be an arbitrary set.
There exists a free universal algebra of type with free generator X
0
. Such
universal algebra is unique up to isomorphism.
Proof
Uniqueness up to isomorphism is a direct consequence of proposition (9). So
we only need to prove the existence. If we dene Y
0
= (0, x) : x X
0
,
from proposition (10) we have a free universal algebra Y with free generator Y
0
.
Furthermore, the map j : X
0
Y dened by j(x) = (0, x) for all x X
0
is an
injective map. Using lemma (13) there exist a set X and a bijection g : X Y
such that X
0
X and g
|X0
= j. At this stage, the set X is only a set, not a
universal algebra of type . However, using the bijection g : X Y we can
easily carry over the structure from Y onto X. Specically, we dene a map T
with domain by setting T(f) : X
(f)
X to be dened by:
x X
(f)
, T(f)(x) = g
1
f g(x) (1.3)
where it is understood that f on the r.h.s. of this equation is a shortcut for
the operator f : Y
(f)
Y and g refers to g
(f)
: X
(f)
Y
(f)
, while
g
1
is simply the inverse g
1
: Y X. So T(f) as dened by equation (1.3)
is indeed a map T(f) : X
(f)
X. It follows that (X, T) is now a universal
algebra of type and X
0
X. We shall complete the proof of this theorem by
showing that X
0
is a free generator of X. Note that this is hardly surprising
since j : X
0
Y
0
is a bijection which coincide with the bijection g : X Y
on X
0
, and we know that Y
0
is a free generator of Y . In order to prove this
formally, let Z be a universal algebra of type and h
0
: X
0
Z be a map. We
need to show that h
0
can be uniquely extended into a morphism h : X Z.
Before we do so, note that from equation (1.3) we have for all x X
(f)
:
g T(f)(x) = f g(x)
20
which shows that g : X Y is not only a bijection, but also a morphism, i.e. g
is an isomorphism. Now from h
0
: X
0
Z we obtain a map h
0
j
1
: Y
0
Z.
Since Y
0
is a free generator of Y , there exists a morphism h

: Y Z such that
h

|Y0
= h
0
j
1
. It follows that h : X Z dened by h = h

g is a morphism,
and furthermore, we have:
h
|X0
= h

g
|X0
= h

j = h

|Y0
j = h
0
j
1
j = h
0
So we have proved the existence of a morphism h : X Z such that h
|X0
= h
0
.
It remains to check that h : X Z is in fact unique. So suppose h

now refers
to a morphism h

: X Z such that h

|X0
= h
0
. Then h g
1
: Y Z and
h

g
1
: Y Z are two morphisms which coincide on Y
0
, since:
(h g
1
)
|Y0
= h
|X0
j
1
= h
0
j
1
= h

|X0
j
1
= (h

g
1
)
|Y0
It follows that h g
1
= h

g
1
, from the uniqueness property of Y
0
being a
free generator of Y . We conclude that h = h

.
1.2 Structural Induction, Structural Recursion
1.2.1 Unique Representation in Free Universal Algebra
In an earlier section, we claimed that most universal algebras were not free. The
following theorem arguably provides the most convincing evidence of this fact.
Let X be a group viewed as a universal algebra of type = (0, 0), (1, 1), (2, 2),
where the product T(2, 2) : X
2
X is denoted . Then the following theorem
shows that X cannot be a free universal algebra of type unless:
x
1
y
1
= x
2
y
2
x
1
= x
2
and y
1
= y
2
for all x
1
, x
2
, y
1
and y
2
X. This is a pretty strong condition. In fact, we can
easily show that X can never be a free universal algebra of type . Indeed, let
e = T(0, 0)(0) be the identity element of X. Then e = e e, or equivalently:
T(0, 0)(0) = T(2, 2)(e, e)
The following theorem shows this cannot be true in a free universal algebra.
Theorem 2 Let X be a free universal algebra of type with free generator
X
0
X . Then for all y X, one and only one of the following is the case:
(i) y X
0
(ii) y = f(x) , f , x X
(f)
Furthermore, the representation (ii) is unique, that is:
f(x) = f

(x

) f = f

and x = x

for all f, f

, x X
(f)
and x

X
(f

)
.
21
Proof
We shall rst prove the theorem for a particular case of free universal algebra of
type . Instead of looking at X itself, we shall rst consider the free universal
algebra Y with free generator Y
0
= (0, x) : x X
0
as described in proposi-
tion (10). So let y Y and suppose that y , Y
0
. There exists n N such that
y Y
n+1
. Let n denote the smallest of such integers. Then y Y
n+1
Y
n


Y
n
.
Hence, there exist f and x Y
(f)
n
such that y = (1, (f, x)). However by
denition, we have f(x) = (1, (f, x)). Since Y
(f)
n
Y
(f)
, we have found f
and x Y
(f)
such that y = f(x). This shows that (ii) is satised whenever
(i) is not. We shall now prove that (i) and (ii) cannot occur simultaneously.
Indeed if y = f(x) for some f and x Y
(f)
, then y = (1, (f, x)). If y is
also an element of Y
0
then y = (0, x

) for some x

X
0
, which is a contradic-
tion as 0 ,= 1. So we have proved that one and one only of (i) and (ii) must
occur. It remains to show that the representation (ii) is unique. So suppose
f(x) = f

(x

) for some f, f

, x Y
(f)
and x

Y
(f

)
. Then we have
(1, (f, x)) = (1, (f

, x

)) and it follows immediately that f = f

and x = x

. So
we have proved the theorem in the case of the free universal algebra Y . We shall
now prove the theorem for X. Consider the bijection j : X
0
Y
0
dened by
j(x) = (0, x). Then we also have j : X
0
Y and since X
0
is a free generator of
X, the map j can be uniquely extended into a morphism g : X Y . Similarly,
the map j
1
: Y
0
X can be uniquely extended into a morphism g

: Y X.
Furthermore, since g

g : X X is a morphism such that:


(g

g)
|X0
= g

j = g

|Y0
j = j
1
j = (id
X
)
|X0
we conclude by uniqueness that g

g = id
X
and similarly g g

= id
Y
. This
shows that g : X Y and g

: Y X are in fact isomorphisms which are


inverse of each other. Of course we already knew that X and Y were isomorphic
from the existence of the bijection j : X
0
Y
0
. But we now have a particular
isomorphism g : X Y which is such that g
|X0
= j. This should allow us
to complete the proof of the theorem. So suppose y X and y , X
0
. Let
y

= g(y) Y . Since the theorem is true for Y , either (i) or (ii) is true for y

.
We claim that y

Y
0
is impossible, as otherwise:
y = g

g(y) = g

(y

) = g

|Y0
(y

) = j
1
(y

) X
0
which contradicts the assumption y , X
0
. It follows that (ii) is true for y

, i.e.
y

= f(x

) for some f and x

Y
(f)
. Taking x = g

(x

) X
(f)
, since
g

: Y X is a morphism we obtain:
y = g

(y

) = g

f(x

) = f g

(x

) = f(x)
which shows that (ii) is true for y. We now prove that (i) and (ii) cannot
occur simultaneously. Suppose that y X
0
and y = f(x) for some f and
x X
(f)
. Taking y

= g(y) Y and x

= g(x) Y
(f)
, we obtain:
y

= g(y) = g
|X0
(y) = j(y) Y
0
22
and furthermore:
y

= g(y) = g f(x) = f g(x) = f(x

)
which shows that (i) and (ii) are simultaneously true for y

Y , which is a
contradiction. To complete the proof of the theorem, it remains to show that
the representation (ii) is unique. So suppose f(x) = f

(x

) for some f, f

,
x X
(f)
and x

X
(f

)
. We obtain:
f(g(x)) = f g(x) = g f(x) = g f

(x

) = f

(g(x

))
From the uniqueness of the representation (ii) in Y , it follows that f = f

and
g(x) = g(x

) Y
(f)
. Hence, for all i (f):
g(x(i)) = g(x)(i) = g(x

)(i) = g(x

(i))
and since g : X Y is injective, we conclude that x(i) = x

(i). This being true


for all i (f), we have proved that x = x

.
Proposition 14 A free universal algebra of type has a unique free generator.
Proof
Let X be a free universal algebra of type . Suppose X
0
X and Y
0
X
are both free generators of X. We need to show that X
0
= Y
0
. First we show
that X
0
Y
0
. So let y X
0
. Applying theorem (2) of page 21 to X with free
generator Y
0
we see that if y , Y
0
there exists f and x X
(f)
such that
y = f(x). But this contradicts theorem (2) applied to X with free generator
X
0
since y X
0
, and (i) and (ii) cannot occur simultaneously. it follows that
y Y
0
and we have proved that X
0
Y
0
. The reverse inclusion is proved in a
similar fashion.
1.2.2 Order in Free Universal Algebra
In this section, we dene the notion of order on a free universal algebra. This
notion will prove useful on several occasions, and specically when proving in
proposition (28) that a free generator is also a generator. Loosely speaking, given
a universal algebra X of type , the order on X is a map : X N representing
the degree of complexity of the elements of X, viewed as expressions in a formal
language. For instance suppose X is of type = (0, 2) with one single binary
operator : X
2
X and free generator X
0
= 0. We have (0) = 0 while
(00) = 1 and (0(00)) = 2. The complexity of a formula is sometimes a
useful tool to carry out induction arguments over N. For those already familiar
with the subject, we do not dene the order : X N by structural recursion,
e.g. (0) = 0 and (x y) = 1 + max((x), (y)). Instead, we shall embed
N with the appropriate structure of universal algebra of type , and consider
: X N as the unique morphism such that
|X0
= 0. In eect, we are limiting
23
ourselves to denitions by recursions over N, which is a standard mathematical
argument already used in proposition (10). We shall not make use of denitions
by structural recursion on free universal algebras, until the principle has been
justied in a later section.
Denition 15 Let be a type of universal algebra. We call default structure
of type on N the map T with domain such that T(f) : N
(f)
N is:
n N
(f)
, T(f)(n) = 1 + maxn(i) : i (f)
for all f , where it is understood that max() = 0.
Denition 16 Let X be a free universal algebra of type with free generator
X
0
X. We call order on X the unique morphism : X N such that

|X0
= 0, where N is embedded with its default structure of type . Given
x X, we call order of x the integer (x) N where is the order on X.
Proposition 17 Let X be a free universal algebra of type with free generator
X
0
X and : X N be the order on X. Then for all x X we have:
(x) = 0 x X
0
Furthermore, for all f and x X
(f)
we have:
(f(x)) = 1 + max(x(i)) : i (f)
Proof
First we show the equality. Suppose f and x X
(f)
. Let T denote the
default structure of type on N as per denition (15). Since : X N is a
morphism, we have:
(f(x)) = T(f)((x))
= 1 + max(x)(i) : i (f)
= 1 + max(x(i)) : i (f)
We now show the equivalence. Since
|X0
= 0 from denition (16), it is clear
that y X
0
implies (y) = 0. Suppose conversely that y X and (y) = 0.
We need to show that y X
0
. Suppose to the contrary that y , X
0
. From
theorem (2) of page 21 there exist f and x X
(f)
such that y = f(x). So:
(y) = (f(x)) = 1 + max(x(i)) : i (f) 1
which contradicts the initial assumption of (y) = 0.
Suppose X = N is the free universal algebra of type = (0, 1) with the
single successor operator s : X
1
X and free generator X
0
= 0. Then
(0) = 0 while (1) = (s(0)) = 1 +(0) = 1. If we now regard X = N as the
24
free universal algebra of type = (0, 1), (1, 0) by adding the constant 0, then
X now has X
0
= as free generator. Furthermore:
(0) = (T(1, 0)(0)) = 1 +max() = 1
More generally, the order of constants in a free universal algebra is 1 and not 0.
Only the elements of the free generator have order 0.
We shall now complete this section with a small proposition on the structure
of a free universal algebra in relation to its order mapping. Note that the
denition of X
0
below is consistent with the existing notation X
0
referring to
the free generator of X, since we have (x) = 0 x X
0
.
Proposition 18 Let X be a free universal algebra of type with free generator
X
0
X. Let : X N be the order on X and dene:
X
n
= x X : (x) n , n N
Then for all n N we have:
X
n+1
= X
n
f(x) : f , x (X
n
)
(f)

Proof
First we show the inclusion . It is clear that X
n
X
n+1
. So let f and
x (X
n
)
(f)
. We need to show that f(x) X
n+1
that is (f(x)) n+1 which
follows from proposition (17) and:
(f(x)) = 1 + max(x(i)) : i (f) 1 +n
where the inequality stems from (x(i)) n for all i (f). We now prove
the inclusion . So suppose y X
n+1
X
n
. Then we must have (y) = n + 1
and in particular (y) ,= 0. From proposition (17) it follows that y , X
0
and consequently using theorem (2) of page 21 we see that y can be uniquely
represented as y = f(x) for some f and x X
(f)
. It remains to show that
x is actually an element of (X
n
)
(f)
, i.e. that x(i) X
n
for all i (f) which
is vacuously true in the case when (f) = 0. So let i (f). We have:
1 +(x(i)) 1 + max(x(i)) : i (f) = (f(x)) = (y) = n + 1
from which we conclude that (x(i)) n as requested.
1.2.3 Universal Sub-Algebra of Universal Algebra
In this section we introduce the notion of universal sub-algebras. The main
motivation for this will appear later when we investigate equivalence relations
and congruences on universal algebras. An equivalence relation on a universal
algebra X is a subset of XX with certain closure properties, and can therefore
be regarded as a universal sub-algebra of X X, provided the latter has been
embedded with the appropriate structure of universal algebra. Rather, than
focussing on the specics of X X, we will study universal sub-algebras in a
general setting.
25
Denition 19 Let X be a universal algebra of type . We say that Y X is a
Universal Sub-Algebra of X if and only if it is closed under every operator i.e.
f , x X
(f)
, [ x Y
(f)
f(x) Y ]
If Y is a universal sub-algebra of X, we call induced structure on Y the map T
with domain such that for all f , the operator T(f) : Y
(f)
Y is:
x Y
(f)
, T(f)(x) = f(x) Y
Let X be a universal algebra of type and Y be a universal sub-algebra of X.
Given f , when using loose notations we have two operators f : X
(f)
X
and f : Y
(f)
Y . However, the latter is simply the restriction of the former
to the smaller domain Y
(f)
X
(f)
, and the notation f(x) is therefore
unambiguous whenever x is an element of Y
(f)
. The best example of universal
sub-algebras are possibly homomorphic images of universal algebras. We have:
Proposition 20 Let h : X Y be a homomorphism between two universal
algebras X and Y of type . Then the image h(X) is a sub-algebra of Y .
Proof
Let f and y h(X)
(f)
. We need to show that f(y) h(X). How-
ever, for all i (f) we have y(i) h(X). Hence there exists x
i
X such
that y(i) = h(x
i
). Let x X
(f)
be dened by x(i) = x
i
for all i (f).
Then we have y(i) = h(x
i
) = h(x(i)) = h(x)(i) for all i (f) and con-
sequently y = h(x). Since h : X Y is a homomorphism, it follows that
f(y) = f h(x) = h f(x) = h(x

) with x

= f(x). So we have found x

X
such that f(y) = h(x

) and we conclude that f(y) h(X).


Restriction of homomorphisms to universal sub-algebras are homomorphisms.
We had to say this once to avoid future logical aws in forthcoming arguments.
For example, if h : X Y is a morphism and X
0
is a sub-algebra of X, we
would like to use proposition (20) to argue that h(X
0
) is a sub-algebra of Y .
This is indeed the case, but we need to argue that h
|X0
: X
0
Y is a morphism.
Proposition 21 Let h : X Y be a morphism between two universal algebras
of type . Given a sub-algebra X
0
X, the restriction h
|X0
is a morphism.
Proof
Given f and x X
(f)
0
we need to check the equality h f(x) = f h(x).
This follows from the fact that every operator f : X
(f)
0
X
0
is the restriction
of f : X
(f)
X and h : X
0
Y is the restriction of h : X Y .
Let / be a non-empty set. Recall that / is the set dened by:
/ = x : Z / , x Z
When / = , the set / is normally not dened, as there is no such things as
the set of all sets. Suppose now that / is a non-empty set of subsets of a given
26
set X. Since / is non-empty, it has an element Z / which by assumption is
a subset of X. So every element x Z is also an element of X and it follows
that / could equally have been dened as:
/ = x X : Z / , x Z
This last expression is now meaningful when / = , and yields = X. Hence
we shall adopt the convention that / = X whenever / = is viewed as a set
of subsets of X, and X is clearly given from the context.
Proposition 22 Let X be a universal algebra of type and / be a set of
universal sub-algebras of X. Then / is a universal sub-algebra of X.
Proof
Let Y = /. We need to show that Y is a universal sub-algebra of X. So let
f and x Y
(f)
. We need to show that f(x) Y . So let Z /. We need
to show that f(x) Z. But Y Z and x Y
(f)
. So x Z
(f)
. Since Z is a
universal sub-algebra of X, we conclude that f(x) Z.
Denition 23 Let X be a universal algebra of type and X
0
X. Let / be
the set of universal sub-algebras of X dened by:
/ = Y : X
0
Y and Y universal sub-algebra of X
We call Universal Sub-Algebra of X generated by X
0
the universal sub-algebra
of X denoted X
0
and dened by X
0
= /.
Proposition 24 Let X be a universal algebra of type and X
0
X. Then, the
universal sub-algebra X
0
generated by X
0
is the smallest universal sub-algebra
of X containing X
0
, i.e. such that X
0
X
0
.
Proof
Dene / = Y : X
0
Y and Y universal sub-algebra of X. From deni-
tion (23), we have X
0
= /. We need to show that X
0
is a universal sub-
algebra of X which contains X
0
, and that it is the smallest universal sub-algebra
of X with such property. Since every element of / is a universal sub-algebra of
X, from proposition (22), X
0
= / is also a universal sub-algebra of X. To
show that X
0
X
0
assume that x X
0
. We need to show that x /. So
let Y /. We need to show that x Y . But this is clear since X
0
Y . So we
have proved that X
0
is a universal sub-algebra of X containing X
0
. Suppose
that Y is another universal sub-algebra of X such that X
0
Y . We need to
show that X
0
is smaller than Y , that is X
0
Y . But Y is clearly an element
of /. Hence if x X
0
= /, it follows immediately that x Y . This shows
that X
0
Y .
Suppose X and Y are universal algebras of type and g : X Y is a
morphism. Given X
0
X, the following proposition asserts that the direct
27
image g(X
0
) of the universal sub-algebra of X generated by X
0
is also the
universal sub-algebra g(X
0
) of Y generated by the direct image g(X
0
). This
proposition will be used for example when proving proposition (452), stating
that the provable sequents of the Gentzen LK-system are in fact the sequents
belonging to the sub-algebra generated by the elementary sequents .
Proposition 25 Let X, Y be universal algebras of type and g : X Y be a
morphism. For every subset X
0
X we have the following equality:
g(X
0
) = g(X
0
)
Proof
First we show : by virtue of proposition (24), g(X
0
) is the smallest sub-
algebra of Y containing the direct image g(X
0
). In order to prove , it is there-
fore sucient to show that g(X
0
) is a sub-algebra of Y with g(X
0
) g(X
0
).
This last inclusion follows immediately from X
0
X
0
. Being a homomorphic
image of the sub-algebra X
0
, the fact that g(X
0
) is a sub-algebra of Y follows
from proposition (20). So we now prove . Consider the subset X
1
dened by:
X
1
= x X
0
: g(x) g(X
0
)
In order to prove , it is sucient to prove that X
0
X
1
. Thus we need
to show that X
1
is a sub-algebra of X with X
0
X
1
. The fact that X
0
X
1
follows from the inclusions X
0
X
0
and g(X
0
) g(X
0
). So it remains to
show that X
1
is a sub-algebra of X. Consider f and x X
(f)
1
. We need
to show that f(x) X
1
. However, from X
1
X
0
we obtain x X
0

(f)
and it follows that f(x) X
0
since X
0
is a sub-algebra of X. In order
to show that f(x) X
1
, it remains to prove that g f(x) g(X
0
). Hav-
ing assumed that g : X Y is a morphism, this amounts to showing that
f g(x) g(X
0
). However, since g(X
0
) is a sub-algebra of Y , in order to
show that f g(x) g(X
0
) we simply need to prove that g(x) g(X
0
)
(f)
.
Thus given i (f), we need to show that g(x)(i) = g(x(i)) g(X
0
). It is
therefore sucient to prove that x(i) X
1
which follows from x X
(f)
1
.
Let X be a universal algebra of type and X
0
X. The universal sub-
algebra X
0
generated by X
0
was dened in terms of an intersection /. This
may be viewed as a denition from above: we start from universal sub-algebras
which are larger than X
0
, including X itself, and we reduce these universal
sub-algebras by taking the intersection between them, until we arrive at X
0
.
The following proposition may be viewed as a denition from below. We start
from the smallest possible set, which is the generator X
0
, and we gradually add
all the elements which should belong to X
0
in view of its closure properties.
Note that the denition from below makes use of a recursion principle, whereas
the denition from above does not.
Proposition 26 Let X be a universal algebra of type and X
0
X. Dene:
X
n+1
= X
n


X
n
, n N
28
with:

X
n
=
_
f(x) : f , x X
(f)
n
_
Then:
X
0
=
+
_
n=0
X
n
Proof
Let Y =
nN
X
n
. We need to show that Y = X
0
. First we show that
Y X
0
. It is sucient to prove by induction that X
n
X
0
for all n N.
Since X
0
X
0
, this is obviously true for n = 0. Suppose we have X
n
X
0
.
We claim that X
n+1
X
0
. It is sucient to prove that

X
n
X
0
. So
let f and x X
(f)
n
. We have to show that f(x) X
0
. But since
X
n
X
0
, we have X
(f)
n
X
0

(f)
. It follows that x X
0

(f)
. But
X
0
is a universal sub-algebra of X, and we conclude that f(x) X
0
. This
completes our induction argument, and we have proved that Y X
0
. We
now show that X
0
Y . Since X
0
Y , from proposition (24), X
0
being the
smallest universal sub-algebra containing X
0
, it is sucient to show that Y is
also a universal sub-algebra of X. So let f and x Y
(f)
. We need to
show that f(x) Y . It is sucient to show that x X
(f)
n
for some n N, as
this will imply f(x)

X
n
X
n+1
Y . If (f) = 0 then X
(f)
n
= 1 = Y
(f)
for all n N and x X
(f)
n
for all n N. Suppose (f) 1. Given i (f),
since x Y
(f)
we have x(i) Y . So there exists n
i
N such that x(i) X
ni
.
Dene n N by:
n = max(n
0
, . . . , n
(f)1
)
Since X
ni
X
n
, we have x(i) X
n
for all i (f). So x X
(f)
n
.
1.2.4 Generator of Universal Algebra
In this section, we prove that a free generator is also a generator. This is one
case where considering the order : X N on a free universal algebra X can
be seen to be very useful.
Denition 27 Let X be a universal algebra of type and X
0
X. We say
that X
0
is a generator of X, if and only if X
0
= X.
Proposition 28 Let X be a universal algebra of type and X
0
X. If X
0
is
a free generator of X, then it is also a generator of X.
Proof
Suppose X
0
is a free generator of X. We want to show that X
0
is a generator
of X, i.e. that X
0
= X. Suppose to the contrary that there exists y X such
that y , X
0
. Let : X N be the order on the free universal algebra X and
consider the set:
A = (y) : y X , y , X
0

29
Then A is a non-empty subset of N which therefore has a smallest element.
So assume that y , X
0
corresponds to the smallest element of A. Since
X
0
X
0
, it is clear that y , X
0
. Since X is a free universal algebra, from
theorem (2) of page 21 there exist f and x X
(f)
such that y = f(x). It
follows from proposition (17) that:
(y) = (f(x)) = 1 + max(x(i)) : i (f)
In particular, we see that (x(i)) < (y) for all i (f). From the minimality
of (y) it follows that x(i) X
0
for all i (f). So x X
0

(f)
. Note that
this conclusion holds even if (f) = 0, and the argument presented is perfectly
valid in that case. Now since X
0
is a universal sub-algebra of X, we conclude
from x X
0

(f)
that f(x) X
0
. So we have proved that y X
0
which
contradicts the initial assumption.
1.2.5 Proof by Structural Induction
We are all familiar with the notion of proof by induction over N. Let be a
formula of rst order logic and u be a variable. In order to prove that [n/u]
is true for all n N, a proof by induction over N consists in proving rst that
[0/u] is true, and then proving the implication [n/u] [n + 1/u] for all
n N. A proof by induction over N is legitimate as the following is a theorem:
[ [0/u] (n N , [n/u] [n + 1/u]) ] (n N , [n/u]) (1.4)
Hence for every formula of rst order logic and every variable u, we have
a corresponding theorem (1.4). This correspondence may be called a theorem
schema, in the same way that ZFC has a few axiom schema. If we denote
Th[, u] the theorem obtained in (1.4) from the formula and the variable u,
then the statement uTh[, u] cannot be represented as a formula of rst order
logic, and is therefore not a theorem. It may be called a meta-theorem, which
is a part of meta-mathematics. It is said that a meta-theorem can also become
a theorem, but only as part of a wider formal system. There is something
fundamentally disturbing about meta-theorems: most of us do not understand
them. There is always a high level of discomfort when referring to formulas of
rst order logic and variables without having dened any of those terms. It is
also problematic to use notations such as [n/u] ( with n in place of u) which
is most likely not always meaningful: some form of restriction should probably
be imposed on the variables u and n, unless maybe is viewed as an equivalence
class rather than a string. The truth is we do not really know.
At the same time, there is something fascinating and magical about meta-
mathematics: one of our favorite meta-theorems states that given a property
[u], if there exists an ordinal such that [], then there exists a smallest
ordinal with such property. This is powerful. This is not something we want to
give up. And yet we hardly understand it, and it is not something we can prove.
Suppose we had a compiler which would transform a high level mathematical
30
statement into a formula of rst order logic; suppose we had dened the notion of
mathematical proof as a nite sequence of low level formulas satisfying certain
properties. Then a meta-theorem would not be compiled and would not be
proved. But we could use its meta-proof to teach our compiler to automatically
generate the appropriate fragment of code needed to construct a proof of Th[, u]
from the formula . There is light at the end of the tunnel.
In this section, we shall make little mention of the ordinals so as to keep as
wide an audience as possible. As it turns out, every n N is an ordinal and the
set N itself is an ordinal (also denoted ), but what really matters is the fact
already used before that every non-empty subset of N has a smallest element.
Now suppose we want to prove theorem (1.4) given a formula and a variable
u: we assume the left-hand-side of (1.4) is true. We then consider the set:
A = n N : [n/u]
and we need to show that A = . Note that the existence of the set A is a direct
consequence of the Axiom Schema of Comprehension which makes the formula:
NA[n(n A (n N) ([n/u]))]
an axiom of ZFC. Now suppose A ,= . Then A is a non-empty subset of N
which therefore has a smallest element, say n N. Since [0/u] is true, we must
have n ,= 0. So n 1 and n 1 N. From the minimality of n, it follows that
[n 1/u] is true. From the implication [n 1/u] [n/u] we conclude that
[n/u] is also true, contradicting the fact that n A. So we have shown that
A = , which completes the proof of theorem (1.4).
Because our proof was somehow parameterized with the formula , we could
call it a meta-proof. So we have just produced a piece of meta-mathematics.
However, this could have been avoided: when dealing with mathematical induc-
tion over N, there is no need to consider a formula of rst order predicate logic.
Attempting to prove that a property is true for all n N is not fundamentally
dierent from proving that a particular subset of N, namely the subset on which
the property is true coincide with the whole of N. So let Y N be a subset
of N. In order to prove that Y = N, it is sucient to show that 0 Y and
furthermore that (n Y ) (n + 1 Y ) for all n N. Indeed, the following is
a theorem:
Y N , [ (0 Y ) n((n Y ) (n + 1 Y )) ] (Y = N) (1.5)
No more meta-mathematics, no more guilt. We are back to the solid grounds
of standard mathematical arguments. We may be wrong and deluded in our
beliefs, but we certainly lose the awareness of it. The proof of theorem (1.5)
is essentially the same but without the disturbing reference to the formula :
suppose the complement of Y in N is a non empty-set. It has a smallest element
n which cannot be 0. From n1 Y we obtain n Y which is a contradiction.
We can now resume our study of universal algebras and deal with the topic
of proof by structural induction. Let X be a universal algebra of type and
31
suppose X
0
X is a generator of X. In order to prove that a property holds
for all y X, it is sucient to prove that the property is true for all y X
0
and
furthermore given f and x X
(f)
, that the property is also true for f(x)
whenever it is true for all x(i)s with i (f). Just like in the case of induction
over N, it would be possible to consider a formula of rst order predicate logic,
and indulge in meta-mathematics. Instead, we shall safely quote:
Theorem 3 Let X be a universal algebra of type . Let X
0
X be a generator
of X. Suppose Y X is a subset of X with the following properties:
(i) x X
0
x Y
(ii) (i (f) , x(i) Y ) f(x) Y
where (ii) holds for all f and x X
(f)
. Then Y = X.
Proof
We need to show that Y = X. From (i) we see immediately that X
0
Y . Since
X
0
is a generator of X, we have X
0
= X. Consequently, it is sucient to prove
that Y is a universal sub-algebra of X, as this would imply that X
0
Y , and
nally X Y . In order to show that Y is a universal sub-algebra of X, consider
f and x Y
(f)
. We need to prove that f(x) Y . From (ii), it is sucient
to show that x(i) Y for all i (f). This follows from x Y
(f)
.
Note that theorem (3) does not require X to be a free universal algebra, but
simply that it should have a generator X
0
X. Furthermore, if f is such
that (f) = 0, then (ii) reduces to f(0) Y . In other words, in order to prove
that a property holds for every element of a universal algebra with constants,
the least we can do is check that every constant has the required property.
Theorem (3) will be used over and over again in these notes. A proof by
structural induction is pretty much all we have in many cases and it is no wonder
that it should be used at all times. However, there will be cases when a simple
structural induction is not enough for our purpose. This is usually the case when
a property depends on more than one single variable and we need to design an
argument which we may regard as a case of double induction. Most of us are
familiar with this situation when attempting to prove a property P(n, m) which
is a binary predicate of n, m N. We shall now deal with a case which will
be very important when attempting to prove the cut elimination theorem of
Gentzen sequent calculus. This theorem essentially asserts that the cut rule:

1

1
,
2
,
2

1
,
2

1
,
2
does not need to be used in order to establish the provability of a sequent .
As we shall see, the provable sequents (without cuts) are exactly those belonging
to the sub-algebra S
0
(V ) of the algebra of sequents S(V ) generated by the
set of elementary sequents S
0
(V ) = : P(V ). The cut elimination
theorem is simply a closure property. Given P(V ), consider the operator:
s
1
s
2
= (s
1
) (s
2
) (s
1
) (s
2
)
32
The cut elimination theorem asserts that the sub-algebra S
0
(V ) is closed under
the binary cut operator associated with the formula P(V ). In other
words, if s
1
, s
2
S(V ) are provable sequents (without cuts) then so is the
sequent s
1
s
2
. Adding the cut rule does not increase the set of provable
sequents. For those who are not yet familiar with Gentzen sequent calculus, let
us rephrase the issue on a more abstract level. Let X be a universal algebra of
type , which represents our algebra of sequents S(V ) of denition (448). The
operator symbols f corresponds to the rules of the Gentzen LK-system,
excluding the cut rule. Consider an arbitrary subset X
0
X and its generated
sub-algebra X
0
. Then X
0
corresponds to the set of elementary sequents S
0
(V )
and X
0
corresponds to the sub-algebra of provable sequents. Given a binary
operator : X
2
X, we shall need to prove that X
0
is closed under :
x
1
X
0
, x
2
X
0
x
1
x
2
X
0

This is the essence of the cut elimination theorem and motivates the following:
Proposition 29 Let X be a universal algebra of type and X
0
X generating
the sub-algebra X
0
. Let be a binary operator on X with the properties:
(i) x
1
X
0
, x
2
X
0
x
1
x
2
X
0

(ii) x
1
X
0
, x
2
X
0

(g)
x
1
g(x
2
) X
0

(iii) x
1
X
0

(f)
, x
2
X
0
f(x
1
) x
2
X
0

(iv) x
1
X
0

(f)
, x
2
X
0

(g)
f(x
1
) g(x
2
) X
0

where (ii), (iii) and (iv) hold for all f, g . Then X


0
is closed under :
x
1
X
0
, x
2
X
0
x
1
x
2
X
0

Proof
We shall prove this proposition with several induction arguments. First we need:
x
1
X
0
, y
2
X
0
x
1
y
2
X
0
(1.6)
We should easily be able to prove this with a structural induction argument
using properties (i) and (ii). In these notes, most induction arguments will be
carried out via a simple reference to theorem (3) of page 32 above. However,
our universal algebra X is not generated by X
0
so we cannot apply theorem (3)
directly to X. We would need to apply the theorem to the sub-algebra X
0

with the induced structure. To avoid any possible logical aws, or any sense of
unease or suspicion arising from this, we shall refrain from using theorem (3)
altogether. After all, the theorem is not deep and we do not need it. Consider:
Y = y
2
X
0
: x
1
X
0
, x
1
y
2
X
0

In order to show that (1.6) is true, it is sucient to prove that X
0
Y .
From proposition (24), the sub-algebra X
0
being the smallest sub-algebra of
33
X containing X
0
, it is sucient to show that Y is a sub-algebra of X with
X
0
Y . The fact that X
0
Y follows immediately from X
0
X
0
and
property (i). So it remains to show that Y is a sub-algebra of X. So let g
and suppose that x
2
Y
(g)
. By virtue of denition (19) we simply need to
show that g(x
2
) Y . However, since Y X
0
we have x
2
X
0

(g)
, and it
follows that g(x
2
) X
0
since X
0
is a sub-algebra of X. In order to show that
g(x
2
) Y , it remains to prove that x
1
g(x
2
) X
0
for all x
1
X
0
. However,
this follows immediately from property (ii) and the fact that x
2
X
0

(g)
.
This completes our proof of property (1.6). We shall use another induction for:
x
1
X
0

(f)
, y
2
X
0
f(x
1
) y
2
X
0
(1.7)
for all f . Here again, this should not be too dicult using properties (iii)
and (iv). So assume f is given and dene the following subset of X:
Y = y
2
X
0
: x
1
X
0

(f)
, f(x
1
) y
2
X
0

In order to show that (1.7) is true, it is sucient to prove that X
0
Y . So
we need to show that Y is a sub-algebra of X and X
0
Y . The latter follows
immediately from X
0
X
0
and property (iii). So it remains to show that
Y is a sub-algebra of X. So let g and suppose that x
2
Y
(g)
. We need
to show that g(x
2
) Y . However, since Y X
0
we have x
2
X
0

(g)
,
and it follows that g(x
2
) X
0
since X
0
is a sub-algebra of X. In order
to show that g(x
2
) Y , it remains to prove that f(x
1
) g(x
2
) X
0
for all
x
1
X
0

(f)
. However, this follows immediately from property (iv) and the
fact that x
2
X
0

(g)
. This completes our proof of property (1.7). We are
now ready to perform our last induction argument so as to prove:
y
1
X
0
, y
2
X
0
y
1
y
2
X
0
(1.8)
Once again, we consider a subset Y of X dened as follows:
Y = y
1
X
0
: y
2
X
0
, y
1
y
2
X
0

In order to prove property (1.8) and complete the proof of this proposition, it is
sucient to prove that X
0
Y . So we need to show that Y is a sub-algebra
of X with X
0
Y . The fact that X
0
Y follows from X
0
X
0
and prop-
erty (1.6). So it remains to show that Y is a sub-algebra of X. So let f
and suppose that x
1
Y
(f)
. We need to show that f(x
1
) Y . However, since
Y X
0
we have x
1
X
0

(f)
and it follows that f(x
1
) X
0
since X
0
is
a sub-algebra of X. In order to show that f(x
1
) Y , it remains to prove that
f(x
1
) y
2
X
0
for all y
2
X
0
. However, this follows immediately from
property (1.7) and the fact that x
1
X
0

(f)
. This completes our proof.
1.2.6 Denition by Recursion over N
Our aim is to introduce the idea of denition by structural recursion on a free
universal algebra of type . We shall deal with the subject in a later section.
34
In this section, we shall concentrate on a simpler notion, that of denition by
recursion over N, also known as denition by simple recursion. Consider the
function g : N N dened by g(0) = 0 and:
g(n + 1) = (n + 1). g(n) , n N
The existence of the function g is often taken for granted. Most of us needed
many years as mathematical students before it nally occurred that the truth
of the following statement was far from obvious:
g[ (g : N N) (g(0) = 1) (n N , g(n + 1) = (n + 1). g(n)) ]
Loosely speaking, dening a map g by recursion consists in rst dening g(0),
and having dened g(0), g(1),. . . g(n 1), to then specify what g(n) is as a
function of all the preceding values g(0), g(1),. . . , g(n 1). More formally,
suppose A is a set and we wish to dene a map g : N A by recursion over
N. We rst dene g(0) A, and then dene g(n) as a function of g
|n
. In other
words, we set g(n) = h(g
|n
) for some function h. Incidentally, the function h is
called an oracle function, as it enables us to predict what the next value of g
is, given all the preceding values. Of course, we want everything to make sense:
so h(g
|n
) should always be meaningful, and since our objective is to obtain
g : N A in which case g
|n
A
n
for all n N, we should probably request
that the oracle function be a map h :
nN
A
n
A. We could imagine an
oracle function with a wider domain and a wider range (as long as h(g
|n
) is a
well dened element of A), but this is not likely to add much to the generality
of the present discussion. So we shall stick to h :
nN
A
n
A. One thing to
note is that the expression g(n) = h(g
|n
) is also meaningful in the case of n = 0.
It is therefore unnecessary to dene g(0) separately: just set h() appropriately
and request that g(n) = h(g
|n
) for all n N rather than just n N

. In the
above example of g(n) = n! and A = N, the oracle function h :
nN
N
n
N
should be dened by setting h() = 1 and for all n N

and g : n N:
h(g) = n.g(n 1)
Note that this denition is legitimate since N
n
N
n

= for n ,= n

.
We are now faced with a crucial question, namely that of the truth of the
following statement, given a set A and an oracle function h :
nN
A
n
A.
g[ (g : N A) (n N , g(n) = h(g
|n
) ]
This will be proved in the next lemma, together with the uniqueness of the func-
tion g : N A. However before we proceed, a few comments are in order. The
proof of lemma (30) below may appear daunting and tedious, as we systemati-
cally check a variety of small technical details. However, there is not much to it.
The key to remember is that a recursion over N is nothing but an induction over
N. Specically, for all n N, there must exist a map g : n N with the right
property, for if there exists an n N for which this is not the case, choosing n
35
to be the smallest of such integer, we can extend the map g : (n1) A using
g(n) = h(g
|n
) and obtain a map g

: n A with the right property, thereby


reaching a contradiction. Once the induction argument is complete, proving the
existence of a map g : N A is easily done by gathering together all the ordered
pairs contained in the various maps g

: n A for n N, which incidentally


are extensions of one another.
Lemma 30 Let A be a set and h :
nN
A
n
A be a map. There exists a
unique map g : N A such that g(n) = h(g
|n
) for all n N.
Proof
First we prove the uniqueness property. Suppose g, g

: N A are maps such


that g(n) = h(g
|n
) and g

(n) = h(g

|n
) for all n N. We need to show that
g = g

, i.e. that g(n) = g

(n) for all n N. One way to achieve this is to prove


that g
|n
= g

|n
for all n N, for which we shall use an induction argument.
Since g
|0
= = g

|0
the property is true for n = 0. Suppose it is true for n N.
Then g
|n
= g

|n
and consequently g(n) = h(g
|n
) = h(g

|n
) = g

(n). Hence:
g
|(n+1)
= g
|n
(n, g(n)) = g

|n
(n, g

(n)) = g

|(n+1)
and we see that the property is true for n + 1. This completes the proof of
the uniqueness property. In order to prove the existence of g : N A with
g(n) = h(g
|n
) for all n N, we shall rst restrict our attention to nite domains
and prove by induction the existence for all n N of a map g : n A such
that g(k) = h(g
|k
) for all k n. This is clearly true for n = 0, since the empty
set is a map g : 0 A for which the condition g(k) = h(g
|k
) for all k 0 is
vacuously satised. So we assume n N, and the existence of a map g : n A
such that g(k) = h(g
|k
) for all k n. We need to show the existence of a
map g

: (n + 1) A such that g

(k) = h(g

|k
) for all k n + 1. Consider
the map g

: (n + 1) A dened by g

|n
= g and g

(n) = h(g), and suppose


k n + 1. We need to check that g

(k) = h(g

|k
). If k = n, we need to check
that g

(n) = h(g

|n
) = h(g) which is true by denition of g

. If k n, then:
g

(k) = g

|n
(k) = g(k) = h(g
|k
) = h(g

|k
)
This completes our induction argument and for all n N we have proved the
existence of a map g : n A such that g(k) = h(g
|k
) for all k n. In fact we
claim that such a map is unique. For if g

: n A is another such map, an


identical induction argument to the one already used shows that g(k) = g

(k)
for all k n. Hence, for all n N, we have proved the existence of a unique
map g
n
: n A such that g
n
(k) = h((g
n
)
|k
) for all k n. Note that from the
uniqueness property we must have (g
n+1
)
|n
= g
n
for all n N. So every g
n+1
is an extension of g
n
. We are now in a position to prove the existence of a map
g : N A such that g(n) = h(g
|n
) for all n N, by considering g =
nN
g
n
.
First we show that g is indeed a map. It is clearly a set of ordered pairs. Suppose
(x, y) g and (x, y

) g. We need to show that y = y

. However, there exist


36
n, n

N such that (x, y) g


n
and (x, y

) g
n
. Without loss of generality, we
may assume that n n

. From the above uniqueness property, we must have


(g
n
)
|n
= g
n
. From (x, y) g
n
we obtain x n and consequently:
y

= g
n
(x) = (g
n
)
|n
(x) = g
n
(x) = y
So we have proved that g is indeed a map. It is clear that dom(g) = N and
rng(g) A. So g is a map g : N A. It remains to check that g(n) = h(g
|n
)
for all n N. So suppose n N. Let k n + 1. Since (k, g
n+1
(k)) g
n+1
g,
we obtain g
n+1
(k) = g(k). It follows in particular that (g
n+1
)
|n
= g
|n
and
g
n+1
(n) = g(n). Since g
n+1
(k) = h((g
n+1
)
|k
) for all k n+1 we conclude that:
g(n) = g
n+1
(n) = h((g
n+1
)
|n
) = h(g
|n
)

There is however one thing to note: the proof of lemma (30) tacitly makes
use of the Axiom Schema of Replacement without any explicit mention of the
fact. This is one other thing we usually take for granted. We have avoided any
mention of the Axiom Schema of Replacement to keep the proof leaner, thereby
following standard mathematical practice. However, considering the low level
nature of the result being proved, we feel it is probably benecial to say a few
words about it, or else the whole proof could be construed as a cheat. At some
point, we considered the set g =
nN
g
n
which is really a notational shortcut
for g = B where B = g

: n N , g

= g
n
. Recall that given a set z, the
set z is dened as:
z = x : y , (x y) (y z)
The existence of this set is guaranteed by the Axiom of Union. So it would seem
that dening g =
nN
g
n
is a simple application of the Axiom of Union and
of course it is. But there is more to the story: we need to justify the fact that
B is indeed a set and we have not done so. If we knew that G : N
nN
A
n
dened by G(n) = g
n
was a map, then B would simply be the range of G, i.e.
B = rng(G). Showing that B is a set would still require the Axiom Schema of
Replacement (as far as we can tell), but it would not be controversial to remain
silent about it. But why is G not a map? Why cant we dene G : N
nN
A
n
by setting G(n) = g
n
for all n N? This is something we do all the time.
Consider n N. We proved the existence and uniqueness of a map g : n A
such that g(k) = h(g
|k
) for all k n. We then labeled this unique map as g
n
.
In fact, we proved that the following formula of rst order predicate logic:
G[n, g] = (n N) [ (g : n A) (k n , g(k) = h(g
|k
)) ]
is a functional class with domain N. We have not formally dened what a
functional class is and we are not able to do so at this stage. Informally, G
is a functional class as a formula of rst order predicate logic where two free
variables n and g (in that order) have been singled out, and which satises:
ngg

[ G[n, g] G[n, g

] (g = g

) ]
37
where G[n, g

] has the obvious meaning. Note that G has other variables such
as A and h, while N is a constant. The functional class G has a domain:
dom(G)[n] = g G[n, g]
which is itself a formula of rst order predicate logic with the free variable n
singled out, something which we should call a class. As it turns out, we have:
n[ dom(G)[n] (n N) ]
which allows us to say that the functional class G has domain N. In general,
the domain of a functional class need not be a set. The functional class G[n, g]
also has a range, which is dened as the class:
rng(G)[g] = nG[n, g]
The Axiom Schema of Replacement asserts that if the domain of a functional
class is a set, then its range is also a set. More formally:
Bg[ (g B) rng(G)[g] ]
which shows as requested, that B = g

: n N , g

= g
n
is indeed a set,
since g

= g
n
is simply a notational shortcut for G[n, g

]. So much for proving


that B is a set. We did not need to dene a map G : N
nN
A
n
by setting
G(n) = g
n
for all n N. We considered a functional class instead. But the
question still stands: the practice of casually dening a map G : N
nN
A
n
by setting G(n) = g
n
is very common. Surely, it is a legitimate practice. To
avoid a confusing conict of notation let us keep G as referring to our functional
class, and consider the map G

: N
nN
A
n
dened by G

(n) = g
n
for all
n N. What we are really saying is that G

is the set of ordered pairs dened by


G

= (n, g
n
) : n N or equivalently G

= (n, g) : G[n, g]. This denition


is legitimate provided G

is really a set. More formally, we need to prove:


G

z[ (z G

) ng[(z = (n, g)) G[n, g]] ] (1.9)


However, G is a functional class whose domain is the set N, while its range is
the set B. Hence, we have the implication G[n, g] ((n, g) N B), and it
follows that (1.9) can equivalently be expressed as:
G

z[ (z G

) (z N B) [z] ] (1.10)
where [z] stands for ng[(z = (n, g)) G[n, g]]. Since (1.10) is a direct conse-
quence of the Axiom Schema of Comprehension, this completes our proof that
G

is indeed a set. So much for lemma (30).


Unfortunately, we are not in a position to close our discussion on the subject
of recursion over N. Yes we have lemma (30), of which we gave a rather de-
tailed proof, including nal meta-mathematical remarks which hopefully were
as convincing as one could have made them. The truth is, lemma (30) is rather
38
useless, as it does not do what we need: suppose we want to dene the map g
with domain N, by setting g(0) = 0 and g(n +1) = T(g(n)) for all n N. In
order to apply lemma (30) we need to produce a set A. Yet we have no idea what
the set A should be. So we are back to the very beginning of this section, asking
the very same question as initially: does there exists a map g with domain N
such that g(0) = 0 and g(n + 1) = T(g(n)) for all n N? Of course once we
know that g exists, it is possible for us to produce a set A, namely A = rng(g) or
anything larger. We can dene an oracle function h :
nN
A
n
A by setting
h() = 0 and h(g) = T(g(n 1)) for all n N

and g : n A. At this stage,


lemma (30) works like a dream. But it is clearly too late, as we needed to know
the existence of g in the rst place.
One may argue that dening g(0) = 0 and g(n + 1) = T(g(n)) is not
terribly useful, and it may be that lemma (30) is pretty much all we need in
practice. But remember proposition (10) in which we claim to have constructed
a free universal algebra of type . We crucially dened the sequence (Y
n
)
nN
with a recursion over N, by setting Y
0
= (0, x) : x X
0
, Y
n+1
= Y
n


Y
n
and:

Y
n
=
_
(1, (f, x)) : f , x Y
(f)
n
_
, n N
Just as in the case of g(n + 1) = T(g(n)), it is not clear what set A should be
used in order to apply lemma (30). We need something more powerful.
Let us go back to the initial problem: we would like to dene a map g with
domain N by rst specifying g(0), and having dened g(0), g(1), . . . , g(n1), by
specifying what g(n) is as a function of g(0), g(1),. . . , g(n 1). So we need an
oracle function h :
nN
A
n
A so as to be able to set g(n) = h(g
|n
). Finding
such an oracle function will not be an easy task without a set A. But why should
A be a set? The only reason we insisted on A being a set was to keep clear
of the akiness of meta-mathematics. We wanted a theorem proper, namely
lemma (30), with a solid proof based on a standard mathematical argument. In
the case of induction, it was possible for us to stay within the realm of standard
mathematics. When it comes to recursion, it would seem that we have no choice
but to step outside of it.
So let us assume that A is a class, namely a formula of rst order predicate
logic with a variable x singled out. We could denote this class A[x] to emphasize
the fact that A is viewed as a predicate over the variable x. Of course, the formula
A may have many other variables, and in fact x itself need not be a free variable
of A. We could have A[x] = , also denoted A[x] = or indeed:
A[x] = (x(x x)) (x(x x))
The class A is simply such that xA[x] is a true statement. This class is the
largest possible, since A[x] is true for all x. We could call A[x] the category of
all sets also denoted Set, but we have no need to do so. Note that it is possible
for this class to have x as a free variable, as in:
A[x] = (y x) (y x)
39
The good thing about having A as a class is that it no longer requires us to know
what it is: if we have no idea as to which class to pick, we can simply decide
that A is the class of all sets. Now we need an oracle function h :
nN
A
n
A.
Since A is not a set but a class, we should be looking for a functional class
H :
nN
A
n
A, rather than a function. But we hardly know what this
means, and a few clarications are in order: rst of all, given a functional
class H[x, y], given two classes A[x] and B[x], we need to give meaning to the
statement H : B A. Informally, this statement should indicate that H is a
functional class with domain B and range inside A. The fact that H[x, y] is a
functional class is formally translated as:
xyy

[ H[x, y] H[x, y

] (y = y

) ]
The fact that H has domain B can be written as:
x[ B[x] y(H[x, y]) ]
and nally expressing that the range of H is inside A can be written as:
y[ x(H[x, y]) A[y] ]
So we now know what the statement H : B A formally represents. If we
intend to give meaning to H :
nN
A
n
A, we still need to spell out what the
class
nN
A
n
is. First we dene the class A
n
[x] for n N. Informally, A
n
[x]
should indicate that x is a map with domain n and range inside A, i.e.:
A
n
[x] = (x is a map) (dom(x) = n) y[ (y rng(x) A[y]) ]
Finally, the class
nN
A
n
should be dened as:
_

nN
A
n
_
[x] = n[ (n N) A
n
[x] ]
So the statement H :
nN
A
n
A is now meaningful. So suppose it is true. We
are looking for a map g : N A such that g(n) = H(g
|n
) for all n N. Things
are starting to be clearer. The statement g : N A is clearly represented by:
(g is a map) (dom(g) = N) y[ (y rng(g) A[g]) ]
while the statement g(n) = H(g
|n
) is an intuitive way of saying H[g
|n
, g(n)].
Furthermore, if we have H :
nN
A
n
A and g : N A, then for all n A
it is easy to check that g
|n
: n A, by which we mean that A
n
[g
|n
] is true. It
follows that dom(H)[g
|n
] is true, and consequently since H is functional, there
exists a unique y
n
such that H[g
|n
, y
n
], which furthermore is such that A[y
n
] is
true. We would like g : N A to be such that g(n) is precisely that unique y
n
,
for all n N. So it all makes sense.
As an example, let us go back to the case when g(n + 1) = T(g(n)) for all
n N with g(0) = 0. Because we have no idea what the class A should be, let
us pick A to be the class of all sets. We need to nd an oracle functional class
40
H :
nN
A
n
A which correctly sets up our recursion objective. Informally
speaking, we want H() = 0 and H(g) = T(g(n 1)) for all n N

and
g : n A. If we want to formally code up the functional class H[g, y] as a
formula of rst order predicate logic, since A
0
[g] = (g = ), this can be done as:
H[g, y] = [A
0
[g] (y = 0)] n[(n N

) A
n
[g] (y = T(g(n 1)))]
The fundamental question is this: does there exist a map g with domain N such
that g(n) = H(g
|n
) for all n N? This is the object of meta-theorem (1) below.
Before, we proceed with the proof of meta-theorem (1), a few remarks may
be in order: we were compelled by the nature of the problem to indulge in a fair
amount of meta-mathematical discussions, i.e. discussions involving formulas of
rst order logic, variables, classes and functional classes, terms which have not
been dened anywhere. This of course falls very short of the usually accepted
standards of mathematical proof. It is however better than nothing at all. We
do not have the conceptual tools to do much more at this stage. It is hoped
that the study of universal algebras and the introduction of formal languages
as algebraic structures will eventually allow us to handle meta-mathematics
without the hand waving. For now, we did everything we could and were as
careful as possible in attempting to give legitimacy to the principle of recursion
over N. So we shall accept the existence of a sequence (Y
n
)
nN
such that
Y
n+1
= Y
n


Y
n
for all n N, where:

Y
n
=
_
(1, (f, x)) : f , x Y
(f)
n
_
, n N
with the belief the existence of (Y
n
)
nN
could be formally proven within ZFC, if
we precisely knew what that means. If it so happens that one of our conclusions
is wrong, it is very likely that the error will be caught at some point, once we
have claried the ideas of formal languages and formal proofs.
MetaTheorem 1 Let A be a class and H :
nN
A
n
A be a functional class.
There exists a unique map g : N A such that g(n) = H(g
|n
) for all n N.
Proof
First we prove the uniqueness property. Suppose g, g

: N A are maps such


that g(n) = H(g
|n
) and g

(n) = H(g

|n
) for all n N. We need to show that
g = g

, i.e. that g(n) = g

(n) for all n N. One way to achieve this is to prove


that g
|n
= g

|n
for all n N, for which we shall use an induction argument.
Since g
|0
= = g

|0
the property is true for n = 0. Suppose it is true for n N.
Then g
|n
= g

|n
and consequently g(n) = H(g
|n
) = H(g

|n
) = g

(n). Hence:
g
|(n+1)
= g
|n
(n, g(n)) = g

|n
(n, g

(n)) = g

|(n+1)
and we see that the property is true for n + 1. This completes the proof of
the uniqueness property. In order to prove the existence of g : N A with
g(n) = H(g
|n
) for all n N, we shall rst restrict our attention to nite domains
and prove by induction the existence for all n N of a map g : n A such
41
that g(k) = H(g
|k
) for all k n. This is clearly true for n = 0, since the empty
set is a map g : 0 A for which the condition g(k) = H(g
|k
) for all k 0 is
vacuously satised. So we assume n N, and the existence of a map g : n A
such that g(k) = H(g
|k
) for all k n. We need to show the existence of a
map g

: (n + 1) A such that g

(k) = H(g

|k
) for all k n + 1. Consider
the map g

: (n + 1) A dened by g

|n
= g and g

(n) = H(g), and suppose


k n + 1. We need to check that g

(k) = H(g

|k
). If k = n, we need to check
that g

(n) = H(g

|n
) = H(g) which is true by denition of g

. If k n, then:
g

(k) = g

|n
(k) = g(k) = H(g
|k
) = H(g

|k
)
This completes our induction argument and for all n N we have proved the
existence of a map g : n A such that g(k) = H(g
|k
) for all k n. In fact
we claim that such a map is unique. For if g

: n A is another such map, an


identical induction argument to the one already used shows that g(k) = g

(k)
for all k n. Hence, for all n N, we have proved the existence of a unique
map g : n A such that g(k) = H(g
|k
) for all k n. It follows that:
G[n, g] = (n N) [ (g : n A) (k n , g(k) = H(g
|k
)) ]
is a functional class with domain N. From the Axiom Schema of Replacement:
rng(G) = g : n N , G[n, g]
is therefore a set. We dene g = rng(G) and we shall complete our proof by
showing that g : N A is a map such that g(n) = H(g
|n
) for all n N. First
we show that g is a set of ordered pairs. Let z g. There exists g

rng(G)
such that z g

. Since g

rng(G), we have G[n, g

] for some n N. It
follows that g

: n A and in particular g

is a set of ordered pairs. From


z g

we see that z is an ordered pair. We then show that g is functional. So


we assume that (x, y) g and (x, y

) g. We need to show that y = y

. Since
g = rng(G), there exist g
1
, g
2
rng(G) such that (x, y) g
1
and (x, y

) g
2
.
Furthermore, there exist n
1
, n
2
N such that G[n
1
, g
1
] and G[n
2
, g
2
]. Without
loss of generality we may assume that n
1
n
2
. From G[n
2
, g
2
] it is not dicult
to show that G[n
1
, (g
2
)
|n1
] is also true. Since G is functional, it follows that
g
1
= (g
2
)
|n1
. From (x, y) g
1
and g
1
: n
1
A we obtain x n
1
. Hence:
y

= g
2
(x) = (g
2
)
|n1
(x) = g
1
(x) = y
We now show that dom(g) = N. First we show that N dom(g). So let
n N. We need to show that n dom(g). However, the functional class G
has domain N. In particular, there exists g

such that G[n +1, g

] is true. From
g

: (n+1) A we see that (n, g

(n)) g

. Furthermore, from G[n+1, g

] we see
that g

rng(G). Since g = rng(G), we conclude that (n, g

(n)) g, and in
particular n dom(g). We now prove that dom(g) N. So let x dom(g). We
need to show that x N. However, there exists y such that (x, y) g. Hence,
there exists g

rng(G) such that (x, y) g

. Furthermore, there exists n N


42
such that G[n, g

] is true. In particular, we have g

: n A and we conclude
from (x, y) g

that x n N. So x N. In order to show that g : N A, it


remains to check that A[g(n)] is true for all n N. So let n N. We need to
show that A[g(n)] is true. However n n +1 and since G has domain N, there
exists g

such that G[n +1, g

]. From g

: (n +1) A we obtain (n, g

(n)) g

.
Furthermore, A[g

(n)] is true. From G[n + 1, g

] we obtain g

rng(G), and it
follows that (n, g

(n)) g. Hence we see that g

(n) = g(n) and since A[g

(n)] is
true we conclude that A[g(n)] is also true. So we have proved that g : N A
is a map. It remains to check that g(n) = H(g
|n
) for all n N. So let n N.
We need to show that g(n) = H(g
|n
). Consider once more the map g

rng(G)
such that G[n + 1, g

] is true. Since g

: (n + 1) A, for all k n + 1 we have


(k, g

(k)) g

and consequently (k, g

(k)) g. It follows that g

(k) = g(k) for


all k n+1, and we see that g

|n
= g
|n
as well as g

(n) = g(n). From G[n+1, g

]
we also have g

(k) = H(g

|k
) for all k n + 1. In particular, we have:
g(n) = g

(n) = H(g

|n
) = H(g
|n
)

1.2.7 Denition by Structural Recursion


Let = (0, 1), (1, 2) and X
0
be a set. We know from theorem (1) of page 20
that there exists a free universal algebra X of type whose free generator is
X
0
. Let us denote : X
1
X and : X
2
X the two operators on X. This
particular choice of notations should invite us to view X as a free universal
algebra of propositional logic. We know from theorem (2) of page 21 that any
proposition of X is either an atomic proposition = p for some p X
0
, or
the negation of a proposition =
1
or the conjunction of two propositions
=
1

2
. These three cases are exclusive, and these representations are
unique. In eect, the free universal algebra X is partitioned into three parts
X = X
0
X

. If A is a set, then any map g : X A could be dened by


specifying the three restrictions g
|X0
, g
|X
and g
|X
separately. We saw from
proposition (17) that the order : X N had this sort of three-fold structure:
X , () =

0 if = p X
0
1 + (
1
) if =
1
1 + max((
1
), (
2
)) if =
1

2
(1.11)
However, we did not invoke equation (1.11) as a denition of the order : X
N as this would not guarantee the existence of . This is very similar to the
simple recursion problem: dening a map g : N N by setting g(0) = 1 and
g(n + 1) = (n + 1).g(n) does not in itself guarantee the existence of g. We
now know that g exists, but we needed to justify the principle of denition by
recursion over N in the form of lemma (30) and meta-theorem (1) of page 41.
Equation (1.11) is clearly recursive as it links () on the left-hand-side to other
values (
1
) and (
2
) assigned by the map to other points. It is not obvious
43
that exists so we had to nd some other way to dene it. And yet we may
feel there is nothing wrong with equation (1.11). From a computing point of
view, it seems pretty obvious that any computation of () would terminate.
For instance taking = (p
1
((p
2
) p
3
)), equation (1.11) would lead to:
() = ((p
1
((p
2
) p
3
)))
= 1 +(p
1
((p
2
) p
3
))
= 2 + max((p
1
), ((p
2
) p
3
))
= 2 +((p
2
) p
3
)
= 3 + max((p
2
), (p
3
))
= 3 +(p
2
)
= 4 +(p
2
)
= 4
Equation (1.11) is a case of denition by structural recursion. There are many
natural mappings on X which could be dened in a similar way. For instance,
suppose we needed a map g : X T(X
0
) returning the set of all atomic
propositions involved in a formula X. In the case of = (p
1
((p
2
)p
3
)),
we would have g() = p
1
, p
2
, p
3
. A natural way to dene the map g is:
X , g() =

p if = p X
0
g(
1
) if =
1
g(
1
) g(
2
) if =
1

2
(1.12)
As another example, suppose we had a truth table v : X
0
0, 1 indicating
whether an atomic proposition p is true (v(p) = 1) or false (v(p) = 0) for all
p X
0
. We would like to dene a map g : X 0, 1 indicating whether a
proposition is true (g() = 1) or false (g() = 0). Once again, it is natural to
dene the map g using structural recursion:
X , g() =

v(p) if = p X
0
1 g(
1
) if =
1
g(
1
). g(
2
) if =
1

2
(1.13)
We know that equations (1.12) and (1.13) are ne: we just need to prove it.
So let us go back to a more general setting and consider a free universal
algebra X of type with free generator X
0
X. Let A be a set. We would like
to dene a map g : X A by structural recursion. We know from theorem (2)
of page 21 that any element of X is either an element of X
0
, or an element of
the form f(x) for some unique f and x X
(f)
. Dening g : X A by
structural recursion consists rst in specifying g(x) for all x X
0
. So we need
a map g
0
: X
0
A with the understanding that g(x) = g
0
(x) for all x X
0
.
For all f and x X
(f)
, we then want to specify g(f(x)) as a function of
all the individual g(x(i))s for all i (f). So we need a map h(f) : A
(f)
A
with the understanding that g(f(x)) = h(f)(g(x)). Note that if (f) = 0, this
amounts to having a map h(f) : 0 A and requesting that g(f(0)) = h(f)(0).
44
In other words, it amounts to specifying the value h(f)(0) assigned by the map
g to the constant f(0). We are now in a position to ask the key question: given
a map g
0
: X
0
A and given a map h(f) : A
(f)
A for all f , does there
exist a map g : X A such that g
|X0
= g
0
and g(f(x)) = h(f)(g(x)) for all
f and x X
(f)
? This question is dealt with in the following theorem.
To prove theorem (4), we are eectively resorting to the same trick as the
one used to dene the order : X N in denition (16): once we give ourselves
a map h(f) : A
(f)
A for all f , we are eectively specifying a structure
of universal algebra of type on A, and requesting that g(f(x)) = h(f)(g(x))
is simply asking that g : X A be a morphism. The existence of g such that
g
|X0
= g
0
is guaranteed by the fact that X
0
is a free generator of X.
Theorem 4 Let X be a free universal algebra of type with free generator
X
0
X. Let A be a set and g
0
: X
0
A be a map. Let h be a map with
domain such that h(f) is a map h(f) : A
(f)
A for all f . Then, there
exists a unique map g : X A such that:
(i) x X
0
g(x) = g
0
(x)
(ii) x X
(f)
g(f(x)) = h(f)(g(x))
where (ii) holds for all f .
Proof
Since h is a map with domain and we have h(f) : A
(f)
A for all f , the
ordered pair (A, h) is in fact a universal algebra of type . Since g
0
: X A
is a map and X is a free universal algebra of type with free generator X
0
,
there exists a unique morphism g : X A such that g
|X0
= g
0
. From this last
equality, we see that (i) is satised. Since g is a morphism, for all f and
x X
(f)
we have:
g(f(x)) = g f(x) = h(f) g(x) = h(f)(g(x))
where it is understood that g(x) on the right-hand-side of this equation refers
to the element of A
(f)
dened by g(x)(i) = g(x(i)) for all i (f). Hence
we see that (ii) is satised by g : X A for all f . So we have proved
the existence of g : X A such that (i) and (ii) holds for all f . Suppose
g

: X A is another map with such property. Then g

is clearly a morphism
with g

|X0
= g
0
and it follows from the uniqueness of g that g

= g.
This was disappointingly simple. One would have expected the principle
of denition by structural recursion to be a lot more complicated than that of
recursion over N. And it can be. We eectively restricted ourselves to the
simple case of g : X A where A is a set rather than a class, and g(f(x))
is simply a function of the g(x(i))s. This would be similar to considering
g : N A given a set A, and requesting that g(n) = h g(n 1) rather than
g(n) = h(g
|n
). Anticipating on future events, suppose X is a free algebra of
rst order logic, generated by atomic propositions (x y) (with x, y belonging
45
to a set of variables V ), with the constant , the binary operator and a
unary quantication operator x for every variable x V . Given a set M with
a binary relation r M M and a map a : V M (a variables assignment),
a natural thing to ask is which of the formulas X are true with respect to
the model (M, r) under the assignment a : V M. In other words, it is very
tempting to dene a truth function g
a
: X 2 = 0, 1 with the formula:
g
a
() =

1
r
( a(x), a(y) ) if = (x y)
0 if =
g
a
(
1
) g
a
(
2
) if =
1

2
min g
b
(
1
) : b = a on V x if = x
1
being understood that g
a
(
1
) g
a
(
2
) is the usual g
a
(
1
) g
a
(
2
) in 0, 1.
This denition of the concept of truth will be seen to be awed: it is clear enough
at this stage that theorem (4) will not be directly applicable to this case, since
we are attempting to dene g
a
: X 2 in terms of other functions g
b
: X 2.
We will need to nd some other way to prove the existence of g
a
. Going back
to propositional logic, another possible example is the following:
X , g() =

0 if = p X
0
T(g(
1
)) if =
1
g(
1
) g(
2
) if =
1

2
(1.14)
This would constitute a case when theorem (4) fails to be applicable, as there is
no obvious set A for g : X A. We shall not provide a meta-theorem for this.
We shall however now provide a stronger version of theorem (4) which will
be required at a later stage of this document. For example, when studying the
Hilbert deductive system on the free algebra of rst order predicate logic P(V ),
we shall dene the notion of proofs as elements of another free algebra (V ),
which in turn will lead to the following recursion:
(V ) , Val() =

if = P(V )
if = , A(V )
if = , , A(V )
M(Val (
1
), Val (
2
)) if =
1

2
xVal (
1
) if = x
1
, x , Sp (
1
)
if = x
1
, x Sp (
1
)
when attempting to dene Val () as representing the conclusion being proved
by the proof . The details of this denition are not important at this stage,
but we should simply notice that given a variable x and the associated unary
operator x : (V ) (V ), we were unable to dene Val (x
1
) simply in
terms of Val (
1
) as an application of theorem (4) would require. The denition
of Val(x
1
) is also based on whether x Sp (
1
), i.e. on whether x is a free
variable of the premises involved in the proof
1
. Going back to the general
setting of an arbitrary free algebra X of type , this is therefore a case when
given f and x X
(f)
, g(f(x)) is not simply dened in terms of g(x), but
46
is also based on x itself. So we would like to set g(f(x)) = h(f)(g(x), x) rather
than the more restrictive g(f(x)) = h(f)(g(x)) of theorem (4).
The proof of the following theorem is identical in spirit to that of Lemma (30).
The only new idea is the introduction of the subsets X
n
X of proposition (18):
X
n
= x X : (x) n , n N
where : X N is the order mapping of denition (16) which allows to reduce
our problem to that of induction over N. Once we have shown the existence
of maps g
n
: X
n
A with the right property, and which are extensions of
one another, we prove the existence of the larger map g : X A simply by
collecting the appropriate ordered pairs in one single set g =
nN
g
n
. Note
that it is probably possible to prove theorem (5) using lemma (30) rather than
duplicating what is essentially the same argument.
Theorem 5 Let X be a free universal algebra of type with free generator
X
0
X. Let A be a set and g
0
: X
0
A be a map. Let h be a map with
domain such that h(f) is a map h(f) : A
(f)
X
(f)
A for all f .
Then, there exists a unique map g : X A such that:
(i) x X
0
g(x) = g
0
(x)
(ii) x X
(f)
g(f(x)) = h(f)(g(x), x)
where (ii) holds for all f .
Proof
We shall rst prove the uniqueness property. So suppose g, g

: X A are two
maps satisfying (i) and (ii) above. We need to show that g = g

or equivalently
that g(x) = g

(x) for all x X. We shall do so using a structural induction


argument based on theorem (3) of page 32. Since X
0
is a generator of X, we shall
rst check this is the case when x X
0
. This follows immediately from (i) and
g(x) = g
0
(x) = g

(x). We proceed with our induction argument by considering


an arbitrary f and assuming x X
(f)
is such that g(x(i)) = g

(x(i)) for
all i (f). We need to show that g(f(x)) = g

(f(x)). Recall that the notation


g(x) in (ii) above refers to the element of A
(f)
dened by g(x)(i) = g(x(i))
for all i (f). In the case when (f) = 0 we have g(x) = 0. So with this
in mind and a similar notational convention for g

(x), our hypothesis leads to


g(x) = g

(x), and using (ii) above we obtain:


g(f(x)) = h(f)(g(x), x) = h(f)(g

(x), x) = g

(f(x))
This completes our induction argument and the proof of uniqueness. We shall
now prove the existence of the map g : X A. We dene:
X
n
= x X : (x) n , n N
where : X N is the order on X as per denition (16). Note that from
proposition (17) we have x X
0
(x) = 0 so this new denition does not
47
conict with the notation X
0
used for the free generator on X. We now consider
the subset Y N of all n N which satisfy the property that there exists a
unique map g
n
: X
n
A such that:
(iii) x X
0
g
n
(x) = g
0
(x)
(iv) f(x) X
n
g
n
(f(x)) = h(f)(g
n
(x), x)
where (iv) holds for all f and x X
(f)
. In order to prove the existence of
the map g : X A, it is sucient to prove that Y = N. We shall rst prove
this is the case. So suppose Y = N. We need to show the existence of the map
g : X A. Dene g =
nN
g
n
. Each g
n
being a map is a set of ordered pairs.
So g is also a set of ordered pairs. In fact, it is a functional set of ordered pairs:
x, y, y

, ((x, y) g) ((x, y

) g) (y = y

)
Suppose this is true for the time being. Then g is a map. Furthermore, since
g
n
: X
n
A and X
n
X for all n N we have:
(x, y) g n N[(x, y) g
n
] (x X) (y A)
So it is clear that dom(g) X and rng(g) A. In fact, since every x X is
an element of X
n
for any n (x), we have dom(g) = X and we conlude that
g : X A. In order to prove the existence of g : X A it remains to show
that (i) and (ii) above are satised. So let x X
0
. From (x, g
0
(x)) g
0
g it
follows immediately that g(x) = g
0
(x) which shows that (i) is indeed true. Let
f and x X
(f)
. Picking n N large enough so that f(x) X
n
we obtain
(f(x), g
n
(f(x))) g
n
g from which we conclude that g(f(x)) = g
n
(f(x)).
Furthermore, for all i (f) from proposition (17) we have (f(x)) (x(i))
which shows that x(i) X
n
and consequently (x
i
, g
n
(x(i))) g
n
g i.e.
g(x(i)) = g
n
(x(i)). This being true for all i (f), we obtain g(x) = g
n
(x), an
equality which is still true when (f) = 0. Hence, using (iv) above we have:
g(f(x)) = g
n
(f(x)) = h(f)(g
n
(x), x) = h(f)(g(x), x)
which shows that (ii) is indeed true. This completes the proof of the existence
of g : X A having admitted that g is a functional relation. We shall now go
back to this particular point and show that g is indeed functional. So let x, y, y

be sets and assume that (x, y) g and (x, y

) g. We need to show that y = y

.
Let n N be the smallest integer such that (x, y) g
n
and likewise let n

N
be the smallest integer such that (x, y

) g
n
. Without loss of generality, we
may assume that n n

and consequently X
n
X
n
. Consider the restriction
(g
n
)
|Xn
: X
n
A. Since g
n
: X
n
A satises (iii) and (iv) above for n

, for
all z X
0
we have (g
n
)
|Xn
(z) = g
n
(z) = g
0
(z) and furthermore given f
and z X
(f)
such that f(z) X
n
we have:
(g
n
)
|Xn
(f(z)) = g
n
(f(z)) = h(f)(g
n
(z), z) = h(f)((g
n
)
|Xn
(z), z)
where the last equality follows from the fact that z(i) X
n
for all i (f).
Hence we see that the map (g
n
)
|Xn
: X
n
A also satises (iii) and (iv) above
48
and from the uniqueness property of g
n
we conclude that (g
n
)
|Xn
= g
n
. We
are now in a position to prove that y = y

. From (x, y) g
n
we obtain x X
n
and y = g
n
(x) while from (x, y

) g
n
we obtain y

= g
n
(x) = (g
n
)
|Xn
(x).
From (g
n
)
|Xn
= g
n
we therefore conclude that y = y

. This completes our


proof of the existence of the map g : X A having assumed the equality
Y = N. We shall complete the proof of this theorem by showing the equality
Y = N is indeed true. So given n N we need to show the existence and
uniqueness of a map g
n
: X
n
A which satises (iii) and (iv) above. First
we prove the uniqueness. So we assume that g, g

: X
n
A are two maps
satisfying (iii) and (iv) above and we need to show that g(x) = g

(x) for all


x X
n
. We shall do so by proving the implication x X
n
g(x) = g

(x) by
structural induction on X, using theorem (3) of page 32. First we check that
this is true for x X
0
. In this case from (iii) above we obtain immediately
g(x) = g
0
(x) = g

(x). Next we consider f and x X


(f)
such that the
implication x(i) X
n
g(x(i)) = g

(x(i)) is true for all i (f), i.e.


g(x) = g

(x), an equality which is still true in the case when (f) = 0. We


need to check the implication f(x) X
n
g(f(x)) = g

(f(x)), which in fact


follows immediately from (iv) and:
g(f(x)) = h(f)(g(x), x) = h(f)(g

(x), x) = g

(f(x))
This completes our structural induction argument and g
n
: X
n
A is indeed
unique. We shall now prove the existence g
n
: X
n
A using an induction
argument on N. For n = 0, we already have a map g
0
: X
0
A which clearly
satises (iii) above. It also vacuously saties (iv) as theorem (2) of page 21
shows no element of X
0
can be of the form f(x) for f and x X
(f)
. So
we assume that g
n
: X
n
A exists for a given n N and we need to show the
existence of g
n+1
: X
n+1
A. Recall the equality from proposition (18):
X
n+1
= X
n
f(x) : f , x (X
n
)
(f)

So we shall dene g
n+1
: X
n+1
A by setting (g
n+1
)
|Xn
= g
n
and:
g
n+1
(y) = h(f)(g
n
(x), x)
for all y X
n+1
X
n
, where y = f(x) is the unique representation of y with
f and x (X
n
)
(f)
. Such representation is indeed unique, as follows from
theorem (2) of page 21. We have thus dened a map g
n+1
: X
n+1
A and it
remains to check that g
n+1
satises (iii) and (iv) above. So let x X
0
. Since
g
n
satises (iii) we have g
n+1
(x) = (g
n+1
)
|Xn
(x) = g
n
(x) = g
0
(x) and g
n+1
also satises (iii). So let f and x X
(f)
be such that y = f(x) X
n+1
.
We need to show that g
n+1
(f(x)) = h(f)(g
n+1
(x), x). We shall distinguish two
cases: rst we assume that f(x) X
n
. Then, since (iv) is true for g
n
:
g
n+1
(f(x)) = (g
n+1
)
|Xn
(f(x))
= g
n
(f(x))
(iv) true g
n
= h(f)(g
n
(x), x)
49
= h(f)((g
n+1
)
|Xn
(x), x)
= h(f)(g
n+1
(x), x)
We now assume that f(x) X
n+1
X
n
. From x (X
n
)
(f)
we obtain:
g
n+1
(f(x)) = h(f)(g
n
(x), x)
= h(f)((g
n+1
)
|Xn
(x), x)
= h(f)(g
n+1
(x), x)
So we have proved that g
n+1
satises (iv) above as requested.
1.3 Relation on Universal Algebra
1.3.1 Relation and Congruence on Universal Algebra
A relation is a set of ordered pairs. If X is a set then a relation on X is a set of
ordered pairs which is also a subset of X X. A map g : A B between two
sets A and B is a particular type of relation which is a subset of AB. It is a
functional relation as it satises the property:
xyy

[ ((x, y) g) ((x, y

) g) (y = y

) ]
A map g : X X is a functional relation on X. In this section we shall
introduce other types of relations on X, which are of particular interest. Most
of us will be familiar with equivalent relations on X. An equivalence relation on
X is a relation on X which is reexive, symmetric and transitive. A relation
is said to be reexive if and only if it satises:
x[ (x X) (x x) ]
Note that it is very common to write x x and x y rather than (x, x) or
(x, y) . A relation on X is said to be symmetric if and only if it satises:
xy[ (x y) (y x) ]
Finally, a relation on X is said to be transitive if and only if:
xyz[ (x y) (y z) (x z) ]
We shall now introduce a new type of relation on X, in the case when X is
a universal algebra of type . Note that given f and x, y X
(f)
, we
shall write x y as a notational shortcut for the statement x(i) y(i) for all
i (f). Note that if (f) = 0 then x y is vacuously true. Beware that the
symbol thus becomes overloaded. In eect, we are dening new relations
on X
n
for n N. So 0 0 may be vacuously true when referring to the relation
on X
0
, but may not be true (if 0 X) with respect to the relation on X.
50
Denition 31 Let X be a universal algebra of type . Let be a relation on
X. We say that is a congruent relation on X, if and only if:
f , x, y X
(f)
, x y f(x) f(y)
It follows that if X is a universal algebra of type with constants, and is a
congruent relation on X, then f(0) f(0) for all f such that (f) = 0.
Denition 32 Let X be a universal algebra of type . We call congruence on
X any equivalence relation on X which is a congruent relation on X.
The notion of congruence on a universal algebra is very important. This
is particularly the case when dealing with free universal algebras. As already
seen following theorem (2) of page 21, most interesting algebraic structures
are not free (a notable exception being formal languages encountered in logic
textbooks). A congruence on a universal algebra X of type will allow us
to dene a new universal algebra [X] of type , called the quotient universal
algebra of X. We shall see that quotients of free universal algebras are in fact
everywhere. This is probably one of the reasons free universal algebras are key.
A good example of congruence is the kernel of a morphism:
Denition 33 Let h : X Y be a homomorphism between two universal
algebras X and Y of type . We call kernel of h the relation ker(h) on X:
ker(h) = (x, y) X X : h(x) = h(y)
Proposition 34 Let h : X Y be a homomorphism between two universal
algebras X and Y of type . Then ker(h) is a congruence on X.
Proof
The relation ker(h) is clearly reexive, symmetric and transitive. So it is an
equivalence relation on X. It remains to show it is also a congruent relation.
For an easier formalism, denote ker(h) =. Let f and x, y X
(f)
such
that x y. We need to show that f(x) f(y) which is h f(x) = h f(y) :
h f(x) = f h(x) = f h(y) = h f(y)
The rst and third equality are just expressing the fact that h is a morphism,
while the second equality follows from h(x) = h(y), itself a consequence of the
fact that for all i (f) we have h(x)(i) = h(x(i)) = h(y(i)) = h(y)(i).
1.3.2 Congruence Generated by a Set
Let X be a universal algebra of type and R
0
X X. Then R
0
is a relation
on X. However, R
0
may not have the right properties and in particular may not
be a congruence on X. In this section, we prove the existence of the congruence
generated by R
0
, namely the smallest congruence on X containing R
0
in the
51
sense of inclusion. The notion of congruence generated by a subset of X X
is a very handy way of creating interesting congruences on X. As we shall see,
it is possible to view a congruence on X as a universal sub-algebra of X X,
provided the latter has been embedded with the right structure of universal
algebra. The congruence generated by R
0
will be seen to be the universal sub-
algebra of X X generated by R
0
, as per denition (23).
Before we proceed, we shall review a few simple lemmas which show that
common properties of a relation such as reexivity, symmetry, transitivity and
being a congruent relation, are in fact akin to closure properties.
Lemma 35 Let X be a set and T
x
: (X X)
0
X X be dened by:
T
x
(0) = (x, x) , (x X)
A relation is reexive on X if and only if it is closed under T
x
for all x X.
Proof
Suppose is closed under T
x
for all x X. We need to show that is a
reexive relation on X. So let x X. We need to show that x x. Since
0 ()
0
, we obtain T
x
(0) . Hence we see that (x, x) or equivalently
x x. Conversely, suppose is a reexive relation on X. We need to show
that is closed under T
x
for all x X. So let x X. We need to show that
is closed under T
x
. So let u ()
0
. We need to show that T
x
(u) . But
u = 0 and T
x
(0) = (x, x). So we need to show that (x, x) or equivalently
x x which follows immediately from the reexivity of .
If X is a set and z (XX)
1
then z is a map z : 0 XX. So z(0) is an
ordered pair (x, y). Recall that the notation (x, y) may be used as a notational
shortcut to refer to the map z : 0 X X dened by z(0) = (x, y).
Lemma 36 Let X be a set and : (X X)
1
X X be dened by:
(x, y) (X X)
1
, (x, y) = (y, x)
A relation is symmetric on X if and only if it is closed under .
Proof
Suppose is closed under . We need to show that is a symmetric relation
on X. So let x, y X such that x y. We need to show that y x. Since
(x, y) ()
1
, we obtain (x, y) . Hence we see that (y, x) or equiva-
lently y x. Conversely, suppose is a symmetric relation on X. We need
to show that is closed under . So let (x, y) ()
1
. We need to show that
(x, y) . So we need to show that (y, x) or equivalently y x which
follows immediately from the symmetry of and x y
If X is a set and z (X X)
2
then z is a map z : 0, 1 X X. So
z(0) is an ordered pair (x, y) and z(1) is an ordered pair (y

, z). Recall that the


notation [(x, y), (y

, z)] may be used as a notational shortcut to refer to the


map z : 0, 1 X X dened by z(0) = (x, y) and z(1) = (y

, z).
52
Lemma 37 Let X be a set and : (X X)
2
X X be dened by:
[(x, y), (y

, z)] =
_
(x, y) if y ,= y

(x, z) if y = y

A relation is transitive on X if and only if it is closed under .


Proof
Suppose is closed under . We need to show that is a transitive relation on
X. So let x, y, z X such that x y and y z. We need to show that x z.
Since [(x, y), (y, z)] ()
2
we obtain [(x, y), (y, z)] . Hence we see that
(x, z) or equivalently x z. Conversely, suppose is a transitive relation
on X. We need to show that is closed under . So let [(x, y), (y

, z)] ()
2
.
Then x y and y

z and we need to show that [(x, y), (y

, z)] . If y ,= y

then we need to show that (x, y) which is true by assumption. If y = y

then y z and we need to show that (x, z) or equivalently x z which


follows immediately from the transitivity of and x y and y z.
Let X be a set and
0
,
1
: XX X be the projection mappings, namely
the maps dened by
0
(x, y) = x and
1
(x, y) = y for all x, y X. Following
our well established convention, for all n N recall that
0
may be used as a
notational shortcut for
n
0
: (X X)
n
X
n
dened by
n
0
(z)(i) =
0
(z(i)) for
all i n. A similar comment obviously applies to
1
.
Lemma 38 Let X be a universal algebra of type . Consider the structure of
universal algebra of type on X X dened by:
T(f)(z) = (f
0
(z), f
1
(z)) , z (X X)
(f)
, f
where
0
,
1
: X X X are the projection mappings. Then, a relation is
a congruent relation on X, if and only if it is closed under T(f) for all f .
Proof
Note from denition (19) that being closed under T(f) for all f could
equally have been phrased as being a universal sub-algebra of X X viewed
as a universal algebra of type . Note also that given z (X X)
(f)
for
f ,
0
(z) and
1
(z) are well-dened elements of X
(f)
(specically we have

0
(z)(i) =
0
(z(i)) for all i (f) with similar equalities for
1
(z)). It follows
that f
0
(z) and f
1
(z) are well-dened elements of X and we see that T(f)
is indeed a well-dened operator T(f) : (XX)
(f)
XX. We now proceed
with the proof: suppose is closed under T(f) for all f . We need to show
that is a congruent relation on X. So let f and x, y X
(f)
and suppose
that x y. We need to show that f(x) f(y). Dene z (X X)
(f)
by
z(i) = (x(i), y(i)) for all i (f), being understood that z = 0 is (f) = 0.
From x y, we have x(i) y(i) for all i (f). It follows that z(i) for all
i (f) and consequently z ()
(f)
. Since is closed under T(f) we obtain
T(f)(z) . However, for all i (f) we have
0
(z)(i) =
0
(z(i)) = x(i) and
therefore
0
(z) = x. Similarly we have
1
(z) = y. It follows that:
T(f)(z) = (f
0
(z), f
1
(z)) = (f(x), f(y))
53
From T(f)(z) we conclude that f(x) f(y). Conversely, suppose is a
congruent relation on X. We need to show that is closed under T(f) for all
f . So let f and z ()
(f)
. We need to show that T(f)(z) . Dene
x, y X
(f)
by setting x =
0
(z) and y =
1
(z). Then T(f)(z) = (f(x), f(y))
and it remains to show that f(x) f(y). Since is a congruent relation on
X, this will be achieved by showing that x y, or equivalently that x(i) y(i)
for all i (f). However, for all i (f) we have x(i) =
0
(z)(i) =
0
(z(i))
and similarly y(i) =
1
(z(i)). It follows that z(i) = (x(i), y(i)). Consequently,
we only need to show that z(i) for all i (f). But this is an immediate
consequence of z ()
(f)
.
Theorem 6 Let X be a universal algebra of type and R
0
X X. Then,
there exists a unique smallest congruence on X such that R
0
.
Proof
First we show the uniqueness. Suppose and are two congruence on X which
are both the smallest congruence on X containing R
0
. Since is a congruence
on X containing R
0
, from the minimality of we obtain . Likewise, since
is a congruence on X containing R
0
, from the minimality of we obtain
. We now show the existence. For all x X, let T
x
: (X X)
0
X X
be dened as in lemma (35). Let : (X X)
1
X X be dened as in
lemma (36) and : (X X)
2
X X be dened as in lemma (37). For all
f, let T(f) : (X X)
(f)
X X be dened as in lemma (38). Consider
the sets
0
= ((0, x), 0) : x X,
1
= ((1, 0), 1),
2
= ((2, 0), 2) and

3
= ((3, f), (f)) : f . Note that these sets are all maps with range
inside N. Furthermore, their domains are pairwise disjoint. It follows that

=
0

3
is a map with range inside N, i.e. rng(

) N. In other
words,

is a type of universal algebra. Let T

be the map with domain

,
such that T

(f) : (X X)

(f)
X X for all f

and T

(f) is dened
as:
T

(f) =

T
x
if f
0
, f = ((0, x), 0) , x X
if f
1
, f = ((1, 0), 1)
if f
2
, f = ((2, 0), 2)
T(f

) if f
3
, f = ((3, f

), (f

)) , f


(1.15)
Note that equation (1.15) is indeed such that T

(f) : (XX)

(f)
XX for
all f

. It follows that the ordered pair (X X, T

) is a universal algebra
of type

. Using denition (23), let = R


0
be the universal sub-algebra of
X X generated by R
0
X X. We shall complete the proof of the theorem
by showing that the relation has all the requested properties. First we show
that is a congruence on X. Since is a universal sub-algebra of X X, it
is closed under every operator T

(f) for all f

. In particular, it is closed
under T
x
for all x X. It follows from lemma (35) that is a reexive relation
on X. Similarly is closed under and , and we see from lemma (36) and
lemma (37) that is a symmetric and transitive relation on X. Furthermore,
54
is closed under every operator T(f

) for all f

and it follows from lemma (38)


that is a congruent relation on X. So we have proved that is an equivalence
relation on X which is a congruent relation, i.e. that is a congruence on X.
From = R
0
we obtain immediately R
0
. It remains to show that is the
smallest congruence on X with R
0
. So we assume that R is a congruence
on X such that R
0
R. We need to show that R. Since = R
0
and
R
0
R it is sucient to show that R is a universal sub-algebra of X X, as
this will imply R
0
R by virtue of proposition (24), and nally R. So
suppose f

. We need to show that R is closed under T

(f). If f
0
then
f = ((0, x), 0) for some x X and we need to show that R is closed under T
x
which follows immediately from lemma (35) and the reexivity of R. If f
1
then we need to show that R is closed under which follows immediately from
lemma (36) and the symmetry of R. If f
2
then we need to show that R is
closed under which follows immediately from lemma (37) and the transitivity
of R. Finally, if f
3
then f = ((3, f

), (f

)) for some f

and we need to
show that R is closed under T(f

), which follows immediately from lemma (38)


and the fact that R is a congruent relation on X. In all four possible cases, we
have proved that R is closed under T

(f).
Denition 39 Let X be a universal algebra of type and R
0
XX. We call
congruence on X generated by R
0
the smallest congruence on X containing R
0
.
1.3.3 Quotient of Universal Algebra
In this section we dene the quotient universal algebra [X] of type derived
from a universal algebra X of type and a congruence on X. Given x X,
we shall denote [x] the equivalent class of x modulo , namely:
[x] = y X : x y
Note that [x] = [y] is equivalent to x y for all x, y X. Given f and
x X
(f)
, [x] to denote the element of [X]
(f)
dened by [x](i) = [x(i)].
Theorem 7 Let X be a universal algebra of type and be a congruence
on X. Given x X, let [x] denote the equivalence class of x modulo and
[X] be the quotient set [X] = [x] : x X. Let : X [X] be dened by
(x) = [x] for all x X. Then, there exists a unique structure of universal
algebra of type on [X] such that is a surjective morphism. Furthermore:
x X
(f)
, f([x]) = [f(x)] (1.16)
for all f . This structure is called the quotient structure on [X].
Proof
Note that given f and x X
(f)
, [x] refers to the element of [X]
(f)
dened
by [x](i) = [x(i)] for all i (f), being understood that [x] = 0 is (f) = 0.
It follows that equation (1.16) is meaningful. Note also that : X [X]
55
is a surjective map, regardless of any structure on [X]. We shall rst prove
the existence of a structure of universal algebra of type on [X] which makes
: X [X] into a morphism. We need to show the existence of a map T with
domain such that T(f) : [X]
(f)
[X] for all f and:
x X
(f)
, f(x) = T(f) (x) (1.17)
So let f . We need to dene some T(f) : [X]
(f)
[X]. Let x

[X]
(f)
.
We need to dene T(f)(x). First we shall show that there exists x X
(f)
such that x

= [x]. Indeed, for all i (f), x

(i) is an element of [X]. So there


exists x
i
X such that x

(i) = [x
i
]. Dene x X
(f)
by setting x(i) = x
i
for
all i (f), being understood that x = 0 if (f) = 0. Then, for all i (f) we
have [x](i) = [x(i)] = [x
i
] = x

(i) and it follows that [x] = x

. We now dene
T(f)(x

) = [f(x)]. We need to check that this denition is valid, namely that


[f(x)] is independent of the particular x X
(f)
such that x

= [x]. So suppose
that x, y X
(f)
such that [x] = [y]. We need to show that [f(x)] = [f(y)] or
equivalently that f(x) f(y). Since is a congruence, this will be achieved by
showing that x y or equivalently that x(i) y(i) for all i (f). However,
[x] = [y] and it follows that [x(i)] = [x](i) = [y](i) = [y(i)] for all i (f)
from which we obtain x(i) y(i). So we have proved that T(f)(x

) is well
dened. Furthermore since T(f)(x

) = [f(x)], we have T(f)(x

) [X]. Hence
we have successfully dened T(f) : [X]
(f)
[X] for all f . It follows
that the ordered pair ([X], T) is a universal algebra of type . it remains to
check that : X [X] is a morphism under this particular structure, i.e. that
equation (1.17) is satised for all f . So let x X
(f)
and x

= [x]. We
have:
f(x) = [f(x)] = T(f)(x

) = T(f)([x]) = T(f) (x)


This completes our proof of the existence of a structure of universal algebra of
type on [X] such that : X [X] is a surjective morphism. We shall now
prove the uniqueness. So suppose S is another map with domain such that
S(f) : [X]
(f)
[X] for all f with respect to which is a morphism. We
need to show that S = T. So let f . We need to show that S(f) = T(f). So
let x

[X]
(f)
. We need to show that S(f)(x

) = T(f)(x

). Let x X
(f)
be such that x

= [x]. Since is a morphism under S we obtain:


S(f)(x

) = S(f)([x]) = S(f) (x) = f(x) = [f(x)] = T(f)(x

)
We shall complete the proof of the theorem by showing that equation (1.16)
holds for all f . So let x X
(f)
. We need to show that f([x]) = [f(x)].
But this follows immediately from T(f)([x]) = [f(x)] and the fact that f([x])
is a notational shortcut for T(f)([x]).
1.3.4 First Isomorphism Theorem
Let h : X Y be a homomorphism between two universal algebras X and Y
of type . We know from proposition (34) that ker(h) is a congruence on X.
56
From theorem (7) we obtain the quotient universal algebra [X] of type derived
from X and ker(h). Furthermore from proposition (20) the homomorphic image
h(X) is a sub-algebra of Y and in particular, it is also a universal algebra of
type . As it turns out, the algebras [X] and h(X) are isomorphic. This is of
course hardly surprising for anyone who has done a little bit of algebra before.
Theorem 8 Let h : X Y be a homomorphism between two universal algebras
X and Y of type . Let [X] be the quotient universal algebra of type derived
from X and the congruence ker(h). Let h

: [X] h(X) be dened as:


x X , h

([x]) = h(x)
Then h

is an isomorphism between [X] and h(X).


Proof
Before we start, we should point out that the map h

: [X] h(X) is well


dened by the formula h

([x]) = h(x) since h(x) is independent of the particular


choice of x [x]. Indeed, if x

[x], then x

x which is (x

, x) ker(h)
and consequently h(x

) = h(x). Now we need to show that h

is a bijective
morphism. First we show that it is bijective. It is clearly surjective, so we
shall prove that h

is injective. Let x, x

X such that h

([x]) = h

([x

]). We
need to show that [x] = [x

]. However, our assumption can equally be written


as h(x) = h(x

) which is the same as (x, x

) ker(h) or x x

. Hence we
obtain [x] = [x

] as requested. It remains to show that h

is a morphism. So
let f and x

[X]
(f)
. We need to show that h

f(x

) = f h

(x

).
However, for all i (f) we have x

(i) [X]. So there exists x


i
X such that
x

(i) = [x
i
]. Let x X
(f)
be dened by x(i) = x
i
for all i (f). Then we
have x

(i) = [x
i
] = [x(i)] = [x](i) and consequently x

= [x]. Hence:
h

f(x

) = h

f([x])
theorem (7) = h

([f(x)])
= h(f(x))
h is a morphism = f h(x)
A: to be proved = f h

([x])
= f h

(x

)
So it remains to show that h(x) = h

([x]). For all i (f), we have:


h(x)(i) = h(x(i)) = h

([x(i)]) = h

([x](i)) = h

([x])(i)
Note that this proof works very well when (f) = 0.
As an application of the rst isomorphism theorem (8), we now present a
result which we are unlikely to use again but which may be viewed as a vin-
dication of the eort we have put into the analysis of free universal algebras.
In theorem (1) of page 20, given an arbitrary set X
0
we showed the existence
of a free universal algebra X of type with free generator X
0
. In theorem (7)
57
of page 55, given a congruence on X we showed how to construct the quo-
tient universal algebra [X] of type . The following theorem shows that the
construction mechanism X
0
X [X] is as general as it gets: every universal
algebra of type is in fact some quotient universal algebra [X] derived from a
free universal algebra X and a congruence on X. Consequently, if we ever
wish to construct a universal algebra of a certain type and with certain proper-
ties, there is absolutely no loss in generality in rst considering a free universal
algebra, and subsequently nding the appropriate congruence on it so as to t
the required properties.
For instance, suppose we have a free universal algebra X with a binary
operator . We would like this operator to be commutative. We should only
make sure our congruence on X contains the set:
A = (x y, y x) : x, y X
If this is the case, the corresponding operator on the quotient universal algebra
[X] will indeed be commutative. For if x

, y

[X] and x, y X are such that


x

= [x] and y

= [y], then from theorem (7) of page 55 we have:


x

= [x] [y] = [x y] = [y x] = [y] [x] = y

Theorem 9 Any universal algebra of type is isomorphic to the quotient [X]


of some free universal algebra X of type , relative to some congruence on X.
Proof
Let Y be a universal algebra of type and X
0
= Y . From theorem (1) of
page 20 there exists a free universal algebra X of type with free generator X
0
.
In particular X
0
X. Consider the identity mapping j : X
0
Y . Since X
0
is a free generator of X, there exists a unique morphism g : X Y such that
g
|X0
= j. Let = ker(g) be the kernel of g. From proposition (34) we know
that is a congruence on X. We shall complete the proof of this theorem by
showing the corresponding quotient universal algebra [X] is isomorphic to Y .
From the rst isomorphism theorem (8) we know that the map g

: [X] g(X)
dened by g

([x]) = g(x) is an isomorphism. So it is sucient to show that


g(X) = Y , i.e. that g is a surjective morphism. So let y Y . We need to show
the existence of x X such that g(x) = y. But Y = X
0
X and:
g(y) = g
|X0
(y) = j(y) = y

1.4 Sub-Formula in Free Universal Algebra


1.4.1 Sub-Formula of Formula in Free Universal Algebra
In this section we formalize the notion of sub-formula in a free universal algebra.
Specically, given a free universal algebra X of type , we dene a relation _
58
on X such that _ expresses the fact that the formula is a sub-formula
of the formula . For example, when studying our free universal algebra of
rst order logic in a later part of this document, we shall want to say that the
formula = x(x y) is a sub-formula of the formula = yx(x y). This
notion will prove important when dening substitutions of variables which are
valid for a given formula . Loosely speaking, a substitution is valid for if
no free variable gets caught under the scope of a quantier. Making this idea
precise requires the notion of sub-formula. In order to add further motivation to
this topic, we shall point out that valid substitutions are themselves a key notion
when dening specialization axioms, which are axioms of the form x [y/x]
where the substitution [y/x] of the variable y in place of x is valid for . It is
very important for us to properly dene the idea of valid substitutions: if we
allow too much freedom in which a variable y can be substituted for x, our
axioms may prove too much including statements which are not true. On the
other hand, if we are too restrictive our axioms may prove too little and Godels
completeness theorem may fail to be true.
The notion of sub-formula can easily be dened on any free universal algebra
and not just in the context of rst order logic. So we shall write x _ y rather
than _ and carry out the analysis in its general setting. Given a free
universal algebra X of type with free generator X
0
X, for all y X we
shall dene the set Sub (y) of all sub-formulas of y, and simply dene x _ y as
equivalent to x Sub (y). The key idea to remember is that given f and
x X
(f)
, the sub-formulas of f(x) are f(x) itself, as well as the sub-formulas
of every x(i), for all i (f). As shall see, the relation _ is a partial order on
X, i.e. a relation which is reexive, anti-symmetric and transitive on X. Recall
that _ is said to be anti-symmetric if and only if for all x, y X we have:
(x _ y) (y _ x) (x = y)
Denition 40 Let X be a free universal algebra of type with free generator
X
0
X. We call sub-formula mapping the map Sub : X T(X) dened by:
(i) x X
0
Sub (x) = x
(ii) x X
(f)
Sub(f(x)) = f(x)
_
i(f)
Sub (x(i))
where (ii) holds for all f . Given x, y X, we say that x is a sub-formula
of y, denoted x _ y, if and only if x Sub (y).
Proposition 41 The structural recursion of denition (40) is legitimate.
Proof
We need to show the existence of a unique map Sub : X T(X) such that (i)
and (ii) of denition (40) hold. We do so by applying theorem (5) of page 47
59
with A = T(X) and g
0
: X A dened by g
0
(x) = x for all x X
0
. Given
f we take h(f) : A
(f)
X
(f)
A dened by:
h(f)(a, x) = f(x)
_
i(f)
a(i)
From theorem (5) of page 47 we obtain the existence and uniqueness of a map
g : X A which satises g(x) = x for all x X
0
, and for f , x X
(f)
:
g(f(x)) = h(f)(g(x), x) = f(x)
_
i(f)
g(x)(i) = f(x)
_
i(f)
g(x(i))
which is exactly (ii) of denition (40) with g = Sub .
1.4.2 The Sub-Formula Partial Order
The objective of this section is to show that _ is a partial order.
Proposition 42 Let X be a free universal algebra of type with free generator
X
0
X. The relation _ on X is reexive.
Proof
We need to show that x Sub(x) for all x X. We shall prove this result by
structural induction using theorem (3) of page 32. If x X
0
then Sub (x) = x
and the property x Sub(x) is clear. Suppose f and x X
(f)
are such
that x(i) Sub (x(i)) for all i (f). We need to show the property is also
true for f(x), i.e that f(x) Sub (f(x)). This follows immediately from:
Sub(f(x)) = f(x)
_
i(f)
Sub (x(i))

Proposition 43 Let X be a free universal algebra of type with free generator


X
0
X. For all x, y X we have the equivalence:
x _ y Sub (x) Sub (y)
In particular, the relation _ on X is transitive.
Proof
First we show the implication . Let x, y X such that Sub(x) Sub (y).
From proposition (42) we have x Sub(x) and consequently x Sub (y) which
shows that x _ y. We now show the reverse implication . We shall do so by
considering the property P(y) dened for y X as:
x X , [ x Sub(y) Sub (x) Sub(y) ]
60
We shall prove P(y) for all y X by structural induction using theorem (3) of
page 32. First we assume that y X
0
. Then Sub(y) = y and the condition
x Sub(y) implies that x = y. In particular x X
0
and Sub(x) = x which
shows that the inclusion Sub (x) Sub (y) is indeed true. So the property P(y)
holds for y X
0
. Let f and y X
(f)
be such that P(y(i)) holds for
all i (f). We need to show that the property P(f(y)) is also true. So let
x X be such that x Sub (f(y)). We need to show that Sub(x) Sub (f(y)).
However, from denition (40) we have:
Sub (f(y)) = f(y)
_
i(f)
Sub (y(i)) (1.18)
Hence the condition x Sub(f(y)) implies that x = f(y) or x Sub (y(i))
for some i (f). Suppose rst that x = f(y). Then Sub (x) Sub (f(y))
follows immediately. Suppose now that x Sub (y(i)) for some i (f). Hav-
ing assumed the property P(y(i)) is true, it follows that Sub(x) Sub (y(i))
and nally from equation (1.18) we conclude that Sub (x) Sub (f(y)). This
completes our induction argument and the proof that P(y) holds for all y X.

Proving the relation _ is reexive and transitive worked pretty well with
structural induction arguments. The proof that _ is anti-symmetric will also
rely on induction but will be seen to be more dicult. Given f and
x X
(f)
a key step in the argument will be to claim that f(x) cannot be a
sub-formula of any x(i) for all i (f). This seems pretty obvious but requires
some care. We shall achieve this by using the order mapping : X N as per
denition (16) of page 24. Recall that given x X the order (x) oers some
measure of complexity for the formula x. The next proposition shows that if x
is a sub-formula of y, then it has no greater complexity than y. Since we already
know from proposition (17) that f(x) has greater complexity than every x(i),
we shall easily conclude that f(x) cannot be a sub-formula of x(i).
Proposition 44 Let X be a universal algebra of type with free generator
X
0
X. Then for all x, y X we have:
x _ y (x) (y)
where : X N is the order mapping of denition (16).
Proof
Given y X consider the property P(y) dened by:
x X , [ x Sub(y) (x) (y) ]
We shall complete the proof of this proposition by showing P(y) is true for all
y X, which we shall do by a structural induction argument using theorem (3)
of page 32. First we assume that y X
0
. Then Sub(y) = y and the condition
61
x Sub (y) implies that x = y and in particular (x) (y). So the property
P(y) is true whenever y X
0
. Next we assume that f and y X
(f)
are
such that P(y(i)) is true for all i (f). We need to show the property P(f(y))
is also true. So let x X be such that x Sub(f(y)). We need to show that
(x) (f(y)). From denition (40) we have:
Sub (f(y)) = f(y)
_
i(f)
Sub (y(i))
Hence the condition x Sub (f(y)) implies that x = f(y) or x Sub (y(i)) for
some i (f). Suppose rst that x = f(y). Then (x) (f(y)) follows
immediately. Suppose now that x Sub (y(i)) for some i (f). Having
assumed the property P(y(i)) is true, it follows that (x) (y(i)). Using
proposition (16) we obtain:
(x) (y(i)) < 1 + max(y(i)) : i (f) = (f(y))
So we have proved that (x) < (f(y)) and in particular (x) (f(y)). This
completes our induction argument and the proof that P(y) holds for all y X.

Proposition 45 Let X be a free universal algebra of type with free generator


X
0
X. The relation _ on X is anti-symmetric.
Proof
Given x, y X we have to show that if x _ y and y _ x then x = y. Given
y X consider the property P(y) dened by:
x X , [(x _ y) (y _ x) (x = y) ]
We shall complete the proof of this proposition by showing P(y) is true for all
y X, which we shall do by a structural induction argument using theorem (3)
of page 32. First we assume that y X
0
. We need to show that P(y) is true.
So let x X be such that x _ y and y _ x. In particular we have x _ y, i.e.
x Sub(y). From y X
0
we obtain Sub (y) = y and we conclude that x = y
as requested. So P(y) is indeed true for all y X
0
. Next we assume that f
and y X
(f)
are such that P(y(i)) is true for all i (f). We need to show
that P(f(y)) is also true. So let x X be such that x _ f(y) and f(y) _ x.
We need to show that x = f(y). Suppose to the contrary that x ,= f(y). We
shall arrive at a contradiction. From x _ f(y) we obtain x Sub(f(y)) and
from denition (40) we have:
Sub (f(y)) = f(y)
_
i(f)
Sub (y(i)) (1.19)
Having assumed that x ,= f(y) it follows that x Sub (y(i)) for some i (f).
Hence we have proved that x _ y(i) for some i (f). However, it is clear from
62
equation (1.19) that Sub (y(i)) Sub(f(y)) and it follows from proposition (43)
that y(i) _ f(y). From the assumption f(y) _ x, we obtain y(i) _ x by
transitivity. Hence we see that x X is an element such that x _ y(i) and
y(i) _ x. From our induction hypothesis, we know that P(y(i)) is true. From
x _ y(i) and y(i) _ x we therefore obtain x = y(i). In particular, using
proposition (17) of page 24 we have:
(x) = (y(i)) < 1 + max(y(i)) : i (f) = (f(y))
So we have proved that (x) < (f(y)). However, from f(y) _ x and proposi-
tion (44) we obtain (f(y)) (x). This is our desired contradiction.
Proposition 46 Let X be a free universal algebra of type with free generator
X
0
X. The relation _ on X is a partial order.
Proof
The relation _ is reexive, anti-symmetric and transitive as can be seen from
propositions (42), (45) and (43) respectively. It is therefore a partial order.
1.4.3 Increasing Map on Free Universal Algebra
Recall that a pre-ordered set is an ordered pair (A, ) where A is a set and is
a preorder on A, namely a binary relation on A which is reexive and transitive.
In the following chapters, we shall encounter many cases of maps v : X A
where X is a free universal algebra and A is pre-ordered set. For example, if
X = P(V ) is the free universal algebra of rst order logic of denition (52)
and A = T(V ) is the power set of V (the set of variables), we shall encounter
the maps Var : X A and Bnd : X A which respectively return the set
of variables and bound variables of a formula X. Another example is
X = (V ) of denition (234), i.e. the free universal algebra of proofs on P(V )
for which we shall have a map Hyp : X A where A = T(P(V )) returning the
set of all hypothesis of a proof X. We shall also have a map Var : (V ) A
where A = T(V ) returning the set of variables being used in a proof X. All
these cases are examples of maps v : X A which are increasing in the sense
that v(x) v(y) whenever x is a sub-formula of y i.e. x _ y. For example if
is a sub-formula of then Var() Var(), or if is a sub-proof of then
Hyp() Hyp (). The following proposition allows us to establish this type of
result once and for all with the right level of abstraction.
Proposition 47 Let X be a free universal algebra of type with free generator
X
0
X. Let (A, ) be a pre-ordered set and v : X A be a map such that:
i (f) , v(x(i)) v(f(x))
for all f and x X
(f)
. Then for all x, y X we have the implication:
x _ y v(x) v(y)
63
Proof
Given y X we need to show the property: x X [ x _ y v(x) v(y) ].
We shall do so by structural induction using theorem (3) of page 32. First we
assume that y X
0
. We need to show the property is true for y. So let x _ y be
a sub-formula of y. We need to show that v(x) v(y). However, since y X
0
we have Sub(y) = y. So from x _ y we obtain x Sub(y) and consequently
x = y. So v(x) v(y) follows from the reexivity of the preorder . Next we
assume that f and y X
(f)
is such that y(i) satises our property for
all i (f). We need to show that same is true of f(y). So let x _ f(y) be a
sub-formula of f(y). We need to show that v(x) v(f(y)). However we have:
x Sub (f(y)) = f(y)
_
i(f)
Sub(y(i))
So we shall distinguish two cases: rst we assume that x = f(y). Then the
inequality v(x) v(f(y)) follows from the reexivity of . Next we assume
that x Sub(y(i)) for some i (f). Then we have x _ y(i) and having as-
sumed y(i) satises our induction property, we obtain v(x) v(y(i)). However,
by assumption we have v(y(i)) v(f(y)) and v(x) v(f(y)) follows from the
transitivity of the preorder on A. This completes our induction argument.
1.4.4 Structural Substitution between Free Algebras
If V and W are sets and : V W is a map, we shall see from denition (53)
and denition (272) that it is possible to dene corresponding : P(V ) P(W)
or : (V ) (W) which are substitutions of variable inside formulas or
inside proofs. These maps are not morphisms of free universal algebras, if only
because P(V ) and P(W) do not have the same type. However, these maps are
pretty close to being morphism, and we shall now study some of their common
properties under the possible abstraction of structural substitution. Recall that
given a type of universal algebra and f , the arity of f is denoted (f).
Denition 48 Let and be two types of universal algebra. We say that a
map q : is arity preserving, if and only if for all f we have:
q(f) = (f)
In other words the arity of the operator q(f) is the same as that of f .
Recall that the which appears on the right-hand-side of (ii) in deni-
tion (49) below refers to the map : X
(f)
Y
(f)
dened by (x)(i) = (x(i))
for all i (f). It follows that the expression q(f)((x)) is always meaningful
since q : is arity preserving and (f) is precisely the arity of q(f).
Denition 49 Let X, Y be free universal algebras of type , and with free
generators X
0
, Y
0
respectively. A map : X Y is a structural substitution if
64
and only if there exists an arity preserving map q : such that:
(i) x X
0
(x) Y
0
(ii) x X
(f)
(f(x)) = q(f)((x))
where (ii) holds for all f .
One of the motivations to consider structural substitutions is to prove the
following proposition in a general setting. If : X Y is a structural substi-
tution between two free universal algebras, then given x, y X we have:
y _ x (y) _ (x)
In other words, if y is a sub-formula of x then (y) is also a sub-formula of (x).
This property can be summarized with the inclusion (Sub (x)) Sub ((x))
for all x X. As it turns out, the reverse inclusion is also true:
Proposition 50 Let X, Y be free universal algebras and : X Y be a struc-
tural substitution. Then for all x X we have the equality:
Sub ((x)) = (Sub (x))
i.e. the sub-formulas of (x) are the images of the sub-formulas of x by .
Proof
Given x X we need to show that Sub ((x)) = (Sub (x)). We shall do so
with a structural induction argument, using theorem (3) of page 32. First we
assume that x X
0
. Since : X Y is structural we have (x) Y
0
and so:
Sub((x)) = (x) = (x) = (Sub (x))
So let f and x X
(f)
be such that the equality is true for all x(i) with
i (f). We need to show the equality is also true for f(x). Since : X Y
is a structural substitution, let q : be an arity preserving map for which
(ii) of denition (49) holds. Then we have the equalities:
Sub((f(x))) = Sub ( q(f)((x)) )
def. (40) = q(f)((x))
_
i(q(f))
Sub ( (x)(i) )
q arity preserving = (f(x))
_
i(f)
Sub ( (x(i)) )
induction = (f(x))
_
i(f)
( Sub (x(i)) )
= ( f(x)
_
i(f)
Sub (x(i)) )
def. (40) = ( Sub (f(x)) )

65
Chapter 2
The Free Universal Algebra
of First Order Logic
2.1 Formula of First Order Logic
2.1.1 Preliminaries
This work is an attempt to provide an algebraization of rst order logic with
terms and without equality. Due to personal ignorance, we are not in a position
to say much of the history of this problem, or the existing literature. We shall
nonetheless endeavor to give here an honest account of our little understanding.
Any specialist who believes this account can be improved in any way is welcome
to contact the author. The idea of turning rst order logic into algebra is not
new. Following wikipedia, it appears Alfred Tarski initially worked on the alge-
braization of rst order logic without terms and with equality, which led to the
emergence of cylindric algebras. At a later stage, Paul Halmos developed a ver-
sion without equality, which led to polyadic algebras. There is also a categorical
approach initially due to William Lawvere, leading to functorial semantics. The
most recent treatment of algebraization of rst order logic without terms seems
to be the work of Ildiko Sain in [52]. There is also the categorical work of George
Voutsadakis, for example in [64] following a recent trend of abstract algebraic
logic which seems to have been initiated by W.J. Blok and D. Pigozzi [6]. A new
book [1] from Hajnal Andreka, Istv an Nemeti and Ildik o Sain on algebraic logic
is about to be released. There are a few older monographs on cylindric algebras
such as [2] and Paul Halmos [28] on polyadic algebras. It is our understanding
that most of the work done so far has focussed on rst order logic without terms,
an exception being the paper from Janis Cirulis [11], of which we are hoping
to get a copy soon. The algebraization of rst order logic with terms seems to
have received far less attention than its counterpart without terms, as it fails to
fall under the scope of currently known techniques of abstract algebraic logic.
Before we complete our history tour, we would like to mention the books of
66
Donald W. Barnes and John M. Mack [4] and P.T. Johnstone [32] which are
the initial motivation for the present work. For a mathematician not trained
in formal logic, these books have something magical. It is hard for us to say if
they succeeded in presenting their subject. Many of the proofs are very short
and require a fair amount of maturity. This document is an attempt to follow
their trail while making the material accessible to less sophisticated readers. We
shall now explain our purpose in more details.
Our aim is to study mathematical statements as mathematical objects of
their own. We believe universal algebras are likely to be the right tool for
this study. We would like to represent the set of all possible mathematical
statements as a universal algebra of some type. Without knowing much of the
details at this stage, we shall call this universal algebra the universal algebra of
rst order logic. The elements of this universal algebra will be called formulas
of rst order logic. It may appear as somewhat naive to refer to the universal
algebra of rst order logic as if we had some form of uniqueness property. There
are after all many logical systems which may be classied as rst order, and
many more algebras arising from these systems. However, we are looking for
an algebra which is minimal in some sense, and encompasses the language of
ZF on which most of modern mathematics can be built. When looking at a
given mathematical statement, we are often casually told that it could be (at
least in principle) coded into the language of ZF. Oh really? We would love
to see that. There are so many issues surrounding this question, it does not
seem obvious at all. One cannot simply dene predicates of ever increasing
complexity and hope to blindly substitutes these denitions within formulas.
Most mathematical objects are not uniquely dened as sets, but are complex
structures which are known up to isomorphism. The list of diculties is endless.
And there is of course category theory. Granted category theory will never t
into the language of ZF, so we shall need another algebra for that. But who
knows? it may be that the universal algebra of rst order logic will be applicable
to category theory: once we can formally dene a predicate of a single variable,
we can formally dene a class as a proper mathematical object and dwell into
the wonders of meta-mathematics without the intuition and the hand waving.
So much for nding the names. The real challenge is to determine the
appropriate denitions. We saw in theorem (9) of page 58, that there was
no loss of generality in viewing our universal algebra as a quotient of a free
universal algebra, modulo the right congruence. One possible way to dene the
universal algebra of rst order logic is therefore to provide specic answers to
the following questions: First, a free generator X
0
needs to be chosen. We then
need to agree on a specic type of universal algebra in order to derive a free
universal algebra X of type with free generator X
0
. Finally, a decision needs
to be made on what the appropriate congruence on X should be.
These we believe are the right questions. For those already familiar with
elements of formal logic, we may outline now a possible answer. Having chosen
a formal language with a deductive system on it, we can consider the relation
dened by if and only if the formulas and are both provable
with respect to this deductive system. This choice of particular congruence gives
67
rise to a Lindenbaum-Tarski algebra. So it is possible our universal algebra of
rst order logic is just a particular case of Lindenbaum-Tarski (and it would
probably have been sensible to call it that way). However at this point of the
document, we do not know whether the provability of is decidable.
We may have heard a few things about the undecidability of rst order logic
and suspect that ( ) is in fact undecidable. If this was the case, a
computer would not be able to tell in general whether the equivalence
holds. This would be highly unsatisfactory. Whatever congruence we eventually
choose, surely we should want it to be decidable. The universal algebra of rst
order logic is about mathematical statements, not theorems. Those statements
which happen to have the same meaning should be regarded as identical, and a
computer program should be able to establish this identity. Those mathematical
statements which happen to be mathematically equivalent should be dealt with
by proof theory, model theory and set theory.
2.1.2 The Free Universal Algebra of First Order Logic
In this section, we shall limit our investigation to the free generator X
0
and
type of universal algebra , postponing the issue of congruence to later parts
of this document. So we are looking to dene the free universal algebra of rst
order logic. First we shall deal with the free generator X
0
. The elements of
X
0
should represent the most elementary mathematical statements. Things like
(x y) or (x = y) are possible candidates for membership to X
0
. We could also
have constant symbols and regard (0 y) or (x = ) as elementary statements.
More generally, statements such as R[x, y] or even S[s, 0, u, , w] could be viewed
as elementary statements, where R and S are so-called predicate symbols. We
may even consider R[f(1, y), z] where f is a so-called function symbol. As far
as we are concerned, the universal algebra of rst order logic should be the
mathematical language of the lowest possible level. In computing terms, it
should be akin to assembly language. There should be no constant, no function
symbol and no predicate symbol beyond the customary . Even the equality
= should be banned. The universal algebra of rst order logic should contain
the bare minimum required to express every standard mathematical statement.
This bare minimum we believe can be based simply on elementary propositions
of the form (x y) where x and y belong to a set of variable V . So our free
generator X
0
should therefore be reduced to:
X
0
= (x y) : x, y V
Turning now to the type of universal algebra , we believe the minimal set
of operations (also known as connectives) required to express every standard
mathematical statement consists in the contradiction constant , which is an
operator of arity 0, the implication operator which is a binary operator, and
for all x V , the quantication operator x which is a unary operator. So the
free universal algebra of rst order logic should consist in elements of the form:
, (x y) , xy (x y) , z[(z x) (z y)] , etc.
68
At this point in time, we have very little to justify this choice, beyond its
simplicity. Clearly, there would be no point in dening and studying a free
universal algebra of rst order logic which does not fulll its role. At some
stage, we will need to consider higher level formal languages which are actually
usable by human beings to express interesting mathematics, and show that high
level statements can eectively be compiled into the low level. This we believe
is the only way to vindicate our choice. For now, we hope simply to reassure
the reader that our conception of bare minimum is sensible: for example, the
negation operator could be compiled as = ( ), the disjunction and
conjunction operators as = and = () respectively,
the equivalence operator as ( ) = [( ) ( )] and nally the
equality predicate as:
(x = y) = z[(z x) (z y)] z[(x z) (y z)]
This does not tell us how to introduce constant symbols and function symbols,
and we are still a very long way to ever compile a statement such as:
1

2
_
+

e
x
2
/2
dx = 1
We hope to address some of these questions in later parts of this document. For
now, we shall rest on the belief that our free universal algebra of rst order logic
is the right tool to consider and study.
But assuming our faith is justied, assuming that interesting mathematical
statements can eectively be compiled as formulas of rst order logic, one im-
portant question still remains: why should we care about a low level algebra
whose mathematical statements are so remote from day to day mathematics?
Why not consider a high level language directly? The answer is twofold: From a
purely mathematical point of view it is a lot easier to deal with a simple algebra.
For instance, if we are looking to prove anything by structural induction, we only
need to consider (x y), , and x for all x V which would not be the case
with a more complex algebra. Furthermore from a computing perspective, we
are unlikely to write sensible software unless we start from the very simple and
move on to the high level. A computer chip typically understands a very limited
set of instructions. High level languages are typically compiled into a sequence
of these elementary tasks. Without thinking very hard on this, it is a natural
thing to believe that a similar approach should be used for meta-mathematics.
At the end of the day, if it is indeed the case that deep mathematical statements
can be compiled as formulas of rst order logic, focussing on these formulas will
surely prove very useful.
Denition 51 Let V be a set. We call First Order Logic Type associated with
V , the type of universal algebra dened by:
= , x : x V
where = ((0, 0), 0), = ((1, 0), 2) and x = ((2, x), 1) given x V .
69
There is no particular conditions imposed on the set V , which could be
empty, nite, innite, countable or uncountable. The set of denition (51) is
a set of ordered pairs which is functional. It is therefore a map with domain
dom() = (0, 0), (1, 0) (2, x) : x V and range rng() = 0, 1, 2. Since
rng() N, by virtue of denition (1) is indeed a type of universal algebra.
With our customary abuse of notation described in page 6, we have () = 0,
() = 2 and (x) = 1 for all x V . It follows that is understood to be
an operator of arity 0, while is binary and x is unary.
Denition 52 Let V be a set with rst order logic type . We call Free Uni-
versal Algebra of First Order Logic associated with V , the free universal algebra
P(V ) of type with free generator P
0
(V ) = V V .
The free universal algebra P(V ) exists by virtue of theorem (1) of page 20. It
is also unique up to isomorphism and we have P
0
(V ) P(V ). For all x, y V
the ordered pair (x, y) will be denoted (x y). It follows that the free generator
P
0
(V ) of P(V ) is exactly what we had promised:
P
0
(V ) = (x y) : x, y V
Furthermore, we have the operators : P(V )
0
P(V ), : P(V )
2
P(V )
and x : P(V )
1
P(V ). We shall refer to the constant (0) simply as .
Given , P(V ), we shall write instead of (, ). Given x V and
P(V ), we shall write x instead of x().
2.1.3 Variable Substitution in Formula
Let V be a set and x, y, z V . Then = x(x z) and = y(y z) are
elements of P(V ). In the case when x ,= z and y ,= z most of us would regard
and as being the same mathematical statement. Yet from theorem (2) of
page 21, and are distinct elements of P(V ) when x ,= y. We are still a
long way to have a clear opinion on what an appropriate congruence on P(V )
should be. Yet we feel that whatever our choice of congruence , we should
have . So we need to formalize the idea that replacing the variable x in
= x(x z) by the variable y does not change the meaning of the formula .
The rst step is for us to formally dene the notion of variable substitution.
Variable substitutions is a key idea in every textbook of mathematical logic.
However, the usual treatment attempts to dene substitutions where variables
are replaced by terms. Since quantication can only occur with respect to a
variable, only free occurrences of variables are substituted with terms, leaving
bound occurrences unchanged. This creates a major drawback: if : V W is
a map and P(V ), we certainly want the substituted formula () to be an
element of P(W), which would be impossible to achieve if we were to leave bound
occurrences of variables unchanged. So contrary to standard practice, we shall
dene a notion of variable substitution where every variable is carried over by
: V W. Dening : P(V ) P(W) from the map : V W we believe is
very fruitful. This type of substitution will not be capture-avoiding in general,
70
but this is no dierent from the general treatment where we often require a
term t to be substitutable for a variable x in , or to be free for x in . Note
that our formal language P(V ) has no term beyond the variables themselves. So
focussing on variables only for substitutions is no restriction of generality. In our
view, terms or constants should not exist in a low level mathematical statement.
The constant = 3.14159 . . . should be a low level predicate (x), and given a
predicate Q(x) the statement Q() obtained by substituting the variable x by
the ground term , should in fact be the statement x[(x) Q(x)] which is
syntactically a very dierent operation from a substitution of variable.
Denition 53 Let V and W be two sets and : V W be a map. We call
substitution mapping between P(V ) and P(W) associated with : V W the
map

: P(V ) P(W) dened by the structural recursion:


P(V ) ,

() =

((x) (y)) if = (x y)
if =

(
1
)

(
2
) if =
1

2
(x)

(
1
) if = x
1
(2.1)
This is one of our rst denition by structural recursion. The principle of such
denition on a free universal algebra is justied by virtue of theorem (4) of
page 45. As this is one of our rst use of this theorem, we shall make sure its
application is done appropriately by checking all the relevant details.
Proposition 54 The structural recursion of denition (53) is legitimate.
Proof
We need to show the existence and uniqueness of the map

: P(V ) P(W)
satisfying equation (2.1). We shall do so by applying theorem (4) of page 45
to the free universal algebra P(V ) with free generator X
0
= P
0
(V ) and the set
A = P(W). First we dene g
0
: X
0
A by g
0
(x y) = ((x) (y)) for all
x, y V . Next, given f we need to dene an operator h(f) : A
(f)
A.
For all
1
,
2
A and for all x V , we set:
(i) h()(0) =
(ii) h()(
1
,
2
) =
1

2
(iii) h(x)(
1
) = (x)
1
Applying theorem (4) of page 45, there exists a unique map

: P(V ) A
such that

|X0
= g
0
and for all f and y P(V )
(f)
:

(f(y)) = h(f)(

(y)) (2.2)
In other words, there exists a unique map

: P(V ) P(W) such that for all


x, y V we have

(x y) = ((x) (y)) and which satises equation (2.2).


Taking f = and y = 0 in equation (2.2) we obtain:

() =

((0)) = h()(0) = (2.3)


71
Taking f = and y = (
1
,
2
) P(V )
2
we obtain:

(
1

2
) = h()(

(
1
),

(
2
)) =

(
1
)

(
2
) (2.4)
Finally, taking f = x and y =
1
P(V )
1
given x V :

(x
1
) = h(x)(

(
1
)) = (x)

(
1
) (2.5)
Since equations (2.3), (2.4) and (2.5) are exactly as equation (2.1), we conclude
there is a unique map

: P(V ) P(W) satisfying equation (2.1).


Let U, V, W be sets while : U V and : V W are maps. From
denition (53) we obtain the substitution mappings

: P(U) P(V ) and

: P(V ) P(W). However, : U W is also a map and we therefore


have a substitution mapping ( )

: P(U) P(W). One natural question


to ask is whether ( )

coincide with the composition

. The following
proposition shows that it is indeed the case. As a consequence, we shall be able
to simplify our notations by referring to the substitution mappings

and
( )

simply as , and , i.e. we shall drop the when referring to


a substitution mapping. Whether the notation is understood to mean

or ( )

no longer matters since the two mappings coincide anyway.


Proposition 55 Let U, V and W be sets. Let : U V and : V W
be maps. Let

: P(U) P(V ) and

: P(V ) P(W) be the substitution


mappings associated with and respectively. Then:
( )

where ()

: P(U) P(W) is the substitution mapping associated with .


Proof
We need to show that ( )

() =

() for all P(U). We shall


do so by structural induction, using theorem (3) of page 32. Since P
0
(U) is a
generator of P(U), we show rst that the property is true on P
0
(U). So let
= (x y) P
0
(U), where x, y U. We have:
( )

() = ( ((x)) ((y)) ) =

( (x) (y) ) =

()
Next we check that the property is true for P(U):
( )

() = =

() =

()
Next we check that the property is true for =
1

2
, if it is true for
1
,
2
:
( )

() = ( )

(
1
) ( )

(
2
)
=

(
1
))

(
2
))
=

(
1
)

(
2
) )
=

(
1

2
))
=

()
72
Finally we check that the property is true for = x
1
, if it is true for
1
:
( )

() = ((x)) ( )

(
1
)
= ((x))

(
1
))
=

((x)

(
1
))
=

(x
1
))
=

()

In the case when W = V and : V W is the identity mapping dened


by (x) = x for all x V , equation (2.1) of denition (53) becomes:
P(V ) , () =

(x y) if = (x y)
if =
(
1
) (
2
) if =
1

2
x(
1
) if = x
1
It seems pretty obvious that : P(V ) P(V ) is also the identity mapping, i.e.
that () = for all P(V ). Yet, formally speaking, there seems to be no
other way but to prove this once and for all. So we shall do so now:
Proposition 56 Let V be a set and i : V V be the identity mapping. Then,
the associated substitution mapping i : P(V ) P(V ) is also the identity.
Proof
We need to show that i() = for all P(V ). We shall do so by structural
induction, using theorem (3) of page 32. Since P
0
(V ) is a generator of P(V ),
we show rst that the property is true on P
0
(V ). So let = (x y) P
0
(V ):
i() = ( i(x) i(y) ) = (x y) =
The property is clearly true for P(V ) since i() = . Next we check that
the property is true for =
1

2
, if it is true for
1
,
2
P(V ):
i() = i(
1
) i(
2
) =
1

2
=
Finally we check that the property is true for = x
1
, if it is true for
1
:
i() = i(x)i(
1
) = x
1
=

Since P(V ) is a free universal algebra, every formula P(V ) has a well
dened set of sub-formulas Sub() as per denition (40) of page 59. If V and
W are sets and : V W is a map with associated substitution mapping
: P(V ) P(W), then given P(V ) and a sub-formula _ it seems
pretty obvious that the image () is also a sub-formula of (). In fact,
the converse is also true and the sub-formulas of () are simply the images
() by of the sub-formulas _ . This property stems from the fact that
: P(V ) P(W) is a structural substitution, as per denition (49).
73
Proposition 57 Let V, W be sets and : V W be a map. The associated
substitution : P(V ) P(W) is a structural substitution and for all P(V ) :
Sub(()) = (Sub ())
Proof
By virtue of proposition (50) it is sucient to prove that : P(V ) P(W)
is a structural substitution as per denition (49). Let (V ) and (W) denote
the rst order logic types associated with V and W respectively as per deni-
tion (51). Let q : (V ) (W) be the map dened by:
f (V ) , q(f) =

if f =
if f =
(x) if f = x
Then q is clearly arity preserving. In order to show that is a structural sub-
stitution, we simply need to check that properties (i) and (ii) of denition (49)
are met. First we start with property (i): so let P
0
(V ). Then = (x y)
for some x, y V and we need to show that () P
0
(W) which follows im-
mediately from () = (x) (y). So we now show property (ii). Given
f (V ), given P(V )
(f)
we need to show that (f()) = q(f)(()).
First we assume that f = . Then (f) = 0, = 0 and consequently:
(f()) = ((0))
(0) denoted = ()
def. (53) =
(0) denoted = (0)
: 0 0 = q()((0))
= q(f)(())
Next we assume that f =. Then (f) = 2 and given = (
0
,
1
) :
(f()) = (
0

1
)
def. (53) = (
0
) (
1
)
: P(V )
2
P(W)
2
= q()((
0
,
1
))
= q(f)(())
Finally we assume that f = x, x V . Then (f) = 1 and given = (
0
) :
(f()) = (x
0
)
def. (53) = (x)(
0
)
: P(V )
1
P(W)
1
= q(x)((
0
))
= q(f)(())

74
2.1.4 Variable Substitution and Congruence
Let V, W be sets and : V W be a map. From denition (53) we have a
substitution mapping : P(V ) P(W) which we still denote by virtue of
proposition (55). Suppose we have a congruence on P(V ) and a congruence on
P(W), both denoted for the purpose of this discussion. Given , P(V )
such that , it will often be useful for us to know whether the substitution
mapping preserves the equivalence of and , i.e. whether () ().
When this is the case, a very common strategy to go about the proof is to
consider a relation on P(V ) dened by if and only if () (),
and to argue that is in fact a congruence on P(V ). Once we know that is
a congruence on P(V ), a proof of () () is simply achieved by showing
an inclusion of the form R
0
where R
0
is a generator of the congruence
on P(V ), in the sense of denition (39). Indeed, such an inclusion implies that
from which we see that for all , P(V ) we have:
() ()
The following proposition conrms that is a congruence on P(V ).
Proposition 58 Let V and W be sets and : V W be a map. Let be an
arbitrary congruence on P(W) and let be the relation on P(V ) dened by:
() ()
for all , P(V ). Then is a congruence on P(V ).
Proof
Since the congruence on P(W) is an equivalence relation, is clearly reexive,
symmetric and transitive on P(V ). So we simply need to show that is a
congruent relation on P(V ). Since is reexive, we have () () and so
. Suppose
1
,
2
,
1
and
2
P(V ) are such that
1

1
and
2

2
.
Dene =
1

2
and =
1

2
. We need to show that , or
equivalently that () (). This follows from the fact that (
1
) (
1
),
(
2
) (
2
) and furthermore:
() = (
1

2
)
= (
1
) (
2
)
(
1
) (
2
)
= (
1

2
)
= ()
where the intermediate crucially depends on being a congruent relation
on P(W). We now suppose that
1
,
1
P(V ) are such that
1

1
. Let
x V and dene = x
1
and = x
1
. We need to show that , or
equivalently that () (). This follows from (
1
) (
1
) and:
() = (x
1
)
75
= (x) (
1
)
(x)(
1
)
= (x
1
)
= ()
where the intermediate crucially depends on being a congruent relation.
2.1.5 Variable of a Formula
Let V be a set and x, y V . Then = x(x y) is an element of P(V ). In
this section, we aim to formalize the idea that the set of variables of is x, y.
Hence we want to dene a map Var : P(V ) T(V ) such that Var() = x, y.
Denition 59 Let V be a set. The map Var : P(V ) T(V ) dened by the
following structural recursion is called variable mapping on P(V ):
P(V ) , Var() =

x, y if = (x y)
if =
Var(
1
) Var(
2
) if =
1

2
x Var(
1
) if = x
1
(2.6)
We say that x V is a variable of P(V ) if and only if x Var().
Proposition 60 The structural recursion of denition (59) is legitimate.
Proof
We need to show the existence and uniqueness of Var : P(V ) T(V ) satisfying
equation (2.6). This follows from an immediate application of theorem (4)
of page 45 to the free universal algebra P(V ) and the set A = T(V ), using
g
0
: P
0
(V ) A dened by g
0
(x y) = x, y for all x, y V , and the
operators h(f) : A
(f)
A (f ) dened for all V
1
, V
2
A and x V as:
(i) h()(0) =
(ii) h()(V
1
, V
2
) = V
1
V
2
(iii) h(x)(V
1
) = x V
1

Recall that a set A is said to be nite if and only if there exists a bijection
j : n A for some n N. The set A is said to be innite if it is not nite.
Note that if A and B are nite sets, then A B is also nite. We shall accept
this fact without proof in this document. The following proposition shows that
a formula of rst order predicate logic always has a nite number of variables.
Proposition 61 Let V be a set and P(V ). Then Var() is nite.
76
Proof
Given P(V ), we need to show that Var() is a nite set. We shall do so by
structural induction using theorem (3) of page 32. Since P
0
(V ) is a generator
of P(V ), we rst check that the property is true for P
0
(V ). So suppose
= (x y) for some x, y V . Since Var() = x, y, it is indeed a nite
set. Next we check that the property is true for P(V ), which is clear since
Var() = . Next we check that the property is true for =
1

2
, if it is true
for
1
,
2
P(V ). This follows immediately from Var() = Var(
1
) Var(
2
)
which is clearly a nite set if both Var(
1
) and Var(
2
) are nite. Finally we
check that the property is true for = x
1
if it is true for
1
. This is also
immediate from Var() = x Var(
1
) and the fact that Var(
1
) is nite.
In the following proposition we show that if _ is a sub-formula of ,
then the variables of are also variables of as we should expect:
Proposition 62 Let V be a set and , P(V ). Then we have:
_ Var() Var()
Proof
This is a simple application of proposition (47) to Var : X A where X = P(V )
and A = T(V ) where the preorder on A is the usual inclusion . We sim-
ply need to check that given
1
,
2
P(V ) and x V we have the inclusions
Var(
1
) Var(
1

2
), Var(
2
) Var(
1

2
) and Var(
1
) Var(x
1
)
which follow immediately from the recursive denition (59).
Let V, W be sets and : V W be a map with associated substitution
mapping : P(V ) P(W). Given P(V ), the variables of are the
elements of the set Var(). The following proposition allows us to determine
which are the variables of (). Specically:
Var(()) = (x) : x Var()
In other words the variables of () coincide with the range of the restriction

|Var()
. As discussed in page 19, this range is denoted (Var()).
Proposition 63 Let V and W be sets and : V W be a map. Then:
P(V ) , Var(()) = (Var())
where : P(V ) P(W) also denotes the associated substitution mapping.
Proof
Given P(V ), we need to show that Var(()) = (x) : x Var(). We
shall do so by structural induction using theorem (3) of page 32. Since P
0
(V )
is a generator of P(V ), we rst check that the property is true for P
0
(V ).
So suppose = (x y) P
0
(V ) for some x, y V . Then we have:
Var(()) = Var((x y))
77
= Var( (x) (y) )
= (x), (y)
= (u) : u x, y
= (u) : u Var(x y)
= (x) : x Var()
Next we check that the property is true for P(V ):
Var(()) = Var() = = (x) : x Var()
Next we check that the property is true for =
1

2
if it is true for
1
,
2
:
Var(()) = Var((
1

2
))
= Var((
1
) (
2
))
= Var((
1
)) Var((
2
))
= (x) : x Var(
1
) (x) : x Var(
2
)
= (x) : x Var(
1
) Var(
2
)
= (x) : x Var(
1

2
)
= (x) : x Var()
Finally we check that the property is true for = x
1
if it is true for
1
:
Var(()) = Var((x
1
))
= Var((x) (
1
))
= (x) Var((
1
))
= (x) (u) : u Var(
1
)
= (u) : u x Var(
1
)
= (u) : u Var(x
1
)
= (x) : x Var()

Let V, W be sets and : V W be a map with associated substitution


mapping : P(V ) P(W). It will soon be apparent that whether or not
: V W is injective plays a crucial role. For example if = xy(x y)
with x ,= y and (x) = u while (y) = v, we obtain () = uv(u v).
When is injective we have u ,= v and it is possible to argue in some sense
that the meaning of () is the same as that of . If is not an injective
map, then we no longer have the guarantee that u ,= v. In that case ()
may be equal to uu(u u), which is a dierent mathematical statement
from that of . Substitution mappings which arise from an injective map are
therefore important. However, there will be cases when we shall need to consider
a map : V W which is not injective. One common example is the map
78
[y/x] : V V dened in denition (65) which replaces a variable x by a variable
y, but leaves the variable y as it is, if already present in the formula. Since
[y/x](x) = y = [y/x](y), this map is not injective when x ,= y. This map is not
the same as the map [y : x] : V V dened in denition (66) which permutes the
two variables x and y. In contrast to [y/x] the permutation [y : x] is an injective
map. However if a formula P(V ) does not contain the variable y, it is easy
to believe that the associated substitution mappings [y/x] : P(V ) P(V ) and
[y : x] : P(V ) P(V ) have an identical eect on the formula , or in other
words that [y/x]() = [y : x](). If this is the case, many things can be said
about the formula [y/x]() despite the fact that [y/x] is not an injective map,
simply because of the equality [y/x]() = [y : x](). More generally, many things
can be said about a formula () whenever it coincides with a formula (),
and is an injective map. To achieve the equality () = () we only need
and to coincide on Var() as we shall now check in the following proposition.
Proposition 64 Let V and W be sets and , : V W be two maps. Then:

|Var()
=
|Var()
() = ()
for all P(V ), where , : P(V ) P(W) are also the substitution mappings.
Proof
Given P(V ), we need to show that if and coincide on Var() V ,
then () = (), and conversely that if () = (), then and coincide on
Var(). We shall do so by structural induction using theorem (3) of page 32.
Since P
0
(V ) is a generator of P(V ), we rst check that the equivalence is true for
P
0
(V ). So suppose = (x y) for some x, y V , and suppose furthermore
that
|Var()
=
|Var()
. Since Var() = x, y it follows that (x) = (x) and
(y) = (y). Hence, we have the following equalities:
() = (x y)
= ( (x) (y) )
= ( (x) (y) )
= (x y)
= ()
Conversely, if () = () then ( (x) (y) ) = ( (x) (y) ) and it follows
that (x) = (x) and (y) = (y). So and coincide on Var(). Next we
check that the equivalence is true for P(V ). Since Var() = , we always
have the equality
|Var()
= =
|Var()
. Furthermore () = = () is
also always true. So the equivalence does indeed hold. Next we check that the
equivalence is true for =
1

2
, if it is true for
1
,
2
P(V ). So we
assume that and coincide on Var(). We need to prove that () = ().
However, since Var() = Var(
1
) Var(
2
), and coincide both on Var(
1
)
and Var(
2
). Having assumed the property is true for
1
and
2
it follows that
79
(
1
) = (
1
) and (
2
) = (
2
). Hence:
() = (
1

2
)
= (
1
) (
2
)
= (
1
) (
2
)
= (
1

2
)
= ()
Conversely if () = () then we obtain (
1
) (
2
) = (
1
) (
2
).
Using theorem (2) of page 21 it follows that (
1
) = (
1
) and (
2
) = (
2
).
Having assumed the equivalence is true for
1
and
2
we conclude that and
coincide on Var(
1
) and Var(
2
) and consequently they coincide on Var().
Finally we check that the equivalence is true for = x
1
if it is true for
1
.
So we assume that and coincide on Var(). We need to show () = ().
However, since Var() = x Var(
1
), and coincide on Var(
1
) and we
have (x) = (x). Having assumed the property is true for
1
it follows that
(
1
) = (
1
), and consequently:
() = (x
1
)
= (x) (
1
)
= (x) (
1
)
= (x
1
)
= ()
Conversely if () = () then we obtain (x) (
1
) = (x) (
1
). Using
theorem (2) of page 21 it follows that (
1
) = (
1
) and (x) = (x). Having
assumed the equivalence is true for
1
we conclude that and coincide on
Var(
1
) and consequently they coincide on Var() = x Var(
1
).
2.1.6 Substitution of Single Variable
In denition (53) we introduced the notion of substitution mapping associated
with a given map. Our motivation was to formalize the idea that x(x z)
and y(y z) were identical mathematical statements when z , x, y, or
equivalently that replacing the variable x by the variable y in x(x z) did not
change the meaning of the formula. We are now in a position to formally dene
the substitution of a variable by another.
Denition 65 Let V be a set and x, y V . We call substitution of y in place
of x, the map [y/x] : V V dened by:
u V , [y/x](u) =
_
y if u = x
u if u ,= x
If we denote [y/x] : P(V ) P(V ) the associated substitution mapping, given
P(V ) the image [y/x]() is called with y in place of x and denoted [y/x].
80
The substitution of y in place of x has a fundamental aw: it is not an
injective map. In most introductory texts on mathematical logic, this is hardly
a problem as the set of variables V is always assumed to be innite. So if we wish
to argue that xy(x y) is the same mathematical statement as yx(y x),
we can always choose a variable z , x, y and prove the equivalence as:
xy(x y) zy(z y)
zx(z x)
yx(y x)
Unfortunately, this will not be possible in the case when V = x, y and x ,= y,
i.e. when V has only two elements. As we shall see in later parts of this
document, a congruence dened on P(V ) in terms of the substitution mapping
[y/x] will lead to the paradoxical situation of xy(x y) , yx(y x) when
V = x, y. To remedy this fact, we shall introduce the permutation mapping
[y : x] and use this as the basis of a new congruence which will turn out to be
simpler, equivalent to the traditional notion based on [y/x] for V innite, and
working equally well for V nite or innite.
Denition 66 Let V be a set and x, y V . We call permutation of x and y,
the map [y : x] : V V dened by:
u V , [y : x](u) =

y if u = x
x if u = y
u if u , x, y
If we denote [y : x] : P(V ) P(V ) the associated substitution mapping, given
P(V ) the image [y : x]() is denoted [y : x].
An immediate consequence of this denition is:
Proposition 67 Let V , W be sets and : V W be an injective map. Then
for all x, y V we have the following equality:
[y : x] = [(y): (x)]
Proof
Let u V . We shall distinguish three cases. First we assume u , x, y. Then:
[y : x](u) = (u) = [(y): (x)] (u)
where the last equality crucially depends on (u) , (x), (y) which itself
follows from the injectivity of and u , x, y. We now assume that u = x:
[y : x](u) = (y)
= [(y): (x)] (x)
= [(y): (x)] (u)
81
Finally, we assume that u = y and obtain:
[y : x](u) = (x)
= [(y): (x)] (y)
= [(y): (x)] (u)
In all cases we see that [y : x](u) = [(y): (x)] (u) as requested.
As already noted, the permutation [y : x] is injective while the substitution
[y/x] is not. However, there will be cases when the formulas [y/x] and [y : x]
coincide, given a formula . Knowing when the equality holds is important and
one simple sucient condition is y , Var(), as will be seen from the following:
Proposition 68 Let V be a set and U V . Then for all x, y V we have:
y , U [y/x]
|U
= [y : x]
|U
Proof
We assume that y , U. Let u U. We need to show that [y/x](u) = [y : x](u).
We shall distinguish three cases: rst we assume that u , x, y. Then the
equality is clear. Next we assume that u = x. Then the equality is also clear.
Finally we assume that u = y. In fact, this cannot occur since y , U and u U.

Proposition 69 Let V be a set and x, y V . Let P(V ). Then, we have:


y , Var() [y/x] = [y : x]
Proof
We assume that y , Var(). We need to show that [y/x] = [y : x]. From
proposition (64) it is sucient to show that [y/x] and [y : x] coincide on Var(),
which follows immediately from y , Var() and proposition (68).
When replacing a variable x by a variable y in a formula , and subsequently
replacing the variable y by a variable z, we would expect the outcome to be the
same as replacing the variable x by the variable z directly. This is in fact the
case, provided y is not already a variable of . Assuming x, y and z are distinct
variables, a simple counterexample is = (x y).
Proposition 70 Let V be a set and x, y, z V . Then for all P(V ):
y , Var() [y/x][z/y] = [z/x]
Proof
Suppose y , Var(). We need to show that [z/y] [y/x]() = [z/x](). From
proposition (64) it is sucient to prove that [z/y] [y/x] and [z/x] coincide on
82
Var(). So let u Var(). We need to show that [z/y] [y/x](u) = [z/x](u).
Since y , Var() we have u ,= y. Suppose rst that u = x. Then we have:
[z/y] [y/x](u) = [z/y](y) = z = [z/x](u)
Suppose now that u ,= x. Then u , x, y and furthermore:
[z/y] [y/x](u) = [z/y](u) = u = [z/x](u)
In any case, we have proved that [z/y] [y/x](u) = [z/x](u).
When replacing a variable x by a variable y in a formula , we would expect
all occurrences of the variable x to have disappeared in [y/x]. In other words,
we would expect the variable x not to be a variable of the formula [y/x]. This
is indeed the case when x ,= y, a condition which may easily be forgotten.
Proposition 71 Let V be a set, x, y V with x ,= y. Then for all P(V ):
x , Var([y/x])
Proof
Suppose to the contrary that x Var([y/x]). From proposition (63), we have
Var([y/x]) = [y/x](Var()). So there exists u Var() such that x = [y/x](u).
If x ,= u we obtain [y/x](u) = u and consequently x = u. If x = u we obtain
[y/x](u) = y and consequently x = y. In both cases we obtain a contradiction.
2.1.7 Free Variable of a Formula
When = x(x y) the variables of are x and y i.e. Var() = x, y.
However, it is clear that a critical distinction exists between the role played
by the variables x and y. The situation is very similar to that of an integral
_
f(x, y) dx where x is nothing but a dummy variable. A dummy variable can
be replaced. In contrast, the variable y cannot be replaced without aecting
much of what is meant by the integral, or the formula . The variable y is called
a free variable of the formula . In this section, we formally dene the notion
of free variable and prove a few elementary properties which are related to it.
Denition 72 Let V be a set. The map Fr : P(V ) T(V ) dened by the
following structural recursion is called free variable mapping on P(V ):
P(V ) , Fr () =

x, y if = (x y)
if =
Fr (
1
) Fr (
2
) if =
1

2
Fr (
1
) x if = x
1
(2.7)
We say that x V is a free variable of P(V ) if and only if x Fr ().
Proposition 73 The structural recursion of denition (72) is legitimate.
83
Proof
We need to show the existence and uniqueness of Fr : P(V ) T(V ) satisfying
equation (2.7). This follows from an immediate application of theorem (4)
of page 45 to the free universal algebra P(V ) and the set A = T(V ), using
g
0
: P
0
(V ) A dened by g
0
(x y) = x, y for all x, y V , and the
operators h(f) : A
(f)
A (f ) dened for all V
1
, V
2
A and x V as:
(i) h()(0) =
(ii) h()(V
1
, V
2
) = V
1
V
2
(iii) h(x)(V
1
) = V
1
x

In proposition (63), given a map : V W with associated substitution


mapping : P(V ) P(W) and given a formula , we checked that the variables
of () were simply the elements of the direct image of Var() by , that is
Var(()) = (Var()). We now show two similar results for the free variables of
(), namely Fr (()) (Fr ()) in the general case and Fr (()) = (Fr ())
in the case when some restriction is imposed on the map , which we require
to be injective on the domain Var(). If = x(x y) and is not injective
on x, y, then it is possible to have () = x(x x) which would clearly
contradict Fr (()) = (Fr ()). However, the requirement that
|Var()
be
injective will be shown to be sucient but is not necessary. If = (x y) and
= [y/x], we have () = (y y) and the equality Fr (()) = (Fr ()) still
holds. We start with the general case leading to the weaker inclusion:
Proposition 74 Let V , W be sets and : V W be a map. Let P(V ) :
Fr (()) (Fr ())
where : P(V ) P(W) also denotes the associated substitution mapping.
Proof
Given P(V ), we need to show the inclusion Fr (()) (x) : x Fr ().
We shall do so by structural induction using theorem (3) of page 32. Since P
0
(V )
is a generator of P(V ), we rst check that the property is true for P
0
(V ).
So suppose = (x y) P
0
(V ) for some x, y V . Then, we have:
Fr (()) = Fr ((x y))
= Fr ( (x) (y) )
= (x), (y)
= (u) : u x, y
= (u) : u Fr (x y)
= (x) : x Fr ()
Having proved the equality, it follows in particular that the inclusion is true.
Next we check that the property is true for P(V ):
Fr (()) = Fr () = (x) : x Fr ()
84
Next we check that the property is true for =
1

2
if it is true for
1
,
2
:
Fr (()) = Fr ((
1

2
))
= Fr ((
1
) (
2
))
= Fr ((
1
)) Fr ((
2
))
(x) : x Fr (
1
) (x) : x Fr (
2
)
= (x) : x Fr (
1
) Fr (
2
)
= (x) : x Fr (
1

2
)
= (x) : x Fr ()
Finally we check that the property is true for = x
1
if it is true for
1
:
Fr (()) = Fr ((x
1
))
= Fr ((x) (
1
))
= Fr ((
1
)) (x)
(u) : u Fr (
1
) (x)
see below (u) : u Fr (
1
) x
= (u) : u Fr (x
1
)
= (x) : x Fr ()
We shall complete the proof by justifying the used inclusion:
(u) : u Fr (
1
) (x) (u) : u Fr (
1
) x
So suppose y = (u) for some u Fr (
1
) and y ,= (x). Then in particular
u ,= x, and if follows that y = (u) with u Fr (
1
) x.
We now prove a second version of proposition (74) which requires stronger
assumptions but leads to an equality rather than mere inclusion:
Proposition 75 Let V and W be sets and : V W be a map. Let P(V )
be such that
|Var()
is an injective map. Then, we have:
Fr (()) = (Fr ())
where : P(V ) P(W) also denotes the associated substitution mapping.
Proof
Given P(V ) such that
|Var()
is injective, we need to show the equality
Fr (()) = (x) : x Fr (). We shall do so by structural induction using
theorem (3) of page 32. Since P
0
(V ) is a generator of P(V ), we rst check that
the property is true for P
0
(V ). So suppose = (x y) P
0
(V ) for some
x, y V . Then, regardless of whether (x) = (y) or not, we have:
Fr (()) = Fr ((x y))
85
= Fr ( (x) (y) )
= (x), (y)
= (u) : u x, y
= (u) : u Fr (x y)
= (x) : x Fr ()
Next we check that the property is true for P(V ):
Fr (()) = Fr () = = (x) : x Fr ()
Next we check that the property is true for =
1

2
if it is true for
1
,
2
.
Note that if
|Var()
is injective, it follows from Var() = Var(
1
) Var(
2
)
that both
|Var(1)
and
|Var(2)
are injective, and consequently:
Fr (()) = Fr ((
1

2
))
= Fr ((
1
) (
2
))
= Fr ((
1
)) Fr ((
2
))
= (x) : x Fr (
1
) (x) : x Fr (
2
)
= (x) : x Fr (
1
) Fr (
2
)
= (x) : x Fr (
1

2
)
= (x) : x Fr ()
Finally we check that the property is true for = x
1
if it is true for
1
.
Note that if
|Var()
is injective, it follows from Var() = x Var(
1
) that

|Var(1)
is also injective, and consequently:
Fr (()) = Fr ((x
1
))
= Fr ((x) (
1
))
= Fr ((
1
)) (x)
= (u) : u Fr (
1
) (x)
see below = (u) : u Fr (
1
) x
= (u) : u Fr (x
1
)
= (x) : x Fr ()
We shall complete the proof by justifying the used equality:
(u) : u Fr (
1
) (x) = (u) : u Fr (
1
) x
First we show the inclusion . Suppose y = (u) for some u Fr (
1
) and
y ,= (x). Then in particular u ,= x, and if follows that y = (u) with
u Fr (
1
) x. We now show the reverse inclusion . Suppose y = (u) for
some u Fr (
1
) x. Then in particular y = (u) for some u Fr (
1
) and
we need to show that y ,= (x). Suppose to the contrary that y = (x). Then
86
(u) = (x) with u Fr (
1
) Var(
1
) Var() and x Var(). Having
assumed that
|Var()
is injective, we obtain u = x, contradicting the fact that
u Fr (
1
) x.
For an easy future reference, we now apply proposition (75) to the single vari-
able substitution mapping = [y/x]. We require that y , Var() to guarantee
the injectivity of the map
|Var()
.
Proposition 76 Let V be a set, P(V ), x, y V with y , Var(). Then:
Fr ([y/x]) =
_
Fr () x y if x Fr ()
Fr () if x , Fr ()
Proof
Let P(V ) such that y , Var(). Let : V V be the single variable
substitution map = [y/x] dened by (x) = y and (u) = u for u ,= x. From
the assumption y , Var() it follows that
|Var()
is an injective map. Indeed,
suppose (u) = (v) for some u, v Var(). We need to show that u = v,
which is clearly the case when u ,= x and v ,= x. So suppose u = x and v ,= x.
From (u) = (v) we obtain y = v contradicting the assumption y , Var().
So u = x and v ,= x is impossible and likewise u ,= x and v = x is an impossible
case. Of course the case u = x and v = x leads to u = v. So we have proved that
u = v in all possible cases and
|Var()
is indeed an injective map. From propo-
sition (75) it follows that Fr (()) = (Fr ()), i.e. Fr ([y/x]) = (Fr ()).
We shall complete the proof by showing that (Fr ()) = Fr () x y
when x Fr (), while (Fr ()) = Fr () when x , Fr (). First we assume
that x , Fr (). We need to show that (Fr ()) = Fr (). First we show the
inclusion . So suppose z = (u) for some u Fr (). We need to show that
z Fr (). Since u Fr (), it is sucient to prove that (u) = u. Hence it
is sucient to show that u ,= x which follows immediately from the assumption
x , Fr (). This completes our proof of . We now show the reverse inclusion
. So suppose z Fr (). We need to show that z = (u) for some u Fr ().
It is sucient to show that z = (z) or z ,= x which follows from z Fr ()
and x , Fr (). This completes our proof in the case when x , Fr (). We now
assume that x Fr (). We need to show that (Fr ()) = Fr () x y.
First we prove the inclusion . So suppose z = (u) for some u Fr (). We
assume that z ,= y and we need to show that z Fr () and z ,= x. First we
show that z ,= x. So suppose to the contrary that z = x. Then (u) = x. If
u ,= x we obtain (u) = u ,= x. Hence it follows that u = x and consequently
(u) = (x). So z = y contradicting the initial assumption. So we need to
show that z Fr (). Once again, since z = (u) and u Fr (), it is sucient
to prove that (u) = u or u ,= x. So suppose to the contrary that u = x.
Then z = (u) = (x) = y which contradicts the initial assumption z ,= y.
This completes our proof of . We now show the reverse inclusion . Having
assumed that x Fr (), we have y = (x) (Fr ()). So it remains to show
that (Fr ()) Fr () x. So suppose z Fr () and z ,= x. We need to
show that z = (u) for some u Fr () for which it is clearly sucient to prove
87
that z = (z) or z ,= x, which is true by assumption.
In proposition (58) we showed that the relation dened by if and
only if () () was a congruence on P(V ), given an arbitrary substitution
: V W and congruence on P(W). The objective was to obtain an easy
way to prove an implication of the form () () where is
another congruence on P(V ) for which there is a known generator R
0
. We are
now interested in an implication of the form Fr () = Fr (). One
natural step towards proving such an implication consists in showing that the
relation dened by if and only if Fr () = Fr (), is itself a congruence
on P(V ). This is the purpose of the next proposition.
Proposition 77 Let V be a set and be the relation on P(V ) dened by:
Fr () = Fr ()
for all , P(V ). Then is a congruence on P(V ).
Proof
The relation is clearly reexive, symmetric and transitive on P(V ). So we
simply need to show that is a congruent relation on P(V ). By reexivity, we
already have . Suppose
1
,
2
,
1
and
2
P(V ) are such that
1

1
and
2

2
. Dene =
1

2
and =
1

2
. We need to show that
, or equivalently that Fr () = Fr (). This follows from the fact that
Fr (
1
) = Fr (
1
), Fr (
2
) = Fr (
2
) and:
Fr () = Fr (
1

2
)
= Fr (
1
) Fr (
2
)
= Fr (
1
) Fr (
2
)
= Fr (
1

2
)
= Fr ()
We now suppose that
1
,
1
P(V ) are such that
1

1
. Let x V and
dene = x
1
and = x
1
. We need to show that , or equivalently
that Fr () = Fr (). This follows from the fact that Fr (
1
) = Fr (
1
) and:
Fr () = Fr (x
1
) = Fr (
1
) x = Fr (
1
) x = Fr (x
1
) = Fr ()

2.1.8 Bound Variable of a Formula


As already noted, the variable x in x(x y) or the integral
_
f(x, y) dx is
a dummy variable. In mathematical logic, dummy variables are called bound
variables. For the sake of completeness, we now formally dene the set of bound
variables of a formula , i.e. the set of all variables which appear in some
88
quantier x within the formula . It should be noted that a variable x can be
both a bound variable and a free variable, as is the case for the formula:
= x(x y) (x y)
Denition 78 Let V be a set. The map Bnd : P(V ) T(V ) dened by the
following structural recursion is called bound variable mapping on P(V ):
P(V ) , Bnd () =

if = (x y)
if =
Bnd(
1
) Bnd(
2
) if =
1

2
x Bnd (
1
) if = x
1
(2.8)
We say that x V is a bound variable of P(V ) if and only if x Bnd().
Proposition 79 The structural recursion of denition (78) is legitimate.
Proof
We need to show the existence and uniqueness of Bnd : P(V ) T(V ) satisfying
equation (2.8). This follows from an immediate application of theorem (4)
of page 45 to the free universal algebra P(V ) and the set A = T(V ), using
g
0
: P
0
(V ) A dened by g
0
(x y) = for all x, y V , and the operators
h(f) : A
(f)
A (f ) dened for all V
1
, V
2
A and x V as:
(i) h()(0) =
(ii) h()(V
1
, V
2
) = V
1
V
2
(iii) h(x)(V
1
) = x V
1

Given a set V and P(V ), the intersection Fr () Bnd () is not empty


in general. The formula = (x y) x(x y) is an example where such
intersection is not empty. However:
Proposition 80 Let V be a set and P(V ). Then we have:
Var() = Fr () Bnd () (2.9)
Proof
We shall prove the inclusion (2.9) by structural induction using theorem (3) of
page 32. Since P
0
(V ) is a generator of P(V ), we rst check that the inclusion is
true for P
0
(V ). So suppose = (x y) P
0
(V ) for some x, y V . Then:
Var() = Var(x y)
= x, y
= x, y
= Fr (x y) Bnd(x y)
= Fr () Bnd()
89
Next we check that the property is true for P(V ):
Var() = = Fr () Bnd()
Next we check that the property is true for =
1

2
if it is true for
1
,
2
:
Var() = Var(
1

2
)
= Var(
1
) Var(
2
)
= Fr (
1
) Bnd(
1
) Fr (
2
) Bnd (
2
)
= Fr (
1

2
) Bnd (
1

2
)
= Fr () Bnd ()
Finally we check that the property is true for = x
1
if it is true for
1
:
Var() = Var(x
1
)
= x Var(
1
)
= x Fr (
1
) Bnd(
1
)
= x ( Fr (
1
) x ) Bnd(
1
)
= Fr (x
1
) Bnd(x
1
)
= Fr () Bnd()

In the following proposition we show that if _ is a sub-formula of ,


then the bound variables of are also bound variables of . Note that this
property is not true of free variables since Fr (
1
) , Fr (x
1
)) in general.
Proposition 81 Let V be a set and , P(V ). Then we have:
_ Bnd () Bnd()
Proof
This is a simple application of proposition (47) to the map Bnd : X A
where X = P(V ) and A = T(V ) where the preorder on A is the usual in-
clusion . We simply need to check that given
1
,
2
P(V ) and x V we
have the inclusions Bnd(
1
) Bnd(
1

2
), Bnd (
2
) Bnd (
1

2
) and
Bnd(
1
) Bnd (x
1
) which follow immediately from denition (78).
Let V, W be sets and : V W be a map with associated substitution
mapping : P(V ) P(W). Given P(V ), the bound variables of are the
elements of the set Bnd(). The following proposition allows us to determine
which are the bound variables of (). Specically:
Bnd (()) = (x) : x Bnd()
In other words the bound variables of () coincide with the range of the re-
striction
|Bnd ()
. As discussed in page 19, this range is denoted (Bnd ()).
90
Proposition 82 Let V and W be sets and : V W be a map. Then:
P(V ) , Bnd (()) = (Bnd ())
where : P(V ) P(W) also denotes the associated substitution mapping.
Proof
Given P(V ), we need to show that Bnd(()) = (x) : x Bnd (). We
shall do so by structural induction using theorem (3) of page 32. Since P
0
(V )
is a generator of P(V ), we rst check that the property is true for P
0
(V ).
So suppose = (x y) P
0
(V ) for some x, y V . Then we have:
Bnd (()) = Bnd ((x y))
= Bnd ( (x) (y) )
=
= (u) : u
= (u) : u Bnd (x y)
= (x) : x Bnd ()
Next we check that the property is true for P(V ):
Bnd (()) = Bnd() = = (x) : x Bnd ()
Next we check that the property is true for =
1

2
if it is true for
1
,
2
:
Bnd(()) = Bnd((
1

2
))
= Bnd((
1
) (
2
))
= Bnd((
1
)) Bnd((
2
))
= (x) : x Bnd(
1
) (x) : x Bnd (
2
)
= (x) : x Bnd(
1
) Bnd(
2
)
= (x) : x Bnd(
1

2
)
= (x) : x Bnd()
Finally we check that the property is true for = x
1
if it is true for
1
:
Bnd (()) = Bnd((x
1
))
= Bnd((x) (
1
))
= (x) Bnd ((
1
))
= (x) (u) : u Bnd (
1
)
= (u) : u x Bnd (
1
)
= (u) : u Bnd(x
1
)
= (x) : x Bnd()

91
2.1.9 Valid Substitution of Variable
Suppose we had a parameterized integral I(x) =
_
f(x, y)dy dened for x A.
Given some a A, we would naturally write I(a) =
_
f(a, y)dy. However, if y
was also an element of A, very few of us would dare to say I(y) =
_
f(y, y)dy.
Substituting the variable y in place of x is not valid, or in other words the
substitution [y/x] is not a valid substitution for the formula =
_
f(x, y)dy.
Similarly, if we had the formula = y(x y) P(V ) with x ,= y which
expresses some unary predicate I(x), it would be a natural thing to write the
formula I(a) = y(a y) whenever a ,= y, while referring to y(y y) as rep-
resenting I(y) would make no sense. Here again it is clear that the substitution
[y/x] is not a valid substitution for the formula = y(x y) when x ,= y. It
is not clear at this stage how I(y) should be dened but we certainly know it
cannot be dened in terms of a substitution of variable which is not valid. At
some point, it will be important for us to dene I(y) in order to claim that:
xy(x y) I(a)
is a legitimate axiom, including when a = y. We shall worry about this later. In
this section, we want to formally dene the notion of substitution : V W
which is valid for a formula P(V ). We are keeping W as an arbitrary set
rather than W = V to make this discussion as general as possible.
So suppose = y(x y) with x ,= y and : V W is an arbitrary
map. Then () = v(u v) where u = (x) and v = (y). The question we
are facing is to determine which conditions should be imposed on to make
it a valid substitution for . Since x is a free variable of y(x y), a natural
condition seems to require that u be a free variable of v(u v). This can
only happen when u ,= v which leads to the condition (x) ,= (y). So it seems
that an injective would be a valid substitution for . In fact, we know from
proposition (64) that () only depends on the restriction
|Var()
so we may
require that
|Var()
be injective rather than itself. However, does injectivity
really matter? Suppose we had = y[(x y) (y z)] with x, y, z distinct.
There would be nothing wrong with a substitution leading to the formula
v[(u v) (v u)] with u ,= v. There is nothing absurd in considering the
binary predicate I(x, z) = y[(x y) (y z)] with (x, z) = (u, u) (having
changed the bound variable y by v). So injectivity does not matter. What
matters is the fact that the free variables x and z of have been substituted by
the free variable u in v[(u v) (v u)]. In other words, when considering a
formula = y[ . . . x . . . z . . . ] with x and z free, what matters is not to end
up with something like () = (y)[ . . . (x) . . . (z) . . . ] with (x) = (y)
or (z) = (y). We do not want free variables of to become articially bound
by the substitution . So it seems that a possible condition for the substitution
to be valid for is the following:
x Fr () (x) Fr (())
So let us consider = xy(x y) with x ,= y as a new example. In this case,
the formula has no free variables and there is therefore no risk of having a free
92
variable getting articially bound by a substitution . So any substitution
would seem to be valid for , which of course isnt the case. As a rule of thumb,
a valid substitution is one which prevents a mathematician writing nonsense.
If = x with = y(x y), then we cannot hope () = (x)() to be
a legitimate expression, unless () itself is a legitimate expression. So yes, we
do not want free variables to become articially bound by the substitution ,
but these free variables are not so much the free variables of itself. There are
also the free variables of or more generally of any sub-formula of . So in the
light of this discussion we shall attempt the following:
Denition 83 Let V and W be sets and : V W be a map. Let P(V ).
We say that is valid for if and only if for every sub-formula _ we have:
x Fr () (x) Fr (())
where : P(V ) P(W) also denotes the associated substitution mapping.
Recall that the notion of sub-formula and the relation _ are dened in
denition (40) of page 59. An immediate consequence of denition (83) is:
Proposition 84 Let V and W be sets and : V W be a map. Let P(V ).
Then is valid for if and only if it is valid for any sub-formula _ .
Proof
Since _ , i.e. is a sub-formula of itself, the if part of this proposition
is clear. So we now prove the only if part. So suppose is valid for and
let _ . We need to show that is also valid for . So let _ and let
x Fr (). We need to show that (x) Fr (()), which follows immediately
from the validity of for and the fact (by transitivity) that _ , i.e. that
is also a sub-formula of .
Another immediate consequence of denition (83) is:
Proposition 85 Let V and W be sets and : V W be a map. Let P(V ).
Then is valid for if and only if for every sub-formula _ we have:
Fr (()) = (Fr ())
Proof
The substitution : V W if valid for if and only if for all _ :
x Fr () (x) Fr (())
This condition is clearly equivalent to (Fr ()) Fr (()), where (Fr ())
refers to the direct image of the set Fr () by the map : V W. From propo-
sition (74) we know the reverse inclusion Fr (()) (Fr ()) is always true.
As anticipated, if is injective on Var() then it is valid for :
93
Proposition 86 Let V and W be sets and : V W be a map. Let P(V )
such that
|Var()
is an injective map. Then is valid for .
Proof
Let : V W be map and P(V ) such that the restriction
|Var()
is an
injective map. We need to show that is valid for . So let _ . Using
proposition (85), we need to show that Fr (()) = (Fr ()). From propo-
sition (75), it is sucient to prove that
|Var()
is injective. Since
|Var()
is
injective, it is sucient to show that Var() Var() which follows from _
and proposition (62).
If it is meaningful to speak of (
1
) and (
2
), it should also be meaningful
to consider (
1
) (
2
) which is the same as (
1

2
). So in order to
prove that a substitution is valid for =
1

2
, it should be sucient to
establish its validity both for
1
and
2
. The following proposition establishes
that fact, which may be useful for structural induction arguments on P(V ).
Proposition 87 Let V and W be sets and : V W be a map. Let P(V )
of the form =
1

2
with
1
,
2
P(V ). Then is valid for if and only
if it is valid for both
1
and
2
.
Proof
Suppose is valid for =
1

2
. From the equality:
Sub () =
1

2
Sub (
1
) Sub(
2
) (2.10)
we obtain
1
Sub (
1
) Sub () and
2
Sub(
2
) Sub (). Hence we see
that
1
_ and
2
_ i.e. both
1
and
2
are sub-formulas of . It follows
from proposition (84) that is valid for both
1
and
2
. Conversely, suppose
is valid for
1
and
2
. We need to show that is valid for =
1

2
. So
let _ and u Fr (). We need to show that (u) Fr (()). From _
we have Sub (). So from the above equation (2.10), we must have =
or Sub (
1
) or Sub (
2
). So we shall distinguish three cases: rst we
assume that Sub(
1
). Then _
1
and from u Fr () and the validity
of for
1
we see that (u) Fr (()). Next we assume that Sub (
2
).
Then a similar argument shows that (u) Fr (()). Finally we assume that
= . Then Fr () = Fr (
1

2
) = Fr (
1
) Fr (
2
) and from u Fr () we
must have u Fr (
1
) or u Fr (
2
). So we shall distinguish two further cases:
rst we assume that u Fr (
1
). From
1
_
1
and the validity of for
1
we
obtain (u) Fr ((
1
)). However, we have:
Fr ((
1
)) Fr ((
1
)) Fr ((
2
))
= Fr ((
1
) (
2
))
= Fr ((
1

2
))
= Fr (())
Hence we see that (u) Fr (()). Next we assume that u Fr (
2
). Then
a similar argument shows that (u) Fr (()). In all possible cases we have
94
shown that (u) Fr (()).
The next proposition may also be useful for induction arguments on P(V ).
It establishes conditions for a substitution to be valid for a formula of the
form = x
1
. As expected, the validity of for
1
is a prerequisite. However,
we also require free variables of not to become articially bound by the sub-
stitution . This naturally leads to the condition u Fr () (u) ,= (x).
Proposition 88 Let V and W be sets and : V W be a map. Let P(V )
of the form = x
1
with
1
P(V ) and x V . Then is valid for if and
only if it is valid for
1
and for all u V we have:
u Fr (x
1
) (u) ,= (x)
Proof
First we show the only if part. So suppose is valid for = x
1
. From:
Sub () = x
1
Sub (
1
) (2.11)
we obtain
1
Sub(
1
) Sub () and consequently
1
_ i.e.
1
is a sub-
formula of . It follows from proposition (84) that is valid for
1
. So let
u Fr (x
1
) = Fr (). It remains to show that (u) ,= (x). Since _ and
u Fr () it follows from the validity of for that (u) Fr (()). Hence
(u) is a free variable of () = (x)(
1
), so (u) ,= (x) as requested.
We now prove the if part. So suppose is valid for
1
and furthermore
that (u) ,= (x) whenever u Fr (x
1
) = Fr (). We need to show that
is valid for = x
1
. So let _ and u Fr (). We need to show
that (u) Fr (()). From _ we have Sub(). So from the above
equation (2.11), we must have = or Sub (
1
). So we shall distinguish
two cases: rst we assume that Sub(
1
). Then _
1
and from u Fr ()
and the validity of for
1
we see that (u) Fr (()). Next we assume that
= . Then Fr () = Fr (x
1
) = Fr (
1
) x and from u Fr () we must
have in particular u Fr (
1
). From
1
_
1
and the validity of for
1
we
obtain (u) Fr ((
1
)). Furthermore, from u Fr () and = we obtain
u Fr () and this implies by assumption that (u) ,= (x). Thus:
(u) Fr ((
1
)) (x) = Fr ((x)(
1
))
= Fr ((x
1
))
= Fr (())
Hence we see that (u) Fr (()) as requested.
Looking back at denition (83), a substitution is valid for if and only if
for all P(V ) we have the following property:
( _ ) [ u Fr () (u) Fr (()) ]
95
In the next proposition, we oer another criterion for the validity of for :
(x
1
_ ) [ u Fr (x
1
) (u) ,= (x) ]
This criterion is arguably simpler as we no longer need to check every sub-
formula _ , but instead can limit our attention to those sub-formula of the
form = x
1
with
1
P(V ) and x V . Furthermore, we no longer need
to establish that (u) is a free variable of (), but instead have to show that
(u) ,= (x) which is also a lot simpler.
Proposition 89 Let V and W be sets and : V W be a map. Let P(V ).
Then is valid for if and only if for all
1
P(V ) and x V we have:
(x
1
_ ) [ u Fr (x
1
) (u) ,= (x) ]
Proof
First we show the only if part: So suppose is valid for . Let
1
P(V )
and x V such that = x
1
_ and let u Fr (). We need to show that
(u) ,= (x). However, from the validity of for we have (u) Fr (()).
So (u) is a free variable of () = (x)(
1
) and (u) ,= (x) as requested.
We now prove the if part: Consider the property P

() dened by:

1
x [ (x
1
_ ) [ u Fr (x
1
) (u) ,= (x) ] ]
Then we need to prove that P

() ( valid for ) for all P(V ). We shall


do so by structural induction, using theorem (3) of page 32. First we assume
that = (x y) for some x, y V . We need to show that the implication is
true for . So suppose P

() is true. We need to show that is valid for ,


which is always the case when = (x y). Next we assume that = . Then
is once again always valid for and the implication is true. Next we assume
that =
1

2
where the implication is true for
1
,
2
P(V ). We need to
show that the implication is also true for . So we assume that P

() is true.
We need to show that is valid for =
1

2
. Using proposition (87) it
is sucient to prove that is valid for both
1
and
2
. Having assumed the
implication is true for
1
and
2
, it is therefore sucient to prove that P

(
1
)
and P

(
2
) are true. First we show that P

(
1
) is true. So let P(V )
and z V such that z _
1
. Suppose u Fr (z). We need to show
that (u) ,= (z). Having assumed P

() is true, it is sucient to prove that


z _ , which follows immediately from z _
1
and
1
_ . So we have
proved that P

(
1
) is indeed true and a similar argument shows that P

(
2
)
is true. This concludes the case when =
1

2
. Next we assume that
= x
1
where x V and the implication is true for
1
P(V ). We need
to show that the implication is also true for . So we assume that P

() is
true. We need to show that is valid for = x
1
. Using proposition (88) it
is sucient to prove that is valid for
1
and furthermore that (u) ,= (x)
whenever u Fr (x
1
). First we show that is valid for
1
. Having assumed
the implication is true for
1
, it is sucient to prove that P

(
1
) is true. So let
P(V ) and z V such that z _
1
. Suppose u Fr (z). We need to
96
show that (u) ,= (z). Having assumed P

() is true, it is sucient to prove


that z _ , which follows immediately from z _
1
and
1
_ . So we
have proved that P

(
1
) is indeed true and is therefore valid for
1
. Next
we show that (u) ,= (x) whenever u Fr (x
1
). So let u Fr (x
1
). We
need to show that (u) ,= (x). Having assumed P

() is true, it is sucient
to prove that x
1
_ which is immediate since x
1
= .
The following proposition is a simple application of proposition (86):
Proposition 90 Let V be a set and x, y V . Let P(V ). Then we have:
y , Var() ([y/x] valid for )
Proof
We assume that y , Var(). We need to show that the substitution [y/x] is
valid for . Using proposition (86), it is sucient to prove that [y/x]
|Var()
is an
injective map. This follows immediately from y , Var() and proposition (68).

If U and V are sets and : U V is a valid substitution for P(U), then


our rule of thumb says it does make logical sense to speak of (). So if W is
another set and : V W is a substitution which is valid for (), it should
make logical sense to speak of (()) = ( )(). So we should expect the
substitution to be valid for . Luckily, this happens to be true. In fact,
the converse is also true: if is valid for then not only is is valid for
(), but surprisingly itself is necessarily valid for .
Proposition 91 Let U, V , W be sets and : U V and : V W be maps.
Then for all P(U) we have the equivalence:
( valid for ) ( valid for ()) ( valid for )
where : P(U) P(V ) also denotes the associated substitution mapping.
Proof
First we prove : so we assume that is valid for P(U) and is valid
for (). We need to show that is valid for . So let _ . Using
proposition (85) we need to show that Fr ( ()) = ( )(Fr ()):
Fr ( ()) = Fr ((()))
(() _ ()) ( valid for ()) = (Fr (()))
( _ ) ( valid for ) = ((Fr ()))
= ( )(Fr ())
Note that () _ () follows from _ and proposition (57). We now
prove : So suppose is valid for . We need to show that is valid for
and is valid for (). First we show that is valid for (), having assumed
97
is indeed valid for . So let _ (). Using proposition (85) we need to show
that Fr (()) = (Fr ()). However from proposition (57), there exists _
such that = (). Hence we have the following equalities:
Fr (()) = Fr ((())
= Fr ( ())
( _ ) ( valid for ) = (Fr ())
= ((Fr ()))
( _ ) ( valid for ) = (Fr (()))
= (Fr ())
It remains to prove that is valid for . So consider the property:
( valid for ) ( valid for )
We need to show this property holds for all P(U). We shall do so by a
structural induction argument, using theorem (3) of page 32. First we assume
that = (x y) for some x, y U. Then is always valid for and the prop-
erty is true. Likewise, is always valid for = and the property is true for .
So we assume that =
1

2
where
1
,
2
P(U) satisfy the property. We
need to show that same is true of . So we assume that is valid for .
We need to show that is valid for =
1

2
. Using proposition (87) it is
sucient to prove that is valid for both
1
and
2
. First we show that is
valid for
1
. Having assumed the property is true for
1
it is sucient to show
that is valid for
1
, which follows from the validity of for =
1

2
and proposition (87). We prove similarly that is valid for
2
which completes
the case when =
1

2
. So we now assume that = x
1
where x U
and
1
P(U) satises our property. We need to show that same is true of .
So we assume that is valid for . We need to show that is valid for
= x
1
. Using proposition (88) it is sucient to show that is valid for
1
and furthermore that u Fr (x
1
) (u) ,= (x). First we show that is
valid for
1
. Having assumed the property is true for
1
it is sucient to prove
that is valid for
1
which follows from the validity of for = x
1
and proposition (88). So we assume that u Fr (x
1
) and it remains to show
that (u) ,= (x). So suppose to the contrary that (u) = (x). Then in par-
ticular (u) = (x) which contradicts the validity of for = x
1
.
Proposition 92 Let V, W be sets and , : V W be maps. Let P(V )
such that the equality () = () holds. Then we have the equivalence:
( valid for ) ( valid for )
Proof
So we assume that () = (). It is sucient to show the implication .
So we assume that is valid for . We need to show that is also valid for
98
. So let _ be a sub-formula of and x Fr (). We need to show that
(x) Fr (()). However from proposition (64) and the equality () = ()
we see that and coincide on Var(). Furthermore from proposition (62) we
have Var() Var(). It follows that and coincide on Var() and in partic-
ular (x) = (x). Using proposition (64) we also have () = (). Hence we
need to show that (x) Fr (()) which follows from the validity of for .
The following proposition will appear later as a useful criterion:
Proposition 93 Let V , W be sets and : V W be a map. Let P(V ).
We assume that there exists a subset V
0
V with the following properties:
(i) Bnd() V
0
(ii)
|V0
is injective
(iii) (V
0
) (Var() V
0
) =
Then the map : V W is valid for the formula P(V ).
Proof
Let x V and
1
P(V ) such that x
1
_ . Using proposition (89), given
u Fr (x
1
) we need to show that (u) ,= (x). However, from proposi-
tion (81) we have Bnd (x
1
) Bnd () and consequently x Bnd(). From the
assumption (i) it follows that x V
0
. We shall now distinguish two cases: rst
we assume that u V
0
. Then from assumption (ii), in order to show(u) ,= (x)
it is sucient to prove that u ,= x which follows from u Fr (x
1
). We now as-
sume that u , V
0
. However, from proposition (62) we have Var(x
1
) Var()
and consequently u Var(). It follows that u Var() V
0
. Hence, we see
that (u) ,= (x) is a consequence of assumption (iii) and x V
0
.
2.1.10 Dual Substitution of Variable
Until now we have studied one type of mapping : P(V ) P(W) which are
the variable substitutions arising from a single mapping : V W, as per
denition (53) of page 71. In this section, we wish to study another type of
substitution : P(V ) P(W) arising from two mappings
0
,
1
: V W. The
idea behind this new type is to allow us to move variables to dierent values,
according to whether a given occurrence of the variable is free or bound. For
example, suppose we had = x(x y) (x y). We want to construct a
dual substitution mapping : P(V ) P(W) such that:
() = u

(u

v) (u v)
More generally given a mapping
0
: V W such that
0
(x) = u and
0
(y) = v,
and another mapping
1
: V W such that
1
(x) = u

, we want to dene
: P(V ) P(W) which will assign free occurrences of a variable according to

0
and bound occurrences according to
1
.
99
One of the main motivations for introducing dual variable substitutions is
to dene a local inverse for the substitution : P(V ) P(W) associated with
a single map : V W. Suppose we had = x(x y) (z y) with x, y, z
distinct and = [x/z]. Then () = x(x y) (x y) and there is no way
to nd a mapping : W V such that () = . We need to go further.
So we are given
0
,
1
: V W and looking to dene : P(V ) P(W)
which assigns free and bound occurrences of variables as specied by
0
and
1
respectively. The issue we have is that on the one hand we need to have some
form of recursive denition, but on the other hand a sub-formula x y has no
knowledge of whether x and y will remain free occurrences of a given formula.
One possible solution is to always assume that x and y will remain free and
dene (x y) =
0
(x)
0
(y) accordingly, while substituting the variable

0
(x) by
1
(x) whenever a quantication x arises by setting:
(x
1
) =
1
(x)(
1
)[
1
(x)/
0
(x)] (2.12)
Unfortunately, this doesnt work. Suppose we have = x(x y) with x ,= y.
Take
0
(x) =
0
(y) = u and
1
(x) = u

. The formula we wish to obtain is


() = u

(u

u). However, if we apply equation (2.12) we have:


() = u

(u u)[u

/u] = u

(u

)
Another idea is to dene in two steps: rst we dene a map:

: P(V ) [T(V ) P(W)]


So given P(V ), we have a map

() : T(V ) P(W) which assigns to


every subset U V a formula

()(U) P(W) which represents the formula


obtained from by assuming all free variables are in U. Specically, if we dene

U
(x) =
0
(x) if x U and
U
(x) =
1
(x) if x , U, we put:

(x y)(U) =
U
(x)
U
(y)
In the case when = x
1
, we cannot easily dene

()(U) in terms of

(
1
)(U) because we want the variable x to be tagged as bound in

(
1
).
So we need to consider the formula

(
1
)(U x) instead, having excluded
the variable x from the set of possible free variables. Specically we dene:

(x
1
)(U) =
1
(x)

(
1
)(U x) (2.13)
Once we have dened

() : T(V ) P(W) we can just set () =

()(V )
since all free variables of are possibly in V . So let us go back to our example
of = x(x y) with x ,= y,
0
(x) =
0
(y) = u and
1
(x) = u

. We obtain:
() =

()(V ) = u

(x y)(V x) = u

(u

u)
This is exactly what we want. So lets hope for the best and dene:
100
Denition 94 Let V , W be sets and
0
,
1
: V W be maps. We call dual
variable substitution associated with (
0
,
1
) the map : P(V ) P(W) dened
by () =

()(V ), where the map

: P(V ) [T(V ) P(W)] is dened by


the following structural recursion, given P(V ) and U T(V ):

()(U) =

U
(x)
U
(y) if = (x y)
if =

(
1
)(U)

(
2
)(U) if =
1

2

1
(x)

(
1
)(U x) if = x
1
(2.14)
where
U
(x) =
0
(x) if x U and
U
(x) =
1
(x) if x , U.
Proposition 95 The structural recursion of denition (94) is legitimate.
Proof
We need to prove the existence and uniqueness of

: P(V ) [T(V ) P(W)]


satisfying equation (2.14). We shall do so using theorem (4) of page 45. So
we take X = P(V ), X
0
= P
0
(V ) and A = [T(V ) P(W)]. We consider
the map g
0
: X
0
A dened by g
0
(x y)(U) =
U
(x)
U
(y). We dene
h() : A
0
A by setting h()(0)(U) = and h() : A
2
A by setting
h()(v)(U) = v(0)(U) v(1)(U). Finally, we dene h(x) : A
1
A with:
h(x)(v)(U) =
1
(x)v(0)(U x)
From theorem (4) there exists a unique map

: X A such that

() = g
0
()
whenever = (x y) and

(f()) = h(f)(

()) for all f = , , x and


X
(f)
. So let us check that this works: from

() = g
0
() for = (x y)
we obtain

()(U) = g
0
(x y)(U) =
U
(x)
U
(y) which is the rst line of
equation (2.14). Take f = and = . We obtain the equalities:

()(U) =

()(U)
proper notation =

((0))(U)
=

(f(0))(U)

: X
0
A
0
= h(f)(

(0))(U)
= h(f)(0)(U)
= h()(0)(U)
=
which is the second line of equation (2.14). Recall that the

which appears in
the fourth equality refers to the unique mapping

: X
0
A
0
. We now take
f = and =
1

2
for some
1
,
2
P(V ). Then we have:

()(U) =

(
1

2
)(U)

(0) =
1
,

(1) =
2
=

(f(

))(U)

: X
2
A
2
= h(f)(

))(U)
= h()(

))(U)
101
=

)(0)(U)

)(1)(U)
=

(0))(U)

(1))(U)
=

(
1
)(U)

(
2
)(U)
This is the third line of equation (2.14). Finally consider f = x and = x
1
:

()(U) =

(x
1
)(U)

(0) =
1
=

(f(

))(U)

: X
1
A
1
= h(f)(

))(U)
= h(x)(

))(U)
=
1
(x)

)(0)(U x)
=
1
(x)

(0))(U x)
=
1
(x)

(
1
)(U x)
and this is the last line of equation (2.14).
2.1.11 Local Inversion of Variable Substitution
Any map : V W gives rise to a substitution mapping : P(V ) P(W).
In general, this substitution mapping has little chance to be injective unless
: V W is itself injective. However even when is not injective, there may
be cases where we can still argue the following implication is true:
() = () = (2.15)
For example, if Var() = Var() it is sucient that
|Var()
be an injective
map for (2.15) to be true. But this itself is not necessary. Consider the case
when = y(x y) (x z) with x, y, z distinct and = [y/z]. Then we
have () = y(x y) (x y). Suppose we knew for some reason that
Bnd() = Bnd () = y. Then this would exclude = z(x z) (x z)
and the implication (2.15) would hold, if not everywhere at least on the set:
() = P(V ) : Bnd() = Bnd()
However, if = [y/x] then () = y(y y) (y z) and the implica-
tion (2.15) fails to be true even if (). As it turns out, the substitution
is not valid for , and () is of little use if we mix up free and bound variables.
In this section, we wish to establish sucient conditions for the implica-
tion (2.15) to hold, if not everywhere at least locally on a domain (). One
way to achieve this is to prove that the substitution : P(V ) P(W) has
a left-inverse locally around , i.e. that there exists a subset P(V ) and
a map : P(W) P(V ) such that () = for all . So if we
102
happen to know that both and are elements of , we are guaranteed the
implication (2.15) is true. So we are given a map : V W and we want to
design a map : P(W) P(V ) which will reverse the eect of the substitution
: P(V ) P(W), at least locally on some interesting domain . We cannot
hope to achieve this for every , but we would like our result to be general
enough. Suppose we knew our substitution behaved on free variables in a
reversible way. Specically, we would have V
0
V such that
|V0
is an injective
map and Fr () V
0
. Suppose likewise that the impact on bound variables was
also reversible. So we assume there is V
1
V such that
|V1
is an injective map
and Bnd () V
1
. For example, if we go back to = y(x y) (x z)
with x, y, z distinct and = [y/z], then is of course not injective but
|V0
and
|V1
are both injective with V
0
= V y and V
1
= V x, z. As it turns
out, we also have Fr () V
0
and Bnd() V
1
. So behaves in a reversible
way both on free and bound variables. Consider now
0
,
1
: W V which are
left-inverse of on V
0
and V
1
respectively, i.e. such that:
(i) u V
0

0
(u) = u
(ii) u V
1

1
(u) = u
Then we should hope to reverse the impact of : P(V ) P(W) by acting
separately on the free and bound variables of () according to
0
and
1
re-
spectively. Specically, if we consider : P(W) P(V ) the dual variable
substitution associated with the ordered pair (
0
,
1
) as per denition (94) of
page 101, we should hope to obtain () = at least for all such that
Fr () V
0
and Bnd () V
1
. So let us see if this works in our case:
() = (y(x y) (x z))
= [y/z] = (y(x y) (x y))
W = V and denition (94) =

( y(x y) (x y) )(V )
=

( y(x y) )(V )

(x y)(V )
=
1
(y)

(x y)(V y) (
0
(x)
0
(y))
=
1
(y)(
0
(x)
1
(y)) (
0
(x)
0
(y))
A: to be proved = y(x y) (x z)
=
So it remains to show that
1
(y) = y,
0
(x) = x and
0
(y) = z. Since we have
V
0
= V y we obtain x, z V
0
and consequently from (i) above it follows
that
0
(x) =
0
[y/z](x) = x and
0
(y) =
0
[y/z](z) = z. Similarly, since
V
1
= V x, z we have y V
1
and consequently from (ii) above it follows that

1
(y) =
1
[y/z](y) = y as requested. Hence we see that our inversion scheme
is working, at least in this case. As it turns out, the substitution = [y/z] is
valid for the formula = y(x y) (x z). But what if wasnt valid for
? Well obviously, when the substitution is not valid for the free and bound
variables of () have been muddled up. We cannot expect our scheme to work
any longer. So let us consider = [y/x] instead which is clearly not valid for
103
= y(x y) (x z). Then
|V0
and
|V1
are still injective and we still have
Fr () V
0
and Bnd () V
1
. Of course the maps
0
,
1
: V V would need
to be dierent but even assuming (i) and (ii) above, our scheme would fail:
() = (y(x y) (x z))
= [y/x] = (y(y y) (y z))
W = V and denition (94) =

( y(y y) (y z) )(V )
=

( y(y y) )(V )

(y z)(V )
=
1
(y)

(y y)(V y) (
0
(y)
0
(z))
=
1
(y)(
1
(y)
1
(y)) (
0
(y)
0
(z))
A: to be proved = y(y y) (x z)
,=
So it remains to show that
1
(y) = y,
0
(y) = x and
0
(z) = z. Since we have
V
0
= V y we obtain x, z V
0
and consequently from (i) above it follows
that
0
(y) =
0
[y/x](x) = x and
0
(z) =
0
[y/x](z) = z. Similarly, since
V
1
= V x, z we have y V
1
and consequently from (ii) above it follows that

1
(y) =
1
[y/x](y) = y as requested. Ok so we are now ready to proceed. We
are given a map : V W and V
0
, V
1
V such that
|V0
and
|V1
are injective
maps. We then consider P(V ) dened by:
= P(V ) : (Fr () V
0
) (Bnd () V
1
) ( valid for )
We are hoping the inversion scheme will work on , namely that we can build a
map : P(W) P(V ) such that () = for all . This result is the
object of theorem (10) of page 107 below. The proof of this theorem relies on
a technical lemma for which we shall now provide a few words of explanations:
proving the formula () = is in fact showing

(())(W) = as follows
from denition (94). A structural induction with = x
1
leads to:

(())(W) =

((x
1
))(W)
=

( (x)(
1
) )(W)
=
1
(x)

((
1
))(W (x))
x Bnd () V
1
= x

((
1
))(W (x))
So we are immediately confronted with two separate issues. One the one hand
we cannot hope to carry out a successful induction argument relating to W
only, since W (x) appears on the right-hand-side of this equation. So we
need to prove something which relates to some

(())(W (U)) rather than

(())(W) where U is a possibly non-empty subset of V . On the other hand,


the condition is not very useful when = x
1
. From the equality
Fr () = Fr (
1
) x, it is clear that the condition Fr () V
0
does not imply
Fr (
1
) V
0
. So it is possible to have and yet
1
, . It is important
for us to design an induction argument in such a way that if an assumption is
made about = x
1
, this assumption is also satised by
1
so we can use our
induction hypothesis. The following lemma addresses these two issues:
104
Lemma 96 Let V , W be sets and : V W be a map. Let V
0
, V
1
be subsets
of V and
0
,
1
: W V be maps such that for all x V :
(i) x V
0

0
(x) = x
(ii) x V
1

1
(x) = x
Let

: P(W) [T(W) P(V )] be the map associated with (


0
,
1
) as per
denition (94). Then for all U T(V ) and P(V ) we have:

(())(W (U)) = (2.16)


provided U and satisfy the following properties:
(iii) ( Fr () U V
0
) ( Bnd () U V
1
)
(iv) ( valid for ) ( (U) (Fr () U) = )
Proof
We assume : V W is given together with the subsets V
0
, V
1
and the maps

0
,
1
: W V satisfying (i) and (ii). Let

: P(W) [T(W) P(V )] be the


map associated with the ordered pair (
0
,
1
) as per denition (94) of page 101.
Given U V and P(V ) consider the property q(U, ) dened by (iii) and
(iv). Then we need to show that for all P(V ) we have the property:
U V , [ q(U, )

(())(W (U)) = ]
We shall do so by a structural induction argument, using theorem (3) of page 32.
First we assume that = (x y) for some x, y V . Let U V and suppose
q(U, ) is true. We need to show that equation (2.16) holds. We have:

(())(W (U)) =

((x y))(W (U))


dene U

= W (U) =

((x y))(U

)
=

( (x) (y) )(U

)
denition (94) =
U
(x)
U
(y)
A: to be proved = x y
=
So it remains to show that
U
(x) = x and
U
(y) = y. First we show
that
U
(x) = x. We shall distinguish two cases: rst we assume that x U.
Then (x) (U) and it follows that (x) , U

. Thus, using denition (94) we


have
U
(x) =
1
(x). From the property (ii) above we have
1
(x) = x
whenever x V
1
. So we only need to show that x V
1
. However, having
assumed that q(U, ) is true, in particular we have Bnd() U V
1
. So x V
1
follows immediately from our assumption x U. Next we assume that x , U.
Since = (x y) we have x Fr (). So it follows that x Fr () U and
consequently (x) (Fr () U). Having assumed that q(U, ) is true, in
105
particular we have (U)(Fr ()U) = . Thus, from (x) (Fr ()U) we
obtain (x) , (U) and we see that (x) U

. Thus, using denition (94) we


have
U
(x) =
0
(x). From the property (i) above we have
0
(x) = x
whenever x V
0
. So it remains to show that x V
0
. Having assumed that
q(U, ) is true, in particular we have Fr () U V
0
. So x V
0
follows
immediately from x Fr () U which we have already proved. This completes
our proof of
U
(x) = x. The proof of
U
(y) = y is identical so we are
now done with the case = (x y). Next we assume that = . Let U V
and suppose q(U, ) is true. We need to show that equation (2.16) holds:

(())(W (U)) =

()(W (U)) =
Next we assume that =
1

2
where
1
,
2
P(V ) satisfy our desired
property. We need to show that the same is true of . So let U V and
suppose q(U, ) is true. We need to show that equation (2.16) holds:

(())(W (U)) =

((
1

2
))(W (U))
dene U

= W (U) =

((
1

2
))(U

)
=

( (
1
) (
2
) )(U

)
denition (94) =

((
1
))(U

((
2
))(U

)
A: to be proved =
1

2
=
So it remains to show that equation (2.16) holds for
1
and
2
. However,
from our induction hypothesis, our property is true for
1
and
2
. Hence, it
is sucient to prove that q(U,
1
) and q(U,
2
) are true. First we show that
q(U,
1
) is true. We need to prove that (iii) and (iv) above are true for
1
:
Fr (
1
) U Fr () U V
0
where the second inclusion follows from our assumption of q(U, ). Furthermore:
Bnd (
1
) U Bnd() U V
1
So (iii) is now established for
1
. Also from q(U, ) we obtain:
(U) (Fr (
1
) U) (U) (Fr () U) =
So it remains to show that is valid for
1
, which follows immediately from
proposition (87) and the validity of for =
1

2
. This completes our proof
of q(U,
1
). The proof of q(U,
2
) being identical, we are now done with the case
=
1

2
. So we now assume that = x
1
where x V and
1
P(V )
satisfy our induction hypothesis. We need to show the same is true for . So
let U V and suppose q(U, ) is true. We need to show that equation (2.16)
holds for U, which goes as follows:

(())(W (U)) =

((x
1
))(W (U))
106
dene U

= W (U) =

((x
1
))(U

)
=

( (x)(
1
) )(U

)
denition (94) =
1
(x)

((
1
))( U

(x) )
(ii) and x Bnd() V
1
= x

((
1
))( U

(x) )
dene U

1
= U

(x) = x

((
1
))(U

1
)
A: to be proved = x
1
=
So it remains to show that

((
1
))(U

1
) =
1
. From U

= W (U) and
U

1
= U

(x) we obtain U

1
= W(U
1
) where U
1
= U x. So it remains
to show that

((
1
))(W (U
1
)) =
1
, or equivalently that equation (2.16)
holds for U
1
and
1
. Having assumed
1
satises our induction hypothesis, it
is therefore sucient to prove that q(U
1
,
1
) is true. So we need to show that
(iii) and (iv) above are true for U
1
and
1
:
Fr (
1
) U
1
= Fr (
1
) x U
= Fr () U
q(U, ) V
0
Furthermore:
Bnd (
1
) U
1
= Bnd (
1
) x U
= Bnd () U
q(U, ) V
1
and:
(U
1
) (Fr (
1
) U
1
) = (U
1
) ( Fr (
1
) x U )
= (U
1
) (Fr () U)
= ((U) (x)) (Fr () U)
q(U, ) = (x) (Fr () U)
(x) (Fr ())
A: to be proved =
So we need to show that (u) ,= (x) whenever u Fr (), which follows
immediately from proposition (88) and the validity of for = x
1
, itself a
consequence of our assumption q(U, ). So we are almost done proving q(U
1
,
1
)
and it remains to show that is valid for
1
, which is another immediate
consequence of proposition (88). This completes our induction argument.
Theorem 10 Let V , W be sets and : V W be a map. Let V
0
, V
1
be subsets
of V such that
|V0
and
|V1
are injective maps. Let be the subset of P(V ):
= P(V ) : (Fr () V
0
) (Bnd () V
1
) ( valid for )
107
Then, there exits : P(W) P(V ) such that:
, () =
where : P(V ) P(W) also denotes the associated substitution mapping.
Proof
Let : V W be a map and V
0
, V
1
V be such that
|V0
and
|V1
are
injective maps. We shall distinguish two cases: rst we assume that V = .
Then for all P(V ) the inclusions Fr () V
0
and Bnd () V
1
are always
true. Furthermore, from denition (83) the substitution is always valid for .
Hence we have = P(V ) and we need to nd : P(W) P(V ) such that
() = for all P(V ). We dene with the following recursion:
() =

if = (u v)
if =
(
1
) (
2
) if =
1

2
if = u
1
(2.17)
We shall now prove that () = using a structural induction argument.
Since V = there is nothing to check in the case when = (x y). So we
assume that = . Then () = and () = as requested. Next we
assume that =
1

2
with (
1
) =
1
and (
2
) =
2
. Then:
() = (
1

2
)
= ( (
1
) (
2
) )
= (
1
) (
2
)
=
1

2
=
Since V = there is nothing to check in the case when = x
1
. So this
completes our proof when V = , and we now assume that V ,= . Let x
0
V
and consider
0
: (V
0
) V and
1
: (V
1
) V to be the inverse mappings
of
|V0
and
|V1
respectively. Extend
0
and
1
to the whole of W by setting

0
(u) = x
0
if u , (V
0
) and
1
(u) = x
0
if u , (V
1
). Then
0
,
1
: W V are
left-inverse of on V
0
and V
1
respectively, i.e. we have:
(i) x V
0

0
(x) = x
(ii) x V
1

1
(x) = x
Let : P(W) P(V ) be the dual variable substitution associated with the
ordered pair (
0
,
1
) as per denition (94). We shall complete the proof of the
theorem by showing that () = for all where:
= P(V ) : (Fr () V
0
) (Bnd () V
1
) ( valid for )
So let . Then in particular satises property (iii) and (iv) of lemma (96)
in the particular case when U = . So applying lemma (96) for U = , we see
that

(())(W ()) = where

: P(W) [T(W) P(V )] is the map


associated with . Hence we conclude that () =

(())(W) = .
108
2.2 The Substitution Congruence
2.2.1 Preliminaries
Before we start, we would like to say a few words to the more expert read-
ers. The notion of substitution congruence seems to be commonly known as
-equivalence in the computer science literature, especially within the context
of -calculus. We decided to use the phrase substitution congruence partly out
of ignorance and partly because rst order logic does not have the notion of
-equivalence and -equivalence as far as we can tell. Instead, we shall consider
other congruences on P(V ) which are the permutation, the absorption and the
propositional congruence which have no obvious counterpart in -calculus. So we
were tempted to change our terminology out of due respect for established tra-
ditions, but decided otherwise in the end. The notion of -equivalence and the
more general issue of variable binding in formal languages, has already received
some academic attention. For example, the interested reader may wish to refer
to Murdoch J. Gabbay [22] and Murdoch J. Gabbay and Aad Mathijssen [23].
These references also focus on capture-avoiding substitutions which we feel is
an important topic, intimately linked to -equivalence. Our own treatment of
capture-avoiding substitutions will give rise to essential substitutions.
2.2.2 The Strong Substitution Congruence
We all know that x(x z) and y(y z) should be identical mathematical
statements when z , x, y. In this section, we shall attempt to formally dene
a congruence on P(V ) such that x(x z) y(y z) is satised for all
z , x, y. More generally, given a set V and a formula
1
P(V ), we would
like our congruence to be such that x
1
y
1
[y/x] for all x, y V . In
other words, if we were to replace the variable x by the variable y in the formula
= x
1
, the formula = y
1
[y/x] resulting from the substitution should
have the same meaning as that of . One of the diculties in pinning down
a formal denition for the congruence is to know exactly what we want. If

1
= (x y) with x ,= y, then = x(x y) while = y(y y) and
it is clear that and do not represent the same mathematical statement.
It would therefore be inappropriate to require that x
1
y
1
[y/x] for all
x, y V , without some form of restriction on the variables x and y. Looking
back at the example of
1
= (x y), the problem arises from the fact that the
substitution [y/x] is not a valid substitution for x
1
. It is therefore tempting to
require that x
1
y
1
[y/x] for all x, y V such that [y/x] is valid for x
1
.
From proposition (90), we can easily check that [y/x] is automatically valid for
x
1
whenever the condition y , Var(
1
) is satised. So we could also require
that x
1
y
1
[y/x] solely when the condition y , Var(
1
) is met. This is
obviously simpler as it does not require the concept of valid substitution, and it
is also in line with the existing literature (e.g. see Donald W. Barnes, John M.
Mack [4] page 28). As it turns out, whether we decide to go for the condition
y , Var(
1
) or [y/x] is valid for x
1
will lead to the same congruence, as can
109
be seen from proposition (134). So the question is settled: we shall require that
x
1
y
1
[y/x] whenever y , Var(
1
). Since a congruence is always reexive,
there is no harm in imposing that x ,= y. We shall dene our congruence in
terms of a generator, in the sense of denition (39). Such a congruence exists
and is unique by virtue of theorem (6) of page 54. However despite our best
eorts, it will appear later that the congruence is not the right one. It will in
fact be too strong when the set V is a nite set. For this reason, the congruence
presented in this section will be called the strong substitution congruence. In
later parts of this document, we shall alter our denition slightly so as to reach
what we believe is a denitive notion of substitution congruence working equally
well for V nite and V innite. Despite its minor aws, the strong substitution
congruence deserves to be studied in its own right, as it seems to be the one
studied in every known textbook (which usually assumes an innite set V ).
Denition 97 Let V be a set. We call strong substitution congruence on P(V )
the congruence on P(V ) generated by the following set R
0
P(V ) P(V ):
R
0
= ( x
1
, y
1
[y/x] ) :
1
P(V ) , x, y V , x ,= y , y , Var(
1
)
where [y/x] denotes the substitution of y in place of x as per denition (65).
2.2.3 Free Variable and Strong Substitution Congruence
The strong substitution congruence is designed in such a way that if you take a
formula and replace its bound variables by other variables, you obtain a for-
mula which is equivalent to , provided the variable substitution is valid. The
strong substitution congruence is also designed to be the smallest congruence
on P(V ) with such property. So one would expect two equivalent formulas to
be identical in all respects, except possibly in relation to their bound variables.
In particular, if a formula is equivalent to a formula , we should expect
both formulas to have the same free variables. This is indeed the case, as the
following proposition shows. Since we already know from proposition (77) that
Fr () = Fr () denes a congruence on P(V ), the proof reduces to making
sure that every ordered pair (, ) belonging to the generator R
0
of the strong
substitution congruence, is such that Fr () = Fr ().
Proposition 98 Let be the strong substitution congruence on P(V ) where V
is a set. Then for all , P(V ) we have the implication:
Fr () = Fr ()
Proof
Let be the relation on P(V ) dened by Fr () = Fr (). We
need to show that or equivalently that the inclusion
holds. Since is the strong substitution congruence on P(V ), it is the smallest
congruence on P(V ) which contains the set R
0
of denition (97). In order to
show the inclusion it is therefore sucient to show that is a congruence
on P(V ) such that R
0
. However, we already know from proposition (77)
110
that is a congruence on P(V ). So it remains to show that R
0
. So let

1
P(V ) and x, y V be such that x ,= y and y , Var(
1
). Dene the
two formulas = x
1
and = y
1
[y/x]. We need to show that
or equivalently that Fr () = Fr (). We shall distinguish two cases. First we
assume that x , Fr (
1
). From the assumption y , Var(
1
) and proposition (76)
we obtain Fr (
1
[y/x]) = Fr (
1
). Furthermore, since Fr (
1
) Var(
1
), we also
have y , Fr (
1
). It follows that:
Fr () = Fr (y
1
[y/x])
= Fr (
1
[y/x]) y
= Fr (
1
) y
= Fr (
1
)
= Fr (
1
) x
= Fr (x
1
)
= Fr ()
We now consider the case when x Fr (
1
). From the assumption y , Var(
1
)
and proposition (76) we obtain Fr (
1
[y/x]) = Fr (
1
) x y, and so:
Fr () = Fr (y
1
[y/x])
= Fr (
1
[y/x]) y
= ( Fr (
1
) x y ) y
= ( Fr (
1
) x ) y
= Fr (
1
) x
= Fr (x
1
)
= Fr ()

2.2.4 Substitution and Strong Substitution Congruence


Given a substitution mapping : P(V ) P(W) arising from an injective map
: V W, and given an arbitrary congruence on P(W), we showed in
proposition (58) that the relation on P(V ) dened by if and only if
() () is a congruence on P(V ). We now apply this fact to the strong
substitution congruence on P(W) to show that () () whenever we have
, where also denotes the strong substitution congruence on P(V ). In
other words, substitution mappings arising from injective substitution preserve
the strong substitution congruence. Since we know that is a congruence
on P(V ), the proof simply consists in showing that R
0
, where R
0
is the
generator of the strong substitution congruence on P(V ) as per denition (97).
111
Proposition 99 Let V and W be sets and : V W be an injective map.
Let be the strong substitution congruence both on P(V ) and P(W). Then:
() ()
for all , P(V ), where : P(V ) P(W) is also the substitution mapping.
Proof
Let be the relation on P(V ) dened by () (). We
need to show that or equivalently that the inclusion
holds. Since is the strong substitution congruence on P(V ), it is the smallest
congruence on P(V ) which contains the set R
0
of denition (97). In order to
show the inclusion it is therefore sucient to show that is a congruence
on P(V ) such that R
0
. However, we already know from proposition (58)
that is a congruence on P(V ). So it remains to show that R
0
. So
let
1
P(V ) and x, y V be such that x ,= y and y , Var(
1
). Dene
= x
1
and = y
1
[y/x]. We need to show that or equivalently that
() (). In order to do so, it is sucient to show that the ordered pair
((), ()) belongs to the generator R

0
of the strong substitution congruence
on P(W) as per denition (97). In other words, it is sucient to show the
existence of

1
P(W) and x

, y

W with x

,= y

and y

, Var(

1
), such
that () = x

1
and () = y

1
[y

/x

]. Take

1
= (
1
) P(W) together
with x

= (x) W and y

= (y) W. Then, we have:


() = (x
1
) = (x) (
1
) = x

1
Furthermore, if we accept for now that [y/x] = [(y)/(x)] , we have:
() = (y
1
[y/x])
= (y) (
1
[y/x])
= y

([y/x](
1
))
= y

[y/x] (
1
)
= y

[(y)/(x)] (
1
)
= y

[y

/x

](

1
)
= y

1
[y

/x

]
So it remains to show that [y/x] = [(y)/(x)] is indeed true, and
furthermore that x

,= y

and y

, Var(

1
). Since : V W is an injective
map, x

,= y

follows immediately from x ,= y. We now show that y

, Var(

1
).
So suppose to the contrary that y

Var(

1
). We shall arrive at a contradiction.
Since

1
= (
1
), from proposition (63) we have:
Var(

1
) = (u) : u Var(
1
)
It follows that there exists u Var(
1
) such that y

= (u). However, y

= (y)
and : V W is an injective map. Hence we see that u = y and consequently
112
y Var(
1
) which contradicts our initial assumption of y , Var(
1
). We shall
complete the proof of this proposition by showing that [y/x] = [(y)/(x)].
So let u V , and suppose rst that u ,= x. Then:
[y/x](u) = ([y/x](u))
= (u)
= [(y)/(x)]((u))
= [(y)/(x)] (u)
where the third equality crucially depends on (u) ,= (x) which itself follows
from the injectivity of and u ,= x. We now assume that u = x. Then:
[y/x](u) = ([y/x](u))
= (y)
= [(y)/(x)]((u))
= [(y)/(x)] (u)
In both cases u ,= x and u = x we see that [y/x](u) = [(y)/(x)] (u).
We now wish to apply proposition (99) to the substitution mapping [y/x]
which is not injective when x ,= y. The trick is to consider the permutation
mapping [y : x] and argue that [y/x] coincides with [y : x] while [y/x] co-
incides with [y : x], provided the condition y , Var() Var() is satised.
Proposition 100 Let be the strong substitution congruence on P(V ) where
V is a set. Let , P(V ) and x, y V such that y , Var() Var(). Then:
[y/x] [y/x]
Proof
We assume that . We need to show that [y/x] [y/x]. Unfortunately,
the map [y/x] : V V is not injective when x ,= y since [y/x](x) = y = [y/x](y).
So we cannot argue directly from proposition (99) that [y/x] [y/x]. In-
stead, we shall consider the permutation [y : x] : V V as per denition (66).
Since [y : x] is injective, from proposition (99) we have [y : x] [y : x]. We
shall complete the proof of the proposition by proving [y : x] = [y/x] and
[y : x] = [y/x]. But this follows immediately from proposition (69) and the
assumption y , Var() Var().
As can be seen from denition (97), the strong substitution congruence was
dened so as to ensure that x
1
is always equivalent to y
1
[y/x] when x ,= y,
provided we have y , Var(
1
). The fundamental idea is that if we replace a
variable x which is not free, by a variable y which is not already present in a
formula, then we do not change the meaning of the formula. We now check that
this property is indeed true. The proof is done by structural induction.
113
Proposition 101 Let be the strong substitution congruence on P(V ) where
V is a set. Let P(V ), x, y V such that y , Var() and x , Fr (). Then:
[y/x]
Proof
We assume x, y V given. For all P(V ), we need to show the implication
(y , Var()) (x , Fr ()) [y/x] . We shall do so by structural
induction, using theorem (3) of page 32. Since P
0
(V ) is a generator of P(V ),
we show rst that the property is true on P
0
(V ). So let = (u v) P
0
(V ),
where u, v V . We assume that y , Var() and x , Fr (), and we need to
show that [y/x] . Since Var() = Fr () = u, v, we have x, y , u, v. In
particular, x , u, v and consequently we have:
[y/x] = ( [y/x](u) [y/x](v) ) = (u v) =
In particular [y/x] . Next we check that the property is true for P(V ):
[y/x] = [y/x]() =
and in particular [y/x] . Next we check that the property is true for
=
1

2
, if it is true for
1
,
2
P(V ). So we assume that y , Var()
and x , Fr (), and we need to show that [y/x] . From the equality
Var() = Var(
1
) Var(
2
) we see that y , Var(
1
) and y , Var(
2
). Likewise,
from Fr () = Fr (
1
) Fr (
2
) we obtain x , Fr (
1
) and x , Fr (
2
). Having
assumed the property is true for
1
and
2
, it follows that
1
[y/x]
1
and

2
[y/x]
2
. The strong substitution congruence being a congruent relation:
[y/x] =
1
[y/x]
2
[y/x]
1

2
=
Finally we check that the property is true for = u
1
, if it is true for
1
. So we
assume that y , Var() and x , Fr (), and we need to show that [y/x] .
We shall distinguish two cases. First we assume that x ,= u. Then, we have:
[y/x] = [y/x](u
1
) = [y/x](u)
1
[y/x] = u
1
[y/x]
So we need to show that u
1
[y/x] u
1
. The strong substitution congruence
being a congruent relation, it is sucient to prove that
1
[y/x]
1
. Having
assumed the property is true for
1
, it is therefore sucient to show y , Var(
1
)
and x , Fr (
1
). Since y , Var() and Var() = u Var(
1
), it is clear that
y , Var(
1
). So we need to show that x , Fr (
1
). So suppose to the contrary
that x Fr (
1
). Since Fr () = Fr (
1
) u and x ,= u, we obtain x Fr ()
which contradicts our initial assumption of x , Fr (). This completes our proof
in the case when x ,= u. We now assume x = u. Then:
[y/x] = [y/x](u
1
) = [y/x](u)
1
[y/x] = y
1
[y/x]
So we need to show that y
1
[y/x] x
1
. We shall consider two cases. First
we assume that x = y. So we need to show that x
1
[x/x] x
1
, for which
114
it is sucient to prove that
1
[x/x] =
1
which follows from proposition (56)
and the fact that [x/x] : V V is the identity mapping. We now assume that
x ,= y. It is sucient to prove that the ordered pair (x
1
, y
1
[y/x]) belongs
to the generator R
0
of the strong substitution congruence on P(V ) as per def-
inition (97). Since x ,= y, it remains to show that y , Var(
1
) which follows
immediately from y , Var() and Var() = x Var(
1
).
2.2.5 Characterization of the Strong Congruence
In denition (97) the strong substitution congruence was dened in terms of a
generator. This makes it very convenient to show that two given formulas
and are equivalent. For instance, if = xy(x y) and = yx(y x), if
there is u V such that u , x, y, the proof that would go as follows:
xy(x y) uy(u y)
ux(u x)
yx(y x)
The rst and third equivalence stem directly from denition (97), as does the
equivalence y(u y) x(u x). Knowing that is a congruent relation
allows us to establish the second equivalence, while knowing that is a transitive
relation allows us to conclude that . More generally, the knowledge of
the generator R
0
of denition (97) gives us the equivalence immediately,
for many pairs (, ). Using the congruent property, reexivity, symmetry and
transitivity of the relation , we readily obtain this equivalence for many more
pairs. So showing that is usually the easy part.
What is more dicult is proving that an equivalence does not hold.
For instance, if = (x y) and = , we would like to think that is not
equivalent to . However, denition (97) does not give us an immediate tool
to prove that , . Fortunately, we showed in proposition (98) that if
then we must have Fr () = Fr (). Since Fr () = x, y while Fr () = we
are able in this case to conclude that is not equivalent to . But if = (x y)
while = (y x) with x ,= y, then the implication Fr () = Fr ()
does not help us to conclude that , . Somehow we need something sharper.
So we need to prove an implication , having chosen a
relation which tells us a bit more about the formulas and , than the
simple Fr () = Fr (). In fact, we shall choose the relation so as to have the
equivalence , as a way of ensuring that the statement
tells us as much as possible about the formulas and . So before we prove
anything, our rst task is to choose a sensible relation .
So let us assume that . We know from theorem (2) of page 21 that
any formula of rst order predicate logic is either of the form (x y), or is the
contradiction constant , or is an implication
1

2
or is a quantication
x
1
. Let us review all these four cases in relation to : so suppose rst that
= (x y) with . Clearly we would expect the formula to be equal to
115
the formula . Suppose now that = . Then we would expect the formula
to be equal to . Suppose now that is an implication =
1

2
. If
we would also expect the formula to be an implication =
1

2
.
Furthermore, we would expect to have
1

1
and
2

2
.
Suppose now that is a quantication = x
1
. Then we would also expect
the formula to be a quantication = y
1
. However, we would not expect
the variables x and y to be the same in general. If x = y then we would expect
to have
1

1
, but if x ,= y then the relationship between
1
and
1
should
be more complex. Informally, we would expect the formula
1
to be the same
mathematical statement as the formula
1
, but with the variable x replaced by
the variable y. Of course we cannot hope to have
1
=
1
[y/x] in general, as
there may have been other changes of variable within the formulas
1
and
1
.
However, we would expect to have the equivalence
1

1
[y/x].
Unfortunately, this does not quite work. We already know from deni-
tion (97) that considering the formula
1
[y/x] is not very safe unless we have
y , Var(
1
). We have made no mention of this fact so far, so it is likely that
the simple condition
1

1
[y/x] is doomed, without further qualication. In
fact, consider the case when = xy(x y) and = yx(y x) with x ,= y.
Then
1
= y(x y) and
1
= x(y x). So we obtain
1
[y/x] = y(y y)
and we certainly expect the equivalence
1

1
[y/x] to be false. So we need
to look for something more complicated. Suppose there exists u V such that
u , x, y. Dening the formula = u(x u) the condition y , Var() is now
satised and we can safely consider the formula [y/x] = u(y u). From de-
nition (97) we obtain immediately
1
and
1
[y/x]. So this may be the
relationship we are looking for between the formula
1
and the formula
1
: the
existence of a third formula such that
1
and
1
[y/x], with the addi-
tional condition y , Var(). Note that we cannot hope to make this relationship
simpler in general. We have already established that
1
= and
1
[y/x],
(or indeed
1
= and
1
= [y/x]) fails to work. The condition
1
and

1
= [y/x] does not work in general either: from proposition (71) we know that
x can never be a variable of [y/x] when x ,= y, and consequently we cannot
hope to have the equality
1
= [y/x] in the case when
1
= x(y x). In the
light of these comments, we can now venture a sensible guess for our relation .
Denition 102 Let be the strong substitution congruence on P(V ) where V
is a set. Let , P(V ). We say that is almost strongly equivalent to and
we write , if and only if one of the following is the case:
(i) P
0
(V ) , P
0
(V ) , and =
(ii) = and =
(iii) =
1

2
, =
1

2
,
1

1
and
2

2
(iv) = x
1
, = x
1
and
1

1
(v) = x
1
, = y
1
, x ,= y ,
1
,
1
[y/x] , y , Var()
It should be noted that (v) of denition (102) is a notational shortcut for
the more detailed statement = x
1
, = y
1
, x ,= y and there exists a
116
formula P(V ) such that
1
,
1
[y/x] , y , Var().
Proposition 103 (i), (ii), (iii), (iv), (v) of denition (102) are mutually exclu-
sive.
Proof
This is an immediate consequence of theorem (2) of page 21 applied to the free
universal algebra P(V ) with free generator P
0
(V ), where a formula P(V )
is either an element of P
0
(V ), or the contradiction constant = , or an im-
plication =
1

2
, or a quantication = x
1
, but cannot be equal to
any two of those things simultaneously. Since (v) can only occur with x ,= y,
it also follows from theorem (2) that (v) cannot occur at the same time as (iv).
So we have dened our relation and it remains to prove the equivalence:

As we will soon discover, the dicult part is showing the implication . In
order to do so, we shall need to show that is a congruence on P(V ) which
contains the generator R
0
of denition (97). So we start by proving R
0
.
Proposition 104 Let be the almost strong equivalence relation on P(V )
where V is a set. Then contains the generator R
0
of denition (97).
Proof
Suppose x, y V and
1
P(V ) are such that x ,= y and y , Var(
1
).
Note that this cannot happen unless V has at least two elements. We dene
= x
1
and = y
1
[y/x]. We need to show that . We shall do
so by proving that (v) of proposition (102) is the case. Dene =
1
and

1
=
1
[y/x] = [y/x]. Since the strong substitution congruence is reexive
on P(V ), we have
1
and
1
[y/x]. It follows that = x
1
, = y
1
,
x ,= y,
1
,
1
[y/x] and y , Var().
We now need to show that is a congruence on P(V ). In particular, it is
an equivalence relation, i.e. a relation on P(V ) which is reexive, symmetric
and transitive. We rst deal with the reexivity.
Proposition 105 Let be the almost strong equivalence relation on P(V )
where V is a set. Then is a reexive relation on P(V ).
Proof
Let P(V ). We need to show that . From theorem (2) of page 21
we know that is either an element of P
0
(V ), or = or =
1

2
or
= x
1
for some
1
,
2
P(V ) and x V . We shall consider these four
mutually exclusive cases separately. Suppose rst that P
0
(V ): then from
= we obtain . Suppose next that = : then it is clear that .
Suppose now that =
1

2
. Since the strong substitution congruence is
117
reexive, we have
1

1
and
2

2
. It follows from (iii) of denition (102)
that . Suppose nally that = x
1
. From
1

1
and (iv) of def-
inition (102) we conclude that . In all cases, we have proved that .
Proposition 106 Let be the almost strong equivalence relation on P(V )
where V is a set. Then is a symmetric relation on P(V ).
Proof
Let , P(V ) be such that . We need to show that . We shall
consider the ve possible cases of denition (102): suppose rst that P
0
(V ),
P
0
(V ) and = . Then it is clear that . Suppose next that =
and = . Then we also have . We now assume that =
1

2
and =
1

2
with
1

1
and
2

2
. Since the strong substitution
congruence on P(V ) is symmetric, we have
1

1
and
2

2
. Hence we
have . We now assume that = x
1
and = x
1
with
1

1
.
Then once again by symmetry of the strong substitution congruence we have

1

1
and consequently . We nally consider the last possible case
of = x
1
, = y
1
with x ,= y and we assume the existence of P(V )
such that
1
,
1
[y/x] and y , Var(). Dene

= [y/x]. In order
to show that it is sucient to prove that
1

[x/y] and x , Var(

).
First we show that x , Var(

). So we need to show that x , Var([y/x])


which in fact follows immediately from proposition (71) and x ,= y. We now
show that
1

[x/y] = [y/x][x/y]. Since


1
and the strong substitution
congruence on P(V ) is a transitive relation, it is sucient to prove that:
[y/x][x/y]
In fact, the strong substitution congruence being reexive, it is sucient to
prove that = [y/x][x/y]. From proposition (70), since y , Var() we obtain
[x/x] = [y/x][x/y]. Since [x/x] : V V is the identity mapping, we conclude
from proposition (56) that [x/x] = and nally = [y/x][x/y].
Having shown that is a reexive and symmetric relation on P(V ), we now
prove that it is also a transitive relation. This is by far the most dicult result
of this section as many technical details need to be checked.
Proposition 107 Let be the almost strong equivalence relation on P(V )
where V is a set. Then is a transitive relation on P(V ).
Proof
Let , and P(V ) be such that and . We need to show
that . We shall consider the ve possible cases of denition (102) in
relation to . Suppose rst that , P
0
(V ) and = . Then from
we obtain , P
0
(V ) and = . It follows that , P
0
(V )
and = . Hence we see that . We now assume that = = .
Then from we obtain = = . It follows that = = and
118
consequently . We now assume that =
1

2
and =
1

2
with

1

1
and
2

2
. From we obtain =
1

2
with
1

1
and
2

2
. The strong substitution congruence being transitive, it follows
that
1

1
and
2

2
. Hence we see that . We now assume that
= x
1
and = x
1
with
1

1
, for some x V . From only the
cases (iv) and (v) of denition (102) are possible. First we assume that (iv) is
the case. Then = x
1
with
1

1
. The strong substitution congruence
being transitive, we obtain
1

1
and consequently . We now assume
that (v) is the case. Then = y
1
for some y V with x ,= y, and there
exists P(V ) such that
1
,
1
[y/x] and y , Var(). The strong
substitution congruence being transitive, we obtain
1
and consequently
. It remains to consider the last possible case of denition (102). So we
assume that = x
1
and = y
1
with x ,= y, and we assume the existence
of P(V ) such that
1
,
1
[y/x] and y , Var(). From only
the cases (iv) and (v) of denition (102) are possible. First we assume that (iv)
is the case. Then = y
1
with
1

1
. The strong substitution congruence
being transitive, we obtain
1
[y/x] and consequently . We now
assume that (v) is the case. Then = z
1
for some z V with y ,= z, and
there exists

P(V ) such that


1

,
1

[z/y] and z , Var(

). We
shall now distinguish two cases. First we assume that x = z. Then = x
1
and = x
1
and in order to show that it is sucient to prove that

1

1
. Since
1
and
1

[z/y], using the symmetry and transitivity of


the strong substitution congruence, it remains to show that

[z/y]. Having
assumed that x = z, we have to prove that

[x/y]. Since y , Var(), from


proposition (70) we have [y/x][x/y] = [x/x] and from proposition (56) we
obtain [x/x] = . It follows that [y/x][x/y] = and it remains to prove that
[y/x][x/y]

[x/y]. Using proposition (100), it is sucient to prove that


[y/x]

and x , Var([y/x]) Var(

). The fact that [y/x]

follows
from
1
[y/x] and
1

, using the symmetry and transitivity of the


strong substitution congruence on P(V ). The fact that x , Var([y/x]) follows
from proposition (71) and the assumption x ,= y. The fact that x , Var(

)
follows from the assumptions x = z and z , Var(

). This completes our proof


in the case when x = z. We now assume that x ,= z. So we have x ,= y, y ,= z
and x ,= z, with = x
1
, = y
1
and = z
1
. Furthermore, there exist
and

P(V ) such that


1
,
1
[y/x],
1

and
1

[z/y].
Finally, we have y , Var() and z , Var(

), and we need to show that .


Dene = [y/z]. It is sucient to show that
1
and
1
[z/x] with
z , Var(). The fact that z , Var() is a direct consequence of proposition (71)
and y ,= z. So it remains to show that
1
and
1
[z/x]. First we show
that
1
. Since
1
and = [y/z], from the symmetry and transitivity
of the strong substitution congruence on P(V ) it is sucient to prove that
[y/z] , which itself follows from proposition (101) provided we show that
y , Var() and z , Fr (). We already know that y , Var(), so it remains to
prove that z , Fr (). In order to do so, we shall rst prove the implication:
z Fr () z Fr ([y/x]) (2.18)
119
using proposition (76) and the assumption y , Var(). In the case x , Fr (),
we obtain Fr ([y/x]) = Fr () and the implication (2.18) is clear. In the case
when x Fr (), we obtain Fr ([y/x]) = Fr () x y, and the implica-
tion (2.18) follows immediately from the fact that z ,= x. Having proved the
implication (2.18), we can now show that z , Fr () by instead proving that
z , Fr ([y/x]). Since
1
[y/x] and
1

it follows from the symmetry


and transitivity of the strong substitution congruence on P(V ) that [y/x]

.
Thus, from proposition (98) we obtain Fr ([y/x]) = Fr (

), and it is therefore
sucient to prove that z , Fr (

), which follows immediately from z , Var(

)
and Fr (

) Var(

), the latter being a consequence of proposition (80). This


completes our proof of
1
. It remains to show that
1
[z/x]. Since

[z/y] and = [y/z] we have to show:


[y/z][z/x]

[z/y] (2.19)
We shall prove the equivalence (2.19) by proving the following two results:
[y/z][z/x] [y/z][z/x][x/y] (2.20)

[z/y] [y/x][x/z][z/y] (2.21)


Suppose for now that (2.20) and (2.21) have been proved. Comparing with the
equivalence (2.19), it is sucient to show that:
[y/z][z/x][x/y] = [y/x][x/z][z/y] (2.22)
So we shall now prove equation (2.22). From proposition (64) it is sucient
to show that the maps [x/y] [z/x] [y/z] and [z/y] [x/z] [y/x] coincide on
Var(). So let u Var(). In particular u ,= y and we have to show that:
[x/y] [z/x] [y/z](u) = [z/y] [x/z] [y/x](u) (2.23)
We shall distinguish three cases. First we assume that u , x, y, z. Then it is
clear that the equality (2.23) holds. Next we assume that u = x:
[x/y] [z/x] [y/z](u) = [x/y] [z/x](u) = [x/y](z) = z
and:
[z/y] [x/z] [y/x](u) = [z/y] [x/z](y) = [z/y](y) = z
So the equality (2.23) holds. Finally we assume that u = z:
[x/y] [z/x] [y/z](u) = [x/y] [z/x](y) = [x/y](y) = x
and:
[z/y] [x/z] [y/x](u) = [z/y] [x/z](u) = [z/y](x) = x
So the equality (2.23) holds again. This completes our proof of (2.23) and (2.22).
It remains to show that the equivalence (2.20) and (2.21) are true. First we
show the equivalence (2.20). Using proposition (101), it is sucient to show
120
that we have both x , Var([y/z][z/x]) and y , Fr ([y/z][z/x]). The fact that
x , Var([y/z][z/x]) is a consequence of proposition (71) and x ,= z. So we need
to show that y , Fr ([y/z][z/x]). We shall do so by applying proposition (76)
from which we obtain, provided we show z , Var([y/z]):
Fr ([y/z][z/x]) =
_
Fr ([y/z]) x z if x Fr ([y/z])
Fr ([y/z]) if x , Fr ([y/z])
(2.24)
However, before we can use equation (2.24) we need to check z , Var([y/z]),
which in fact immediately follows from proposition (71) and y ,= z. Having jus-
tied equation (2.24), it is now clear that in order to prove y , Fr ([y/z][z/x]),
it is sucient to prove that y , Fr ([y/z]). Since y , Var(), we can ap-
ply proposition (76) once more, and having already shown z , Fr () while
proving the equivalence
1
, we obtain Fr ([y/z]) = Fr (). Hence, it is
sucient to prove that y , Fr () which follows immediately from y , Var()
and Fr () Var(). This completes our proof of the equivalence (2.20). It re-
mains to show that the equivalence (2.21) is true. Using proposition (100), it is
sucient to prove that

[y/x][x/z] and z , Var(

) Var([y/x][x/z]). We
already know that z , Var(

), and z , Var([y/x][x/z]) is an immediate conse-


quence of proposition (71) and x ,= z. So we need to show that

[y/x][x/z].
From
1
[y/x] and
1

we obtain [y/x]

and it is therefore su-


cient to prove that [y/x] [y/x][x/z]. Using proposition (101), it is sucient
to show that x , Var([y/x]) and z , Fr ([y/x]). The fact that x , Var([y/x])
is an immediate consequence of proposition (71) and x ,= y. So it remains to
show that z , Fr ([y/x]), which in fact was already proved in the course of
proving the equivalence
1
.
Having shown that is a reexive, symmetric and transitive relation on
P(V ), it remains to prove that it is also a congruent relation on P(V ). However,
this cannot be done before we show the implication .
Proposition 108 Let be the almost strong equivalence and be the strong
substitution congruence on P(V ), where V is a set. For all , P(V ):

Proof
Let , P(V ) such that . We need to show that . We shall con-
sider the ve possible cases of denition (102) in relation to . Suppose rst
that = P
0
(V ). From the reexivity of the strong substitution congru-
ence, it is clear that . Suppose next that = = . Then we also have
. We now assume that =
1

2
and =
1

2
where
1

1
and
2

2
. The strong substitution congruence being a congruent relation on
P(V ), we obtain . Next we assume that = x
1
and = x
1
where

1

1
and x V . Again, the strong substitution congruence being a congru-
ent relation we obtain . Finally we assume that = x
1
and = y
1
where x ,= y, and there exists P(V ) such that
1
,
1
[y/x] and
121
y , Var(). Using once again the fact that the strong substitution congruence
is a congruent relation, we see that x and y [y/x]. By symmetry
and transitivity of the strong substitution congruence, it is therefore sucient
to show that x y [y/x] which follows immediately from x ,= y, y , Var()
and denition (97).
We are now in a position to show that is a congruent relation, which is
the last part missing before we can conclude that is a congruence on P(V ).
Proposition 109 Let be the almost strong equivalence relation on P(V )
where V is a set. Then is a congruent relation on P(V ).
Proof
From proposition (105), the almost strong equivalence is reexive and so
. We now assume that =
1

2
and =
1

2
where
1

1
and
2

2
. We need to show that . However from proposition (108) we
have
1

1
and
2

2
and it follows from denition (102) that . We
now assume that = x
1
and = x
1
where
1

1
and x V . We need
to show that . Once again from proposition (108) we have
1

1
and
consequently from denition (102) we obtain .
Proposition 110 Let be the almost strong equivalence relation on P(V )
where V is a set. Then is a congruence on P(V ).
Proof
We need to show that is reexive, symmetric, transitive and that it is a con-
gruent relation on P(V ). From proposition (105), the relation is reexive.
From proposition (106) it is symmetric while from proposition (107) it is tran-
sitive. Finally from proposition (109) the relation is a congruent relation.
So is a congruence on P(V ) such that R
0
. We conclude this section
with the equivalence and summarize with theorem (11) below.
Proposition 111 Let be the almost strong equivalence and be the strong
substitution congruence on P(V ), where V is a set. For all , P(V ):

Proof
From proposition (108) it is sucient to show the implication or equivalently
the inclusion . Since is the strong substitution congruence on P(V ), it
is the smallest congruence on P(V ) which contains the set R
0
of denition (97).
In order to show the inclusion it is therefore sucient to show that is
a congruence on P(V ) such that R
0
. The fact that it is a congruence stems
from proposition (110). The fact that R
0
follows from proposition (104).
122
We are now ready to provide a characterization of the strong substitution
congruence. It should be noted that (v) of theorem (11) below is a notational
shortcut for the more detailed statement = x
1
, = y
1
, x ,= y and
there exists a formula P(V ) such that
1
,
1
[y/x] , y , Var().
Theorem 11 Let be the strong substitution congruence on P(V ) where V is
a set. For all , P(V ), if and only if one of the following is the case:
(i) P
0
(V ) , P
0
(V ) , and =
(ii) = and =
(iii) =
1

2
, =
1

2
,
1

1
and
2

2
(iv) = x
1
, = x
1
and
1

1
(v) = x
1
, = y
1
, x ,= y ,
1
,
1
[y/x] , y , Var()
Proof
Immediately follows from proposition (111) and denition (102).
2.2.6 Counterexamples of the Strong Congruence
A fair amount of work has been devoted to the study of the strong substitution
congruence, culminating with its characterization in the form of theorem (11).
Unfortunately, most of this work was done vain. As we shall now discover, the
strong substitution congruence is not the appropriate notion to study. It may
give us insight and will certainly help us in the forthcoming developments, it
may be commonly referred to in the literature, but it is marred by a major
aw. The problem is as follows: when V = x, y with x ,= y, i.e. when V has
only two elements, the formulas = xy(x y) and = yx(y x) are not
equivalent. This is highly disappointing. If there is one thing we would expect of
a substitution congruence, it is certainly to be such that and be equivalent.
This leaves us with a painful alternative: we either give up on the possibility
of nite sets of variables V , or we abandon the notion of strong substitution
congruence and all the work that has gone along with it. As already hinted,
we are not prepared to accept that nothing interesting can be said with nite
sets V . There must be an appropriate notion of substitution congruence to be
dened on P(V ), and we shall therefore continue our search for it. But before
we resume our quest, we shall rst prove that the problem truly exists:
Proposition 112 Let be the strong substitution congruence on P(V ) where
V = x, y and x ,= y. Then, we have:
xy (x y) , yx(y x)
Proof
Dene
1
= y(x y) and
1
= x(y x). We need to show that x
1
, y
1
.
Suppose to the contrary that x
1
y
1
. We shall derive a contradiction.
123
Since x ,= y, from theorem (11) of page 123 there exists P(V ) such that

1
,
1
[y/x] and y , Var(). From
1
we see that y(x y) ,
and applying theorem (11) once more, since V = x, y we see that must
be of the form = x
1
or = y
1
. However the case = y
1
is impossi-
ble since y , Var(). Suppose now that = x
1
for some
1
P(V ). Then
x , Fr () = Fr (
1
)x. However we have y(x y) from proposition (98)
we have x = Fr ( y(x y) ) = Fr (). This is our desired contradiction.
As we have just seen, the strong substitution congruence is inadequate when
V has two elements. In fact, it is also inadequate when V has three elements as
the following proposition shows. We do not intend to spend too much time prov-
ing negative results, but we certainly believe the strong substitution congruence
on P(V ) will fail whenever V is a nite set.
Proposition 113 Let be the strong substitution congruence on P(V ) where
V = x, y, z and x ,= y, y ,= z and x ,= z. Then, we have:
xyz [(x y) (y z)] , yzx[(y z) (z x)]
Proof
Dene
1
= yz [(x y) (y z)] and
1
= zx[(y z) (z x)]. We
need to show that x
1
, y
1
. Suppose to the contrary that x
1
y
1
. We
shall derive a contradiction. Since x ,= y, from theorem (11) of page 123 there
exists P(V ) such that
1
,
1
[y/x] and y , Var(). From
1
we
see that yz [(x y) (y z)] , and applying theorem (11) once more,
since V = x, y, z we see that must be of the form = x
1
, = y
1
or = z
1
. However the case = y
1
is impossible since y , Var().
Furthermore, the case = x
1
is also impossible since this would imply that
x , Fr (), while from yz [(x y) (y z)] and proposition (98):
x = Fr ( yz [(x y) (y z)] ) = Fr ()
It follows that = z
1
for some
1
P(V ). Hence we see that:
yz [(x y) (y z)] z
1
Applying theorem (11) once more, there exists P(V ) such that we have
z [(x y) (y z)] ,
1
[z/y] and z , Var(). From the strong equiv-
alence z [(x y) (y z)] and yet another application of theorem (11),
since V = x, y, z we see that must be of the form = x
1
, = y
1
or
= z
1
. However, the case = z
1
is impossible since z , Var(). Suppose
now that = x
1
. Then x , Fr (), contradicting the equality:
x, y = Fr ( z [(x y) (y z)] ) = Fr ()
obtained from z [(x y) (y z)] and proposition (98). Since y Fr ()
we conclude similarly that = y
1
is equally impossible.
124
2.2.7 The Substitution Congruence
Prior to dening the strong substitution congruence in page 110, we discussed
what we believed were the requirements a substitution congruence ought to meet.
Unfortunately, despite our best eorts, we failed to dene the right notion. So
we are back to our starting point, enquiring about the appropriate denition of
a substitution congruence. In proposition (101) we showed that given a formula
, we have the equivalence [y/x] provided y , Var() and x , Fr ().
The key idea underlying this proposition is that some substitution should not
aect the meaning of a formula. This may give us a new angle of attack. In
this section, we dene those substitutions which we believe should have this
invariance property of not altering the equivalence class of a given formula ,
i.e. such that () . We shall then dene a new substitution congruence.
In proposition (101) we required that x , Fr (). This condition ensures
the substitution [y/x] does not aect free variables of the formula . There is
clearly a good reason for this. A substitution congruence is all about potentially
diering quantication variables. We cannot expect to have () unless the
free variables of are invariant under the substitution . Furthermore, proposi-
tion (101) also required y , Var(). This condition guarantees the substitution
[y/x] is valid for . This motivates the following denition:
Denition 114 Let V be a set and : V V be a map. Let P(V ). We
say that is an admissible substitution for if and only if it satises:
(i) valid for
(ii) u Fr () , (u) = u
Having dened admissible substitutions with respect to a formula , our
belief is that an appropriate substitution congruence should be such that
() whenever is admissible for . If we consider = xy(x y)
and the permutation = [y : x] for x ,= y, then is clearly admissible for .
Yet we know from proposition (112) that the equivalence () fails when
V = x, y and is the strong substitution congruence on P(V ). To remedy
this failure, we may conjecture that an appropriate denition of a substitution
congruence should be made in reference to ordered pairs (, ()) where is
admissible for . In other words, if we dene:
R
1
= ( , () ) : P(V ) , : V V admissible for
the right substitution congruence should simply be dened as the congruence on
P(V ) generated by R
1
. As it turns out, we shall adopt a dierent but equivalent
denition of the substitution congruence, in terms of a generator R
0
which is
a lot smaller than R
1
. In eect, rather than considering all possible ordered
pair (, ()) where is admissible for , we restrict our attention to the case
when = x
1
and = [y : x] for x ,= y and y , Fr (
1
), thereby obtaining
a denition of the substitution congruence which is formally very similar to
denition (97) of the strong substitution congruence.
125
Denition 115 Let V be a set. We call substitution congruence on P(V ) the
congruence on P(V ) generated by the following set R
0
P(V ) P(V ):
R
0
= ( x
1
, y
1
[y : x] ) :
1
P(V ) , x, y V , x ,= y , y , Fr (
1
)
where [y : x] denotes the permutation of x and y as per denition (66).
We have now dened what we hope will be a denitive notion of substitution
congruence. However, the denition (115) we chose does not make reference
to ordered pairs (, ()) where is an admissible substitution for . One of
our rst task is to make sure denition (115) is equivalent to having dened
the substitution congruence as generated by these ordered pairs. We start by
showing that () whenever is admissible for .
Proposition 116 Let be the substitution congruence on P(V ) where V is a
set. Let P(V ) and : V V be an admissible substitution for . Then:
()
Proof
We need to show the property [ ( admissible for ) () ] for all
P(V ). We shall do so by structural induction, using theorem (3) of page 32.
Since P
0
(V ) is a generator of P(V ), we shall rst show that the property is true
on P
0
(V ). So let = (x y) P
0
(V ) where x, y V . We assume that
: V V is an admissible substitution for . We need to show that ().
However since Fr () = x, y and is an admissible substitution for , we have
(x) = x and (y) = y and consequently:
() = (x y) = ( (x) (y) ) = (x y) =
and in particular we see that (). We now check that the property is
true for = . Note that any map : V V is an admissible substitution
for . Since we always have = (), it follows that (). We now
check that the property is true for =
1

2
if it is true for
1
,
2
P(V ).
So we assume that : V V is an admissible substitution for . We need
to show that (). Since =
1

2
and () = (
1
) (
2
), the
substitution congruence being a congruent relation on P(V ), it is sucient to
show that
1
(
1
) and
2
(
2
). First we show that
1
(
1
). Having
assumed the property is true for
1
, it is sucient to show that is an ad-
missible substitution for
1
. Since admissible for , in particular it is valid
for and it follows from proposition (87) that it is also valid for
1
. So it
remains to show that (u) = u for all u Fr (
1
) which follows immediately
from Fr () = Fr (
1
) Fr (
2
) and the fact that (u) = u for all u Fr ().
So we have proved that
1
(
1
) and we show similarly that
2
(
2
).
We now need to check that the property is true for = x
1
if it is true for

1
P(V ). So we assume that : V V is an admissible substitution for
. We need to show that (). We shall distinguish two cases: rst we
assume that (x) = x. Then () = x(
1
) and in order to show (),
126
the substitution congruence being a congruent relation on P(V ), it is sucient
to show that
1
(
1
). Having assumed the property is true for
1
, it is
therefore sucient to prove that is an admissible substitution for
1
. Since
admissible for , in particular it is valid for and it follows from proposi-
tion (88) that it is also valid for
1
. So it remains to show that (u) = u for all
u Fr (
1
). We shall distinguish two further cases: rst we assume that u = x.
Then (u) = u is true from our assumption (x) = x. So we assume that u ,= x.
It follows that u Fr (
1
) x = Fr (), and since is admissible for , we
conclude that (u) = u. This completes our proof of () in the case when
(x) = x. We now assume that (x) ,= x. Let y = (x). Then () = y (
1
)
and we need to show that x
1
y (
1
). However, since [y : x] [y : x] is the
identity mapping we have = [y : x]

where the map

: V V is dened
as

= [y : x] . It follows that (
1
) =

(
1
)[y : x] and we need to show
that x
1
y

(
1
)[y : x]. Let us accept for now that y , Fr (

(
1
)). Then
from denition (115) we obtain x

(
1
) y

(
1
)[y : x], and it is there-
fore sucient to prove that x
1
x

(
1
). So we see that it is sucient
to prove
1

(
1
) provided we can justify the fact that y , Fr (

(
1
)).
First we show that
1

(
1
). Having assumed our property is true for
1
it is sucient to prove that

is admissible for
1
. However, we have already
seen that is valid for
1
. Furthermore, since [y : x] is an injective map, from
proposition (86) it is a valid substitution for (
1
). It follows from proposi-
tion (91) that

= [y : x] is valid for
1
. So in order to prove that

is
admissible for
1
, it remains to show that

(u) = u for all u Fr (


1
). So let
u Fr (
1
). We shall distinguish two cases: rst we assume that u = x. Then

(u) = [y : x]((x)) = [y : x](y) = x = u. Next we assume that u ,= x. Then u


is in fact an element of Fr (). Having assumed is admissible for we obtain
(u) = u. We also obtain the fact that is valid for = x
1
and consequently
(u) ,= (x), i.e. u ,= y. Thus

(u) = [y : x]((u)) = [y : x](u) = u. This


completes our proof that

is admissible for
1
and
1

(
1
). It remains to
show that y , Fr (

(
1
)). So suppose to the contrary that y Fr (

(
1
)). We
shall obtain a contradiction. Using proposition (74) there exists u Fr (
1
) such
that y =

(u). Having proven that

is admissible for
1
we have

(u) = u
and consequently y = u Fr (
1
). From the assumption y = (x) ,= x we in
fact have y Fr (). So from the admissibility of for we obtain (y) = y
and furthermore from the validity of for = x
1
we obtain (y) ,= (x). So
we conclude that y ,= (x) which contradicts our very denition of y.
We are now in a position to check that the substitution congruence is also
generated by the set of ordered pairs (, ()) where is admissible for .
Proposition 117 Let V be a set. Then the substitution congruence on P(V )
is also generated by the following set R
1
P(V ) P(V ):
R
1
= ( , () ) : P(V ) , : V V admissible for
Proof
Let denote the substitution congruence on P(V ) and be the congruence
127
on P(V ) generated by R
1
. We need to show that =. First we show that
. Since is the smallest congruence on P(V ) which contains the set R
0
of denition (115), in order to prove it is sucient to prove that R
0
.
So let
1
P(V ) and x, y V be such that x ,= y and y , Fr (
1
). Dene
= x
1
and = y
1
[y : x]. We need to show that . The congruence
being generated by R
1
it is sucient to prove that (, ) R
1
. However, if
we dene : V V by setting = [y : x] we have:
= y
1
[y : x] = (x) (
1
) = (x
1
) = ()
Hence, in order to show (, ) R
1
, it is sucient to prove that is an ad-
missible substitution for . Being injective, it is clear from proposition (86)
that is valid for . So we need to prove that (u) = u for all u Fr ().
So let u Fr (). It is sucient to prove that u , x, y. The fact that
u ,= x is clear from u Fr () = Fr (
1
) x. The fact that u ,= y follows
from u Fr () Fr (
1
) and the assumption y , Fr (
1
). We now show that
. Since is the smallest congruence on P(V ) which contains the set R
1
,
it is sucient to show that R
1
. So let P(V ) and : V V be an
admissible substitution for . We need to show that (). But this follows
immediately from proposition (116).
2.2.8 Free Variable and Substitution Congruence
A lot of the work which was done for the strong substitution congruence needs
to be done all over again for the substitution congruence. We start by showing
that equivalent formulas have the same free variables. The following proposition
is the counterpart of proposition (98) of the strong substitution congruence.
Proposition 118 Let denote the substitution congruence on P(V ) where V
is a set. Then for all , P(V ) we have the implication:
Fr () = Fr ()
Proof
Let be the relation on P(V ) dened by Fr () = Fr (). We need
to show that or equivalently that the inclusion holds.
Since is the substitution congruence on P(V ), it is the smallest congruence
on P(V ) which contains the set R
1
of proposition (117). In order to show the
inclusion it is therefore sucient to show that is a congruence on P(V )
such that R
1
. However, we already know from proposition (77) that is
a congruence on P(V ). So it remains to show that R
1
. So let P(V )
and : V V be an admissible substitution for . We need to show that
() or equivalently that Fr () = Fr (()). However, since is valid for
, from proposition (85) we have Fr (()) = (Fr ()) so we need to show that
Fr () = (Fr ()) which follows from the fact that (u) = u for all u Fr ().
128
Proposition 119 Let denote the substitution congruence on P(V ) where V
is a set. Then for all
1
P(V ) and x, y V such that x, y , Fr (
1
) we have:
x
1
y
1
Proof
From y , Fr (
1
) and denition (115) we see that x
1
y
1
[y : x]. So
we need to show that y
1
[y : x] y
1
. Hence it is sucient to prove that

1
[y : x]
1
. Using proposition (116), we simply need to argue that [y : x]
is an admissible substitution for
1
. Being injective, from proposition (86) it
is a valid substitution for
1
. So it remains to show that [y : x](u) = u for all
u Fr (
1
) which follows immediately from x, y , Fr (
1
).
2.2.9 Substitution and Substitution Congruence
We now show that equivalence between formulas is invariant under injective
substitution. The following proposition is the counterpart of proposition (99)
of the strong substitution congruence.
Proposition 120 Let V and W be sets and : V W be an injective map.
Let be the substitution congruence both on P(V ) and P(W). Then:
() ()
for all , P(V ), where : P(V ) P(W) is also the substitution mapping.
Proof
Let be the relation on P(V ) dened by () (). We need to
show that or equivalently that the inclusion holds. Since
is the substitution congruence on P(V ), it is the smallest congruence on P(V )
which contains the set R
0
of denition (115). In order to show the inclusion
it is therefore sucient to show that is a congruence on P(V ) such that
R
0
. However, we already know from proposition (58) that is a congruence
on P(V ). So it remains to show that R
0
. So let
1
P(V ) and x, y V
be such that x ,= y and y , Fr (
1
). Dene = x
1
and = y
1
[y : x]. We
need to show that or equivalently that () (). In order to do so, it
is sucient to show that the ordered pair ((), ()) belongs to the generator
R

0
of the substitution congruence on P(W) as per denition (115). In other
words, it is sucient to show the existence of

1
P(W) and x

, y

W with
x

,= y

and y

, Fr (

1
), such that () = x

1
and () = y

1
[y

: x

]. Take

1
= (
1
) P(W) together with x

= (x) W and y

= (y) W. Then:
() = (x
1
) = (x) (
1
) = x

1
Furthermore, from proposition (67) we have [y : x] = [(y): (x)] and so:
() = (y
1
[y : x])
129
= (y) (
1
[y : x])
= y

([y : x](
1
))
= y

[y : x] (
1
)
= y

[(y): (x)] (
1
)
= y

[y

: x

](

1
)
= y

1
[y

: x

]
So it remains to show that x

,= y

and y

, Fr (

1
). Since : V W is an
injective map, x

,= y

follows immediately from x ,= y. We now show that


y

, Fr (

1
). So suppose to the contrary that y

Fr (

1
). We shall arrive at a
contradiction. Since

1
= (
1
), from proposition (75) we have:
Fr (

1
) = (u) : u Fr (
1
)
It follows that there exists u Fr (
1
) such that y

= (u). However, y

= (y)
and : V W is an injective map. Hence we see that u = y and consequently
y Fr (
1
) which contradicts our initial assumption of y , Fr (
1
).
2.2.10 Characterization of the Substitution Congruence
Just like the strong substitution congruence, the substitution congruence was
dened in terms of a generator. This makes it convenient to prove an equivalence
, but inconvenient when it comes to proving this equivalence does not hold.
We shall therefore follow a similar strategy to that leading up to theorem (11)
of page 123. We refer the reader to the discussion preceding denition (102).
We shall dene a new relation on P(V ) and prove that .
The point of the relation is to give us more insight about the formulas and
whenever we have . Most of the hard work consists in proving that is
a transitive relation. However, this will be a lot simpler than in the case of the
strong substitution congruence. In fact comparing denition (121) below to its
counterpart denition (102), we see that these denitions are formally identical
with a dierence occurring solely in (v). For the strong substitution congruence:
(v) = x
1
, = y
1
, x ,= y ,
1
,
1
[y/x] , y , Var()
while for the substitution congruence we have:
(v) = x
1
, = y
1
, x ,= y ,
1

1
[y : x] , y , Fr (
1
)
This is the beauty of the substitution congruence. Not only does it work equally
well for V nite and V innite or so we believe, but it is also a lot simpler in
many respects. It involves the injective permutation [y : x] rather than the
non-injective substitution [y/x]. It makes no reference to the existence of a
third formula in the case of (v) but directly gives us
1

1
[y : x] instead.
The proofs in the case of the substitution congruence are simpler than their
counterparts of the strong substitution congruence.
130
Denition 121 Let be the substitution congruence on P(V ) where V is a
set. Let , P(V ). We say that is almost equivalent to and we write
, if and only if one of the following is the case:
(i) P
0
(V ) , P
0
(V ) , and =
(ii) = and =
(iii) =
1

2
, =
1

2
,
1

1
and
2

2
(iv) = x
1
, = x
1
and
1

1
(v) = x
1
, = y
1
, x ,= y ,
1

1
[y : x] , y , Fr (
1
)
Proposition 122 (i), (ii), (iii), (iv), (v) of def. (121) are mutually exclusive.
Proof
This is an immediate consequence of theorem (2) of page 21 applied to the free
universal algebra P(V ) with free generator P
0
(V ), where a formula P(V )
is either an element of P
0
(V ), or the contradiction constant = , or an im-
plication =
1

2
, or a quantication = x
1
, but cannot be equal to
any two of those things simultaneously. Since (v) can only occur with x ,= y,
it also follows from theorem (2) that (v) cannot occur at the same time as (iv).
Having dened the relation , we now proceed to prove the equivalence
. The strategy is the same as the one adopted for the strong
substitution congruence. The dicult part is to show the inclusion , which
we achieve by showing R
0
together with the fact that is a congruence.
Proposition 123 Let be the almost equivalence relation on P(V ) where V
is a set. Then contains the generator R
0
of denition (115).
Proof
Suppose x, y V and
1
P(V ) are such that x ,= y and y , Fr (
1
). Note that
this cannot happen unless V has at least two elements. We dene = x
1
and
= y
1
[y : x]. We need to show that . We shall do so by proving that
(v) of denition (121) is the case. Dene
1
=
1
[y : x]. Then we have = x
1
,
= y
1
, x ,= y and y , Fr (
1
). So it remains to show that
1

1
[y : x]
which is immediate from the reexivity of .
Proposition 124 Let be the almost equivalence relation on P(V ) where V
is a set. Then is a reexive relation on P(V ).
Proof
Let P(V ). We need to show that . From theorem (2) of page 21
we know that is either an element of P
0
(V ), or = or =
1

2
or
= x
1
for some
1
,
2
P(V ) and x V . We shall consider these four
mutually exclusive cases separately. Suppose rst that P
0
(V ): then from
= we obtain . Suppose next that = : then it is clear that .
131
Suppose now that =
1

2
. Since the substitution congruence is reexive,
we have
1

1
and
2

2
. It follows from (iii) of denition (121) that .
Suppose nally that = x
1
. From
1

1
and (iv) of denition (121) we
conclude that . In all cases, we have proved that .
Proposition 125 Let be the almost equivalence relation on P(V ) where V
is a set. Then is a symmetric relation on P(V ).
Proof
Let , P(V ) be such that . We need to show that . We shall
consider the ve possible cases of denition (121): suppose rst that P
0
(V ),
P
0
(V ) and = . Then it is clear that . Suppose next that =
and = . Then we also have . We now assume that =
1

2
and
=
1

2
with
1

1
and
2

2
. Since the substitution congruence
on P(V ) is symmetric, we have
1

1
and
2

2
. Hence we have .
We now assume that = x
1
and = x
1
with
1

1
. Then once again
by symmetry of the substitution congruence we have
1

1
and consequently
. We nally consider the last possible case of = x
1
, = y
1
with
x ,= y,
1

1
[y : x] and y , Fr (
1
). We need to show that
1

1
[x: y] and
x , Fr (
1
). First we show that
1

1
[x: y]. Note that [x : y] and [y : x] are
in fact the same substitutions. So we need to show that
1

1
[y : x]. Since

1

1
[y : x] and [y : x] : V V is injective, from proposition (120) we obtain:

1
[y : x]
1
[y : x][y : x]
It is therefore sucient to show that
1

1
[y : x][y : x] which follows from
proposition (56) and the fact that [y : x] [y : x] is the identity mapping. We now
show that x , Fr (
1
). From
1

1
[y : x] and proposition (118) we obtain
Fr (
1
) = Fr (
1
[y : x]). So we need to show that x , Fr (
1
[y : x]). So suppose
to the contrary that x Fr (
1
[y : x]). We shall derive a contradiction. Since
[y : x] is injective, from proposition (75) we have Fr (
1
[y : x]) = [y : x](Fr (
1
))
and consequently there exists u Fr (
1
) such that x = [y : x](u). It follows
that u = y which contradicts the assumption y , Fr (
1
).
Showing that is a transitive relation is the dicult part of this section.
The following proposition is the counterpart of proposition (107) of the strong
substitution congruence. Its proof is however a lot simpler.
Proposition 126 Let be the almost equivalence relation on P(V ) where V
is a set. Then is a transitive relation on P(V ).
Proof
Let , and P(V ) be such that and . We need to show that
. We shall consider the ve possible cases of denition (121) in relation to
. Suppose rst that , P
0
(V ) and = . Then from we obtain
, P
0
(V ) and = . It follows that , P
0
(V ) and = . Hence we
132
see that . We now assume that = = . Then from we obtain
= = . It follows that = = and consequently . We now assume
that =
1

2
and =
1

2
with
1

1
and
2

2
. From we
obtain =
1

2
with
1

1
and
2

2
. The substitution congruence
being transitive, it follows that
1

1
and
2

2
. Hence we see that .
We now assume that = x
1
and = x
1
with
1

1
, for some x V .
From only the cases (iv) and (v) of denition (121) are possible. First we
assume that (iv) is the case. Then = x
1
with
1

1
. The substitution
congruence being transitive, we obtain
1

1
and consequently . We
now assume that (v) is the case. Then = y
1
for some y V with x ,= y,

1

1
[y : x] and y , Fr (
1
). In order to prove it is sucient to show
that
1

1
[y : x] and y , Fr (
1
). First we show that
1

1
[y : x]. It is
sucient to prove that
1
[y : x]
1
[y : x] which follows from
1

1
, using
proposition (120) and the fact that [y : x] : V V is injective. We now show that
y , Fr (
1
). From
1

1
and proposition (118) we obtain Fr (
1
) = Fr (
1
).
Hence it is sucient to show that y , Fr (
1
) which is true by assumption. It
remains to consider the last possible case of denition (121). So we assume that
= x
1
and = y
1
with x ,= y,
1

1
[y : x] and y , Fr (
1
). From
only the cases (iv) and (v) of denition (121) are possible. First we
assume that (iv) is the case. Then = y
1
with
1

1
. The substitution
congruence being transitive, we obtain
1

1
[y : x] and consequently .
We now assume that (v) is the case. Then = z
1
for some z V with y ,= z,

1

1
[z : y] and z , Fr (
1
). We shall now distinguish two cases. First we
assume that x = z. Then = x
1
and = x
1
and in order to show that
it is sucient to prove that
1

1
. From
1

1
[z : y] and z = x we
obtain
1

1
[y : x] and it is therefore sucient to prove that
1

1
[y : x].
However, we know that
1

1
[y : x] and since [y : x] : V V is injective,
from proposition (120) we obtain
1
[y : x]
1
[y : x][y : x]. It is therefore
sucient to show that
1

1
[y : x][y : x] which follows from proposition (56)
and the fact that [y : x] [y : x] is the identity mapping. This completes our
proof in the case when x = z. We now assume that x ,= z. So we have x ,= y,
y ,= z and x ,= z, with = x
1
, = y
1
and = z
1
. Furthermore,

1

1
[y : x] and
1

1
[z : y] while we have y , Fr (
1
) and z , Fr (
1
),
and we need to show that . So we need to prove that
1

1
[z : x]
and z , Fr (
1
). First we show that z , Fr (
1
). So suppose to the contrary
that z Fr (
1
). We shall derive a contradiction. Since [y : x] is injective,
from proposition (75) we have Fr (
1
[y : x]) = [y : x](Fr (
1
)). It follows that
z = [y : x](z) is also an element of Fr (
1
[y : x]). However we have
1

1
[y : x]
and consequently from proposition (118) we obtain Fr (
1
) = Fr (
1
[y : x]).
Hence we see that z Fr (
1
) which is our desired contradiction. We shall
now prove that
1

1
[z : x]. Since
1

1
[z : y], it is sucient to show that

1
[z : y]
1
[z : x]. However we know that
1

1
[y : x] and since [z : y] : V V
is injective, from proposition (120) we obtain
1
[z : y]
1
[y : x][z : y]. It is
therefore sucient to show that
1
[y : x][z : y]
1
[z : x]. Let us accept for now:
[z : x] = [y : x] [z : y] [y : x] (2.25)
133
Then we simply need to show that
1
[y : x][z : y]
1
[y : x][z : y][y : x]. Using
proposition (116), it is therefore sucient to prove that [y : x] is an admissible
substitution for
1
[y : x][z : y]. Since [y : x] : V V is injective, from proposi-
tion (86) it is valid for
1
[y : x][z : y]. So it remains to show that [y : x](u) = u for
all u Fr (
1
[y : x][z : y]). It is therefore sucient to prove that neither x nor y
are elements of Fr (
1
[y : x][z : y]). However, since [z : y] [y : x] : V V is injec-
tive, from proposition (75) we have Fr (
1
[y : x][z : y]) = [z : y] [y : x](Fr (
1
)). So
we need to show that x and y do not belong to [z : y] [y : x](Fr(
1
)). First we
do this for x. Suppose x = [z : y] [y : x](u) for some u Fr (
1
). By injectivity
we must have u = y, contradicting the assumption y , Fr (
1
). We now deal
with y. So suppose y = [z : y] [y : x](u) for some u Fr (
1
). By injectivity we
must have u = z, contradicting the fact that z , Fr (
1
) which we have already
proven. It remains to prove that equation (2.25) holds. So let u V . We need:
[z : x](u) = [y : x] [z : y] [y : x](u)
This is clearly the case when u , x, y, z. The cases u = x, u = y and u = z
are easily checked.
The implication is the simple one. We need to prove it
now in order to show that is a congruent relation.
Proposition 127 Let be the almost equivalence and be the substitution
congruence on P(V ), where V is a set. For all , P(V ):

Proof
Let , P(V ) such that . We need to show that . We shall
consider the ve possible cases of denition (121) in relation to . Suppose
rst that = P
0
(V ). From the reexivity of the substitution congruence,
it is clear that . Suppose next that = = . Then we also have
. We now assume that =
1

2
and =
1

2
where
1

1
and
2

2
. The substitution congruence being a congruent relation on P(V ),
we obtain . Next we assume that = x
1
and = x
1
where
1

1
and x V . Again, the substitution congruence being a congruent relation we
obtain . Finally we assume that = x
1
and = y
1
where x ,= y,

1

1
[y : x] and y , Fr (
1
). For the last time, the substitution congruence
being a congruent relation we obtain y
1
[y : x]. Hence in order to show
it is sucient to show that x
1
y
1
[y : x] which follows immediately
from x ,= y, y , Fr (
1
) and denition (115).
Proposition 128 Let be the almost equivalence relation on P(V ) where V
is a set. Then is a congruent relation on P(V ).
Proof
From proposition (124), the almost equivalence is reexive and so . We
134
now assume that =
1

2
and =
1

2
where
1

1
and
2

2
.
We need to show that . However from proposition (127) we have
1

1
and
2

2
and it follows from denition (121) that . We now assume
that = x
1
and = x
1
where
1

1
and x V . We need to show that
. Once again from proposition (127) we have
1

1
and consequently
from denition (121) we obtain .
Proposition 129 Let be the almost equivalence relation on P(V ) where V
is a set. Then is a congruence on P(V ).
Proof
We need to show that is reexive, symmetric, transitive and that it is a con-
gruent relation on P(V ). From proposition (124), the relation is reexive.
From proposition (125) it is symmetric while from proposition (126) it is tran-
sitive. Finally from proposition (128) the relation is a congruent relation.
So we have shown that is a congruence on P(V ) which contains the genera-
tor R
0
of the substitution congruence. The equality = follows immediately.
Proposition 130 Let be the almost equivalence and be the substitution
congruence on P(V ), where V is a set. For all , P(V ):

Proof
From proposition (127) it is sucient to show the implication or equivalently
the inclusion . Since is the substitution congruence on P(V ), it is the
smallest congruence on P(V ) which contains the set R
0
of denition (115). In
order to show the inclusion it is therefore sucient to show that is a
congruence on P(V ) such that R
0
. The fact that it is a congruence stems
from proposition (129). The fact that R
0
follows from proposition (123).
We have no need to remember the relation . The equality = is wrapped
up in the following theorem for future reference. This is the counterpart of
theorem (11) of page 123 of the strong substitution congruence.
Theorem 12 Let be the substitution congruence on P(V ) where V is a set.
For all , P(V ), if and only if one of the following is the case:
(i) P
0
(V ) , P
0
(V ) , and =
(ii) = and =
(iii) =
1

2
, =
1

2
,
1

1
and
2

2
(iv) = x
1
, = x
1
and
1

1
(v) = x
1
, = y
1
, x ,= y ,
1

1
[y : x] , y , Fr (
1
)
Proof
Immediately follows from proposition (130) and denition (121).
135
2.2.11 Substitution Congruence vs Strong Congruence
In denition (97) we initially dened the strong substitution congruence think-
ing it was the appropriate notion to study. We then realized in proposition (112)
and proposition (113) that the strong substitution congruence is in fact unsatis-
factory in the case when the set of variables V has two or three elements. For ex-
ample, when V = x, y with x ,= y, we saw that the formulas = xy(x y)
and = yx(y x) failed to be equivalent. It is easy to believe that similar
failures can be uncovered whenever V is a nite set. So we decided to search for
a new notion of substitution congruence which led to denition (115). In this
section, we shall attempt to show that this new substitution congruence has all
the advantages of the strong substitution congruence, but without its aws.
First we look at the paradox of proposition (112). So let V = x, y with
x ,= y and = xy(x y) with = yx(y x). Then setting
1
= y(x y)
we have = x
1
and = y
1
[y : x]. Since x ,= y and y , Fr (
1
) we conclude
from denition (115) that where denotes the substitution congruence.
So the paradox of proposition (112) is lifted. Suppose now that V = x, y, z
where x, y and z are distinct, and dene = xyz [(x y) (y z)]
and = yzx[(y z) (z x)]. We claim that . If this is the
case, then the paradox of proposition (113) is also lifted. Let be the formula
dened by = yxz [(y x) (x z)]. Then is simply the formula
obtained from by permuting the variables x and y. In order to show
it is sucient to prove that and . First we show that .
Dening
1
= yz [(x y) (y z)] we have = x
1
and = y
1
[y : x].
Since x ,= y and y , Fr (
1
) we conclude from denition (115) that . We
now show that . Setting = xz [(y x) (x z)] together with

= zx[(y z) (z x)] we have = y and = y

. It is therefore
sucient to show that

. Dening
1
= z [(y x) (x z)] we obtain
= x
1
and

= z
1
[z : x]. Since x ,= z and z , Fr (
1
) we conclude from
denition (115) that

. So we have proved that the substitution congruence


is not subject to the paradoxes of proposition (112) and proposition (113).
Having reviewed the known shortcomings of the strong substitution congru-
ence, we shall now establish that the new substitution congruence is in fact
pretty much the same notion as the old. Specically, we shall see that both
congruences coincide when V is an innite set. Furthermore, even when V is
a nite set we shall see that two formulas and which are simply equiva-
lent with respect to the substitution congruence, are in fact strongly equivalent
provided and do not use up all the variables in V , i.e. Var() ,= V or
Var() ,= V . First we check that the strong substitution congruence is indeed
a relation which is stronger than the substitution congruence:
Proposition 131 Let be the substitution congruence and be the strong
substitution congruence on P(V ) where V is a set. Then for all , P(V ):

Proof
We need to show the inclusion . However, using denition (97) the strong
136
substitution congruence is generated by the following set:
R
0
= ( x
1
, y
1
[y/x] ) :
1
P(V ) , x, y V , x ,= y , y , Var(
1
)
while from denition (115) the congruence is generated by the set:
R
1
= ( x
1
, y
1
[y : x] ) :
1
P(V ) , x, y V , x ,= y , y , Fr (
1
)
So it is sucient to show that R
0
R
1
which follows from the fact that

1
[y/x] =
1
[y : x] whenever y , Var(
1
) as can be seen from proposition (69).
Proposition 132 Let be the strong substitution congruence on P(V ) where
V is a set. Let P(V ) such that y , Fr () and Var() ,= V . Then we have:
x y [y : x]
Proof
Suppose for now that y , Var(). From proposition (69) we have the equality
[y : x] = [y/x] and the strong equivalence x y [y : x] is therefore a of
consequence y , Var() and denition (97). So the conclusion of the proposition
follows easily with the simple assumption y , Var(). Thus we may assume that
y Var() from now on. Suppose now that P(V ) is such that y , Var()
and . Then for the reasons just indicated we obtain x y [y : x].
Furthermore, the strong substitution congruence being a congruent relation on
P(V ), from we have x x and y [y : x] y [y : x], where we
have used the fact that [y : x] [y : x] which itself follows from the injec-
tivity of [y : x] : V V and proposition (99). By transitivity it follows that
x y [y : x] and the proposition is proved. Thus, it is sucient to show
the existence of P(V ) with y , Var() and . Having assumed the
existence of z V such that z , Var(), let P(V ) be dened as = [z/y].
Note in particular that z ,= y since we have assumed y Var(). The fact that
y , Var() follows immediately from z ,= y and proposition (71). So it remains
to show that or equivalently that [z/y] which is a consequence of
proposition (101) and the facts that z , Var() and y , Fr ().
Proposition 133 Let be the substitution congruence and be the strong
substitution congruence on P(V ) where V is a set. Let , P(V ) be such
that Var() ,= V . Then we have the equivalence:

Proof
The implication follows from proposition (131). So we need to prove .
Specically, given P(V ), we need to show that satises the property:
Var() ,= V [ ]
137
We shall do so by a structural induction argument, using theorem (3) of page 32.
First we assume that = (x y) for some x, y V . As we shall see the con-
dition Var() ,= V will not be used here. So let P(V ) such that .
It is sucient to prove that . In fact it is sucient to show that =
which follows immediately from = (x y) and , using theorem (12) of
page 135. So we now assume that = . Then again from theorem (12) the
condition implies that = and we are done. So we now assume that
=
1

2
where
1
,
2
P(V ) satisfy our property. We need to show the
same is true of . So we assume that Var() ,= V and we consider P(V )
such that . We need to show that . However, using theorem (12)
once more the condition implies that must be of the form =
1

2
where
1

1
and
2

2
. Hence the strong substitution being a congru-
ent relation, in order to show it is sucient to prove that
1

1
and

2

2
. Having assumed our induction property is true for
1
, it follows from
Var(
1
) Var() that Var(
1
) ,= V and
1

1
is therefore an immediate
consequence of
1

1
. We prove similarly that
2

2
which completes the
case when =
1

2
. We now assume that = x
1
where x V and

1
P(V ) satises our property. We need to show the same is true for . So
we assume that Var() ,= V and consider P(V ) such that . We need
to show that . From theorem (12), the condition leads to two
possible cases: must be of the form = x
1
with
1

1
, or it must be of
the form = y
1
with x ,= y,
1

1
[y : x] and y , Fr (
1
). First we assume
that = x
1
which
1

1
. Then in order to show that it is sucient
to prove that
1

1
. Having assumed our induction property is true for
1
,
it follows from Var(
1
) Var() that Var(
1
) ,= V and
1

1
is therefore an
immediate consequence of
1

1
. So we now consider the second case when
= y
1
with x ,= y,
1

1
[y : x] and y , Fr (
1
). We need to show that
. However, using proposition (120) and the fact that [y : x] : V V is
an injective map such that [y : x] [y : x] is the identity mapping, we obtain
immediately
1
[y : x]
1
. Having assumed our property is true for
1
, from
Var(
1
) ,= V we obtain
1
[y : x]
1
. Composing once again on both side by
[y : x] we now argue from proposition (99) that
1

1
[y : x]. Thus in order to
show that it is sucient to prove that x
1
y
1
[y : x] which follows
immediately from proposition (132), y , Fr (
1
) and Var(
1
) ,= V .
So it is now clear that the substitution congruence and strong substitution
congruence are pretty much the same thing. We just need to bear in mind
that an equivalence may fail to imply a strong equivalence in
the case when all variables of V have been used up by both and . We
shall conclude this section by providing counterparts to proposition (116) and
proposition (117) for the strong substitution congruence.
Proposition 134 Let be the strong substitution congruence on P(V ) where
V is a set. Let P(V ) and : V V be an admissible substitution for
such that Var(()) ,= V . Then, we have:
()
138
Proof
Let be the substitution congruence on P(V ). Having assumed is admissible
for , using proposition (116) we obtain (). Since Var(()) ,= V we
conclude from proposition (133) that ().
Proposition (134) allows us to state another characterization of the strong
substitution congruence as the following proposition shows. In denition (97),
the strong substitution congruence was dened in terms of a generator:
R
0
= (, ()) : = x
1
, = [y/x] , x ,= y , y , Var(
1
)
As we shall soon discover, this generator is in fact a subset of the set R
1
of
ordered pairs (, ()) where is admissible for and such that Var(()) ,= V .
Since we now know from proposition (134) that R
1
is itself a subset of the strong
substitution congruence, it follows that R
1
is also a generator of the strong
substitution congruence, which we shall now prove formally:
Proposition 135 Let V be a set. Then the strong substitution congruence on
P(V ) is also generated by the following set R
1
P(V ) P(V ):
R
1
= ( , () ) : P(V ) , : V V admissible for , Var(()) ,= V
Proof
Let denote the strong substitution congruence on P(V ) and be the congru-
ence on P(V ) generated by R
1
. We need to show that =. First we show
that . Since is the smallest congruence on P(V ) which contains the
set R
0
of denition (97), in order to prove it is sucient to prove that
R
0
. So let
1
P(V ) and x, y V be such that x ,= y and y , Var(
1
).
Dene = x
1
and = y
1
[y/x]. We need to show that . The
congruence being generated by R
1
it is sucient to prove that (, ) R
1
.
However, if we dene : V V by setting = [y/x] we have:
= y
1
[y/x] = (x) (
1
) = (x
1
) = ()
Hence, in order to show (, ) R
1
, it is sucient to prove that is an ad-
missible substitution for and furthermore that Var(()) ,= V . The fact that
Var(()) ,= V follows immediately from x , Var([y/x]), which is itself a
consequence of x ,= y and proposition (71). So we need to show that is an
admissible substitution for . First we show that is valid for = [y/x]. This
follows immediately from proposition (90) and the fact that y is not an element
of Var() = Var(
1
) x. Next we show that (u) = u for all u Fr ().
So let u Fr () = Fr (
1
) x. Then in particular u ,= x and consequently
(u) = [y/x](u) = u. This completes our proof of . We now show that
. Since is the smallest congruence on P(V ) which contains the set R
1
,
it is sucient to show that R
1
. So let P(V ) and : V V be an
admissible substitution for such that Var(()) ,= V . We need to show that
(). But this follows immediately from proposition (134).
139
From proposition (135) the strong substitution congruence is generated by
the set of ordered pairs (, ()) where is admissible for and such that
Var(()) ,= V . This is also a aw of the strong substitution congruence as we
feel the condition Var(()) ,= V should not be there. In contrast, as can be seen
from proposition (117), the substitution congruence does not suer from this
condition, as it is simply generated by the set of ordered pairs (, ()) where
is admissible for . Whichever way we look at it, it is clear that the substitution
congruence is a better notion to look at than the strong substitution congruence.
In many respects, it is also a lot simpler. Before we close this section, we may go
back to a question we raised before dening the strong substitution congruence
in page 110. We decided to dene the strong substitution in terms of ordered
pairs (x
1
, y
1
[y/x]) where x ,= y and y , Var(
1
). What if we had opted for
the condition [y/x] valid for x
1
rather than y , Var(
1
) ? It is now clear
from proposition (134) that this would have yielded a congruence identical to
the strong substitution congruence. So we did well to keep it simple.
2.3 Essential Substitution of Variable
2.3.1 Preliminaries
The notion of essential substitution arises from the more general issue of capture-
avoiding substitutions in formal languages with variable binding. This question
is already dealt with in the computer science literature of which by and large,
we are sadly unaware. It is therefore very dicult for us to give due credit to the
work that has already been done. However, we would like to mention the papers
of Murdoch J. Gabbay [22] and Murdoch J. Gabbay and Aad Mathijssen [23],
whose existence came to our attention, and which deal with capture-avoiding
substitutions. Our own approach we think is original, but of course we cannot be
sure. The whole question of capture-avoiding substitutions arises from the fact
that most substitutions : P(V ) P(W) associated with a map : V W
are not capture-avoiding. We introduced the notion of valid substitutions in
denition (83) so as to distinguish those cases when () is capture-avoiding
and those when it is not. This was clearly a step forward but we were still
very short of dening a total map : P(V ) P(W) which would somehow
be associated to : V W, while avoiding capture for every P(V ). Our
solution to the problem relies on the introduction of the minimal transform
mapping / : P(V ) P(

V ) where

V = V N is the direct sum of V and N.
In considering the minimal extension

V of the set V , we are eectively adding
a new set of variables specically for the purpose of variable binding, an idea
which bears some resemblance with the introduction of atoms in the nominal set
approach of Murdoch J. Gabbay. The minimal transform mapping replaces all
bound variables from the set V to the set N. The benet of doing so is immediate
as given a map : V W, the obvious extension :

V

W applied to the
minimal transform /() is now capture-avoiding for all . In other words, the
formula /() is always logically meaningful. However, this formula belongs
140
to the space P(

W) and we are still short of dening an essential substitution
: P(V ) P(W). The breakthrough consists in noting that, provided the set
W is innite or larger than the set V , it is always possible to express the formula
/() as the minimal transform /() for some formula P(W). Using
the axiom of choice, an essential substitution : P(V ) P(W) associated with
the map : V W is simply dened in terms of / = /, and is
uniquely characterized modulo -equivalence, i.e. the substitution congruence.
Of course, anyone interested in computer science and computability in general
will not look favorably upon the use of the axiom of choice. We are dealing with
arbitrary sets V and W of arbitrary cardinality. In practice, a computer scientist
will have sets of variables which are well-ordered, and it is not dicult to nd
some P(W) such that /() = /() in a way which is deterministic.
2.3.2 Minimal Transform of Formula
There will come a point when we will want to invoke x
1

1
[a] as an
instance of the specialization axioms to argue that
1
[a] must be true if we
know that x
1
is itself true. In doing so, the formula
1
[a] should be some
form of evaluation of the formula
1
, where the free variable x of
1
has been
replaced by a. For example, when
1
= y(x y) with x ,= y, it would be
natural to dene
1
[a] = y(a y), provided a ,= y. In the case when a = y
such denition would not work. So we have a problem. It is important for us to
argue that x
1

1
[a] is a legitimate axiom even in the case when a = y. If
we fail to do so, our deductive system will be too weak and we will most likely
have a very hard time attempting to prove G odels completeness theorem.
There are no doubt several ways to solve our problem. One of them is to
abandon the aim of dening
1
[a] as a formula and dene it instead as an equiv-
alence class. So
1
[y] would not be dened as u(y u) for an arbitrary u ,= y
but rather as the full equivalence class of u(y u) (modulo the substitution
congruence for example). It is always an ugly thing to do to dene a mapping
a
1
[a] in terms of a randomly chosen u ,= y. By viewing
1
[a] as an equiv-
alence class we eliminate this dependency in u, which is aesthetically far more
pleasing. On the negative side, if we are later to attempt dening a deductive
congruence by setting whenever both and are provable
from our deductive system, having
1
[a] dened as an equivalence class would
mean our deductive congruence fails to be decoupled from the substitution con-
gruence. It would be very nice to dene a deductive congruence which has no
dependency to a substitution congruence.
There is possibly another approach to resolve our problem which is the one
pursued in this section. The formula
1
= y(x y) is a problem to us simply
because it has y as a variable. The way we have dened the algebra P(V ) has a
major drawback: we are using the same set of variables V to represent variables
which are free and those which are not. Things would be so much simpler if we
had (x ) or 0 (x 0) instead of y(x y). Life would be a lot easier if
we had an algebra where the free variables and the bound variables were taken
from dierent sets, more specically sets with empty intersection.
141
Denition 136 Let V be a set. We call minimal extension of V the set

V
dened as the disjoint union of V and N, that is:

V = 0 V 1 N
So now we have a set of variables

V with fundamentally two types of el-
ements. We can use the elements of the form (0, x) with x V to represent
free variables of a formula, and the elements of the form (1, n) with n N to
represent variables which are not free. Furthermore, since we have two obvi-
ous embeddings i : V

V and j : N

V there is no point writing (0, x) or
(1, n) and we can simply represent elements of

V as x and n, having in mind
the inclusions V

V and N

V with V N = . So we can now represent
elements of P(

V ) as

= 0 (x 0) without ambiguity. Of course, the formula

= x(0 x) is also an element of P(

V ) which is contrary to the spirit of


why we have dened

V in the rst place: we want free variables of any formula
to be chosen from V and bound variables to be chosen from N. So not every
element of P(

V ) will be interesting to us. We will need to restrict our atten-


tion to a meaningful subset of P(

V ). We shall do so by dening a mapping


/ : P(V ) P(

V ) which will make sure the formula /() P(

V ) abides to
the right variable conventions, for all P(V ). The range of this mapping
/(P(V )) P(

V ) will hopefully be an interesting algebra to look at.


So how should our mapping / : P(V ) P(

V ) look like? Given x, y V


we would probably want to have /(x y) = x y P(

V ). We would also
request that /() = and /(
1

2
) = /(
1
) /(
2
). The real
question is to determine how /(x
1
) should be dened, given
1
P(V ) and
x V . The whole idea of the mapping / : P(V ) P(

V ) is to replace bound
variables in V of a formula P(V ) with nice looking variables 0, 1, 2, 3 . . . in
the formula /(). So /(x
1
) should be dened as:
/(x
1
) = n/(
1
)[n/x]
where n N

V . With such denition, we start from the formula /(
1
) which
has been adequately transformed until now, and then replace the free variable
x (if applicable) with an integer variable n so as to obtain /(
1
)[n/x], and we
nally substitute the quantication x with the corresponding n. This is all
looking good. The only thing left to determine is how to choose the integer n.
We know from our study of valid substitutions dened in page 93 that we cannot
just pick any n. It would not make sense to consider /(
1
)[n/x] unless the
substitution [n/x] is valid for /(
1
). For example, suppose = xy(x y),
i.e.
1
= y(x y). It is acceptable to dene /(
1
) = 0 (x 0) and choose
n = 0 or indeed /(
1
) = 7 (x 7) is also ne, since [n/y] is always valid for
(x y) regardless of the particular choice of n. However, having decided upon
/(
1
) = 0 (x 0), we cannot dene /() = 0 0 (0 0) as this would
make no sense. The substitution [0/x] is not valid for 0 (x 0). We have to
choose a dierent integer and set /() = 1 0 (1 0). The substitution [n/x]
should be valid for the formula /(
1
). In fact, the obvious choice of integer
is to pick n as the smallest integer for which [n/x] is valid for /(
1
). Note
142
that such an integer always exists: we know that /(
1
) has a nite number
of variables, so there exists n N which is not a variable of /(
1
). Having
chosen such an n, we see from proposition (90) that [n/x] is valid for /(
1
).
Denition 137 Let V be a set with minimal extension

V . We call minimal
transform mapping on P(V ) the map /: P(V ) P(

V ) dened by:
P(V ) , /() =

(x y) if = (x y)
if =
/(
1
) /(
2
) if =
1

2
n/(
1
)[n/x] if = x
1
(2.26)
where n = mink N : [k/x] valid for /(
1
).
Given P(V ) we call /() the minimal transform of .
Proposition 138 The structural recursion of denition (137) is legitimate.
Proof
We need to prove that there exists a unique map / : P(V ) P(

V ) which
satises equation (2.26). We shall do so using theorem (4) or page 45. So take
X = P(V ), X
0
= P
0
(V ) and A = P(

V ). Dene g
0
: X
0
A by setting
g
0
(x y) = (x y). Dene h() : A
0
A by setting h()(0) = and
h() : A
2
A by setting h()(
1
,
2
) =
1

2
. Finally, given x V ,
dene h(x) : A
1
A by setting h(x)(
1
) = n
1
[n/x] where:
n = n(x)(
1
) = mink N : [k/x] valid for
1

Then applying theorem (4), there exists a unique map / : X A satisfying


the following conditions: rst we have /(x y) = g
0
(x y) = (x y) which
is the rst line of equation (2.26). Next we have /() = h()(0) = which
is the second line. Next we have:
/(
1

2
) = h()(/(
1
), /(
2
)) = /(
1
) /(
2
)
which is the third line. Finally, given x V we have:
/(x
1
) = h(x)(/(
1
)) = n/(
1
)[n/x]
where n = mink N : [k/x] valid for /(
1
) and this is the fourth line.
The variables of the minimal transform /() are what we expect:
Proposition 139 Let V be a set and P(V ). Then we have:
Fr (/()) = Var(/()) V = Fr () (2.27)
where /() P(

V ) is the minimal transform of P(V ).


143
Proof
We shall prove equation (2.27) with a structural induction argument, using
theorem (3) of page 32. First we assume that = (x y) with x, y V . We
need to show the equation is true for . However, we have /() = (x y)
and consequently Fr (/()) = x, y = Var(/()). So the equation is clearly
true. Next we assume that = . Then /() = and Fr (/()) = =
Var(/()). So the equation is also clearly true. Next we assume that =

1

2
where the equation is true for
1
,
2
P(V ). We need to show the
equation is also true for . On the one hand we have:
Var(/()) V = Var(/(
1

2
)) V
= Var(/(
1
) /(
2
)) V
= [ Var(/(
1
)) Var(/(
2
)) ] V
= [Var(/(
1
)) V ] [Var(/(
2
)) V ]
= Fr (
1
) Fr (
2
)
= Fr (
1

2
)
= Fr ()
and on the other hand:
Fr (/()) = Fr (/(
1

2
))
= Fr (/(
1
) /(
2
))
= Fr (/(
1
)) Fr (/(
2
))
= Fr (
1
) Fr (
2
)
= Fr (
1

2
)
= Fr ()
So the equation is indeed true for =
1

2
. Next we assume that = x
1
where x V and the equation is true for
1
P(V ). We need to show the
equation is also true for . Since /() = n/(
1
)[n/x] for some n N:
Var(/()) V = Var( n/(
1
)[n/x] ) V
= ( n Var( /(
1
)[n/x] ) ) V
V N = in

V = Var( /(
1
)[n/x] ) V
= Var( [n/x]( /(
1
) ) ) V
prop. (63) = [n/x]( Var( /(
1
) ) ) V
[n/x](x) = n , V = [n/x]( Var( /(
1
) ) x ) V
[n/x](u) = u if u ,= x = ( Var( /(
1
) ) x ) V
= ( Var(/(
1
)) V ) x
= Fr (
1
) x
= Fr (x
1
)
= Fr ()
144
Furthermore, since [n/x] is valid for /(
1
), we have:
Fr (/()) = Fr ( n/(
1
)[n/x] )
= Fr ( /(
1
)[n/x] ) n
= Fr ( [n/x]( /(
1
) ) ) n
prop. (85), [n/x] valid for /(
1
) = [n/x]( Fr ( /(
1
) ) ) n
[n/x](x) = n = [n/x]( Fr ( /(
1
) ) x ) n
[n/x](u) = u if u ,= x = Fr ( /(
1
) ) x n
= Fr (
1
) x n
Fr (
1
) V , n N = Fr (
1
) x
= Fr (x
1
)
= Fr ()
So we see that equation (2.27) is also true for = x
1
.
Proposition 140 Let V be a set and P(V ). Then we have:
Bnd (/()) = Var(/()) N (2.28)
where /() P(

V ) is the minimal transform of P(V ).


Proof
We shall prove equation (2.28) with a structural induction argument, using
theorem (3) of page 32. First we assume that = (x y) with x, y V . We
need to show the equation is true for . However, we have /() = (x y) and
consequently Bnd(/()) = = Var(/()) N. So the equation is true. Next
we assume that = . Then /() = and Bnd(/()) = = Var(/()).
So the equation is also true. Next we assume that =
1

2
where the
equation is true for
1
,
2
P(V ). We need to show it is also true for :
Bnd(/()) = Bnd (/(
1

2
))
= Bnd (/(
1
) /(
2
))
= Bnd (/(
1
)) Bnd(/(
2
))
= (Var(/(
1
)) N) (Var(/(
2
)) N)
= ( Var(/(
1
)) Var(/(
2
)) ) N
= Var(/(
1
) /(
2
)) N
= Var(/(
1

2
)) N
= Var(/()) N
So the equation is indeed true for =
1

2
. Next we assume that = x
1
where x V and the equation is true for
1
P(V ). We need to show the
145
equation is also true for . Since /() = n/(
1
)[n/x] for some n N:
Bnd (/()) = Bnd ( n/(
1
)[n/x] )
= n Bnd(/(
1
)[n/x])
= n Bnd( [n/x](/(
1
)) )
prop. (82) = n [n/x]( Bnd(/(
1
)) )
[n/x](x) = n = n [n/x]( Bnd(/(
1
)) x )
[n/x](u) = u if u ,= x = n Bnd(/(
1
)) x
= n ( Var(/(
1
)) N) x
= ( n Var(/(
1
)) x ) N
[n/x](u) = u if u ,= x = ( n [n/x]( Var(/(
1
)) x ) ) N
[n/x](x) = n = ( n [n/x]( Var(/(
1
)) ) ) N
prop. (63) = ( n Var( [n/x](/(
1
)) ) ) N
= ( n Var( /(
1
)[n/x] ) ) N
= Var(n/(
1
)[n/x]) N
= Var(/()) N
So we see that equation (2.28) is also true for = x
1
.
Given P(V ) the minimal transform /() is an element of P(

V ). So it
is dicult to argue that both formulas and /() are substitution equivalent
as they do not sit on the same space. However, if we consider the inclusion map
i : V

V , it is possible to regard as an element of P(

V ) and ask whether


both formulas are equivalent. Indeed, we have:
Proposition 141 Let V be a set and i : V

V be the inclusion map. We
denote the substitution congruence on P(

V ). Then for all P(V ) we have:


/() i()
Proof
We shall prove the equivalence /() i() by a structural induction argument,
using theorem (3) of page 32. First we assume that = (x y) for some
x, y V . Then we have the equality /() = (x y) = i() and the equivalence
is clear. Next we assume that = . Then we have /() = = i() and the
equivalence is also clear. So we assume that =
1

2
where
1
,
2
P(V )
satisfy the equivalence. Then we have:
/() = /(
1

2
)
= /(
1
) /(
2
)
i(
1
) i(
2
)
= i(
1

2
)
= i()
146
Finally we assume that = x
1
where x V and
1
P(V ) satises the
equivalence. In this case we have:
/() = /(x
1
)
n = mink : [k/x] valid for /(
1
) = n/(
1
)[n/x]
= [n/x](x/(
1
))
A: to be proved x/(
1
)
xi(
1
)
= i(x) i(
1
)
= i(x
1
)
= i()
So it remains to show that [n/x](x/(
1
)) x/(
1
). Hence, from propo-
sition (116) it is sucient to prove that [n/x] is an admissible substitution for
x/(
1
). We already know that [n/x] is valid for /(
1
). Using proposi-
tion (88), given u Fr (x/(
1
)), in order to prove that [n/x] is valid for
x/(
1
) we need to show that [n/x](u) ,= [n/x](x). So it is sucient to show
that [n/x](u) ,= n. Since u Fr (x/(
1
)), in particular we have u ,= x.
Hence, we have to show that u ,= n. Using proposition (139) we have:
u Fr (x/(
1
)) Fr (/(
1
)) V
and consequently from V N = we conclude that u ,= n. In order to show
that [n/x] is an admissible substitution for x/(
1
) it remains to prove that
[n/x](u) = u for all u Fr (x/(
1
)), which follows from u ,= x.
2.3.3 Minimal Transform and Valid Substitution
In the previous section, we hinted at the diculty of dening the evaluation
[a] when = y(x y) and a = y. The issue is that the substitution [y/x] is
not valid for the formula (when x ,= y), making it impossible to write:
a V , [a] = [a/x]
We dened the minimal transform mapping / : P(V ) P(

V ) with this
problem in mind. Now from denition (137), the minimal transform of is
/() = 0 (x 0) and it is clear that [a/x] is now valid for /() for all
a V . So it would be sensible to dene:
a V , [a] = /()[a/x]
Of course such a denition would lead to [a] = 0 (a 0) which is an element
of P(

V ) and not an element of P(V ). So it may not be what we want, but we


are certainly moving forward. More generally suppose P(V ) is an arbitrary
formula and : V W is an arbitrary map. We would like to dene the formula
147
() in P(W) but in a way which is sensible, rather than following blindly the
details of denition (53). The problem is that the substitution may not be
valid for the formula . This doesnt mean we shouldnt be interested in (). In
many cases we may only care about what happens to the free variables. Bound
variables are not interesting. We do not mind having them moved around,
provided the basic structure of the formula remains. So when is valid for ,
this is what happens when using () as per denition (53). However when
is not valid for , we need to nd another way to move the free variables of
without making a mess of it. So consider the minimal transform /(). There is
no longer any conicts between the free and he bound variables. If : V W
is an arbitrary map, we can move the free variables of /() according to the
substitution without creating nonsense. To make this idea precise, we dene:
Denition 142 Let V and W be sets. Let : V W be a map. We call
minimal extension of the map :

V

W dened by:
u

V , (u) =
_
(u) if u V
u if u N
where

V and

W are the minimal extensions of V and W respectively.
So the map :

V

W is the map moving the free variables of /() according
to , without touching the bound variables. As expected we have:
Proposition 143 Let : V W be a map. Then for all P(V ) the
minimal extension :

V

W is valid for the minimal transform /().
Proof
We need to check the three properties of proposition (93) are met in rela-
tion to :

V

W and /() with V
0
= N. First we show property (i) :
we need to show that Bnd(/()) N which follows from proposition (140).
Next we show property (ii) : we need to show that
|N
is injective which
is clear from denition (142). We nally show property (iii) : we need to
show that (N) (Var(/()) N) = . This follows from (N) N and
(Var(/()) N) (V ) = (V ) W, while W N = .
So here we are. Given P(V ) and an arbitrary map : V W we cannot
safely consider () unless is valid for . However, there is no issues in looking
at /() instead, since is always valid for /(). Of course /() is an
element of P(

W) and not P(W). It is however a lot better to have a meaningful
element of P(

W), rather than a nonsensical formula () of P(W). But what if
was a valid substitution for ? Then () would be meaningful. There would
not be much point in considering /() as a general scheme for substituting
free variables, unless it coincided with () in the particular case when is valid
for . Since () P(W) while /() P(

W), we cannot expect these
formulas to match. However, after we take the minimal transform / () we
should expect it to be equal to /(). This is indeed the case, and is the
object theorem (13) below. For now, we shall establish a couple of lemmas:
148
Lemma 144 Let V, W be sets and : V W be a map. Let = x
1
where

1
P(V ) and x V such that (x) , (Fr ()). Then for all n N we have:
[n/x] /(
1
) = [n/(x)] /(
1
)
where :

V

W is the minimal extension of : V W.
Proof
Using proposition (64), we only need to show that the mappings [n/x] and
[n/(x)] coincide on Var(/(
1
)). So let u Var(/(
1
))

V . We want:
[n/x](u) = [n/(x)] (u) (2.29)
Since

V is the disjoint union of V and N, we shall distinguish two cases: rst
we assume that u N. From V N = we obtain u ,= x and conse-
quently [n/x](u) = (u) = u. From W N = we obtain u ,= (x)
and consequently [n/(x)] (u) = [n/(x)](u) = u. So equation (2.29) is
indeed satised. Next we assume that u V . We shall distinguish two fur-
ther cases: rst we assume that u = x. Then [n/x](u) = (n) = n and
[n/(x)] (u) = [n/(x)]((x)) = n and we see that equation (2.29) is again
satised. Next we assume that u ,= x. Then [n/x](u) = (u) = (u),
and furthermore [n/(x)] (u) = [n/(x)]((u)). In order to establish equa-
tion (2.29), it is therefore sucient to prove that (u) ,= (x). However, since
u Var(/(
1
)) and u V , it follows from proposition (139) that u Fr (
1
).
Having assumed that u ,= x we in fact have u Fr (x
1
) = Fr (). Having
assumed that (x) , (Fr ()) it follows that (u) ,= (x) as requested.
Lemma 145 Let V, W be sets and : V W be a map. Let = x
1
where

1
P(V ) and x V such that (x) , (Fr ()). Then for all k N we have:
[k/x] valid for /(
1
) [k/(x)] valid for /(
1
)
where :

V

W is the minimal extension of : V W.
Proof
First we show : So we assume that [k/(x)] is valid for /(
1
). We need
to show that [k/x] is valid for /(
1
). However, we know from proposition (143)
that is valid for /(
1
). Hence, from the validity of [k/(x)] for /(
1
)
and proposition (91) we see that [k/(x)] is valid for /(
1
). Furthermore,
having assumed (x) , (Fr ()), we can use lemma (144) to obtain:
[k/x] /(
1
) = [k/(x)] /(
1
) (2.30)
It follows from proposition (92) that [k/x] is valid for /(
1
), and in par-
ticular, using proposition (91) once more, we conclude that [k/x] is valid for
/(
1
) as requested. We now show : so we assume that [k/x] is valid for
/(
1
). We need to show that [k/(x)] is valid for /(
1
). However, from
149
proposition (91) it is sucient to prove that [k/(x)] is valid for /(
1
).
Using equation (2.30) and proposition (92) once more, we simply need to show
that [k/x] is valid for /(
1
). Having assumed that [k/x] is valid for /(
1
),
from proposition (91) it is sucient to show that is valid for [k/x] /(
1
).
We shall do so with an application of proposition (93) to V
0
= N. First we need
to check that Bnd ( [k/x] /(
1
) ) N. However, from proposition (82) we have
Bnd( [k/x] /(
1
) ) = [k/x]( Bnd (/(
1
)) ) and from proposition (140) we have
Bnd(/(
1
)) N. So the desired inclusion follows. Next we need to show that

|N
is injective which is clear from denition (142). Finally, we need to check
that (N) ( Var([k/x] /(
1
)) N) = . This follows from (N) N and
( Var([k/x] /(
1
)) N) (V ) = (V ) W, while W N = .
Theorem 13 Let V and W be sets. Let : V W be a map. Let P(V ).
If is valid for , then it commutes with minimal transforms, specically:
/() = / () (2.31)
where :

V

W is the minimal extension of : V W.
Proof
Before we start, it should be noted that in equation (2.31) refers to the
substitution mapping : P(V ) P(W) while the / on the right-hand-side
refers to the minimal transform mapping / : P(W) P(

W). The other /
which appears on the left-hand-side refers to the minimal transform mapping
/ : P(V ) P(

V ), and is the substitution mapping : P(

V ) P(

W). So
everything makes sense. For all P(V ), we need to show the property:
( valid for ) /() = / ()
We shall do so by structural induction using theorem (3) of page 32. First we
assume that = (x y) where x, y V . Then any is valid for and we need
to show equation (2.31). This goes as follows:
/() = /(x y)
= (x y)
= (x) (y)
= (x) (y)
= /((x) (y))
= /((x y))
= / (x y)
= / ()
Next we assume that = . Then we have /() = = / () and the
property is also true. Next we assume that =
1

2
where the property
is true for
1
,
2
P(V ). We need to show the property is also true for .
150
So we assume that is valid for . We need to show equation (2.31) holds
for . However, from proposition (87), the substitution is valid for both
1
and
2
. Having assumed the property is true for
1
and
2
, it follows that
equation (2.31) holds for
1
and
2
. Hence, we have:
/() = /(
1

2
)
= (/(
1
) /(
2
))
= (/(
1
)) (/(
2
))
= /((
1
)) /((
2
))
= /((
1
) (
2
))
= /((
1

2
))
= / ()
Finally we assume that = x
1
where x V and the property is true for

1
P(V ). We need to show the property is also true for . So we assume
that is valid for . We need to show equation (2.31) holds for . However,
from proposition (88), the substitution is also valid for
1
. Having assumed
the property is true for
1
, it follows that equation (2.31) holds for
1
. Hence:
/() = /(x
1
)
n = mink : [k/x] valid for /(
1
) = ( n/(
1
)[n/x] )
= (n) ( /(
1
)[n/x] )
= n [n/x] /(
1
)
A: to be proved = n[n/(x)] /(
1
)
= n[n/(x)] / (
1
)
= n/[(
1
)][n/(x)]
B: to be proved = m/[(
1
)][m/(x)]
m = mink : [k/(x)] valid for /[(
1
)] = /((x)(
1
))
= /((x
1
))
= / ()
So we have two more points A and B to justify. First we deal with point A. It
is sucient for us to prove the equality:
[n/x] /(
1
) = [n/(x)] /(
1
)
Using lemma (144) it is sucient to show that (x) , (Fr ()). So suppose
to the contrary that (x) (Fr ()). Then there exists u Fr () such that
(u) = (x). This contradicts proposition (88) and the fact that is valid for
, which completes the proof of point A. So we now turn to point B. We need
to prove that n = m, for which it is sucient to show the equivalence:
[k/x] valid for /(
1
) [k/(x)] valid for /[(
1
)]
151
This follows from lemma (145) and the fact that (x) , (Fr ()), together with
the induction hypothesis /(
1
) = / (
1
).
2.3.4 Minimal Transform and Substitution Congruence
Let = xy(x y) and = yx(y x), where x ,= y. We know that
where denotes the substitution congruence on P(V ). The quickest way to see
this is to dene
1
= y(x y) and argue using denition (115) of page 126
that = x
1
while = y
1
[y : x] with y , Fr (
1
). If we now look at minimal
transforms, since /(
1
) = 0 (x 0) and /(
1
[y : x]) = 0 (y 0) we obtain:
/() = 1 0 (1 0) = /()
So here is a case of two equivalent formulas with identical minimal transforms.
Of course, this is hardly surprising. We designed the minimal transform so as
to replace the bound variables with a unique and sensible choice. In fact, we
should expect the equality /() = /() to hold whenever . Conversely,
suppose P(V ) is a formula with minimal transform:
/() = 1 0 (1 0)
It is hard to imagine a formula which is not equivalent to and . It seems
pretty obvious that would need to be of the form = uv(u v) and it
would not be possible to have u = v. Hence:
= uv(u v) uy(u y) xy(x y) =
The main theorem of this section will show that is indeed equivalent to
/() = /(). As a preamble, we shall prove:
Proposition 146 Let V be a set and be the relation on P(V ) dened by:
/() = /()
for all , P(V ). Then is a congruence on P(V ).
Proof
The relation is clearly reexive, symmetric and transitive. So it is an equiv-
alence relation on P(V ) and we simply need to prove that it is a congruent
relation, as per denition (31) of page 51. We already know that . So let
=
1

2
and =
1

2
where
1

1
and
2

2
. We need to show
that . This goes as follows:
/() = /(
1

2
)
= /(
1
) /(
2
)
(
1

1
) (
2

2
) = /(
1
) /(
2
)
= /(
1

2
)
= /()
152
Next we assume that = x
1
and = x
1
where x V and
1

1
. We
need to show that . This goes as follows:
/() = /(x
1
)
n = mink : [k/x] valid for /(
1
) = n/(
1
)[n/x]

1

1
= n/(
1
)[n/x]

1

1
= m/(
1
)[m/x]
m = mink : [k/x] valid for /(
1
) = /(x
1
)
= /()

Theorem 14 Let be the substitution congruence on P(V ) where V is a set.


Then for all , P(V ) we have the equivalence:
/() = /()
where /() and /() are the minimal transforms as per denition (137).
Proof
First we show : consider the relation on P(V ) dened by if and only
if /() = /(). We need to show the inclusion . However, we know
from proposition (146) that is a congruence on P(V ) and furthermore from
proposition (117) that is generated by the set:
R
1
= ( , () ) : P(V ) , : V V admissible for
It is therefore sucient to prove the inclusion R
1
. So let : V V be
an admissible substitution for as per denition (114) of page 125. We need
to show that (), i.e. that /() = / (). Let :

V

V be the
minimal extension of as per denition (142) of page 148. Since is admissible
for , in particular is valid for . So we can apply theorem (13) of page 150
from which we obtain / () = /(). It is therefore sucient to prove
that /() = /(). Let i :

V

V be the identity mapping. We need
to show that i /() = /() and from proposition (64) it is sucient to
prove that i and coincide on Var(/()). So let u Var(/()). We need
to show that (u) = u. Since

V is the disjoint union of V and N, we shall
distinguish two cases: rst we assume that u N. Then (u) = u is immediate
from denition (142). Next we assume that u V . Then from u Var(/())
and proposition (139) it follows that u Fr (). Having assumed that is
admissible for we conclude that (u) = (u) = u as requested. We now
prove : we need to show that every P(V ) satises the property:
P(V ) , [ /() = /() ]
We shall do so by a structural induction argument using theorem (3) of page 32.
First we assume that = (x y) for some x, v V . Let P(V ) such that
153
/() = (x y) = /(). We need to show that . So it is sucient
to prove that = . From theorem (2) of page 21 the formula can be of
one and only one of four types: rst it can be of the form = (u v) for
some u, v V . Next, it can be of the form = , and possibly of the form
=
1

2
with
1
,
2
P(V ). Finally, it can be of the form = u
1
for
some u V and
1
P(V ). Looking at denition (137) we see that the minimal
transform /() has the same basic structure as . Thus, from the equality
/() = (x y) it follows that can only be of the form = (u v). So we
obtain /() = (u v) = (x y) and consequently u = x and v = y which
implies that = (x y) = as requested. We now assume that = . Let
P(V ) such that /() = = /(). We need to show that . Once
again it is sucient to prove that = . From/() = we see that can only
be of the form = . So = as requested. We now assume that =
1

2
where
1
,
2
P(V ) satisfy our property. We need to show the same if true
of . So let P(V ) such that /() = /(
1
) /(
2
) = /(). We
need to show that . From /() = /(
1
) /(
2
) we see that can
only be of the form =
1

2
. It follows that /() = /(
1
) /(
2
)
and consequently we obtain /(
1
) /(
2
) = /(
1
) /(
2
). Thus, using
theorem (2) of page 21 we have /(
1
) = /(
1
) and /(
2
) = /(
2
). Having
assumed
1
and
2
satisfy our induction property, it follows that
1

1
and

2

2
and consequently
1

2

1

2
. So we have proved that
as requested. We now assume that = x
1
where x V and
1
P(V )
satisfy our property. We need to show the same is true of . So let P(V )
such that /() = /(). We need to show that . We have:
/() = n/(
1
)[n/x] (2.32)
where n = mink N : [k/x] valid for /(
1
). Hence from the equality
/() = /() we see that can only be of the form = y
1
for some
y V and
1
P(V ). We shall distinguish two cases: rst we assume that
y = x. Then we need to show that x
1
x
1
and it is therefore sucient to
prove that
1

1
. Having assumed
1
satisfy our property, we simply need
to show that /(
1
) = /(
1
). However we have the equality:
/() = m/(
1
)[m/x] (2.33)
where m = mink N : [k/x] valid for /(
1
). Comparing (2.32) and (2.33),
from /() = /() and theorem (2) of page 21 we obtain n = m and thus:
/(
1
)[n/x] = /(
1
)[n/x] (2.34)
Consider the substitution :

V

V dened by = [n/x]. We shall conclude
that /(
1
) = /(
1
) by inverting equation (2.34) using the local inversion
theorem (10) of page 107 on the substitution . So consider the sets V
0
= V
and V
1
= N. It is clear that both
|V0
and
|V1
are injective maps. Dene:
= P(

V ) : (Fr () V
0
) (Bnd () V
1
) ( valid for )
154
Then using theorem (10) there exits : P(

V ) P(

V ) such that () = for


all . Hence from equation (2.34) it is sucient to prove that /(
1
)
and /(
1
) . First we show that /(
1
) . We already know that
= [n/x] is a valid substitution for /(
1
). The fact that Fr (/(
1
)) V
0
follows from proposition (139). The fact that Bnd(/(
1
)) V
1
follows from
proposition (140). So we have proved that /(
1
) as requested. The proof
of /(
1
) is identical, which completes our proof of in the case
when = y
1
and y = x. We now assume that y ,= x. Consider the formula

= x
1
[x: y] where [x: y] is the permutation mapping as per denition (66)
of page 81. Suppose we have proved the equivalence

. Then in order
to prove it is sucient by transitivity to show that

. However,
having already proved the implication of this theorem, we know that

implies /() = /(

). Hence, if we have

, it is sucient to prove

knowing that /() = /(

) and

= x

1
where

1
=
1
[x: y]. So
we are back to the case when y = x, a case we have already dealt with. It fol-
lows that we can complete our induction argument simply by showing

.
Since = y
1
and

= x
1
[x : y] with x ,= y, from denition (115) of
page 126 we simply need to check that x , Fr (
1
). So suppose to the contrary
that x Fr (
1
). Since x ,= y we obtain x Fr (). From proposition (139)
it follows that x Fr (/()). Having assumed that /() = /() we ob-
tain x Fr (/()) and nally using proposition (139) once more, we obtain
x Fr (). This is our desired contradiction since = x
1
.
We conclude this section with an immediate consequence of theorem (13) of
page 150 and theorem (14). From proposition (120) we know that () ()
follows from whenever : V W is injective. We can now loosen the
hypothesis by requiring that be simply valid for and . Note that this
result is not true if denotes the strong substitution congruence rather than
the substitution congruence. Take V = x, y, z and W = x, y with x, y, z
distinct and = xy(x y) while = yx(y x) with (x) = x and
(y) = y. Then is valid for and but () () fails to be true.
Theorem 15 Let V and W be sets and : V W be a map. Let be the
substitution congruence on P(V ) and P(W). Then if is valid for and :
() ()
for all , P(V ), where : P(V ) P(W) is also the substitution mapping.
Proof
We assume that and : V W is valid for and . We need
to show that () (). Using theorem (14) it is sucient to prove that
/ () = / (). Since is valid for and , using theorem (13) of
page 150 it is therefore sucient to prove that /() = /(), which
follows immediately from /() = /(), itself a consequence of .
155
2.3.5 Isomorphic Representation Modulo Substitution
One of our goals in dening the minimal transform mapping /: P(V ) P(

V )
in page 143 was to design an algebra Q(V ) where the free and bound variables
would be chosen from dierent sets. We now pursue this idea further. Given a
set V , we dened the minimal extension

V by adding to V a disjoint copy of
N, giving us plenty of bound variables to choose from. Unfortunately in spite
of this, the free algebra P(

V ) is not very interesting to us, as it contains too


many formulas, including those such as = x(0 x) which contravene our
desired convention of keeping free variables in V and bound variables in N. So
we need to restrict our attention to a smaller subset of P(

V ) and the obvious


choice is to pick Q(V ) = /(P(V )) i.e. to choose Q(V ) as the range of the
minimal transform mapping. From proposition (139) and proposition (140) this
guarantees all formulas have free variables in V and bound variables in N. So
we have the right set of formulas. However Q(V ) is not yet an algebra, as it is
not a universal sub-algebra of P(

V ). Indeed, given x V and Q(V ), the


formula x is not an element of Q(V ). Somehow we would like Q(V ) to be
an algebra. But of which type? The free universal algebras P(V ) and P(

V )
do not have the same type. The algebra P(

V ) has many more quantication


operators n for n N which P(V ) does not have. So what do we want for
Q(V )? All elements of Q(V ) have free variables in V . It does not add any
value to quantify a formula with respect to a variable which is not free. So it is
clear the unary operators n for n N should not be part of Q(V ), and Q(V )
should be of the same type of universal algebra as P(V ). So lets endow Q(V )
with the appropriate structure of universal algebra: First we need an operator
: Q(V )
0
Q(V ). Since = /() Q(V ) we should keep the structure
induced by P(

V ) and dene (0) = Q(V ). Next we need a binary operator


: Q(V )
2
Q(V ). Since for all
1
,
2
P(V ), we have:
/(
1
) /(
2
) = /(
1

2
)
we see that Q(V ) is closed under the operator of P(

V ). So we should keep
that too. Finally given x V , we need a unary operator x : Q(V )
1
Q(V ).
In this case the operator x of P(

V ) is not suitable for our purpose, as Q(V ) is


not closed under x. So we need to dene a new operator which we denote x
(with the letter x in bold) to avoid any possible confusion. Let us pick:
Q(V ) , x = n[n/x] (2.35)
where n = mink N : [k/x] valid for and the n on the right-hand-side
of equation (2.35) is the usual quantication operator of P(

V ). We need to
check that this denition does indeed yield an operator x : Q(V )
1
Q(V ),
specically that x Q(V ) for all Q(V ). So let Q(V ). There exists

1
P(V ) such that = /(
1
) and consequently we have:
x = n[n/x] = n/(
1
)[n/x] = /(x
1
) Q(V )
where the last equality follows from denition (137) and:
n = mink N : [k/x] valid for /(
1
)
156
So Q(V ) is now a universal algebra of the same type as P(V ) which is what we
set out to achieve. Note that this algebra is not free in general since:
xy(x y) = 1 0 (1 0) = yx(y x)
whenever x and y are two distinct elements of V . This contradicts the unique-
ness principle of theorem (2) of page 21 which prevails in free universal algebras.
Denition 147 Let V be a set with rst order logic type . We call minimal
transform algebra associated with V , the universal algebra Q(V ) of type :
Q(V ) = /(P(V )) P(

V )
where / : P(V ) P(

V ) is the minimal transform mapping, and the operators


, are induced from P(

V ) while for all x V and Q(V ) we have:


x = n[n/x]
where n = mink N : [k/x] valid for .
Proposition 148 Let V be a set with minimal transform algebra Q(V ). Then,
the minimal transform mapping /: P(V ) Q(V ) is a surjective morphism.
Proof
The minimal transform mapping / : P(V ) Q(V ) is clearly surjective by
denition (147) of Q(V ). So it remains to show it is a morphism. From deni-
tion (137) of page 143 we have /() = and for all
1
,
2
P(V ):
/(
1

2
) = /(
1
) /(
2
)
So it remains to show that /(x
1
) = x/(
1
) for all x V and
1
P(V ).
However, dening n = mink N : [k/x] valid for /(
1
) we have:
/(x
1
) = n/(
1
)[n/x] = x/(
1
)
where the two equalities follow from denitions (137) and (147) respectively.
So from proposition (148) we have a morphism / : P(V ) Q(V ) which is
surjective, while from theorem (14) we have the equivalence:
/() = /()
So the kernel of this morphism is exactly the substitution congruence on P(V ).
Theorem 16 Let V be a set with minimal transform algebra Q(V ), and let
the quotient universal algebra of P(V ) modulo the substitution congruence be
denoted [P(V )] as per theorem (7). Let /

: [P(V )] Q(V ) be dened as:


P(V ) , /

([]) = /()
where / is the minimal transform mapping. Then /

is an isomorphism.
157
Proof
From proposition (148) / : P(V ) Q(V ) is a surjective morphism and from
theorem (14), the kernel ker(/) coincides with the substitution congruence on
P(V ). Hence the fact that /

: [P(V )] Q(V ) is an isomorphism follows


immediately from the rst isomorphism theorem (8) of page 57.
2.3.6 Substitution Rank of Formula
When V and W are sets and : V W is a map, given P(V ) our
motivating force in the last few sections has been to dene a map () even in
the case when is not a valid substitution for . Our strategy so far has been to
step outside of P(W) into the bigger space P(

W), and dene () as /()
where /() is the minimal transform of and :

V

W is the minimal
extension of , as per denitions (137) and (142) respectively. In this section and
the next, we wish to go further than this and dene () as an actual element
of P(W). This shouldnt be too hard. Indeed, when computing the minimal
transform /(), all the bound variables have been ordered and rationalized
in a given way. The eect of on /() is simply to move the free variables
from V to W which should have no impact of the conguration of the bound
variables. It is therefore not unreasonable to think that the formula /()
is still properly congured so to speak, in other words that /() = /()
for some P(W). If this is the case, we can simply dene () = .
This denition would not uniquely characterize the formula (), but it would
certainly dene a unique class modulo the substitution congruence, by virtue of
theorem (14) of page 153. So lets have a look at an example with = y(x y)
and x ,= y. Assuming (x) = u we obtain /() = 0(u 0), and sure
enough this is indeed the minimal transform of v(u v) where v W and
u ,= v. So our strategy is clear: all we need to do is prove that /() is indeed
equal to the minimal transform /() for some P(W). There is however
a glitch: the set W may be too small. Our example was successful because we
assumed the existence of v W such that u ,= v. This need not be the case.
We cannot have W = as otherwise there would be no map : V W when
x, y V ,= , but it is possible that W be limited to the singleton W = u.
When this is the case, there exists no P(W) such that /() = 0(u 0).
So it is clear we cannot hope to achieve /() = /() in general unless
W has suciently many variables to accommodate the formula /(). It
appears the formula 0(u 0) requires at least two variables. This is also the
case for 1 0(0 1) or (u v) with u ,= v, while 0(0 0) requires one
variable only. As for the formula ( ), it requires no variable at all.
So we need to make this idea precise: given the formula /() we have to
dene the minimum required number of variables in W so the formula can be
squeezed into P(W). This motivates the following denition:
Denition 149 Let V be a set and be the substitution congruence on P(V ).
158
For all P(V ) we call substitution rank of the integer rnk () dened by:
rnk () = min [Var()[ : P(V ) ,
where [Var()[ denotes the cardinal of the set Var(), for all P(V ).
Given a set A, it is customary to use the notation [A[ to refer to the cardinal
number of A. We have not formally dened the notion of ordinal and cardinal
numbers at this stage, but for all P(V ) the set Var() is nite and [Var()[
is simply the number of elements it has. So this should be enough for now. Given
a formula P(V ), we dened rnk () as the smallest number of variables used
by a formula , which is equivalent to modulo the substitution congruence.
Our hope is that provided rnk ( /() ) [W[, there should be enough
variables in W for us to nd P(W) such that /() = /(). So let us
see if this works: in the case /() = , the condition rnk ( /() ) [W[
becomes 0 [W[ and it is clear we do not require W to have any element. In the
case when /() = 0(0 0) the condition becomes 1 [W[ and it is clear W
should have at least one variable. If /() = 0(u 0) then the condition
becomes 2 [W[ which is another success. A more interesting example is
/() = 1 0(1 0) (u v) with u ,= v. This formula is equivalent to
= uv(u v) (u v) and consequently we have rnk ( /() ) = 2. So
the condition becomes 2 [W[ which is indeed the correct condition to impose
on W since we do not need more than two variables, as /() = /()
with = uv(u v) (u v). Hence we believe denition (149) refers to
the correct notion and we shall endeavor to prove the existence of P(W)
such that /() = /() whenever the condition rnk ( /() ) [W[ is
satised. For now, we shall establish some of the ranks properties:
Proposition 150 Let V be a set and P(V ). Then, we have:
[Fr ()[ rnk () [Var()[
Proof
The inequality rnk () [Var()[ follows immediately from denition (149). So
it remains to show that [Fr ()[ rnk (). So let P(V ) such that
where is the substitution congruence. We need to show [Fr ()[ [Var()[.
From proposition (118) we have Fr () = Fr (). Hence we need to show that
[Fr ()[ [Var()[ which follows from Fr () Var().
Given P(V ), we know from denition (149) that rnk () = [Var()[ for
some . Given a suitably large subset V
0
V , the following proposition
allows us to choose such a with the additional property Var() V
0
provided
Fr () V
0
. Obviously we cannot hope to have Var() V
0
unless V
0
is
large enough and Fr () V
0
since two substitution equivalent formulas have
the same free variables. Being able to nd a with Var() V
0
is crucially
important for us. If we go back to the case when /() = 0(u 0), our
rst step in proving the existence of such that /() = /() will be to
159
show the existence of
1
= v(u v), which is a
1
with Var(
1
) V
0
where
V
0
= W

W. When V
0
is not large enough, we cannot hope to nd a with
Var() V
0
. However, we can ensure that V
0
Var() which will also be a
useful property when calculating the substitution rank of
1

2
.
Proposition 151 Let V be a set and P(V ) such that Fr () V
0
for some
V
0
V . Let denote the substitution congruence on P(V ). Then there exists
P(V ) such that and [Var()[ = rnk () such that:
(i) rnk () [V
0
[ Var() V
0
(ii) [V
0
[ rnk () V
0
Var()
Proof
Without loss of generality, we may assume that [Var()[ = rnk (). Indeed,
suppose the proposition has been established with the additional assumption
[Var()[ = rnk (). We need to show it is then true in the general case. So given
V
0
V , consider P(V ) such that Fr () V
0
. From denition (149) there
exists
1
P(V ) such that
1
with the equality [Var(
1
)[ = rnk (). From
proposition (118) we obtain Fr () = Fr (
1
) and so Fr (
1
) V
0
. Hence we see
that
1
satises the assumption of the proposition with the additional property
[Var(
1
)[ = rnk (
1
). Having assumed the proposition is true in this case, we
obtain the existence of P(V ) such that
1
, [Var()[ = rnk (
1
) and
which satises (i) and (ii) where is replaced by
1
. However rnk (
1
) = rnk ()
and replacing by
1
in (i) and (ii) has no impact. So satises (i) and (ii).
Hence we have and [Var()[ = rnk () together with (i) and (ii) which
establishes the proposition in the general case. So we now assume without
loss of generality that [Var()[ = rnk () and Fr () V
0
. We need to show the
existence of such that [Var()[ = [Var()[ and which satises (i) and (ii).
Note that if V = then V
0
= and we can take = . So we assume V ,= . We
shall rst consider the case when rnk () [V
0
[ and show the existence of such
that Var() V
0
. Since Fr () V
0
the set V
0
is the disjoint union of Fr ()
and V
0
Fr (), giving us the equality [V
0
[ = [Fr ()[ + [V
0
Fr ()[. Since we
also have [Var()[ = [Fr ()[ +[Var() Fr ()[, we obtain from [Var()[ [V
0
[:
[Fr ()[ +[Var() Fr ()[ [Fr ()[ +[V
0
Fr ()[
Since [Fr ()[ is a nite cardinal it follows that [Var() Fr ()[ [V
0
Fr ()[.
Hence, there is an injection mapping i : Var() Fr () V
0
Fr (). Having
assumed V ,= consider x

V and dene the map : V V as follows:


u V , (u) =

u if u Fr ()
i(u) if u Var() Fr ()
x

if u , Var()
Dene = (). It remains to show that has the desired properties, namely
that , [Var()[ = [Var()[ and Var() V
0
. First we show . Using
proposition (116) it is sucient to prove that is an admissible substitution
160
for . It is clear that (u) = u for all u Fr (). So it remains to show that
is valid for . From proposition (86) it is sucient to prove that
|Var()
is
an injective map. So let u, v Var() such that (u) = (v). We need to show
that u = v. We shall distinguish four cases: rst we assume that u Fr () and
v Fr (). Then the equality (u) = (v) leads to u = v. Next we assume that
u , Fr () and v , Fr (). Then we obtain i(u) = i(v) which also leads to u = v
since i : Var() Fr () V
0
Fr () is an injective map. So we now assume
that u Fr () and v , Fr (). Then from (u) = (v) we obtain u = i(v)
which is in fact impossible since u Fr () and i(v) V
0
Fr (). The last
case u , Fr () and v Fr () is equally impossible which completes our proof
of . So we now prove that [Var()[ = [Var()[. From proposition (63)
we have Var() = Var(()) = (Var()). So we need [(Var())[ = [Var()[
which is clear since
|Var()
: Var() (Var()) is a bijection. So it remains to
show that Var() V
0
, or equivalently that (Var()) V
0
. So let u Var().
We need to show that (u) V
0
. We shall distinguish two cases: rst we
assume that u Fr (). Then (u) = u and the property (u) V
0
follows
from the inclusion Fr () V
0
. Next we assume that u , Fr (). Then we
have (u) = i(u) V
0
Fr () V
0
. So in the case when rnk () [V
0
[
we have been able to prove the existence of satisfying (i). In fact we claim
that also satises (ii). So let us assume that [V
0
[ rnk (). Then we must
have rnk () = [V
0
[ and we need to show that V
0
Var(). However, we have
[Var()[ = rnk () and consequently [Var()[ = [V
0
[ together with Var() V
0
.
Two nite subsets ordered by inclusion and with the same cardinal must be
equal. So Var() = V
0
. We now consider the case when [V
0
[ < rnk (). We
need to show the existence of such that [Var()[ = [Var()[ satisfying
(i) and (ii). In the case when [V
0
[ < rnk (), (i) is vacuously true, so we simply
need to ensure that V
0
Var(). Since [V
0
[ < [Var()[ we obtain:
[V
0
Var()[ = [V
0
[ [V
0
Var()[
< [Var()[ [V
0
Var()[
= [Var() V
0
[
So there is an injective map i : V
0
Var() Var() V
0
. Given x

V , dene:
u V , (u) =

u if u Var() i( V
0
Var() )
i
1
(u) if u i( V
0
Var() )
x

if u , Var()
Let = (). It remains to show that , [Var()[ = [Var()[ and
V
0
Var(). First we show that . Using proposition (116) it is suf-
cient to prove that is an admissible substitution for . So let u Fr (). We
need to show that (u) = u. So it is sucient to prove that u , i( V
0
Var() )
which follows from the fact that u V
0
, itself a consequence of Fr () V
0
.
In order to show that is also valid for , from proposition (86) it is su-
cient to prove that is injective on Var(). So let u, v Var() such that
(u) = (v). We need to prove that u = v. The only case when this may
161
not be clear is when (u) = u and (v) = i
1
(v) or vice versa. So we
assume that u Var() i( V
0
Var() ) and v i( V
0
Var() ). Then
we see that (u) = u Var() while (v) = i
1
(v) V
0
Var(). So
the equality (u) = (v) is in fact impossible, which completes our proof
of . As before, the fact that [Var()[ = [Var()[ follows from the
injectivity of
|Var()
and it remains to prove that V
0
Var(). So let
u V
0
we need to show that u Var() = (Var()) and we shall distin-
guish two cases: rst we assume that u Var(). Since u V
0
, it cannot
be an element of i( V
0
Var() ). It follows that u Var() i( V
0
Var() )
and consequently u = (u) (Var()) = Var(). Next we assume that
u V
0
Var(). Then i(u) is an element of i( V
0
Var() ) and therefore
(i(u)) = i
1
(i(u)) = u. Since i(u) is an element of Var() V
0
we conclude
that u = (i(u)) (Var()) = Var(), which completes our proof.
The substitution rank of a formula is essentially the minimum number of
variables needed to describe a representative of its class modulo substitution.
We should expect injective variable substitutions to have no eect on the rank.
Proposition 152 Let V, W be sets and : V W be a map. Then for all
P(V ), if
|Var()
is an injective map, we have the equality:
rnk (()) = rnk ()
where : P(V ) P(W) denotes the associated substitution mapping.
Proof
Let : V W and P(V ) such that
|Var()
is an injective map. We need
to show that rnk (()) = rnk (). First we shall show rnk (()) rnk ().
Using proposition (151) with V
0
= Var(), since we have Fr () Var() and
rnk () [Var()[, there exists P(V ) such that , [Var()[ = rnk ()
and Var() Var(). Having assumed is injective on Var(), it is therefore
injective on both Var() and Var(). From proposition (86) it follows that is
a valid substitution for both and . Hence from theorem (15) of page 155 we
obtain () () and consequently using proposition (63) :
rnk (()) [Var(())[ = [(Var())[ = [Var()[ = rnk ()
So it remains to show that rnk () rnk (()). If V = then rnk () = 0 and
we are done. So we may assume that V ,= . So let x

V and dene:
u W , (u) =
_

1
(u) if u (Var())
x

if u , (Var())
Then : W V is injective on Var(()) = (Var()) and hence:
rnk ( (()) ) rnk (())
So it suces for us to show that () = . From proposition (64) it is thus
sucient to show that (x) = x for all x Var(), which is clear.
162
If P(V ) we know from proposition (141) that /() is substitution
equivalent to i() where i : V

V is the inclusion map. Hence we have:
Proposition 153 Let V be a set and P(V ). Then the formula and its
minimal transform have equal substitution rank, i.e.:
rnk (/()) = rnk ()
Proof
Let denote the substitution congruence on P(

V ) and i : V

V be the in-
clusion map. From proposition (141) we have /() i() and consequently
rnk (/()) = rnk (i()). So we need to show that rnk (i()) = rnk () which
follows from proposition (152) and the fact that i : V

V is injective.
We have already established that an injective variable substitution does not
change the rank of a formula. If P(V ) and : V V is an admissible
substitution for , we also know from proposition (116) that (). Hence if
: V V is valid for while leaving the free variables unchanged, it will not
change the rank of the formula either. It is not unreasonable to think that
this property will remain true if we only impose that be injective on Fr ().
In fact, this should also apply to any : V W, without assuming W = V .
Proposition 154 Let V, W be sets and : V W be a map. Let P(V ).
We assume that is valid for and
|Fr ()
is an injective map. Then:
rnk (()) = rnk ()
where : P(V ) P(W) denotes the associated substitution mapping.
Proof
We shall proceed in two steps: rst we shall prove the proposition is true in
the case when W = V . We shall then extend the result to arbitrary W. So
let us assume W = V . Let : V V valid for P(V ) such that
|Fr ()
is an injective map. We need to show that rnk (()) = rnk (). If V =
then : V V is the map with empty domain, namely the empty set which is
injective on Var() = and rnk (()) = rnk () follows from proposition (152).
So we assume that V ,= . The idea of the proof is to write () =
1

0
()
where each substitution
0
,
1
is rank preserving. Having assumed injective
on Fr (), we have the equality [(Fr ()[ = [Fr ()[ and consequently:
[Var() Fr ()[ = [Var()[ [Fr ()[
[V [ [Fr ()[
= [V [ [ (Fr ()) [
= [ V (Fr ()) [
So let i : Var() Fr () V (Fr ()) be an injective map. Let x

V and
dene the substitution
0
: V V as follows:
x V ,
0
(x) =

(x) if x Fr ()
i(x) if x Var() Fr ()
x

if x , Var()
163
Let us accept for now that
0
is injective on Var(). Then using proposi-
tion (152) we obtain rnk (
0
()) = rnk (). So consider
1
: V V :
u V ,
1
(u) =

u if u (Fr ())
i
1
(u) if u i( Var() Fr () )
x

otherwise
Note that
1
is well dened since (Fr ()) i( Var() Fr () ) = . So let us
accept for now that
1
is admissible for
0
(). Then using proposition (116)
we obtain
1

0
()
0
() where is the substitution congruence on P(V ).
Hence in particular we have rnk (
1

0
()) = rnk (
0
()) = rnk (). So in order
to show that the proposition is true in the case when W = V , it remains to
prove that
0
is injective on Var(),
1
is admissible for
0
() and furthermore
that
1

0
() = (). First we show that
0
is injective on Var(). So let
x, y Var() such that
0
(x) =
0
(y). We need to show that x = y. We shall
distinguish four cases: rst we assume that x Fr () and y Fr (). Then
the equality
0
(x) =
0
(y) leads to (x) = (y). Having assumed
|Fr ()
is an
injective map, we obtain x = y. Next we assume that x Var() Fr () and
y Var() Fr (). Then the equality
0
(x) =
0
(y) leads to i(x) = i(y) and
consequently x = y. So we now assume that x Fr () and y Var() Fr ().
Then
0
(x) = (x) (Fr ()) and
0
(y) = i(y) V (Fr ()). So the
equality
0
(x) =
0
(y) is in fact impossible. We show similarly that the nal
case x Var() Fr () and y Fr () is also impossible which completes the
proof that
0
is injective on Var(). We shall now show that
1

0
() = ().
From proposition (64) it is sucient to prove that
1

0
(x) = (x) for all
x Var(). We shall distinguish two cases: rst we assume that x Fr ().
Then
0
(x) = (x) (Fr ()) and consequently
1

0
(x) = (x) as requested.
Next we assume that x Var()Fr (). Then
0
(x) = i(x) i( Var()Fr () )
and consequently
1

0
(x) = i
1
(i(x)) = (x). So it remains to show that

1
is admissible for
0
(), i.e. that it is valid for
0
() and
1
(u) = u for all
u Fr (
0
()). So let u Fr (
0
()). We need to show that
1
(u) = u. So
it is sucient to prove that u (Fr ()). However from proposition (74) we
have Fr (
0
())
0
(Fr ()) and consequently there exists x Fr () such that
u =
0
(x) = (x). It follows that u (Fr ()) as requested and it remains to
show that
1
is valid for
0
(). From proposition (91) it is sucient to show
that
1

0
is valid for . However, having proved that
1

0
() = () from
proposition (92) it is sucient to prove that is valid for which is in fact true
by assumption. This completes our proof of the proposition in the case when
W = V . We shall now prove the proposition in the general case. So we assume
that : V W is a map and P(V ) is such that is valid for and
|Fr ()
is an injective map. We need to prove that rnk (()) = rnk (). Let U be the
disjoint union of the sets V and W, specically:
U = 0 V 1 W
164
Let i : V U and j : W U be the corresponding inclusion maps. Consider
the formula

= i() P(U) and let

: U U be dened as:
u U ,

(u) =
_
j (u) if u V
u if u W
Let us accept for now that

is valid for

and that

|Fr (

)
is an injective
map. Having proved the proposition in the case when W = V , it can be applied
to

: U U and

P(U). Hence, since i and j are injective maps, using


proposition (152) we obtain the following equalities:
rnk (()) = rnk (j ())
A: to be proved = rnk (

))
case W = V = rnk (

)
= rnk (i())
prop. (152) = rnk ()
So it remains to show that j () =

) and furthermore that

is valid for

while

|Fr (

)
is an injective map. First we show that j() =

). Since

= i() from proposition (64) it is sucient to prove that j (u) =

i(u)
for all u Var(). So let u Var(). In particular u V and consequently
i(u) i(V ) U. From the above denition of

we obtain immediately

(i(u)) = j (u) as requested. So we now prove that

is valid for

= i().
Using proposition (91) it is sucient to prove that

i is valid for . However,


having proved that

i() = j (), from proposition (92) it is sucient


to prove that j is valid for . Having assumed that is valid for , using
proposition (91) once more it remains to show that j is valid for () which
follows from the injectivity of j and proposition (86). So it remains to prove
that

|Fr (

)
is an injective map. So let u, v Fr (

) such that

(u) =

(v).
We need to show that u = v. However since

= i(), from proposition (74)


we have Fr (

) i(Fr ()). Hence, there exists x, y Fr () such that u = i(x)


and v = i(y). Having proved that j =

i on Var(), from the equality

(u) =

(v) we obtain j (x) = j (y). It follows from the injectivity of


j that (x) = (y). Having assumed that is injective on Fr () we conclude
that x = y and nally that u = v as requested.
As with anything involving the algebra P(V ), it is dicult to establish deeper
properties without resorting to some form of structural induction argument.
Hence if we want to say anything more substantial about the substitution rank
of a formula P(V ) we need to relate the rank of
1

2
and x
1
to the
ranks of
1
and
2
. These relationships are not as simple as one would wish.
Proposition 155 Let V be a set and P(V ) of the form =
1

2
where

1
,
2
P(V ). Then the substitution ranks of ,
1
and
2
satisfy the equality:
rnk () = max( [Fr ()[ , rnk (
1
) , rnk (
2
) )
165
Proof
First we show that max( [Fr ()[ , rnk (
1
) , rnk (
2
) ) rnk (). From propo-
sition (150) we already know that [Fr ()[ rnk (). So it remains to show
that rnk (
1
) rnk () and rnk (
2
) rnk (). So let be the substitution
congruence on P(V ) and . We need to show that rnk (
1
) [Var()[
and rnk (
2
) [Var()[. However, from =
1

2
and theorem (12) of
page 135 we see that must be of the form =
1

2
where
1

1
and
2

2
. Hence we have rnk (
1
) [Var(
1
)[ [Var()[ and simi-
larly rnk (
2
) [Var(
2
)[ [Var()[. So it remains to show the inequality
rnk () max( [Fr ()[ , rnk (
1
) , rnk (
2
) ). We shall distinguish two cases:
rst we assume that max(rnk (
1
), rnk (
2
)) [Fr ()[. Since Fr (
1
) Fr ()
and Fr (
2
) Fr () using proposition (151) we obtain the existence of
1

1
and
2

2
such that [Var(
1
)[ = rnk (
1
) and [Var(
2
)[ = rnk (
2
) with the
inclusions Var(
1
) Fr () and Var(
2
) Fr (). Since
1

2
:
rnk () [Var(
1

2
)[
= [Var(
1
) Var(
2
)[
Var(
i
) Fr () [Fr ()[
= max( [Fr ()[ , rnk (
1
) , rnk (
2
) )
Next we assume that [Fr ()[ max(rnk (
1
), rnk (
2
)). We shall distinguish
two further cases: rst we assume that rnk (
1
) rnk (
2
). Since we have both
Fr (
2
) Fr () and [Fr ()[ rnk (
2
), from proposition (151) we can nd

2

2
such that [Var(
2
)[ = rnk (
2
) and Fr () Var(
2
). In particular
we obtain the inclusion Fr (
1
) Var(
2
) and applying proposition (151) once
more, from rnk (
1
) rnk (
2
) = [Var(
2
)[ we obtain the existence of
1

1
such that [Var(
1
)[ = rnk (
1
) and Var(
1
) Var(
2
). It follows that:
rnk () [Var(
1

2
)[
= [Var(
1
) Var(
2
)[
Var(
1
) Var(
2
) = [Var(
2
)[
= rnk (
2
)
= max( [Fr ()[ , rnk (
1
) , rnk (
2
) )
The case rnk (
2
) rnk (
1
) is dealt with similarly.
Proposition 156 Let V be a set and P(V ) of the form = x
1
where

1
P(V ) and x V . Then the substitution ranks of and
1
satisfy:
rnk () = rnk (
1
) +
where 2 = 0, 1 is given by the equivalence = 1 if and only if:
( x , Fr (
1
) ) ( [Fr (
1
)[ = rnk (
1
) )
166
Proof
Let = x
1
where
1
P(V ) and x V . Dene = rnk () rnk (
1
). Then
is an integer, possibly negative. In order to prove that 2 it is therefore
sucient to prove that the following inequalities hold:
rnk (
1
) rnk () rnk (
1
) + 1 (2.36)
This will be the rst part of our proof. Next we shall show the equivalence:
( = 1) ( x , Fr (
1
) ) ( [Fr (
1
)[ = rnk (
1
) ) (2.37)
So rst we show that rnk (
1
) rnk (). Let where denotes the
substitution congruence on P(V ). We need to show that rnk (
1
) [Var()[.
Using theorem (12) of page 135, from the equivalence we see that is
either of the form = x
1
where
1

1
, or is of the form = y
1
where

1

1
[y : x], x ,= y and y , Fr (
1
). First we assume that = x
1
where

1

1
. Then we have the following inequalities:
rnk (
1
) [Var(
1
)[
[x Var(
1
)[
= [Var(x
1
)[
= [Var()[
Next we assume that = y
1
where
1

1
[y : x], x ,= y and y , Fr (
1
). The
permutation [y : x] being injective, from proposition (120) we obtain

1

1
where

1
=
1
[y : x]. Furthermore, dening

= x

1
we have =

[y : x].
Using the injectivity of [y : x] once more and proposition (63) we obtain:
[Var()[ = [Var(

[y : x])[ = [ [y : x](Var(

)) [ = [Var(

)[
So we need to prove that rnk (
1
) [Var(

)[ where

= x

1
and

1

1
.
Hence we are back to our initial case and we have proved that rnk (
1
) rnk ().
Next we show that rnk () rnk (
1
) + 1. So let
1

1
. We need to show
that rnk () 1 [Var(
1
)[ or equivalently that rnk () [Var(
1
)[ + 1:
rnk () = rnk (x
1
)
x
1
x
1
[Var(x
1
)[
= [x Var(
1
)[
[Var(
1
)[ + 1
So we are done proving the inequalities (2.36). We shall complete the proof
of this proposition by showing the equivalence (2.37). First we show : so
we assume that = 1. We need to show that x , Fr (
1
) and furthermore
that [Fr (
1
)[ = rnk (
1
). First we show that x , Fr (
1
). So suppose to the
contrary that x Fr (
1
). We shall obtain a contradiction by showing = 0,
that is rnk (
1
) = rnk (). We already know that rnk (
1
) rnk (). So we
167
need to show that rnk () rnk (
1
). So let
1

1
. We need to show that
rnk () [Var(
1
)[. However, from
1

1
and proposition (118) we obtain
Fr (
1
) = Fr (
1
) and in particular x Fr (
1
). It follows that:
rnk () = rnk (x
1
)
x
1
x
1
[Var(x
1
)[
= [x Var(
1
)[
x Fr (
1
) Var(
1
) = [Var(
1
)[
This is our desired contradiction and we conclude that x , Fr (
1
). It remains
to show that [Fr (
1
)[ = rnk (
1
). So suppose this equality does not hold. We
shall obtain a contradiction by showing = 0, that is rnk () rnk (
1
). So let

1

1
. We need to show once again that rnk () [Var(
1
)[. However, having
assumed the equality [Fr (
1
)[ = rnk (
1
) does not hold, from proposition (150)
we obtain [Fr (
1
)[ < rnk (
1
) and consequently from
1

1
we have:
[Fr (
1
)[ = [Fr (
1
)[ < rnk (
1
) [Var(
1
)[
It follows that the set Var(
1
) Fr (
1
) cannot be empty, and there exists
y Var(
1
) Fr (
1
). From x , Fr (
1
) and y , Fr (
1
) = Fr (
1
) using
proposition (119) we obtain the equivalence x
1
y
1
. Hence we also have
the equivalence x
1
y
1
and consequently:
rnk () = rnk (x
1
)
x
1
y
1
[Var(y
1
)[
= [y Var(
1
)[
y Var(
1
) = [Var(
1
)[
which is our desired contradiction and we conclude that [Fr (
1
)[ = rnk (
1
).
This completes our proof of in the equivalence (2.37). We now prove :
so we assume that x , Fr (
1
) and [Fr (
1
)[ = rnk (
1
). We need to show
that = 1, that is rnk () = rnk (
1
) + 1. We already have the inequality
rnk () rnk (
1
) + 1. So it remains to show that rnk (
1
) + 1 rnk ()
or equivalently rnk (
1
) < rnk (). So let . We need to show that
rnk (
1
) < [Var()[. Once again, using theorem (12) of page 135, from the
equivalence we see that is either of the form = x
1
where
1

1
,
or is of the form = y
1
where
1

1
[y : x], x ,= y and y , Fr (
1
). First
we assume that = x
1
where
1

1
. Hence rnk (
1
) [Var(
1
)[ and we
shall distinguish two further cases: rst we assume that rnk (
1
) = [Var(
1
)[.
Having assumed that [Fr (
1
)[ = rnk (
1
) we obtain:
[Fr (
1
)[ = [Fr (
1
)[ = rnk (
1
) = [Var(
1
)[
from which we see that Var(
1
) = Fr (
1
) and consequently it follows that
x , Fr (
1
) = Fr (
1
) = Var(
1
). Hence we see that:
rnk (
1
) [Var(
1
)[
168
x , Var(
1
) < [x Var(
1
)[
= [Var(x
1
)[
= [Var()[
So we now assume that rnk (
1
) < [Var(
1
)[, in which case we obtain:
rnk (
1
) < [Var(
1
)[
[x Var(
1
)[
= [Var(x
1
)[
= [Var()[
We now consider the case when = y
1
where
1

1
[y : x], x ,= y and
y , Fr (
1
). Once again, dening

1
=
1
[y : x] and

= x

1
we obtain
the equivalence

1

1
and [Var(

)[ = [Var()[. So we need to prove that


rnk (
1
) < [Var(

)[ knowing that

1

1
which follows from our initial case.

As already discussed, our objective is to show the existence of a P(W)


such that /() = /() provided we have rnk ( /()) [W[. Assuming
we prove this result, it will be important for us to have a way of telling whether
the condition rnk ( /()) [W[ is satised. The following proposition allows
us to do this. Given any map : V W, the associated variable substitution
: P(V ) P(W) cannot increase the substitution rank of a formula. Hence:
rnk ( /()) rnk (/()) = rnk () [Var()[ [V [
So we see for example that the condition [V [ [W[ is a sucient condition.
Note the following proposition makes no assumption on the validity of for .
Proposition 157 Let V, W be sets and : V W be a map. Let P(V ):
rnk (()) rnk ()
where : P(V ) P(W) denotes the associated substitution mapping.
Proof
We shall prove rnk (()) rnk () by structural induction using theorem (3)
of page 32. First we assume that = (x y) for some x, y V . Then we have:
rnk (()) = [ (x), (y) [ [ x, y [ = rnk ()
Next we assume that = . Then rnk (()) = rnk () = 0 = rnk (). So we
now assume that =
1

2
where
1
,
2
P(V ) satisfy our property:
rnk (()) = rnk ((
1

2
))
= rnk ((
1
) (
2
))
prop. (155) = max( [ Fr (()) [ , rnk ((
1
)) , rnk ((
2
)) )
169
max( [ Fr (()) [ , rnk (
1
) , rnk (
2
) )
prop. (74) max( [ (Fr ()) [ , rnk (
1
) , rnk (
2
) )
max( [Fr ()[ , rnk (
1
) , rnk (
2
) )
prop. (155) = rnk ()
Finally we assume that = x
1
where x V and
1
P(V ) satisfy our
induction property. Then () = (x)(
1
). Let = rnk () rnk (
1
) and
let = rnk (()) rnk ((
1
)). From proposition (156) we know that , 2.
We shall distinguish two cases: rst we assume that = 0. Then we have:
rnk (()) = rnk ((
1
)) rnk (
1
) rnk ()
Next we assume that = 1. We shall distinguish two further cases: if = 1 :
rnk (()) = 1 + rnk ((
1
)) 1 + rnk (
1
) = rnk ()
So it remains to deal with the last possibility when = 1 and = 0. Using
proposition (156), from = 0 we see that x Fr (
1
) or [Fr (
1
)[ < rnk (
1
).
So we shall distinguish two further cases. These cases may not be exclusive of
one another but we dont need them to be. So rst we assume x Fr (
1
).
Using proposition (156) again, from = 1 we obtain [Fr ((
1
))[ = rnk ((
1
))
and furthermore (x) , Fr ((
1
)). Having assumed that x Fr (
1
) it follows
that (x) is an element of (Fr (
1
)) but not an element of Fr ((
1
)). Hence,
we see that the inclusion Fr ((
1
)) (Fr (
1
)) which we know is true from
proposition (74) is in fact a strict inclusion. So we have a strict inequality
between the nite cardinals [Fr ((
1
))[ < [(Fr (
1
))[. Thus:
rnk (()) = 1 + rnk ((
1
))
[Fr ((
1
))[ = rnk ((
1
)) = 1 +[Fr ((
1
))[
[Fr ((
1
))[ < [(Fr (
1
))[ [(Fr (
1
))[
[Fr (
1
)[
prop. (150) rnk (
1
)
= 0 = rnk ()
So we now assume that [Fr (
1
)[ < rnk (
1
). In this case we have:
rnk (()) = 1 + rnk ((
1
))
[Fr ((
1
))[ = rnk ((
1
)) = 1 +[Fr ((
1
))[
prop. (74) 1 +[(Fr (
1
))[
1 +[Fr (
1
)[
[Fr (
1
)[ < rnk (
1
) rnk (
1
)
= 0 = rnk ()

170
Proposition 158 Let V be a set and n N such that n [V [. Then there
exists a formula P(V ) such that Fr () = and rnk () = n.
Proof
From the inequality n [V [ there exists an injective map u : n V . Dene

0
= and
k+1
= (u(k) u(k))
k
for k n. Let =
n
. For example,
in the case when n = 2 the formula is given by:
= ( u(1) u(1) ) [ ( u(0) u(0) ) ]
Dene
0
= and
k+1
= u(k)
k
for k n. Let =
n
. We shall complete
the proof of this proposition by showing that Fr () = and rnk () = n. First
we prove that Fr () = . From the recursion
k+1
= (u(k) u(k))
k
:
Fr (
k+1
) = u(k) Fr (
k
) , k n
Since Fr (
0
) = Fr () = an easy induction argument shows that we have
Fr (
k
) = u[k] for all k n + 1. Note that u[k] denotes the image of the
set k by u, i.e. the range of the restricted map u
|k
. We are using the square
brackets [ . ] to avoid any confusion between u[k] and u(k). In particular for
k = n we see that Fr () = Fr (
n
) = u[n] = rng(u). Furthermore, from the
recursion formula
k+1
= u(k)
k
we obtain the equality:
Fr (
k+1
) = Fr (
k
) u(k) , k n
Since Fr (
0
) = Fr () = rng(u), an easy induction argument shows that
Fr (
k
) = rng(u) u[k] for all k n + 1. In particular for k = n we have
the equality Fr () = Fr (
n
) = rng(u) u[n] = as requested. So it remains
to show that rnk() = n. Using proposition (156), from the recursion formula

k+1
= u(k)
k
and the fact that u(k) rng(u) u[k] = Fr (
k
) we obtain
rnk (
k
) = rnk (
k+1
). Note that u(k) rng(u) u[k] is true since u is injective.
Hence we see that rnk () = rnk (
0
) = rnk () and it is sucient to prove
that rnk () = n. However having proved Fr (
k
) = u[k], it follows from the
injectivity of u that [Fr (
k
)[ = k for all k n +1. From the recursion formula

k+1
= (u(k) u(k))
k
using proposition (155) we obtain:
rnk (
k+1
) = max( [Fr (
k+1
)[ , rnk (u(k) u(k)) , rnk (
k
) )
that is rnk (
k+1
) = max(k + 1, 1, rnk(
k
)) for all k n. Since rnk (
0
) = 0,
a simple induction argument shows that rnk (
k
) = k for all k n + 1. In
particular for k = n we obtain rnk () = rnk (
n
) = n as requested.
2.3.7 Existence of Essential Substitution
Given a map : V W we would like to prove the existence of an essential
substitution mapping, namely a map as dened below. Recall that the minimal
transform and minimal extension can be found in denitions (137) and (142):
171
Denition 159 Let V, W be sets and : V W be a map. We call essential
substitution mapping associated with , any map

: P(V ) P(W) such that:


/

= / (2.38)
where :

V

W is the minimal extension and / is the minimal transform.
We cannot expect an essential substitution mapping to be uniquely dened.
If denotes the substitution congruence on P(W) then from theorem (14) of
page 153, any other map

: P(V ) P(W) satisfying

()

() for all
P(V ) will also satisfy equation (2.38). However, if

: P(V ) P(W) does


indeed satisfy equation (2.38), then for all P(V ) the equivalence class of

() modulo the substitution congruence is uniquely determined. This is the


best we can do. The notion of an essential substitution mapping

associated
with a map we feel plays an important role. If P(V ) is such that is
valid for , then it follows from theorem (13) of page 150 that

() ()
where : P(V ) P(W) denotes the usual variable substitution mapping of
denition (53). So

can be viewed as a natural extension of : P(V ) P(W)


which thanks to equation (2.38), assigns a meaningful formula

() even in the
case when is not a valid substitution for . This fundamentally allows us to
write x [y/x]

as a valid instance of the specialization axioms for all


y V , without having to worry whether the substitution [y/x] is valid for .
For example, if = y(x y) with x ,= y, we shall now be able to claim
that x x(y x) is a legitimate axiom. So our aim is to prove the
existence of an essential substitution mapping

: P(V ) P(W) associated


with , at least when W is a large enough set. Given P(V ) the rst
step in our proof is to show the existence of a formula P(W) such that
/() = /(). As already discussed in the previous section, we cannot
hope to obtain the existence of without some condition on the set W. We
conjectured the substitution rank of /() had to be smaller than [W[, i.e.
rnk ( /() ) [W[. So assuming this condition holds, we are setting out to
prove the existence of which is the object of theorem (17) below. Until now,
most of our results have been obtained by structural induction arguments. In
this case, structural induction does not seem to work. It appears the notion of
substitution rank does not lend itself easily to induction, and we shall need to
nd some other route. So let us go back to the simple case when = y(x y)
with x ,= y. Assuming (x) = u we obtain /() = 0(u 0) P(

W).
Our rst objective will be to show the existence of some
1
P(

W) such that
Var(
1
) W and /()
1
where denotes the substitution congruence
on P(

W). Hopefully the condition rnk ( /() ) [W[ will allow us to do
that using proposition (151). Following our example, the formula
1
would look
something like
1
= v(u v) P(

W) for some v W and u ,= v. Once we
have the formula
1
we can easily project it onto P(W) and dene = q(
1
)
where q :

W W is an appropriate substitution. Having dened a formula
P(W) it will remain to show that:
/() = /() (2.39)
172
We shall proceed as follows: we rst consider the operator A : P(

W) P(

W)
dened by A = p

/ where p :

W

W is an appropriate substitution and

/ : P(

W) P(

W) is the minimal transform mapping. From the equivalence


/()
1
and theorem (14) of page 153 we obtain immediately:
A /() = A(
1
) (2.40)
We then prove that A /() = /() so as to obtain:
/() = A(
1
) (2.41)
Using the injection i : W

W we nally prove that the minimal transform
/: P(W) P(

W) is in fact given by /= A i, and we conclude by arguing
that
1
= i q(
1
) = i() which gives us equation (2.39) from (2.41).
Before we formally dene the operator A, we shall say a few words on the
minimal extension

V of the set

V which is itself the minimal extension of V .
From denition (136) the minimal extension

V is dened as:

V = 0 V 1 N
Hence we have

V = 0

V 1 N. So the set

V contains three types of


elements: those of the form (0, (0, x)) which we simply identify with x V ,
those of the form (0, (1, n)) which we identify with n N, and those of the form
(1, n) which we denote n for all n N. Thus, in the interest of lighter notations,
we regard

V as the disjoint union


V = V N

N where

N is the image of N
through the embedding n n. So for example, if = y(x y) P(V ), we
shall continue to denote the minimal transform of as /() = 0(x 0), while
the minimal transform of /() is denoted

/ /() =

0(x

0) P(

V ).
With these notational conventions in mind, we can now state:
Denition 160 Let V be a set. We call weak transform on P(

V ) the map
A : P(

V ) P(

V ) dened by A = p

/ where

/ : P(

V ) P(

V ) is the
minimal transform mapping and p :

V

V is dened by:
u

V , p (u) =
_
u if u

V
n if u = n

N
So for example, if = 0(x 0) P(

V ) we obtain

/() =

0(x

0) and
consequently A() = 0(x 0) = . However, if = 1(0 1) then

/() =

0(0

0) and A() = 0(0 0). This last equality shows the limitations
of the operator A. It is not a very interesting operator, as the substitution
equivalence A() is not necessarily preserved. On the positive side, the
operator A has a simple denition which will make our forthcoming proofs a
lot easier. It also has the useful properties allowing us to prove theorem (17) :
Lemma 161 Let V be a set and A : P(

V ) P(

V ) be the weak transform on


P(

V ). Let i : V

V be the inclusion map. Then we have:
/ = A i
where / : P(V ) P(

V ) is the minimal transform mapping.


173
Proof
Let

/ : P(

V ) P(

V ) be the minimal transform mapping and p :


V

V be
the map of denition (160). Given P(V ), since i : V

V is an injective
map, in particular it is valid for . Hence we have:
A i() = p

/ i()
theorem (13) = p

i /()
A: to be proved = /()
So it remains to show that p

i(u) = u for all u



V , where

i :

V

V is the
minimal extension of i. So let u

V . We shall distinguish two cases: rst we
assume that u V . Then

i(u) = i(u) = u and consequently p

i(u) = p (u) = u
as requested. Next we assume that u N. Then

i(u) = u and it follows that
p

i(u) = p ( u) = u. So this appears to complete our proof. However, we


feel this proof is not completely convincing, particularly with regards to the
step

i(u) = u. A blind reading of denition (142) for the minimal extension

i :

V

V seems to indicate that



i(u) = u whenever u N. But of course,
the wording of denition (142) is the result of identications, which are now
possibly confusing us. So let i
2
: N

V and j
2
: N

V be the inclusion
maps. What is really meant by denition (142) is that

i i
2
(u) = j
2
(u) for all
u N. In our discussion prior to stating denition (160) we agreed to denote
j
2
(u) as u. Hence if we identify u and i
2
(u) we obtain

i(u) = u as claimed. So
let us pursue this further and make all our identications explicit. The wording
of denition (160) is itself the result of identications. Given u N the formula
p( u) = u should be rigorously written as p j
2
(u) = i
2
(u). So given u N, if
we identify u and i
2
(u)

V we have the following equalities:
p

i(u) = p

i i
2
(u)
def. (142) = p j
2
(u)
def. (160) = i
2
(u)
identication = u
Likewise, if we denote i
1
: V

V and j
1
:

V

V the inclusion mappings, and


identify i
1
(u) and u for all u V , we obtain the following equalities:
p

i(u) = p

i i
1
(u)
def. (142) = p j
1
i(u)
def. (160) = i(u)
i = i
1
= i
1
(u)
identication = u

174
Lemma 162 Let V, W be sets and : V W be a map. Then:
A / = /
where A : P(

W) P(

W) is the weak transform on P(

W).
Proof
For once we shall be able to prove the formula without resorting to a structural
induction argument. In the interest of lighter notations, we shall keep the same
notations /,

/, A and p in relation to the sets V and W. So let P(V ):
A /() = p

/ /()
theorem (13) of p. 150 = p



/ /()
prop. (141) = p



/ i()
A: to be proved = p

/ i()
= A i()
lemma (161) = /()
So it remains to show that p

(u) = p (u) for all u

V . So let u

V .
Since

V is the disjoint union of



V and

N, we shall distinguish two cases: rst we
assume that u

V . Then

(u) = (u)

W and consequently p

(u) = (u).
So the equality p

(u) = p (u) is satised since p (u) = u for u

V . Next
we assume that u

N i.e. that u = n for some n N. Then

(u) = u = n and
consequently p

(u) = n = p (u).
Before we move on to theorem (17) which is our key result, we should pause
a moment on an interesting consequence of lemma (162). In general given and
P(V ) such that where is the substitution congruence, we cannot
conclude that = . Likewise if

,

P(

V ) are substitution equivalent,


we cannot conclude that that

=

. However, if both

and

are minimal
transforms then things are dierent: it is in fact possible to infer the equality
/() = /() simply from the substitution equivalence /() /():
Proposition 163 Let V be a set and , P(V ). Then we have:
/() /() /() = /()
where the relation denotes the substitution congruence on P(

V ).
Proof
We assume that /() /(). We need to show that /() = /(). How-
ever, from theorem (14) of page 153 we obtain

/ /() =

/ /(), where

/ : P(

V ) P(

V ) is the minimal transform mapping. Hence, from deni-


tion (160), we see that A /() = A /(). Applying lemma (162) to
W = V and the identity mapping : V V we conclude that /() = /().
175

For future reference, we also quote the following lemma:


Lemma 164 Let V be a set and p :

V

V be the map of denition (160).
Then for all P(V ), the substitution p is valid for the formula

/ /().
Proof
Using proposition (86) it is sucient to show that p is injective on the set
Var(

/ /()). First we shall show that p is injective on V

N: so suppose
u, v V

N are such that p (u) = p (v). We need to show that u = v. We
shall distinguish four cases: rst we assume that u, v V

V . Then u = v
follows immediately from denition (160). Next we assume that u, v

N. Then
u = n and v = m for some n, m N. From p (u) = p (v) we obtain n = m and
consequently u = v. Next we assume that u V and v

N. Then v = m for
some m N and from p (u) = p (v) we obtain u = n which contradicts the fact
that V N = . So this case is in fact impossible. The case u

N and v V
is likewise impossible and we have proved that p is injective on V

N. So it
remains to show that Var(

/ /()) V

N. Let u Var(

/ /()). We
need to show that u V

N. Since

V =

V

N, we shall distinguish two cases:
rst we assume that u

V . Then using proposition (139) we obtain:
u Var(

/ /())

V = Fr (/()) = Fr () V V

N
Next we assume that u

N. Then it is clear that u V

N.
Theorem 17 Let V, W be sets and : V W be a map. Let P(V ) with:
rnk ( /() ) [W[
where :

V

W is the minimal extension of and /() is the minimal
transform of . Then, there exists P(W) such that /() = /().
Proof
Our rst step is to nd
1
P(

W) such that /()
1
and Var(
1
) W,
where denotes the substitution congruence on P(

W). We shall do so using
proposition (151) applied to the set

W and the formula /() P(

W).
Suppose for now that we have proved the inclusion Fr ( /()) W. Then
from the assumption rnk ( /() ) [W[ and proposition (151) we obtain the
existence of
1
P(

W) such that /()
1
and Var(
1
) W. In fact,
proposition (151) allows to assume that [Var(
1
)[ = rnk ( /() ) but we
shall not be using this property. So we need to prove that Fr ( /()) W:
Fr ( /()) = Fr ( (/()) )
prop. (74) ( Fr (/()) )
prop. (139) = (Fr ())
P(V ) (V )
def. (142) = (V )
W
176
So the existence of
1
P(

W) such that /()
1
and Var(
1
) W is
now established. Next we want to project
1
onto P(W) by dening = q(
1
)
where q :

W W is a substitution such that q(u) = u for all u W. To be
rigorous, we need to dene q(n) for n N which we cannot do when W = . So
in the case when W ,= , let u

W and dene q :

W W as follows:
u

W , q(u) =
_
u if u W
u

if u N
and consider the associated substitution mapping q : P(

W) P(W). In the
case when W = , there exists no substitution q :

W W but we can dene
the operator q : P(

W) P(W) with the following structural recursion:
q() =

if = (u v)
if =
q(
1
) q(
2
) if =
1

2
if = u
1
(2.42)
In both cases we set = q(
1
) P(W). We shall complete the proof of the
theorem by proving the equality /() = /(). Using lemma (162) :
/() = A /()
def. (160) = p

/ /()
theorem (14) and /()
1
= p

/(
1
)
= A(
1
)
A: to be proved = A i q(
1
)
= q(
1
) = A i()
lemma (161) = /()
So it remains to show that i q(
1
) =
1
where i : W

W is the inclusion
map. As before, we shall distinguish two cases: rst we assume that W ,= .
Then q arises from the substitution q :

W W and from proposition (64) it is
sucient to prove that i q(u) = u for all u Var(
1
). Since Var(
1
) W the
equality is clear. Next we assume that W = . From the inclusion Var(
1
) W
it follows that Var(
1
) = and it is therefore sucient to prove the property:
Var() = i q() =
for all P(

W). We shall do so by structural induction. Note that when
W = , the inclusion map i : W

W is simply the map with empty domain,
namely the empty set. First we assume that = (u v) for some u, v

W.
Then the above implication is vacuously true. Next we assume that = .
Then i q() = is clear. So we now assume that =
1

2
where
1
,
2
satisfy the above implication. We need to show the same is true of . So we
assume that Var() = . We need to show that i q() = . However, from
177
Var() = we obtain Var(
1
) = and Var(
2
) = . Having assumed
1
and
2
satisfy the implication, we obtain the equalities i q(
1
) =
1
and i q(
2
) =
2
from which i q() = follows immediately. So it remains to check the case
when = u
1
for which the above implication is also vacuously true.
Theorem 18 Let V, W be sets and : V W be a map. Then, there exists
an essential substitution mapping

: P(V ) P(W) associated with , if and


only if [W[ is an innite cardinal or the inequality [V [ [W[ holds.
Proof
First we prove the if part: so we assume that [W[ is an innite cardinal,
or that it is nite with [V [ [W[. We need to prove the existence of an
essential substitution mapping

: P(V ) P(W) associated with . Let


c : T(P(W)) P(W) be a choice function whose existence follows from the
axiom of choice. Let us accept for now that for all P(V ), there exists some
P(W) such that /() = /(). Given P(W), let [] denote the
congruence class of modulo the substitution congruence, which is a non-empty
subset of P(W). Dene

: P(V ) P(W) by setting

() = c([]), where
is an arbitrary formula of P(W) such that /() = /(). We need to
check that

is well dened, namely that

() is independent of the particular


choice of . But if

is such that /() = /(

) then /() = /(

)
and it follows from theorem (14) of page 153 that [] = [

]. So it remains
to show the existence of such that /() = /() for all P(V ). So
let P(V ). Using theorem (17) of page 176 it is sucient to prove that
rnk ( /() ) [W[. The substitution rank of a formula being always nite,
this is clearly true if [W[ is innite. Otherwise, using proposition (157) :
rnk ( /() ) rnk (/())
prop. (153) = rnk ()
[Var()[
[V [
[V [ [W[ [W[
We now prove the only if part: So we assume there exists

: P(V ) P(W)
essential substitution associated with . We need to show that [W[ is an innite
cardinal, or that it is nite with [V [ [W[. So suppose [W[ is a nite cardinal.
There exists n N such that [W[ = n. We need to show that [V [ n holds. So
we assume to the contrary that n+1 [V [ and we shall derive a contradiction.
Using proposition (158) there exists a formula P(V ) such that Fr () =
and rnk () = n + 1. Hence we have the following:
n + 1 = rnk ()
prop. (153) = rnk (/())
A: to be proved = rnk ( /())
178

associated with = rnk (/

())
prop. (153) = rnk (

())
[Var(

())[

() P(W) [W[
= n
which is our desired contradiction. So it remains to show /() = /().
From proposition (64) it is enough to prove that (u) = u for all u Var(/()).
Using denition (142) of the minimal extension it is sucient to show that
Var(/()) N. Since

V = V N, it is therefore sucient to prove that
Var(/()) V = . However from proposition (139) we have the equality
Var(/()) V = Fr (). So the result follows from Fr () = .
2.3.8 Properties of Essential Substitution
Until now we have always started from a map : V W and investigated the
existence of an associated essential substitution

: P(V ) P(W). Thanks to


theorem (18) of page 178 we have a simple rule based on the cardinals [V [ and
[W[ allowing us to determine when such existence occurs. In this section, we
wish to investigate essential substitutions in their own right. We shall say that a
map

: P(V ) P(W) is an essential substitution whenever there exists a map


: V W such that

is an essential substitution associated with , as per


denition (159). We believe essential substitutions should play an important
role in mathematical logic. Our study of substitutions started with the (not
essential) substitutions mappings

: P(V ) P(W) associated with a map


: V W as per denition (53). These substitution mappings allowed us to
characterize the substitution congruence but are somewhat limited by the fact
that is generally not a valid substitution for P(V ). The introduction of
valid substitutions as per denition (83) allowed us to move forward somehow,
but the issue with valid substitutions fundamentally remains: if a substitution
: V W is valid for P(V ), then the associated

: P(V ) P(W)
behaves properly on . In other words, it is meaningful to speak of

().
However, this validity of for is a local property so to speak, and we are still
short of a map

: P(V ) P(W) with the right global property. This is what


essential substitutions allow us to achieve: if

: P(V ) P(W) is an essential


substitution then

() is meaningful for all P(V ). It is as if the associated


map : V W had the global property of being valid for all P(V ).
Denition 165 Let V, W be sets. We say that a map

: P(V ) P(W) is
an essential substitution if and only if there exists : V W such that:
/

= /
where :

V

W is the minimal extension and / is the minimal transform.
179
The following proposition shows that any map : V W associated with
an essential substitution

: P(V ) P(W) is unique. This means an essential


substitution

can safely be regarded as a map

: V W without any risk


of confusion. In fact, there is no need to have separate notations

and . If
: P(V ) P(W) is an essential substitution, it is meaningful to speak of (x).
Proposition 166 Let V , W be sets and

: P(V ) P(W) be an essential


substitution. Then the map : V W associated with

is unique.
Proof
Let
0
,
1
: V W such that /

=
i
/ for all i 2. We need to show
that
0
(x) =
1
(x) for all x V . So let x V . Dene = (x x). Then :
(
0
(x)
0
(x)) = (
0
(x)
0
(x))
=
0
(x x)
=
0
/()
= /

()
=
1
/()
= (
1
(x)
1
(x))
From (
0
(x)
0
(x)) = (
1
(x)
1
(x)) we conclude that
0
(x) =
1
(x).
The notion of essential substitution is tailor made for the substitution con-
gruence which is arguably the most important congruence in P(V ). If an essen-
tial substitution : P(V ) P(W) is redened without changing the class of
() modulo substitution, we still obtain an essential substitution which is in
fact identical to when viewed as a map : V W.
Proposition 167 Let V, W be sets and : P(V ) P(W) be an essential
substitution. Let : P(V ) P(W) be a map such that () () for all
P(V ) where is the substitution congruence on P(W). Then is itself an
essential substitution with associated map : V W identical to .
Proof
Since is essential we have / () = /() for all P(V ). Having
assumed that () () for all P(V ), from theorem (14) of page 153
we have / () = / (). It follows that / () = /() and we
conclude that is itself essential with associated map : V W.
Suppose : P(V ) P(W) is an essential substitution. We have an associ-
ated map : V W which we have decided to call to keep our notations
light and natural. But to every map : V W there is an associated vari-
able substitution mapping

: P(V ) P(W) as per denition (53). It has


been our practice so far to denote the variable substitution

simply by .
Obviously we cannot keep calling everything without creating great con-
fusion. Whenever we are dealing with a map : V W and another map
180
: P(V ) P(W), the fact that these two maps have the same name is ne,
as we can tell from the context which map is being referred to. We are now
confronted with a situation where we have two maps : P(V ) P(W) which
cannot be distinguished from the data type of their argument, so to speak.
Hence our notation

: P(V ) P(W) for the usual variable substitution as


per denition (53). Now given P(V ), we may be wondering what relation-
ship there is between () and

(). We cannot hope to have an equality in


general since from proposition (167) we know that can be redened modulo
the substitution congruence without aecting its associated map : V W.
However, if is the substitution congruence on P(W) we may be able to claim
that ()

(). Indeed, this is of course the case when is valid for :


Proposition 168 Let V, W be sets and : P(V ) P(W) be an essential sub-
stitution. Then if is valid for P(V ), we have the substitution equivalence:
()

()
where

: P(V ) P(W) is the associated substitution as per denition (53).


Proof
We assume that is valid for . We need to show that ()

(), that
is / () = /

(). However, having assumed is valid for , from


theorem (13) of page 150 we have /

() = /(). Since is an essential


substitution we also have / () = /(). So the result follows.
We have spent a lot of time proving the existence of essential substitutions
without providing natural examples. The most obvious are the variable substitu-
tion mappings associated with injective maps : V W as per denition (53).
A more interesting and somewhat deeper result, is the fact that the minimal
transform mapping / : P(V ) P(

V ) is itself an essential substitution.


Proposition 169 Let V, W be sets and : V W be an injective map. Then
the associated map : P(V ) P(W) is essential with associated map itself.
Proof
Let : V W be an injective map. Let : P(V ) P(W) be the associ-
ated substitution mapping as per denition (53). The fact that both mappings
are called is standard practice at this stage for us. We need to prove that
: P(V ) P(W) is an essential substitution mapping, with associated map
: V W. In other words, we need to prove that / () = /() for all
P(V ). Using theorem (13) of page 150 it is sucient to show that is valid
for which follows from proposition (86) and the injectivity of : V W.
The minimal transform / is an essential substitution with associated map
the inclusion i : V

V . In this case, we shall break with our own tradition and
refrain from using the same notation for / and its associated map i : V

V .
Proposition 170 Let V be a set. The minimal transform / : P(V ) P(

V )
is an essential substitution with associated map the inclusion i : V

V .
181
Proof
Let

/ : P(

V ) P(

V ) denote the minimal transform on P(

V ). We need to
show that

/ /() =

i /() for all P(V ). However, since i is injective,


from proposition (86) it is valid for and it follows from theorem (13) of page 150
that

/ i() =

i /(). Hence we need to show that



/ /() =

/ i().
Using theorem (14) of page 153 we need to show /() i() where is the
substitution congruence on P(

V ), which we know is true from proposition (141).

Given P(V ) and a map : V V which is admissible for , we


know from proposition (116) that () where denotes the substitution
congruence on P(V ). In other words, if is valid for and (u) = u for all
u Fr () then we have where = (). The following proposition is an
extension of this result for essential substitutions. In fact, the converse is now
also true. If then we must have = () for some essential substitution
such that (u) = u for all u Fr (). This is less deep than it looks:
Proposition 171 Let V be a set and be the substitution congruence on P(V ).
Then for all , P(V ) we have if and only if = () for some
essential substitution : P(V ) P(V ) such that (u) = u for all u Fr ().
Proof
First we show the if part: so suppose = () for some essential substi-
tution : P(V ) P(V ) such that (u) = u for all u Fr (). We need
to show that . From theorem (14) of page 153 it is sucient to prove
that /() = /(). Having assumed that = () we need to show that
/() = / (). However, since is essential we have / () = /().
Hence we need to show that /() = /(). Using proposition (64) it is
sucient to prove that (u) = u for all u Var(/()). So let u Var(/()).
Since

V = V N we shall distinguish two cases: rst we assume that u N. Then
(u) = u is clear from denition (142). Next we assume that u V . Then from
proposition (139) we obtain u Fr () and it follows that (u) = (u) = u. We
now prove the only if part: so suppose . We need to show that = ()
for some essential substitution : P(V ) P(V ) such that (u) = u for all
u Fr (). Let i : P(V ) P(V ) be the identity mapping. From proposi-
tion (169), i is an essential substitution associated with the identity i : V V .
Let : P(V ) P(V ) be dened by () = i() whenever ,= and () = .
Having assumed that we have () i() for all P(V ). It follows
from proposition (167) that is an essential substitution whose associated map
: V V is the identity. In particular, we have (u) = u for all u Fr ().
The composition of two essential substitutions is itself essential, while the
associated map is the obvious composition of the respective associated maps.
Proposition 172 Let U, V and W be sets while the maps : P(U) P(V )
and : P(V ) P(W) are essential substitutions. Then : P(U) P(W)
is itself an essential substitution with associated map : U W.
182
Proof
We need to show that /() = ( )/. However from denition (142), it
is clear that the minimal extension ( ) :

U

W is equal to the composition
of the minimal extensions . Hence we have:
/ ( ) = /
= /
= ( ) /

Suppose : V W is a map which is valid for P(V ). From propo-


sition (85) we know that Fr (()) = (Fr ()). In fact this equality holds for
every sub-formula _ . The following proposition shows that the equality
Fr (()) = (Fr ()) holds whenever : P(V ) P(W) is an essential sub-
stitution. As already pointed out, this is as if was valid for all P(V ).
But of course, nothing about sub-formulas in this case. Consider the formula
= y(x y) where V = x, y and x ,= y. Let : P(V ) P(V ) be an
essential substitution associated with = [y/x]. Then : V V is clearly
not valid for and the equality Fr (()) = (Fr ()) may look very surprising.
Indeed, we have Fr () = x and consequently (Fr ()) = y. It is very
tempting to argue that () = y(y y) has no free variable and that the
equality therefore cannot be true. But remember that : P(V ) P(V ) does
not refer to the substitution mapping associated with : V V . The context
of our discussion makes it very clear that we are starting with : P(V ) P(V )
as a given essential substitution. Hence () is not equal to y(y y). In fact,
we must have / () = 0(y 0) and since V = x, y it is not dicult to
prove that the only possible value is () = x(y x), and Fr (()) = y.
Proposition 173 Let V, W be sets and : P(V ) P(W) be an essential
substitution. Then for all P(V ) we have the equality:
Fr (()) = (Fr ())
Proof
Using proposition (139) we obtain the following:
Fr (()) = Fr ( / () )
= Fr ( /() )
valid for /() = ( Fr (/()) )
prop. (139) = (Fr ())
Fr () V = (Fr ())

183
If , : V W are two maps and P(V ), we know from proposition (64)
that () = () if and only if and coincide on Var(). The following
proposition provides the counterpart result for essential substitutions. So we
start from , : P(V ) P(W) two essential substitutions and we consider
the equivalence () () where is the substitution congruence. Note
that there is not much point considering the equality () = () in the case
of essential substitutions, since we know from proposition (167) that both
and can be arbitrarily redened modulo substitution, without aecting their
associated maps , : V W. So we can only focus on () () which
holds if and only if the associated maps , : V W coincide on Fr () :
Proposition 174 Let V, W be sets and , : P(V ) P(W) be two essential
substitutions. Then for all P(V ) we have the equivalence:

|Fr ()
=
|Fr ()
() ()
where denotes the substitution congruence on the algebra P(W).
Proof
From theorem (14) of page 153, () () is equivalent to the equality
/ () = / (). Having assumed and are essential, this is in turn
equivalent to /() = /(). Using proposition (64), this last equal-
ity is equivalent to (u) = (u) for all u Var(/()). Hence we need to
show that this last statement is equivalent to (u) = (u) for all u Fr ().
First we show : so suppose (u) = (u) for all u Var(/()) and let
u Fr (). We need to show that (u) = (u). From proposition (139) we have
Var(/()) V = Fr (). It follows that u Var(/()) V and consequently
we have (u) = (u) = (u) = (u) as requested. So we now prove : So
we assume that (u) = (u) for all u Fr () and consider u Var(/()).
We need to show that (u) = (u). Since

V = V N we shall distinguish two
cases: rst we assume that u N. Then (u) = u = (u). Next we assume that
u V . Then u Var(/()) V = Fr () and (u) = (u) = (u) = (u).
In theorem (15) of page 155 we proved that () () whenever and
: V W is a substitution which is valid for both and . We now provide
an extension of this result for an essential substitution : P(V ) P(W). We
should not confuse the following proposition with proposition (171) relative to
essential substitutions : P(V ) P(V ). If we want to claim that ()
then proposition (171) requires that (u) = u for all u Fr (). This is not the
same thing as claiming () () for which no special conditions on the set
Fr () = Fr () is attached (recall that Fr () = Fr () whenever ). If you
have , then you can move the free variables around as much as you want
with an essential substitution. It will not change the fact that () ().
Proposition 175 Let V, W be sets and : P(V ) P(W) be an essential
substitution. Then for all formulas , P(V ) we have the implication:
() ()
184
where denotes the substitution congruence on P(V ) and P(W).
Proof
So we assume that . We need to show that () (). Using theo-
rem (14) of page 153 it is sucient to show that / () = / (). Having
assumed that is essential, it is sucient to show that /() = /()
which follows immediately from /() = /(), itself a consequence of .

In proposition (154) we showed that a substitution : V W which is valid


for a formula P(V ) will preserve its substitution rank, provided
|Fr ()
is an
injective map. The same is true of an essential substitution : P(V ) P(W).
Proposition 176 Let V, W be sets and : P(V ) P(W) be an essential
substitution. Let P(V ) such that
|Fr ()
is an injective map. Then:
rnk (()) = rnk ()
where rnk ( ) refers to the substitution rank as per denition (149).
Proof
Using proposition (153) we obtain the following:
rnk (()) = rnk (/ ())
essential = rnk ( /())
A: to be proved = rnk (/())
prop. (153) = rnk ()
So it remains to show that rnk ( /()) = rnk (/()). Using proposi-
tion (154) it is sucient to show that is valid for /() and furthermore that
it is injective on Fr (/()). We know that is valid for /() from proposi-
tion (143). We also know that Fr (/()) = Fr () from proposition (139). So
it remains to show that is injective on Fr () V which is clearly the case
since coincide with on V and is by assumption injective on Fr ().
Without the injectivity of on Fr (), the substitution rank is generally not
preserved. However as can be seen below, it can never increase. The following
proposition extends proposition (157) to essential substitutions.
Proposition 177 Let V, W be sets and : P(V ) P(W) be an essential
substitution. Then for all formula P(V ) we have the inequality:
rnk (()) rnk ()
where rnk ( ) refers to the substitution rank as per denition (149).
185
Proof
Using proposition (153) we obtain the following:
rnk (()) = rnk (/ ())
essential = rnk ( /())
prop. (157) rnk (/())
prop. (153) = rnk ()

Given an essential substitution : P(V ) P(W) and P(V ), there is


only so much we can say about (), even if we know about the structure of
. For example, if =
1

2
we cannot claim that () = (
1
) (
2
).
However, as the following proposition will show, we have () (
1
) (
2
)
where is the substitution congruence on P(W). A more interesting example
is the case when = x
1
. We would like to claim that () (x)(
1
).
However, there is little chance of this equivalence being true when (x) happens
to be a free variable of (). So the condition (x) , Fr (()) should be
a minimal requirement. Remember that the map : P(V ) P(W) is an
essential substitution. It is not simply the substitution mapping associated with
a map : V W as per denition (53). If this was the case then the equality
() = (x)(
1
) would hold, and the condition (x) , Fr (()) would be
automatically satised. When : P(V ) P(W) is an essential substitution,
it is very well possible that (x) Fr (()). Let = x(x y) with x ,= y
and : P(V ) P(V ) an essential substitution associated with [y/x]. Then
(x) = y while () = z(z y) for some z ,= y. So (x) Fr (()). The
following proposition will show that the condition (x) , Fr (()) is in fact
sucient to obtain the equivalence () (x)(
1
).
Proposition 178 Let V, W be sets and : P(V ) P(W) be an essential
substitution. Let be the substitution congruence on P(W). Then we have:
P(V ) , () :

= ((x) (y)) if = (x y)
= if =
(
1
) (
2
) if =
1

2
(x)(
1
) if = x
1
, (x) , Fr (())
Proof
First we assume that = (x y) for some x, y V . We need to show that
() = (x) (y). However, any map : V W is valid for . It follows from
proposition (168) that ()

(), where

: P(V ) P(W) is the associated


substitution mapping as per denition (53). From

() = (x) (y) and the


equivalence ()

(), using theorem (12) of page 135 we conclude that


() = (x) (y) as requested. Next we assume that = . Then once
again is valid for and we obtain ()

() = and consequently from


theorem (12) we have () = . So we now assume that =
1

2
for some
186

1
,
2
P(V ). We need to show () (
1
) (
2
). Using theorem (14) of
page 153, we simply compute the minimal transforms:
/ () = /()
= /(
1

2
)
= ( /(
1
) /(
2
) )
= /(
1
) /(
2
)
= / (
1
) / (
2
)
= /( (
1
) (
2
) )
So we now assume that = x
1
and (x) , Fr (()). We need to show that
() (x)(
1
). Likewise, we shall compute minimal transforms:
/ () = /()
= /(x
1
)
n = mink : [k/x] valid for /(
1
) = ( n/(
1
)[n/x] )
= (n) ( /(
1
)[n/x] )
= n [n/x] /(
1
)
A: to be proved = n[n/(x)] /(
1
)
= n[n/(x)] / (
1
)
= n/[(
1
)][n/(x)]
B: to be proved = m/[(
1
)][m/(x)]
m = mink : [k/(x)] valid for /[(
1
)] = /((x)(
1
))
So it remains to justify point A and B. First we deal with point A: it is sucient
to prove the equality [n/x] /(
1
) = [n/(x)] /(
1
), which follows
from lemma (144) and (x) , (Fr ()), itself a consequence of (x) , Fr (())
and proposition (173). We now deal with point B: it is sucient to prove the
equivalence [k/x] valid for /(
1
) [k/(x)] valid for /(
1
) which fol-
lows from lemma (145) and the fact that (x) , (Fr ()).
This last proposition seems to indicate there is nothing we can say about
() if it happens to have (x) as a free variable. Luckily this is not the case:
Proposition 179 Let V, W be sets and : P(V ) P(W) be an essential
substitution. Let = x
1
where x V ,
1
P(V ). There exists an essential
substitution : P(V ) P(W) such that = on V x and (x) , Fr (()).
Furthermore, for any such we have the substitution equivalence:
() (x)(
1
)
187
Proof
We shall rst prove the substitution equivalence. So let : P(V ) P(W)
be an essential substitution which coincide with on V x and such that
(x) , Fr (()). We need to show that () (x)(
1
) where denotes
the substitution congruence on P(W). However, since Fr () V x we see
that and are essential substitutions which coincide on Fr (). It follows
from proposition (174) that () (). Hence it is sucient to prove that
() (x)(
1
). Applying proposition (178) it is sucient to show that
(x) , Fr (()). However, by assumption we have (x) , Fr (()) and so:
(x) , Fr (()) = (Fr ()) = (Fr ()) = Fr (())
where we have used proposition (173). It remains to show that such an essential
substitution : P(V ) P(W) exists. Suppose we have proved that Fr (())
is a proper subset of W. Then there exists y

W such that y

, Fr (()).
Consider the map : V W dened by:
u V , (u) =
_
(u) if u V x
y

if u = x
Then it is clear that coincides with on V x and (x) , Fr (()). In order
to show the existence of : P(V ) P(W) it is sucient to show the existence
of an essential substitution associated with : V W. From theorem (18) of
page 178 it is sucient to show that [W[ is an innite cardinal, or that it is
nite with [V [ [W[. However, this follows immediately from the existence of
the essential substitution : P(V ) P(W) and theorem (18). So it remains
to show that Fr (()) is a proper subset of W. This is clearly true if [W[ is an
innite cardinal. So we may assume that [W[ if nite, in which case we have
[V [ [W[. In particular, [V [ is also a nite cardinal and since x , Fr () we
have [Fr ()[ < [V [. Hence we have the following inequalities:
[Fr (())[ = [(Fr ())[ [Fr ()[ < [V [ [W[
So Fr (()) is indeed a proper subset of W, as requested.
2.4 The Permutation Congruence
2.4.1 The Permutation Congruence
We are still looking for the appropriate congruence which should be dened on
P(V ) so as to identify mathematical statements which are deemed identical.
Until now, we have highlighted the substitution congruence dened in page 126
as a possible source of identity. Unfortunately whatever nal congruence we go
for, it will need to be larger (i.e. weaker) than the substitution congruence. As
mathematicians, we are familiar with the fact that the order of quantication
in a given mathematical statement is immaterial. For example, the formulas
188
= xy(x y) and = yx(x y) should mean the same thing. Yet the
equivalence of and cannot be deduced from the substitution congruence
alone when x ,= y. Indeed, let denote the substitution congruence on P(V )
and suppose that . Dene
1
= y(x y) and
1
= x(x y). Then we
have x
1
y
1
with x ,= y. Using theorem (12) of page 135 it follows that

1

1
[y : x] which is x(x y) x(y x). Using theorem (12) once more
we see that (x y) (y x) which contradicts theorem (12). Permuting the
order of quantication cannot be achieved by substituting variables.
Denition 180 Let V be a set. We call permutation congruence on P(V ) the
congruence on P(V ) generated by the following set R
0
P(V ) P(V ):
R
0
= ( xy
1
, yx
1
) :
1
P(V ) , x, y V
We easily see that the permutation congruence preserves free variables :
Proposition 181 Let denote the permutation congruence on P(V ) where V
is a set. Then for all , P(V ) we have the implication:
Fr () = Fr ()
Proof
Let denote the relation on P(V ) dened by if and only if we have
Fr () = Fr (). Then we need to show that . However, we know from
proposition (77) that is a congruence on P(V ). Since is dened as the
smallest congruence on P(V ) which contains the set R
0
of denition (180), we
simply need to show that R
0
. So let
1
P(V ) and x, y V . We need
to show that xy
1
yx
1
which is Fr (xy
1
) = Fr (yx
1
). This is
clearly the case since both sets are equal to Fr (
1
) x, y.
Suppose
1
P(V ) and x, y, z V . If denotes the permutation congru-
ence on P(V ), then the following equivalence can easily be proved:
xyz
1
zyx
1
(2.43)
The most direct proof probably involves permuting pairs of adjacent variables
until we reach the conguration zyx having started from xyz. More
generally, given x V
n
where n N, given an arbitrary permutation : n n,
we would expect the following equivalence to hold equally:
x(n 1) . . . x(0)
1
x((n 1)) . . . x((0))
1
(2.44)
There are several issues to be addressed with equation (2.44). One of our con-
cerns will be of course to establish its truth. There is also the use of the loose
notation . . . which is arguably questionable in a document dealing with formal
logic. More fundamentally, regardless of whether formula (2.44) is true or false,
it is an ugly formula. Granted we could improve things slightly by writing:
x
n1
. . . x
0

1
x
(n1)
. . . x
(0)

1
(2.45)
189
but this is hardly satisfactory. One of our objectives is therefore to design the
appropriate formalism to remove the ugliness of (2.44). Now when it comes to
establishing its proof, the key idea is the same as with xyz and involves
permuting adjacent variables until we reach the appropriate conguration. So
we shall need the fact that an arbitrary permutation can always be expressed as
a composition of adjacent moves. We provide a proof in the following section.
2.4.2 Integer Permutation
We start with a couple of denitions including that of elementary permutation
which formalizes the idea of adjacent moves.
Denition 182 Let n N we call permutation of order n a bijection : n n.
Denition 183 Let n N, n 2. We call elementary permutation of order n
any permutation : n n of the form = [i : i + 1] for some i n 1.
For the sake of brevity, we allowed ourselves to use the notational shortcut
=
k
. . .
1
in the following proof. This is fair enough. In a way, we are still
doing meta-mathematics with the established standards of usual mathematics.
If we are one day to provide a formal proof of the following lemma, we shall
either need to discard the . . . , or create a high level language where its use is
permitted, and corresponding formulas correctly compiled as low level formulas
of rst order logic with some form of recursive wording.
Lemma 184 Let n N, n 2. Any permutation : n n of order n is a
composition of elementary permutations, i.e. there exist k N

and elementary
permutations
1
, . . . ,
k
such that:
=
k
. . .
1
Proof
We shall prove lemma (184), using an induction argument on n 2. First we
show that the lemma is true for n = 2. So suppose : 2 2 is a permutation
of order 2. We shall distinguish two cases. First we assume that = [0 : 1].
Then is an elementary permutation and there is nothing else to prove. We
now assume that is the identity. Then can be expressed as = [0: 1] [0: 1].
Having considered the only two possible cases, we conclude that lemma (184)
is true when n = 2. We now assume that lemma (184) is true for n 2. We
need to show that it is true for n + 1. So we assume that : (n + 1) (n +1)
is a permutation of order n + 1. We need to show that can be expressed
as a composition of elementary permutations. We shall distinguish two cases.
First we assume that (n) = n. Then the restriction
|n
is easily seen to be
a permutation of order n. Indeed, it is clearly an injective map
|n
: n n
which is also surjective. Using the induction hypothesis, there exist k N

and
elementary permutations
1
, . . . ,
k
of order n such that:

|n
=
k
. . .
1
190
For all j 1, . . . , k consider the extension

j
of
j
on n + 1 dened by
(

j
)
|n
=
j
and

j
(n) = n. Having assumed that (n) = n, we obtain:
=

k
. . .

1
It is therefore sucient to show that each

j
is an elementary permutation of
order n + 1. So let j 1, . . . , k. Since
j
is an elementary permutation of
order n, there exists i n 1 such that
j
= [i : i + 1] (of order n). It follows
immediately that

j
= [i : i +1] (or order n+1). So

j
is indeed an elementary
permutation of order n +1. This completes our proof of lemma (184) for n +1
in the case when (n) = n. We now assume that (n) ,= n. Then (n) n.
In other words, there exists j n such that (n) = n 1 j. We shall prove
by induction on j n that can be expressed as a composition of elementary
permutations of order n+1. Note that in order to do so, it is sucient to prove
the existence of m N

and elementary permutations

1
, . . . ,

m
such that:

m
. . .

1
(n) = n
Indeed if that is the case, having proved lemma (184) for n+1 in the case when
(n) = n, we deduce the existence of k N

and elementary permutations

1
, . . . ,
k
of order n + 1 such that:

m
. . .

1
=
k
. . .
1
and since
1
= for every elementary permutation , we conclude that:
=

1
. . .

m

k
. . .
1
So we need to show the existence of m N

and elementary permutations

1
, . . . ,

m
such that

m
. . .

1
(n) = n, and we shall do so by induction on
j n such that (n) = n1j. First we assume that j = 0. Then (n) = n1
and it is clear that [ n1 : n] (n) = n. So the property is true for j = 0. We
now assume that the property is true for j (n 1) and we need to show that
it is true for j + 1. So we assume that (n) = n 1 (j + 1). Then we have:
[ n 1 (j + 1) : n 1 j ] (n) = n 1 j
Having assumed the property is true for j, there exist m N

and elementary
permutations

1
, . . . ,

m
such that:

m
. . .

1
[ n 1 (j + 1) : n 1 j ] (n) = n
Hence we see that the property is also true for j + 1.
Having established that every permutation : n n is a composition of
elementary permutations for n 2, our main objective is to design a proof of :
x
n1
. . . x
0

1
x
(n1)
. . . x
(0)

1
(2.46)
191
However, we would also like to do so while at the same time creating a more
condensed formalism. In the next section, we shall dene the notion of iterated
quantication, allowing us to replace the expression x
n1
. . . x
0

1
by a simple
x
1
with x V
n
. Before we do so we shall introduce a simple equivalence
relation on V
n
so as to identify x V
n
with y = x . This will allow us to
replace the statement of the equivalence (2.46) with the following implication:
x y x
1
y
1
Denition 185 Let V be a set, n N and x, y V
n
. We say that x is
permutation equivalent to y and we write x y if and only if there exists a
permutation : n n of order n such that y = x .
Proposition 186 Let V be a set and n N. Let denote the permutation
equivalence on V
n
. Then is an equivalence relation on V
n
.
Proof
We need to show that is a reexive, symmetric and transitive relation on V
n
.
First we show that is reexive. So suppose x V
n
. We need to show that
x x. Let : n n be the identity permutation. Then we have x = x .
Note that if n = 0, then x = 0 and = 0 and the equality x = x is still
true. This shows that x x. We now show that is symmetric. So suppose
x, y V
n
and x y. We need to show that y x. By assumption, there
exists a permutation : n n such that y = x . Composing to the right
by the inverse permutation
1
: n n we obtain y
1
= x and it follows
that y x. We now show that is transitive. So suppose x, y, z V
n
are
such that x y and y z. We need to show that x z. However, there exist
permutations : n n and : n n such that y = x and z = y . It
follows that z = x ( ) and nally x z.
2.4.3 Iterated Quantication
The following denition allows us to replace the formalism x(n1) . . . x(0)
with a more condensed expression x. In doing so, we are overloading the
symbol by creating a new unary operator x : P(V ) P(V ) for all x V
n
and n N. This operator coincides with the standard operator x when x V .
Note that when n = 0, the only possible element x V
n
is x = 0 and the
operator 0 : P(V ) P(V ) is simply the identity. There is obviously a small
risk of notational confusion when seeing 0 especially in the context of minimal
transforms. Overall, we do not think this risk is high enough to warrant the
introduction of a dierent symbol for iterated quantication. Given n N
and x V
n
we dene x in terms of the expression x(n 1) . . . x(0) . In
case this is not clear, this is simply a notational shortcut for the more formal
denition by recursion 0 = and if x V
n+1
, x = x(n)x
|n
.
192
Denition 187 Let V be a set, n N and x V
n
. For all P(V ) we call
iterated quantication of by x the element x of P(V ) dened by:
x = x(n 1) . . . x(0)
where it is understood that if n = 0, we have 0 = .
Given any congruence relation on P(V ), we have x x whenever
and x V . Obviously this property should extend to iterated quantication:
Proposition 188 Let V be a set, n N and x V
n
. Let be an arbitrary
congruent relation on P(V ). Then for all , P(V ) we have:
x x
Proof
We shall prove this result by induction on n N. First we assume n = 0. Then
V
n
= 0 and x = 0. By convention the corresponding iterated quantications
are dened as 0 = and 0 = . So the property is clearly true. Next we
assume that the property is true for n N. We need to show that it is true
for n + 1. So we assume that and x V
n+1
. We need to show that
x x. However, we have x
|n
V
n
and from the induction hypothesis we
obtain x
|n
x
|n
. Since is a congruent relation on P(V ) we have:
x = x(n)x
|n
x(n)x
|n
= x
So we conclude that the property is true for n + 1.
We are now able to quote and prove what we set out from the beginning. If
two iterated quantication operators are equivalent modulo some permutation,
then the corresponding formulas are permutation equivalent. This all seems
pretty obvious, but it has to be proved one way or another:
Proposition 189 Let V be a set, n N and x, y V
n
. Let denote the
permutation congruence on P(V ). Then for all P(V ) we have:
x y x y
where x y refers to the permutation equivalence on V
n
.
Proof
We shall distinguish three cases. First we assume that n = 0. Then x = y = 0
and the result is clear. Next we assume that n = 1. Then x y implies that
x = y and the result is also clear. We now assume that n 2 and x y. We
need to show that x y. From x y there exists a permutation : n n
such that y = x . We shall distinguish two cases. First we assume that
is an elementary permutation, namely that there exists i n 1 such that
= [i : i + 1] (of order n). We shall prove that x y using an induction
193
argument on n 2. First we assume that n = 2. Then we must have i = 0 and
= [0: 1] (of order 2). Hence from y = x we obtain:
y = y(1)y(0) = x(0)x(1)
Comparing with x = x(1)x(0), it is clear that the ordered pair (x, y)
belongs to the generator R
0
of the permutation congruence as per denition (180).
In particular we have x y which completes our proof in the case when
is elementary and n = 2. We now assume that the property is true for
elementary and n 2. We need to show that it is also true for elementary
and n + 1. So we assume the existence of i n such that = [i : i + 1] (of
order n + 1). We need to show that x y. We shall distinguish two
cases. First we assume that i = n 1 which is the highest possible value when
i n. Then for all j n 1 we have j ,= i and j ,= i + 1 and consequently
y(j) = x (j) = x [i : i +1](j) = x(j). Hence we see that x
|n1
= y
|n1
, and:
y = y(n) y(n 1) y
|n1
= x(n 1) x(n) x
|n1

while we have x = x(n) x(n1) x


|n1
. Setting u = x(n) and v = x(n1)
with
1
= x
|n1
, it follows that x = uv
1
and y = vu
1
. Hence
we see that the ordered pair (x, y) belongs to the generator R
0
of the
permutation congruence as per denition (180). In particular x y. We
now assume that i n 1. In particular we have i + 1 ,= n and i ,= n and:
y(n) = x (n) = x [i : i + 1](n) = x(n)
It follows that y = y(n)y
|n
= x(n)y
|n
while x = x(n)x
|n
.
Hence, we see that in order to showthat x y, the permutation congruence
being a congruent relation on P(V ), it is sucient to prove that x
|n
y
|n
.
This follows immediately from our induction hypothesis and the fact that:
y
|n
= x
|n
= x [i : i + 1]
|n
= x
|n
[i : i + 1]
where it is understood that the second occurrence of [i : i +1] in this equation
refers to the elementary permutation of order n (rather than n +1) dened for
i n 1. This completes our induction argument and we have proved that
x y in the case when y = x with elementary. We now assume
that : n n is an arbitrary permutation of order n and we need to show
that x y. However since n 2 it follows from lemma (184) that can
be expressed as a composition of elementary permutations of order n. In other
words, there exist k N

and elementary permutations


1
, . . . ,
k
such that:
=
k
. . .
1
We shall prove that x y by induction on the integer k N

. If k = 1
then is an elementary permutation and we have already proved that the result
is true. We now assume that the result is true for k 1 and we need to show
it is also true for k + 1. So we assume that:
=
k+1

k
. . .
1
194
Dening

=
k+1
. . .
2
we obtain =


1
. It follows that:
y = x = x


1
= y


1
where we have put y

= x

. Now since
1
is an elementary permutation
and y = y


1
, we have already proved that y y

. Furthermore, since
y

= x

and

is a composition of k elementary permutations, it follows


from our induction hypothesis that y

x. By transitivity of the permu-


tation congruence, we conclude that x y.
2.4.4 Irreducible Formula
Having proved proposition (189) our next objective is to establish some char-
acterization theorem for the permutation congruence, similar to theorem (12)
of page 135 which was established for the substitution congruence. Suppose
= uvw((u v) (v w)). We know from theorem (2) of page 21 that
the representation = x
1
is unique: clearly
1
= vw((u v) (v w))
and x = u. However, if we introduce iterated quantication as we have done,
the representation = x
1
with x V
n
and n N is not unique. However,
such representation must be unique if we impose that
1
does not start as a
quantication. For lack of a better word, we shall say that
1
is irreducible
to indicate that no further quantication can be removed at the top level. Al-
though we are not very happy with the chosen terminology (irreducible is a big
word in algebra), this is simply a temporary notion soon to be forgotten. It will
allow us to prove the uniqueness we require for our characterization theorem.
Denition 190 Let V be a set and P(V ). We say that is irreducible if
and only if one of the following is the case:
(i) P
0
(V )
(ii) =
(iii) =
1

2
,
1
P(V ) ,
2
P(V )
Proposition 191 Let V be a set and P(V ). Then can be uniquely
represented as = x
1
where x V
n
, n N,
1
P(V ) and
1
is irreducible.
Proof
Given P(V ) we need to show the existence of n N, x V
n
and
1
irreducible such that = x
1
. Furthermore, we need to show that this rep-
resentation is unique, namely that if m N, y V
m
,
1
is irreducible and
= y
1
, then we have x = y and
1
=
1
. Note that x = y will automatically
imply that n = m, as identical maps have identical domains. We shall carry out
the proof by structural induction using theorem (3) of page 32. Since P
0
(V ) is
a generator of P(V ), we rst check that the property is true for P
0
(V ). So
suppose P
0
(V ). We take n = 0, x = 0 V
n
and
1
= . Then
1
is irre-
ducible and we have = x
1
which shows the existence of the representation.
195
Suppose now that = y
1
for some y V
m
, m N and
1
irreducible. Since
P
0
(V ), from theorem (2) of page 21, cannot be a quantication. It follows
that m = 0, y = 0 and
1
= = 0
1
=
1
. This proves the uniqueness of the
representation, and the property is proved for P
0
(V ). We now assume that
= or indeed that =
1

2
. Then is irreducible and taking n = 0,
x = 0 and
1
= we obtain = x
1
. Once again from theorem (2), cannot
be a quantication and it follows that this representation is unique. It remains
to show that the property is true in the case when = z for some z V , and
the property is true for . First we show the existence of the representation.
Having assumed that the property is true for , there exists n N, x V
n
and

1
irreducible such that = x
1
. Consider the map x

: (n + 1) V dened
by x

|n
= x and x

(n) = z. Taking
1
=
1
we obtain:
x

1
= x

(n)x

|n

1
= zx
1
= z =
which shows the existence of the representation. We now prove the uniqueness.
So suppose m N, y V
m
,
1
is irreducible and = y
1
. We need to show
that y = x

and
1
=
1
. From = y
1
we obtain z = y
1
. It follows
that y
1
is a quantication and from theorem (2) it cannot be irreducible.
This shows that m 1, and consequently m = p + 1 for some p N. Hence:
z = = y
1
= y(p)y
|p

1
Applying theorem (2) once more, we obtain z = y(p) and = y
|p

1
. Having
assumed the property is true for , its representation = x
1
is unique. It
follows that y
|p
= x and
1
=
1
. So we have proved that
1
=
1
. Furthermore
from y
|p
= x it follows that p = n and m = n+1. So y is a map y : (n+1) V
such that y
|n
= x and y(n) = z. We conclude that y = x

.
2.4.5 Characterization of the Permutation Congruence
We are now able to confront our next objective, that of establishing a charac-
terization theorem for the permutation congruence, similar to theorem (12) of
page 135. This type of theorem has proved very useful in the case of the substi-
tution congruence. Whenever it is important to be able to say something
about the common structure of and . For example, if =
1

2
, we
want the ability to claim that is also of the form =
1

2
. An example
of this is proposition (155) establishing the formula between substitution ranks
rnk () = max( [Fr ()[ , rnk (
1
) , rnk (
2
) ) when =
1

2
. The strategy
to prove our characterization theorem (19) below will follow very closely that of
theorem (12). We shall not try to be clever: we dene a new binary relation on
P(V ) which constitutes our best guess of what an appropriate characterization
should look like. We then painstakingly prove that this new relation has all
the desired properties: it contains the generator of the permutation congruence,
it is reexive, symmetric and transitive. It is a congruent relation, hence a
congruence on P(V ) which is in fact equivalent to the permutation congruence.
196
Denition 192 Let be the permutation congruence on P(V ) where V is a
set. Let , P(V ). We say that is almost permutation equivalent to and
we write , if and only if one of the following is the case:
(i) P
0
(V ) , P
0
(V ) , and =
(ii) = and =
(iii) =
1

2
, =
1

2
,
1

1
and
2

2
(iv) = x
1
, = y
1
,
1

1
, x y V
n
, n 1
where it is understood that
1
and
1
are irreducible in (iv), and x y refers
to the permutation equivalence on V
n
as per denition (185).
Proposition 193 (i), (ii), (iii), (iv) of denition (192) are mutually exclusive.
Proof
This is an immediate consequence of theorem (2) of page 21 applied to the free
universal algebra P(V ) with free generator P
0
(V ), where a formula P(V )
is either an element of P
0
(V ), or the contradiction constant = , or an impli-
cation =
1

2
, or a quantication = x
1
, but cannot be equal to any
two of those things simultaneously. Note that we are requesting that n 1 in
(iv) of denition (192), so = x
1
and = y
1
are indeed quantications
when (iv) is the case.
Proposition 194 Let be the almost permutation equivalence on P(V ) where
V is a set. Then contains the generator R
0
of denition (180).
Proof
Let x, y V and
1
P(V ). Let = xy
1
and = yx
1
. We need to
show that . From proposition (191), there exist n N, z V
n
and
1
irreducible such that
1
= z
1
. Consider the map u : (n + 2) V dened by
u
|n
= z, u(n) = y and u(n + 1) = x. Then we have:
= xy
1
= u(n + 1) u(n) u
|n

1
= u(n + 1) u
|n+1

1
= u
1
Consider the map v : (n + 2) V dened by v = u [n : n + 1]. Then v
|n
= z,
v(n) = x and v(n + 1) = y and it follows similarly that = yx
1
= v
1
.
Hence, we have found
1
irreducible, u, v V
n+2
with u v such that = u
1
and = v
1
. From (iv) of denition (192) we conclude that .
Proposition 195 Let be the almost permutation equivalence on P(V ) where
V is a set. Then is a reexive relation on P(V ).
Proof
Let P(V ). We need to show that . From theorem (2) of page 21
we know that is either an element of P
0
(V ), or = or =
1

2
197
or = x
1
for some
1
,
2
P(V ) and x V . We shall consider these
four mutually exclusive cases separately. Suppose rst that P
0
(V ): then
from = we obtain . Suppose next that = : then it is clear that
. Suppose now that =
1

2
. Since the permutation congruence is
reexive, we have
1

1
and
2

2
. It follows from (iii) of denition (192)
that . Suppose nally that = x
1
. From proposition (191), there
exist n N, z V
n
and
1
irreducible such that
1
= z
1
. Consider the map
u : (n + 1) V dened by u
|n
= z and u(n) = x. Then we have:
= x
1
= xz
1
= u(n) u
|n

1
= u
1
Since
1
is irreducible,
1

1
and u u, we conclude from (iv) of deni-
tion (192) that .
Proposition 196 Let be the almost permutation equivalence on P(V ) where
V is a set. Then is a symmetric relation on P(V ).
Proof
Let , P(V ) be such that . We need to show that . We shall
consider the four possible cases of denition (192): suppose rst that P
0
(V ),
P
0
(V ) and = . Then it is clear that . Suppose next that =
and = . Then we also have . We now assume that =
1

2
and
=
1

2
with
1

1
and
2

2
. Since the permutation congruence on
P(V ) is symmetric, we have
1

1
and
2

2
. Hence we have . We
now assume that = x
1
and = y
1
with
1

1
and
1
,
1
irreducible,
x y V
n
and n 1. Then once again by symmetry of the permutation
congruence we have
1

1
and furthermore y x. It follows that .
Proposition 197 Let be the almost permutation equivalence on P(V ) where
V is a set. Then is a transitive relation on P(V ).
Proof
Let , and P(V ) be such that and . We need to show that
. We shall consider the four possible cases of denition (192) in relation
to . Suppose rst that , P
0
(V ) and = . Then from we
obtain , P
0
(V ) and = . It follows that , P
0
(V ) and = . Hence
we see that . We now assume that = = . Then from we
obtain = = . It follows that = = and consequently . We now
assume that =
1

2
and =
1

2
with
1

1
and
2

2
. From
we obtain =
1

2
with
1

1
and
2

2
. The permutation
congruence being transitive, it follows that
1

1
and
2

2
. Hence we see
that . We now assume that = x
1
and = y
1
with
1

1
, and

1
,
1
irreducible, x y V
n
and n 1. Since n 1, from only the
case (iv) of denition (192) is possible. Furthermore from proposition (191), the
representation = y
1
with
1
irreducible is unique. It follows that = z
1
198
with
1

1
irreducible and y z V
n
. The permutation congruence being
transitive, we obtain
1

1
. Furthermore, also by transitivity we obtain x z
and we conclude that .
Proposition 198 Let be the almost permutation equivalence and be the
permutation congruence on P(V ), where V is a set. For all , P(V ):

Proof
Let , P(V ) such that . We need to show that . We shall con-
sider the four possible cases of denition (192) in relation to . Suppose
rst that = P
0
(V ). From the reexivity of the permutation congruence,
it is clear that . Suppose next that = = . Then we also have
. We now assume that =
1

2
and =
1

2
where
1

1
and
2

2
. The permutation congruence being a congruent relation on P(V ),
we obtain . Next we assume that = x
1
and = y
1
where
1

1
and
1
,
1
are irreducible, x y V
n
and n 1. The permutation congruence
being a congruent relation on P(V ), from proposition (188) and
1

1
we
obtain y
1
y
1
. Hence by transitivity of the permutation congruence, it
remains to show that x
1
y
1
which follows immediately from x y and
proposition (189).
Proposition 199 Let be the almost permutation equivalence on P(V ) where
V is a set. Then is a congruent relation on P(V ).
Proof
From proposition (195), the almost permutation equivalence is reexive and
so . We now assume that =
1

2
and =
1

2
where
1

1
and
2

2
. We need to show that . However from proposition (198) we
have
1

1
and
2

2
and it follows from denition (192) that . We
now assume that = x
1
and = x
1
where
1

1
and x V . We need
to show that . Once again from proposition (198) we have
1

1
. We
shall now distinguish two cases in relation to
1

1
. First we assume that (i)
(ii) or (iii) of denition (192) is the case. Then both
1
and
1
are irreducible.
Dening x

: 1 V by setting x

(0) = x we obtain:
= x
1
= x

(0)
1
= x

1
and similarly = x

1
. From
1
,
1
irreducible and
1

1
we conclude that
. We now assume that (iv) of denition (192) is the case. Then
1
= u

1
and
1
= v

1
where

1
and

1
,

1
are irreducible, u v V
n
and
n 1. Dening u

: (n + 1) V by setting u

|n
= u and u

(n) = x we obtain:
= x
1
= xu

1
= u

(n)u

|n

1
= u

1
199
and similarly = v

1
where v : (n + 1) V is dened by v

|n
= v and
v

(n) = x. Since

1
,

1
are irreducible and

1
, in order to show that
it is sucient to prove that u

. However since u v, there


exists a permutation : n n of order n such that v = u . Dening

: (n + 1) (n + 1) by

|n
= and

(n) = n we obtain a permutation of


order n + 1 such that v

= u

.
Proposition 200 Let be the almost permutation equivalence on P(V ) where
V is a set. Then is a congruence on P(V ).
Proof
We need to show that is reexive, symmetric, transitive and that it is a con-
gruent relation on P(V ). From proposition (195), the relation is reexive.
From proposition (196) it is symmetric while from proposition (197) it is tran-
sitive. Finally from proposition (199) the relation is a congruent relation.
Proposition 201 Let be the almost permutation equivalence and be the
permutation congruence on P(V ), where V is a set. For all , P(V ):

Proof
From proposition (198) it is sucient to show the implication or equivalently
the inclusion . Since is the permutation congruence on P(V ), it is the
smallest congruence on P(V ) which contains the set R
0
of denition (180). In
order to show the inclusion it is therefore sucient to show that is a
congruence on P(V ) such that R
0
. The fact that it is a congruence stems
from proposition (200). The fact that R
0
follows from proposition (194).
Recall that irreducible formulas are dened in page 195. We can now forget
about the almost permutation equivalence and simply remember the following
characterization of the permutation congruence on P(V ) :
Theorem 19 Let be the permutation congruence on P(V ) where V is a set.
For all , P(V ), if and only if one of the following is the case:
(i) P
0
(V ) , P
0
(V ) , and =
(ii) = and =
(iii) =
1

2
, =
1

2
,
1

1
and
2

2
(iv) = x
1
, = y
1
,
1

1
, x y V
n
, n 1
where x y refers to the permutation equivalence on V
n
as per denition (185).
Furthermore, we may assume that
1
and
1
are irreducible in (iv).
Proof
First we show the implication . So we assume that . We need to show
200
that (i), (ii), (iii) or (iv) is the case. However from proposition (201) we have
. It follows from denition (192) that (i), (ii), (iii) or (iv) is indeed the
case with
1
and
1
irreducible in the case of (iv). We now show the reverse
implication . So we assume that (i), (ii), (iii) or (iv) is the case. We need
to show that . We shall distinguish two cases. First we assume that
(i), (ii) or (iii) is the case. Then from denition (192) we obtain and
consequently from proposition (201) we have . We now assume that (iv)
is the case. Then from proposition (188) and
1

1
we obtain x
1
x
1
.
Furthermore, from proposition (189) and x y we obtain x
1
y
1
. By
transitivity of the permutation congruence, we conclude that:
= x
1
y
1
=

2.5 The Absorption Congruence


2.5.1 The Absorption Congruence
After the substitution and permutation congruence, following our pursuit of the
right congruence on P(V ) which should be dened to identify mathematical
statements which have the same meaning, a third source of identity will be
studied in this section. As mathematicians, it is not uncommon for us to regard
a mathematical statement
1
as being exactly the same as x
1
whenever x is
not a free variable of
1
. We shall dene the absorption congruence on P(V ) as
being generated by the ordered pairs (
1
, x
1
) where x , Fr (
1
). Note that
whichever nal congruence we wish to adopt to dene the universal algebra of
rst order logic, it is not completely obvious that such congruence should contain
the absorption congruence: on the one hand it is appealing to say that and
x are the same mathematical statements. On the other hand, a consequence
of doing so will be that and x will also be identical statements, after we
include some propositional equivalence, and dene and x in the obvious
way. This state of aairs is somewhat unsatisfactory: The statement does
not say anything, contrary to the statement x which expresses the existence
of something. If we are to use the universal algebra of rst order logic to study
axiomatic set theory at a later stage, we would rather have the existence of at
least one set guaranteed by the axioms of ZF, rather than being embedded in
the language being used. The universe being void is a logical possibility.
Denition 202 Let V be a set. We call absorption congruence on P(V ), the
congruence on P(V ) generated by the following set R
0
P(V ) P(V ):
R
0
= (
1
, x
1
) : x , Fr (
1
) ,
1
P(V ) , x V
Just like the substitution and the permutation congruence, the absorption
congruence preserves free variables. The following proposition is similar to
201
propositions (118) and (181) and simply involves checking the property on R
0
:
Proposition 203 Let denote the absorption congruence on P(V ) where V
is a set. Then for all , P(V ) we have the implication:
Fr () = Fr ()
Proof
Let denote the relation on P(V ) dened by if and only if we have
Fr () = Fr (). Then we need to show that . However, we know from
proposition (77) that is a congruence on P(V ). Since is dened as the
smallest congruence on P(V ) which contains the set R
0
of denition (202), we
simply need to show that R
0
. So let
1
P(V ) and x V such that
x , Fr (
1
). We need to show that
1
x
1
which is Fr (
1
) = Fr (x
1
).
Since Fr (x
1
) = Fr (
1
) x, the result follows from x , Fr (
1
).
When is the substitution or permutation congruence and , P(V )
are such that , from the structural form of we are able to infer the
structural form of thanks to the characterizations of theorem (12) of page 135
and theorem (19) of page 200. For example, when =
1

2
, we are able
to tell that must be of the form =
1

2
which can be very useful
when dealing with structural induction arguments. So we would like to obtain a
similar result for the absorption congruence, and we shall do so in theorem (20)
of page 210. However, the case of the absorption congruence is more complicated
than any other in this respect, as it does not preserve the structure of formulas.
So for example if
1

1
and
2

2
then the formula =
1

2
and
=
1

2
must be equivalent simply because is a congruent relation.
But from the equivalence and the knowledge that =
1

2
, it is
impossible for us to conclude that =
1

2
with
1

1
and
2

2
. This
is because we can always add quantiers x, yx and zyx in front of the
formula with x, y, z , Fr (), while preserving the equivalence class of . So
the situation appears hopeless, and one natural instinct is to give up attempting
to provide a characterization theorem for the absorption congruence, in line with
theorem (12) and theorem (19). However, it may be that the equivalence
allows us to say something about and nonetheless, something which may
be a weaker statement than initially hoped for, but still better than nothing at
all. Consider the formula P(V ): It may have a few pointless quantiers
coming at the front. In other words, it may be of the form = zyx

with
x, y, z , Fr (

). So let us assume that these pointless quantications at the


front have been removed and we are left with the formula

. In particular, we
assume that any quantier coming at the front of

is meaningful, in the sense


that if

= x
1
then x Fr (
1
). Now consider the formula : we can also
remove the rst layer of meaningless quantication and retain the core

of
the formula . It is clear that we have

and

. So whenever the
equivalence arises, we accept the inescapable fact that nothing can be
said about the relative structures of and as these are being obfuscated by the
202
presence of pointless quantications at the front. But it may be that something
can be said about the relative structures of

and

. In other words, after


we remove the rst layer of meaningless quantications, the remaining core
formulas may have a common structure. This is indeed the case, as will be
seen from theorem (20) of page 210. For now, we shall concentrate on giving
a precise denition of the core formula

obtained from after removing the


rst layer of pointless quantications. We shall prove in proposition (206) below
that any formula can be uniquely expressed as = u

with u V
n
, n N
and additional properties formalizing the idea that the iterated quantication
u of denition (187) is pointless while any remaining quantier at the front of

is meaningful. This allows us to put forward the denition:


Denition 204 Let V be a set and P(V ). We call = u

of proposi-
tion (206) the core decomposition of , and we call

the core of .
Before we prove proposition (206) we shall establish the following lemma:
Lemma 205 Let n N, u V
n
and

P(V ). The following are equivalent:


(i) u(k) , Fr (

) , for all k n
(ii) u(k) , Fr ( u
|k

) , for all k n
Proof
In order to show the equivalence between (i) and (ii), it is sucient to prove that
if (i) or (ii) is satised, then the property (k n) Fr (

) = Fr ( u
|k

) is
true. Recall that given u V
n
, u is a map u : n V and u
|k
is the restriction
of u to k n. When k = 0, from denition (187) of the iterated quantication
we have Fr ( u
|k

) = Fr ( 0

) = Fr (

) so the property is true, regardless


of whether (i) or (ii) holds. So suppose the property is true for k N. We need
to show it is also true for k + 1. So we assume that (k + 1) n. We need to
show the equality Fr (

) = Fr ( u
|(k+1)

). Assuming (i) is true:


Fr ( u
|(k+1)

) = Fr ( u(k)u
|k

)
= Fr ( u
|k

) u(k)
Induction hypothesis = Fr (

) u(k)
(i) = Fr (

)
If we assume that (ii) is true, then the argument goes as follows:
Fr ( u
|(k+1)

) = Fr ( u(k)u
|k

)
= Fr ( u
|k

) u(k)
(ii) = Fr ( u
|k

)
Induction hypothesis = Fr (

)
203

In proposition (206) below, we chose to write u(k) , Fr ( u


|k

) in (i)
rather than the simpler u(k) , Fr (

). This will make some of the formal


proofs smoother. By virtue of lemma (205) the two are equivalent.
Proposition 206 Let V be a set and P(V ). Then can be uniquely
represented as = u

where u V
n
, n N,

P(V ) with the property:


(i) u(k) , Fr ( u
|k

) , for all k n
(ii)

= z z Fr ()
where it is understood that (ii) holds for all z V and P(V ).
Proof
We shall prove this property with a structural induction using theorem (3) of
page 32. First we assume that = (x y) for some x, y V . Take

= ,
n = 0 and u = 0. We claim that u

is a core decomposition of . From


denition (187) of the iterated quantication, it is clear that = u

. Since
n = 0, property (i) of proposition (206) is vacuously satised. Furthermore since

= (x y), from theorem (2) of page 21 we see that

cannot be expressed
as

= z where z V and P(V ). It follows that (ii) is also vacuously


satised. So we have proved the existence of the core decomposition = u

.
We now show the uniqueness: so suppose = u

for some u V
n
, n N
and

P(V ). We need to show that n = 0, u = 0 and

= . So suppose to
the contrary that n > 0. Then we obtain the following equality:
(x y) = = u

= u(n 1)u
|(n1)

which contradict the uniqueness property of theorem (2). It follows that n = 0


and from u V
n
= 0 we obtain u = 0 and nally = u

. So we
now assume that = . Once again, taking

= , n = 0 and u = 0 we
obtain a core decomposition u

of . Furthermore if we assume = u

for some u V
n
, n N and

P(V ), then n = 0 follows immediately from


theorem (2) and consequently u = 0 and

= . So we now assume that


=
1

2
for some
1
,
2
P(V ) which satisfy our property. We need to
show the same is true of . Take

= , n = 0 and u = 0. We obtain a core


decomposition u

of which is unique by virtue of theorem (2). The fact that

1
and
2
are themselves uniquely decomposable is irrelevant here. So we now
assume that = x
1
for some x V and
1
P(V ) satisfying our property.
We need to show the same is true of . We shall distinguish two cases: rst we
assume that x Fr (
1
). Take

= , n = 0 and u = 0. We claim that u

is a core decomposition of . Since = u

and (i) is vacuously satised, we


simply need to check that (ii) is true. So suppose

= z for some z V and


P(V ). We need to show that z Fr (). However the equality

= z
implies that x
1
= z and consequently from theorem (2) we have x = z and

1
= . It follows that z Fr () from the assumption x Fr (
1
). We now
204
prove the uniqueness of the core decomposition in the case when x Fr (
1
).
So suppose = u

for some u V
n
, n N and

P(V ) satisfying (i) and


(ii). We need to show that n = 0, u = 0 and

= . Suppose to the contrary


that n > 0. Then we obtain the following equality:
x
1
= = u

= u(n 1)u
|(n1)

Using theorem (2) we obtain x = u(n 1) and


1
= u
|(n1)

. Applying
property (i) to k = (n 1) n we obtain x , Fr (
1
) which contradicts
our assumption. We now assume that x , Fr (
1
). First we show that a
core decomposition exists: from our induction hypothesis there exists a core
decomposition
1
= u
1

1
of
1
, where u
1
V
n1
, n
1
N and

1
P(V )
satisfying (i) and (ii). Take

1
, n = n
1
+ 1 and u V
n
dened by
u
|(n1)
= u
1
and u(n 1) = x. We claim u

is a core decomposition of :
= x
1
= xu
1

1
= u(n 1)u
|(n1)

= u

So it remains to check that (i) and (ii) are satised. First we show (i). So
let k n. We need to show that u(k) , Fr (u
|k

). We shall distinguish
two cases. First we assume that k = n 1. Then we need to show that
x , Fr (u
1

1
) = Fr (
1
) which is true by assumption. Next we assume that
k (n 1). Then we need to show that u
1
(k) , Fr ((u
1
)
|k

1
) which is true
by virtue of property (i) applied to the core decomposition
1
= u
1

1
. We
now show property (ii). So we assume that

= z for some z V and


P(V ). We need to show that z Fr (). Since

1
, this follows from
property (ii) applied to the core decomposition
1
= u
1

1
. So we have proved
that u

is indeed a core decomposition of in the case when x , Fr (


1
). It
remains to show the uniqueness. So we assume that = u

for some u V
n
,
n N and

P(V ) satisfying (i) and (ii). We need to show that n = n


1
+1,
u(n 1) = x, u
|(n1)
= u
1
and

1
. First we show that n > 0. Indeed
if n = 0 then u = 0 and consequently we obtain

= u

= = x
1
. It
follows from property (ii) that x Fr (
1
) which contradicts our assumption.
Having established that n > 0 we obtain the equality:
u(n 1)u
|(n1)

= u

= = x
1
= xu
1

1
Using theorem (2) we see that x = u(n 1) as requested and furthermore:
u
|(n1)

=
1
= u
1

1
From our induction hypothesis and the uniqueness of the core decomposition

1
= u
1

1
, in order to show that u
|(n1)
= u
1
and

1
it is sucient
to prove that u
|(n1)

is a core decomposition of
1
. So we simply need to
check that (i) and (ii) are satised, which is clearly the case since (i) and (ii)
205
are assumed to be true for the decomposition u

. Having established the


equality between the maps u
|(n1)
and u
1
, these must have identical domain
and consequently n = n
1
+ 1 as requested, which completes our induction.
We shall now formally check that the core of is equivalent to modulo
the absorption congruence, which is what we expect. Removing the rst layer
of pointless quantication does not aect the equivalence class of :
Proposition 207 Let be the absorption congruence on P(V ) where V is a
set. Let P(V ) with core

P(V ). Then we have the equivalence:


Proof
We shall prove the equivalence with an induction argument over n N, where
n is the integer underlying the core decomposition = u

where u V
n
and

P(V ) satisfying (i) and (ii) of proposition (206). First we assume


that n = 0. Then u = 0 and =

so the equivalence is clear. Next we


assume that n > 0 and the equivalence is true for n 1. We need to show that
u

. However, since u

satisfy (i) and (ii) of proposition (206), the


same can be said of u
|(n1)

which is therefore a core decomposition itself.


From our induction hypothesis we obtain u
|(n1)

. It is therefore su-
cient to prove that u(n1)u
|(n1)

u
|(n1)

which follows immediately


from u(n 1) , Fr (u
|(n1)

) and denition (202).


Lemma 208 Let V be a set and P(V ) with core decomposition = u

where u V
n
. Let x V with x , Fr (). Then x has core decomposition:
x = v

where v V
n+1
is dened by the equalities v(n) = x and v
|n
= u.
Proof
We assume that u

is the core decomposition of where u V


n
. Let x V
with x , Fr (). Let v V
n+1
be dened by v(n) = x and v
|n
= u. We need to
show that v

is the core decomposition of x. We have:


v

= v(n)v
|n

= xu

= x
So it remains to show that (i) and (ii) of proposition (206) are satised. First
we show (i). So let k (n + 1). We need to show that v(k) , Fr (v
|k

). We
shall distinguish two cases: rst we assume that k = n. Then we need to show
that x , Fr (u

) = Fr () which is true by assumption. Next we assume that


206
k n. Then we need to show that u(k) , Fr (u
|k

) which follows from (i)


applied to the core decomposition u

. We now prove (ii). So we assume that

= z for some z V and P(V ). We need to show that z Fr ().


This follows immediately from (ii) applied to the core decomposition u

.
2.5.2 Characterization of the Absorption Congruence
In this section, we shall provide a characterization theorem for the absorption
congruence, in similar fashion to what was done for theorem (12) of page 135
and theorem (19) of page 200 of the substitution and permutation congruence
respectively. As discussed in the previous section, if denotes the absorp-
tion congruence on P(V ) then the equivalence does not allow us to say
anything on the relative structures of and . However, we are able to infer
something about the core formulas

and

of denition (204). Our strategy


to prove theorem (20) below will mirror exactly that of theorem (12) and theo-
rem (19). We start by dening a binary relation of almost equivalence on P(V )
which is our best guess of what a proper characterization of the absorption con-
gruence should look like. We then proceed to show that the almost equivalence
is indeed an equivalence relation on P(V ) which is in fact a congruent relation,
and we conclude by showing that it coincides with the absorption congruence.
Denition 209 Let be the absorption congruence on P(V ) where V is a set.
Let , P(V ). We say that is almost equivalent to and we write ,
if and only if one of the following is the case:
(i)

P
0
(V ) ,

P
0
(V ) , and

(ii)

= and

=
(iii)

=
1

2
,

=
1

2
,
1

1
and
2

2
(iv)

= x
1
,

= x
1
, x V and
1

1
where

and

are the core of and respectively as per denition (204).


Proposition 210 (i), (ii), (iii), (iv) of def. (209) are mutually exclusive.
Proof
This follows immediately from theorem (2) of page 21 applied to P(V ).
Proposition 211 Let be the almost equivalence relation on P(V ) where V
is a set. Then contains the generator R
0
of denition (202).
Proof
Let x V and
1
P(V ) such that x , Fr (
1
). We need to show that

1
x
1
. Let
1
= u

where u V
n
be the core decomposition of
1
.
207
Since x , Fr (
1
), from lemma (208) we see that the core decomposition of
x
1
is v

for some v V
n+1
. In particular,
1
and x
1
have identical
core

. Using theorem (2) of page 21, we must have

P
0
(V ) or

=
or

=
1

2
for some
1
,
2
P(V ) or

= z for some z V and


P(V ). So (i), (ii), (iii) or (iv) of denition (209) must be the case.
Proposition 212 The almost equivalence relation on P(V ) is reexive.
Proof
Let P(V ). We need to show that . Let

be the core of . From the-


orem (2) of page 21, one of the four following cases must occur: if

= (x y)
for some x, y V , then follows from

. If

= then
follows again from

. If

is of the form

=
1

2
then
follows from the reexivity of . If

is of the form

= x
1
, then
follows again from the reexivity of , which completes our proof.
Proposition 213 The almost equivalence relation on P(V ) is symmetric.
Proof
Follows immediately from the symmetry of the absorption congruence .
Proposition 214 The almost equivalence relation on P(V ) is transitive.
Proof
Let , , P(V ) such that and . We need to show that
. We shall consider the four possible cases (i), (ii), (iii) and (iv) of de-
nition (209) in relation to . Let

and

denote the core of , and


respectively. First we assume that

P
0
(V ),

P
0
(V ) with

.
Then from we must have

P
0
(V ) and

. It follows that

P
0
(V ),

P
0
(V ) and

and we see that . Next we assume


that

= =

. Then from we must have

= =

and we
conclude once again that . Next we assume that

and

are of the
form

=
1

2
and

=
1

2
with
1

1
and
2

2
. Then from
we see that

must be of the form

=
1

2
with
1

1
and

2

2
. From the transitivity of the absorption congruence, it follows that

1

1
and
2

2
and consequently . Finally, we assume that

and

are of the form

= x
1
and

= x
1
with
1

1
. Then from
we see that

must be of the form

= x
1
with
1

1
. Once again by
transitivity we obtain
1

1
and nally as requested.
Proposition 215 Let be the almost equivalence and be the absorption
congruence on P(V ), where V is a set. For all , P(V ):

208
Proof
Let , P(V ) such that . We need to show that . We shall
consider the four possible cases (i), (ii), (iii) and (iv) of denition (209) in
relation to . Let

and

denote the core of and respectively. From


proposition (207) we have

and

. It is therefore sucient to prove


that

. This equivalence is clear in cases (i) and (ii) of denition (209).


So we assume that

and

are of the form

=
1

2
and

=
1

2
where
1

1
and
2

2
. The absorption congruence being a congruent
relation we obtain

as requested. Finally, we assume that

and

are of the form

= x
1
and

= x
1
with
1

1
. Once again, using the
fact that the absorption congruence is a congruent relation we obtain

Proposition 216 The almost equivalence relation on P(V ) is congruent.


Proof
By reexivity, we already know that . So we assume that =
1

2
and =
1

2
where
1

1
and
2

2
. We need to show that .
However, using proposition (206), it is clear the core decomposition of and
are = u

and = v

with u = v = 0,

= and

= . Furthermore,
from proposition (215) we have
1

1
and
2

2
. It follows that

and

are of the form

=
1

2
and

=
1

2
with
1

1
and
2

2
.
Hence we see that as requested. We now assume that = x
1
and
= x
1
where x V and
1

1
. We need to show that . We shall
distinguish two cases: rst we assume that x Fr (
1
). From
1

1
and
proposition (215) we obtain
1

1
. It follows from proposition (203) that
Fr (
1
) = Fr (
1
) and consequently x Fr (
1
). Using proposition (206) it is
clear that the core decomposition of and are = u

and = v

where
u = v = 0,

= and

= . Note that the conditions x Fr (


1
) and
x Fr (
1
) are crucially required to ensure (ii) of proposition (206) is satised.
So we have proved that

and

are of the form

= x
1
and

= x
1
with
1

1
and we conclude that as requested. We now assume that
x , Fr (
1
). Then we also have x , Fr (
1
). Let
1
= u
1

and
1
= v
1

be the core decompositions of


1
and
1
respectively. Then from lemma (208)
the core decompositions of and are of the form = u

and = v

. In
particular, and
1
have identical core

while and
1
have identical core

. From
1

1
and denition (209) we conclude that as requested.
Proposition 217 The almost equivalence relation on P(V ) is a congruence.
Proof
From proposition (212), the relation is reexive. From proposition (213) it is
symmetric while from proposition (214) it is transitive. It is therefore an equiv-
alence relation on P(V ). Furthermore, from proposition (216), the relation
is congruent. We conclude that it is a congruence relation on P(V ).
209
Proposition 218 Let be the almost equivalence and be the absorption
congruence on P(V ), where V is a set. For all , P(V ):

Proof
We need to show the equality =. The inclusion follows from proposi-
tion (215). So it remains to show. However, since is the smallest congruence
on P(V ) which contains the set R
0
of denition (202), it is sucient to show
that is a congruence which contains R
0
. The fact that is a congruence
follows from proposition (217). The fact that it contains R
0
follows from (211).

We can now forget about the almost equivalence and conclude with:
Theorem 20 Let be the absorption congruence on P(V ) where V is a set.
For all , P(V ), if and only if one of the following is the case:
(i)

P
0
(V ) ,

P
0
(V ) , and

(ii)

= and

=
(iii)

=
1

2
,

=
1

2
,
1

1
and
2

2
(iv)

= x
1
,

= x
1
, x V and
1

1
where

and

are the core of and respectively as per denition (204).


Proof
This follows immediately from denition (209) and proposition (218).
2.5.3 Absorption Mapping and Absorption Congruence
In denition (137) we introduced the minimal transform / : P(V ) P(

V )
which allowed us to characterize an -equivalence with a simple equality
/() = /(), as follows from theorem (14) of page 153. We would like to
achieve a similar result for the absorption congruence. As we shall see, things
are a lot simpler in this case. We shall dene a mapping / : P(V ) P(V )
which eectively removes pointless quantications from a formula. We shall then
prove that the absorption equivalence can be reduced to the equality
/() = /(). This will be done in proposition (223) below. In general given
a congruence on P(V ), we believe there is enormous benet in reducing the
equivalence to an equality f() = f() for some map f : P(V ) A
where A is an arbitrary set. This can always be done of course, as is
equivalent to an equality [] = [] between equivalence classes. However, if
we can nd a map f which is computable in some sense with a codomain A in
which the equality is decidable, this will ensure that the congruence is itself
decidable. Granted we have no idea what computable and decidable mean at this
stage, but at some point we shall want to know. This is for later.
210
Denition 219 Let V be a set. We call absorption mapping on P(V ) the map
/ : P(V ) P(V ) dened by the following recursive equation:
P(V ) , /() =

(x y) if = (x y)
if =
/(
1
) /(
2
) if =
1

2
x/(
1
) if = x
1
, x Fr (
1
)
/(
1
) if = x
1
, x , Fr (
1
)
Proposition 220 The structural recursion of denition (219) is legitimate.
Proof
We have to be slightly careful in this case. As usual, we have a choice between
theorem (4) of page 45 and theorem (5) of page 47. Given = x
1
, looking
at denition (219) it seems that /() is not simply a function of /(
1
) but
also involves a dependence in
1
. So it would seem that theorem (5) has to
be used in this case. Of course, it is not dicult to see that an equivalent
denition could be obtained by replacing Fr (
1
) by Fr (/(
1
)), allowing us to
use theorem (4) instead. However, we have no need to worry about that. So
let us use theorem (5) with X = P(V ), X
0
= P
0
(V ) and A = P(V ). We
dene g
0
: X
0
A by setting g
0
(x y) = (x y) which ensures the rst
condition of denition (219) is met. Next, we dene h() : A
0
X
0
A by
setting h()(0, 0) = which gives us the second condition. Next, we dene
h() : A
2
X
2
A be setting h(a, ) = a(0) a(1). Finally, we dene
h(x) : A
1
X
1
A be setting h(x)(a, ) = xa(0) if x Fr ((0)) and
h(x)(a, ) = a(0) otherwise. So let us check that conditions 3, 4 and 5 are
met. If =
1

2
we have the following equalities:
/() = /(
1

2
)
f =, = (
1
,
2
) = /(f())
/ : X
2
A
2
= h(f)(/(), )
= h()(/(), )
= /()(0) /()(1)
/ : X A = /((0)) /((1))
= /(
1
) /(
2
)
If = x
1
with x Fr (
1
), then we have the following equalities:
/() = /(x
1
)
f = x, (0) =
1
= /(f())
/ : X
1
A
1
= h(f)(/(), )
= h(x)(/(), )
x Fr ((0)) = x/()(0)
211
/ : X A = x/((0))
= x/(
1
)
The case x , Fr (
1
) is dealt with in a similar way.
In proposition (141) we showed that /() i() where is the substitution
congruence and i : V

V is the inclusion map. In eect, this shows that the
minimal transform preserves the equivalence class of . Of course, this is not
quite true as the minimal transform takes values in P(

V ) and not P(V ). So the


equivalence class of is preserved only after we embed in P(

V ) as i(). In
the case of the absorption congruence things are simpler: since the absorption
mapping takes values in P(V ) itself, we are able to write equivalence /() .
Proposition 221 Let V be a set and / : P(V ) P(V ) be the absorption
mapping. Let be the absorption congruence on P(V ). Then given P(V ):
/()
Proof
We shall prove this equivalence with a structural induction, using theorem (3)
of page 32. The equivalence is clear in the case when = (x y) and = .
So we assume that =
1

2
where
1
,
2
P(V ) satisfy our equivalence:
/() = /(
1

2
)
= /(
1
) /(
2
)

1

2
=
Next we assume that = x
1
for some x V and
1
P(V ) satisfying our
equivalence. We shall distinguish two cases. First we assume that x Fr (
1
):
/() = /(x
1
)
x Fr (
1
) = x/(
1
)
x
1
=
Next we assume that x , Fr (
1
). Then we have:
/() = /(x
1
)
x , Fr (
1
) = /(
1
)

1
x , Fr (
1
) x
1
=
212

Before we prove that the equivalence can be reduced to the quality


/() = /() we shall need to show that this equality denes a congruence:
Proposition 222 Let V be a set and be the relation on P(V ) dened by:
/() = /()
where / : P(V ) P(V ) is the absorption mapping. Then is a congruence.
Proof
The relation is clearly an equivalence relation on P(V ). So it remains to show
that it is also a congruent relation. By reexivity, we already know that .
So let =
1

2
and =
1

2
be such that
1

1
and
2

2
. We
need to show that or equivalently /() = /() which goes as follows:
/() = /(
1

2
)
= /(
1
) /(
2
)

i

i
= /(
1
) /(
2
)
= /(
1

2
)
= /()
We now assume that = x
1
and = x
1
where
1

1
. We need to
show that or equivalently /() = /(). We shall distinguish two cases:
rst we assume that x Fr (
1
). Having assumed that
1

1
, we obtain
/(
1
) = /(
1
) and consequently from proposition (221) we have
1

1
,
where is the absorption congruence on P(V ). It follows from proposition (203)
that Fr (
1
) = Fr (
1
) and thus x Fr (
1
). Hence we have:
/() = /(x
1
)
x Fr (
1
) = x/(
1
)

1

1
= x/(
1
)
x Fr (
1
) = /(x
1
)
= /()
We now assume that x , Fr (
1
). Then we also have x , Fr (
1
) and hence:
/() = /(x
1
)
x , Fr (
1
) = /(
1
)

1

1
= /(
1
)
x , Fr (
1
) = /(x
1
)
= /()

213
Proposition 223 Let be the absorption congruence on P(V ) where V is a
set. Then for all , P(V ) we have the following equivalence:
/() = /()
where / : P(V ) P(V ) is the absorption mapping as per denition (219).
Proof
First we show : Let be the relation on P(V ) dened by if and only
if /() = /(). We need to show the inclusion . However, since is
the smallest congruence on P(V ) which contains the set R
0
of denition (202),
it is sucient to show that is a congruence on P(V ) which contains R
0
. We
already know from proposition (222) that is a congruence on P(V ). So it re-
mains to show that R
0
. So let
1
P(V ) and x V such that x , Fr (
1
).
We need to show that /(
1
) = /(x
1
) which follows immediately from de-
nition (219). We now prove : so we assume that /() = /(). We need to
show that , which follows immediately from proposition (221).
2.6 The Propositional Congruence
2.6.1 The Propositional Congruence
Denition 224 Let V be a set. We call propositional valuation on P(V ) any
map v : P(V ) 2 satisfying the following properties. Given
1
,
2
P(V ):
(i) v() = 0
(ii) v(
1

2
) = v(
1
) v(
2
)
Denition 225 Let V be a set. We call propositional congruence on P(V ),
the congruence on P(V ) generated by the following set R
0
P(V ) P(V ):
R
0
= (, ) : v() = v() for all v : P(V ) 2 propositional valuation
To be continued. . .
214
Chapter 3
The Free Universal Algebra
of Proofs
3.1 The Hilbert Deductive Congruence
3.1.1 Preliminaries
In this section we introduce yet another congruence on P(V ) which we have cho-
sen to call the Hilbert deductive congruence, hoping this will make it obvious to
the familiar reader. Another possible name could have been the Lindenbaum-
Tarski congruence since it is the congruence giving rise to the Lindenbaum-
Tarski algebra, a seemingly well established term in mathematical logic. The
Hilbert deductive congruence is probably not what we are looking for to de-
ne our Universal Algebra of First Order Logic. We are looking to identify
mathematical statements which have the same meaning, and such identication
should be decidable. We are not looking for theorems. However, the study of the
Hilbert deductive congruence gives us a great opportunity to introduce funda-
mental notions of mathematical logic which we shall no doubt require at a later
stage: these are the concept of proof and provability, the deduction theorem,
the notion of semantics and model, and of course G odels completeness theorem.
These are the bread and butter of any textbook on mathematical logic and it
is about time we touch upon them. So our rst objective will be to dene the
notion of proof and provability, which is the question of axiomatization of rst
order logic. Having reviewed some of the existing textbooks and references, it is
astonishing to see how little consensus there is on the subject. It is impossible to
nd two references which will universally agree on which axioms should be used,
or which rules of inference. So we had to make our own mind and decide for
ourselves what constitutes a good axiomatization of rst order logic, and here it
is: rstly, a good axiomatization of rst order logic should be sound and ensure
Godels completeness theorem holds. This is rather uncontroversial so we shall
not say more about it. Secondly, The deduction theorem should hold without
215
any form of restriction. Most references will depart from this principle and
quote the deduction theorem imposing some form of restriction, typically that
a formula should be closed (e.g. [18], [30], [32], [39], [43], [44], [45], [62]). The
only exception known to us is [4] where the deduction theorem is restored to its
full glory. This we believe should be the case. There is also a web page [46] on
http://planetmath.org indicating the awareness by some living mathemati-
cians that the deduction theorem need not fail in rst order logic, provided the
concept of proof is carefully dened. This brings us to our third requirement
which a good axiomatization of rst order logic ought to meet: the concept of
proof should not be presented in terms of nite sequences of formulas. Instead,
proofs should be dened as formal expressions in a free universal algebra gen-
erated by the formulas of the language itself (the set of possible hypothesis)
and whose operators are the chosen rules of inference: to every formula cor-
responds a proof = arising from a constant operator, indicating that is
invoked as an axiom; to every proof corresponds another proof x arising
from a unary operator indicating a generalization with respect to the variable x;
to every pair of proof (
1
,
2
) corresponds another proof
1

2
arising from a
binary operator, that of modus ponens for example. . . On this algebra of proofs
should be dened a key semantics associating every proof to its conclusion. By
imposing this free algebraic structure on proofs, we are preserving their natural
tree structure, giving us the full power of structural induction and structural
recursion which we cannot enjoy with the linear structure of nite sequences.
These should be abandoned. Some authors still want to dene formulas as nite
sequences of characters and will go as far as introducing a concatenation op-
erator on their strings, explaining why the parenthesis ( and ) should be used.
At best, they are just creating a free universal algebra of formulas which could
have been dened in a generic way, up to isomorphism. At worse, by insisting
on the details of their string implementation, they are making everything very
awkward to prove. Finite sequences are not suited to mathematical logic.
3.1.2 Axioms of First Order Logic
We are now introducing the axioms which we intend to use as part of our deduc-
tive system. These axioms come in ve dierent groups which we shall describe
individually. The rst three groups of axioms are often referred to as proposi-
tional axioms while the other two are specic to rst order logic, dealing with
quantication. We have no axiom relating to equality of course, having dis-
missed the equality predicate from our language. As already explained prior to
denition (51) of page 69, we are hoping to dene the notion of equality in terms
of the primitive symbol , and make equality axioms simple consequences of
our specic coding. So no equality for now. In choosing these ve groups of
axioms, we have tried to keep it lean and simple, while reaching the greatest
possible consensus between the references available to us. The axioms below are
pretty much those of Donald W. Barnes and John M. Mack [4], Miklos Ferenczi
and Miklos Sz ots [18], P.T. Johnstone [32], Elliot Mendelson [43] and G. Takeuti
and W.M. Zaring [61]. They are referred to as traditional textbook axioms of
216
predicate calculus by Norman Megills Metamath web site [45] which we nd
reassuring. However, these references disagree on some minor details so we had
to choose between them. For example, contrary to [18] we do not wish to speak
of formula scheme, which we feel do not bring much mathematical value. We
have also decided against [4] and avoided as much congruence as possible when
dening axioms, to keep the material as down to earth as possible. Finally,
we have ignored a few oddities in [32] which were seemingly motivated by P.T.
Johnstones desire to allow the empty model.
Our rst group of axioms corresponds to Axiom ax-1 of Metamath [45].
We have decided to follow this reference in calling an axiom within this group
a simplication axiom. It appears the terminology originates from the link
between these axioms and the formula
1

2

1
. Some historical background
can be found on the web page http://us.metamath.org/mpegif/ax-1.html.
Denition 226 Let V be a set. We call simplication axiom on P(V ) any
formula P(V ) for which there exist
1
,
2
P(V ) such that:
=
1
(
2

1
)
The set of simplication axioms on P(V ) is denoted A
1
(V ).
Our second group of axioms corresponds to Axiom ax-2 of Metamath [45].
We have again decided to follow this reference in calling an axiom within this
group a Frege axiom. Further details and historical background can be found
on the web page http://us.metamath.org/mpegif/ax-2.html.
Denition 227 Let V be a set. We call Frege axiom on P(V ) any formula
P(V ) for which there exist
1
,
2
,
3
P(V ) such that:
= [
1
(
2

3
)] [(
1

2
) (
1

3
)]
The set of Frege axioms on P(V ) is denoted A
2
(V ).
Our third group of axioms corresponds to Axiom ax-3 of Metamath [45]
which can be found on http://us.metamath.org/mpegif/ax-3.html. How-
ever, we chose a dierent form for these axioms, namely [(
1
) ]
1
following [4] and [32] rather than (
1

2
) (
2

1
) which is also the
form encountered in [18]. It is easy to believe that both choices lead to the same
notion of provability. Our terminology is transposition axiom following [45].
Denition 228 Let V be a set. We call transposition axiom on P(V ) any
formula P(V ) for which there exists
1
P(V ) such that:
= [(
1
) ]
1
The set of transposition axioms on P(V ) is denoted A
3
(V ).
217
Our fourth group of axioms is that of Theorem stdpc5 in Metamath [45]
which can be found on http://us.metamath.org/mpegif/stdpc5.html. Nor-
man Megill has this as a theorem rather than an axiom but it is still men-
tioned on http://us.metamath.org/mpegif/mmset.html#traditional as a
traditional textbook axiom. It is indeed a chosen axiom of [4], [18] and [32].
As we were not able to nd an obvious name, we chose quantication axiom.
Denition 229 Let V be a set. We call quantication axiom on P(V ) any
formula P(V ) for which there exist
1
,
2
P(V ) and x V such that:
x , Fr (
1
) , = x(
1

2
) (
1
x
2
)
The set of quantication axioms on P(V ) is denoted A
4
(V ).
Our fth and last group of axioms is Theorem stdpc4 of Metamath [45] which
can be found on http://us.metamath.org/mpegif/stdpc4.html. Following
this reference once again, we have decided to call axioms within this group
a specialization axiom. It is a chosen group of axioms for [4], [18] and [32]
with varying wordings in relation to the highly delicate problem of making sure
variable substitutions are valid. Indeed, a specialization axiom is of the form:
x
1

1
[y/x] (3.1)
Strictly speaking, this formula will have a dierent meaning, and require dif-
ferent verbal qualications, depending on the specics of how the substitution
[y/x] : P(V ) P(V ) is dened. For example, most references in the literature
will dene [y/x] with a recursive formula in which bound occurrences of x are
not replaced by y. This is contrary to our own denition (53) of page 71 where
all variables are systematically substituted regardless of whether they are free
or bound occurrences. As it turns out, we did not rely on denition (53) to
dene specialization axioms, but dened them instead in terms of an essen-
tial substitution [y/x] : P(V ) P(V ) as per denition (165). In other words,
if [y/x] : P(V ) P(V ) denotes an essential substitution associated with the
map [y/x] : V V of denition (65), then the formula (3.1) is a legitimate
specialization axiom. In a more condensed form, x
1

1
[y/x] is an axiom
whenever [y/x] is an essential substitution of y in place of x. This is beautiful,
this is short and does not require any further qualication such as provided y is
free for x in
1
, i.e. provided [y/x] is valid for
1
as per denition (83). There
is however a huge disadvantage in using essential substitutions when dening
specialization axioms: the level of mathematical sophistication required to fol-
low the analysis is a lot higher and most people will stop reading. Surely we do
not want that. Another drawback is that essential substitutions are intertwined
with the notion of substitution congruence. If : P(V ) P(V ) is an essential
substitution and P(V ) then () is of course a well dened formula of P(V )
but the specics of () are usually unknown. We typically only know about
the class of () modulo the substitution congruence. Luckily this is often all
we care about, but it does mean we are not able to provide an axiomatization
of rst order logic without dependencies to the substitution congruence. In the
218
light of these considerations, our preferred option would have been to dene the
specialization axioms in line with the existing literature:
x
1

1
[y/x] , where [y/x] is valid for
1
(3.2)
where the substitution [y/x] : P(V ) P(V ) is now simply the substitution
associated with [y/x] : V V as per denition (53). This would have the
advantage of being simple, consistent with the usual practice and devoid of any
reference to the substitution congruence. The fact that our substitutions of
denition (53) have an impact on bound occurrences of variables is contrary to
standard practice, but of little signicance. The notion of valid substitution of
denition (83) is not commonly found in textbooks, but is a nice and simple
generalization of the idea that y is free for x in
1
. So why not simply use (3.2)
to dene specialization axioms? Why introduce the elaborate notion of essential
substitutions and seemingly make everything more complicated with no visible
benet? The answer is this: we do not have a choice. This document is devoted
to the study of the algebra P(V ) where V is a set of arbitrary cardinality. In
particular V can be a nite set. By contrast, the existing literature focusses
on rst order languages where the number of variables is typically countably
innite. This makes a big dierence. We believe our insistence on allowing
V to be nite will be rewarded. We cannot be sure at this stage, but we are
quite certain the theory of vector spaces was not solely developed in the innite
dimensional case. There must be value in the study of rst order languages
which are nite dimensional. This is of course no more than a hunch, but we
are not quite ready to give it up. So let us understand why allowing V to be
nite changes everything: consider V = x, y and = x
1
= xy(x y)
where x ,= y. We know from theorem (18) of page 178 that there exists an
essential substitution [y/x] : P(V ) P(V ) associated with [y/x] : V V . It
is not dicult to prove that the only possible value for [y/x](
1
) is x(y x).
So because we have chosen to dene specialization axioms in terms of essential
substitutions, from x
1

1
[x/x] and x
1

1
[y/x] we obtain:
(i) xy(x y) y(x y)
(ii) xy(x y) x(y x)
as the two possible specialization axioms associated with = x
1
. However,
suppose we had opted for (3.2) to dene specialization axioms. Then the formula
x
1

1
[y/x] would need to be excluded as an axiom because [y/x] is clearly
not a valid substitution for
1
= y(x y). So the formula (ii) above would
seemingly not be an axiom for us. Does it matter? Well in general no: for in
general we have a spare variable z to work with. So suppose x, y, z V with
x, y, z distinct. Then from xy(x y) we can prove (z x) by successive
specialization. By successive generalization we obtain zx(z x) and nally
by specializing once more we conclude that x(y x). It follows that the
formula (ii) above would fail to be an axiom, but it would still be a theorem
of our deductive system, keeping the resulting notion of provability unchanged.
However, in the case when V = x, y we have a problem. There doesnt seem to
219
be a way to turn the formula (ii) into a theorem. Does it matter? Yes it does. It
is clear that xy(x y) x(y x) is going to be true in any model, under
any variables assignment. Unless it is also a theorem, Godels completeness
theorem will fail, which we do not want in a successful axiomatization of rst
order logic. We are now in a position to state our chosen denition:
Denition 230 Let V be a set. We call specialization axiom on P(V ) any
formula P(V ) for which there exist
1
P(V ) and x, y V such that:
= x
1

1
[y/x]
where [y/x] : P(V ) P(V ) is an essential substitution of y in place of x. The
set of specialization axioms on P(V ) is denoted A
5
(V ).
We know from proposition (167) that an essential substitution : P(V ) P(V )
can be redened arbitrarily without changing its associated map : V V , as
long as such re-denition preserves classes modulo the substitution congruence.
It follows that if [y/x] : P(V ) P(V ) is an essential substitution of y in place
of x, it remains an essential substitution of y in place of x after re-denition
modulo substitution. It follows that any formula x
1

1
is a specialization
axiom, as long as

1

1
[y/x], where denotes the substitution congruence:
Proposition 231 Let V be a set. Let
1
P(V ) and x, y V . Let

1
P(V )
such that

1

1
[y/x] where [y/x] : P(V ) P(V ) is an essential substitution
of y in place of x. Then the formula = x
1

1
is a specialization axiom.
Proof
Suppose

1

1
[y/x] where [y/x] : P(V ) P(V ) is an essential substitution
associated with the map [y/x] : V V . Dene : P(V ) P(V ) by setting
() = [y/x]() if ,=
1
and (
1
) =

1
. Having assumed that

1

1
[y/x]
we see that () [y/x]() for all P(V ). It follows from proposition (167)
that is also an essential substitution associated with [y/x] : V V . In other
words, is also an essential substitution of y in place of x. Using denition (230)
it follows that = x
1
(
1
) is a specialization axiom. From (
1
) =

1
we conclude that = x
1

1
is a specialization axiom as requested.
Denition 232 Let V be a set. We call axiom of rst order logic on P(V ) any
formula P(V ) which belongs to the set A(V ) dened by:
A(V ) = A
1
(V ) A
2
(V ) A
3
(V ) A
4
(V ) A
5
(V )
3.1.3 Proofs as Free Universal Algebra
Most textbook references we know about dene proofs as nite sequences of for-
mulas, and inference rules as ordered pairs (, ) where is a nite sequence of
formulas and is a formula. A good illustration of this is Denition 1.6 of page
12 in [18]. As far as we can tell, this approach does not work. Having proofs as
220
nite sequences of formulas means that in order to establish any result, the only
tool available to us is induction on the length of the proof. This makes it excru-
ciatingly painful and messy to prove anything and is far inferior to the seamless
ow of a proof by structural induction. Furthermore, denying the structure of
a free algebra to the set of proofs means we cannot use structural recursion to
dene anything. There are many interesting functionals which could be dened
on the set of proofs. It is a shame to give that up. Granted, nite sequences
of formulas are easily understood. Everyone likes them for this reason. Not ev-
eryone is familiar with the machinery of universal algebras which require some
eort to acquire. But it is well worth it. Now when it comes to rules of inference,
the structure (, ) is also inadequate. Indeed, it does not allow generalization
to be expressed in a sensible way. In Denition 2.10 of page 22 in [18], the
generalization is introduced as ((
1
), x
1
). In other words, if the formula
1
has been proved, so has the formula x
1
. Surely this cannot be right. And
yet this is the accepted approach by many authors such as Marc Hoyois [30], J.
Donald Monk [44] and George Tourlakis [62]. Our favorite web site [45] also has
it, as can be seen on http://us.metamath.org/mpegif/ax-gen.html. Even
if we are willing to disregard the uncomfortable paradox of claiming x
1
is
proved whenever
1
is, we know this approach is failing because the deduction
theorem is no longer true for non-closed formulas. Granted there are other logi-
cal systems we are told, where the deduction theorem does not work. Logicians
are free to do what they want. In this document we care about classical rst
order logic, meta-mathematics, the language of ZF and we are looking for math-
ematical objects which oer a natural formalization of standard mathematical
arguments. It is clear the deduction theorem has to hold. Suppose we need to
prove a statement of the form x(p(x) q(x)). A standard mathematician will
proceed as follows: let x be arbitrary. Suppose p(x) holds. . . Then a long proof
follows. . . So we see that q(x) is true. Hence we conclude that p(x) q(x) and
since x was arbitrary we obtain x(p(x) q(x)). This is a standard mathemat-
ical argument, the essence of which is to construct a proof of q(x) from the single
hypothesis p(x). So a standard mathematician will devote most of his energy
establishing the sequent p(x) q(x). He will then implicity use the deduction
theorem to claim that (p(x) q(x)) and conclude x(p(x) q(x)) from
generalization, having rightly argued that x is arbitrary. So strictly speaking,
a standard mathematician does not write a complete proof x(p(x) q(x)).
Instead, he resorts to some form of sequent calculus to argue that his conclusion
must be true if the sequent p(x) q(x) is itself true. This whole reasoning
is what he regards as a valid proof. Now we are free to formalize the notion of
proof in anyway we want. We are personally convinced that every theorem of
Metamath [45] is correct. We are sure Don Monk is right. However, as far as
we are concerned, it is unthinkable to design a concept of proof for which the
arguments of the standard mathematician fail to be legitimate. We must be
able to infer the sequent (p(x) q(x)) from p(x) q(x). The deduction
theorem should be true, regardless of whether a formula is closed or not.
In the light of these discussions, it should be clear by now that we wish
to construct our set of proofs as a free universal algebra X of some type
221
generated by a set of formula X
0
P(V ). On this algebra should be dened
two key semantics Val : X P(V ) and Hyp : X T(P(V )) so that given a
proof X, the valuation Val () would represent the conclusion being proved
by , while Hyp () would be the set of hypothesis. With this in mind, we could
then dene a consequence relation T(P(V )) P(V ) as follows:
(Val () = ) (Hyp () ) for some X
This is the outline of the plan. We now need to provide the specics. Our rst
task is to determine the exact type of the universal algebra X. In the case
of classical rst order logic it would seem natural to have a binary operator
for the rule of inference known as modus ponens, and a unary operator x
for the generalization with respect to the variable x V . This would work as
follows: if
1
was a proof of
1
and
2
was a proof of
1

2
, then
1

2
would be a proof of
2
. This can be enforced by making sure the valuation
Val : X P(V ) satises Val(
1

2
) =
2
. Of course we need operators to be
dened everywhere, so the proof
1

2
has to be meaningful even in the case
when the conclusion Val(
2
) cannot be expressed in the form of Val (
1
)
2
for some formula
2
. But what is the conclusion of a proof
1

2
which stems
from a awed application of the modus ponens rule of inference? We clearly
havent proved anything. So the conclusion of
1

2
should be the weakest
mathematical result of all. In other words, we should simply make sure the
valuation Val : X P(V ) satises Val(
1

2
) = . In the case of
generalization, if
1
was a proof of
1
, then x
1
would be a proof of x
1
,
provided the variable x is truly arbitrary. This last condition can be formalized
by saying that x is not a free variable of any formula belonging to Hyp(
1
). We
shall write this as x , Sp (
1
). So whenever this condition holds, we simply need
to make sure the valuation Val : X P(V ) satises Val(x
1
) = xVal (
1
).
In the case when x Sp (
1
), then x
1
constitutes a awed application of the
generalization rule of inference, and we should simply set Val(x
1
) = .
So we have decided to include a binary operator and a unary operator x
for all x V , in dening the type of our free universal algebra of proofs. But
what about axioms? If A(V ) is an axiom of rst order logic and X
is a proof which relies on the axiom , how should this axiom be accounted
for? One solution is not to treat axioms in any special way and regard simply
as yet another hypothesis of the proof . In that case we have Hyp ().
However, this is not a good solution: for suppose we wish to design a proof
of the formula x((x x) (x x)). Conceivably, the only solution is to
design a proof = x
1
as the generalization in x of another proof
1
whose
conclusion is Val(
1
) = (x x) (x x). We shall soon see that nding
a proof
1
is not dicult. However such a proof relies on axioms which have
x as a free variable. Consequently, if we were to include axioms as part of
the set of hypothesis Hyp(
1
) the condition x , Sp (
1
) would not hold and
= x
1
would constitute a case of awed generalization, whose conclusion is
Val() = , and not what we set out to prove. Of course it would be
possible to have a set of axioms which have no free variables, by considering
universal closures. However, this is not the solution we adopted. We chose a set
222
of axioms which may potentially have free variables, and it is important that
we exclude those axioms from the set of hypothesis of a given proof. Hence, for
every axiom A(V ) we shall dene a constant operator : X
0
X, so
that = (0) is simply a proof with conclusion Val() = and the crucial
property Hyp() = . In fact, we shall introduce a unary operator : X
0
X
for every formula P(V ) and not simply for A(V ). If = (0) is a
proof where is not a legitimate axiom, we shall simply regard the proof as
a case of awed invocation of axiom and set Val () = .
The choice of creating an operator : X
0
X even in the case when
is not a legitimate axiom may seem odd at rst glance. What is the point
of doing this? This choice will eectively increase our free universal algebra of
proofs by allowing awed and seemingly pointless proofs of the form where
is not an axiom. Why? Well, rstly we should realize that this isnt the end
of the world. We have already accepted the idea of having many proofs of the
form
1

2
or x
1
which are awed in some sense, as they are illegitimate use
of inference rules and whose conclusion is . Allowing the possibility of
wrongly invoking an axiom makes little dierence to the existing status. More
importantly, this exibility brings considerable advantages as we shall discover
in later part of this document. One advantage is to seamlessly dene the notion
of variable substitution in proofs. Given a map : V W we shall dene
a corresponding substitution for proofs as per denition (272). Allowing
to be meaningful regardless of whether is a legitimate axiom allows us to
set () = () without having to worry whether or not the formula () is
indeed an axiom. This makes a lot of the formalism smoother and more elegant.
Another advantage is the ability to study other deductive systems based on an
alternative set of axioms without having to consider a dierent algebra of proofs.
We can keep the same free universal algebra of proofs and simply change the
semantics Val : X P(V ) giving rise to an alternative notion of provability.
Denition 233 Let V be a set. We call Hilbert Deductive Proof Type associ-
ated with V , the type of universal algebra dened by:
= : P(V ) x : x V
where = ((0, ), 0), P(V ), = ((1, 0), 2) and x = ((2, x), 1), x V .
The specics of how , and x are dened are obviously not important.
The coding is done in such a way that the Hilbert deductive proof type is
indeed a set of ordered pairs which is functional and with range in N. So the
set is a map with range in N, and is therefore a type of universal algebra as
per denition (1). We have also set up the coding to ensure each operator has
the expected arity, namely () = 0, () = 2 and (x) = 1. We are now
ready to dene our free universal algebra of proofs X of type . We shall adopt
the generator X
0
= P(V ) and denote this algebra (V ):
Denition 234 Let V be a set with associated Hilbert deductive proof type .
We call Free Universal Algebra of Proofs associated with V , the free universal
algebra (V ) of type with free generator P(V ).
223
Recall that the existence of (V ) is guaranteed by theorem (1) of page 20. It
follows that we have P(V ) (V ): any formula P(V ) is also a proof. It is
simply the proof which has itself as a premise, and itself as a conclusion. In other
words, we shall have Hyp () = and Val() = . Note that if P(V ) then
both and are proofs, where we casually use the notation as a shortcut
for (0). These proofs have the same conclusion , provided the formula
is an axiom. The dierence between them is that an axiom has no premise,
namely Hyp() = . One other important point needs to be made: both P(V )
and (V ) are free universal algebras for which is dened a natural notion of
sub-formula and sub-proof as per denition (40). So given , P(V ) and
, (V ) it is meaningful to write _ and _ expressing the idea that
is a sub-formula of and is a sub-proof of . However, since P(V ) (V )
the statement _ is ambiguous. It is not clear whether we are referring to
the sub-formula partial order on P(V ) or (V ). The distinction is crucial: the
only sub-proof of is itself, while may have many sub-formulas. So we have
a problem for which a solution is to use two dierent symbols, one referring to
sub-formulas and one referring to sub-proofs. In these notes, we have decided to
keep the same symbol _ for both notions, hoping the context will always make
clear which of the two notions is being considered. In general, we shall want
to avoid this sort of situation. Whenever extending one notion from formulas
to proofs, we shall always make sure the meaning remains the same regardless
of whether is viewed as an element of P(V ) or (V ). For example, the
variables Var(), free variables Fr () and bound variables Bnd () of a proof
will be dened with this in mind. The same will apply to the notions of valid
substitution, the image () of a proof and its minimal transform /(). So
the sub-formula partial order _ is an exception, where a confusion is possible.
Recall that another example of possibly confusing notation exists in this note
namely 0 which may refer to the standard quantication x with x = 0, or to
the iterated quantication as per denition (187). Before we formally dene the
set of hypothesis Hyp () for a given proof (V ), we would like to stress:
Proposition 235 Let V be a set and , P(V ). Then we have:
= =
Proof
We are now familiar with the free universal algebra of rst order logic P(V )
and a common abuse of notation regarding the symbol . This symbol is
inherently ambiguous as it may refer to three dierent things: rstly, it refers
to an operator symbol namely an element (V ) of the rst order logic type
of denition (51). Secondly, it refers to the actual operator : 0 P(V )
of P(V ) with arity () = 0. Thirdly, it refers to the actual formula (0)
which we commonly denote in order to keep our notations lighter. Given
a formula P(V ), the same thing can be said of the notation which is
equally ambiguous: it refers to the operator symbol of denition (233) or the
actual operator : 0 (V ) or the proof (0) which we casually denote
. So when it comes to proposition (235) which we are now attempting to
224
prove, there is potentially a problem. Luckily the statement of this proposition
is true regardless of how we wish to interpret the notations and . If
the two symbols are equal, then the two operators are equal and consequently
the two proofs are equal. So it is sucient to prove the following implication:
(0) = (0) =
which is our initial statement where the abuse of notation has been removed.
So suppose (0) = (0). Using theorem (3) of page 32 we obtain the equality
between symbols = . However, from denition (233) we have =
((0, ), 0) and = ((0, ), 0). So we conclude that = as requested.
Denition 236 Let V be a set. The map Hyp : (V ) T(P(V )) dened by
the following structural recursion is called the hypothesis mapping on (V ):
(V ) , Hyp () =

if = P(V )
if =
Hyp (
1
) Hyp (
2
) if =
1

2
Hyp (
1
) if = x
1
(3.3)
We say that is a hypothesis of the proof (V ) if and only if Hyp ().
Note that given a proof (V ), if we regard as some formal expression
involving variables of P(V ), then the set of hypothesis Hyp () is simply the set of
variables occurring in . We should also note that the recursive denition (3.3)
is easily seen to be legitimate and furthermore that Hyp () is a nite set, as a
straightforward structural induction will show. Our next step is to dene the
valuation mapping Val : (V ) P(V ) which returns the conclusion Val()
of a proof (V ). Before we can do so we need to dene the set Sp ()
representing the free variables occurring in the premises of the proof . This set
is important to formally decide whether a proof of the form x
1
constitutes a
legitimate application of the generalization rule of inference. Obviously, having
proved the formula Val (
1
) using a proof
1
, we cannot claim to have proved
xVal (
1
) unless the variable x was truly arbitrary, that is x , Sp (
1
).
Denition 237 Let V be a set. Given P(V ), we say that x V is a free
variable of , if and only if it belongs to the set Fr () dened by:
Fr () = Fr() :
Given (V ), we say that x V is a specic variable of the proof , if and
only if it belongs to the set Sp () dened by the equality:
Sp () = Fr (Hyp ()) = Fr () : Hyp ()
We also need to formally decide whether a proof of the form
1

2
consti-
tutes a legitimate application of the rule of inference known as modus ponens.
225
This is the case when the conclusion Val (
2
) of the proof
2
can be expressed
as Val (
2
) = Val(
1
) for some formula P(V ). When this is the case,
the proof
1

2
is a proof of . Otherwise, it simply becomes a proof of
which is a very weak result. This motivates the following:
Denition 238 Let V be a set. We call modus ponens mapping on P(V ) the
map M : P(V )
2
P(V ) dened by:

1
,
2
P(V ) , M(
1
,
2
) =
_
if
2
=
1

otherwise
Note that the modus ponens mapping M : P(V )
2
P(V ) is well-dened
by virtue of theorem (2) of page 21 which guarantees the uniqueness of any
P(V ) such that
2
=
1
, assuming such does exist.
Denition 239 Let V be a set. We call valuation mapping on (V ) the map
Val : (V ) P(V ) dened by the following structural recursion:
(V ) , Val () =

if = P(V )
if = , A(V )
if = , , A(V )
M(Val(
1
), Val (
2
)) if =
1

2
xVal (
1
) if = x
1
, x , Sp (
1
)
if = x
1
, x Sp (
1
)
where M : P(V )
2
P(V ) refers to the modus ponens mapping on P(V ).
Proposition 240 The structural recursion of denition (239) is legitimate.
Proof
We need to prove the existence and uniqueness of the map Val : (V ) P(V ).
We have to be slightly more careful than usual, as we cannot apply theorem (4)
of page 45 in this case. The reason for this is that we do not wish Val(x
1
)
to be simply a function of Val(
1
). Indeed, the conclusion of the proof x
1
depends on whether x Sp (
1
) or not. So we want Val(x
1
) to be a func-
tion of both Val(
1
) and
1
. So the main point is to dene the mapping
h(x) : P(V ) (V ) P(V ) by h(x)(
1
,
1
) = x
1
if x , Sp (
1
) and
h(x)(
1
,
1
) = otherwise. We can then apply theorem (5) of page 47.
Note that the mapping h() : P(V )
0
(V )
0
P(V ) should be dened
dierently, depending on whether is an axiom of rst order logic or not. If
A(V ) we set h()(0, 0) = and otherwise h()(0, 0) = .
3.1.4 Provability and Sequent Calculus
Having constructed the free universal algebra of proofs (V ), we are now ready
to dene the notion of provability and syntactic entailment. We shall introduce
a binary consequence relation T(P(V )) P(V ) as follows:
226
Denition 241 Let V be a set. Let P(V ) and P(V ). We say that
(V ) is a proof of the formula from if and only if we have:
Val() = and Hyp ()
Furthermore, we say that is provable from or that entails , and we write:

if and only if there exists a proof (V ) of the formula from . We say
that is provable and we write if and only if it is provable from = .
The relation satises a few properties which are highlighted in [50]: the
identity property is satised as we clearly have for all P(V ). Indeed,
the proof = is a proof of from the set . The monotonicity property
is also satised as we have the implication ( ) ( ) .
Indeed, if (V ) is a proof with Val () = and Hyp () then in
particular we have Hyp() and consequently is also a proof of from
. Another property satised by is the nitary property: if then there
exists a nite subset
0
such that
0
. Indeed, if is a proof of from
, then it is also a proof of from
0
= Hyp() . Other properties such
as transitivity and structurality are considered in Umberto Rivieccio [50]. We
shall prove transitivity after we have established the deduction theorem (21) of
page 231. The property of structurality is slightly more delicate in the case of
rst order logic with terms. We would like to be able to claim that () ()
whenever we have and : P(V ) P(V ) is a map of a certain type. The
existing literature on abstract algebraic logic e.g. W.J. Blok and D. Pigozzi [6]
and the already mentioned [50] seem to consider logical systems without terms,
where the algebra of formulas is similar to that of propositional logic. In our
case, things are slightly more complicated as it is not immediately obvious which
type of : P(V ) P(V ) should be used to dene the structurality property.
However, with a little bit of thought, it seems that a pretty good candidate
is to consider the class of essential substitutions : P(V ) P(V ) as per
denition (165). In fact, we hope to be able to prove a substitution theorem:
() () in the more general case when : P(V ) P(W) is an
essential substitution and W is a set of variables, which may not be equal to V .
Such theorem is already stated in [4] as Theorem 4.3 page 33 using the notion of
semi-homomorphism. For now, we shall complete this section by proving other
properties of the consequence relation . The idea behind this is to develop
a naive and heuristic form of sequent calculus allowing us to prove a sequent
as easily as possible. The claim that (i.e. entails ) is that
there exists a proof of from . It is usually a painful exercise to exhibit an
element (V ) such that Val () = and Hyp () . It is a lot easier to
prove the existence of such a without actually saying what it is. The following
propositions will allow us to do that. Of course, this will not be very helpful
if we are interested in proof searching algorithms at a later stage. However,
it would be wrong to think that proof searching requires that we nd a proof
227
(V ). Another approach consists in attempting to formally prove or falsify
the sequent , possibly along the lines of the Gentzen system described
in Jean H. Gallier [24] or Gilles Dowek [16], using another algebra of proofs
based on another language of sequents. The next proposition establishes that
the sequent is always true when is an axiom of rst order logic :
Proposition 242 Let V be a set and P(V ). Let A(V ) be an axiom of
rst order logic. Then is provable from , i.e. we have .
Proof
Given an axiom of rst order logic A(V ), consider the proof = .
Then Val () = and Hyp() = . In particular, we have Val() = and
Hyp() . It follows that (V ) is a proof of from , so .
One way to look at proposition (242) is to say that to every axiom of rst
order logic is associated a group of axioms in the language of sequents.
An axiom of the form can be viewed as an inference rule of arity 0,
that is a constant operator on a new algebra of proofs by sequents. We are not
claiming to be doing anything here, but simply indicating an avenue for future
development. In a similar fashion, the modus ponens rule of the algebra (V )
gives rise to a binary rule of inference in the language of sequents:
Proposition 243 Let V be a set and P(V ). For all , P(V ) we have:
( ) ( ( ))
Proof
Let V be a set and P(V ). Let , P(V ) be formulas such that
and ( ). From we obtain the existence of a proof
1
(V )
such that Val (
1
) = and Hyp (
1
) . From ( ) we obtain the
existence of a proof
2
(V ) such that Val(
2
) = and Hyp (
2
) .
Dene the proof =
1

2
. Then, from denition (239) we have:
Val() = Val(
1

2
)
= M(Val(
1
), Val (
2
))
= M(, )
=
where M : P(V )
2
P(V ) is the modus ponens mapping as per denition (238).
Furthermore, using denition (236) we obtain:
Hyp () = Hyp (
1

2
)
= Hyp (
1
) Hyp (
2
)

It follows that (V ) is a proof of from and we have proved that .
Similarly, the generalization rule of the algebra (V ) gives rise to a unary
rule of inference in the language of sequents.
228
Proposition 244 Let V be a set and P(V ). For all P(V ) and x V :
( ) (x , Fr ()) x
Proof
Let V be a set and P(V ). Let P(V ) and x V be such that and
x , Fr (). From we obtain the existence of a proof
1
(V ) such that
Val(
1
) = and Hyp(
1
) . From x , Fr () and Hyp (
1
) it follows in
particular that x , Sp (
1
). Dene the proof = x
1
(V ). Then:
Val () = Val (x
1
)
x , Sp (
1
) = xVal (
1
)
= x
Furthermore, we have Hyp () = Hyp(x
1
) = Hyp (
1
) . It follows that
(V ) is a proof of x from and we conclude that x.
Every axiom of rst order logic is of the form =
1

2
. Hence, given
P(V ) from proposition (242) we have
1

2
. It follows that if the
sequent
1
has been proved, we obtain
2
immediately by application of
the modus ponens property of proposition (243). So every axiom of rst order
logic also gives rise to a unary rule of inference in the language of sequents. We
shall now make these explicit by considering each of the ve possible groups
of axioms separately. Of course, these unary rules of inference can be deduced
from other existing rules. So we may wish to regard them simply as theorems
of the form
1

2
in the language of sequents.
Proposition 245 Let V be a set and P(V ). For all
1
,
2
P(V ):

1
(
2

1
)
Proof
Let V be a set and P(V ). Let
1
,
2
P(V ) such that
1
. From de-
nition (226) the formula
1
(
2

1
) is a simplication axiom and it follows
from proposition (242) that
1
(
2

1
). Hence, using
1
and the
modus ponens property of proposition (243) we conclude that (
2

1
).
Proposition 246 Let V be a set and P(V ). For all
1
,
2
,
3
P(V ):

1
(
2

3
) (
1

2
) (
1

3
)
Proof
Let V be a set and P(V ). Let
1
,
2
,
3
P(V ) with
1
(
2

3
).
From denition (227), [
1
(
2

3
)] [(
1

2
) (
1

3
)] is a Frege
axiom and it follows from proposition (242) that:
[
1
(
2

3
)] [(
1

2
) (
1

3
)]
229
Hence, using
1
(
2

3
) and the modus ponens property of proposi-
tion (243) we conclude that (
1

2
) (
1

3
).
Proposition 247 Let V be a set and P(V ). For all
1
P(V ) we have:
(
1
)
1
Proof
Let V be a set and P(V ). Let
1
P(V ) such that (
1
) .
From denition (228) the formula [(
1
) ]
1
is a transposition
axiom and it follows from proposition (242) that [(
1
) ]
1
.
Hence, using (
1
) and the modus ponens property of proposi-
tion (243) we conclude that
1
.
Proposition 248 Let V be a set and P(V ). For all
1
,
2
P(V ), x V :
(x , Fr (
1
)) ( x(
1

2
))
1
x
2
Proof
Let V be a set and P(V ). Let
1
,
2
P(V ) and x V such that
x , Fr (
1
) and x(
1

2
). From denition (229) and x , Fr (
1
) the
formula x(
1

2
) (
1
x
2
) is a quantication axiom and it follows
from proposition (242) that x(
1

2
) (
1
x
2
). Hence, using
x(
1

2
) and the modus ponens property of proposition (243) we con-
clude that
1
x
2
.
Proposition 249 Let V be a set and P(V ). For all
1
P(V ), x, y V :
x
1

1
[y/x]
where [y/x] : P(V ) P(V ) is an essential substitution of y in place of x.
Proof
Let V be a set and P(V ). Let
1
P(V ) and x, y V such that
x
1
. Let [y/x] : P(V ) P(V ) be an essential substitution associated
with [y/x] : V V as per denition (159). From denition (230) the formula
x
1

1
[y/x] is a specialization axiom and it follows from proposition (242)
that x
1

1
[y/x]. Hence, using x
1
and the modus ponens prop-
erty of proposition (243) we conclude that
1
[y/x].
The following proposition can also be viewed as an axiom in the language of
sequents. We shall need it when proving the deduction theorem.
Proposition 250 Let V be a set and P(V ). For all
1
P(V ) we have:
(
1

1
)
230
Proof
Let V be a set and P(V ). Let
1
P(V ). Since
1
(
1

1
) is a
simplication axiom, from proposition (242) we have
1
(
1

1
). So
in order to show (
1

1
), by virtue of the modus ponens property of
proposition (243) it is sucient to prove that:
[
1
(
1

1
)] (
1

1
)
From proposition (246) it is therefore sucient to prove that:

1
[(
1

1
)
1
]
which follows immediately from the fact that
1
[(
1

1
)
1
] is also a
simplication axiom.
3.1.5 The Deduction Theorem
It is now time to deliver on our promise. Our chosen axiomatization of rst
order logic allows us to state the deduction theorem in its full generality. As
already mentioned, a good number of references (e.g. [18], [30], [32], [39], [43],
[44], [45], [62]) have chosen axiomatic systems in which the deduction theorem
fails in general, and can only be stated for closed formulas. As far as we can
tell, the most common source of failure is the acceptance of (
1
, x
1
) as a
rule of inference. This is in contrast with the formalization presented in these
notes, where the conclusion x
1
is only reached when
1
is the conclusion
of a proof
1
such that x , Sp (
1
), i.e. x is not a specic variable of
1
.
This is in line with the common mathematical practice of not claiming x
1
from
1
unless the variable x is truly arbitrary. The deduction theorem also
appears in full generality as Theorem 4.9 page 34 in Donald W. Barnes, John
M. Mack [4]. Their proof relies on an induction argument based on the length
of the proof underlying the sequent . Having chosen a free algebraic
structure for our set (V ), our own proof of the deduction theorem will benet
from the clarity and power of structural induction. Interestingly, the online
course of Prof. Arindama Singh of IIT Madras [55] also oers a proof of the
deduction theorem in full generality which implicitly makes use of structural
induction, despite having proofs dened as nite sequences of formulas. This
can be seen at the end of the lecture 39 available on YouTube via the link
http://nptel.iitm.ac.in/courses/111106052/39.
Theorem 21 Let V be a set and P(V ). For all , P(V ) we have:
( )
Proof
We shall rst prove the implication : so suppose V is a set and P(V ) is
such that ( ) where , P(V ). In particular, we have:
( )
231
and it is clear that . From the modus ponens property of proposi-
tion (243) it follows that as requested. We now prove the reverse
implication : let

be the set of all proofs (V ) with the property that


for all P(V ) and , P(V ) the following implication holds:
(Val () = ) (Hyp () ) ( )
First we shall prove that in order to complete the proof of this theorem it is
sucient to show that

= (V ). So we assume that

= (V ). Let
P(V ) and , P(V ) be such that . We need to show that
( ). From the assumption we obtain the existence of
a proof (V ) such that Val () = and Hyp () . However,
having assumed

= (V ) it follows that is also an element of

. From
Val() = and Hyp() we therefore conclude that ( )
as requested. We shall now complete the proof of this theorem by showing the
equality

= (V ) using a structural induction argument as per theorem (3)


of page 32. Since P(V ) is a generator of the algebra (V ) we rst check that
P(V )

. So suppose is a proof of the form =


1
for some
1
P(V ).
We need to show that
1

. So let P(V ) and , P(V ) be such


that
1
= Val () = and
1
= Hyp() . We need to show that
( ), that is to say (
1
). We shall distinguish two cases:
rst we assume that
1
. In that case, Hyp () and is in fact a proof
of
1
from . So
1
and our desired conclusion (
1
) follows
immediately from the simplication property of proposition (245). We now as-
sume that
1
, . From the inclusion
1
it follows that
1
=
and we therefore need to show that (
1

1
) which follows immediately
from proposition (250). This completes our proof of P(V )

. We shall now
proceed with our induction argument by showing that every proof (V )
of the form =
1
for some
1
P(V ) is an element of

. So suppose
=
1
and let P(V ) and , P(V ) be such that Val () = and
= Hyp() (This inclusion of course does not tell us anything). We
need to show that ( ). We shall distinguish two cases: rst we assume
that
1
A(V ), i.e. that
1
is an axiom. Then we have
1
= Val() = and
we need to show that (
1
). However, having assumed
1
A(V ) is
an axiom of rst order logic, we have
1
and (
1
) follows imme-
diately from the simplication property of proposition (245). We now assume
that
1
, A(V ). Then we have ( ) = Val() = and we need to show
that ( ( )). However, we know that ( ) is true by virtue
of proposition (250) and ( ( )) therefore follows from the simpli-
cation property of proposition (245). We shall now proceed with our induction
argument by assuming (V ) is of the form =
1

2
where both
1
and

2
are elements of

. We need to show that is also an element of

. So let
P(V ) and , P(V ) be such that M(Val (
1
), Val(
2
)) = Val () =
and Hyp(
1
) Hyp (
2
) = Hyp() , where M : P(V )
2
P(V ) is the
modus ponens mapping of denition (238). We need to show that ( ).
We shall distinguish two cases: rst we assume that Val (
2
) is not of the form
232
Val(
2
) = Val(
1
)
1
for any
1
P(V ). From denition (238) it fol-
lows that M(Val(
1
), Val (
2
)) = and consequently = .
So we need to prove that ( ) which follows from the sim-
plication property of proposition (245) and the fact that ( ), it-
self an outcome of proposition (250). We now assume that Val(
2
) is of the
form Val (
2
) = Val(
1
)
1
for some
1
P(V ). In this case we obtain
M(Val(
1
), Val (
2
)) =
1
= so we need to prove that (
1
). From
the modus ponens property of proposition (243) it is therefore sucient to show
that ( Val(
1
)) and ( Val (
1
)) (
1
). In fact, from
the Frege property of proposition (246), ( Val(
1
)) (
1
) is
itself a consequence of (Val (
1
)
1
) so we can concentrate on
proving this last entailment together with ( Val(
1
)). First we show
that ( Val(
1
)). Recall our assumption that the proof
1
is an el-
ement of

, and from Hyp(


1
) Hyp (
2
) we obtain in particular
Hyp(
1
) . Hence from the equality Val (
1
) = Val (
1
) we con-
clude immediately that ( Val (
1
)) as requested. We now show that
(Val (
1
)
1
). Since Val (
1
)
1
= Val (
2
) this is in fact the
same as showing ( Val (
2
)) which is also a simple consequence of the
assumption
2

and Hyp (
2
) . This completes our proof in the
case when (V ) is of the form =
1

2
. We now assume that is of
the form = x
1
where x V and
1
is an element of

. We need to show
that is also an element of

. So let P(V ) and , P(V ) be such


that Val (x
1
) = Val () = and Hyp(
1
) = Hyp() . We need
to show that ( ). We shall distinguish two cases: rst we assume
that x Sp (
1
). In this case we have Val(x
1
) = and therefore
= . So we need to prove that ( ) which we already
know is true as we have shown in this proof. We now assume that x , Sp (
1
).
Once again, we need to show that ( ) and we shall distinguish two
further cases: rst we assume that x Fr (). From x , Sp (
1
) = Fr (Hyp (
1
))
it follows that , Hyp (
1
). Hence, from Hyp() = Hyp(
1
) we
obtain Hyp () , which together with Val() = imply that . So
( ) follows immediately from the simplication property of propo-
sition (245). We now assume that x , Fr (). From x , Sp (
1
) it follows
that Val (x
1
) = xVal (
1
) and consequently = xVal (
1
). Hence we
need to show that ( xVal (
1
)). However, since x , Fr (), from
the quantication property of proposition (248), it is sucient to prove that
x( Val(
1
)). From Hyp (
1
) , dening

= Hyp (
1
)
we obtain

. It is therefore sucient to prove that

x( Val (
1
)).
From x , Sp (
1
) = Fr (Hyp (
1
)) it follows that x , Fr (

). Using the gen-


eralization property of proposition (244), it is therefore sucient to prove that

( Val(
1
)). Now recall our assumption that the proof
1
is an element
of

. From Hyp(
1
)

and Val(
1
) = Val (
1
) we obtain immediately

( Val(
1
)) as requested. This completes our induction argument.
233
3.1.6 Transitivity of Consequence Relation
We are now in a position to provide a proof of the transitivity property of the
consequence relation : if for all and , then we must have
. In essence, we shall use the deduction theorem (21) to argue that:
(
n
(
n1
. . . (
1
)
where
1
, . . . ,
n
is the set of hypothesis of a proof underlying the
sequent . From
k
and a repeated use of modus ponens, we conclude
that . Note that if is a proof of from , it would be tempting to design
a new proof of from by replacing every assumption
k
of the proof by a
proof
k
of
k
from . For those who care, this idea does not work: the proof
may contain cases of generalization with respect to a variable x which become
invalidated by the presence of x as a free variable in some Hyp(
k
). Despite our
best eorts, we have not been able to design a proof along those lines.
Proposition 251 Let V be a set. Let , P(V ) and P(V ). Then:
( for all ) ( ) (3.4)
Proof
Without loss of generality we may assume that is a nite set. Indeed, sup-
pose the implication (3.4) has been proved in this case. We shall show that it
is then true in general. So suppose for all and . We need
to show that . However, there exists
0
nite such that
0
and

0
. Hence we have for all
0
and
0
. Having assumed the
implication (3.4) is true for nite, it follows that as requested. So we
assume without loss of generality that is a nite set. We shall show that (3.4)
is true for all P(V ) and P(V ), using an induction argument on the
cardinal [[ of the set . First we assume that [[ = 0. Let P(V ) and
P(V ). From = and we obtain and in particular .
Hence the implication (3.4) is necessarily true. We now assume that (3.4) is true
for all P(V ) and P(V ) whenever [[ = n for some n N. We need
to show the same is true when [[ = n + 1. So let P(V ) and P(V ).
We assume that for all and . We need to show that
. Having assumed [[ = n + 1, in particular ,= . Let

and
dene

. Then [

[ = n. Furthermore we have =

and consequently

. Using the deduction theorem (21) we obtain

). However, since

we also have for all

.
Having assumed the implication (3.4) is true for all P(V ), P(V ) and
[[ = n, in particular it is true for , (

) and

. Hence we see that


(

) is true. Furthermore, since

we have

. Using the
modus ponens property of proposition (243) we conclude that .
There is an interesting corollary to proposition (251) which looks like the
cut rule for sequent calculus. In case this turns out to be useful we quote:
234
Proposition 252 Let V be a set. Let P(V ) and , P(V ). Then:
( ) ( )
Proof
Using the deduction theorem (21) of page 231, by assumption we have the se-
quents ( ) and (( ) ). Dening the set of formulas
= , ( ) it follows that for all . In order to
prove that , from the transitivity of proposition (251) it is therefore su-
cient to prove that . Using the transposition property of proposition (247),
we simply need to show that ( ) . From the deduction theo-
rem (21), this amounts to showing that

where

= . Let
us accept for now that

( ) and

( ) . Then from the


modus ponens property of proposition (243) we obtain

as requested.
So it remains to show that

( ) and

( ) . First we
show the sequent

( ) : from the deduction theorem (21), we need to


show that

which follows from proposition (243) and the fact that

contains the three formulas , and . So we now show the


sequent

( ) : from the deduction theorem (21), we need to show


that

which follows from proposition (243) and the fact that

contains the three formulas , ( ) and .


3.1.7 The Hilbert Deductive Congruence
Having created a sensible consequence relation T(P(V ))P(V ) we can now
formally dene the Hilbert deductive congruence on P(V ). Given , P(V ),
we shall say that and are equivalent if and only if both formulas
and are provable. So we naturally start with a study of the relation
dened by ( ) which shall be seen to be a preorder:
Denition 253 Let V be a set. We call Hilbert deductive preorder on P(V )
the relation dened by if and only if ( ), for all , P(V ).
Recall that a preorder is a binary relation which is reexive and transitive. The
proof that is indeed a preorder follows. Note that by virtue the deduction
theorem (21) of page 231, the statement ( ) is equivalent to .
Proposition 254 Let V be a set. Then, the Hilbert deductive preorder on
P(V ) is a reexive and transitive relation on P(V ).
Proof
First we show that is reexive, namely that for all P(V ). We have
to show that ( ), which follows from proposition (250). Next we show
that is transitive. So let , , P(V ) such that and . We need
to show that , that is ( ). Using the deduction theorem (21), it is
sucient to prove that . However, from the assumption we have
( ) and consequently . From the assumption we obtain
235
( ) and in particular ( ). Using the modus ponens property
of proposition (243) we conclude that as requested.
The Hilbert deductive preorder is not a congruent relation. If
1

1
and

2

2
, we cannot argue that
1

2

1

2
. Although it would be easy
to provide a counterexample, we shall refrain from doing so at this stage, as we
are not able to prove anything: we do have some tools allowing us to establish a
given sequent ( ), but we are not yet equipped to refute such a sequent.
Proving that a formula is not provable is dicult. We shall need some model
theory to do this. For now, we shall focus on the positive result:
Proposition 255 Let V be a set and be the Hilbert deductive preorder on
P(V ). Let
1
,
2
and
1
,
2
P(V ) such that
1

1
and
2

2
. Then:

1

2

1

2
Proof
We assume that
1

1
and
2

2
, i.e. (
1

1
) and (
2

2
).
We need to show that
1

2

1

2
i.e. (
1

2
) (
1

2
).
Using the deduction theorem (21) of page 231 it is sucient to prove that

1

2
(
1

2
). In fact, using theorem (21) once more, it is su-
cient to prove that
2
where =
1

2
,
1
. From the assumption
(
1

1
) we obtain
1

1
and consequently
1
. Furthermore, it is
clear that (
1

2
). Using the modus ponens property of proposition (243)
we obtain
2
. However, from the assumption (
2

2
) we have in par-
ticular (
2

2
). Using the modus ponens property of proposition (243)
once more we obtain
2
as requested.
As we have just seen, the implication operator formally behaves like a
dierence
2

1
when it comes to the Hilbert deductive preorder. By contrast,
the quantication operator x respects the order:
Proposition 256 Let V be a set and be the Hilbert deductive preorder on
P(V ). Let
1
,
1
P(V ) such that
1

1
. Then for all x V we have:
x
1
x
1
Proof
Suppose
1

1
and let x V . we need to show that x
1
x
1
which is
(x
1
x
1
). Using the deduction theorem (21) of page 231 it is sucient
to prove that x
1
with = x
1
. It is clear that x
1
. From
proposition (169) the identity mapping i : P(V ) P(V ) is an essential substi-
tution associated with the identity i : V V . In particular, it is an essential
substitution of y in place of x in the case when y = x. Using the specialization
property of proposition (249) it follows that
1
. However, the assumption

1

1
leads to (
1

1
) and in particular (
1

1
). Using the
modus ponens property of proposition (243) we obtain
1
. Since x , Fr ()
236
we conclude x
1
from the generalization property of proposition (244).
We shall now dene the Hilbert deductive congruence in terms of the Hilbert
deductive preorder, and prove it is indeed a congruence on P(V ).
Denition 257 Let V be a set and be the Hilbert deductive preorder on P(V ).
We call Hilbert deductive congruence on P(V ) the relation dened by:
( ) ( )
Proposition 258 The Hilbert deductive congruence on P(V ) is a congruence.
Proof
Let denote the Hilbert deductive congruence on P(V ). We need to prove that
is an equivalence relation which is also a congruent relation. First we show
that it is indeed an equivalence relation. Let denote the Hilbert deductive
preorder on P(V ). From proposition (254), is reexive. So it is clear that
is also reexive. It is also obvious that is symmetric. So it remains to show
that is transitive. So suppose , , P(V ) are such that and .
In particular we have and and it follows from the transitivity of
that . We show similarly that and we conclude that . So
it remains to show that is a congruent relation on P(V ). We have already
proved that from reexivity. Suppose
1

1
and
2

2
. We need to
show that
1

2

1

2
. First we show that
1

2

1

2
. This
follows from proposition (255) and the fact that
1

1
and
2

2
. Using

1

1
and
2

2
we prove similarly that
1

2

1

2
. Suppose
now that
1

1
and x V . We need to show that x
1
x
1
. However,
from
1

1
and proposition (256) we obtain x
1
x
1
. Likewise from

1

1
we have x
1
x
1
. The equivalence x
1
x
1
follows.
The Hilbert deductive congruence expresses the idea of logical equivalence
between formulas. We are already familiar with other congruences which formal-
ize the idea of identical meaning. Obviously we should expect formulas which
have identical meaning to be logically equivalent. We shall now check this is
indeed the case, starting with the substitution congruence on P(V ) :
Proposition 259 Let V be a set. Let and denote the substitution and
Hilbert deductive congruence on P(V ) respectively. Then for all , P(V ):

The substitution congruence is stronger than the Hilbert deductive congruence.
Proof
We need to show the inclusion , for which it is sucient to show R
0

where R
0
is a generator of the substitution congruence. Using denition (115)
it is therefore sucient to prove that where = x
1
for some x V
and
1
P(V ), and = y
1
[y : x] with x ,= y and y , Fr (
1
). First we
237
show that where is the Hilbert preorder on P(V ). So we need to show
that ( ). Using the deduction theorem (21) of page 231 it is sucient to
prove that , which is y
1
[y : x] where = x
1
. However, having
assumed that y , Fr (
1
), in particular we have y , Fr (). Using the gener-
alization property of proposition (244) it is therefore sucient to prove that

1
[y : x]. Let us accept for now that the formula = x
1

1
[y : x] is a
specialization axiom. Then from proposition (242) we have and using the de-
duction theorem once more we obtain
1
[y : x] as requested. So it remains to
show that is indeed a specialization axiom. From proposition (231) it is su-
cient to show that
1
[y : x]
1
[y/x] where [y/x] : P(V ) P(V ) is an essential
substitution of y in place of x. Since the permutation [y : x] : V V is injective,
from proposition (169) its associated map [y : x] : P(V ) P(V ) is also an es-
sential substitution. Hence, in order to show the equivalence
1
[y : x]
1
[y/x],
from proposition (174) it is sucient to show that [y/x] and [y : x] coincide on
Fr (
1
). This is clearly the case since y , Fr (
1
). So we have proved that
and it remains to show that . However, dene

1
=
1
[y : x]. Then it is
clear that
1
=

1
[x: y] and we need to show that y

1
x

1
[x: y], for which
we can use an identical proof as previously, provided we show that x , Fr (

1
).
So suppose to the contrary that x Fr (

1
). Using proposition (75), since
[y : x] is injective we have Fr (

1
) = [x: y](Fr (
1
)). It follows that there exists
u Fr (
1
) such that x = [x: y](u). Hence u = y which contradicts y , Fr (
1
).

Two formulas which are permutation equivalent are also logically equivalent:
Proposition 260 Let V be a set. Let and denote the permutation and
Hilbert deductive congruence on P(V ) respectively. Then for all , P(V ):

The permutation congruence is stronger than the Hilbert deductive congruence.
Proof
We need to show the inclusion , for which it is sucient to show R
0

where R
0
is a generator of the permutation congruence. Using denition (180)
it is therefore sucient to prove that where = xy
1
and = yx
1
where
1
P(V ) and x, y V . By symmetry, it is sucient to show that
i.e. ( ). Using the deduction theorem (21) of page 231 it is therefore
sucient to prove that which is yx
1
where = xy
1
. It
is clear that xy
1
. Using the specialization property of proposition (249)
we obtain y
1
. Hence, using specialization once more we obtain
1
.
However, it is clear that x , Fr (). So we can use the generalization property
of proposition (244) to obtain x
1
. With one additional use of generaliza-
tion, since y , Fr (), we conclude that yx
1
as requested.
Two formulas which are absorption equivalent are also logically equivalent:
238
Proposition 261 Let V be a set. Let and denote the absorption and
Hilbert deductive congruence on P(V ) respectively. Then for all , P(V ):

The absorption congruence is stronger than the Hilbert deductive congruence.
Proof
We need to show the inclusion , for which it is sucient to show R
0

where R
0
is a generator of the absorption congruence. Using denition (202) it
is therefore sucient to prove that
1
x
1
where
1
P(V ) and x , Fr (
1
).
First we show that
1
x
1
. We need to show that (
1
x
1
) which
is
1
x
1
after application of the deduction theorem (21) of page 231.
However, since x , Fr (
1
), from
1

1
and the generalization property
of proposition (244) we obtain
1
x
1
as requested. So it remains to
show that x
1

1
, which is (x
1

1
), which follows immediately from
proposition (242) and the fact that x
1

1
is a specialization axiom.
Propositional equivalence is also stronger than logical equivalence:
Proposition 262 Let V be a set. Let and denote the propositional and
Hilbert deductive congruence on P(V ) respectively. Then for all , P(V ):

The propositional congruence is stronger than the Hilbert deductive congruence.
Proof
To be determined.
So the substitution, permutation, absorption and propositional congruences
are all stronger than the Hilbert deductive congruence. The following propo-
sition is an easy consequence of this, and can be added to our set of rules for
sequent calculus: if a formula is provable while having the same meaning as
a formula , then is also provable. In fact, if a formula is provable while
being logically equivalent to a formula , then is also provable.
Proposition 263 Let V be a set and be a congruence which is stronger than
the Hilbert deductive congruence. Then for all , P(V ) and P(V ):
( ) ( )
Proof
We assume that and . We need to show that . However the
congruence on P(V ) is stronger than the Hilbert deductive congruence .
Hence from we obtain . It follows in particular that , i.e.
( ) and consequently we have ( ). From and the modus
ponens property of proposition (243) we conclude that as requested.
239
3.2 The Free Universal Algebra of Proofs
3.2.1 Totally Clean Proof
The universal algebra of proofs (V ) of denition (234) is not as natural as
it looks: a universal algebra has operators which must be dened everywhere.
However, the modus ponens operator should not be dened in most cases. It
makes little sense to infer anything from two conclusions
1
and
2
using modus
ponens, unless the conclusion
2
is of the form
2
=
1
in which case we
can derive the new conclusion . In other words, the proof
1

2
is not a valid
application of the modus ponens rule, unless Val(
2
) = Val(
1
) for some
P(V ). In a similar way, the proof x
1
is not a valid application of the
generalization rule, unless the condition x , Sp (
1
) is met, i.e. the variable x is
truly general and not specic. So in some sense, it may be argued that the struc-
ture of universal algebra is not suited to represent a set of proofs. These proofs
have operators which are naturally partial functions, while universal algebras
require operators which are total functions. There is a prominent example for
which this issue arises: the algebraic structure of eld has an inverse operator
which is not dened on 0. So there is a problem: our solution to it is to regard
proofs simply as formal expressions where the operators and x are dened
everywhere, and to introduce Val : (V ) P(V ) as a semantics representing
the conclusion being proved by a given formal expression. So defying common
sense, we assign meaning to all proofs =
1

2
, = x
1
and even =
when is not an axiom of rst order logic, while attempting to redeem ourselves
by setting Val () = whenever a proof arises from a awed application
of a rule of inference or awed invocation of axiom.
In this section, we introduce the notion of totally clean proof to emphasize
the case when no such awed inference occurs. We shall use the phrase totally
clean rather than simply clean so as to reserve the latter for a weaker notion
of awlessness which we shall introduce in denition (337) when dealing with
proofs modulo. In this section, we shall also establish proposition (271) below
showing that any true sequent can always be proved in a totally clean
way. This gives us some degree of reassurance: although our universal algebra of
proofs (V ) is arguably too large and contains proofs which may be deemed to
be awed, every conclusion reached by such a awed proof is in fact provable in
the orthodox way. So it would be possible to reject (V ) and only consider its
subset of totally clean proofs, but this would not change the notion of provability
as introduced in denition (241). However, beyond our peace of mind there is
another good reason to introduce the notion of totally clean proof: given a map
: V W and a proof (V ), we shall soon want to substitute the variables
of as specied by the map , thereby dening a map : (V ) (W) which
we shall do in denition (272). This map may give us a tool to create new proofs
from old proofs, allowing us to show some provability statements which would
otherwise be out of reach. For example, suppose we want to prove ()
knowing that is true. One possible line of attack is to consider a proof
(V ) underlying the sequent and hope that () is in fact a proof with
240
conclusion (). So we need the following equation to hold:
Val () = Val () (3.5)
This equation is bound to play an important role. More generally, a transfor-
mation : (V ) (W) is interesting if we control the conclusion of () in
some way, given the conclusion of . Now consider the following proof:
= (x x)((y y) (z z))
Whenever x ,= y, this proof is not totally clean as it contains a awed application
of modus ponens. However, if (x) = (y) = u while (z) = v we obtain:
() = (u u)((u u) (v v))
This is a totally clean proof with conclusion v v. It should be clear from this
example that we cannot hope to prove properties such as (3.5) in full generality,
and must restrict our attention to the case of totally clean proofs. Too many
bizarre things may happen otherwise. It is very hard to carry out a sensible
analysis of proofs which are not totally clean. Hence the following:
Denition 264 Let V be a set. The map s : (V ) 2 = 0, 1 dened by the
following structural recursion is called the strength mapping on (V ):
(V ) , s() =

1 if = P(V )
1 if = , A(V )
0 if = , , A(V )
s(
1
) s(
2
) if =
1

2
s(
1
) if = x
1
(3.6)
where 2 = 0, 1 is dened by = 1 if and only if we have the equality:
Val (
2
) = Val (
1
) Val()
and 2 = 0, 1 is dened by = 1 if and only if x , Sp (
1
). We say that a
proof (V ) is totally clean if and only if it has full strength, i.e. s() = 1.
Proposition 265 The structural recursion of denition (264) is legitimate.
Proof
We need to show the existence and uniqueness of s : (V ) 2 satisfying the ve
conditions of equation (3.6). We shall do so using theorem (5) of page 47 with
X = (V ), X
0
= P(V ) and A = 2. Choosing the map g
0
: X
0
A dened by
g
0
() = 1 we ensure the rst condition is met. Next for every formula P(V )
we dene h() : A
0
X
0
A by setting h()(0, 0) = 1 if A(V ) and
h()(0, 0) = 0 otherwise. This ensures the second and third conditions are
met. Next we dene h() : A
2
X
2
A with the following formula:
h()(
1
,
2
,
1
,
2
) =
_

1

2
if Val(
2
) = Val (
1
) Val (
1

2
)
0 if Val(
2
) ,= Val (
1
) Val (
1

2
)
241
This takes care of the fourth condition. Finally we dene h(x) : A
1
X
1
A:
h(x)(
1
,
1
) =
_

1
if x , Sp (
1
)
0 if x Sp (
1
)
which ensures the fth condition is met and completes our proof.
A proof which relies on a awed sub-proof is itself awed. Hence a proof
cannot be totally clean unless every one of its sub-proofs is totally clean.
Proposition 266 Let V be a set and s : (V ) 2 be the strength mapping.
Then the strength of a sub-proof is higher than the strength of a proof, i.e.
_ s() s()
A proof (V ) is totally clean if and only if every _ is totally clean.
Proof
The implication is a simple application of proposition (47) to s : X A where
X = (V ) and A = 2 where the chosen preorder on A is the usual order
(reversed) on 0, 1. We simply need to check that given
1
,
2
(V )
and x V we have the inequalities s(
1
) s(
1

2
), s(
2
) s(
1

2
) and
s(
1
) s(x
1
) which follow immediately from the recursive denition (264).
It remains to check that (V ) is totally clean if and only if every one of its
sub-proofs is totally clean. Since is a sub-proof of itself, the if part is clear.
Conversely, let _ be a sub-proof of where is totally clean. Then s() = 1
and since s() s() we see that s() = 1. So the proof is also totally clean.
A proof arising from modus ponens is of the form =
1

2
. Such a proof
cannot be totally clean unless the use of modus ponens is legitimate, i.e. the
conclusion of
2
is of the form Val(
2
) = Val (
1
) Val(). Of course we also
need both
1
and
2
to be totally clean. These conditions are in turn sucient.
The following proposition is useful to carry out structural induction arguments.
Proposition 267 Let V be a set and be a proof of the form =
1

2
.
Then is totally clean if and only if both
1
,
2
are totally clean and:
Val (
2
) = Val (
1
) Val()
Proof
First we show the only if part: so we assume that =
1

2
is totally clean.
Then s() = 1 and from denition (264) we obtain s(
1
) s(
2
) = 1. It
follows that s(
1
) = s(
2
) = = 1. So
1
and
2
are totally clean and from
= 1 and denition (264) we conclude that Val (
2
) = Val (
1
) Val (). We
now prove the if part: so we assume that =
1

2
where
1
,
2
are totally
clean and Val (
2
) = Val(
1
) Val(). Then from denition (264):
s() = s(
1
) s(
2
) = 1 1 1 = 1
242
It follows that is itself a totally clean proof, as requested.
A proof arising from generalization is of the form = x
1
. Such a proof
cannot be totally clean unless the use of generalization is legitimate, that is
x , Sp (
1
). We also need
1
to be totally clean, and these conditions are in
turn sucient. The following proposition is also useful for structural induction.
Proposition 268 Let V be a set and be a proof of the form = x
1
. Then
is totally clean if and only if
1
is totally clean and x , Sp (
1
).
Proof
First we show the only if part: so we assume that = x
1
is totally clean.
Then s() = 1 and from denition (264) we obtain s(
1
) = 1. It follows
that s(
1
) = = 1. So
1
is totally clean and furthermore from = 1 and
denition (264) we conclude that x , Sp (
1
). We now prove the if part: so we
assume that = x
1
, where
1
is totally clean and x , Sp (
1
). Then from
denition (264) s() = s(
1
) = 1 1 = 1. So is itself totally clean.
We decided to dene a totally clean proof in terms of strength s : (V ) 2
as per denition (264). This was not the only way to go about it. A totally clean
proof is one which is not awed. This means that every axiom invocation or
use of modus ponens or generalization is legitimate. The following proposition
conrms that a totally clean proof can be dened as one without awed steps:
Proposition 269 Let V be a set. Then (V ) is totally clean if and only
if for all
1
,
2
(V ), P(V ) and x V , we have the following:
(i) _ A(V )
(ii)
1

2
_ Val(
2
) = Val (
1
) Val (
1

2
)
(iii) x
1
_ x , Sp (
1
)
Proof
First we show the only if part: so we assume that (V ) is totally clean.
We need to show that (i), (ii) and (iii) hold. First we show (i) : so we as-
sume that _ is a sub-proof of of the form = for some P(V ).
We need to show that A(V ), i.e. that is an axiom. Having assumed
that is totally clean, from proposition (266) the proof = is also totally
clean. So A(V ) follows immediately from denition (264). Next we show
(ii) : so we assume that _ is a sub-proof of of the form =
1

2
. We
need to show that Val (
2
) = Val (
1
) Val(). Having assumed that is
totally clean, from proposition (266) the proof is also totally clean. However,
is of the form =
1

2
and it follows from proposition (267) that we have
Val(
2
) = Val (
1
) Val() as requested. We now show (iii) : so we assume
that _ is a sub-proof of of the form = x
1
. We need to show that
x , Sp (
1
). Having assumed that is totally clean, from proposition (266)
the proof is also totally clean. However, is of the form = x
1
and it
243
follows from proposition (268) that x , Sp (
1
) as requested. We now show
this if part: we need to prove that every (V ) satises the property
(i) + (ii) + (iii) ( is totally clean). We shall do so with a structural in-
duction argument, using theorem (3) of page 32. So rst we assume that =
for some P(V ). From denition (264) we obtain s() = 1. So is always
totally clean and the property is satised. Next we assume that = where
P(V ). We need to show satises our property. So we assume that (i), (ii)
and (iii) hold. We need to show that is totally clean. However, since is a
sub-proof of itself we have _ and using (i) we obtain A(V ). It follows
from denition (264) that s() = s() = 1 and we conclude that is totally
clean as requested. Next we assume that =
1

2
where
1
,
2
(V )
satisfy our induction property. We need to show the same is true of . So we
assume that (i), (ii) and (iii) hold for . We need to show that is totally
clean. Using proposition (267), it is sucient to prove that both
1
and
2
are
totally clean and furthermore that Val (
2
) = Val (
1
) Val(). This last
equality is an immediate consequence of condition (ii) and the fact that is a
sub-proof of itself i.e. =
1

2
_ . So it remains to show that
1
and
2
are
totally clean. Having assumed that
1
and
2
satisfy our induction property,
it is therefore sucient to prove that (i), (ii) and (iii) are satised by
1
and

2
. First we show that (i) is satised by
1
: so we assume that _
1
is a
sub-proof of
1
of the form = . We need to show that A(V ). However,
having assumed that (i) holds for =
1

2
, it is sucient to prove that _
which follows immediately from _
1
and
1
_ . We prove similarly that
(i) holds for
2
. Next we show that (ii) is satised by
1
: so we assume that
_
1
is a sub-proof of
1
of the form =
1

2
. We need to show that
Val(
2
) = Val(
1
) Val (). However, having assumed that (ii) holds for
=
1

2
, it is sucient to prove that _ which follows immediately from
_
1
and
1
_ . We prove similarly that (ii) holds for
2
and we now show
that (iii) holds for
1
: so we assume that _
1
is a sub-proof of
1
of the
form = x
1
. We need to show that x , Sp (
1
). However, having assumed
that (iii) holds for =
1

2
, it is sucient to prove that _ which again
follows from _
1
and
1
_ . We prove similarly that (iii) holds for
2
which
completes our induction argument in the case when =
1

2
. So we nally
assume that = x
1
where x V and
1
(V ) satises our induction
property. We need to show the same is true of . So we assume that (i), (ii)
and (iii) hold for . We need to show that is totally clean. Using proposi-
tion (268), it is sucient to prove that
1
is totally clean and furthermore that
x , Sp (
1
). This last property is an immediate consequence of condition (iii)
and the fact that is a sub-proof of itself i.e. = x
1
_ . So it remains to
show that
1
is totally clean. Having assumed that
1
satises our induction
property, it is therefore sucient to prove that (i), (ii) and (iii) hold for
1
.
First we show that (i) is true: so we assume that _
1
is a sub-proof of
1
of
the form = . We need to show that A(V ). However, having assumed
that (i) holds for = x
1
, it is sucient to prove that _ which follows
immediately from _
1
and
1
_ . Next we show that (ii) is true: so we
assume that _
1
is a sub-proof of
1
of the form =
1

2
. We need to
244
show that Val (
2
) = Val (
1
) Val(). However, having assumed that (ii)
holds for = x
1
, it is sucient to prove that _ which follows imme-
diately from _
1
and
1
_ . We now show that (iii) is satised: so we
assume that _
1
is a sub-proof of
1
of the form = u
1
. We need to
show that u , Sp (
1
). However, having assumed that (iii) holds for = x
1
,
it is sucient to prove that _ which again follows from _
1
and
1
_ .
As we shall see in proposition (271) below, any true sequent can be
established with a totally clean proof. The key to the argument is that any
awed step _ of a proof with conclusion Val() = can be discarded
and replaced by a totally clean fragment as the following lemma shows:
Lemma 270 Let V be a set. There exists a totally clean proof (V ) with:
( Hyp () = ) ( Val() = )
Proof
Consider the formulas
1
= (( ) ) and
2
=
1
( ).
Let us accept for now that both
1
and
2
are axioms and dene =
1

2
.
It is clear that Hyp() = and furthermore we have the equality:
Val() = Val(
1

2
)
def. (239) = M( Val(
1
), Val (
2
) )

1
,
2
A(V ) = M(
1
,
2
)
= M(
1
,
1
( ) )
=
So it remains to show that is totally clean. However, from denition (264)
both
1
and
2
are totally clean and it follows from proposition (267) and:
Val(
2
) = Val (
1
) Val()
that is also totally clean. It remains to check that both
1
and
2
are indeed
axioms of rst order logic. However, we have
1
= ((
1
) )
1
where
1
= . So
1
is a transposition axiom as per denition (228). Likewise,
we have
2
= ((
2
) )
2
where
2
= and
2
is an axiom.
We can now prove the main result of this section: there is no loss of generality
in assuming that a true sequent has an underlying proof which is totally
clean. This shows that dening our syntactic entailment solely in terms of totally
clean proofs would have made no dierence to the nal notion of provability.
However, not working on the free universal algebra of proofs (V ) would have
made formal developments considerably more cumbersome.
Proposition 271 Let V be a set and P(V ). Let P(V ) be such that
the sequent is true. Then there exists a totally clean proof of from .
245
Proof
In order to establish this proposition, we shall rst prove it is sucient to show
that every proof (V ) has a totally clean counterpart, namely that:

(V ) , (

totally clean) ( Val(

) = Val () ) ( Hyp (

) Hyp () )
In other words, for every proof (V ), there exists a totally clean proof

with the same conclusion as , and fewer hypothesis. So let us assume this is
the case for now. We need to show the proposition is true. So let P(V )
and P(V ) such that . Then there exists a proof of from ,
i.e. (V ) such that Val () = and Hyp() . Taking a totally
clean counterpart of , we obtain a totally clean proof

(V ) such that
Val(

) = Val () and Hyp (

) Hyp(). It follows that

is a totally clean
proof of from as requested. So we need to show that every proof (V )
has a totally clean counterpart. We shall do so by structural induction, using
theorem (3) of page 32. First we assume that = for some P(V ). Then
is totally clean and there is nothing else to prove. So we now assume that
= for some P(V ). We shall distinguish two cases: rst we assume
that A(V ). Then is totally clean and we are done. We now assume that
, A(V ). Then Val () = . Using lemma (270) there exists a totally
clean proof

(V ) such that Hyp (

) = and Val(

) = . It follows
that Hyp (

) Hyp () and Val (

) = Val () and we have found a totally


clean counterpart of . Next we assume that =
1

2
where
1
,
2
(V )
are proofs with totally clean counterparts. We need to show that also has a
totally clean counterpart. So let

1
and

2
be totally clean counterparts of
1
and
2
respectively. We shall distinguish two cases. First we assume that:
Val (

2
) = Val (

1
) Val (

2
) (3.7)
Then dening

2
, since

1
and

2
are totally clean it follows from
proposition (267) that

is totally clean. Furthermore, we have:


Hyp (

) = Hyp (

2
)
= Hyp (

1
) Hyp (

2
)
Hyp (
1
) Hyp (
2
)
= Hyp (
1

2
)
= Hyp ()
and:
Val (

) = Val (

2
)
def. (239) = M( Val (

1
), Val (

2
) )
Val (

i
) = Val(
i
) = M( Val (
1
), Val (
2
) )
= Val (
1

2
)
= Val ()
246
So we have found a totally clean counterpart

of as requested. Next we
assume that equation (3.7) does not hold. Then Val(

2
) = Val (

1
)
is false for all P(V ). It follows that Val (
2
) = Val (
1
) is also
false for all and consequently from denition (239) Val () = . Using
lemma (270) there exists a totally clean proof

(V ) such that Hyp (

) =
and Val(

) = . It follows that Hyp (

) Hyp () and furthermore


Val(

) = Val (). So we have found a totally clean counterpart of . Finally,


we assume that = x
1
where x V and
1
(V ) is a proof which has
a totally clean counterpart. We need to show that also has a totally clean
counterpart. So let

1
be a totally clean counterpart of
1
. We shall distinguish
two cases: rst we assume that x , Sp (
1
). Since Hyp (

1
) Hyp (
1
) we have
Sp (

1
) Sp (
1
) and consequently x , Sp (

1
). Hence, dening

= x

1
,
since

1
is totally clean it follows from proposition (268) that

is itself totally
clean. Furthermore we have the following inclusion:
Hyp(

) = Hyp (

1
) Hyp (
1
) = Hyp()
and the equalities:
Val (

) = Val (x

1
)
x , Sp (

1
) = xVal (

1
)
= xVal (
1
)
x , Sp (
1
) = Val (x
1
)
= Val ()
So we have found a totally clean counterpart

of as requested. Next we
assume that x Sp (
1
). Then from denition (239) we have Val () = ,
and using lemma (270) we see once again that has a totally clean counterpart.

3.2.2 Variable Substitution in Proof


In denition (53) we introduced the substitution : P(V ) P(W) associated
with a map : V W. This substitution is crude in the sense that it is
generally not capture-avoiding. In other words, given a formula P(V ), the
substitution is generally not valid for , as per denition (83). However, the
introduction of the minimal transform of denition (137) allowed us to build on
this elementary substitution and derive a new type of variable substitution map-
ping : P(V ) P(W), the so-called essential substitutions of denition (165).
Essential substitutions are capture-avoiding and their existence is guaranteed
by theorem (18) with a mere cardinality assumption on V and W. A notable
application of essential substitutions is our ability to put forward an axiomati-
zation of rst order logic where the specialization axioms of denition (230), i.e.
x
1

1
[y/x] are stated without the usual caveat of y being free for x in
1
.
247
What was done for formulas can be done for proofs. Given a map : V W
and a proof (V ), it is meaningful to ask which proof () (W) can be
derived by systematically substituting variables in according to the map .
As in the case of formulas, we cannot expect miracles from this as the transfor-
mation will most likely be too crude in general. However, this may be a good
starting point which is certainly worth investigating. There are many reasons
for us to attempt building substitutions on proofs which are associated in some
way with a map : V W. Any : (V ) (W) can be viewed as a tool
to create new proofs. This tool may prove invaluable when attempting to show
that a proof exists. Godels completeness theorem is a case when the existence of
a proof has to be established. Many common arguments in mathematical logic
involve language extensions which can be viewed as embeddings. It is usually
implicit in the argument that what is provable (or inconsistent) in one space,
remains provable in the other. As we do not wish to take anything for granted,
this is one reason for us to look into substitution of variables in proofs. Another
reason is to establish some form of structurality condition for our consequence
relation . This condition is stated in [6] and [50] as () ().
However, at this point of the document it is not clear to us what type of substi-
tution would allow the structurality condition to hold in the context of rst
order logic with terms. Our hope is that essential substitutions will work, and
indeed they will as theorem (30) of page 396 will show. The work leading up
to this important theorem requires a new tool, namely the ability to consider
variable substitutions in proofs. With this in mind, we dene:
Denition 272 Let V , W be sets and : V W be a map. We call proof
substitution associated with , the map

: (V ) (W) dened by:


(V ) ,

() =

() if = P(V )
() if =

(
1
)

(
2
) if =
1

2
(x)

(
1
) if = x
1
(3.8)
where : P(V ) P(W) also denotes the associated substitution mapping.
Given a map : V W and P(V ) the notation () is potentially
ambiguous. Since P(V ) (V ), it may refer to the proof of denition (272) or
to the formula of denition (53). Luckily, the two notions coincide.
Proposition 273 The structural recursion of denition (272) is legitimate.
Proof
We need to prove the existence and uniqueness of the map

: (V ) (W)
satisfying the four conditions of equation (3.8). We shall do so using theorem (4)
of page 45 applied to the free universal algebra X = (V ) with X
0
= P(V ) and
A = (W). First we dene g
0
: X
0
A by g
0
() = () which ensures the rst
condition is met. Next, given a formula P(V ) we dene h() : A
0
A by
setting h()(0) = () which ensures the second condition is met. Next we
dene h() : A
2
A by setting h()(
1
,
2
) =
1

2
which ensures the third
248
condition is met. Finally, given x V we dene h(x) : A
1
A by setting
h(x)(
1
) = (x)
1
. This guarantees our fourth condition is met.
Given : V W, the associated proof substitution

: (V ) (W) is
denoted

and not . The following proposition will allow us to change that


and revert to the simple notation : (V ) (W) just as we have done with
: P(V ) P(W). This proposition should be compared with proposition (55).
Once we know that ( )

we can safely get rid of the star .


Proposition 274 Let U, V , W be sets. Let : U V and : V W be
maps. Let

: (U) (V ) and

: (V ) (W) be the proof substitu-


tions associated with and respectively. Then we have the equality:
( )

where ( )

: (U) (W) is the proof substitution associated with .


Proof
We need to show that ( )

() =

() for all (U). We shall do so


with a structural induction argument, using theorem (3) of page 32. First we
assume that = for some P(U). Then we have the equalities:
( )

() = ( )

()
= ()
=

(())
=

()
Next we assume that = for some P(U). Then we have:
( )

() = ( )

()
= ( ())
=

(())
=

()
=

()
Next we assume that =
1

2
where
1
,
2
(U) satisfy our property:
( )

() = ( )

(
1

2
)
= ( )

(
1
)( )

(
2
)
=

(
1
)

(
2
)
=

(
1
)

(
2
) )
=

(
1

2
)
=

()
249
Finally, we assume that = x
1
where x U and
1
satises our property:
( )

() = ( )

(x
1
)
= (x) ( )

(
1
)
= (x)

(
1
)
=

( (x)

(
1
) )
=

(x
1
)
=

()

As in the case of proposition (56), the following spells out the obvious:
Proposition 275 Let V be a set and i : V V be the identity mapping. Then,
the associated proof substitution mapping i : (V ) (V ) is also the identity.
Proof
We need to show that i() = for all (V ). We shall do so with a struc-
tural induction argument using theorem (3) of page 32. First we assume that
= for some P(V ). Then i() = i() and using proposition (56) it
follows that i() = = . Next we assume that = for some P(V ).
Then we have i() = i() = = . Next we assume that =
1

2
in
which case we obtain i() = i(
1
)i(
2
) =
1

2
= . Finally we assume that
= x
1
which yields i() = i(x)i(
1
) = x
1
= .
The map : (V ) (W) is a structural substitution as per deni-
tion (49). This allows us to claim that Sub (()) = (Sub ()). Hence:
_ () _ ()
In other words, if is a sub-proof of then () is a sub-proof of (). We
also have the converse: any sub-proof of () is of the form () where is a
sub-proof of . The following proposition is the analogue of proposition (57).
Proposition 276 Let V, W be sets and : V W be a map. The associated
proof substitution : (V ) (W) is structural and for all (V ) :
Sub (()) = (Sub ())
Proof
By virtue of proposition (50) it is sucient to prove that : (V ) (W)
is a structural substitution as per denition (49). Let (V ) and (W) denote
the Hilbert deductive proof types associated with V and W respectively as per
denition (233). Let q : (V ) (W) be the map dened by:
f (V ) , q(f) =

() if f =
if f =
(x) if f = x
250
Then q is clearly arity preserving. In order to show that is a structural sub-
stitution, we simply need to check that properties (i) and (ii) of denition (49)
are met. First we start with property (i): so let = for some P(V ). We
need to show that () P(W) which follows immediately from () = ().
So we now show property (ii): given f (V ), given (V )
(f)
we need
to show that (f()) = q(f)(()). First we assume that f = for some
P(V ). Then (f) = 0, = 0 and consequently we have:
(f()) = ((0))
(0) denoted = ()
def. (272) = ()
()(0) denoted () = ()(0)
: 0 0 = q()((0))
= q(f)(())
Next we assume that f = . Then (f) = 2 and given = (
0
,
1
) :
(f()) = (
0

1
)
def. (272) = (
0
)(
1
)
: (V )
2
(W)
2
= q()((
0
,
1
))
= q(f)(())
Finally we assume that f = x, x V . Then (f) = 1 and given = (
0
) :
(f()) = (x
0
)
def. (272) = (x)(
0
)
: (V )
1
(W)
1
= q(x)((
0
))
= q(f)(())

When dealing with congruences on (V ) we shall need the following:


Proposition 277 Let V and W be sets and : V W be a map. Let be an
arbitrary congruence on (W) and let be the relation on (V ) dened by:
() ()
for all , (V ). Then is a congruence on (V ).
Proof
Since the congruence on (W) is an equivalence relation, is clearly reexive,
symmetric and transitive on (V ). So we simply need to show that is a
congruent relation on (V ). Since is reexive, we have () () for
all P(V ), and so . Suppose
1
,
2
,
1
and
2
(V ) are such that
251

1

1
and
2

2
. Dene =
1

2
and =
1

2
. We need to show
that , or equivalently that () (). This follows from the fact that
(
1
) (
1
), (
2
) (
2
) and furthermore:
() = (
1

2
)
= (
1
)(
2
)
(
1
)(
2
)
= (
1

2
)
= ()
where the intermediate crucially depends on being a congruent relation
on (W). We now suppose that
1
,
1
(V ) are such that
1

1
. Let
x V and dene = x
1
and = x
1
. We need to show that , or
equivalently that () (). This follows from (
1
) (
1
) and:
() = (x
1
)
= (x) (
1
)
(x)(
1
)
= (x
1
)
= ()
where the intermediate crucially depends on being a congruent relation.
3.2.3 Hypothesis of a Proof
Given a proof (V ), the set of hypothesis Hyp () was dened in an earlier
section as per denition (236). At the time, a simple denition was enough
so we could dene the set of specic variables Sp () which in turn allowed us
to speak of the valuation mapping Val : (V ) P(V ) and correspondingly
establish the notion of provability and sequent , as per denition (241). It
is now time to be a little bit more thorough and state some of the elementary
properties satised by the set of hypothesis Hyp(). We start with the fact that
Hyp() is simply the set of formulas P(V ) which are sub-proofs of :
Proposition 278 Let V be a set and (V ). Then for all P(V ):
Hyp () _
In other words, is a hypothesis of if and only if is a sub-proof of .
Proof
We need to show the equality Hyp() = Sub() P(V ). We shall do so with a
structural induction argument using theorem (3) of page 32. First we assume
that = for some P(V ). Then we have Hyp() = and from deni-
tion (40) Sub () = . So the equality is clear. Next we assume that =
252
for some P(V ). Then Hyp () = while Sub() = . Using theorem (2)
of page 21 we obtain Sub () P(V ) = . So the equality is true. Next we
assume that =
1

2
where
1
,
2
(V ) satisfy the equality. Then:
Hyp() = Hyp (
1

2
)
= Hyp (
1
) Hyp(
2
)
= (Sub (
1
) P(V )) (Sub (
2
) P(V ))
= ( Sub (
1
) Sub (
2
) ) P(V )
theorem (2) of p. 21 = (
1

2
Sub (
1
) Sub (
2
) ) P(V )
= Sub (
1

2
) P(V )
= Sub () P(V )
Finally we assume that = x
1
where x V and
1
satises the equality:
Hyp() = Hyp (x
1
)
= Hyp (
1
)
= Sub (
1
) P(V )
theorem (2) of p. 21 = ( x
1
Sub(
1
) ) P(V )
= Sub (x
1
) P(V )
= Sub () P(V )

The map Hyp : (V ) T(P(V )) dened on the free universal algebra


(V ) is increasing with respect to the inclusion partial order on T(P(V )).
Proposition 279 Let V be a set and , (V ). Then we have:
_ Hyp () Hyp()
Proof
This follows from an application of proposition (47) to Hyp : X A where
X = (V ) and A = T(P(V )) where the preorder on A is the usual in-
clusion . We simply need to check that given
1
,
2
(V ) and x V
we have the inclusions Hyp (
1
) Hyp (
1

2
), Hyp (
2
) Hyp (
1

2
) and
Hyp(
1
) Hyp (x
1
) which follow from the recursive denition (236).
Given a map : V W and a proof (V ), the hypothesis of the
proof () are the images by : P(V ) P(W) of the hypothesis of . Note
that the symbol is overloaded and refers to three possible maps: apart from
: V W, there is : (V ) (W) when referring to (). There is also
: P(V ) P(W) whose restriction to Hyp () has a range denoted (Hyp ()).
253
Proposition 280 Let V, W be sets and : V W be a map. Let (V ) :
Hyp (()) = (Hyp ())
where : (V ) (W) also denotes the proof substitution mapping.
Proof
We shall proof this equality with an induction argument, using theorem (3) of
page 32. First we assume that = for some P(V ). Then we have:
Hyp(()) = Hyp (())
def. (236) = ()
= ()
= (Hyp ())
So we now assume that = for some P(V ). Then Hyp() = . However
from denition (272), () = () and we have Hyp(()) = . So the equality
Hyp(()) = (Hyp ()) holds. So we now assume that =
1

2
where

1
,
2
(V ) are proofs for which the equality is true. Then:
Hyp (()) = Hyp((
1

2
))
= Hyp( (
1
)(
2
) )
= Hyp((
1
)) Hyp((
2
))
= (Hyp (
1
)) (Hyp (
2
))
= ( Hyp (
1
) Hyp (
2
) )
= (Hyp (
1

2
))
= (Hyp ())
Finally, we assume that = x
1
where x V and
1
(V ) is a proof for
which the equality is true. We need to show the same is true of :
Hyp (()) = Hyp ((x
1
))
= Hyp ( (x)(
1
) )
= Hyp ((
1
))
= (Hyp (
1
))
= (Hyp (x
1
))
= (Hyp ())

3.2.4 Axiom of a Proof


In this section, we dene and establish elementary properties of the set Ax ()
representing the set of all axioms being invoked in a proof (V ). The set
254
Ax () is casually called the set of axioms of just like Hyp() is the set of
hypothesis of . An element of Ax () is casually called an axiom of . This
terminology is slightly unfortunate since an axiom of may not be an axiom at
all. For convenience, we allowed to be meaningful for all P(V ). So the
inclusion Ax () A(V ) does not hold in general, unless is totally clean.
Denition 281 Let V be a set. The map Ax : (V ) T(P(V )) dened by
the following structural recursion is called the axiom set mapping on (V ):
(V ) , Ax () =

if = P(V )
if =
Ax (
1
) Ax (
2
) if =
1

2
Ax (
1
) if = x
1
(3.9)
We say that P(V ) is an axiom of (V ) if and only if Ax ().
The recursive denition (281) is easily seen to be legitimate and furthermore
Ax () is a nite set, as a straightforward structural induction will show.
Proposition 282 Let V be a set and (V ). Then for all P(V ) :
Ax () _
In other words, is an axiom if and only if is a sub-proof of .
Proof
Consider the map : P(V ) (V ) dened by () = . We need to show:
Ax () =
1
(Sub ()) = P(V ) : () Sub()
We shall do so with an induction argument using theorem (3) of page 32. First
we assume that = for some P(V ). Then Ax () = while Sub () = .
Using theorem (2) of page 21 we obtain
1
(Sub ()) = . So the equality is true.
Next we assume that = for some P(V ). Then we have Ax () =
and from denition (40) Sub() = . Since from proposition (235) is an
injective map, we have
1
(Sub ()) = and the equality is true. Next we
assume that =
1

2
where
1
,
2
(V ) satisfy the equality:
Ax () = Ax (
1

2
)
= Ax (
1
) Ax (
2
)
=
1
(Sub (
1
))
1
(Sub (
2
))
=
1
( Sub (
1
) Sub (
2
) )
theorem (2) of p. 21 =
1
(
1

2
Sub (
1
) Sub (
2
) )
=
1
(Sub (
1

2
))
=
1
(Sub ())
255
Finally we assume that = x
1
where x V and
1
satises the equality:
Ax () = Ax (x
1
)
= Ax (
1
)
=
1
(Sub (
1
))
theorem (2) of p. 21 =
1
( x
1
Sub (
1
) )
=
1
(Sub (x
1
))
=
1
(Sub ())

The axioms of a totally clean proof are indeed axioms of rst order logic.
Proposition 283 Let V be a set and (V ). Then we have the implication:
( totally clean) Ax () A(V )
Proof
Let (V ) be totally clean. We need to show that Ax () A(V ). So let
Ax (). We need to show that A(V ). However from proposition (282)
we obtain _ . Since is totally clean, it follows from proposition (266)
that is also totally clean. From denition (264) we see that A(V ).
The map Ax : (V ) T(P(V )) dened on the free universal algebra (V )
is increasing with respect to the inclusion partial order on T(P(V )).
Proposition 284 Let V be a set and , (V ). Then we have:
_ Ax () Ax ()
Proof
This follows from an application of proposition (47) to Ax : X A where
X = (V ) and A = T(P(V )) where the preorder on A is the usual inclusion
. We simply need to check that given
1
,
2
(V ) and x V we have the
inclusions Ax (
1
) Ax (
1

2
), Ax (
2
) Ax (
1

2
) and Ax (
1
) Ax (x
1
)
which follow from the recursive denition (281).
Given a map : V W and a proof (V ), the axioms of the proof ()
are the images by : P(V ) P(W) of the axioms of . Note that the symbol
is overloaded and refers to three possible maps: apart from : V W, there
is : (V ) (W) when referring to (). There is also : P(V ) P(W)
whose restriction to Ax () has a range denoted (Ax ()). Hence we have:
Proposition 285 Let V, W be sets and : V W be a map. Let (V ) :
Ax (()) = (Ax ())
where : (V ) (W) also denotes the proof substitution mapping.
256
Proof
We shall prove this equality with an induction argument, using theorem (3) of
page 32. First we assume that = for some P(V ). Then () = ()
and consequently Ax (()) = . So the equality is clear. Next we assume that
= for some P(V ). Then we have the equalities:
Ax (()) = Ax (())
= Ax (())
= ()
= ()
= (Ax ())
= (Ax ())
Next we assume that =
1

2
where
1
,
2
(V ) are proofs for which the
equality is true. We need to show the same is true of :
Ax (()) = Ax ((
1

2
))
= Ax ( (
1
)(
2
) )
= Ax ((
1
)) Ax ((
2
))
= (Ax (
1
)) (Ax (
2
))
= ( Ax (
1
) Ax (
2
) )
= (Ax (
1

2
))
= (Ax ())
Finally, we assume that = x
1
where x V and
1
(V ) is a proof for
which the equality is true. We need to show the same is true of :
Ax (()) = Ax ((x
1
))
= Ax ( (x)(
1
) )
= Ax ((
1
))
= (Ax (
1
))
= (Ax (x
1
))
= (Ax ())

3.2.5 Variable of a Proof


It feels somewhat articial to dene the set of variables Var() of a proof . All
variables involved are very dierent in nature and it seems very little benet can
be derived from aggregating them in one big set. Among these, there are the free
257
or bound variables coming from the set of hypothesis Hyp (), and those coming
from the set of axioms Ax (). There are also the variables involved in the use
of generalization. These ve groups of variables are not mutually exclusive,
unless when a proof is totally clean and no specic variable of Sp () can be
used for generalization. So why dene the set Var()? Experience has shown
us that given a formula P(V ), the set Var() was very useful in allowing
us to strengthen a few results on substitutions. If : V W is an injective
map, then many things can be said about : P(V ) P(W) or the formula
(). However, we usually do not need to know that is injective to derive our
conclusion. It is usually sucient to assume the injectivity of the restriction

|Var()
. This distinction is in fact crucial: given an injective map : V W,
we often need to consider a left-inverse of , namely a map : W V such
that (x) = x for all x V . Such a left-inverse is rarely injective, but
the restriction
|(V )
is an injective map. The set Var() together with Fr ()
and Bnd () were also crucial notions when dealing with valid substitutions and
minimal transforms. As we shall soon discover, the same sort of analysis can be
carried out for proofs, leading up to essential substitutions and their existence
in theorem (29) of page 383. So Var() is a useful notion and so is Var().
Denition 286 Let V be a set. The map Var : (V ) T(V ) dened by the
following structural recursion is called variable mapping on (V ):
(V ) , Var() =

Var() if = P(V )
Var() if =
Var(
1
) Var(
2
) if =
1

2
x Var(
1
) if = x
1
(3.10)
We say that x V is a variable of (V ) if and only if x Var().
Given a formula P(V ) the notation Var() is potentially ambiguous. Since
P(V ) (V ), it may refer to the usual Var() of denition (59), or to the set
Var() where = of denition (286). Luckily, the two notions coincide.
Proposition 287 The structural recursion of denition (286) is legitimate.
Proof
We need to show the existence and uniqueness of the map Var : (V ) T(V )
satisfying the four conditions of equation (3.10). This follows from an applica-
tion of theorem (4) of page 45 with X = (V ), X
0
= P(V ) and A = T(V ) where
g
0
: X
0
A is dened as g
0
() = Var(). Furthermore, given P(V ) we take
h() : A
0
A dened h()(0) = Var(). We take h() : A
2
A dened by
h()(A
0
, A
1
) = A
0
A
1
and h(x) : A
1
A dened by h(x)(A
0
) = xA
0
.

Many elementary properties of Var() resemble those of Var(). We have


not attempted to design the right level of abstraction to avoid what may appear
as tedious repetition. The following is the counterpart of proposition (61) :
258
Proposition 288 Let V be a set and (V ). Then Var() is nite.
Proof
This follows from a structural induction argument using theorem (3) of page 32.
If = P(V ) then Var() = Var() is nite by virtue of proposition (61).
If = then Var() = Var() is also nite. If =
1

2
then we have
Var() = Var(
1
) Var(
2
) which is nite. Finally if = x
1
then we have
Var() = x Var(
1
) which is also nite if Var(
1
) is itself nite.
The map Var : (V ) T(V ) is increasing with respect to the standard
inclusion on T(V ). In other words, the variables of a sub-proof are also variables
of the proof itself. The following is the counterpart of proposition (62) :
Proposition 289 Let V be a set and , (V ). Then we have:
_ Var() Var()
Proof
This is a simple application of proposition (47) to Var : X A where X = (V )
and A = T(V ) where the preorder on A is the usual inclusion . We sim-
ply need to check that given
1
,
2
(V ) and x V we have the inclu-
sions Var(
1
) Var(
1

2
), Var(
2
) Var(
1

2
) and Var(
1
) Var(x
1
)
which follow immediately from the recursive denition (286).
Given a map : V W and (V ), the variables of the proof () are
the images by of the variables of . This is the counterpart of proposition (63) :
Proposition 290 Let V and W be sets and : V W be a map. Then:
(V ) , Var(()) = (Var())
where : (V ) (W) also denotes the associated proof substitution mapping.
Proof
Given (V ), we need to show that Var(()) = (Var()). We shall do
so by structural induction using theorem (3) of page 32. First we assume that
= for some P(V ). Then we have the equalities:
Var(()) = Var(())
prop. (63) = (Var())
= (Var())
Next we assume that = for some P(V ). Then we have:
Var(()) = Var(())
= Var(())
259
= Var(())
prop. (63) = (Var())
= (Var())
= (Var())
Next we check that the property is true for =
1

2
if it is true for
1
,
2
:
Var(()) = Var((
1

2
))
= Var((
1
)(
2
))
= Var((
1
)) Var((
2
))
= (Var(
1
)) (Var(
2
))
= ( Var(
1
) Var(
2
) )
= (Var(
1

2
))
= (Var())
Finally we check that the property is true for = x
1
if it is true for
1
:
Var(()) = Var((x
1
))
= Var( (x)(
1
) )
= (x) Var((
1
))
= (x) (Var(
1
))
= ( x Var(
1
) )
= (Var(x
1
))
= (Var())

In order to have the equality () = () is it sucient that , : V W


coincide on Var(), which is pretty natural. What is less obvious is the fact that
the converse is also true. The following is the counterpart of proposition (64) :
Proposition 291 Let V and W be sets and , : V W be two maps. Then:

|Var()
=
|Var()
() = ()
for all (V ), where , : (V ) (W) are also the proof substitutions.
Proof
First we prove : We shall do so with an induction argument, using theo-
rem (3) of page 32. First we assume that = for some P(V ). Then we
260
have Var() = Var() so we assume that and coincide on Var(). Using
proposition (64) we obtain () = () which is () = (). Next we assume
that = for some P(V ). Then again we have Var() = Var() so we
assume that and coincide on Var(). Using proposition (64) once more we
obtain () = (). It follows that () = () = () = (). So we now
assume that =
1

2
where
1
,
2
(V ) satisfy our property. We need
to show the same is true of . So we assume that and coincide on Var().
We need to show that () = (). However since Var() = Var(
1
) Var(
2
)
we see that and coincide on Var(
1
) and Var(
2
), and it follows from our
induction hypothesis that (
1
) = (
1
) and (
2
) = (
2
). Hence:
() = (
1

2
)
= (
1
)(
2
)
= (
1
)(
2
)
= (
1

2
)
= ()
Finally, we assume that = x
1
where x V and
1
(V ) satises our
property. We need to show the same is true of . So we assume that and
coincide on Var(). We need to show that () = (). However since
Var() = x Var(
1
) we see that and coincide on Var(
1
), and it follows
from our induction hypothesis that (
1
) = (
1
). Hence:
() = (x
1
)
= (x)(
1
)
= (x)(
1
)
x Var() = (x)(
1
)
= (x
1
)
= ()
We now show : we shall do so with a structural induction argument, using
theorem (3) of page 32. First we assume that = for some P(V ). Then
from () = () we obtain () = () and from proposition (64) it follows
that and coincide on Var() = Var(). Next we assume that = for some
P(V ). Suppose () = (). Then () = () and consequently from
proposition (235) we have () = (). Using proposition (64) once more it fol-
lows that and coincide on Var() = Var(). We now assume that =
1

2
where
1
,
2
are proofs which satisfy our implication. We need to show the same
is true of . So we assume that () = (). We need to show that and
coincide on Var() = Var(
1
) Var(
2
). However, from () = () we obtain
(
1
)(
2
) = (
1
)(
2
) and consequently using theorem (2) of page 21
we have (
1
) = (
1
) and (
2
) = (
2
). Having assumed that
1
and
2
satisfy our implication, it follows that and coincide on Var(
1
) and also
on Var(
2
). So they coincide on Var() as requested. So we now assume that
= x
1
where x V and
1
(V ) is a proof which satises our implication.
261
We need to show the same is true of . So we assume that () = (). We
need to show that and coincide on Var() = x Var(
1
). However, from
() = () we obtain (x)(
1
) = (x)(
1
) and consequently using the-
orem (2) of page 21 we have (
1
) = (
1
) and (x) = (x). Having assumed
that
1
satises our implication, it follows that and coincide on Var(
1
).
Since (x) = (x), they also coincide on Var() = xVar(
1
) as requested.
Given a proof (V ), any variable which belongs to the conclusion
Val() of a sub-proof _ is of course a variable of . Note that the converse
is not true in general, unless is totally clean: if = and , A(V ) then
we have Val () = and any sub-proof of is itself. Hence we have
Var(Val ()) : _ = , while Var() = Var() may not be empty. We
shall deal with the case when is totally clean later in proposition (343).
Proposition 292 Let V be a set and (V ). Then we have:
Var(Val()) : _ Var() (3.11)
i.e. the variables of conclusions of sub-proofs of are variables of .
Proof
Using proposition (289) we have Var() Var() for all _ . It is therefore
sucient prove the inclusion Var(Val ()) Var() for all (V ) which we
shall do with a structural induction argument, using theorem (3) of page 32.
First we assume that = for some P(V ). Then the inclusion follows
immediately from Val () = . Next we assume that = for some P(V ).
The inclusion is clear in the case when Val () = . So we may assume
that Val () ,= in which case must be an axiom of rst order logic,
i.e. A(V ) and Val() = . The inclusion follows from Var() = Var().
So we now assume that =
1

2
where
1
,
2
(V ) are proofs satisfying
our inclusion. We need to show the same is true of . This is clearly the case if
Val() = . So we may assume Val () ,= in which case we must
have Val (
2
) = Val(
1
) Val (). Hence we see that:
Var(Val ()) Var(Val (
1
)) Var(Val())
= Var( Val (
1
) Val () )
= Var(Val (
2
))
Var(
2
)
Var(
1
) Var(
2
)
= Var(
1

2
)
= Var()
So we now assume that = x
1
where x V and
1
(V ) satises our
inclusion. We need to show the same is true of . This is clearly the case if
262
Val() = . So we may assume that Val () ,= in which case we
must have x , Sp (
1
) and Val () = xVal (
1
). Hence we see that:
Var(Val ()) = Var( xVal (
1
) )
= x Var(Val (
1
))
x Var(
1
)
= Var(x
1
)
= Var()

3.2.6 Specic Variable of a Proof


The notion of specic variable of a proof should now be natural to us . To every
proof is associated a set of hypothesis Hyp() and a set of axioms Ax ().
Every formula Hyp () has a set of free variables Fr (), and so has every
formula Ax (). In denition (237) we decided to call a specic variable
of only those variables which are free variables of some Hyp(). The
specic variables are those which are not general. This allowed us to conveniently
express the idea of legitimate use of generalization in x
1
, with the simple
formalism x , Sp (
1
). In this section, we provide a few elementary properties
of the set Sp (). We start with a structural denition of Sp () :
Proposition 293 Let V be a set. The specic variable map Sp : (V ) T(V )
of denition (237) satises the following structural recursion equation:
(V ) , Sp () =

Fr () if = P(V )
if =
Sp (
1
) Sp (
2
) if =
1

2
Sp (
1
) if = x
1
Proof
Using denition (237) we have Sp () = Fr () : Hyp(). In the
case when = for P(V ), since Hyp () = we obtain immediately
Sp () = Fr (). In the case when = for P(V ), since Hyp() = we
obtain Sp () = . When =
1

2
for some
1
,
2
(V ) we have:
Sp () = Sp (
1

2
)
= Fr () : Hyp(
1

2
)
= Fr () : Hyp(
1
) Hyp(
2
)
= ( Fr () : Hyp (
1
) Fr() : Hyp(
2
) )
= (Fr () : Hyp(
1
)) (Fr () : Hyp (
2
))
= Sp (
1
) Sp (
2
)
263
Finally if = x
1
for some x V and
1
(V ) we obtain:
Sp () = Sp (x
1
)
= Fr() : Hyp(x
1
)
= Fr() : Hyp(
1
)
= Sp (
1
)

The map Sp : (V ) T(V ) is increasing with respect to the inclusion on


T(V ). So the specic variables of a sub-proof are specic variables of the proof.
Proposition 294 Let V be a set and , (V ). Then we have:
_ Sp () Sp ()
Proof
Suppose _ is a sub-proof of (V ). Then we have:
Sp () = Fr (Hyp ())
= Fr () : Hyp ()
prop. (279) Fr () : Hyp ()
= Fr (Hyp ())
= Sp ()

Let : V W be a map and (V ) be a proof. Then () is a proof


whose specic variables are always images of specic variables of by .
Proposition 295 Let V, W be sets and : V W be a map. Let (V ) :
Sp (()) (Sp ())
where : (V ) (W) also denotes the proof substitution mapping.
Proof
There is no induction required in this case. Using denition (237) we have:
Sp (()) = Fr ( Hyp(()) )
prop. (280) = Fr ( (Hyp ()) )
= Fr() : (Hyp ())
= Fr(()) : Hyp ()
prop. (74) (Fr ()) : Hyp ()
= ( Fr () : Hyp() )
= ( Fr (Hyp ()) )
= (Sp ())
264

A specic variable of is a variable of . It had to be said once:


Proposition 296 Let V be a set and (V ). Then we have:
Sp () Var()
Proof
There is no induction required in this case. The proof goes as follows:
Sp () = Fr () : Hyp ()
prop. (80) Var() : Hyp ()
Val() = = Var(Val ()) : Hyp ()
prop. (278) Var(Val ()) : _
prop. (292) Var()

3.2.7 Free Variable of a Proof


In this section we dene and study the set of free variables Fr () of a proof
(V ). The notion of free variables is crucial when attempting to formalize
the idea of variable substitutions which avoid capture. Free variables also play
a key role when considering minimal transforms which are in turn a key step
in constructing essential substitutions. Essential substitutions for proofs will be
shown to exist in theorem (29) of page 383. Given a proof (V ), the free
variables of are simply the free variables of the hypothesis and axioms of ,
with the exception of generalization variables which are removed from Fr ():
Denition 297 Let V be a set. The map Fr : (V ) T(V ) dened by the
following structural recursion is called free variable mapping on (V ):
(V ) , Fr () =

Fr () if = P(V )
Fr () if =
Fr (
1
) Fr (
2
) if =
1

2
Fr (
1
) x if = x
1
(3.12)
We say that x V is a free variable of (V ) if and only if x Fr ().
Given a formula P(V ) the notation Fr () is potentially ambiguous. Since
P(V ) (V ), it may refer to the usual Fr () of denition (72), or to the set
Fr () where = of denition (297). Luckily, the two notions coincide.
Proposition 298 The structural recursion of denition (297) is legitimate.
265
Proof
We need to show the existence and uniqueness of the map Fr : (V ) T(V )
satisfying the four conditions of equation (3.12). This follows from an applica-
tion of theorem (4) of page 45 with X = (V ), X
0
= P(V ) and A = T(V ) where
g
0
: X
0
A is dened as g
0
() = Fr (). Furthermore, given P(V ) we take
h() : A
0
A dened h()(0) = Fr (). We take h() : A
2
A dened by
h()(A
0
, A
1
) = A
0
A
1
and h(x) : A
1
A dened by h(x)(A
0
) = A
0
x.

Given a map : V W with associated : (V ) (W), given a proof


(V ) the free variables of the proof () must be images of free variables
of by . The following proposition is the counterpart of proposition (74) :
Proposition 299 Let V , W be sets and : V W be a map. Let (V ) :
Fr (()) (Fr ())
where : (V ) (W) also denotes the associated proof substitution mapping.
Proof
We shall prove the inclusion with a structural induction, using theorem (3) of
page 32. First we assume that = for some P(V ). Then we need to show
that Fr (()) (Fr ()) which follows immediately from proposition (74).
We now assume that = for some P(V ). Then we have:
Fr (()) = Fr (())
= Fr (())
= Fr (())
prop. (74) (Fr ())
= (Fr ())
= (Fr ())
We now assume that =
1

2
where
1
,
2
(V ) are proofs which satisfy
our inclusion. We need to show the same is true of which goes as follows:
Fr (()) = Fr ((
1

2
))
= Fr ((
1
)(
2
))
= Fr ((
1
)) Fr ((
2
))
(Fr (
1
)) (Fr (
2
))
= ( Fr (
1
) Fr (
2
) )
= (Fr (
1

2
))
= (Fr ())
266
Finally, we assume that = x
1
where x V and
1
(V ) is a proof
satisfying our inclusion. We need to show the same is true of :
Fr (()) = Fr ((x
1
))
= Fr ( (x)(
1
) )
= Fr ((
1
)) (x)
(Fr (
1
)) (x)
= ( Fr (
1
) x ) (x)
( Fr (
1
) x )
= (Fr (x
1
))
= (Fr ())

The specic variables of a proof (V ) are those with respect to which


generalization is not permitted. So if a proof is totally clean, it contains no
awed attempt at generalization and the specic variables remain free variables:
Proposition 300 Let V be a set and (V ) be totally clean. Then we have:
Sp () Fr ()
In other words, the specic variables of are also free variables of .
Proof
For every proof (V ) we need to show the following implication:
( totally clean) Sp () Fr ()
We shall do so with a structural induction, using theorem (3) of page 32. First
we assume that = for some P(V ). Then is always totally clean
in this case and we have Sp () = Fr (Hyp ()) = Fr () = Fr (). We now
assume that = for some P(V ). Then Sp () = and the implication
is clearly true. So we now assume that =
1

2
where
1
,
2
(V ) are
proofs satisfying our implication. We need to show the same is true of . So we
assume that is totally clean. We need to show the inclusion Sp () Fr ().
However, from proposition (267) both
1
and
2
are totally clean. Hence:
Sp () = Sp (
1

2
)
prop. (293) = Sp (
1
) Sp (
2
)

1
,
2
totally clean Fr (
1
) Fr (
2
)
= Fr (
1

2
)
= Fr ()
267
Note that the equality Val (
2
) = Val (
1
) Val () which follows from
being totally clean, has not been used in the argument. We now assume that
= x
1
where x V and
1
(V ) is a proof satisfying our implication.
We need to show the same is true for . So we assume that is totally clean.
We need to show the inclusion Sp () Fr (). However, from proposition (268)
we see that
1
is totally clean and x , Sp (
1
). Hence, we have:
Sp () = Sp (x
1
)
prop. (293) = Sp (
1
)
x , Sp (
1
) = Sp (
1
) x

1
totally clean Fr (
1
) x
= Fr (x
1
)
= Fr ()

Just as we did for formulas, we shall need to consider congruences on (V ).


In particular, we shall introduce the substitution congruence on (V ) as per
denition (362). Whenever is a congruence on (V ) we may wish to prove
the implication Fr () = Fr (). One of the key steps in the proof is
to argue that equality between sets of free variables is itself a congruence:
Proposition 301 Let V be a set and be the relation on (V ) dened by:
Fr () = Fr ()
for all , (V ). Then is a congruence on (V ).
Proof
The relation is clearly reexive, symmetric and transitive on (V ). So we
simply need to show that is a congruent relation on (V ). By reexivity, we
already have for all P(V ). So suppose
1
,
2
,
1
and
2
(V )
are such that
1

1
and
2

2
. Dene =
1

2
and =
1

2
. We need
to show that , or equivalently that Fr () = Fr (). This follows from the
fact that Fr (
1
) = Fr (
1
), Fr (
2
) = Fr (
2
) and furthermore:
Fr () = Fr (
1

2
)
= Fr (
1
) Fr (
2
)
= Fr (
1
) Fr (
2
)
= Fr (
1

2
)
= Fr ()
We now assume that
1
,
1
(V ) are such that
1

1
. Let x V and
dene = x
1
and = x
1
. We need to show that , or equivalently
268
that Fr () = Fr (). This follows from the fact that Fr (
1
) = Fr (
1
) and:
Fr () = Fr (x
1
) = Fr (
1
) x = Fr (
1
) x = Fr (x
1
) = Fr ()

3.2.8 Bound Variable of a Proof


In this section we dene and study the set Bnd () of bound variables of a
proof (V ). The notion of bound variables has proved particularly useful
when dealing with the local inversion theorem (10) of page 107 for formulas.
A counterpart of this theorem for proofs will be established as theorem (25) of
page 353. The set of bound variables is also important in giving us a useful test
for valid substitutions in the form of proposition (93) for which a counterpart
shall be established for proofs as proposition (320). The bound variables of a
proof are simply the bound variables of the hypothesis and axioms of , to
which is added every variable occurring in the use of generalization.
Denition 302 Let V be a set. The map Bnd : (V ) T(V ) dened by the
following structural recursion is called bound variable mapping on (V ):
(V ) , Bnd () =

Bnd() if = P(V )
Bnd() if =
Bnd(
1
) Bnd(
2
) if =
1

2
x Bnd (
1
) if = x
1
(3.13)
We say that x V is a bound variable of (V ) if and only if x Bnd ().
Given a formula P(V ) the notation Bnd () is potentially ambiguous. Since
P(V ) (V ), it may refer to the usual Bnd () of denition (78), or to the set
Bnd() where = of denition (302). Luckily, the two notions coincide.
Proposition 303 The structural recursion of denition (302) is legitimate.
Proof
We need to show the existence and uniqueness of the map Bnd : (V ) T(V )
satisfying the four conditions of equation (3.13). This follows from an applica-
tion of theorem (4) of page 45 with X = (V ), X
0
= P(V ) and A = T(V ) where
g
0
: X
0
A is dened as g
0
() = Bnd(). Furthermore, given P(V ) we take
h() : A
0
A dened h()(0) = Bnd(). We take h() : A
2
A dened by
h()(A
0
, A
1
) = A
0
A
1
and h(x) : A
1
A dened by h(x)(A
0
) = xA
0
.

The free and bound variables of a proof (V ) are variables of , as we


would expect. In fact, every variable of is either free or bound, or both free
and bound. The following proposition is the counterpart of proposition (80) :
269
Proposition 304 Let V be a set and (V ). Then we have:
Var() = Fr () Bnd () (3.14)
Proof
We shall prove the equality with a structural induction, using theorem (3) of
page 32. First we assume that = for some P(V ). Then the equality
follows immediately from proposition (80). We now assume that = for
some P(V ). Then using proposition (80) once more, we have:
Var() = Var()
= Var()
prop. (80) = Fr () Bnd ()
= Fr () Bnd()
= Fr () Bnd()
So we now assume that =
1

2
where
1
,
2
(V ) are proofs satisfying
the equality. We need to show that same is true of which goes as follows:
Var() = Var(
1

2
)
= Var(
1
) Var(
2
)
= Fr (
1
) Bnd (
1
) Fr (
2
) Bnd(
2
)
= Fr (
1

2
) Bnd (
1

2
)
= Fr () Bnd()
We now assume that = x
1
where x V and
1
(V ) is a proof satisfying
our equality. We need to show the same is true of which goes as follows:
Var() = Var(x
1
)
= x Var(
1
)
= x Fr (
1
) Bnd (
1
)
= (Fr (
1
) x) x Bnd (
1
)
= Fr (x
1
) Bnd(x
1
)
= Fr () Bnd ()

The map Bnd : (V ) T(V ) is increasing with respect to the standard


inclusion on T(V ). In other words, the bound variables of a sub-proof are also
bound variables of the proof itself, a property which does not hold for free
variables. The following is the counterpart of proposition (81) :
270
Proposition 305 Let V be a set and , (V ). Then we have:
_ Bnd () Bnd()
Proof
This follows from an application of proposition (47) to Bnd : X A where
X = (V ) and A = T(V ) where the preorder on A is the usual inclu-
sion . We simply need to check that given
1
,
2
(V ) and x V
we have the inclusions Bnd (
1
) Bnd (
1

2
), Bnd (
2
) Bnd (
1

2
) and
Bnd(
1
) Bnd (x
1
) which follow from the recursive denition (302).
Given a map : V W and (V ), the bound variables of the proof
() are the images by of the bound variables of . A similar result for free
variables cannot be stated, unless the substitution is valid for , as will be
seen from proposition (315). In general, only the inclusion Fr (()) (Fr ())
is true. The following is the counterpart of proposition (82) :
Proposition 306 Let V , W be sets and : V W be a map. Let (V ) :
Bnd (()) = (Bnd ())
where : (V ) (W) also denotes the associated proof substitution mapping.
Proof
We shall prove this equality with a structural induction, using theorem (3) of
page 32. First we assume that = for some P(V ). Then we need to
show that Bnd (()) = (Bnd ()) which follows from proposition (82). Next
we assume that = for some P(V ). Then we have:
Bnd (()) = Bnd (())
= Bnd (())
= Bnd (())
prop. (82) = (Bnd ())
= (Bnd ())
= (Bnd ())
We now assume that =
1

2
where
1
,
2
(V ) are proofs satisfying our
equality. We need to show the same is true of which goes as follows:
Bnd (()) = Bnd((
1

2
))
= Bnd((
1
)(
2
))
= Bnd((
1
)) Bnd((
2
))
= (Bnd (
1
)) (Bnd (
2
))
= ( Bnd (
1
) Bnd (
2
) )
= (Bnd (
1

2
))
= (Bnd ())
271
Finally, we assume that = x
1
where x V and
1
(V ) is a proof
satisfying our equality. We need to show the same is true of :
Bnd (()) = Bnd ((x
1
))
= Bnd ( (x)(
1
) )
= (x) Bnd((
1
))
= (x) (Bnd (
1
))
= ( x Bnd (
1
) )
= (Bnd (x
1
))
= (Bnd ())

3.2.9 Valid Substitution of Variable in Proof


In denition (83) we dened what it meant for a substitution : V W to be
valid for a formula P(V ). We needed to introduce the notion that would
not randomly distort the logical structure of , a property which is commonly
known as avoiding capture. The purpose was obvious: we wanted to carry
over properties of P(V ) into corresponding properties of () P(W).
For example, theorem (15) of page 155 tells us that a substitution equivalence
is preserved by provided it is valid for both and . Another example
will be seen as the substitution lemma in the context of model theory, where
proposition (480) will show that if is valid for , then the truth of () in a
model M under an assignment a : W M is equivalent to the truth of under
the assignment a . The notion of being valid for is crucially important.
One of the most interesting properties of formulas which we may wish to
carry over is that of provability. If a sequent is true, we all want to know
under what conditions the corresponding sequent () () is also true. It is
very tempting to conjecture that provided is valid for and is also valid for
every hypothesis , then the result will hold. Of course, an obvious issue is
the fact that a sequent does not tell us anything about the axioms being
used in a proof underlying the sequent. So we cannot hope to simply carry
over every step of the proof with , as our assumption does not guarantee is