You are on page 1of 24

J. 1:. A. K. VAN BENTHF.

51

FOUR PARADOXES

I. INTRODUCTION

In this paper we want to discuss four paradoxes, viz. Cantor’s paradox about
the totality of all sets, Russell’s paradox about the classof all sets which are
not members of themselves, Curry’s paradox concerning formal systems in
which self-reference is possible and Lob’s version of the Liar paradox in
which negation and falsity do not occur. It will be shown that. although
these paradoxes are closely related from a technical point of view, there is a
curious lack of historical continuity in their development. With a little
exaggeration, it could be said that Russell’s paradox should have occurred
to Cantor and Curry’s paradox to Russell, whereas Lob’s paradox is esscn-
tially just Curry’s paradox. These and similar historical points will be found
in Section IV which summarizes the results of the expository Sections II
(about Cantor and Russell) and III (about Curry and Lob). The remainder of
the paper is devoted to a more general discussion of paradoxes. In Section V
we classify proposed remedies, showing that two main strategies are available
and, indeed, unavoidable. Section VI is about the “crises” caused by para-
doxes: the view will be defended that paradoxes are not as disastrous (logi-
cally or otherwise) as is often thought. The consternation of Frege, Russell
and Hilbert, no matter how fruitful in the end, was ill-considered from our
point of view. In a final section (Section VII) the relation between formal
theories and natural language is touched upon, in order to show how the
above formal paradoxes affect the study of language.
Before going into detail, we will give a brief sketch of our four paradoxes.
Cantor’s paradox shows that contradictions arise in set theory when the
class of all sets is considered to be a set. Russell’s paradox is about the class
of all sets not containing themselves as a member, which cannot be a set
either. When formulated in terms of concepts and application, the latter
paradox shows that the concept “not applicable to itself” leads to a contra-
diction: it is applicable to itself if and only if it is not applicable to itself.
Curry’s paradox shows that negation is not essential in this connection,

Journal of Philosophical Logic 7 (1978) 49-12. AI1 Rights Reserved.


Copyright 0 1978 by D. Reidel Publishing Company, Dordrecht, Holland.
50 J. F.A. K. VAN BENTHEM

since implication and application suffice for proving the undesirable con-
clusion that any statement is true. This last conclusion is also the contention
of Lob’s paradox, whose ingenious argument may not be too well-known,
for which reason it is reproduced here.
First, we knew already that any statement is true, for the Liar paradox
produces a sentence (A) both false and true, and “ex falso sequitur quod-
Tibet”. A “says of itself” that it is false, or, in more prosaic (extensional)
terms:

(1) A ++-%I,
which is a logical contradiction. (For this “intensional/extensional” point,
cf. Smullyan [27] .) Lob’s argument shows that the use of negation is not
needed for the proof that every statement is true. For, let B be any sentence
of the language. Create a sentence A such that A is true if and only if it
implies B, i.e.,

(2) A++(A+B).
Then argue as follows. Suppose

(3) A, then

(4) A +Band

(5) B. In other words, withdrawing the assumption (3),

(6) A + B, i.e.,

(7) A, so
(8) B!
This proof may seem like a piece of magic (Lob’s formal proof (46) . . . ,
(54) from which it is derived was called “magical” in Smorynski [26] ), but
the sober-minded reader must have realized already that

(9) (A++(A-+B))*(AhB)
is a propositional tautology.
Another way to phrase the argument, which stays even closer to Lijb’s
paper, is

(10) A + (A + B) (from (2))


FOUR PARADOXES 51

(11) A-+B (from (10) by propositional logic)

(12) A (from (11) and (2))


(13) B.

II. RUSSELL’S PARADOX AND CANTOR’S PARADOX

Russell’s paradox is best known in its class form

(14) iyw?vlE {vlr~yliff (ylv4rl4 {YIY 4vl.


Another form is the “paradox of exemplifiability”, obtained by letting
variables range over concepts. Elementary statements are of the form X(Y)
(“A’ applies to Y”), where the case X(X) is not excluded, and concepts may
be extracted from statements by X-abstraction, whence

(1% ;W.lX(X)(xX.lX(x)) ifflxX.lX(X)(xX.lX(X)).


This form will be used in Section 111.
It is often thought that Russell’s paradox teaches a specifically set-
theoretic lesson. Of course, the barber “pseudo-paradox” is given as an ordi-
nary language example (“the village barber shavesall and only those villagers
who do not shave themselves”), but it is quickly dismissed by saying “Well,
this just shows there are no such barbers”. Now Russell’s paradox shows just
this in set theory: there are no such sets as (y ly qy}, and in both casesthe
same logical principle may be invoked, viz. that

(16) WX)(‘~Y)(RYX * ~RYY)


is a logical truth (the “anti-diagonal principle”).
(16) is about the simplest truth of predicate logic not of a directly prop
ositional nature. For, if (Vx,) . . . (Vxk)@ is such a truth, where 4 is a prop-
ositional matrix, then Q is a propositional tautology, and if (3x,). . . (3xk)$
is such a truth, then some disjunction of instances of Q is a propositional
tautology, by Herbrand’s theorem. (Does this mean that we have only dis-
covered the simple paradoxes up to now?)
Blocking the paradox at once by forbidding the (y I@(y)] notation or
similar notations allowing self-reference phenomena sometimes makes
people forget the above basic feature of the paradox. Another example:
“x = (y IX q y>” is wrong not so much because of the self-reference, but
52 J.b'.A.K.VAN BENTHEM

because of the fact that

07) (3x)(vYj(fvx ++-XY)


is a logical contradiction. More generally, taboos on the formulation of self-
referential statements often make it impossible to attack certain opinions by
showing that they are self-refuting. These opinions have, then, to be dis-
missed beforehand as “unwellformed”, which is surely lessconvincing to
their subscribers.
Still, there is a difference between the barber and the classversion. In the
barber case the normal solution would be as follows. Ordinary language is
not very cautious about phrases like “for all” and “for all except”. We say,
e.g., “He is richer than anyone”, meaning “richer than anyone else”; and
“He was loved by no one” is not ordinarily thought to imply that he did not
love himself. (Compare the familiar “Any rule has an exception (except this
one)“.) Therefore, we amend the story to “The village barber shaves,of the
village population minus himself, exactly those who do not shave them-
selves”. This is perfectly compatible with the remainder of the truth, which
is that he shaveshimself: if only for advertizing purposes.
It is to be doubted, however, if this way-out provides an appropriate
answer to the Russellian criticism of Descartes’ Cogito found in Orenstein
[20] : “Someone thinks about (exactly) those who do not think about them-
selves. Does it follow that he thinks? No, for there is no such person”. If we
take the sentence “Someone thinks about (exactly) those who do not think
about themselves” extensionally, as (3x)( Vy) (x thinks about y iffy does
not think about y), then we have a contradiction, and the way-out is avail-
able. But this amounts to following Orenstein in his confusion between the
extensional reading and the intensional one (presumably meant by
Descartes): (3x) x thinks about {y ( y does not think about y). Then, it is
not up to us to change the description of what such an x has in mind. More-
over, even if it be admitted that our x argues as follows “I am thinking about
(exactly) those who do not think about themselves” and, being a trained
philosopher, arrives at a contradiction through self-reflection, this just means
that x was thinking a contradiction: something entirely possible according
to most authorities.
In the case of set theory, at least, the paradox cannot be avoided in this
way, although
FOUR PARADOXES 53

(18) (3XN.VYHY
+x + (RYX
+?1RYYN
is consistent in logic. To see this, consider the assumption

(19) (3X)(VY)(Y#X~(YEXtiY4Y)).
We distinguish two cases:

(9 x Ex. Then (Vy)(y Ex - {xl “y ey).


(ii) x$x.Then(Vy)(yEx~ {x}+yBy).
Both ways a Russell set has been constructed and a contradiction follows.
So, very weak set-theoretic principles (formation of singletons, unions and
relative complements) destroy this possibility.
Russell’s argument is strongly reminiscent of the argument used by
Cantor .to show that the cardinality of the power set of any given set y
exceeds that of y. Recall the situation: it is supposed that some function f
exists from y onto the power set p(y) of y. Then z is defined as
{uEyluBf(u)).S ince fis onto, z = f(v) for some z,Ey, and a contradic-
tion: u Ef(V) iff z, @f(v).
Let us pursue this point. Cantor’s paradox arises when the classof all sets
is taken to be a set itself. Prima facie, there seems to be quite some differ-
ence with the Russell case. for

(20) (3XPYMYX
is surely not a logical contradiction! So one might call Cantor’s paradox a
purely set-theoretic one. (And yet the difficulties arising whenever a
“totality of all things” is assumed, e.g., in ontology, seem to point at a
logical insight.) Then, how is a contradiction arrived at? We mention three
possible ways:
(a) Use the separation axiom to prove the existence of the Russell class.
This is fast, but hardly instructive. (We return to this.)
(b) Use the axiom of regularity and the existence of singletons to prove
that (Vx)x 4 x, contradicting (20).
[Proof: The axiom of regularity reads

(21) (vx)((~~lY~x-t(~Y)(Y~x~(Vz)(z~Y-,z~x))).
Now if (Vy)y E x, then x E x, but {xj contradicts (2 1): no E-minimal
element occurs in it.]
54 J. F. A. K. VAN BENTHEM

This method is rather anachronistic, however: the axiom of regularity is


a relatively late addition to set theory, and may even be still debatable to
some.
It seems we are left with the usual explanation, given, e.g., by Kleene in
[12] :
“Let T be the set of all sets. Now 2T is a set of sets, and hence 2* 5 T.
By the definition of < for cardinals (Section 34), ifM C_N then z>z-.
(Why?) So 2? 2 7. But by Cantor’s theorem (C), 2T > r. Thus we have a
contradiction”.
Apart from being far too complicated (the notion of cardinality had
better be left out), this proof contains a superfluous appeal to the power set
axiom. For, consider the situation. It was supposed that

(22) (Vy)u E x. Therefore,


(23) Wu)(u Ex ++Y g xl,
so the identity mapping constitutes an isomorphism fbetween x and “P(x)“.
Just following Cantor’s argument given above yields z = {x lx 4 x) and, since
z =f(z), we get . . . Russell’s paradox.

III. CURRY’S PARADOX AND LOB’S PARADOX

Both the Liar-paradox and Russell’s paradox aim at creating a statement


which is true if and only if it is false. Is the concept of falsity essential in
this connection? To fuc our thoughts, consider the paradox of exemplifi-
ability (( 15)): can negation be removed from its formulation? The most
obvious idea is to apply Russell’s own tric of replacing 1X by X + 3,
where B is an arbitrary “unwanted” statement. Instead of xX.1X(X) we
then get (F) LX. X(X) + B. Applying F to itself, like before, yields

(24) F(F) = xX.X(X) *B(F) = F(F) + B.


In [4] Curry considered (24) (without our comment about the Russell
tric, however. He mentions Camap as his inspiration), and then argued

(25) F(F) --* F(F) and so, by (24)

(26) F(F) --* (F(F) -+ B), from which, by the propositional


“absorption rule”,
FOUR PARADOXES 55

(27) F(F) --f B, or, by (24) again,

(28) F(F) and


(29) B.
Curry interpreted this as a paradoxical feature of even negation-less sys-
tems in which self-reference is allowed: any statement B becomes provable.
The corresponding natural language paradox lies at hand, although he does
not formulate it explicitly.
It is interesting to find that some people reacting on this paradox did not
find fault with self-reference at all: they interpreted it as meaning that some
of the steps (25) . . . , (29) must be suspect. Now (25) is blameless, (26) is
just one half of a definition, (27) a trivial adjustment, (28) follows by defi-
nition again, so . . . the transition to (29) is wrong! This is Fitch’s train of
thought (we think): he is reported by Geach in [9] to have blamed the
particular application of modus ponens involved here and constructed sys-
tems of logic satisfying certain restrictions (the “simple restriction” or the
“special restriction”) blocking the above inference. (Cf. also Anderson’s
account of Fitch’s response to the Curry paradox in [l] .) But, clearly, this
conclusion is absurd, modus ponens is the logician’s best friend, so the real
culprit must have slipped through before (29). In [9] Geach tracks it down:
it is the absorption rule. (This was partly inspired by Moh Shaw-Kwei, who
had centered his discussion [ 171 of Curry’s paradox around this rule.) “If
we want to retain the naive view of truth” he says, “then we must modify
the elementary rules of inference relating to ‘if’.” And, “after all, the form
‘if p, then p only if r’ never occurs in ordinary discourse, and we might
have a wrong idea of its logical force”. This suggestion is not motivated any
further, let alone worked out. The procedure reminds one of operations
during which “useless” organs like the appendix are taken away: it would
not work with, say, most glands.
A remarkable feature of Geach’s paper is its very sophisticated construc-
tion of an A such that A cf (A + B) is true, where B is the arbitrary state-
ment like above:
Let “IV” be a name of the variable “w” ranging over statements. Consider

(30) The result of replacing the variable W in the statement w by


a name of w implies B, and
56 J. F. A. K. VAN BENTHEM

(31) The result of replacing the variable Win the statement (30)
by a name of (30) implies B.

Clearly, (31) is the result of replacing the variable Win the statement (30)
by the name “(30)” of that statement, so (31) may serve as the required A.
Readers familiar with the proof of the futed-point lemma for number
theory will notice the similarity. We repeat the salient points of that proof
for convenience. Let F(x) be a number-theoretic formula with the one free
variable x. A number-theoretic sentence D is to be found such that

(32) D ++FCD’)
holds, where ‘D1 is the numeral corresponding to the Gljdel number rD'
of D. First, form

(33) F(sub(x, rxl, x)),


where “sub” represents the arithmetical substitution function SUB satisfying

(34) SUB<G’, ‘yl, n) = ‘[n/v] G7


for all number-theoretic formulas G, aI1 variables y and ail n E fN. For the
Code! number m of (33) we get

(35) F(sub(m, ‘xl, m))


as the required D. This is seen by noting that

(36) SUB[(33)‘, ‘x1, m) = ‘F(sub(m, rxl, m))‘.


Geach’s construction is an obvious improvement upon Curry’s procedure
of just postulating self-reference. Admittedly, the “sub(x, rxl, x)“-idea is
more or less implicit in Curry’s “X(X)“, but in a very rudimentary form. We
are not sure how much originality can be claimed for Geach in this matter,
however, because the fixed-point construction, both in informal and in
formal contexts, had a long prehistory in 19.54 already (cf. Section IV).
Thus, Geach had the idea of Curry’s paradox as well as an easily arithmet-
izable construction for obtaining the basic equivalence with which the
argument starts. He must have been aware that GSdel had managed to put
the related Liar paradox to good use in number theory, by achieving self-
reference through arithmetical manipulations. Yet this material remained
unused, except for a misguided attack upon the absorption rule.
FOUR PARADOXES 57

A side-remark: using an idea of Quine’s (cf. Smullyan 127’1) one may


formulate Curry’s paradox without using substitution:
Define the norm of an expression E as E followed by its own quotation.
Then consider

(37) B is implied by the norm of ‘B is implied by the norm of”.


Clearly, (37) is the norm of “B is implied by the norm of”, so (37) is true if
and only if B is implied by (37). This construction is arithmetizable as well,
as Smullyan shows in the above-mentioned paper.
Geach’s result appeared in 1954, and a little later, in 1955, Lob’s remark-
able paper [ 151 appeared in which he proved what was to become known as
“Lob’s theorem”:
For the provability predicate Provp, of Peano Arithmetic,

(38) bPA ProvpA (-BY) + B implies TPA B

for any number-theoretic sentence B.


The crucial point in the proof of this is the construction of a sentence A
such that

(39) TPA A ++(ProvpA CA’) + B)


from which kPA B is deduced. Before we give this deduction, let us state
the theorem in more detail and more generally.
Let Tbe any arithmetical theory strong enough to represent substitution
(so the fiied-point property holds for T). Let ProvT satisfy

(40) If t-T C, then t-T Prov,[C’)

(41) r--T Prov,cC + D’) + (Prov&C1) + Prov,(rDl))

(42) I-T ProvJC’) + Prov,(+-Prov,cC1)l)


for all arithmetical sentences C. D. (Using the modal notation qC=def
ProvTIC1) this may be simplified to

(43) If i-T C, then rT EIC

WI +T~(C’D)+(oC+oD)

(45) kT q c + 0 OC.)
For such a theory T and all arithmetical sentences B,
58 J.F.A.K.VANBENTHEM

if t-T ProvTcB’) + B, then t-T B,


or, in our modal notation that we will use from now on,

if l-T q B -+ B, then I-T B.

Proof: Assume, for any B, that i-T q B -+ B. It is to be shown that l-T B.


The fixed-point property applied to Prov,(x) + B yields A such that
kT A ++(Prov,~A’) + B), i.e., modally,

(46) t-TA++(oA-+B)
(Cf. Godel’s “Liar sentence” J/(et&w) for which 9 ++lo$ holds.)
Then the reasoning goes as follows,

(47) tTA +(oA +B) ((46))


(48) ~To(A-+(oA+B)) ((47) + (43))
(49) t-T~A+(ooA-+oB) ((48) + (44) twice)

(50) ä ,oA+oB ((49) + (45) + “absorption”)

(51) /--,oA+B ((SO) + the assumption on B)

(52) t-T A ((5 1) + (46))


(53) t--T 0x4 ((52) + (43))
(54) i-T B.
Again a few side remarks. Limb answered a question of Henkin in this way,
who had asked if the formula “asserting” its own provability in Peano arith-
metic (i.e., kPA B tf qB) is provable. It is.
Lob’s theorem may be sharpened to

(55) kT o(oB-+B)+oB.

Recent work of D. H. J. de Jongh, R. Solovay and others (cf. Solovay [28])


shows that a striking converse holds as well: any property of Provp, express-
ible in this modal notation follows from (55). More precisely, a formula F
constructed from proposition letters using 1, --, and q , is derivable in the
modal logic consisting of propositional logic together with the modal ad-
ditions (43), (44) and (5.5) if and only if every formula F’ obtained from F
by substituting arithmetical sentences for the proposition letters and
FOUR PARADOXES 59

replacing parts of the form q G by Prov,,cG’), is provable in Peano


arithmetic.
Finally, it should be noted that the analogy between metamathematical
results about provability predicates and results in modal logic had struck
R. Montague already, witness his [ 191.
Now, back to the main topic. Lob remarks at the end of [ 151 (“at the
referee’s suggestion”) that a paradox could be formulated by stripping his
proof of o-signs, i.e., dropping provability in favour of truth. (47), . . . , (54)
then collapses to (lo), . . . , (13) and we have that any statement B is true,
via the intermediary of A which says that B is true if A is. This is Curry’s
paradox, of course, and when Lob remarks that the paradox may be of
interest, in that it shows self-reference to cause trouble even in the absence
of negation, then again this was even Curry’s motivation.
So, we have the strange anti-parallel between Godel’s proof, where a well-
known semantic paradox inspired a formal result in terms of provability
rather than truth, and Liib’s proof, where a semantic paradox (known as
well, but never put to good use) was extracted from a formal result about
provability. To be sure, all the building blocks of Lob’s paper existed in the
literature (cf. our references to Godel, Curry and Geach), but Lob seems to
be the first to have seen an important insight, where others saw mere
curiosities.

IV. HISTORICAL REMARKS

In the previous sections we have seen the development of the few ideas
responsible for our four paradoxes. Cantor supplied the concept of a “set”
and the diagonal method. This method applied to the totality of all sets
yields the Russell class. Russell reformulated the set-theoretic paradox in
terms of concepts, using application, h-abstraction and negation. His own
definition of negation by means of implication forms the basis of Curry’s
version of the latter paradox, in which negation has been eliminated. Finally,
a fixed-point construction allowed Geach (and Lab) to dispense with
X-abstraction: some number theory (necessary for coding) suffices.
On the other hand, the historical course of events is far from clear, and
seems to consist of several disconnected strands. We will add some comments
to put Sections II and III in better perspective.
(a) In 1892 Cantor published a paper in the “Jahresbericht der deutschen
60 J.F.A.K.VANBENTHEM

Mathematiker-Vereinigung” in which he proved that the power set of any


given set has a greater cardinality than that set. This paper led Russell to
discover his paradox in 190 1, aswill be seen below. The Burali-Forti paradox
concerning the class of all ordinals appeared in 1897. This result was known
to Cantor as early as 1896 (cf. his letter to D. Hilbert mentioned in
Meschkowski [ 16 ] ). In a letter to Dedekind, written in 1899, Cantor states
the Burali-Forti paradox as well as a similar paradox concerning the classof
all cardinals. He did not find these results disturbing, because of his distinc-
tion between “consistent” multiplicities that may be gathered into a set
(he mentions the “apt” French word ensemble) and “inconsistent” ones
that are given by some condition without being a set (in other words: the
modern set/class distinction). The “paradoxes” establish nothing more than
the insight that the ordinals (or cardinals) do not form a consistent multi-
plicity. From this point of view, there is no particular interest in deriving
new paradoxes, which may explain Cantor’s failure to arrive at the Russell
paradox. Cantor’s reaction will be discussed in Sections V and VI.
Russell, however, starting from the implicit assumption that any con-
dition defines a set, was disturbed by Cantor’s result of 1892 because it
seemed to contradict his intuition that there is a universal set. (Cf. Russell
[25], especially Sections IO0 and 349.) He decided to “test” Cantor’s
reasoning by applying it to the set of all sets, and arrived at the Russell class.
His subsequent attempts to overcome the difficulty are well-known, and it is
tempting to conclude that we owe, amongst others, the theory of types to
the fact that Russell’s ontological views were lesssophisticated than Cantor’s.
It remains to correct an error in Beth [2], where it is narrated how the
rumour of Cantor’s paradox reached Russell in June 1901, “who then
established a paradox of his own”. This is not substantiated by Russell’s
own remarks, either in [25] or in his autobiography. Russell’s doubts were
his own.
(b) We all know Frege’s drastic reaction to Russell’s communication of
his paradox (cf. Van Heyenoort [ 10) ): he dropped logic altogether. After
writing a first draft of this paper, however, we discovered, in Quine [21],
that Frege did propose a way-out of the difficulty: by means of modifying
the axiom schema of set-formation

(56) (3XWY)(Y Ex ff KY)) to


(57) ( 3X)(VY)(Y F x + (Y E x ++G(v))
FOUR PARADOXES 61

which amounts to the move discussed in Section 2 (cf. (18)). In fact there is
quite a literature about attempts of this kind to circumvent the paradoxes
(cf. Geach [8], Hintikka [ 1 I] and Quine [21] ). For example, Quine shows
that the above modification (“Frege’s way-out”) is unsuccessful given some
very weak logical principles concerning X-abstraction.
(c) It was remarked in Section III that Curry does not refer to Russell’s
paradox of exemplifiability. This isolation is a recurrent phenomenon in the
history of what may be called the paradox of Curry-Geach-Lob. Later
pupils of Curry, like M. Bunder in [3], do not refer to either Geach or Limb.
There is also a difference in emphasis. Bunder, like Curry himself, uses the
term “paradox” in a strictly technical sense to mean “inconsistency in some
formal system”. So maybe the natural language version of the paradox
should be credited to Geach and Lob only. (By the way, this use of the term
“paradox” is quite objectionable. It should be used only when there is an
element of surprise involved - in the spirit of Proclos, who calls some (valid)
result in Euclidean geometry ncr&o~orc~rov.)
(d) The fmed-point construction in Geach [9] was not the first of its
kind, although it may be doubted if the idea was widely known in 1954.
There is an ad hoc construction to this effect in Tarski, Robinson and
Mostowski [30] (1953) and when Lob used a construction similar to
Geach’s in [ 151 he called it “originated by Godel”. Still, the general fixed-
point idea had been given by Barkley Rosser as early as 1939 (cf. [24] ), so
logicians who knew their Journal of Symbolic f.ogic (something still feas-
ible in those days) should have had it as a tool already and be aware of its
formalizability in arithmetic. Moreover, an informal natural language version
of the fixed-point construction had been stated for a philosophical audience
by J. Findlay in [5], which appeared in 1942. For example, Findlay para-
phrases Godel’s key formula as

(58) The statement obtained by replacing the only free variable


in the statement form “The statement obtained by replacing
the only free variable in the statement form X by the quo-
tation of that statement form is unprovable” by the quotation
of that statement form is unprovable.
(M. Polanyi records in his “Personal knowledge” that Russell, being con-
fronted with (58) “took its meaning in at a glance”.)
(e) The isolation of the various traditions is shown by the recent paper
62 J. F. A. K. VAN BENTHEM

(1976) Stone [29], in which the author gives a formal version of Geach’s
paradox, apparently unaware of Lob’s work in this direction.
Lob himself does not refer to any of Russell, Curry and Geach.
Yet the similarity with Geach’s paper will strike any reader of both. It did
not escape G. Kreisel, as a look at the Mathematical Reviews of 1956 showed
(cf. [13]). It did not escape E. W. Beth either, whose [2] contains a refer-
ence (but no more than that) to the “paradox of Geach-Limb”.

V. REACTIONS TO THE PARADOXES

The paradoxes of Section II have led to the axiomatization of set theory, as


a way of stating the principles of that disciplin explicitly. The two current
versions of axiomatic set theory, Zermelo-Fraenkel (ZF) and Von
Neumann-Bernays-Godel (NBG), reflect two different attitudes to the
paradoxes, however. ZF embodies the “classical” reaction to a contradiction:
the theory in which it has been derived is too strong, so it should be
weakened. It could be said that ZF tries to have asmuch of Cantor’s intuitive
set theory as possible short of the unrestricted principle (56). Frege’s way-
out is another instance of this kind of move. NBG, on the other hand,
escapesthe contradiction by means of making a distinction. The contra-
diction arose through identification of objects (viz. classesand.sets) which
should have been distinguished between. For example, the Russell-paradox
merely shows that the Russell classis not a set. We have seen an anticipation
of this attitude in Cantor’s letter to Dedekind. If it be objected that making
the class/set distinction means being wise after the event, this fact could be
admitted at once: the paradox has enabled us to see more clearly.
Russell’s theory of types is the best-known reaction to the paradox of
exemplifiability. It could be classified as an extreme case of making a dis-
tinction: the whole language is divided into types. This reaction differs from
the previous ones in that the disputed principle cannot even be formulated
any more: “X(X)” becomes unwellformed, instead of “x E x” in set theory,
which merely expresses a false statement (cf. (2 1)). In the case of the related
Curry paradox we have encountered the other strategy, that of weakening
the theory, as well: Fitch and Geach considered dropping some rules of
logic. There may seem to be too little logic in the derivation of the Russell
paradox for this strategy to be applied, but let us look more closely. First
h-conversion is used in the derivation of (15), which could be objected to,
FOUR PARADOXES 63

and, even if (15) is accepted, one could try to block the derivation from
(A ++1A) to A and 74. It would (presumably) read as follows

(59) A+lA

(60) A-+lA (from (59))

(61) (A + 1A) + 1A (by propositional logic)


(62) 1A (modus ponens on (60) and (6 1))

(63) 1A +A (from (59))

(64) A (modus panens on (62) and (63)).


If one does not follow Fitch in suspecting modus ponens, then an attack on
(61) seems the only possibility left. But this principle holds even in minimal
logic, so there does not seem to be much hope for moves in this direction.
It is also instructive to review some proposed solutions to the Liar para-
dox. This paradox arises in pure logic, once self-reference is admitted. The
weakening strategy consists in restricting either logic or self-reference
mechanisms. The predominant reaction makes distinctions, however. Again,
Russell’s ramified theory of types does this most thoroughly, by typing the
theory of types. Tarski’s distinction between object language and meta-
language is another casein point: sentences may have different functions.
Both these reactions are concerned with self-reference. Another influential
school makes the distinction inside logic by questioning the equivalence

(65) A if and only if rA1 is true,


i.e., Tarski’s convention T. Once it is admitted that the expressions on the
left and right hand side of (65) are not necessarily equivalent, it becomes
necessary to determine the status of the phrase “is true”: does it denote a
predicate (as is implied by cornering A) or is it a sentential operator really?
This last-mentioned approach turns the “truth operator” T into a modal
operator for which laws might be formulated like

(66) TA-‘A

(67) T(A+B)+(TA+TB)
(68) TA (for logical axioms A)
etc. Adding the law of bivalence
64 J.F.A.K.VANBENTHEM

(69) TAv77A
will yield an undesirable collapsing of T:

(70) A -+ TA becomes derivable (using 77A + 1A).


We are thus led to question a traditional logical principle. Most authors take
truth to be a property of sentences, however (cf. Van Fraassen [6] or
Kripke [14]). The Liar paradox and related paradoxes may then be inter-
preted as stating restrictions on the use of the truth predicate. (It may be of
interest to note that Kripke discussesHenkin’s problem about the formula
B cf oB asserting its own provability, in this connection, He is concerned
with the sentence asserting its own truth, which is equally interesting as the
liar sentence itself, from his point of view.) Two of these restrictions,
inspired by results of Tarski and Montague, will be given below, for a better
appreciation of Lijb’s result of Section III. But the main point will be clear
already: once the distinction has been made, the paradoxes ceaseto be
harmful and turn out to be interesting theorems.
It follows from Tarski’s proof of the undefinability (within arithmetic)
of arithmetical truth that no truth predicate T(in a language allowing fixed
point constructions) can satisfy, for all A

(71) A tf @-Al).
A more careful inspection of his argument shows that T cannot even satisfy,
for all A,

(72) fl~l)-tA

(73) if t-A, then I-- TcAl).


To see this, start with any B such that

(74) B +’ 1 <B’). Then


(75) 7fB’) -+ 1B (from (74))

(76) flB1) + B (from (72))


(77) 1 T(-B1) (from (7.5) and (76))

(78) B (from (77) and (74))


(79) n?o (from (78) and (73))
and (77) A (79) is a contradiction.
FOUR PARADOXES 65

Montague’s [ 191 , mentioned in Section III, is relevant here as well. The


paradox of the Hangman (described in his [ 181) led him to discover a the-
orem about number theory with the following implication for T: (let ~4
denote T(-Al) as before) T cannot satisfy, for all A and B,

(80) o(A + B) -+ (ml -+ oB)

(81) oA +A

(821 o(oA -+A)


and, for all A which are laws of logic or axioms of Robinson’s arithmetic
(or any theory strong enough to represent diagonalization)

(83) o/i.
Montague’s proof of this starts with a sentence B such that

(84) B +. z~lB,
but we will give a proof not involving negation. Let C be any statement.
Create a B such that the following holds

(85) B+QB+C).
The proof of (85) uses only a finite conjunction D of Robinson’s axioms, so

(86) D + (B + m(B + C-j) is a law of logic, whence

(87) (~(B+C)-+(B+C))+(D+(B+(B+C)))
and, by a form of the absorption rule,

(88) (o(B -+ C) + (B + C’)) + (D + (B --f C)) are laws of logic.

Applying (83) and (80) yields the provability of

(89) q(O(B+C)+(B+C))+(OD+C(B-+C)) so. by (82)


and (83) again,

(90) o(B + C). It follows, by (85), that

(91) B and, by (8 I), that


(92) B+C whence
(93) c. (QED)
66 J.F.A.K.VANBENTHEM

Finally, the reactions to the paradox of Curry-Geach-Lob conform to


our expectations. The weakening strategy was found in the work of Fitch
(who restricted the use of modus ponens) and Geach (who restricted the use
of the rule of absorption). Other strategies are as for the case of the Liar
paradox. For example, Lob’s theorem may be interpreted as saying that the
truth predicate cannot conform to all Jaws of 54, even in the absence of
negation: (43), (44). (45) together with (8 I) would enable us to prove any
statement whatsoever. Montague’s result was shown to be applicable in this
case as well: it is even stronger than Lob’s in that it does not require either
(43) or (45) (although it does require the truth of Robinson’s axioms).
When a theory, or, more informally, a set of beliefs, turns out to contain
a contradiction, the classicalstrategy is to drop some of the beliefs involved.
In fact one could maintain that this is one of the most important ways of
correcting our beliefs (cf. Qume and Ullian [22]). There may be various
ways of trimming the theory down to consistency, but this just means there
is room for creativity. A second strategy consists in rephrasing the entire
theory in such a way that the contradiction disappears: either it cannot be
formulated any more, or it becomes a harmless theorem. This rephrasing is
the result of making distinctions not present in the old theory. There is no
real opposition between the two strategies: e.g., one could say that in the
second strategy we are forced to take back some metabeliefs about the old
theory.
It is clear that any inconsistent theory can be weakened until the remain-
der becomes consistent again (or at least not blatantly inconsistent). A
similar result holds with respect to the second strategy: the inconsistency
can always be removed by making distinctions. Again, there are many possi-
bilities, e.g., (i) splitting up the objects described by the theory into new
classes,or (ii) viewing the relations in the theory under new aspects. (i)
amounts, in predicate logic, to adding new unary predicate constants and
suitably relativizing the appropriate quantifiers. (ii) amounts to raising
degrees of predicate constants (cf. the familiar search for “enough para-
meters to describe the situation consistently” in physical equations). Any
inconsistency can be removed by means of tactics like (i) and (ii). (It would
be quite tedious to state and prove this in formal detail.) The paradoxes
treated in this section have shown how many examples of both kinds of
strategy exist.
Let us note in passing that the above remarks do not depend on the fact
FOUR PARADOXES 67

that a contradiction is a statement of the form A ADA, properly speaking.


This would restrict the scope of our discussion to the case where negation
is present, but, like we have seen, a good substitute exists in caseswhere
negation is absent: (VA)A.
The two strategies discussed here seem to be the only possibilities of
neutralizing a paradox (apart from ignoring the problem). Yet they are not
often used consciously, as legitimate means of escape from an unpleasant
situation. For example, the method of raising degrees has not even been
tried in the case of set theory, as far as we know. (One could introduce para-
meters z in statements of the form x Ey, thus x E, Y. Several unparadoxical
versions of the principle of abstraction (56) then become available. This is
nothing but a formal trick, of course, but interesting interpretations might be
found: maybe z could be taken as a time variable, and “x E, y” as “at time
z x belongs toy”, which allows for “growing” sets.) Another example: the
method of weakening logic has not been tried in the case of set theory
either (but Quine’s result referred to in Section IV shows that even very
weak logics cannot support the weight of (56)). Looking at the matter from
our methodological point of view may yield new insights. For example, is
there an interesting set theory which retains (56) but sacrifices some (as
little as possible) logic?
To certain persons, the second strategy seems a little dishonest. A contra-
diction means defeat which should be accepted: by changing the theory.
What retreat is more ignominious than a refusal to lose by changing the rules
of the game after first having agreed to play? And yet this is what a change
in logic or in language means. This point was stressed by Feyerabend in
connection with quantum logic. It may be seen quite forcefully by imagin-
ing what reception a move like Russell’s (change the language in order to
prevent future formulation of the paradox) would get in ordinary debate.
This strategy brings out, a little too clearly, the unfalsifiability of science:
there is always a way out of any theoretical quandary (a trait it shares with
related closed systems of thought like the world of magic). Moreover, it
seems to endanger the possibility of real disagreement, by justifying the
tendency, always present in contemporary continental philosophy, to inter-
pret differences of opinion as talking as cross-purposes. (This is.good for
human relations, but bad for philosophy.) And yet the second strategy
seems to be the most fruitful one from a theoretical point of view: it leads
to ever subtler intellectual tools with which to analyze the world.
68 J. F. A. K. VAN BENTHEM

Nevertheless, someone like Cantor, who took this point of view (cf.
Section IV) is often accused of dishonesty. He merely pretended to be un-
affected by the paradoxes, out of sheer self-preservation, it is said. Any un-
prejudiced observer would have shared in the general experience of catas-
trophe around 1900. But would he? This question is taken up in the next
section.

VI. PARADOXES AND CRISES

Any student of logic is taught that a paradox means disaster. The fate of
Philites of Cos (died of excessive thinking about the Liar paradox) or
Gottlob Frege (saw his life’s work destroyed at a single blow) is vividly
before his mind. The Russell-Frege correspondence is on its way to become
the logician’s counterpart to the philosopher’s Socratic “Apology”. And
contemporary logic and foundational research are depicted as still gasping
for breath after having survived several terrible foundational crises. A sum-
mary of current accounts yields the following picture. Around 1890 Frege,
Dedekind and Cantor tried to find a logically secure base for mathematics.
They thought they had succeeded until several paradoxes were discovered
around 1900, the Russell paradox being one of the most vicious. In such an
infant theory these might be considered children’s diseases,but this was not
the impression they made on the researchers involved (except maybe
Cantor). Frege and Dedekind withdrew from the area, Russell and Hilbert
started an extensive process of reconstruction, and the conviction of the
“never again” was formalized in Hilbert’s Program: consistency should be
proved for existing mathematical theories. But again a shattering blow: in
193 1 Godel’s second incompleteness theorem showed this program to be
untenable. A weakened version survived, but on a doubtful philosophical
basis. At present we live in a period of relative calm, but for how long?
This account is fortified by the many crises being discovered in more
ancient times. Zeno’s paradoxes, the discovery of the irrationality of the
square root of 2 (with a very convenient legend about Hippasos who was
killed because of his divulging the secret): even Antiquity knew its foun-
dational problems!
Sceptics may point at the silent majority of researchers unaffected by
the socalled crises. If they know etymology (not many sceptics do), they
may recall that “paradox” means something like “surprise”, which is
neutral between disaster and godsent.
FOUR PARADOXES 69

But the main question is, how much of the above historical picture is
correct? And, even if it is largely correct, how many of the opinions held at
the time are tenable? We have seen in the previous section that there are
many possible ways of escape from a paradox: the question is to find inter-
esting ones. But this may be viewed as a quite respectable (and even desir-
able?) situation: an existing theory has to be refined and one tries several
methods. It mainly depends on one’s personal preferences, whether one
wants to live in more quiet times, when intra-theoretical activities predomi-
nate, or in the more disturbing phases of reconstruction. Even during those
phases, the majority of mathematicians continues like before, ignorant, or
justly confident that their activities will not lose importance. (At most, they
will be reinterpreted in the light of the new theory.) In fact, the successof a
proposed reconstruction will depend to a great extent on its ability to pre-
serve as much as possible (better: everything) of what has been achieved
already.
Then, why this feeling of despair when the paradoxes were discovered?
One important cause seems to lie in a philosophical attitude which might be
labelled the “foundational syndrom”. In Beth [2] there is a description of
the “Aristotelian theory of science” which requires a scientific theory to be
based on primitive notions and principles that hit rock-bottom: they are to
be self-evident and, a fortiori, free from contradictions. When can we be sure
to have found such notions and principles, even in mathematics? Venerable
concepts, like that of “number”. are accepted unthinkingly by most math-
ematicians, whose science is commonly regarded as the paradigm of exact-
ness, but

“Es ist wohl zu beachten, dass die Strenge der Beweisftihrung ein Schein bleibt, mag
such die Schlusskette liickenlos sein, wenn die Definitionen nur nachtraglich dadurch
gerechtfertigt aerden, dass man auf keinen Widerspruch gestossen ist. So hat man im
Grunde nur eine erfahrungsmassige Srcherheit erlangt und muss eigentlich darauf
gefasst sein. zuletzt doch noch einen Widerspruch anzutreffen, der das ganze Gebaude
zum Einsturze bringt”

as Frege wrote in the preface of his [7]. Mathematics would not be safe
until the very last of its notions lay at anchor in the sheltered bay of logic.
Strangely enough, Beth says that the Aristotelian theory of science, so
conspicuous in Frege, has succumbed to a mentality inspired by modern
logic: that of valuing methods above foundations. It is to be doubted if this
mentality lived in any of the great logicians around the turn of the century.
70 J.F.A.K.VANBENTHEM

They wanted security, and they believed in final solutions. The consider-
ations of Section V are opposed to this tendency. The important thing is not
to avoid contradictions once and for alI, but to have a way of dealing with
them, whenever they occur. And, this may sound like dialectics, but it can-
not be helped, it is quite conceivable that contradictions are the fuel of
much scientific progress, enabling us to get a finer picture of reality through
the strategy of making new distinctions. From this perspective, what is
scientific in science is not to be located in specific tncths (“2 + 2 = 4”, “the
Earth revolves around the Sun”), but in its procedures. Beth could be right
in stating that this is the prevailing modem attitude, but the Aristotelian
theory of science is not as moribund, even within contemporary logic, as he
claims.
The foregoing may not seem particularly heretic, but consider what
becomes of Hilbert’s Program. Is there any justification, from our point of
view, for a quest for consistency proofs? Well, it is not wrong, of course, to
look for consistency proofs, but it is not absolutely vital either. And GGdel’s
result, to us, is not the bitter disappointment it was to some, but the taking
away of what we need not have hoped for anyway. It is only fair to say that
Hilbert himself may have taken a much more detached stand than the above
historical picture suggests. His axiomatization of mathematical theories,
starting with “Grundlagen der Geometrie” in 1899, was inspired by consider-
ations of rigour and clarity, and a proof of consistency seemed to be a mini-
mal methodological requirement, which had the virtue, in his opinion, of
establishing the existence of the relevant objects at the same time. In his
Paris lecture of 1900 there is no trace of paradoxes: foundational problems
are ranked first because of their intrinsic interest. Only in 1904 we find him
saying that the Russell paradox is having a “downright catastrophic effect”
in mathematics. (Cf. Reid [23] .) But in “Grundlagen der Mathematik” this
consternation has ebbed away, and we find the original motivation for his
program. This may explain why Godel’s result does not seem to have
shocked him as much as it should have (according to a recent author). In
later proof theory consistency statements are studied a lot, but more, I am
afraid, because of the nice mathematical theories that can be built around
them, than because of any live contact with their original meaning. Still,
many people have a lively fear of contradictions, and someone like Wette,
who seems to have found a contradiction within ZF, is dismissed with a
sigh of relief (“too cranky to be taken seriously”). But why? What
FOUR PARADOXES 71

mathematical result would be more exciting than the discovery of a cor,tra-


diction in, say, Peano arithmetic? Who believes that mathematics would
come to an end because of such an an event? I say that, within a century, it
would count as the greatest advance ever in mathematics, having led to an
incomparably better understanding of the concept of “number”.

VII. PARADOXES AND NATURAL LANGUAGE

Most paradoxes have a formal and an informal version, formulated with


respect to formal languages and natural language, respectively. Are the latter
versions more than amusing curiosities about the incurably paradoxical
language that we use? Tarski’s dismissal of natural language is well-known:
“too rich to be true”. But we study language through various subsystems
whose behaviour is more or less like that of formal theories. and, therefore,
the conclusions of the formal versions are often applicable. Examples of this
are to be found in Section V, especially in connection with the Liar paradox.
The most important problem is probably which notion of “truth” is suitable
for natural language. If we admit that natural language is rich enough in
syntactic devices to produce self-reference, there are still several options.
First, one may take the modal approach to the truth predicate, which seems
rather severely limited by Lijb’s and Montague’s results. Then, there is
Kripke’s approach in [ 141, which may become a serious rival to even
Tarski’s semantic theories. But the situation could also be like that in set
theory. No general truth predicate might be definable, but only a relation of
truth, which holds between a strutire and a sentence. Maybe “true with
respect to” is the notion we are looking for, in the spirit of current model-
theoretic semantics.

REFERENCES

[l] Anderson, A. R., ‘Fitch on Consistency ’ , in The Logical Enterprise (ed. b)


A. R. Anderson, R. Barcan Marcus and R. M. Martin), Yale University Press, New
Haven, 1975, pp. 123-141.
[2] Beth, E. W.? The Foundations of .Mathematics, Sorth-Holland, Amsterdam, 1959.
[3] Bunder, M. W. V., ‘Set Theory Based on Combinatory Logic’. dissertation,
Universityof Amsterdam, 1969.
[4] Curry, H. B., ‘The Inconsistency of Certain Formal Logics’,Journal of Symbolic
Logic 7 (1942), 115.-117.
72 I. F. A. K. VAN BENTHEM

[S] Findlay, J., ‘Goedclian Sentences, a Non-numerical Approach’,Mind LI ;1‘)42),


259-265.
[6 J van Fraassen, B. C., ‘Presupposition, Implication and Self-reference’, The Journal
ofPhilosophy LXV (1968), 136-152.
[7] Frege, G., Die (;rundlagen der Arithmetik, W. Koebner Verlag, Breslau, 1884.
(81 Ceach, P. T., ‘On Frege’s Way Out’, in Logic Matters, Blackwell, Oxford, 1972,
pp. 235-237.
[ 91 Geach, P. T., ‘On Insolubilia’, in Logic Mufters, pp. 209-2 11.
110: van Heyenoort, J. (ed), Eiom Frege to Gddel, Harvard University Press,
Cambridge, Mass., 1967.
[ 111 Hintikka, J .> ‘Vicious Circle Principle and the Paradoxes’, Journal of Symbo/ic
Logic 22 (1957). 245-249.
[ 121 Kleene, S. C., Mathematical Logic, Wiley, New York, 1968.
[ 131 Kreisel, G., ‘Review of “Solution of a Problem of Leon Henkin”‘, Mathematical
Reviews 17-5 (1956).
[ 141 Kripke, S., ‘Outline of a Theory of Truth’, JournalofPhilosophy 72: 19 (1975),
690-716.
[ 15 ] Lob, M. H., ‘Solution of a Problem of Leon Henkin’, Journal of Symbolic Logic
20 (1955), 115-I 18.
[ 161 Meschkowski. H. (ed.), Probleme des Unendlichen, Braunschweig, 1967.
[ 171 Moh Shaw-Kwei, ‘Logical Paradoxes for Many-valued Systems’, Journal of
Symbolic Logic 19 (1954). 37-40.
[ 18) Montague, R., (with Kaplan, D.), ‘A Paradox Regaincd’,Notre DumeJoumal
of Formal Logic 1 (1960), 79-90.
[ 191 Montague, R., ‘Syntactical Treatments of Modality, with Corollaries on
Reflexion Principles and Finite Axiomatizability’. Acta Philosophica Fennica 16
(1963), 153-167.
[20] Orenstein, A., ‘I Think, Therefore I Am Not’, Rassegna Internazionale di Logica
12, (dicembre 1975). 166.
[21] Quine, W. V. 0.. ‘On Frege’s Way Out’, in Selected LogicPupers, Random House,
New York, 1966,146-158.
1221 Quine, W. V. 0. and Ullian, J. S., The Web of Belief, Random House, New York,
1970.
1231 Reid, C., Hilbert, Springer, Berlin, 1970.
[ 241 Barkley Rosser, J., ‘An Informal Exposition of Godel’s Theorems and Church’s
Theorem’, Journal of Symbolic Logic 4 (1939), 5 3-60.
[ 25 ] Russell, B., The Principles of Mathematics, Allen & Unwin, London, 1903.
[ 261 Smorynski, C., ‘Consistency and Related Metamathematical Properties’, Report
75-02, Mathematisch Instituut, University of Amsterdam, Amsterdam, 1975.
[27] Smullyan, R., ‘Languages in which Self-reference is Possible’, Journal of
Symbolic Logic 22 (1957), 55-67.
[ 281 Solovay, R.,ProvabiZity Interpretations of Modal Logic, IBM Thomas J. Watson
Research Center, New York, 1976.
[ 291 Stone, I. David, ‘A formalization of Geach’s antinomy’, Analysis 36:4 (1976),
203-207.
[ 30) Tarski, A., Robinson, R. M. and Mostowski, A., Undecidable Theories, North
Holland, Amsterdam, 1953.

You might also like