Professional Documents
Culture Documents
THE
REVIEW OF
SYMBOLIC
LOGIC
Copyright © 2010 by the Association for Symbolic Logic. All rights reserved.
Reproduction by photostat, photo-print, microfilm, or like process by permission only.
Association for Symbolic Logic. Individual Members of the Association for Symbolic Logic
receive a print subscription to The Review of Symbolic Logic as a benefit of membership.
Requests for information, applications for membership, orders for back volumes, business cor-
respondence, and notices and announcements for publication in the The Bulletin of Symbolic
Logic should be sent to the Secretary-Treasurer of the Association: Charles Steinhorn, ASL,
Box 742, Vassar College, 124 Raymond Avenue, Poughkeepsie, NY 12604, USA. The email
address of the Association’s business office is asl@vassar.edu. The ASL website is located at
http://www.aslonline.org. Links from that site provide further information on the Review and
on submitting papers for publication.
Copyright: All rights reserved. No part of this publication may be reproduced, in any form
or by any means, electronic, photocopy, or otherwise, without permission in writing from
Cambridge University Press. Photocopying information for users in the U.S.A.: Copying
for internal or personal use beyond that permitted by Sec. 107 or 108 of the U.S. Copyright
Law is authorized for users duly registered with the Copyright Clearance Center (CCC)
Transaction Reporting Service, provided that the appropriate remittance per article is
paid directly to CCC, 222 Rosewood Drive, Danvers, MA 01923, USA. Specific written
permission must be obtained for all other copying. General enquiries from the USA, Mexico,
and Canada should be addressed to the New York office of Cambridge University Press
http://www.cambridge.org/us/information/rights/contacts/newyork.htm; general enquiries
from elsewhere should be addressed to the Cambridge office http://www.cambridge.org/
uk/information/rights/contacts/cambridge.htm; permission enquiries from Australia and
New Zealand should be addressed to the Melbourne office http://www.cambridge.org/aus/
information/contacts_melbourne.htm; enquiries regarding Spanish-language translation
rights (only) should be addressed to the Madrid office http://www.cambridge.org/uk/
information/rights/contacts/madrid.htm
Abstract. In ‘The Foundations of Mathematics’, Frank Ramsey separates paradoxes into two
groups, now taken to be the logical and the semantical. But he also revises the logical system
developed in Whitehead and Russell’s Principia Mathematica, and in particular attempts to provide
an alternate resolution of the semantical paradoxes. I reconstruct the logic that he develops for this
purpose, and argue that it falls well short of his goals. I then argue that the two groups of paradoxes
that Ramsey identifies are not properly thought of as the logical and semantical, and that in particular,
the group normally taken to be the semantical paradoxes includes other paradoxes—the intensional
paradoxes—which are not resolved by the standard metalinguistic approaches to the semantical
paradoxes. It thus seems that if we are to take Ramsey’s interest in these problems seriously, then the
intensional paradoxes deserve more widespread attention than they have historically received.
c Association for Symbolic Logic, 2010
1 doi:10.1017/S1755020309990359
2 DUSTIN TUCKER
This axiom says that for every higher order function, there is an extensionally equivalent
first-order function.
The axiom of reducibility was never popular; even the Introduction to the second edition
of Principia Mathematica attempts to do without it. Since it is only necessary in a ramified
theory of types, Ramsey attempts to develop a nonramified theory of types that can serve
as the foundation of mathematics. The cited motivation for the ramified theory of types is a
collection of seven paradoxes (Whitehead & Russell, 1910, pp. 60–61). Ramsey observes
that these paradoxes can be divided into two classes, which have since come to be called
the semantical and the logical,3 and that ramification is only required for the former. Thus,
with the aim of eliminating the axiom of reducibility, he attempts to develop a nonramified
theory of types—one that only restricts the ranges of bound variables by type—in which
the semantical paradoxes do not arise.
My original aim in studying this section of Ramsey’s paper was twofold. (i) I found
it surprising that a proposed resolution of the semantical paradoxes by a figure such as
Ramsey could have been completely ignored for over 80 years. Even contemporary re-
views, such as Church (1932); Russell (1932), say nothing about this resolution, and I have
not seen any discussion of it in later work. I had hoped that Ramsey’s resolution would
provide fresh ideas about the semantical paradoxes—ideas not colored by Tarski’s subse-
quent analysis. At least, I hoped, it might illuminate a more familiar idea in a new way.
(ii) I wanted to determine whether his resolution could also handle another class of para-
doxes, which are specific to intensional logics (and which I thus call the intensional para-
doxes). Again, the hope was that even if it could not resolve them, it would illuminate
something or suggest something different from what has been done in the intervening
decades.
Unfortunately, my conclusions seem to divest Ramsey’s resolution of almost all its
interest qua logical system. With respect to (i), I think not only that he fails to resolve the
semantical paradoxes, but also that the failure involves such confusion that there is little
insight into those paradoxes to be gained from examining his failure. I am loath to attribute
such confusion to anybody, of course, and especially to someone like Ramsey, but I think
that it is the most charitable reading of his attempted resolution. (ii) fares no better, as
I think that there is nothing to be gleaned from his system’s inability to avoid these other
paradoxes.
In spite of this divestment, I think that it is at least interesting when someone like
Ramsey falls into such confusion. And, of course, perhaps my interpretation is incorrect,
and Ramsey actually has developed a novel and satisfactory resolution of the semantical
paradoxes. I might then hope that this exposition could lead to a better understanding of
such a resolution. Thus, one main aim of this paper is to reconstruct the logic Ramsey
employs in ‘The Foundations of Mathematics’ and to understand how it is intended to
resolve the semantical paradoxes. Unfortunately, his resolution of the paradoxes is far
from transparent. It is somewhat reminiscent of Tarski’s hierarchy of languages in that
Ramsey thinks that the resolution of the paradoxes relies on distinguishing different senses
of ‘mean’, but the details are significantly murkier than those of Tarski’s now-classical
resolution. Also, Ramsey does not develop a hierarchy of meaning relations, and he does
not relativize meaning to a language the way Tarski does. Part I is thus devoted to a
reconstruction and analysis of Ramsey’s logic, the technical consequences of his account
3 Actually, I do not think that his groups are the same as the semantical and the logical paradoxes;
this is the subject of Part II.
INTENSIONALITY AND PARADOXES IN RAMSEY 3
of meaning, and the application of those consequences to that logic in an attempt to resolve
the semantical paradoxes.
Part II is concerned with a slightly different topic. I have said that ‘The Foundations of
Mathematics’ is largely remembered for its division of paradoxes into two groups, Groups
A and B, which are now known as the logical and the semantical respectively. Certainly
this is how people have historically taken the distinction Ramsey draws.4 However, I think
that this misrepresents his division. In particular, I think that there are paradoxes in his
Group B that are not resolved by the common resolutions forwarded for the semantical
paradoxes—that are arguably not semantical paradoxes.5 Even more particularly, I think
that the extra paradoxes in Group B are the intensional paradoxes that I mentioned in (ii)
above. My general interest in the intensional paradoxes is that they pose problems for
logics attempting to capture propositional attitudes, but I will not argue for that point here.
My sole concern in Part II is to argue that Ramsey includes paradoxes in his Group B that
are not resolved by metalinguistic resolutions of the semantical paradoxes.
PART I
RAMSEY’S RESOLUTION OF
THE SEMANTICAL PARADOXES
4 See, for example, Fraenkel & Bar-Hillel (1958, pp. 5–14), Beth (1959, §171), Kneale & Kneale
(1962, pp. 664–665), Quine (1963, pp. 254–255), Feferman (1984, p. 75), and Priest (1994,
pp. 25–26).
5 Sort of not semantical paradoxes, anyway. I argue that, while all of Ramsey’s formalizations of
the Group B paradoxes will be prohibited by any resolution of the semantical paradoxes, those
formalizations are not the only ones we should be concerned with. I then attempt to show that
other, better formalizations of those paradoxes do not admit of similar resolutions, and argue that
one therefore ought to not be satisfied as soon as one has a resolution of the semantical paradoxes,
at least if one takes Ramsey’s concern with these paradoxes seriously. These details are the subject
of Section 8.
6 Most of the time. But, as in Section 3, it is sometimes easier to use Ramsey’s notation when
quoting him.
7 I also Curry the logic—I treat, for instance, binary relations as functions to other functions—but
this has no substantive impact on the resulting system, and could be eliminated at the cost of only
a little simplicity.
8 In my formulation of the logic, there are also well-formed formulas of type i and various
functional types (see below). But these well-formed formulas play no role beyond simplifying
the formation rules.
4 DUSTIN TUCKER
an infinite alphabet of variables with superscript τ . These superscripts are often omitted
when no undesired ambiguity thereby arises. There are two primitive constants ∼ p, p
τ, p, p
and ; p, p, p and infinitely many constants , one for each type τ . The other
primitives are λ, [, ], and any primitive constants of determinate type that one wishes to
include.
If f is a variable or constant with superscript τ , it is a well-formed formula of type τ .
If P is a well-formed formula of type τ and x is a variable with superscript σ , λx[P] is a
well-formed formula of type σ, τ . If P is a well-formed formula of type τ, σ and Q is a
well-formed formula of type τ , then PQ (often written P(Q)) is a well-formed formula of
type σ. I abbreviate P(Q)(R) (i.e., PQR) with P(Q, R), ; (P, Q) with P ; Q, and
(λx[P]) with x[P]. I use square brackets to disambiguate scope when these abbrevia-
tions lead to ambiguity.
Ramsey assigns numbers to some functional types: “A function of individuals we will
call a function of type 1; a function of functions of individuals, a function of type 2; and
so on” [p. 46]. Using the above notation, we can define these recursively. Anything of
type i, p, i, i, p, i, i, i, p, and so forth is of Type 1; given types τi of type n,
anything of type τ1 , p, τ1 , τ2 , p, τ1 , τ2 , τ3 , p, and so forth or of type τ1 , τ2 ,
τ1 , τ2 , τ3 , τ1 , τ2 , τ3 , τ4 , and so forth is of type n + 1. To simplify later definitions
slightly, I will also take anything of type i to be of Type 0.
Ramsey actually takes these numbered collections of types to be types themselves,
and does not distinguish between the different types within each collection. Thus, when
I indicate types with superscripts below, I often follow Ramsey and ambiguously use
numbers as though they denote types rather than collections of types. This undoubtedly
raises difficult technical problems, but I think that I can safely ignore them for the purposes
of this paper.
The numbered types are the only functional types that Ramsey considers. These are
all types of propositional functions:9 the last type symbol appearing in their symbols is
always p. This is what I mean when I say that Ramsey’s logic is intensional—repeated
functional application always results in well-formed formulas of type p rather than of the
type t of truth-values. In this respect, Ramsey’s logic is similar to the Russellian simple
type theory presented in Church (1974). I follow Thomason (1980) in using different
symbols for the primitive constants of this logic, replacing the standard truth-functional
constants ¬, →, and ∀ with my intensional ∼, ;, and respectively. The remaining
symbols ∨, ∧, ↔, and ∃ are replaced with ∪, ∩, , and respectively, which are defined
in the usual way.10
§3. Terminology. I have not tried to provide a model for Ramsey’s logic because
model theory had not been invented when Ramsey was writing. One probably could con-
struct models for the logic I have just described, but they are not necessary for explaining
Ramsey’s resolution of the paradoxes, so I will not be concerned to do so. Of course, if
his resolution seemed promising, we would need to confirm that he really has provided a
solution by constructing a model, but I argue that his resolution is not successful, so this
should not be an issue.
This raises some terminological worries, because Ramsey uses ‘proposition’ and ‘func-
tion’, words that are now associated with models. Ramsey frequently uses these terms to
refer to strings of symbols. Thus, ‘φa’ often is a proposition—it isn’t just a formula of
type p—and ‘λx[φ(x)]’ often is a function.
Even for Ramsey, this is not the correct use of ‘proposition’—‘φa’ is a propositional
symbol, and propositional symbols are instances of propositions. Then “two proposi-
tional symbols are to be regarded as instances of the same proposition . . . when they
express agreement and disagreement with the same sets of truth-possibilities of atomic
propositions” [p. 9]. That is, they are instances of the same proposition when their
truth tables agree. He reiterates this criterion of sameness at pp. 33, 34 and employs it
again at p. 35.11
In contrast to propositions, Ramsey, at least most of the time, insists that functions
are just symbols. He initially defines ‘propositional function’ as “an expression of the
form ‘ f x̂’, which is such that it expresses a proposition when any symbol (of a certain
appropriate logical type depending on f ) is substituted for ‘x̂’ ” [p. 8]. Similar statements
litter ‘The Foundations of Mathematics’: “a propositional function of individuals [is] a
symbol of the form . . .” [p. 35], “functions are symbols” [p. 36], and “our definition of a
propositional function as itself a symbol” [p. 43] are some examples. Only twice does he
suggest that there is a distinction to be drawn between functions and functional symbols,
as there was for propositions. As with propositions, it arises when he is explaining identity
for functions; thus, for example, he writes,
Two such symbols [i.e., propositional functional symbols] are regarded
as the same function when the substitution of the same set of names
in the one and in the other always gives the same proposition. Thus if
‘ f (a, b, c)’, ‘g(a, b, c)’ are the same proposition for any set of a, b, c,
‘ f (x̂, ŷ, ẑ)’ and ‘g(x̂, ŷ, ẑ)’ are the same function, even if they are quite
different to look at. [p. 35]
Here it seems as though he has a notion of function that is distinct from the symbols he
has been dealing with.
This, however, is an isolated case, and rarely arises. In fact, Ramsey seldom even invokes
the distinction between propositions and propositional symbols. He is almost exclusively
concerned with propositional and functional symbols12 —in particular, his discussion and
proposed resolution of the semantical paradoxes only make use of these symbolic senses
of ‘proposition’ and ‘function’. Still, in what follows, I will use ‘propositional symbol’ and
‘functional symbol’ whenever I am talking about the actual formulas, and I will edit quotes
from Ramsey accordingly when doing so does not obscure anything relevant. In light of
this exclusive concern with propositional and functional symbols, I almost always follow
the practice of allowing symbols to name themselves and omit mention (and corner) quotes
around them. Section 2 is a good illustration of this practice; the beginning of this section,
an illustration of an exception.
11 This should sound similar to things like Carnap’s state descriptions and more modern possible
worlds. I revisit this in Section 7.3.
12 Ramsey never uses the latter term, but given that propositional symbols are well-formed formulas
of type p, I will take functional symbols to be well-formed formulas of any functional type.
6 DUSTIN TUCKER
13 I do not mean to say that he has no other views about meaning; he does. It is just not clear that
those views were well considered.
INTENSIONALITY AND PARADOXES IN RAMSEY 7
14 He is not this explicit about the way the orders of definienda are determined, but this is clearly
what he has in mind.
15 As an aside, it is interesting to note that Ramsey has not actually introduced a hierarchy of orders:
orders are not defined in terms of earlier orders, but of types.
16 These are the only functions that he considers [p. 35, n. 1].
17 Technically, this would require that there be mention quotes around the definiendum, as f is a
variable over symbols. In practice, and in keeping with my general laxity about mention quotes,
I take =df to imply mention rather than use and omit these quotes without loss of precision.
18 One might be concerned here that the distinction between propositions and propositional symbols
is playing more of a role in Ramsey’s resolution of the paradoxes than I claim, since it is being
appealed to here. But, as will be apparent below when the resolution is presented, the distinction
is only important in that propositions proper play no role whatsoever in the resolution.
8 DUSTIN TUCKER
‘ φ[ f (φ)]’, the range of φ is “the set of [all] functions [of a specific type], not . . . the set
of [functions of order 1]” [p. 44, n. 1]. Similarly, Ramsey never allows functional symbols
that can only take arguments of certain orders. This is understandable—if he allowed such
functional symbols, he would have a hard time keeping orders out of bound variables.
Note that Ramsey characterizes the range of the variable φ as a set of functions. But
for Ramsey, there are no functions proper, just functional symbols, and in fact, we can
interpret this remark as taking the ranges of bound variables of type > 0 to actually be
sets of functional symbols (or, equivalently for my purposes, symbols that are short for
functional symbols—symbols such as the ‘φ’, ‘ψ’, and ‘χ ’ introduced above). However
implausible this sounds to modern ears, it follows from Ramsey’s insistence that there are
no functions but functional symbols. And even if it didn’t, I think that it plays a crucial role
in his analysis of the semantical paradoxes; see Section 6.2.
We can now see why I take the characteristic property of Ramsey’s orders to be that they
do not restrict the ranges of bound variables beyond their type. One might have thought
that there was no such restriction simply in virtue of his definition of orders. He, at least,
seems to have thought so. After all, how can properties of symbols restrict the ranges of
bound variables, which range over objects? But if I am right that the correct interpretation
of his resolution requires that at least some variables range over symbols, then his evasion
of such a restriction—and thus his evasion of ramification—is not so immediate. In fact,
I argue below that he does restrict the ranges of variables in ways that go beyond mere type
restrictions (although not, it turns out, by order, which is a separate problem).
5.2. The relations Ri . We still do not have enough to resolve the semantical para-
doxes, because we have yet to make any changes to the formalism of simple type theory.
Orders have been defined, but they do not place any restrictions on anything. We have not
yet captured the motivation for his orders, namely, that ‘φ’, ‘ψ’, and ‘χ’ are related to their
definientia in importantly different ways.
Ramsey’s solution is that there is not just one meaning relation R, but instead one such
relation for each order. Then a formula of the form R(f, P) is true when both f actually does
mean P (in the elliptical sense mentioned above) and R is the relation corresponding to the
order of f. Given the above motivation, this idea is not entirely unreasonable. Although he
does not use this notation, I will use Rn to indicate the relation corresponding to order n,
so that we can say that Rn (f, P) is only true when f is of order n and f elliptically means P.
As an example of how this works, we can represent the above definitions with the
following formulas.
R0 ‘φ’, λx[S 1 (a, x)] ,
R1 ‘ψ’, λx y 0 [S 1 (y, x)] ,
R2 ‘χ’, λx φ 1 f 2 λz[φ(z)], x .
These are true (or, more accurately, denote true propositions), but would be false if the Ri s
were changed.
Ramsey also considers meaning relations that are appropriate to multiple orders. For
instance, we could have a relation R that is only capable of being true when its first
argument is either of order n or of order m, which we might represent with ‘Rn,m ’. For
simplicity’s sake I will restrict myself to relations only appropriate to a single order. This
restriction has no substantive consequences.
INTENSIONALITY AND PARADOXES IN RAMSEY 9
It is very important to note that Rn (f, P) is not ill formed when f is not of order n, but
simply false. Orders cannot render formulas ill formed—to allow that would be to embrace
ramification.
(8) is an axiom schema. It can be glossed as the principle that all meanings of a given
symbol f are coextensive.
Now we can prove the left-to-right direction of (4). Assume
F(‘F’), (9)
whence by (1)
φ 1 R ‘F’, λz[φ(z)] ∩ ∼φ(‘F’) . (10)
It is easy to see how Ramsey’s strategy works when (1) is replaced with
F(x) =df φ 1 R0 x, λz[φ(z)] ∩ ∼φ(x) ; (1 )
this results in (6) and (11) becoming
φ 1 R0 ‘F’, λz[φ(z)] ; φ(‘F’) (6 )
and
R0 ‘F’, λz[ψ(z)] ∩ ∼ψ(‘F’) (11 )
respectively. But since the definiens in (1 )
contains a bound variable of Type 1, ‘F’ has
Order 2. Thus, as explained above, (2) will only be true if it is replaced with
R2 ‘F’, λx[F(x)] . (2 )
Clearly, we can derive nothing from (6 ), (11 ), and (2 ), so the proof of (4) no longer goes
through.22
The obvious response is to change (1 ) to
F(x) =df φ 1 R2 x, λz[φ(z)] ∩ ∼φ(x) , (1 )
21 This is the obvious revision of (8), but it is not the only possible one. Perhaps, for instance, we
would want to allow the subscripts on the two Rs to be different. This would not defeat Ramsey’s
resolution, though—see note 22—and I can think of no plausible replacement for (8) that would
do so.
22 If, as I considered in note 21, (8 ) allowed different R s, then the left-to-right direction would still
i
be provable. As it is, we cannot go in either direction.
INTENSIONALITY AND PARADOXES IN RAMSEY 11
With that notational point made, we can return to examining Ramsey’s reasoning. The
idea here seems to be that since φ can be instantiated by something “of some such form
as φ 1 f 2 λz[φ(z)], x ” (or, rather, λx φ 1 f 2 λz[φ(z)], x ), (1 ) involves a hidden
variable that must be able to range over functions of the same type as f —functions of
Type 2.26
It is not clear how this is supposed to play out in the formalism, because Ramsey
never explains exactly what a hidden variable is or what it means to involve one. But
to criticize Ramsey’s resolution of the paradoxes, as I wish to do, we need only understand
the conditions under which a formula involves a hidden variable, not why it does so.
To get a handle on this, it is helpful to return to the observation made above, that vari-
ables range over symbols. The principle he is using, as best I can tell, is this: a hidden
variable of type n is involved in a formula whenever an explicit variable in that formula is
contain symbols of type n. In (1 ), φ could be the
capable of ranging over symbols that 27
symbol λx φ 1 f 2 λz[φ(z)], x , so according to this principle, (1 ) contains (or, again,
involves) a hidden variable of Type 2, making ‘F’ of Order 3.
One ought to be dubious of this principle, and thus of any interpretation of Ramsey that
claims that he relies on it. But consider his discussion of the other semantical paradoxes.
He does not spend much time on these paradoxes, but his discussion of what he calls the
Liar, which I will call the Liar (to distinguish it from the modern metalinguistic Liar)
seems to rely on the same principle. He formulates the liar sentence as
‘ p’ p[Say(‘ p’) ∩ Rn (‘ p’, p) ∩ ∼ p]. (14)
According to Ramsey, since ‘ p’ is of order n, “ ‘ p’ may be φ n−1 [ψ n (φ)]. Hence ‘ p’
involves ψ n , and ‘I am lying’ in the sense of ‘I am asserting a false proposition of order
n’ is at least of order n + 1 and does not contradict itself” [p. 48].28
26 It is not clear whether Ramsey is here saying that all functions of Order 2 have to be of this form.
They do not, of course; ‘ φ 1 [φ(a)]’ is of Order 2 and does not contain a function of Type 2.
Ramsey was aware of formulas of this form, too; one appears at the top of p. 42. But whether he
is making a mistake here is irrelevant, as he does not need to say that the second argument to R
in (1 ) must contain a function of Type 2. If he did, he would be in trouble, because as mentioned
above, the R he actually uses is not just R2 , but R0,1,2 , “the sum of [R0 , R1 , and R2 ]” [p. 45].
See also note 28.
27 Or, if we want variables to range over symbols that are short for functional symbols, whenever an
explicit variable in that formula is capable of ranging over symbols that are short for expressions
containing symbols of type n. For simplicity’s sake, I will assume that variables range over
functional symbols directly, but this is clearly not importantly different from assuming that they
range over symbols that are defined with functional symbols.
28 Again, Ramsey is working with a ‘ p’ of order n or less, but this is irrelevant. He uses φ for both
types, but I have changed one to ψ for clarity’s sake. Ramsey says that φ and ψ are of types
n and n + 1 respectively—and thus that ‘ p’ involves ψ n+1 —but this, I think, can only be a
simple mistake. Functions of order n contain bound variables of type n − 1, not type n. If we used
Ramsey’s n for φ, then the Rn would have to be Rn+1 .
Against the concerns raised in note 26, Ramsey’s use of ‘may’ here is further evidence not only
that he need not say that a variable must be instantiated by the right sort of expression, but also
that he himself relies on nothing more than that it may be so.
One might be concerned about Ramsey’s gloss of ‘I am lying’ as ‘I am asserting a false
proposition of order n’, since propositions do not have orders. I will set this issue aside for now;
I revisit it in note 41.
INTENSIONALITY AND PARADOXES IN RAMSEY 13
The first thing that one notices about this account of the Liar paradox is ‘ p’ in (14).
(Though I have often omitted mention quotes throughout the paper, I especially do so
here to avoid confusion about whether they originate in Ramsey’s text.) This is the first
time that a quoted symbol has appeared immediately following a quantifier in Ramsey’s
paper, and he does not explain what it means. The natural reading is that the variable
in ‘ p’ is a variable that ranges over propositional symbols (or symbols introduced via
definitions whose definientia are propositional symbols; see note 27). This might strike
one as somewhat odd, since I argued above that bound variables of any type (other than i)
range over symbols. If that is right, then there should be no need to have an explicit
symbol following a quantifier. But it is actually not so clear that Ramsey is ignoring the
difference between propositions and propositional symbols here: he writes, “ ‘ p’ may be
φ n−1 [ψ n (φ)],” whence, it seems, we are supposed to think that p—which ‘ p’ means—is
the proposition denoted by φ n−1 [ψ n (φ)]. If this is right, then the variable in p could
plausibly range over propositions themselves, rather than propositional symbols, and
the Rn in (14) would be different from the Ri above: its second argument would be an
object—a proposition—rather than another symbol.29
For now, this discussion can be set aside, as Ramsey is clear that the order of (14), which
is all that is relevant to his resolution of the Liar paradox, is determined by the variable in
‘ p’, not the variable in p. (The discussion will return with force in Section 8.) The only
point I wish to make here is that Ramsey’s reasoning clearly relies on the principle that
I stated above: (14) involves a hidden variable function of type n because ‘ p’30 may be
short for an expression containing a symbol of type n—because “ ‘ p’ involves ψ n ”—and
it is this hidden variable that forces (14) to be of order n + 1.
To recap: I am suggesting that a formula P contains a hidden variable of type n just in
case there is an explicit variable in P that can be instantiated by a formula containing a
constant of type n. The order of P is then m + 1, m the higehst type of variable—explicit
or hidden—occurring in P.
6.3. Problems for Ramsey’s resolution. If this principle is really what Ramsey is
relying on, then his resolution of the paradoxes faces serious problems. I hinted at one of
these when I pointed out that (i) he seems to be talking as though giving an Ri arguments of
the wrong order yields an ill-formed formula. This actually points to a much more serious
issue, which I have also hinted at before: (ii) Ramsey seems to actually be reintroducing
a restriction on the ranges of bound variables beyond types. This is already problematic,
because this is precisely what ramified type theories do, but it will turn out that
(iii) Ramsey’s restriction is not even as good a restriction as Russellian orders are.
Ramsey’s restriction, unlike that of ramification, has no basis in the formalism—it is
entirely ad hoc. Finally, I argue that (iv) if we are allowed to restrict the ranges of variables
in this way, we can reintroduce the paradox.
Of course, Ramsey does not see it this way, so let us start from the beginning and
return
to (i).
I have said
repeatedly that he seems to be assuming that formulas like
R0 ‘φ’, λx x 0 [ f (x)] are not just false but ill formed. He does not actually have to
assume this, but if he doesn’t, then the reasoning behind his resolution becomes extremely
29 Indeed, this might not be too surprising. Recall that though Ramsey is almost always insistent
that there are no functions but functional symbols, he does think that there are propositions
independent of propositional symbols. Perhaps that distinction is (probably unconsciously) in
play here. This becomes more significant in Section 8.2.
30 As I said above, I am adding no mention quotes here; this is the first variable occurring in (14).
14 DUSTIN TUCKER
31 Of course, one could add more structure to the logic in order to avoid this worry. One could
probably even argue for such additions through an appeal to Ramsey’s understanding of meaning.
But the other three worries would still stand, and spelling out his account of meaning would be a
lengthy digression, so I will not pursue this response.
INTENSIONALITY AND PARADOXES IN RAMSEY 15
This leads to one final point, (iv), which is that we can also choose the range of φ so that
‘F’ is only of Order 2, whence we can once again derive (4). To do this, we simply restrict
φ so
that the only
second-order functional symbols it can range over are symbols like
λx φ 1 [φ(x)] . Prohibiting this restriction would, of course, require still more arbitrary
restrictions, this time on what restrictions are permissible. Thus, after all this work, it
seems as though his resolution of the paradoxes not only requires an unmotivated and un-
wanted restriction on the ranges of bound variables, but does not even successfully resolve
the paradoxes without even more ad hoc restrictions on how these very restrictions can
look.
PART II
§7. The intensional paradoxes. So far, the intensionality of the logic I have been
using has played no role: neither fact that well-formed formulas are of type p nor the
use of nonstandard logical symbols has been relevant. But paralleling the intensional part
of the logic, I now introduce a type t, intuitively of truth-values; the primitive constants
¬t,t and →t,t,t ; and infinitely many constants ∀τ,t,t , one for each type τ . The
constants ∨, ∧, ↔, and ∃ are defined in the usual way. I also introduce infinitely many
constants =τ,τ,t and ≈τ,τ, p , which are the obvious identity relations for each type
τ ,33 and the constant ∨ p,t , which takes propositions to their truth-values. This suggests
32 Again, see Fraenkel & Bar-Hillel (1958, pp. 5–14), Beth (1959, §171), Kneale & Kneale (1962,
pp. 664–665), Quine (1963, pp. 254–255), Feferman (1984, p. 75), and Priest (1994, pp. 25–26).
33 If there is a reason to think that Ramsey’s logic absolutely prohibits the inclusion of identity, then
we can add Church’s strict equivalence from Church (1974). The following paradox requires a
little more work in that case, but it can still be constructed. However, it seems to me that the
reasons Church presents in Church (1974) in favor of strict equivalence work in favor of identity
16 DUSTIN TUCKER
(17) denotes the proposition that the only things that Aristotle has asserted (if he has
asserted anything at all) are either the proposition denoted by (15) or false propositions.
This seems to at least be possibly true, and it does not seem to contradict the denotation
of (16), so hypothesizing that the two of them are true should not be problematic. But, of
course, it is. Formally, we are supposing
∨
A a, x p [A(a, x) ; ∼x] (18)
and
∨
∀x p A(a, x) → x= y p [A(a, y) ; ∼y] ∨ ¬∨x ; (19)
these are ∨(16) and ∨(17) respectively.
From (18) we have
¬∀x p [∨A(a, x) → ¬∨x], (20)
which is simply ¬∨(15), almost immediately. The derivation is elementary; one assumes
∨(15) and, after applying the translation principles, instantiates the variable with (15) itself.
here, and the concerns raised there about using identity instead of strict equivalence do not apply
to the present logical system.
34 The example of Aristotle and assertion comes from Church (1974), although Church does not
develop any paradoxes.
INTENSIONALITY AND PARADOXES IN RAMSEY 17
(24) and (20) are contradictories, so the supposition of (18) and (19) has gone wrong
somewhere. It is hard to see where, though. It is certainly possible for Aristotle to say
(the Greek equivalent of) “Everything Aristotle asserts is false” and nothing else, which
would, at least prima facie, satisfy both assumptions; such a situation is, at least prima
facie, one in which Aristotle has said (and said only) that everything Aristotle says
is false.35
7.2. The difference between the intensional and the semantical paradoxes. It is
tempting to think that the above paradox—call it the Aristotle paradox—will be resolved by
whatever one adopts to resolve the Liar (and the Grelling, and maybe the Strengthened Liar,
etc.). It certainly feels similar to the semantical paradoxes. But there is a crucial difference.
Consider, for example, Tarski’s hierarchy of languages. This resolves the semantical
paradoxes by prohibiting any language from talking about the semantics of that very
language. In particular, no sentence of a given language L n can contain a satisfaction
predicate for L n , so one cannot construct a sentence that says of itself that it is
false.
In the Aristotle paradox, though, no sentence says of itself that it is false. Indeed, there is
no appearance of a metalinguistic notion of truth at all. The only notion of truth that appears
in the Aristotle paradox is one that applies to propositions. The problematic assumption is
not that Aristotle asserts, “Every sentence Aristotle asserts is false”; the problem is that
Aristotle asserts the proposition that every proposition Aristotle asserts is false. Thus, any
restriction on metalinguistic satisfaction predicates that we wish to include in our logic
(such as a Tarskian prohibition on languages containing their own satisfaction predicates)
can be taken on board without qualification—it will do nothing to the above derivation of
a contradiction, because satisfaction of formulas is completely irrelevant there.
The preceding paragraph is a little imprecise. We can informally state the situation that
leads to the Aristotle paradox as one in which (i) Aristotle asserts the proposition that every
proposition Aristotle asserts is false and (ii) every proposition Aristotle asserts is either that
proposition or false. The ‘false’s in this informal characterization apply to propositions,
and in that sense, as I wrote above, “[t]he only notion of truth that appears in the Aristotle
35 I do not mean to say that, at the end of the day, this actually is a situation in which (18) and (19)
are both true; one way to resolve the paradoxes is to insist that it is not actually such a situation.
But if they were not even prima facie true, there would nothing for a logician to do here; there is
work to be done precisely because (i) this (clearly possible) situation seems to be one in which
both (18) and (19) are true and (ii) (18) and (19) seem to imply a contradiction. If things did not
even seem this way, then there would be no paradox to resolve in the first place.
18 DUSTIN TUCKER
paradox is one that applies to propositions.” But I did not use any truth or falsity predicates
when formalizing (i) and (ii), and generally do not do so to translate English sentences
containing ‘true’ or ‘false’ (in their propositional senses, anyway). I would, for example,
represent the proposition that Aristotle asserts a true proposition with ‘ p[A(a, p) ∩ p]’,
in which nothing like a truth predicate appears; if we were saying that the proposition is
false, I would simply insert a ∼ in front of the second conjunct.36
While good to clarify—I really was speaking imprecisely—these details are irrelevant
to the main point, which is that no metalinguistic predicates of any sort appear in either the
informal statements (i) and (ii) or the formalizations thereof. As long as that is true, Tarski’s
hierarchy of semantical predicates—of satisfaction predicates—will be of no immediate
assistance in resolving the Aristotle paradox.37
Similarly, a truth-value gap approach to the semantical paradoxes will not help here
without adaptation. That approach allows sentences to lack truth-values, but again, truth-
values of sentences are irrelevant to the Aristotle paradox; truth-values of propositions are
what matter.
I do not mean to say that these are the only two approaches to the semantical paradoxes,
or that they cannot be adapted to resolve the intensional paradoxes as well.38 But such
issues are far beyond the scope of this paper, as I am not concerned here with resolutions of
the intensional paradoxes. My only concern is to show that as far as Ramsey is concerned,
the intensional paradoxes are distinct from the semantical paradoxes, and it is enough for
this purpose that resolutions of the latter do not always resolve the former.
7.3. Propositions. “As far as Ramsey is concerned” in the preceding sentence is not
innocuous. This distinction between the intensional and the semantical paradoxes might
not be all that interesting if one takes propositions to be, or at least be very much like,
sentences. Thus, one might worry at this point that I am imposing too specific an account
of propositions on Ramsey. Russell, for instance, is notorious for not being clear about the
distinction between propositions and the sentences that express them. While I think that
the issues raised by the intensional paradoxes are interesting in their own right, they would
36 This way of formalizing the relevant propositions (without the ∨ constant) has been taken by
others working on these paradoxes; Prior (1961) is a notable example. The general approach of
translating English sentences containing ‘true’ and ‘false’ without truth or falsity predicates is
discussed at length in Grover et al. (1974).
37 One might think that ∨ is the relevant truth predicate, contra my claim that no metalinguistic
truth predicate appears in my formalizations of (i) and (ii). If it were, then the Aristotle paradox
wouldn’t be very surprising—it would be arising in a logic that Tarski already proved couldn’t
have models. But the ∨ s in (18) and (19) only serve to say that the propositions expressed
by (16) and (17)—which are formalizations of (i) and (ii) respectively and which contain no
truth predicates of any sort—are true. That is, if one wants to think of ∨ as a truth predicate, one
must think of it as a propositional, not metalinguistic, truth predicate, and this is precisely the
distinction that I am trying to draw between the intensional and semantical paradoxes: the former
are concerned with propositional truth; the latter, metalinguistic.
38 Indeed, the ramified theory of types is not entirely unlike Tarski’s hierarchy—especially in light
of Church (1976)—and it has been employed to resolve these paradoxes in, for example, Church
(1993, p. 152). And there is no reason to think that an approach that posits some sort of proposition
gaps would be hopeless, although none has been developed in perfect detail. Parsons argues for
such an approach in Parsons (1974), and Bealer (1994, p. 162) at least thinks that positing gaps is
the most promising extant route. But what—and, more importantly, where—such gaps are exactly
is never very clear, and such an approach has never been worked out in print in any detail (as far
as I know).
INTENSIONALITY AND PARADOXES IN RAMSEY 19
have little place in this paper if it turned out that Ramsey was equally unclear, or if he
clearly took propositions and sentences to be importantly alike.
Luckily, Ramsey is explicit both about the distinction between propositions and formulas
and about the nature of propositions themselves. As I said in Section 3, his terminology is
somewhat confusing at times, but he seems to be clear on the distinction in practice, even if
his notation does not observe it very scrupulously. In fact, I think that his understanding of
propositions is very close to the popular modern account of propositions as sets of possible
worlds.39 One of the quotes in Section 3 already suggests this. As I wrote there,
“two propositional symbols are to be regarded as instances of the same
proposition . . . when they express agreement and disagreement with the
same sets of truth-possibilities of atomic propositions” [p. 9]. That is,
they are instances of the same proposition when their truth tables agree.
Here, I have glossed expressing agreement and disagreement with sets of truth-
possibilities of atomic propositions as having truth tables that agree. But it is easy to
think of each row of a truth table picking out a set of possible worlds, namely, the worlds
at which the atomic propositions in the truth table have the truth-values they are assigned
on that row. One can then think of two propositional symbols as instantiating the same
proposition, to use Ramsey’s terminology, when they are true at the same possible worlds.
This, at least, should behave the same as his notion of agreeing and disagreeing with the
same truth-possibilities.
§8. Intensional paradoxes in Group B. I have argued that, given Ramsey’s account
of propositions, the intensional paradoxes are distinct from the semantical paradoxes. But
this is not enough; I also have to argue that Ramsey was worried (at least de re, if not de
dicto) about the intensional paradoxes. The argument here is a bit more intricate than one
might expect, because all of Ramsey’s formalizations of the Group B paradoxes actually
involve semantical notions: Tarski’s hierarchy, for example, will resolve them all. But this
does not mean that the formalizations he provides are the only ones available, or the only
ones that he would have endorsed. Indeed, given the informal statements of the paradoxes
that both he and Russell—from whom Ramsey inherited all but one of the paradoxes he
lists—provide, I think that it is plain that one can construct more faithful formalizations in
some cases. In particular, I think that he has shoehorned meaning into his formalization of
the Liar , and that a more natural formalization of even the sentence that he starts with,
and definitely the sentence that Russell starts with, involves no semantical predicates.
If (i) this is right, and if (ii) we are interested in resolving the (informal statements of the)
paradoxes that Ramsey began with, then we need to look beyond the standard resolutions
of the semantical paradoxes even to resolve all of the paradoxes that Ramsey himself was
worrying about (though we do not need to do so to deal with all of his formalizations
of those paradoxes). Given the attention people have paid to Ramsey’s account of the
paradoxes, (ii) seems to be obviously true. I hope to show that (i) is as well.
39 Or the precursor to that account, Carnap’s notion of state descriptions. In fact, Carnap (1956, p. 9)
writes, “Some ideas of Wittgenstein were the starting-point for the development of [state-
descriptions],” citing the Tractatus in a footnote. Meanwhile, §I of ‘The Foundations of
Mathematics’, which contains most of Ramsey’s discussion of propositions, makes frequent
reference to Wittgenstein and his Tractatus. It is thus not surprising that Ramsey’s account of
propositions is similar to Carnap’s state descriptions.
20 DUSTIN TUCKER
paradox (Paradox 8), I will represent it with Rn , and retranslate (25) as
‘ p’ p[Say(‘ p’) ∩ Rn (‘ p’, p) ∩ ∼ p]. (14 )
The idea seems to be that one cannot say the sentence (14 ) and nothing else, though
there is nothing intuitively contradictory about such a supposition. If we were to take this
formalization seriously, we would probably want to rephrase it using more modern tech-
niques for representing self-reference, which had not been developed when Ramsey was
writing. As with the Aristotle paradox, one could then spell the supposition out carefully
with ∨ .
I am not going to take this formalization seriously, though, because it involves a met-
alinguistic relation, Rn , and a metalinguistic speech predicate, Say. Of course, with such
relations present, resolutions of the semantical paradoxes can be easily adapted to prohibit
any contradiction from forming. Consider, for example, Tarski’s argument that no language
can contain its own satisfaction predicate. One can easily adapt that argument to show
that no language with a constant like ∨ can contain an expression relation that holds
between formulas and the propositions they express. That is, one can show that, on pain of
contradiction, no language can include both ∨ and Rn . If we were to follow Tarski all the
way, this would lead us to develop a hierarchy of Rn relations, in much the same way that
Tarski proposes a hierarchy of truth predicates.
This, however, is a very unsatisfactory resolution of Paradox 4, especially in light of
the way Ramsey himself glosses ‘I am lying’ immediately after presenting this resolution.
As I quoted in Section 6.2, he thinks that he has shown that “ ‘I am lying’ in the sense
of ‘I am asserting a false proposition of order n’ . . . does not contradict itself” [p. 48].41
But surely ‘I am asserting a false proposition of order n’ does not involve any semantical
notions—it just involves the notion of assertion as applied to a proposition (a proposition
which happens to be about itself). Thus, although Ramsey’s own formalization of Para-
dox 4 will be resolved by, say, an adaptation of Tarski’s argument, it does not seem like
the best formalization of Paradox 4 in the first place. If we are going to take Paradox 4
seriously, then, we ought to try to construct a more faithful formalization of it, and see if
it, too, will turn out to be no different than the semantical paradoxes.
8.3. Better formalizations of Paradox 4 and a related paradox. There are two places
we could look for the informal statements that we are attempting to formalize. Ramsey’s
statement of the paradoxes is, of course, one place. But since he took himself to be sim-
ply repeating the list in Whitehead & Russell (1910) (with the exception of Grelling’s
paradox), one might also look there. If it turns out that there is an important difference be-
tween a paradox listed in ‘The Foundations of Mathematics’ and its precursor in Principia
Mathematica, then it seems reasonable to think that even Ramsey would be interested in
resolving both the paradox he lists and the distinct version from Russell. I think that such
a difference does turn up with respect to Paradox 4 and what I will call Contradiction 1,
the first of seven contradictions listed at Whitehead & Russell (1910, pp. 60–61); I discuss
possible formalizations of both paradoxes (or, more precisely, of the informal statements
of both paradoxes).
41 As I said in note 28, this is not actually a good gloss of his formalization, because propositions
do not have orders: it is not the order of the proposition, but the order of the propositional symbol
in (25)—and (14 )—that leads to the hidden variable. A better gloss would be ‘I am saying a
sentence of order n that denotes a false proposition’; in addition to attributing the order to the
right sort of thing, this captures the essentially semantical nature of (25).
22 DUSTIN TUCKER
42 Barwise & Etchemendy (1987) addresses self-referential (they prefer ‘circular’) propositions, but
it is highly unusual in this. While I think that the suggestions made in that book deserve closer
attention, this is not the place for it.
43 This is just the way it is put in Whitehead & Russell (1910), as quoted above.
44 It does involve an assertion relation that relates an individual to a proposition directly, and there is
no evidence that Ramsey considered such relations, but this does not strike me as a huge problem.
Certainly we now are happy to countenance propositional attitudes, and I see no reason to think
that Ramsey would be averse to them.
INTENSIONALITY AND PARADOXES IN RAMSEY 23
purposes. Luckily, (27) is very close to a formalization of one of the paradoxes listed in
Whitehead & Russell (1910), about which the point can be made without such concerns.
As my aim is only to show that there is at least one paradox that Ramsey would have been
interested in that is not resolved along with the semantical paradoxes, this should suffice.
8.3.2. Russell’s Contradiction 1. There are seven paradoxes, or contradictions, dis-
cussed in Whitehead & Russell (1910); they are basically Paradoxes 1–7 in a different
order. The first of the seven, which I will call Contradiction 1, is the analogue of Paradox 4.
It is introduced:
Epimenides the Cretan said that all Cretans were liars, and all other
statements made by Cretans were certainly lies. Was this a lie? The
simplest form of this contradiction is afforded by the man who says “I am
lying”; if he is lying, he is speaking the truth, and vice versa. (Whitehead
& Russell, 1910, p. 60)
Later, when discussing the resolution of the paradoxes by the ramified theory of types
[p. 62], the sentence I quoted above, ‘There is a proposition which I am affirming and which
is false’, is presented. But as all of these ostensible simplifications involve indexicality and
possibly self-reference, neither of which is present in the original statement, I want to focus
on the Epimenides paradox itself.
The problematic state of affairs here is one in which three things obtain: (i) Epimenides
says the proposition that every proposition a Cretan says is false; (ii) Epimenides is a
Cretan; and (iii) every other proposition any Cretan has said is false. As I said above,
Russell is not known for his care in distinguishing propositions from sentences, so one
might worry about my use of ‘proposition’ in (i) and (iii). But he uses ‘proposition’ in the
simplified version from p. 62, and Ramsey clearly takes those propositions to be distinct
from the sentences that denote them in a very modern way, so I do not think that my use of
the term is illicit.
It should now be clear how these formalizations go, but for completeness, I include them
here. Letting ei be Epimenides, S i, p, p (x, y) mean that x says (the proposition) y, and
C i, p (x) mean that x is a Cretan,
∨
S e, x p y i C(y) ; [S(y, x) ; ∼x] , (28)
∨
C(e), (29)
and
∀x p ∀y i ∨C(y) → ∨S(y, x) →
x= x p y i C(y) ; [S(y, x) ; ∼x] ∨ ¬∨x (30)
§10. Acknowledgments. Thanks to Rich Thomason for many helpful comments and
suggestions throughout the development of this paper. Thanks also to Gabriel Sandu for
comments that helped frame my thoughts, especially regarding Section 6.2, and an anony-
mous referee, whose comments helped me clarify and tighten up several parts of Part II.
BIBLIOGRAPHY
Barwise, J., & Etchemendy, J. (1987). The Liar. Oxford, UK: Oxford University Press.
Bealer, G. (1982). Quality and Concept. Oxford, UK: Oxford University Press.
45 The intensional paradoxes have not been completely ignored. See, for example, Prior (1961),
Church (1993, p. 152), Bealer (1982, pp. 98–100), and Klement (2002). But attention to them
beyond these and with the realization that they are importantly different from the semantical
paradoxes has been minimal.
INTENSIONALITY AND PARADOXES IN RAMSEY 25
Bealer, G. (1994). Property theory: The type-free approach v. the Church approach. Journal
of Philosophical Logic, 23, 139–171.
Beth, E. W. (1959). The Foundations of Mathematics. Amsterdam, The Netherlands:
North-Holland Publishing Company.
Braithwaite, R. B., editor. (1931). The Foundations of Mathematics and Other Logical
Essays. London: Routledge and Kegan Paul. Collected papers of Frank P. Ramsey.
Carnap, R. (1956). Meaning and Necessity (second edition). Chicago, IL: Chicago
University Press. First edition published in 1947.
Church, A. (1932). Review of The Foundations of Mathematics and Other Logical Essays.
The American Mathematical Monthly, 39, 355–357.
Church, A. (1974). Russellian simple type theory. Proceedings of the American
Philosophical Association, 47, 21–33.
Church, A. (1976). Comparison of Russell’s resolution of the semantical antinomies with
that of Tarski. Journal of Symbolic Logic, 41, 747–760.
Church, A. (1993). A revised formulation of the logic of sense and denotation. Alternative
(1). Noûs, 27, 141–157.
Fraenkel, A. A., & Bar-Hillel, Y. (1958). Foundations of Set Theory. Amsterdam, The
Netherlands: North-Holland Publishing Co.
Feferman, S. (1984). Toward useful type-free theories, I. Journal of Symbolic Logic, 49,
75–111.
Grover, D. L., Camp, J. L. Jr., & Belnap, N. D. Jr. (1974). A prosentential theory of truth.
Philosophical Studies, 27, 73–125.
Kneale, W., & Kneale, M. (1962). The Development of Logic. Oxford, UK: Oxford
University Press.
Klement, K. C. (2002). Frege and the Logic of Sense and Reference. New York, NY:
Routledge.
Mellor, D. Hugh, editor. (1990). Philosophical Papers. Cambridge, England: Cambridge
University Press. Collected papers of Frank P. Ramsey.
Parsons, C. (1974). The liar paradox. Journal of Philosophical Logic, 3, 381–412.
Prior, A. N. (1961). On a family of paradoxes. Notre Dame Journal of Formal Logic, 2,
16–32.
Priest, G. (1994). The structure of the paradoxes of self-reference. Mind, New Series,
103(409), 25–34.
Quine, W. V. O. (1963). Set Theory and Its Logic. Cambridge, MA: Harvard University
Press.
Ramsey, F. P. (1925). The foundations of mathematics. Proceedings of the London
Mathematical Society, 25, 338–384. Reprinted in Braithwaite (1931); Mellor (1990).
Russell, B. (1932). Review of the Foundations of Mathematics and Other Logical Essays.
Philosophy, 7, 84–86.
Thomason, R. H. (1980). A model theory for propositional attitudes. Linguistics and
Philosophy, 4, 47–70.
Whitehead, A. N., & Russell, B. (1910). Principia Mathematica (first edition). Cambridge,
England: Cambridge University Press.
DEPARTMENT OF PHILOSOPHY
UNIVERSITY OF MICHIGAN
435 SOUTH STATE STREET
ANN ARBOR, MI 48109–1003
E-mail: dtuck@umich.edu
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010
Abstract. We show that if we interpret modal diamond as the derived set operator of a topo-
logical space, then the modal logic of Stone spaces is K4 and the modal logic of weakly scattered
Stone spaces is K4G. As a corollary, we obtain that K4 is also the modal logic of compact Hausdorff
spaces and K4G is the modal logic of weakly scattered compact Hausdorff spaces.
c Association for Symbolic Logic, 2010
26 doi:10.1017/S1755020309990335
MODAL LOGIC OF STONE SPACES 27
of compact Hausdorff spaces is K4 and the modal logic of weakly scattered compact
Hausdorff spaces is K4G.
§2. Preliminaries. In this paper we will be interested in the following modal logics:
1. K4 = K + 33 p → 3 p;
2. K4D = K4 + 3;
3. K4G = K4 + ¬2⊥ → ¬2¬2⊥; and
4. GL = K + 2(2 p → p) → 2 p.
It is well known that K4 is the modal logic of transitive frames, that K4D is the
modal logic of transitive serial frames, and that GL is the modal logic of dually well-
founded frames. These three logical systems are well known in the literature (see, e.g.,
Chagrov & Zakharyaschev, 1997). On the other hand, K4G is a relatively new system
introduced in Esakia (2002). Its main importance lies in its capability to express modally
Gödel’s second incompleteness theorem (a consistent logical system cannot prove its own
consistency).
Each of the four modal logics is complete with respect to its relational semantics. We
briefly recall some basic facts about relational semantics which will be used subsequently.
Let F = (W, R) be a K4-frame; that is, F is transitive (w Rv and v Ru imply w Ru). Then
F is a K4D-frame if in addition it is serial (i.e., for each w ∈ W there exists v ∈ W such
that w Rv). We call w ∈ W a reflexive point if w Rw; otherwise we call w an irreflexive
point. Let
C(w) = {w} ∪ {v ∈ W : w Rv and v Rw}.
We call C(w) the cluster generated by w; we also call a subset C of W a cluster if C =
C(w) for some w ∈ W . Let C be a cluster of W . We call C proper if it consists of more than
one element, simple if it consists of a single reflexive point, and degenerate if it consists
of a single irreflexive point. We call w ∈ W a maximal point if w Rv implies w = v, and
a quasimaximal point if w Rv implies v Rw. Clearly each maximal point is quasimaximal,
but not vice versa.
Now, F is a GL-frame iff F is dually well founded (i.e., for each nonempty subset V
of W there exists v ∈ V such that v Ru for no u ∈ V ); and F is a K4G-frame iff F is a
K4-frame and for each w ∈ W , either w is an irreflexive maximal point or there exists an
irreflexive maximal point v ∈ W such that w Rv.
We say that w ∈ W is a root of F if w Rv for each v ∈ W − {w}, and that F is rooted if
there exists a root in F. Note that a root may not be unique. In fact, if w is a root, then each
element of C(w) is also a root.
The next proposition states that all four modal logics of our interest have the finite model
property.
28 GURAM BEZHANISHVILI ET AL.
P ROPOSITION 2.1.
1. K4 is the modal logic of finite rooted transitive frames.
2. K4D is the modal logic of finite rooted transitive serial frames.
3. GL is the modal logic of finite rooted transitive irreflexive frames.
4. K4G is the modal logic of finite rooted K4G-frames.
Proof. For (1) and (2) see, for example, Chagrov & Zakharyaschev (1997, corollary
5.3.2); and for (3) see, for example, Chagrov & Zakharyaschev (1997, theorem 5.46).
We sketch a proof that K4G is the modal logic of finite rooted K4G-frames, using the
standard filtration argument through a well-chosen set of formulas. If K4G
ϕ, then
ϕ is refuted on the canonical model MK4G of K4G. Since K4 is a canonical logic and
the formula ¬2⊥ → ¬2¬2⊥ contains no propositional letters, the underlying frame of
MK4G is a K4G-frame. Consider the standard transitive filtration (see, e.g., Chagrov &
Zakharyaschev, 1997, pp. 141–145) of MK4G through the set
= {ψ : ψ is a subformula of ϕ ∧ (¬2⊥ → ¬2¬2⊥)}.
Since the underlying frame of MK4G is a K4G-frame, it is not difficult to see that the finite
refutation frame obtained by such a filtration has all quasimaximal clusters degenerate.
Indeed, let x be an arbitrary element in the filtrated model. Then x can be identified with
a maximal consistent subset of . Suppose x is not an irreflexive maximal point. Then
x must contain ¬2⊥. We also have that ¬2⊥ → ¬2¬2⊥ ∈ x. Therefore, by Modus
Ponens, ¬2¬2⊥ ∈ x. But then x is related to some y in the filtrated model with 2⊥ ∈
y. This implies that y is an irreflexive maximal point of the filtrated model. Thus, the
underlying frame of the filtrated model is a finite K4G-frame. That ϕ can be refuted on a
finite rooted K4G-frame is now straightforward.
Let X be a topological space and A ⊆ X . We recall that x ∈ X is a limit point of A if
for each open neighborhood U of x we have A ∩ (U − {x}) = ∅. Let d(A) denote the set
of limit points of A; d(A) is called the derived set of A. It is obvious that the closure of A
is A union d(A); that is, cl(A) = A ∪ d(A).
We also recall that a valuation of the basic modal language in a topological space X is a
map ν from the set of propositional letters into the powerset of X . Given a valuation ν and
x ∈ X , we define the satisfaction relation by induction:
1. x |ν p iff x ∈ ν( p);
2. x |ν ϕ ∧ ψ iff x |ν ϕ and x |ν ψ;
3. x |ν ¬ϕ iff not x |ν ϕ; and
4. x |ν 3ϕ iff for each open neighborhood U of x there exists y ∈ U − {x} such that
y |ν ϕ.
It follows that
2a. x |ν ϕ ∨ ψ iff x |ν ϕ or x |ν ψ
and that
4a. x |ν 2ϕ iff there exists an open neighborhood U of x such that y |ν ϕ for each
y ∈ U − {x}.
Given a topological space X , a valuation ν, and a formula ϕ, we say that ϕ is true in X
if x |ν ϕ for each x ∈ X and that ϕ is valid if ϕ is true under any valuation. If ϕ is valid
in X , then we write X | ϕ.
MODAL LOGIC OF STONE SPACES 29
Let L(X ) = {ϕ : X | ϕ}. Then it is well known (and easy to verify) that L(X ) is
a modal logic,
called the modal logic of X . Given a class K of topological spaces, let
L(K ) = {L(X ) : X ∈ K }. Obviously L(K ) is a modal logic, called the modal logic
of K .
Let X be a topological space. We recall that X is a TD -space if each point of X is the
intersection of an open subset and a closed subset of X . Alternatively, X is a TD -space iff
dd(A) ⊆ d(A) for each A ⊆ X . We also recall that x ∈ X is an isolated point if {x} is
an open subset of X . Let iso(X ) denote the set of isolated points of X . Then X is called
dense-in-itself if iso(X ) = ∅. Alternatively, X is dense-in-itself iff d(X ) = X .
We say that a subset A of X is dense if cl(A) = X , that X is weakly scattered if iso(X )
is dense in X , and that X is scattered if each subspace of X is weakly scattered.
The next proposition is well known. It shows that three of the four logics we are inter-
ested in are all modal logics of natural classes of topological spaces.
P ROPOSITION 2.2.
On the other hand, it will follow from our results that K4G is the modal logic of weakly
scattered TD -spaces.
A particularly important class of topological spaces is that of compact Hausdorff spaces.
Since each Hausdorff space is TD , it follows that the modal logic of compact Hausdorff
spaces contains K4.
We recall that a subset A of a topological space is clopen if it is both closed and open, and
that X is zero-dimensional if clopen subsets of X form a basis for the topology. Compact
Hausdorff zero-dimensional spaces are often called Stone spaces. They play an important
role in the theory of Boolean algebras as it follows from Stone duality that the category of
Boolean algebras and Boolean algebra homomorphisms is dually equivalent to the category
of Stone spaces and continuous maps. Under Stone duality, atomless Boolean algebras
correspond to dense-in-itself Stone spaces, atomic Boolean algebras correspond to weakly
scattered Stone spaces, and superatomic Boolean algebras correspond to scattered Stone
spaces.
It follows from Shehtman (1990) that K4D is the modal logic of any dense-in-itself zero-
dimensional metrizable space. In particular, K4D is the modal logic of the Cantor space C.
Since C is a dense-in-itself Stone space, it follows that the modal logic of dense-in-itself
Stone spaces is K4D. In addition, it follows from Abashidze (1988) that GL is the modal
logic of any ordinal α ≥ ωω (viewed as a topological space in the interval topology). In
particular, GL is the modal logic of ωω + 1. Since ωω + 1 is a scattered Stone space, it
follows that GL is the modal logic of scattered Stone spaces.
In this paper we show that K4 is the modal logic of all Stone spaces and K4G is the
modal logic of weakly scattered Stone spaces. As a consequence, we obtain that K4 is also
the modal logic of all compact Hausdorff spaces and K4G is the modal logic of weakly
scattered compact Hausdorff spaces. Consequently, K4G is also the modal logic of weakly
scattered TD -spaces. Thus, we obtain the following picture:
1. K4 = the modal logic of TD -spaces = the modal logic of compact Hausdorff spaces
= the modal logic of Stone spaces;
30 GURAM BEZHANISHVILI ET AL.
2. K4D = the modal logic of dense-in-itself TD -spaces = the modal logic of dense-in-
itself compact Hausdorff spaces = the modal logic of dense-in-itself Stone spaces;
3. K4G = the modal logic of weakly scattered TD -spaces = the modal logic of weakly
scattered compact Hausdorff spaces = the modal logic of weakly scattered Stone
spaces; and
4. GL = the modal logic of scattered spaces = the modal logic of scattered compact
Hausdorff spaces = the modal logic of scattered Stone spaces.
§3. Modal logic of dense-in-itself Stone spaces: a new proof. As we pointed out in
the previous section, K4D is the modal logic of the Cantor space. In this section we give
a new and simplified proof of this result by adopting the technique developed in Aiello
et al. (2003) for proving completeness of S4 with respect to the Cantor space (when 3 is
interpreted as the closure operator).
We proceed as follows. By Proposition 2.1(2), K4D is complete with respect to finite
rooted K4D-frames. Therefore, if K4D
ϕ, then there exists a finite rooted K4D-frame
F = (W, R) such that F | ϕ. Since F is a K4D-frame, each quasimaximal point of F is
reflexive. Figuratively speaking, F is top-reflexive.
We recall that U ⊆ W is an upset of W if w ∈ U and w Rv imply v ∈ U , and that the
collection of upsets of W forms a topology τ R on W, called an Alexandroff topology (in
which the intersection of any family of open subsets is again open). We also recall from
Bezhanishvili et al. (2005) that a map f from a topological space (X, τ ) into (W, R) is a
d-morphism if:
(i) f is continuous (V ∈ τ R implies f −1 (V ) ∈ τ ),
(ii) f is open (U ∈ τ implies f (U ) ∈ τ R ),
(iii) f is i-discrete (w an irreflexive point of W implies f −1 (w) is a discrete subspace
of X ), and
(iv) f is r-dense (w a reflexive point of W implies f −1 (w) is a dense-in-itself subspace
of X ),
and that onto d-morphisms preserve validity of formulas; or put differently, they reflect
refutation. Therefore, in order to refute ϕ on the Cantor space C, it is sufficient to construct
a d-morphism from C onto W .
L EMMA 3.1. For each finite rooted K4D-frame F, there exists a d-morphism f : C F
from the Cantor space C onto F.
Proof. We view C as the collection of infinite paths of the infinite binary tree T2 .
It is left to be shown that f is an onto d-morphism. That f is onto is obvious from the
definition of f . To see that f is open, let X be a finite path of T2 and the end of X be
labeled by w. Then, by the definition of f , we have f (B X ) ⊆ R + (w). Conversely, if
v ∈ R + (w), then there exists a finite path Y extending X whose end is labeled by v. Let
σ = (Y, 0, 0, . . .). Then σ ∈ BY ⊆ B X and f (σ ) = v. Thus, f (B X ) = R + (w), and so f
is open.
To see that f is continuous, let w ∈ W . We let
U= {B X : X is a finite path of T2 whose end is labeled by v ∈ R + (w)},
§5. Main results. In this section we prove our main results, that the modal logic of
Stone spaces is K4, and that the modal logic of weakly scattered Stone spaces is K4G. As
a corollary, we obtain that the modal logic of compact Hausdorff spaces is also K4 and that
the modal logic of weakly scattered compact Hausdorff spaces is K4G.
The key observation in establishing our main results is that each finite quasitree F =
(W, R) is a d-morphic image of an appropriately chosen Stone space. Our strategy will be
as follows:
1. Represent F as the disjoint union of two finite frames D and T in such a way that:
– D is a top-reflexive quasitree, hence a K4D-frame;
– T is the disjoint union of irreflexive trees T1 , . . . , Tn , hence a GL-frame.
2. Use Lemma 3.1 to build a d-morphism f from the Cantor space C onto D.
3. Use Lemma 4.2 to build d-morphisms gi from limit ordinals ωki +1 onto the
trees Ti .
4. Combine C and ωk1 +1 , . . . , ωkn +1 to obtain a Stone space X .
5. Combine f and g1 , . . . , gn to obtain a d-morphism from X onto F.
For Step (1) we employ a method reminiscent of the Cantor–Bendixson theorem which
represents each space X as the disjoint union of an open subspace U and a closed subspace
F so that U is scattered and F is dense-in-itself.
L EMMA 5.1. Let F = (W, R) be a finite quasitree. Then there exist finite (possibly
empty) frames D = (D, R D ) and T = (T, RT ) such that:
(i) W = D ∪ T , D ∩ T = ∅, R D is the restriction of R to D, and RT is the restriction
of R to T ;
(ii) D is a top-reflexive quasitree; and
(iii) T is the disjoint union of irreflexive trees T1 , . . . , Tn .
Proof. We first build D by applying repeatedly the operator R −1 to W until we reach the
(largest) fixpoint. More precisely, let D0 = W and Di+1 = R −1 (Di ). Clearly Di+1 ⊆ Di .
36 GURAM BEZHANISHVILI ET AL.
Proof. This follows easily from the well-known fact that the category Stone of Stone
spaces and continuous maps is closed under pushouts. In fact, Z is the pushout of the
diagram X ← Z
→ Y in the category Stone. More precisely, Z is the factor space of the
topological sum X ⊕ Y by the equivalence relation {(i(z), j (z)) : z ∈ Z }.
We denote the pushout of the diagram X ← Z
→ Y by X ⊕ Z Y and point out
that since we are working with compact Hausdorff spaces, continuous injections are in
fact topological (homeomorphic) embeddings. We consider an example which will be the
starting point in the construction of the space X to follow.
Suppose we are given an ordinal ωk1 +1 and its compactification Y1 such that Y1∗ is
homeomorphic to a closed subspace C1 of C. Then using Lemma 5.2 we can identify
the copies of C1 present in both C and Y1 to obtain the space X 2 = C ⊕ C1 Y1 such
that:
(a) X 2 is a Stone space based on the disjoint union of ωk1 +1 and C,
(b) ωk1 +1 is homeomorphic to an open subspace of X 2 ,
(c) C is homeomorphic to a closed subspace of X 2 , and
(d) Y1∗ ⊆ cl({ωk1 · k : k < ω}).
This situation is depicted in Figure 3 below.
For our final Step (5), we need to construct an onto d-morphism h : X → F. For this we
observe that X is the disjoint union of C and ωk1 +1 , . . . , ωkn +1 . Now let x ∈ X . We set
f (x), x ∈ C
h(x) =
gi (x), x ∈ ωki +1 .
That h is a well defined onto map is obvious. It is left to be shown that h is a d-morphism.
We first show that the restriction of h to each Yi , which we denote by h i , is a d-morphism.
Let Ti+ denote the range of h i , which is a subframe of F based on the set Ti ∪ Di . Let also
f i denote the restriction of f to Ci .
L EMMA 5.3. The map h i : Yi → Ti+ is a d-morphism.
Proof. To see that h i is continuous, let U be an upset of Ti+ . If U ⊆ Ti , then h i−1 (U ) =
gi−1 (U ), which is open in Yi since gi is continuous and ωki +1 is an open subset of Yi . If
U ∩ Di = ∅, then U = (U ∩ Di ) ∪ Ti and h i−1 (U ) = f i−1 (U ) ∪ ωki +1 , which is open in
Yi because f i−1 (U ) is open in Yi∗ and ωki +1 is open and dense in Yi .
To see that h i is open, let U be an open subset of Yi . If U ⊆ ωki +1 , then h i (U ) = gi (U ),
which is an upset of Ti since gi is open. Therefore, h i (U ) is also an upset of Ti+ . Suppose
now that U ∩ Yi∗ = ∅. Then f (U ) = gi (U ∩ ωki +1 ) ∪ f i (U ∩ Yi∗ ). By Lemma 4.4,
ωki · k ∈ U ∩ ωki +1 for some k < ω; and by Lemma 4.2, gi (ωki · k) = ri . Since gi is open,
gi (U ∩ ωki +1 ) = Ti . Thus, as f i is open, Ti ∪ f i (U ∩ Yi∗ ) is an upset of Ti+ .
That h i is r-dense is obvious because there are no reflexive points in Ti and f i is r-dense.
Similarly, as both f i and gi are i-discrete, it is easy to see that h i is i-discrete. Consequently,
h i : Yi → Ti+ is a d-morphism.
Now we show that h is a d-morphism. Since F is finite, by Bezhanishvili et al. (2005,
corollary 2.8), it is sufficient to show that d(h −1 (w)) = h −1 (R −1 (w)) for each w ∈ W ,
where d denotes the derived set operator of X .
L EMMA 5.4. For each w ∈ W we have d(h −1 (w)) = h −1 (R −1 (w)).
Proof. First we recall that if Y is a closed subspace of X and A ⊆ Y , then d X (A) =
dY (A). Now let w ∈ W . If w ∈ D, then R −1 (w) ⊆ D. Therefore, h −1 (w) = f −1 (w) and
h −1 (R −1 (w)) = f −1 (R −1 (w)). Since f is a d-morphism, we have:
d X (h −1 (w)) = dC ( f −1 (w)) = f −1 (R −1 (w)) = h −1 (R −1 (w)).
Next suppose that w ∈ Ti for some i ≤ n. Then h −1 (w) = h i−1 (w) and h −1 (R −1 (w)) =
h i−1 (RT−1+ (w)). By Lemma 5.3, h i is a d-morphism. Now as Yi is a closed subspace of X ,
i
MODAL LOGIC OF STONE SPACES 39
we have:
d X (h −1 (w)) = dYi (h i−1 (w)) = h i−1 (RT−1+ (w)) = h −1 (R −1 (w)).
i
§6. Acknowledgment. The second and third authors were partially supported by the
Georgian National Science Foundation Grant GNSF/ST06/3-017.
BIBLIOGRAPHY
Bezhanishvili, G., & Morandi, P. J. (2010). Scattered and hereditarily irresolvable spaces
in modal logic. Archive for Mathematical Logic.
Chagrov, A., & Zakharyaschev, M. (1997) Modal Logic, volume 35 of Oxford Logic
Guides. New York, NY: The Clarendon Press Oxford University Press.
Engelking, R. (1977). General Topology. Warsaw, Poland: PWN—Polish Scientific
Publishers.
Esakia, L. (1981). Diagonal constructions, Löb’s formula and Cantor’s scattered spaces.
In Studies in logic and semantics, ed. Z. Mikeladze (In Russian). Tbilisi, Georgia:
Metsniereba, pp. 128–143.
Esakia, L. (2002). A modal version of Gödel’s second incompleteness theorem, and the
McKinsey system. In Logical Investigations, No. 9, ed. A. Karpenko (In Russian).
Moscow, Russia: Nauka, pp. 292–300.
Esakia, L. (2004). Intuitionistic logic and modality via topology. Annals of Pure and
Applied Logic, 127(1–3), 155–170.
Gabelaia, D. (2004). Topological semantics and two-dimensional combinations of modal
logics. PhD Thesis, King’s College, London.
McKinsey, J. C. C., & Tarski, A. (1944). The algebra of topology. Annals of Mathematics,
45, 141–191.
Shehtman, V. (1990). Derived sets in Euclidean spaces and modal logic. Preprint X-90-05,
University of Amsterdam.
Terasawa, J. (1997). Metrizable compactification of ω is unique. Topology and its
Applications, 76, 189–191.
Abstract. Sound and complete semantics for classical propositional logic can be obtained by
interpreting sentences as sets. Replacing sets with commuting dense binary relations produces an
interpretation that turns out to be sound but not complete for R. Adding transitivity yields sound and
complete semantics for RM, because all normal Sugihara matrices are representable as algebras of
binary relations.
§1. Introduction. One way to get sound and complete semantics for classical propo-
sitional logic is to evaluate each variable as one of two truth values, and extend this
valuation to more complicated sentences by the classical truth tables. Another way to
get sound and complete semantics for classical propositional logic is to evaluate each
variable as a subset of a fixed universe of discourse. For complex sentences, interpret
conjunction as intersection, disjunction as union, negation as complementation, and so
on. These two methods are essentially the same, but the second one provides an obvious
generalization: replace “set” with “binary relation.” This approach was taken by Tarski,
who produced an undecidable fragment of classical propositional logic by early 1942;
see Tarski & Givant (1987, sec. 5.4, sec. 5.5, fn. 3*). Tarski’s operations include Boolean
intersection ∩, union ∪, and complementation , relative (or Peircean) multiplication | and
addition †, conversion −1 , and an identity relation.
Relevance logic arose in the 1950s and 1960s from attempts to axiomatize the notion that
an implication A → B should be regarded as true only if the hypothesis A is “relevant”
to the conclusion B. The earliest systems were proposed by Orlov in 1928 (Došen, 1992),
and by Moh (1950), Church (1951), and Ackermann (1956) in the 1950s. Semantics
were introduced and developed only much later, in the 1970s; see Routley & Routley
(1972), Routley & Meyer (1972a, 1972b), Urquhart (1972), Routley & Meyer (1973),
Fine (1974), Meyer & Routley (1973), Anderson & Belnap (1975), Routley et al. (1982),
Anderson et al. (1992), and Brady (2003).
The calculus of relations was created by De Morgan (1856, 1864a, 1864b, 1966)
and Peirce (1870, 1880, 1883, 1885, 1897, 1960, 1984), and was extensively developed
by Schröder (1966). Relation algebras arose from Tarski’s axiomatization of the calculus
of relations; see Tarski (1941), Tarski & Givant (1987), and Maddux (1991). Tarski’s un-
decidable propositional calculus is equivalent to the equational theory of relation algebras.
The Routley–Meyer semantics for relevance logic and the theory of relation algebras
have a significant class of structures in common. A structure is in this class if it is simul-
taneously the atom structure of a relation algebra and a normal relevant model structure.
c Association for Symbolic Logic, 2010
41 doi:10.1017/S1755020309990293
42 ROGER D . MADDUX
Prominent examples of these are the ones constructed by Lyndon (1961) from projec-
tive planes. This connection is the key to deep undecidability results in both subjects;
see Andréka et al. (1997) and Urquhart (1984).
This confluence makes it possible to think of propositional variables, sentences, and
worlds in a relevant model structure as binary relations. The connectives of relevance logic
are then certain operations on binary relations determined by the Routley–Meyer seman-
tics. For example, negation ∼ turns out to be converse complementation while fusion ◦
is simply composition. The constants of relevance logic will not be considered here
because they are the source of some difficulties; see Routley et al. (1982, p. 348), Bimbo
et al. (2009).
In Section 2 we present axioms and rules of deduction for relevance logic, and focus
attention on two prominent systems, R and RM. Sections 3 and 4 introduce relational
relevance algebras and give two examples, due to Belnap and Meyer. Soundness for the
interpretation of sentences as binary relations is shown in Section 5. In Section 6 we
prove that RM is a complete axiomatization of the logic of transitive commutative dense
relational relevance algebras, while in Sections 7 and 8 we show that R is an incomplete
axiomatization of the logic of commutative dense relational relevance algebras. Some
closing remarks are made in Section 9.
For discussions and communications about these topics, thanks to K. Bimbó, J. M. Dunn,
N. Galatos, R. Hirsch, I. Hodkinson, P. Jipsen, T. Kowalski, R. L. Kramer, D. McCarty,
R. K. Meyer, S. Mikulás, L. Moss, A. Urquhart, and the referee.
§2. Systems of relevance logic. Let Pv be a countable set whose elements are called
propositional variables. There are five connectives, ∨, ∧, ◦, →, and ∼. For any C ⊆
{∨, ∧, ◦, →, ∼}, the set SentC of C-sentences is the closure of the variables under appli-
cation of the connectives in C. Let Sent := Sent{∨,∧,◦,→,∼} . The connectives are opera-
tions on Sent which act in the way required of a language, that is, Sent, ∨, ∧, ◦, →, ∼ is
an algebra of type 2, 2, 2, 2, 1 (four binary operations
and one unary operation)
which is
absolutely freely generated by Pv. This means that Sent, ∨, ∧, ◦, →, ∼ is generated by
Pv and any function from Pv to an algebra R of type 2, 2, 2, 2, 1 has a unique extension
to a homomorphism from Sent, ∨, ∧, ◦, →, ∼ into R.
A sentence S ∈ Sent is an axiom of R if there are sentences A, B, C ∈ Sent such
that S is one of the sentences (A1)–(A31) listed below, and S an axiom of RM if S is one
of (A1)–(A33). The axioms of Routley & Meyer (1973, pp. 204, 224) are A1–A15.
A→A A1 (A1)
A∧B → A A5 (A2)
A∧B → B A6 (A3)
A→ A∨B A8 (A5)
B → A∨B A9 (A6)
((A → A) → B) → B (A13)
A → ((∼B → ∼ A) → B) (A14)
A → (∼B → ∼(A → B)) (A15)
A → ((A → B) → B) A2 (A23)
(A → (A → B)) → (A → B) A4 (A25)
(A → ∼ A) → ∼ A (A26)
(A → B) → (∼A ∨ B) (A28)
(A ∧ (A → B)) → B (A29)
(A → B) → (A → (A → B)) (A33)
Among the following rules of deduction, only modus ponens and Adjunction are used in
R and RM. The rules used in the Basic Logic of Routley et al. (1982, p. 287) are modus
ponens, Adjunction, Suffixing, Prefixing, and Contraposition.
A, A → B B modus ponens
A, B A ∧ B Adjunction
44 ROGER D . MADDUX
A → ∼B B → ∼ A Contraposition
A → B (B → C) → (A → C) Suffixing
A → (B → C) B → (∼C → ∼ A) Cycling
A (A → B) → B E-rule (Brady, 2003, p. 8)
For any A ∈ Sent, we write R A (or RM A) if A belongs to every subset of Sent that
contains the axioms of R (or RM) and is closed under modus ponens and Adjunction. This
axiomatization of R is highly redundant but provides more input for semantic analysis
in Theorem 5.1. Routley & Meyer (1973) use only A1–A15. Furthermore, R is well-
axiomatized in the following sense.
T HEOREM 2.1 (Routley & Meyer, 1973, theorem 7). Let C be one of the following sets
of connectives:
{→}, {→, ∼}, {→, ◦}, {→, ∼, ◦}, {→, ∧}, {→, ∨, ∧},
{→, ◦, ∧}, {→, ◦, ∧, ∨}, {→, ∼, ∧, ∨}, {→, ∼, ◦, ∧, ∨}.
If A ∈ SentC , then R A iff A is derivable, using only modus ponens and Adjunction, from
those axioms among A1–A15 that explicitly contain connectives in C.
§3. Relational relevance algebras. Binary relations are, by definition, sets of ordered
pairs. For arbitrary binary relations A and B, their union, intersection, difference, converse,
and relative product are defined as follows.
A ∪ B := {x, y : x, y ∈ A or x, y ∈ B} (1)
A ∩ B := {x, y : x, y ∈ A and x, y ∈ B} (2)
A − B := {x, y : x, y ∈ A and x, y ∈
/ B} (3)
A−1 := {x, y : y, x ∈ A} (4)
A|B := {x, y : ∃z (x, z ∈ A and z, y ∈ B)}. (5)
Let U be a non-empty
set. U 2 = {x, y : x, y ∈ U } is the set of ordered pairs of elements
of U . Sb U is the set of subsets of U 2 , and is called the set of binary relations on U .
2
∼A = {x, y : x, y ∈ U and y, x ∈
/ A}
A † B = {x, y : x, y ∈ U and ∀z∈U (x, z ∈ A or z, y ∈ B)}
Relational relevance algebras lack the constants of relevant algebras (Urquhart, 1996)
or De Morgan monoids (Anderson & Belnap, 1975), but they do satisfy many equations
not involving constants that have been used in the definitions of these and other algebras
designed for relevance logic. For example, if R = R, ∪, ∩, ◦, →, ∼ is a relational
relevance algebra, then R, ∪, ∩ is a distributive lattice, R, ◦ is a semigroup, and many
other equations and inclusions hold for all A, B, C ∈ R, such as
A ◦ (B ∪ C) = (A ◦ B) ∪ (A ◦ C)
(B ∪ C) ◦ A = (B ◦ A) ∪ (C ◦ A)
(A ∪ B) → C = (A → B) ∩ (A → C)
A → (B ∩ C) = (A → B) ∩ (A → C)
(A ∪ B) → C = (A → C) ∩ (B → C)
∼(∼A) = A
∼(A ∪ B) = ∼ A ∩ ∼B
∼(A ∩ B) = ∼ A ∪ ∼B
A ◦ B = ∼(A → ∼B)
A → B = ∼(A ◦ ∼B)
A → (B → C) = A ◦ B → C
(A → B) ◦ A ⊆ B
A → B ⊆ (C → A) → (C → B).
46 ROGER D . MADDUX
§4. Relational relevance algebras of Belnap and Meyer. In this section we give two
useful examples of relational relevance algebras, one on an infinite set, and one on a finite
set. For these examples we first define two closely related finite algebras, Belnap’s M0 and
Meyer’s RM84. They can be defined together as follows.
(i) Both M0 and RM84 are algebras of the form S3 , ∨, ∧, ◦, →, ∼ where
S3 := {−3, −2, −1, −0, +0, +1, +2, +3},
the set of designated values is {+0, +1, +2, +3}, and a sentence A is valid in
the algebra if every homomorphism from the algebra of sentences carries A to a
designated value.
(ii) For both algebras the reduct S3 , ∨, ∧ is the lattice of a Boolean algebra whose
atoms are −1, +0, and −2, whose top element is +3 and bottom element is −3,
satisfying these equations: −1∨+0 = +1, −1∨−2 = −0, and +0∨−2 = +2. For
tables see Belnap (1960, p. 145), Anderson & Belnap (1975, p. 252), Routley et al.
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 47
(1982, pp. 178, 253), and Brady (2003, p. 101); for Hasse diagrams see Anderson
& Belnap (1975, pp. 198, 252), Routley et al. (1982, p. 178), and Brady (2003,
p. 102).
(iii) In both algebras the operation ∼ takes −i to +i and +i to −i for every i ∈
{0, 1, 2, 3}.
(iv) The operation → in Belnap’s M0 is defined in Table 1, (Belnap, 1960, p. 145,
Anderson & Belnap, 1975, p. 253, and Brady, 2003, p. 101).
(v) The operation → in Meyer’s RM84 is defined in Table 2, (Anderson & Belnap,
1975, p. 334, and Routley et al., 1982, p. 253).
(vi) In both algebras the operation ◦ is defined by x ◦ y = ∼(x → ∼y); see Tables 3
and 4.
In the next two theorems we show that M0 and RM84 are isomorphic to algebras in Rcd ,
and hence belong to IRcd . Theorem 4.1 was announced in the abstract (Maddux, 2007) and
noted again in Bimbo et al. (2009), while Theorem 4.2 is new.
T HEOREM 4.1. Belnap’s M0 is isomorphic to a commutative dense relational relevance
algebra on a countable set, so
M0 ∈ IRcd . (14)
Proof. Let Q be the set of rational numbers. Define a map ρ from the universe S3 of M0
into the set of binary relations on Q, as follows.
Table 1. Operation → in M0
→ −3 −2 −1 −0 +0 +1 +2 +3
−3 +3 +3 +3 +3 +3 +3 +3 +3
−2 −3 +2 −3 +2 −3 −3 +2 +3
−1 −3 −3 +1 +1 −3 +1 −3 +3
−0 −3 −3 −3 +0 −3 −3 −3 +3
+0 −3 −2 −1 −0 +0 +1 +2 +3
+1 −3 −3 −1 −1 −3 +1 −3 +3
+2 −3 −2 −3 −2 −3 −3 +2 +3
+3 −3 −3 −3 −3 −3 −3 −3 +3
Table 3. Operation ◦ in M0
◦ −3 −2 −1 −0 +0 +1 +2 +3
−3 −3 −3 −3 −3 −3 −3 −3 −3
−2 −3 −2 +3 +3 −2 +3 −2 +3
−1 −3 +3 −1 +3 −1 −1 +3 +3
−0 −3 +3 +3 +3 −0 +3 +3 +3
+0 −3 −2 −1 −0 +0 +1 +2 +3
+1 −3 +3 −1 +3 +1 +1 +3 +3
+2 −3 −2 +3 +3 +2 +3 +2 +3
+3 −3 +3 +3 +3 +3 +3 +3 +3
ρ(−3) := ∅,
ρ(+0) := Id := {x, x : x ∈ Q},
ρ(+3) := Q2 .
M0 ∼
= ρ(S3 ), ∪, ∩, ◦, →, ∼ ∈ Rcd .
ρ(+3) := U 2 .
Then ρ(S3 ) is closed under ∪, ∩, ◦, →, and ∼, ρ(S3 ), ∪, ∩, ◦, →, ∼ is a commutative
dense relational relevance algebra, ρ is an isomorphism, and
RM84 ∼
= ρ(S3 ), ∪, ∩, ◦, →, ∼ ∈ Rcd .
multiplication and composition are definable in this fragment since A|B = ∼(B → ∼ A)
and A ◦ B = ∼(A → ∼B), so one may understand relevance logic as the restriction of the
calculus of relations to the operations ∪, ∩, |, ◦,→, and ∼.
Schröder (1966, sec. 11, pp. 153ff) showed that if a term is understood as the assertion
that the relation it denotes contains the universal relation, then every Boolean combination
of equations between terms denoting relations is equivalent to a single term. This conven-
tion allows the formulation of the calculus of relations as a sentential calculus; for details
see Tarski & Givant (1987, chap. 5). The corresponding convention for relevance logic is
that an individual term asserts that the relation it denotes contains the identity relation.
In the next theorem, parts (16)–(22) are handy computational rules, parts (23)–(31) show
that validity is preserved in all relational relevance algebras by the rules of deduction,
parts (32)–(47) show that several sentences are valid in R, and the remaining parts give
sentences valid in Rc , Rd , Rcd , and Rt .
T HEOREM 5.1. Suppose U is a set and A, B, C ⊆ U 2 . Then
Id → A = A, (16)
A → (B → C) = B|A → C = A ◦ B → C, (18)
A|(A → B) ⊆ B, ( A → B) ◦ A ⊆ B, (19)
(A → B)|∼B ⊆ ∼ A, ∼B ◦ (A → B) ⊆ ∼ A, (20)
A ⊆ B implies B → C ⊆ A → C, (21)
A ⊆ B implies C → A ⊆ C → B. (22)
if Id ⊆ A → ∼B then Id ⊆ B → ∼ A, (25)
Id ⊆ A → A, (32)
Id ⊆ A ∩ B → A, (33)
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 51
Id ⊆ A ∩ B → B, (34)
Id ⊆ A ∩ (B ∪ C) → (A ∩ B) ∪ (A ∩ C), (39)
Id ⊆ ∼∼ A → A, (40)
Id ⊆ ((A → A) → B) → B, (44)
Id ⊆ A → (B → (A ◦ B)), (50)
−i ∨ j if i ≤ j
i → j := (65)
−i ∧ j if i > j.
The binary operation ◦, obtained by the definition i ◦ j := ∼(i → ∼ j), can be character-
ized as follows (Anderson & Belnap, 1975, p. 400). If 1 ≤ i, j ≤ n then
−i ◦ − j = − max(i, j), (66)
⎧
⎨−i if j ≤ i
−i ◦ j = , (67)
⎩j if i < j
Table 5. ◦ in S8 and S9 .
◦ −4 −3 −2 −1 1 2 3 4
−4 −4 −4 −4 −4 −4 −4 −4 −4
−3 −4 −3 −3 −3 −3 −3 −3 4
−2 −4 −3 −2 −2 −2 −2 3 4
−1 −4 −3 −2 −1 −1 2 3 4
1 −4 −3 −2 −1 1 2 3 4
2 −4 −3 −2 2 2 2 3 4
3 −4 −3 3 3 3 3 3 4
4 −4 4 4 4 4 4 4 4
◦ −4 −3 −2 −1 0 1 2 3 4
−4 −4 −4 −4 −4 −4 −4 −4 −4 −4
−3 −4 −3 −3 −3 −3 −3 −3 −3 4
−2 −4 −3 −2 −2 −2 −2 −2 3 4
−1 −4 −3 −2 −1 −1 −1 2 3 4
0 −4 −3 −2 −1 0 1 2 3 4
1 −4 −3 −2 −1 1 1 2 3 4
2 −4 −3 −2 2 2 2 2 3 4
3 −4 −3 3 3 3 3 3 3 4
4 −4 4 4 4 4 4 4 4 4
then for each i = 1, . . . n it is not the case that qi < qi (hence q, q ∈ / L i ), nor is it
−1
the case that qi > qi (hence q, q ∈ / L i ). Thus q, q is not in any of the relations in
{L 1 , L −1
1 , . . . , L n , L −1 }.
n
Assume q = q . Let i = 1 if q1 = q1 , and otherwise let i be the smallest element
of {2, . . . , n} such that q1 , . . . , qi−1 = q1 , . . . , qi−1
and qi = qi . Since Q is linearly
ordered, either qi < qi or qi > qi , hence q, q ∈ L i iff qi < qi and q, q ∈ L i−1 iff
qi > qi . It follows from q1 , . . . , qi−1 = q1 , . . . , qi−1 that q, q is not in any of the
−1 −1
relations L 1 , L 1 , . . . , L i−1 , L i−1 . The assumption that qi = qi prevents the pair from
−1
belonging to any of the remaining relations L i+1 , L i+1 , . . . , L n , L −1
n .
Let
An := S : S ⊆ Ln . (73)
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 55
q , q ∈ L i , hence
q1 , . . . , qi−1 = q1 , . . . , qi−1
= q1 , . . . , qi−1 (74)
so
q1 , . . . , qi−1 = q1 , . . . , qi−1
and qi < qi = qi .
If we let
q = q1 , . . . , qi−1 , qi , . . . , q j−1 , q j − 1, q j+1 , . . .
then q, q ∈ L i and q , q ∈ L j , hence q, q ∈ L i |L j , but if we let
q = q1 , . . . , qi−1 , qi , . . . , q j−1 , q j + 1, q j+1 , . . .
then q, q ∈ L i , q , q ∈ L −1 −1
j , and q, q ∈ L i |L j . Except for the case 1 = i, which
is notationally simpler, we have completed the proof that
L i = L i |L j = L i |L −1
j whenever 1 ≤ i < j ≤ n. (77)
56 ROGER D . MADDUX
L i−1 = L −1 −1 −1
j |L i = L j |L i whenever 1 ≤ i < j ≤ n (80)
L i−1 = L i−1 |L −1 −1
j = L i |L j whenever 1 ≤ i < j ≤ n. (81)
Next we consider the products L i |L i−1 and L i−1 |L i . If q, q ∈ L i |L i−1 , then there is some
q ∈ Qn such that q, q ∈ L i and q , q ∈ L i−1 , hence
q1 , . . . , qi−1 = q1 , . . . , qi−1
= q1 , . . . , qi−1
, qi < qi > qi .
There are three cases. First, if qi < qi then q, q ∈ L i . Second, if qi > qi then q, q ∈
L i−1 . For the third case we suppose qi = qi , which implies
q1 , . . . , qi = q1 , . . . , qi . (82)
If q = q then q, q ∈ Id. Suppose q = q . From (82) we know that q and q must
differ at some index greater than i. Let j be the smallestindex such that i < j ≤ n and
q j = q j . If q j < q j then q, q ∈ L j . If q j > q j then q, q ∈ L −1 j . This exhausts all
the possibilities, and shows that
L i |L i−1 ⊆ Id ∪ L i ∪ L i−1 ∪ L j ∪ L −1j = Id ∪ L j ∪ L −1
j .
i< j≤n i≤ j≤n
which is equivalent to
q1 , . . . , qi−1 = q1 , . . . , qi−1
. (83)
If we choose q ∈ Qn so that
q1 , . . . , qi−1 = q1 , . . . , qi−1
= q1 , . . . , qi−1 (84)
and qi > max(qi , qi ), we get q, q ∈ L i and q , q ∈ L i−1 , hence q, q ∈ L i |L i−1 .
This completes the proof that
L i |L i−1 = Id ∪ L j ∪ L −1
j . (85)
i≤ j≤n
L i−1 |L −1 −1
j = L min(i, j) , (88)
⎧
⎪
⎪ if i < j
⎪L i
⎨
L −1 −1
j |L i = L i |L j = ⎪ L j
−1
if j < i . (89)
⎪
⎩Id ∪
⎪ −1
i= j≤k≤n (L k ∪ L k ) if i = j
The remaining relative products of relations in Ln , which all involve Id, are
Relative multiplication distributes over union, so it follows that An is closed under relative
multiplication as well as union, intersection, complementation with respect to Qn × Qn ,
and conversion. Note that An contains the identity relation on Qn .
For every J ⊆ {1, 2, . . . , n}, let L J := ∅ and L −1 J := ∅ if J = ∅, and otherwise
−1 −1
let L J := i∈J L i and L J := i∈J iL . For every i ∈ {1, 2, . . . , n} let [1, i] =
{1, 2, . . . , i − 1, i} and [i, n] = {i, i + 1, . . . , n − 1, n}. Using this notation we can rewrite
(85) and (86) as
L −1 −1 −1
[1,i] |L [1, j] = L [1,min(i, j)] , (94)
L −1 −1 −1
[i,n] |L [ j,n] = L [min(i, j),n] . (95)
= L [1,n] ∪ Id ∪ L −1
[ j,n] ,
58 ROGER D . MADDUX
which implies
L [1,i] |L −1 −1
[ j,n] = L [1,n] ∪ Id ∪ L [ j,n] whenever 1 ≤ j ≤ i. (97)
We will use the relations in Ln to create a copy of the Sugihara matrix S2n+2 . The
example which inspired this construction is Belnap’s M0 , which has two copies of S4 as
subalgebras, namely {−3, −2, +2, +3} and {−3, −1, +1, +3}.
First, define a function T : S2n+2 → Sb Qn 2 by
T−n−1 := ∅ (98)
T1 := L [1,n] ∪ Id (100)
Ti := L [1,n] ∪ Id ∪ L −1
[n+2−i,n] whenever 2 ≤ i ≤ n + 1 (101)
and let
Tn := {T−n−1 , T−n , . . . , T−1 , T1 , . . . , Tn , Tn+1 }.
Note that Tn+1 = Qn × Qn . Also, the images of the designated values of S2n+2 are
T1 , . . . , Tn , Tn+1 , exactly the elements of Tn that contain the identity relation Id. It follows
immediately from the definitions that the relations in Tn form a chain,
T−n−1 ⊆ T−n ⊆ · · · ⊆ T−1 ⊆ T1 ⊆ · · · ⊆ Tn ⊆ Tn+1 . (102)
Therefore Tn is closed under union and intersection. A straightforward calculation shows
that Tn is also closed under converse complementation. In fact, for every i ∈ S2n=2 =
{−n − 1, . . . , −1, 1, . . . , n + 1} we have
∼(Ti ) = T−i = T∼(i) . (103)
To show that Tn is closed under relative multiplication, we need to examine all the products
of relations in Tn .
First note that all products involving T−n−1 = ∅ are pretty trivial, for if X ∈ Tn then
T−n−1 |X = ∅|X = ∅ = T−n−1 . (104)
If 1 ≤ i, j ≤ n then we have
T−i |T− j = T− max(i, j) , (105)
since
T−i |T− j = L [1,n+1−i] |L [1,n+1− j]
= L [1,min(n+1−i,n+1− j) by (92)
= L [1,n+1−max(i, j)]
= T− max(i, j) .
We use this to show
T−i |T1 = T−i (106)
as follows.
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 59
= T−i ∪ T−i
= T−i .
If n ≥ i ≥ j ≥ 1, then n + 1 − i < n + 2 − j, so by (96),
T−i |L −1 −1
[n+2− j,n] = L [1,n+1−i] |L [n+2− j,n] = L [1,n+1−i] = T−i .
hence
T−i |T j = T−i |(T1 ∪ L −1
[n+2− j,n] )
= T−i ∪ T j = T j .
We have proved that
⎧
⎨T−i if n ≥ i ≥ j ≥ 1
T−i |T j = . (107)
⎩T if 1 ≤ i < j ≤ n
j
= T1 .
Suppose 2 ≤ j ≤ n. First observe that
T1 |L −1 −1
[n+2− j,n] = (L [1,n] ∪ Id)|L [n+2− j,n] (109)
= L [1,n] |L −1 −1
[n+2− j,n] ∪ Id|L [n+2− j,n]
60 ROGER D . MADDUX
= L [1,n] ∪ Id ∪ L −1 −1
[n+2− j,n] ∪ L [n+2− j,n] by (97)
= Tj ,
and then use this observation together with (108) to obtain
T1 |T j = T1 | T1 ∪ L −1
[n+2− j,n] (110)
= T1 |T1 ∪ T1 |L −1
[n+2− j,n]
= T1 ∪ T j = T j by (108), (109).
Finally, if 2 ≤ i, j ≤ n + 1 then we first note
−1
Ti |L −1
[n+2− j,n] = T1 ∪ L −1
[n+2−i,n] L [n+2− j,n] (111)
−1
= T1 L −1 −1
[n+2− j,n] ∪ L [n+2−i,n] L [n+2− j,n]
= T j ∪ L −1
[min(n+2−i,n+2− j),n] by (109), (95)
= T j ∪ L −1
[n+2−max(i, j),n] ,
and then
Ti |T j = Ti | T1 ∪ L −1
[n+2− j,n] (112)
= Ti |T1 ∪ Ti |L −1
[n+2− j,n]
= Ti ∪ T j ∪ L −1
[n+2−max(i, j),n] by (110), (111)
= Tmax(i, j) ∪ L −1
[n+2−max(i, j),n]
= Tmax(i, j) .
This completes the proof that Tn is closed under relative multiplication and composition.
Since Tn is closed under ∪, ∩, ◦, →, ∼, we may use it as the universe of an algebra with
these operations. Let
Tn := Tn , ∪, ∩, ◦, →, ∼ . (113)
Observe that (105), (107), (108), (110), and (112) are enough to confirm that relative
multiplication in Tn behaves the same as multiplication in the Sugihara matrix S2n+2
according to (66)–(68). It is easy to see that the other operations are preserved by T, so
T is an isomorphism from the Sugihara matrix S2n+2 to Tn . Combining these observations
with Theorem 6.1 completes the proof of part (i).
For part (ii), consider a sentence A and choose n so that A has fewer than 2n + 2
propositional variables. By Theorem 6.1 we have RM A iff A is valid in S2n+2 . The
isomorphism from S2n+2 to Tn carries designated values of S2n+2 onto the relations in
Tn that contain Id, so A is valid in S2n+2 iff Tn | A. Part (iii) follows from parts (i)
and (ii).
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 61
§7. Relevant model structures. Relevant model structures, introduced in Routley &
Meyer (1972a, 1972b, 1973), provide sound and complete semantics for R. A relevant
model structure K = K , R, ∗ , 0 consists of a nonempty set K, a ternary relation R ⊆
K 3 , a unary operation ∗ : K → K , and a distinguished element 0 ∈ K, such that postulates
(p1)–(p6) hold for all a, b, c ∈ K. To state these postulates, we first adopt some definitions.
R 2 abcd iff ∃x (Rabx, Rxcd, x ∈ K ) (d1)
T HEOREM 7.1. Properties (p1)–(p6) are equivalent to (p1), (p2), (p3 ), (p4), (p5 ) (p6),
and (comm).
Proof. Assume postulates (p1)–(p6). We must show (comm), (p3 ), and (p5 ). For this
we only need (p3), (p4), and (p5).
Assume Rabc. We have R0aa by (p1), so R 2 0abc by (d2), hence R 2 0bac by (p3),
and finally Rbac by (p4). Thus (comm) holds. (p5 ) follows from (p5) by (comm). For
(p3 ), assume R 2 abcd. Then R 2 bacd by (d2) and (comm), so R 2 bcad by (p3), and finally
R 2 a(bc)d by (d1), (comm), and (d2).
For the converse, assume (p1), (p2), (p3 ), (p4), (p5 ), (p6), and (comm). We get (p5)
from (p5 ) and (comm). For (p3), assume R 2 abcd. Then R 2 bacd by (d1) and (comm),
hence R 2 b(ac)d by (p3 ), and finally R 2 acbd by (d2), (comm), and (d1).
Because of this theorem we think of a relevant model structure as one that satisfies
0-reflexivity, 0-cancellation, density, involution, associativity, commutativity, and both
rotations.
Suppose K = K , R, ∗ , 0 is a structure with distinguished element 0 ∈ K , ternary
relation R ⊆ K 3 , and unary operation ∗ : K → K . (K need not be a relevant model
structure.) For any a ∈ K and X ⊆ K, X is a-closed if y ∈ X whenever x ∈ X and
x ≤a y. Let (K) be the set of 0-closed subsets of K. A valuation on K is a function
ν : Sent → Sb (K ) such that, for all A, B ∈ Sent,
ν( A ∧ B) = ν(A) ∩ ν(B),
ν( A ∨ B) = ν(A) ∪ ν(B),
∼X := {a : a ∗ ∈
/ X }. (117)
In this definition we avoid distinguished elements, but they are sometimes included;
see Routley & Meyer (1973, p. 228) and Brady (2003, p. 81) for other choices of similarity
type for the algebra of K.
The algebra Pr(K) of a relevant model structure K is a subalgebra of a larger algebra
obtained by using the set of all subsets of K instead of (K). This is the complex algebra
of K, defined by
Cm (K) := Sb (K ) , ∪, ∩, ◦, →, ∼ . (119)
Note that if 0-identity property (p1 ) holds in K, then Cm (K) coincides with the algebra
of K.
Furthermore, the complex algebra Cm (K) has no ∼-fixed points. To see this, suppose
X = ∼X = {a : a ∗ ∈ / X } for some X ⊆ K . Then a ∈ X iff a ∗ ∈ / X , for all a ∈ K .
In particular, for a = 0 we would have 0 ∈ X iff 0∗ ∈ / X , but 0 = 0∗ in every relevant
model structure satisfying (p1 ), a contradiction.
Every valuation
ν on a relevant model structure K is a homomorphism from the algebra
of sentences Sent, ∨, ∧, ◦, →, ∼ to the algebra of K, and conversely. Therefore
A is valid
in K iff 0 ∈ ν( A) for every homomorphism ν from Sent, ∨, ∧, ◦, →, ∼ to the algebra
of K.
The following two constructions are from Meyer & Routley (1973, part I) and Routley &
Meyer (1973). For both of them we assume K = K , R, ∗ , 0 where R ⊆ K 3 , ∗ : K → K ,
and 0 ∈ K . Let 0 ∈ / K and let K := K ∪ {0 }. Define a unary operation ∗ : K → K as
follows: a ∗ = a ∗ if a ∈ K and 0 ∗ = 0 . Let R be the ternary relation on K defined by
R :=R ∪ 0 , 0 , 0
∪ 0 , 0 , a : 0, 0, a ∈ R
∪ 0 , a, 0 : 0, a, 0∗ ∈ R
∪ a, 0 , 0 : a, 0, 0∗ ∈ R
∪ a, b, 0 : a, b, 0∗ ∈ R
∪ 0 , a, b : 0, a, b ∈ R
∪ a, 0 , b : a, 0, b ∈ R ,
and let K := K , R , ∗ , 0 . Then K is the normalization of K.
L EMMA 7.3. (Routley & Meyer, 1973). If K is a relevant model structure then the nor-
malization of K is a normal relevant model structure. If a sentence A ∈ Sent is invalid
in K, then A is also invalid in the normalization of K.
,
For a similar construction from Meyer & Routley (1973, part I), choose some 1 ∈ / K
,
and let K := K ∪ {1 }. Define a unary operation ∗ : K → K as follows: a ∗ = a ∗ if
, ,
a ∈ K and 1 ∗ = 1 . Define a ternary relation
, ,
R :=R ∪ {a, 1 , a : a ∈ K } ∪ {1 , a, a : a ∈ K }
, , , ,
∪ a, a ∗ , 1 : a ∈ K ∪ {1 , 1 , 1 }
64 ROGER D . MADDUX
,
and let K := K , R , ∗ , 1 . Meyer & Routley (1973, part I) did not give a name to K . We
,
will call it K-with-identity, and denote it briefly by K[1 ].
,
L EMMA 7.4 (Meyer & Routley, 1973, part I). If K is a relevant model structure then K[1 ]
is a normal relevant model structure that satisfies (p1 ). Furthermore, if K is normal then
,
exactly the same sentences are valid in both K and K[1 ].
Next are the Routley–Meyer completeness results.
T HEOREM 7.5 (Routley & Meyer, 1973; Meyer & Routley, 1973). The following state-
ments are equivalent for every sentence A ∈ Sent.
(i) R A.
(ii) A is valid in every relevant model structure.
(iii) A is valid in every normal relevant model structure.
(iv) A is valid in every relevant model structure that satisfies (p1 ).
Proof. The equivalence of (i) and (ii) is theorem 3 of Routley & Meyer (1973).
Obviously (ii) implies (iii), and (iii) implies (iv) since every relevant model structure that
satisfies (p1 ) is normal.
To show that (iv) implies (i) it is enough to prove that every nontheorem of R is invalid
in some (normal) relevant model structure that satisfies (p1 ). Assume R A. Since (ii)
implies (i), there exists some relevant model structure K such that A is not valid in K. Let
,
K be the normalization of K and let K be K [1 ]. Thus K has two more elements than K.
Since A is invalid in K, it is also invalid in the normalization K of K by Lemma 7.3. But
the same sentences are valid in both K and K by Lemma 7.4, so A is also invalid in K .
Since K is a relevant model structure that satisfies property (p1 ), we are done.
Part (iv) of Theorem 7.5 inspired the following question, which was asked in Maddux
(2007).
(Q3) Is is true that R A iff A is valid in every relevant model structure that satis-
fies (p1 ), (p5 ), and (5 )?
In addressing this question, Kowalski (2007) defines a system B whose language contains
only ∧, ∨, and →. The axioms of B are (A1)–(A8) and the rules are modus ponens,
Adjunction, Prefixing, and Suffixing. He proves that B A iff A is valid in every structure
that satisfies (p6), (p1 ), (p5 ), (5 ), plus the condition that Ra0b iff a = b.
§8. Incompleteness of R for Rcd . We answer question (Q1) here, for which we will
need
T HEOREM 8.1. Let U be a nonempty set and assume A, B, C, D, E, F, G ⊆ U 2 . Then
Id ⊆ A|B ∩ C|D ∩ E|F → (L)
A| A−1 |C ∩ B|D −1 ∩ (A−1 |E ∩ B|F −1 )|(E −1 |C ∩ F|D −1 ) |D,
Parts (L) and (M) are in the calculus of relations, but they are not part of relevance logic
because they involve conversion. Accompanying (L) and (M) are their consequences (L )
and (M ). These use only the operations allowed in relevance logic but, as is shown below,
their corresponding sentences are not provable in R. Infinitely many more such examples
can be found in Mikulás (2009).
Now (L), (M), and the equations used by Mikulás (2009) all have the same special form.
There is a general procedure applicable to such equations which produces (L ) and (M )
from (L) and (M), respectively. There are also procedures that work on all equations if a
particular constant is available in the language. However, we will not go further into these
matters.
Proof. We only prove (M) and (M ). The proofs of (L) and (L ) are similar. By (17), (M)
and (M ) are equivalent to inclusions whose left side is the relation A ∩ (B ∩ C|D)|(E ∩
F|G).
For (M), suppose v, w ∈ A ∩ (B ∩ C|D)|(E ∩ F|G). Then v, w ∈ A and there is
some x ∈ U such that v, x ∈ B, v, x ∈ C|D, x, w ∈ E, and x, w ∈ F|G. Hence
there are y, z ∈ U such that v, y ∈ C, y, x ∈ D, x, z ∈ F, and z, w ∈ G. It now
follows from only v, w ∈ A, v, x ∈ B, x, w ∈ E, v, y ∈ C, y, x ∈ D, x, z ∈ F,
and z, w ∈ G that v, w is in the relation in the conclusion of (M), that is,
v, w ∈ C| (C −1 |A ∩ D|E)|G −1 ∩ D|F ∩ C −1 |(A|G −1 ∩ B|F) |G.
Now apply (M) with C ∩C −1 and G ∩ G −1 in place of C and G, respectively, and conclude
that v, w belongs to a relation contained in the third relation in the conclusion of (M ), as
follows.
−1 −1
v, w ∈ (C ∩ C −1 )| ((C ∩ C −1 ) |A ∩ D|E)|(G ∩ G −1 )
−1 −1
∩ D|F ∩ (C ∩ C −1 ) |(A|(G ∩ G −1 ) ∩ B|F) |G
= (C ∩ C −1 )| ((C ∩ C −1 )|A ∩ D|E)|(G ∩ G −1 )
∩ D|F ∩ (C ∩ C −1 )|(A|(G ∩ G −1 ) ∩ B|F) |G
⊆ C| (C|A ∩ D|E)|G ∩ D|F ∩ C|(A|G ∩ B|F) |G.
We use the abbreviation A|B := ∼(B → ∼ A) to transcribe (L ) and (M ) into sentences
(L ), (M ) ∈ Sent.
A ∧ (B ∧ C|D)|(E ∧ F|G) → (M )
A ∧ (B ∧ (C ∧ ∼C)|D)|(E ∧ F|G)
∨ A ∧ (B ∧ C|D)|(E ∧ F|(G ∧ ∼G))
∨ C| (C|A ∧ D|E)|G ∧ D|F ∧ C|(A|G ∧ B|F) |G
The validity of (L ) and (M ) in R was established by Theorem 8.1. However,
T HEOREM 8.2. R (L ) and R (M ).
Proof. Let K28 = K , R28 , ∗ , 0, where K = {0, 1, 2, 3}, x ∗ = x for every x ∈ K , and
R28 is the following ternary relation on K with 28 triples.
R28 :=[0, 0, 0] ∪ [1, 1, 1] ∪ [2, 2, 2] ∪ [3, 3, 3] ∪
[0, 1, 1] ∪ [0, 2, 2] ∪ [0, 3, 3] ∪
[1, 2, 2] ∪ [3, 1, 1] ∪ [2, 3, 3] ∪ [1, 2, 3].
K28 is (isomorphic to) the atom structure of the relation algebra 4265 from Maddux (2006).
K28 is a normal relevant model structure that satisfies (p1 ) and the reflection properties
(p5 ) and (p5 ). By (p1 ), the algebra of K28 is the same as its complex algebra Cm (K28 ).
Neither (L ) nor (M ) is valid in K28 . Both (L ) and (M ) will fail if we choose variables
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 67
§9. Conclusions. Algebras for relevance logic can be created in an abstract algebraic
way: add operations for the connectives and distinguished elements for the constants, and
impose on the operations and distinguished elements postulates that mimic the axioms.
Operations in individual algebras may be specified by tables (in the finite case) or rules,
and are designed to validate the axioms of the logic. Although algebraization may be
mathematically illuminating, it is open to the philosophical charge that “. . . algebraic char-
acterizations . . . are merely formal, exhibiting no connection with the intended meanings
of the logical constants” (Copeland, 1979, p. 405).
Somewhat less abstract are the algebras of relevant model structures. Here the elements
of the algebras are actually sets, so two of the operations, namely intersection and union,
need not be specified by rules or postulates. But the other operations arise abstractly from
the ternary relation R and the unary operation ∗ of the structure. Postulates imposed on
R and ∗ are designed to validate the axioms. Indeed, many books and papers have lists of
axioms (which are essentially second-order statements about relevant model structures)
and their corresponding postulates on R and ∗ (which are first-order statements about
relevant model structures). Once again, “If the only constraint on ∗ is that the resulting
theory should validate the right set of sentences, then we are indeed in the presence of
merely formal model theory” (Copeland, 1979, p. 410).
In contrast, the elements of relational relevance algebras are binary relations, none of
the operations are abstractly defined, and there are no postulates for R. The operations
of relational relevance algebras are just standard set-theoretically defined operations on
binary relations. Of course, some axioms of R fail in R. The reasons for their failure
are given in Theorem 5.1, from which we can see that the commutative dense relational
relevance algebras will satisfy all the axioms of R. Focusing attention on the subclass of
commutative dense algebras in R is a response to the axioms of R. For the system of Basic
Logic consisting of axioms (A1)–(A20) and all nine rules, no such response is needed. The
natural class of models is R, and Basic Logic is a finite approximation to R-logic.
One should expect ad hoc semantics to be sound and complete because they are designed
for that purpose. But R-logic, Rcd -logic, Rcdt -logic, and so forth, are part of the nineteenth
century calculus of relations, while R and RM are mid-twentieth century inventions that
just happen to be a proper subsystem of Rcd -logic and exactly the same as Rcdt -logic,
respectively. Is this just a pure coincidence, or is there some underlying reason? There is
no sign that the founders of relevance logic were trying to capture properties of binary
relations in their axioms, so perhaps it is a coincidence. At least the binary relational
68 ROGER D . MADDUX
interpretation escapes the charge that “. . . it is completely obscure what meaning is given
to negation in the Routley–Meyer theory . . . ” (Copeland, 1979, p. 408). The meaning
of negation is quite clear; ∼ is converse complementation. Anderson & Belnap (1975,
p. 345) ask, “How then to interpret ◦? We confess puzzlement.” In the binary relational
interpretation, ◦ is composition.
Philosophical considerations are (or, at least, ought to be) constrained by mathematical
theorems, so we give here a summary of the main results in this paper (Theorems 4.1, 4.2,
5.1, Corollary 5.2, and Theorems 6.2, 8.1, and 8.2).
(L ), (M ) ∈ R-logic ⊂ Rcd -logic ⊂ Rcdt -logic = RM
(L ), (M ) ∈
/ R ⊂ Rcd -logic ⊂ {M0 }-logic = BM
BIBLIOGRAPHY
Ackermann, W. (1956). Begründung einer strengen Implikation. Journal of Symbolic
Logic, 21, 113–128.
Anderson, A. R., & Belnap, N. D. Jr. (1975). Entailment. Princeton, NJ: Princeton
University Press.
Anderson, A. R., Belnap, N. D. Jr. & Dunn, J. M. (1992). Entailment. The Logic of
Relevance and Necessity. Vol. II. Princeton, NJ: Princeton University Press.
Andréka, H., Givant, S. R., & Németi, I. (1997). Decision problems for equational theories
of relation algebras. Memoirs of the American Mathematical Society, 126(604), xiv+126.
Belnap, N. D. Jr. (1960). Entailment and relevance. Journal of Symbolic Logic, 25,
144–146.
Bimbo, K., Dunn, J. M., & Maddux, R. D. (2009). Relevance logics and relation algebras.
The Review of Symbolic Logic, 2(1), 102–131.
Brady, R. T. (editor.) (2003). Relevant Logics and Their Rivals. Volume II. Aldershot,
Hants, England. Ashgate Publishing Ltd.
Church, A. (1951). The weak positive implicational propositional calculus. Journal of
Symbolic Logic, 16(5), 238.
Copeland, B. J. (1979). On when a semantics is not a semantics: Some reasons for disliking
the Routley-Meyer semantics for relevance logic. Journal of Philosophical Logic, 8(4),
399–413.
De Morgan, A. (1856). On the symbols of logic, the theory of the syllogism, and in
particular of the copula, and the application of the theory of probabilities to some
questions in the theory of evidence. Transactions of the Cambridge Philosophical
Society, 9, 79–127. Reprinted in De Morgan (1966).
De Morgan, A. (1864a). On the syllogism: III, and on logic in general. Transactions of the
Cambridge Philosophical Society, 10, 173–230. Reprinted in De Morgan (1966).
De Morgan, A. (1864b). On the syllogism: IV, and on the logic of relations. Transactions
of the Cambridge Philosophical Society, 10, 331–358. Reprinted in De Morgan (1966).
De Morgan, A. (1966). “On the Syllogism” and Other Logical Writings. New Haven, CT:
Yale University Press.
Došen, K. (1992). The first axiomatization of relevant logic. Journal of Philosophical
Logic, 21(4), 339–356.
Fine, K. (1974). Models for entailment. Journal of Philosophical Logic, 3, 347–372.
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 69
Kowalski, T. (2007). Weakly associative relation algebras hold the key to the universe.
Bulletin of the Section of Logic, 36(3/4), 145–157.
Lyndon, R. C. (1961). Relation algebras and projective geometries. Michigan Math
Journal, 8, 21–28.
Maddux, R. D. (1991). The origin of relation algebras in the development and
axiomatization of the calculus of relations. Studia Logica, 50(3–4), 421–455.
Maddux, R. D. (2006). Relation Algebras, Volume 150 of Studies in Logic and Foundations
of Mathematics. Amsterdam, The Netherlands: Elsevier B. V.
Maddux, R. D. (2007). Relevance logic and the calculus of relations [abstract].
International Conference on Order, Algebra and Logics, Vanderbilt University, June
13, 2007, pp. 1–3. Available from: www.math.vanderbilt.edu/∼oa12007/submissions/
submission 10.pdf.
Meyer, R. K., & Routley, R. (1973). Classical relevant logics. I, II. Studia Logica, 32,
51–68; ibid. 33 (1974), 183–194.
Mikulás, S. (2009). Algebras of relations and relevance logic. Journal of Logic and
Computation, 19(2), 305–321.
Moh, S.-K. (1950). The deduction theorems and two new logical systems. Methodos, 2,
56–75.
Peirce, C. S. (1870). Description of a notation for the logic of relatives, resulting from an
amplification of the conceptions of Boole’s calculus of logic. Memoirs of the American
Academy of Sciences, 9, 317–378. Reprinted in Peirce (1960) and Peirce (1984).
Peirce, C. S. (1880). On the algebra of logic. American Journal of Mathematics, 3, 15–57.
Reprinted in Peirce (1960).
Peirce, C. S. (1883). Note B: The logic of relatives. In Peirce, C. S., editor. Studies in
Logic by Members of the Johns Hopkins University. Boston, MA: Little, Brown, and Co.,
pp. 187–203. Reprinted by John Benjamins Publishing Co., Amsterdam and
Philadelphia, 1983, pp. lviii, vi+203.
Peirce, C. S. (1885). On the algebra of logic; a contribution to the philosophy of notation.
American Journal of Mathematics, 7, 180–202. Reprinted in Peirce (1960).
Peirce, C. S. (1897). The logic of relatives. The Monist, 7, 161–217. Reprinted in Peirce
(1960).
Peirce, C. S. (1960). Collected Papers. Cambridge, MA: The Belknap Press of Harvard
University Press.
Peirce, C. S. (1984). Writings of Charles S. Peirce. Vol. 2 (chronological edition).
Bloomington, IN: Indiana University Press.
Routley, R., & Meyer, R. K. (1972a). The semantics of entailment. II. Journal of
Philosophical Logic , 1(1), 53–73.
Routley, R., & Meyer, R. K. (1972b). The semantics of entailment. III. Journal of
Philosophical Logic, 1(2), 192–208.
Routley, R., & Meyer, R. K. (1973). The semantics of entailment. I. In Truth, Syntax and
Modality (Proc. Conf. Alternative Semantics, Temple Univ., Philadelphia, Pa., 1970),
pp. 199–243. Studies in Logic and the Foundations of Math., Vol. 68. Amsterdam, The
Netherlands: North-Holland.
Routley, R., Plumwood, V., Meyer, R. K., & Brady, R. T. (1982). Relevant Logics and Their
Rivals. Part I. Atascadero, CA: Ridgeview Publishing Co.
Routley, R., & Routley, V. (1972). The semantics of first degree entailment. Noûs, 6(4),
335–359.
Schröder, F. W. K. E. (1966). Vorlesungen über die Algebra der Logik (exacte Logik),
Volume 3, “Algebra und Logik der Relative,” Part I (second edition). Bronx, NY:
Chelsea. First published by B. G. Teubner, Leipzig, 1895.
70 ROGER D . MADDUX
Sugihara, T. (1955). Strict implication free from implicational paradoxes. Memoirs of the
Faculty of Liberal Arts, Series 1, no. 4, 55–59.
Tarski, A. (1941). On the calculus of relations. Journal of Symbolic Logic, 6, 73–89.
Tarski, A., & Givant, S. (1987). A Formalization of Set Theory Without Variables.
Providence, RI: American Mathematical Society.
Urquhart, A. (1972). Semantics for relevant logics. Journal of Symbolic Logic, 37,
159–169.
Urquhart, A. (1984). The undecidability of entailment and relevant implication. Journal of
Symbolic Logic, 49(4), 1059–1073.
Urquhart, A. (1996). Duality for algebras of relevant logics. Studia Logica, 56(1–2),
263–276.
DEPARTMENT OF MATHEMATICS
396 CARVER HALL
IOWA STATE UNIVERSITY
AMES, IA 50011
E-mail: maddux@iastate.edu
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010
Abstract. This paper begins an axiomatic development of naive set theory—the consequences
of a full comprehension principle—in a paraconsistent logic. Results divide into two sorts. There is
classical recapture, where the main theorems of ordinal and Peano arithmetic are proved, showing
that naive set theory can provide a foundation for standard mathematics. Then there are major
extensions, including proofs of the famous paradoxes and the axiom of choice (in the form of the
well-ordering principle). At the end I indicate how later developments of cardinal numbers will
lead to Cantor’s theorem, the existence of large cardinals, and a counterexample to the continuum
hypothesis.
§1. Introduction. The axioms of naive set theory define sets through existence and
uniqueness conditions. The first axiom is the principle of comprehension, that any collec-
tion of objects is a set, and is itself an object. The second is the principle of extensionality,
that the members of a set completely determine the identity of that set. On the basis of
these principles alone, the mathematics of set theory can be developed, as well as some
novel inconsistent mathematics, as this article begins to demonstrate.
The comprehension principle is inconsistent. In the presence of classical logic, it
is outright incoherent. Since Russell told this to Frege, it has been the cause of some
surprise and consternation. The most prevalent response has been to maintain classical
logic, and to adopt Zermelo’s (1908) selection of some instances of the comprehension
principle. This is not the only possible response. Classical logic makes comprehension
absurd; the consternation, often tracked by using the word ‘paradox’, is due to the deep
sense that comprehension is not absurd. Others have tried to maintain comprehension, and
adopt, say, Routley and Meyer’s (1976) selection of logical axioms with which to reason
about it.
The resulting theory has several important features. First, it contains the basic theorems
of standard ZFC set theory, initially proving the ZF axioms and building to the theory
of ordinals and Peano postulates. This has been called the classical recapture, and
has proved startlingly elusive until now. The demands placed on the underlying logic
are simply very stringent. Speaking to this point, the usually optimistic authors of
Meyer et al. (1978) conclude in dramatic fashion that “naive set theory is untenable”
(p. 128). Priest summarizes the situation:
Since the early days of paraconsistent logic, it has been clear that the
rejection of ex contradictione [quodlibet] is not possible without the re-
jection of other things which appear to be much more integral to classical
reasoning. . . . Several logicians (including Brady, Meyer, Mortensen,
c Association for Symbolic Logic, 2010
71 doi:10.1017/S1755020309990281
72 ZACH WEBER
below), these demands mean that set negation excludes the middle, tertium non datur:
∨ ¬ for any sentence . This is to be expected. An exhaustive universe is explicitly
postulated by Cantor and Frege; it is the cause of inconsistency in comprehension to begin
with. In light of the noted duality between paraconsistent and paracomplete (i.e., partial or
intuitionistic) systems, it is at least reasonable to investigate the mathematics of a gap-free
theory.
Essentially, we want the strongest logic possible that does not explode when given a
comprehension principle. Once all these are tailored and formalized, we obtain one of the
original tools fashioned for the job. Let us have the language of first-order set theory: with
primitives ∧, ¬, →, ∀, =, and ∈; variables x, y, z, . . .; names a, b, c, . . .; and formulae
, , ϒ, . . ., built up by standard formation rules. The usual shorthand1 is used: ∨
for ¬(¬ ∧ ¬); ↔ for ( → ) ∧ ( → ); ∃ is ¬∀¬.
2.1. Axioms. All instances of the following schemata are theorems:
I →
IIa ∧ →
IIb ∧ →
III ∧ ( ∨ ϒ) → ( ∧ ) ∨ ( ∧ ϒ) (distribution)
IV ( → ) ∧ ( → ϒ) → ( → ϒ) (conjunctive syllogism)
V ( → ) ∧ ( → ϒ) → ( → ∧ ϒ)
VI ( → ¬) → ( → ¬) (contraposition)
VII ¬¬ → (double negation elimination)
VIII ( → ) → ¬( ∧ ¬) (counter example)
IXa ( → ) → [( → ϒ) → ( → ϒ)]
IXb ( → ) → [(ϒ → ) → (ϒ → )] (hypothetical syllogisms)
X ∀x → (y/x)
XI ∀x( → ) → ( → ∀x)
XII ∀x( ∨ ) → ∨ ∀x.
In Axiom X, y is free for x in . In Axioms XI and XII, x is not free in .
The hypothetical syllogism pair IXa and IXb are called suffixing and prefixing, respec-
tively. Without these we have the weak relevant logic DLQ, dialethic logic with quantifiers,
introduced in Routley & Meyer (1976) as ‘dialectical’ logic. Adding them amplifies the
logic to TLQ.
2.2. Rules. The following rules are valid:
I , ∧ (ad junction)
II , → (modus ponens)
III → , ϒ → ( → ϒ) → ( → )
IV ∀x
V x = y (x) → (y) (substitution).
Modus ponens is also valid in the single premise form: ∧ ( → ) , not to
be mistaken for the illegitimate axiom form. Substitution, similarly, is only valid in rule
form. Given Axioms IXa and IXb, the hypothetical syllogism pair, then Rule III is actually
redundant because it is derivable. There is no harm in keeping it explicit, though.
1 Taking these as definitions means that , for example, ∨ → ¬(¬ ∧ ¬) is no more than an
instance of Axiom I below.
74 ZACH WEBER
Brady proves that naive set theory in the closely related logics DKQ and TKQ is
non-trivial (Brady, 2006, p. 242), in the sense that it has a model in which some
sentences fail. Brady’s work begins in Brady (1971) and he and Routley collaborated
on the proof in Brady & Routley (1989). Routley used DKQ in his ?, calling the set
theory DST, for dialectical set theory, which in present terminology would be dialethic
set theory. The logic DLQ is the same as DKQ, except that ∨ ¬ is strength-
ened to the counterexample axiom; and TLQ, again, adds the hypothetical syllogisms.
I have found DKQ to be too weak to obtain satisfactory results, namely, the classical
recapture. Non-triviality of naive set theory in DLQ and TKQ is an open problem.
I conjecture that DLQ is the weakest logic of its kind that can support any robust
mathematics.
The following derived facts about the logic will be most helpful.
Double negation introduction follows from Axioms I and VI, and modus ponens:
(¬ → ¬) → ( → ¬¬). Then the contraposition Axiom VI can be rearranged to
( → ) → (¬ → ¬). Counterexample gets its name from the contraposed form
∧ ¬ → ¬( → ). From the instance of counterexample ( → ) → ¬( ∧ ¬),
by Axiom I and modus ponens we have the law of non-contradiction: ¬( ∧ ¬). Then by
the definition of disjunction follows the law of excluded middle, ∨ ¬. Contraposition
on Axiom V gives a schema for ∨-elimination: ( → ) ∧ (ϒ → ) → ( ∨ ϒ → ).
Then the pair ( → ¬) → ¬ and (¬ → ) → , reductio and consequentia
mirabilis, are theorems. We have the scheme for existential instantiation, ∀x( → ) →
∃x → , from Axiom XI, contraposition, and the definition of ∃, and subject to the
condition that x is not free in .
A useful derived rule is
→ ( → ϒ), ϒ → → ( → ).
→ ( → (ϒ → ), → → ( → (ϒ → )),
§3. Axioms. Our first-order formal language is now augmented with a variable bind-
ing term forming operator {· : −}; it remains open how to conservatively add term-forming
symbols in relevance contexts (Brady, 2006, p. 177), and is not a problem addressed here.
The set concept is now characterized by two axioms.
A XIOM 3.1 (Abstraction). x ∈ {z : (z)} ↔ (x).
A XIOM 3.2 (Extensionality). ∀z(z ∈ x ↔ z ∈ y) ↔ x = y.
Existential generalization on abstraction immediately yields the comprehension
principle5 :
T HEOREM 3.3 (Comprehension). ∃y∀x(x ∈ y ↔ (x)).
Under abstraction, the substitution rule is x = y ∀z(x ∈ z → y ∈ z).
Abstraction and extensionality can be reconnected, as in Frege’s axiom:
T HEOREM 3.4 (Basic law V). {x : } = {x : } ↔ ∀x( ↔ ).
Peano chose the ∈ sign to denote predication, from the Greek verb σ τ ιν, ‘to be’.
Sets are predicates in extension. Since arbitrary predicates determine sets, then, in the
comprehension principle the occurrence of y in the predicate is not ruled out. Following
Routley, this is completely unrestricted comprehension; without this, some sets would
not be obtainable, for example, the limiting case of diagonal sets, Z = {x : x ∈ Z }.
To guarantee that such instances are valid abstractions—to ensure that every pred-
icate, even groundless ones, determines a set—we have abstraction instances of the
form
x ∈ {zu : (z, u)} ↔ (z/x, u/{zu : (z, u)})
where the right-hand side indicates a simultaneous substitution in of z by x, and u
by the term {zu : (z, u)}. (At first, in Brady & Routley (1989, p. 419), a new quantifier,
formation rule, and reflection axiom were added to handle circular predicates; but by Brady
(2006, pp. 177, 200), the idea is streamlined as above.) Axiom 3.1 in this way includes
cases
x ∈ {zu : (z, u)} ↔ (x, {zu : (z, u)}).
We can then say ‘a set y of all x such that . . . x . . . y . . .’, as opposed to the usual ‘the set
of all x such that . . . x . . .’.
Naive set theory is studied here with the full comprehension scheme for three reasons.
The first has already been stated, namely, the philosophical conviction that all predicates,
even baroque predicates, determine sets.6 Note that the liar paradox has a similar kind of
self-reference. Second is a pragmatic motive: Since weak relevant logics are so terribly
weak, the proving power for theorems must come from somewhere. Unrestricted compre-
hension will allow easier construction of functions, diagonal sets, and other useful objects.
The third motive, then, is to model easily some venerable fixed-point phenomena in set
theory; see , for example, our recursive characterization of the ordinals and natural numbers
below. Uniqueness is not essential to these cases, as there are non-standard models in the
classical account anyway.
It is frequently convenient to use names for sets, for example, for a set a there is a set of
all the subsets of a, called the powerset of a. The symbol P(a) is used to denote this set in
the same way that ∅ is used to denote a particular set below; it is a name. So, to use a bit
of notation to be introduced below, x ∈ P(a) is just x ∈ {y : y ⊆ a}. Similar comments
apply to complementation a, intersection, union, and so forth. This notation can be thought
of as governed by instances of comprehension:
x0 , . . . , xn ∈ y ↔ (x0 , . . . , xn ).
Since in later sections we will officially develop ordered pairs and the natural numbers to
act as indices, their use as primitives here is only heuristic.
§4. Basics.
P ROPOSITION 4.1. y = {z : (z)} ↔ ∀x(x ∈ y ↔ (x)).
Proof. By extensionality, y = {z : (z)} ↔ ∀x(x ∈ y ↔ x ∈ {z : (z)}. By
abstraction, ∀x(x ∈ {z : (z)} ↔ (x)). Then by transitivity of the conditional, ∀x(x ∈
y ↔ (x)). For the converse, we again invoke the abstraction scheme, where ∀x((x) ↔
x ∈ {z : (z)}, so by transitivity ∀x(x ∈ y ↔ x ∈ {z : (z)}. And this with the
extensionality axiom completes the proof.
6 Priest and Routley: “The naive notion of set is that of the extension of an arbitrary predicate....
This is as tight an account as can be expected from any fundamental notion. It was thought to be
problematical only because it was assumed (under the ideology of consistency) that ‘arbitrary’
could not mean arbitrary. However, it does” (Priest et al., 1989, p. 499).
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 77
x = x,
x = y → y = x,
x = y ∧ y = z → x = z.
Proof. These follow directly from the properties of → and ∧ (Axioms I, II, IV, and V),
and extensionality.
As with Proposition 4.8 below, identity also obeys an alternate form of transitivity due
to the hypothetical syllogism, namely
x = y → (y = z → x = z).
P ROPOSITION 4.4. Sets that differ with respect to membership are not identical. In
particular, ∃x(x ∈ a ∧ x ∈ a) → a = a.
Proof. We prove the contrapositive.
1 a = a → ∀z(z ∈ a ↔ z ∈ a) E xt.
2 ∀z(z ∈ a ↔ z ∈ a) → (b ∈ a ↔ b ∈ a) 1, Ax.X
3 (b ∈ a ↔ b ∈ a) → b ∈ a ∨ b ∈ a Ax.I I, V I I I
4 b ∈ a ∨ b ∈ a → ¬(b ∈ a ∧ b ∈ a) 3, Ax.I
5 a = a → ¬(b ∈ a ∧ b ∈ a) Ax.I V
6 (b ∈ a ∧ b ∈ a) → a = a Ax.V I.
Existential generalization completes the result.
When a set a is such that its membership is inconsistent, some b ∈ a and b ∈ a, then a
is inconsistent.
P ROPOSITION 4.5. ∃x(x = x).
Proof. By comprehension we have Russell’s set, R = {x : x ∈ x}.
1 ∀x(x ∈ R ↔ x ∈ x) Comp.
2 R ∈ R ↔ R ∈ R 1, Ax.X
3 R ∈ R → R ∈ R 2, Ax.I I
4 R ∈ R ∨ R ∈ R 3, Ax.V I I I
5 R ∈ R 4, Ax.V
6 R∈R 2, 5, RuleI I
7 R ∈ R ∧ R ∈ R 5, 6, RuleI.
Since R differs from itself with respect to membership, by Proposition 4.4, R = R.
From this novelty of inconsistent set theory, Restall (1992, p. 427) infers by generaliza-
tion the
C OROLLARY 4.6 (Restall). There are at least two objects, ∃x∃y(x = y).
78 ZACH WEBER
These few facts already show us that the present set theory will have as theorems
some propositions not contained by classical theory, and that classical theorems may be
recaptured by distinctively non-classical means.
D EFINITION 4.7 (Subsets). x ⊆ y := ∀z(z ∈ x → z ∈ y);
a proper subset is x ⊂ y := x ⊆ y ∧ ∃z(z ∈ y ∧ z ∈ x).
This leads to a natural partial order; the converse of antisymmetry actually holds, too,
since this is just the axiom of extensionality rewritten.
P ROPOSITION 4.8. Subsets are reflexive and antisymmetric,
x ⊆ x,
x ⊆ y ∧ y ⊆ x → x = y.
x ⊆ y ∧ y ⊆ z → x ⊆ z,
x ⊆ y → (y ⊆ z → x ⊆ z),
and x ⊆ y → (z ⊆ x → z ⊆ y).
Proof. Reflexivity and antisymmetry come from extensionality, Axiom I, and the
commutativity of conjunction. The forms of transitivity are direct results of conjunctive
syllogism and the hypothetical syllogism pair, respectively.
V = {x : ∃y(x ∈ y)},
∅ = {x : ∀y(x ∈ y)},
§5. ZF. Now it is time to produce all the axioms of Zermelo–Fraenkel set theory,
except Fundierung. Since general comprehension induces sets that are not well founded,
for example, V ∈ V , the foundation axiom is not a part of the present theory, as is expected
from Restall’s results in Restall (1992). That the other axioms are forthcoming is not
especially surprising, since Zermelo in 1908 (Zermelo, 1967) saw them as a consistent
fragment of the naive theory. The axiom of infinity will be provable, too, showing that the
naive theory does not fail a reductive program in the same way as Russell and Whitehead’s
system did.
In each proposition, the step of universally generalizing on free a, b is omitted.
P ROPOSITION 5.1 (Aussonderung). ∃y∀x(x ∈ y ↔ x ∈ a ∧ (x)).
P ROPOSITION 5.2 (Powerset). ∃y∀x(x ∈ y ↔ x ⊆ a).
For any a, we use the name P(a) := {x : x ⊆ a}.
P ROPOSITION 5.3 (Pairing). ∃y∀x(x ∈ y ↔ x = a ∨ x = b).
For any a, b, {a, b} := {x : x = a ∨ x = b}. A special case is the singleton {a} := {x :
x = a}. For relevance purposes, sometimes singletons are relativized to a particular set,
which is a key to defining ordinal and numerical successor below, and so to which we call
special attention.
D EFINITION 5.4 (Relevant singleton). {a}b := {x : x = a ∧ x ∈ b}.
80 ZACH WEBER
The essential property is that {a}b ⊆ b when a ∈ b, as is the case classically. Actually
we have the stronger fact that {a}b ⊆ b even without a ∈ b.
P ROPOSITION 5.5 (Union). ∃y∀x(x ∈ y ↔ ∃z(z ∈ a ∧ x ∈ z)).
P ROPOSITION 5.6 (Intersection). ∃y∀x(x ∈ y ↔ ∀z(z ∈ a → x ∈ z)).
The
standard names are adopted:
a := {x : ∃z(z ∈ a ∧ x ∈ z)} and
a := {x : ∀z(z ∈ a → x ∈ z)} for the union and intersection of a, respectively. Note
that, because the conditional is not material, these are not necessarily duals. On the other
hand,
a ∪ b := {x : x ∈ a ∨ x ∈ b} and
a ∩ b := {x : x ∈ a ∧ x ∈ b} obey their usual algebra—save the explosive a ∩ a ⊆ ∅.
The complement of b in a is now easily construed as a − b := a ∩ b.
The next axiom is the axiom of infinity. Note that V is already a set containing ∅ and
its successors (since every set whatever is in V ), and that we will be properly introducing
the natural numbers later. So V is infinite7 and the following is only for curiosity; compare
Petersen’s characterization of the natural numbers in (Petersen, 2000, p. 386).
P ROPOSITION 5.7 (Infinity). There is a non-empty set i such that when x ∈ i, also
{x}i ∈ i.
Proof. Consider i = {x : x ⊆ i}. Both i ∈ i and ∅ ∈ i, so i is not empty. For x ∈ i, also
{x}i ⊆ i, and so {x}i ∈ i.
In the next definition and beyond, we take to writing some sets in a commonplace
way, as , for example, {x, y : x ∈ a ∧ y ∈ b}. As noted above, this is to be controlled with
appropriate instances of abstraction. While our language does not, at the moment, contain
function symbols, when x, y ∈ f and f is a function, then we will write f (x) = y,
having in mind something like a two-place predicate: ∀x∃y[(x, y) ∧ ∀z((x, z) →
y = z)].
D EFINITION 5.8. An ordered pair is a, b = {{a}, {a, b}}. The cartesian product is
a × b = {x, y : x ∈ a ∧ y ∈ b}. A relation is r ⊆ a × b. A function is a relation
f : a −→ b with domain dom( f ) = a and range r ng( f ) = b, such that
x, u ∈ f ∧ x, v ∈ f
→ u = v.
The composition of two functions f , g is f ◦ g = {x, z : ∃y(x, y ∈ g ∧ y, z ∈ f }. The
image of a under f is f a = { f (x) : x ∈ a}. The restriction of f to x is f |x = f ∩(x ×a).
P ROPOSITION 5.9 (Replacement). Let f be a function with domain a. Then f a exists.
P ROPOSITION 5.10 (Ordered pairs). a, b = c, d a = c ∧ b = d.
Proof. Right to left is substitution, a = c a ∈ a, b → c ∈ a, b, and similarly for
b = d. We prove left to right. For a = c,
7 A set X is dedekind infinite if there is an injection from X to some Y ⊂ X . Then it is provable that
some sets are dedekind infinite, more or less in the same way that Dedekind proved it in 1888.
Consider f = {x, {x} : x ∈ V }. This is an injection of V into a proper part of itself. Therefore
V is dedekind infinite.
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 81
Most set theories include the axiom of choice; proving choice would recapture the
axioms of ZFC. It has been a vexed question as to whether or not a choice principle is or
should be a fact about set theory (though it is now mostly regarded without suspicion). For
our naive theory, we will see that choice is a consequence of a deeper Cantorian principle:
that every set can be put into a well order. To show this, though, we must first do a lot of
work on the notion of order itself. Useful classical references for the next section are Drake
(1974), Levy (1979), and Kunen (1980).
§6. Ordinals.
D EFINITION 6.1. A set a with respect to ∈ is:
strictly ordered iff
x, y, z ∈ a → x ∈ x
∧ (x ∈ y ∧ x ∈ x → y ∈ x)
∧ (y ∈ z → (x ∈ y → x ∈ z));
totally or linearly ordered by ⊆ iff a is strictly ordered and
x ∈ a
→ (y ∈ a
→ x ⊆ y ∨ y ⊆ x),
that is, ⊆-trichotomy holds;
well founded, W f (a), iff
y ⊆ a ∧ ∃zz ∈ y
→ ∃z(z ∈ y ∧ ¬∃x(x ∈ z ∧ x ∈ y))];
well ordered, W o(a), iff totally ordered and well founded;
transitive, T r (a), iff x ∈ a → x ⊆ a;
an ordinal iff a is a well-ordered and transitive set of ordinals, ⊆-connected to all other
ordinals.
In summary, by (full) comprehension
P ROPOSITION 6.2. There is a set of all ordinals, On = {x : x is an ordinal }, such that
x ∈ On ↔ W o(x)
∧ y∈x→y⊆x
∧ x ⊆ On
∧ y ∈ On →
(x ⊆ y ∨ y ⊆ x).
The definition of ordinal is adapted from von Neumann. We have added impredicative
clauses, to capture the recursive idea that the ordinals are the same inside and out. The
ordinals are an analysis of the concept of induction, generalized to the transfinite and
reified. The hard work in the theory of ordinals is in locating the right definition. Then
the properties of the ordinals, culminating in the Burali–Forti theorem, should all follow,
as it were, by logic alone.
A few more detailed comments. Our extra clause in the antisymmetry condition for
strict order, x ∈ x, is due to the relevance constraint on implication, as we will see in the
proofs below. In the well-founding clause, the rendering of a set having a least member
is material; the intensional alternative would have been that for any non-empty y ⊆ a,
there is a z ∈ y such that ∀x(x ∈ z → z ∈ y). Using this definition, however, makes it
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 83
too difficult to prove that anything is well founded. Similarly, to gloss z as a least member
if z ∩ y = ∅ would be almost impossible to confirm, since one would have to show that
x ∈ z ∩ y not only fails, but is absurd. It may be contradictory, but it is not absurd. For
linear order, we are using the ⊆ relation instead of the ∈ relation, and have built in an added
clause into the definition of On based on this choice. Finally, notice that → can be replaced
with
→.
On to the mathematics. Ordinals are written with lowercase Greek letters.
P ROPOSITION 6.3. ∅ ∈ On.
Proof. This is because ∅ is explosive. First, ∅ ⊆ On, by Proposition 4.11. Similarly, x ∈
∅ → x ⊆ ∅, transitivity, and x ∈ ∅ → (y ∈ ∅ → x ⊆ y ∨ y ⊆ x), a linear order. To show
that ∅ is ⊆-connected to all ordinals it again suffices that ∅ ⊆ x for any x at all. Finally, to
show well foundedness, we have a ∈ y ∧ (x ∈ y → x ∈ ∅) → a ∈ y ∧ (x ∈ ∅ ∨ x ∈ y),
by counterexample. Again since ∅ is explosive we get
a ∈ y ∧ (x ∈ y → x ∈ ∅) → a ∈ y ∧ (x ∈ a ∨ x ∈ y).
Generalization completes the proof.
α = {x : x ∈ On ∧ x ∈ α}.
Proof. Let α ∈ On. The previous theorem and the axiomatic instance β ∈ α → β ∈ α
show that β ∈ α → β ∈ On ∧ β ∈ α. The other direction is immediate. Therefore
α = α ∩ On.
P ROPOSITION 6.7. α ∈ On → α ∈ α.
Proof. The idea is that, were α ∈ α, then still α ∈ α.
1 α ∈ On → ∀x(x ∈ α → x ∈ x). de f.6.1
2 ∀x(x ∈ α → x ∈ x) → (α ∈ α → α ∈ α). Ax.X.
3 (α ∈ α → α ∈ α) → α ∈ α ∨ α ∈ α. Ax.V I I I
4 α ∈ α ∨ α ∈ α → α ∈ α. Ax.V.
Conjunctive syllogism completes the proof.
P ROPOSITION 6.8. α ∈ β ∧ α ∈ α → β ∈ α.
Proof. α ∈ β ∧ α ∈ α → β ⊆ α by counterexample, and β ⊆ α → β ∈ α.
P ROPOSITION 6.9. β ∈ γ → (α ∈ β → α ∈ γ ).
Proof. By the definition of transitivity on γ .
The last few propositions have showed, by virtue of self-similarity, that the ordinals are
strictly ordered. Now we need total and well ordering.
T HEOREM 6.10 (Trichotomy of ordinals). Any two ordinals are ⊆-connected,
α ∈ On → (β ∈ On
→ α ⊆ β ∨ β ⊆ α).
Proof. This is a clause in the definition of ordinal.
This delivers, for a start, some miscellany like the following.
P ROPOSITION 6.11. α ∩ β ∈ On, and α ∪ β ∈ On.
Proof. We can show α ∩ β is well ordered and transitive independently of trichotomy.
Let x ∈ α ∩ β. Then x ∈ α ∧ x ∈ β; then x ⊆ α ∧ x ⊆ β; then x ⊆ α ∩ β, showing
transitivity. Meanwhile, α ∩ β ⊆ α; since α is well ordered, α ∩ β is well ordered, too.
Easier still is to notice that if α ⊆ β, then α ∩ β = α; and if β ⊆ α then α ∩ β = β,
ordinals both. The case for α ∪ β is just like this.
More weakly, we see that certain intervals of On must be empty. Since α ∩ β ∈ On, we
have α ∩ β ∈ α ∩ β. Therefore either α ∩ β ∈ α or α ∩ β ∈ β. Therefore, there cannot
be any ordinals intervening between both α and α ∩ β, and β and α ∩ β. Observations like
these are the spur for the classical proof that On is linearly ordered.
P ROPOSITION 6.12. The ordinals are well founded.
Proof. We have to show that a non-empty θ ⊆ On has a least member. Let β ∈ θ . The
idea is that β is ∈-least in θ, or else the least member of β is the least member of θ . Note
that β ∩ θ ⊆ β; since β is well ordered, β ∩ θ is well ordered (by Proposition 6.4). Either
β ∩ θ is empty, or not.
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 85
P ROPOSITION 6.13. Any transitive set of ordinals, connected to all other ordinals by
⊆, is an ordinal.
Proof. Any set of ordinals is well ordered, by the previous theorems. Then the definition
of being an ordinal is satisfied.
T HEOREM 7.3. n + = ∅.
Proof. Since n ∈ n + but no n ∈ ∅, zero is not the successor of any number. That is, if
n+ = ∅ then n ∈ ∅, which implies everything, including the desired theorem.
T HEOREM 7.4. n ∈ ω n + ∈ ω.
Proof. If n ∈ ω, then n + meets all the requirements to be a number: n + ∈ On, and
has a predecessor, and all its members are in n, in which case they are numbers, or else a
number identical to n, a number again.
T HEOREM 7.5. n + = m + n = m.
Proof. If n + = m + , then ∀z(z ∈ n ∨ (z ∈ ω ∧ z = n) ↔ z ∈ m ∨ (z ∈ ω ∧ z = m)).
Then picking n and m for z respectively,
(n ∈ m ∨ (n ∈ ω ∧ n = m)) ∧ (m ∈ n ∨ (m ∈ ω ∧ m = n)),
which, once the extraneous conjuncts are dropped, distributes to
(n ∈ m ∧ m ∈ n) ∨ (n ∈ m ∧ m = n) ∨ (n = m ∧ m ∈ n) ∨ (n = m ∧ m = n).
Since n ∈ m ∧ m ∈ n implies n = m by two applications of transitivity, each above
conjunct implies that n = m. So the successor of every number is unique.
Names of the first few natural numbers are
∅ = 0
∅+ = 1
∅++ = 2
..
.
The fifth postulate is induction, proved in the general, transfinite case over On.
88 ZACH WEBER
∀β(β ∈ α ∨ β ∈ θ) → α ∈ θ.
But the hypothesis implies ∃β(β ∈ α ∧ β ∈ θ) ∨ α ∈ θ , for every α, which negates this
claim. Therefore there is no least, and so no ordinal not in θ , as required.
Transfinite induction will hold for any well-ordered set, including ω. The base case, 0,
is covered automatically by the induction hypothesis: If ¬(0) → ∃x(x ∈ 0 ∧ ¬(x)),
then (0) by the explosiveness of ∅.
A close mate of induction is definition by recursion. There is something appropriate in
proving the recursion theorem, as we are about to, with a set containing itself in its defining
condition.
T HEOREM 7.7 (Transfinite recursion). Let h be a function from V to V . There is a
function f from On to V such that
f (α) = h( f |α).
Proof. Take the set
x, y ∈ f ↔ y = h( f |x).
Existence is immediate from full comprehension. This is a function because h is; if
x, y, x, z ∈ f then y = h( f |x) = z.
The recursion scheme is used to define ordinal arithmetic. To deal with limit ordinals,
we look to least upper bounds.9
D EFINITION 7.8. Let X be a set of ordinals. The supremum of X , sup(X ), is the least
ordinal δ such that every x ∈ X is either in δ or identical to δ.
Addition, taking h to be h(α) = α + , is
α+0 = α
α + (β + 1) = (α + β) + 1
9 The usual identification of the sup(α) with α does not seem to follow in the logic here, or at
leastnot with ordinals as they are here defined, because the existential quantifier resists a proof
that α is an ordinal. Nevertheless, where X is a set of ordinals, then ordinals including every
member of X certainly exist—On, for example—and so, by well foundedness, a least exists, too.
If there is any question about the uniqueness of a supremum, a choice function—developed in the
next section independently of ordinal arithmetic—can be applied to make a functional selection.
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 89
α · (β + 1) = α·β +α
α·β = sup{α · γ : γ ∈ β} for limit β;
exponentation, with h β (α) = α · β, is
α0 = 1
α β+1 = αβ · α
§8. Global choice. In 1977, Routley produced an argument for the axiom of global
choice from full comprehension (Routley, 1980), (Priest et al., 1989, p. 374). Since then
the claim that choice is a theorem of naive set theory has become part of paraconsistent
folklore. There are some problems with Routley’s proofs (Weber, forthcoming-b), but his
general idea is correct. Answering a fundamental question about the set concept, we here
surpass classical theory and derive a global choice theorem without need of any further
assumptions. In fact what is proved is a weak version of Cantor’s well-ordering principle,
which he took to be a Denkgesetz, a law of thought (Hallett, 1984). It was in service
of proving this principle, essential to the theory of transfinite cardinals, that Zermelo
formulated his choice principle in 1904. The well ordering here is produced by injecting
V into a particular subset of On, thereby giving a well order for every subset of V , that is,
every set.
A function f : a −→ b is injective iff ∀x∀y(x = y
→ f (x) = f (y)).
T HEOREM 8.1. The universe can be well ordered.
Proof. An injection f : V −→ On is required. Consider
x, y ∈ f ↔ x ∈ V ∧ y ∈ On ∧ y = On.
That is, for each x ∈ V , f (x) = On. This is clearly a function. Now,
{α ∈ On : α = On} ⊆ On,
showing that the range of f is a segment of the ordinals and therefore well ordered.
Intuitively, the Burali–Forti paradox indicates that the members of the range of f are
discrete (Corollary 6.15), of the form
. . . ∈ On ∈ On ∈ . . . ,
so {On} may be injected into by arbitrarily large sets, inducing a well order on them.
Formally, because On = On, ∀x∀y(x = y
→ On = On), so ∀x∀y(x = y
→ f (x) =
f (y)). Therefore f is an injection. Thus
{x f (x) : f (x) ∈ On}
is a well order on V .
The proof is clearly not constructive; given the ordering {a On , b On , c On , . . .} on V with
each On distinct, it is not said how to determine a first member. This is exactly the case
90 ZACH WEBER
with Zermelo’s choice principle, which is a pure existence claim. A proof of a Cantorian
“law of thought” will inevitably be by demonstration of a bare existence of an ordering;
and so here, it has been established that for any well-ordered set there is a least member,
and {On} is well ordered. The difference between Zermelo’s proof and our own is that no
extra assumptions are needed to produce the existence claim. It comes directly from the set
concept.
Since subsets of a well-ordered set are well ordered.
C OROLLARY 8.2. (Zermelo 1904) Every set can be well ordered.
The following familiar steps lead to choice (Rubin & Rubin, 1985). The details are not
at all trivial, but are here omitted. A chain is a set connected by ⊆.
P ROPOSITION 8.3 (Hausdorff’s maximal principle). Every set has a maximal chain.
L EMMA 8.4 (Zorn). If every chain of some non-empty a has an upper bound, then a
has a maximal element.
A function f is called a choice function on a iff f (x) ∈ x when x ∈ a and ∃ zz ∈ x.
T HEOREM 8.5 (Choice). There is a choice function on every set.
C OROLLARY 8.6 (Global choice). There is a choice function on V . A fortiori, for every
non-empty a there is a choice function f such that f (a) ∈ a.
§9. Conclusions. The results in this paper are a beginning. For future work, a more
definite approach to restricted quantification, once one is known, will be invaluable.
A proof of the open conjecture at the end of the section on ordinals is obviously desirable.
And a more effective transfinite induction principle, with →s instead of extensional
connectives in key places, is desirable, too. Nevertheless, I hope these omissions can be
forgiven for now, in light of the significant increase in detail and insight provided.
To conclude on a forward-looking note, let me indicate some very striking results await-
ing in the theory of cardinal numbers. With the cardinals, we see clearly why all the work
on ordinals has been worthwhile, and also just how powerful the theory is. The construction
follows von Neumann’s assignment of least ordinals in an equinumerous class as cardinals.
Let < denote strict ordering by cardinality.
First, On should itself be a cardinal number, being evidently the least of its size. It
would be the cardinal of V . All cardinals are ordinals, so all cardinals are less than On.
This shows that
∃x∀y(|y| < |x|).
Then a simple quantifier swap proves the essence of Cantor’s theorem, that there are
distinctive orders of infinity:
∀y∃x(|y| < |x|).
For example, if |ω| := ℵ0 by definition, then ∃y(ℵ0 < y), so there are uncountable
cardinals. Note that this transcendence result holds even for V or On themselves—Cantor’s
paradoxes.
In a similar vein, a cardinal λ is said to be inaccessible if for every κ < λ, also 2κ < λ.
The existence of inaccessible cardinals is unprovable in ZFC. But, almost trivially, On
is such a cardinal: For κ < On, also 2κ < On, because On is the biggest. Therefore
inaccessible cardinals exist.
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 91
Finally, the generalized continuum hypothesis GCH conjectures for all cardinals κ, λ
that ¬(κ < λ < |P(κ)|). The GCH evidently fails at On, as the cardinal On provides a
counterexample. Assuming that On < |P(On)|, still we would have (by the Burali–Forti
contradiction) that On < On < |P(On)|. Then again, this does not rule out the GCH in
general; in fact, an instance of GCH holds at On. Let λ be a cardinal On < λ < |P(On)|.
Being a cardinal, ¬(On < λ), since all cardinals are members of, and less than, On. Thus
¬(On < λ < |P(On)|). In fact, further assuming that On = On + 1 obtains via the
Schröder–Bernstein theorem, and that On = ℵ On , then by existential generalization,
∃α(2ℵα = ℵα+1 ).
§10. Acknowledgment. Much thanks to Graham Priest, Greg Restall, Ross Brady,
Conrad Asmus, Sam Butchardt, Lloyd Humberstone, Stewart Shapiro, and anonymous
referees.
BIBLIOGRAPHY
Asmus, C. (2009). Restricted Arrow. Journal of Philosophical Logic, 38, 405–431.
Batens, D., Mortensen, C., Priest, G., & van Bendegem, J.-P., editors. (2000). Frontiers of
Paraconsistent Logic. Baldock, Hertfordshire, England and Philadelphia, PA: Research
Studies Press.
Beall, J. C., Brady, R. T., Hazen, A. P., Priest, G., & Restall, G. (2006). Relevant restricted
quantification. Journal of Philosophical Logic, 35, 587–598.
Brady, R. (1971). The consistency of the axioms of the axioms of abstraction and
extensionality in a three valued logic. Notre Dame Journal of Formal Logic, 12, 447–
453.
Brady, R., editor. (2003). Relevant Logics and Their Rivals, Volume II: A Continuation
of the Work of Richard Sylvan, Robert Meyer, Val Plumwood and Ross Brady. With
contributions by: Martin Bunder, Andre Fuhrmann, Andrea Loparic, Edwin Mares, Chris
Mortensen, and Alasdair Urquhart. Aldershot, Hampshire, UK: Ashgate.
Brady, R. (2006). Universal Logic. Stanford, California: CSLI.
Brady, R. T., & Routley, R. (1989) The non-triviality of extensional dialectical set theory.
In Priest, G., Routley, R., and Norman, J., editors. Paraconsistent Logic: Essays on the
Inconsistent. Munich: Philosophia Verlag, pp. 415–436.
da Costa, N. (2000). Paraconsistent mathematics. In Batens, D., Mortensen, G., Priest G.,
and van Bendegem J.-P., editors. Frontiers of Paraconsistent Logic. Baldock,
Hertfordshire, England and Philadelphia, PA: Research Studies Press, pp. 165–180.
Drake, F. (1974). Set Theory: An Introduction to Large Cardinals. Amsterdam: North
Holland Publishing Co.
Hallett, M. (1984). Cantorian Set Theory and Limitation of Size. Oxford Logic Guides.
Oxford [Oxfordshire]: Clarendon Press, 1984.
Kunen, K. (1980). Set Theory: An Introduction to Independence Proofs. Amsterdam: North
Holland Publishing Co.
Levy, A. (1979) Basic Set Theory. Berlin, Heidelberg and New York: Springer Verlag.
Reprinted by Dover, 2002.
Libert, T. (2005). Models for paraconsistent set theory. Journal of Applied Logic, 3, 15–41.
Mares, E. (2004). Relevant Logic. Cambridge, UK; New York: Cambridge University
Press.
92 ZACH WEBER
Meyer, R. K., Routley, R., & Michael Dunn, J. (1978). Curry’s paradox. Analysis, 39, 124–
128. Rumored to have been written only by Meyer.
Petersen, U. (2000). Logic without contraction as based on inclusion and unrestricted
abstraction. Studia Logica, 64, 365–403.
Priest, G. (2006). In Contradiction: A Study of the Transconsistent. Oxford, UK: Oxford
University Press. Second expanded edition of Priest (1987).
Priest, G., Routley, R., & Norman, J., editors. (1989). Paraconsistent Logic: Essays on the
Inconsistent. Munich: Philosophia Verlag.
Restall, G. (1992). A note on naı̈ve set theory in L P. Notre Dame Journal of Formal Logic,
33, 422–432.
Routley, R. (1980). Exploring Meinong’s Jungle and Beyond. Canberra: Philosophy
Department, RSSS, Australian National University. Interim Edition, Departmental
Monograph number 3.
Routley, R., & Meyer, R. K. (1976). Dialectical logic, classical logic and the consistency
of the world. Studies in Soviet Thought, 16, 1–25.
Rubin, H., & Rubin, J. E. (1985) [1963]. Equivalents of the Axiom of Choice. Amsterdam,
North Holland Publishing Co.
Weber, Z. (forthcoming-a). Extensionality and restriction in naive set theory. Studia Logica.
Weber, Z. (forthcoming-b). Notes on inconsistent set theory. In Tanaka, K., Berto, F.,
Paoli, F., and Mares, E., editors. World Congress of Paraconsistency 4.
Zermelo, E. (1967). Investigations in the foundations of set theory. In van Heijenoort,
J, editor. From Frege to Gödel: A Source Book in Mathematical Logic, 1879–1931.
Cambridge, MA: Harvard University Press, pp. 200–215.
Abstract. Moritz Pasch (1843–1930) gave the first rigorous axiomatization of projective geom-
etry in his Vorlesungen über neuere Geometrie (1882), in which he also clearly formulated the view
that deductions must be independent from the meanings of the nonlogical terms involved. Pasch also
presented in these lectures the main tenets of his philosophy of mathematics, which he continued to
elaborate on throughout the rest of his life. This philosophy is quite unique in combining a deductivist
methodology with a radically empiricist epistemology for mathematics. By taking into consideration
publications from the entire span of Pasch’s career, the latter decades of which he devoted primarily
to careful reflections on the nature of mathematics and of mathematical knowledge, Pasch’s highly
original, but virtually unknown, philosophy of mathematics is presented.
c Association for Symbolic Logic, 2010
93 doi:10.1017/S1755020309990311
94 DIRK SCHLIMM
concerned with the foundations of geometry as well as those of analysis and arithmetic,
and he developed a distinctive and original philosophy of mathematics.
The thesis that underlies the present paper is that Pasch’s reflections on the nature
of mathematics, which he presented throughout his life, in particular in his later more
systematic accounts, are elaborations and refinements of a philosophical position that he
put forward already in his famous lectures of 1882. One clear indication of this are the
later editions of these lectures in 1912 and 1926, where he had the opportunity to modify
or retract his earlier claims, but in which he only elaborated and added minor details. Thus,
what we can find in Pasch is a very thoughtful and consistent approach to the foundations
of mathematics.
The two main aspects of Pasch’s philosophy are a formal stance with regard to the valid-
ity of mathematical deductions and a strong commitment to an empiricist understanding of
the basic concepts of mathematics. On the face of it, these views might appear incompatible
in various ways. Firstly, that the meanings of the mathematical terms are given empirically
might not square with Pasch’s particular conception of deductivism, according to which
deductions must be independent of the meanings of the terms. Secondly, the introduction of
ideal elements in mathematics might stand in conflict with an empirical stance, and thirdly,
empiricism might appear to be incompatible with the common view that mathematical
deductions provide certain and necessary knowledge. (These aspects of incompatibility
will be addressed below.) Pasch’s deductivism and empiricism are mentioned in Ernest
Nagel’s informative paper on the development of geometry and logic (Nagel, 1939), with-
out, however, containing an account of Pasch’s attempt to reconcile them. Such an account
is also missing from Walter Contro’s detailed analysis of the axioms presented in Pasch’s
1882 lectures on projective geometry (Contro, 1976), and from the recent, in parts highly
speculative discussion in Tamari (2007).4 By drawing on Pasch’s lectures on geometry,
as well as his publications on the foundations of analysis and arithmetic, and his later
more philosophical works, this paper presents Pasch’s quite unique philosophy of mathe-
matics as a coherent system.5 The historical context and Pasch’s views of the relationship
between mathematical and philosophical investigations, which form the framework for
Pasch’s work, are presented in the next section. The tension between his radical empiri-
cism, aimed as providing an epistemological basis for mathematics, and his goal to capture
the essence of mathematical reasoning deserves particular attention. While Pasch maintains
that empiricism provides the best philosophical foundations for mathematics, he also
advances a very modern deductivist methodology for purely mathematical investigations.
These views are discussed in Sections 3 and 4, respectively. Finally, Pasch’s efforts to
merge these considerations into a unified whole, which I shall refer to as Pasch’s pro-
gramme, are presented in Section 5.
In presenting the reflections on mathematics and mathematical practice of a deep and
clear thinker such as Moritz Pasch, who stood with one foot firm in the empiricist tradition
of the nineteenth century, while vigorously striding with his other foot into the modern
Kronecker and Weierstrass, and submitted his Habilitation in 1870 in Giessen; after having been
Privatdozent at the University of Giessen he became extraordinary professor in 1873, and was
full professor from 1875 until 1910; he died September 20, 1930, at the age of 86.
4 For a coherent interpretation of Pasch’s mathematical work, see Gandon (2005). (Footnote added
November 2009).
5 Not all aspects of Pasch’s views can be dealt with in a satisfactory manner in the present paper.
Some of these are mentioned in the Concluding Remarks below, and are intended to be covered
in future work.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 95
mathematics of the twentieth century, this paper is also intended as a contribution toward
a better understanding of the radical transition mathematics underwent at the turn of the
twentieth century.
§2. Pasch’s view of mathematics. The received account of the nature of mathematics
in the first half of the nineteenth century was that given by Kant, who considered the
theorems of arithmetic and of Euclidean geometry as synthetic a priori. However, interest in
this topic was revived after non-Euclidean geometries became to be regarded as acceptable
consistent theories through the work of Riemann, Beltrami, and Klein. The presence of vi-
able alternatives to Euclidean geometry cast doubt on Kant’s transcendental reasoning, and
Hermann von Helmholtz famously argued in the late 1860s and early 1870s that the ques-
tion as to which theory of geometry describes best the space we live in should be answered
empirically.6 Around the same time reliance on mathematical intuition was also severely
called into question by another development in mathematics, namely in function theory.
In the 1860s Weierstrass lectured about the possibility of continuous functions that are
nowhere differentiable and soon thereafter many other such ‘monster’ functions, which de-
fied visualization and which proved commonly accepted intuitions wrong, where studied.
It is against this background that Pasch formed his views on the nature of mathematics.7
One of the earliest insights into Pasch’s own development is offered in letters written
in 1882 to Felix Klein.8 Herein Pasch mentions as influences to his views the lectures of
Kronecker and Weierstrass that he attended in Berlin in 1865–1866,9 and also discussions
in the 1860s with his friend and colleague Jakob Rosanes.10 In these letters Pasch also
expresses his disappointment regarding the views of the few philosophers that he has read
(without mentioning any names, however). Nevertheless, Pasch thought his views to be so
commonsensical that he assumed them to be generally shared and he was surprised to hear
of Klein’s experiences of the contrary. In print, Pasch readily points out that his views are
‘by no means new’ (Pasch, 1887a, p. 129), but again without mentioning any predecessors.
We find only a brief reference in Pasch (1882a, p. 17) to von Helmholtz (1876), but whether
Pasch was in fact influenced by von Helmholtz, or whether he just quoted the famous
scientists in support of a view that he arrived at independently or influenced by other
authors remains an open question.
Thus, it seems that the seed to Pasch’s views on mathematics, which underlie his axioma-
tization of geometry as well as his other foundational and philosophical investigations, was
planted early in his career. As he admits without hesitation (and as will be discussed below),
particular aspects of his philosophical outlook evolved over time, but on the whole Pasch
remained committed throughout his life to the two pillars of his philosophy: deductivism
Investigations Goals
and empiricism. Pasch published a systematic account of his views, which I turn to next,
only in the last two decades of his career.
In order to accommodate his deductivism and empiricism into a coherent picture, Pasch
distinguishes between different layers of mathematical and philosophical investigations,
which are characterized by different aims and methodologies (see Table 1). According to
this picture, mathematical investigations take place at two distinct layers. The first one,
which Pasch describes as ‘rough’ (‘derb’) mathematics, comprises the usual work that is
done by mathematicians in order to obtain new results (Pasch, 1918a, p. 230).11 The bulk
of mathematical research falls into this category, and it is worth mentioning already at this
point that as a practicing mathematician Pasch was well aware of the distinction between
how mathematics is presented and how it is actually done (more on this later).
The second layer of mathematical work is foundational in character and it involves care-
fully working out the fundamental concepts and propositions of a discipline and showing
how the entire discipline can be built up from them. Pasch refers to this part of mathematics
as ‘heikel’ (Pasch, 1918a, p. 230), which is translated here as ‘delicate’, but could also mean
‘finicky’ and ‘touchy’. His own axiomatization of projective geometry (Pasch, 1882a) and
his introduction to analysis (Pasch, 1882b) are examples of such investigations. Delicate
mathematics is guided by the difficult demand for a ‘scrupulous completeness of the trains
of thought’ (‘unbedingte Vollständigkeit der Gedankengänge’) and is motivated by an ‘urge
for pure knowledge’ (‘entspringt [ . . . ] dem Drange nach Erkenntnis überhaupt’) (Pasch,
1924a, p. 36). Such investigations aim at an axiomatic presentation of a mathematical
discipline, which Pasch calls a stem (‘Stamm’), consisting of stem concepts and propo-
sitions (Pasch, 1882a, pp. 74, 98).12 On their basis a mathematical theory can be built up
deductively, and as long as they are not given any philosophical grounding, Pasch also
refers to them as ‘hypothetical’ (Pasch, 1917, p. 185) or ‘mathematical’ (Pasch, 1924a,
p. 43).
Once a mathematical foundation of a discipline has been given, the philosophical task
arises of determining the meanings of the mathematical terms and of giving an account
of their applicability to the world. In other words, a ‘substructure’ (‘Unterbau’) has to
be provided that supports and grounds the mathematical theory (Pasch, 1917, p. 185).13
For these philosophical foundations different approaches are possible, and Pasch mentions
rationalist, a priori, and empiricist accounts as alternatives (Pasch, 1926c, p. 138). For
reasons that will be discussed later (Section 3), Pasch himself decided to pursue a radical
empiricist approach. The details of Pasch’s efforts to connect the philosophical substruc-
ture to the purely mathematical foundations are discussed under the heading of ‘Pasch’s
programme’ below (Section 5).
Finally, a second layer of philosophical investigations is concerned with uncovering the
conditions, skills, and so forth, that are necessary for employing the basic concepts and car-
rying out the investigations at the higher levels. Pasch refers to this layer as investigations
regarding the ‘prescientific origins’ or simply the ‘origin’ (‘Ursprung’) of mathematics
and thinking in general (Pasch, 1924a, p. 40), and he identifies as its fundamental concepts
those of ‘thing’, ‘proper name’, ‘event’ (in particular that of ‘naming a thing’), ‘collective
name’, ‘earlier’ and ‘later’ events, ‘immediate following’, and ‘chain’ of events (Pasch,
1924b, p. 234).14 Pasch’s investigations at this layer might be characterized, borrowing an
expression of Hilbert, as a deepening of the foundations of human knowledge.15
The investigations at each of the top three layers can be pursued independently of the
considerations pertaining to a lower layer, which allows for the division of ordinary and
foundational research as well as the division of mathematical and philosophical labor. As
a consequence, mathematicians can ignore the questions regarding the origins and the
applicability of mathematics altogether, and most often they do.16 For Pasch, however,
a complete picture of mathematics requires an account of each of these four layers and of
their interconnections. To emphasize and illustrate this organic, hierarchical structure Pasch
employs terminology that evokes the picture of a tree of mathematics: On the one hand,
he refers to the philosophical foundations as a ‘Kern’, which is rendered here as core, but
could also be translated as ‘pip’ or ‘kernel’, that consists of core concepts and propositions
(‘Kernbegriffe’ and ‘Kernsätze’) (Pasch, 1916).17 On the other hand, the mathematical
foundations of a discipline are called a ‘Stamm’, translated here as stem, which but could
also be rendered as ‘stalk’ or ‘trunk’, that consists of stem concepts and propositions
(‘Stammbegriffe’ and ‘Stammsätze’); in accordance with this botanical metaphor, the do-
main of philosophical inquiry that is common to all sciences is referred to as an area of
roots (‘Wurzelgebiet’) in Pasch (1924a, p. 34).
Failure to notice Pasch’s distinction between a (mathematical) stem and a (philosophical)
core, and indiscriminate reference to both stem propositions and core propositions as
‘axioms’ has led to misinterpretations and disputes in the literature. For example, Kline
(1972, p. 1008) mentions that some of Pasch’s axioms have empirical origins, while Torretti
(1978, p. 211, and footnote 49) explicitly disagrees with this assessment and claims that
Pasch considers all axioms to be empirically grounded.18 Since also Nagel does not address
Pasch’s crucial distinction between a core and a stem (Nagel, 1939, pp. 193–199), it has
also been missed by many later commentators who relied heavily on Nagel’s account.
14 See also Pasch (1927), Pasch (1930a), and (Pasch, 1980, p. 16). The search for such origins can
certainly be traced back to Kant, but also some of Pasch’s colleagues addressed such questions,
for example, Dedekind (1888, p. 336), and Veronese (1894, pp. 1–2).
15 See Pasch’s discussion of Hilbert (1922) and Hilbert (1923) in Pasch (1924b, pp. 236–240).
16 See Pasch (1912, p. 204), Pasch (1924a, p. 43), and Pasch (1927, p. 123).
17 Core propositions are also referred to as ‘primitive stem propositions’ in Pasch (1924a, p. 16
[1915]), and in Pasch (1924b, p. 232) the core is referred to as a ‘ “natural” stem’.
18 A similar claim is made in Boniface (2004, p. 133).
98 DIRK SCHLIMM
In the course of his career, which spanned over 60 years, Pasch devoted his attention
increasingly to the deeper layers of mathematical and philosophical investigations, de-
scribing his aim as ‘getting as far as possible to the beginnings’ (Pasch, 1926b, p. 166).
Pasch’s earliest publications were a few short research articles, after which he brought
out two books in 1882, both of which are concerned with foundational mathematical
work and are interspersed with philosophical reflections. Soon after Pasch’s lectures on
geometry, his Einleitung in die Differential- und Integralrechnung (Pasch, 1882b) ap-
peared.19 But, Pasch was not able to develop the foundations of analysis as deeply as
he had intended, and he tried to remedy this in Grundlagen der Analysis (Pasch, 1909)
and Veränderliche und Funktion (Pasch, 1914). Only after his retirement from teaching for
almost four decades at the University of Giessen,20 Pasch found the time to write on more
philosophical topics. This led to numerous articles and the collections Mathematik und
Logik (Pasch, 1919, 1924a), Mathematik am Ursprung (Pasch, 1927), and Der Ursprung
des Zahlbegriffs (Pasch, 1930a).21
According to his strong conviction of the existence of a tight connection between ‘cor-
rectness of linguistic expression and correctness of thinking’ (‘Sprachrichtigkeit und Denk-
richtigkeit’) (Pasch, 1930b, p. 6),22 Pasch always struggled to find the most appropriate
terminology for expressing his ideas. For example, while he distinguished between ‘basic’
and ‘stem’ concepts and propositions in Pasch (1882a, pp. 74 and 98), he began referring to
the former as ‘core’ in Pasch (1916, p. 276), remarking that his original terms ‘Grundsätze’
and ‘Grundbegriffe’ were often understood in a different sense than he intended. Changes
in terminology also reflect changes in Pasch’s way of thinking. For example, the distinction
between ‘rough’ and ‘delicate’ mathematics (Pasch, 1918a, p. 230) was first introduced as
one between ‘consistent’ (‘konsistent’) and ‘disputable’ (‘strittig’) mathematics 2 years
earlier (Pasch, 1916, p. 275). Similarly, the distinction between ‘proper’ and ‘improper’
mathematics that Pasch introduces in Pasch (1914, pp. 153–157) was later reformulated as
one between ‘perfect’ and ‘imperfect’ mathematics (Pasch, 1918a, p. 230).23 In later years
Pasch also urged to employ different names for mathematical notions and their empirical
correlates, on the grounds that their conceptual differences can be easily overlooked if
they are both referred to by similar names, and he suggested the terms ‘location’, ‘path’,
‘segment’, ‘bowl’, and ‘plate’ (‘Stelle’, ‘Weg’, ‘Strecke’, ‘Schale’, ‘Platte’) as names for
the empirical conceptions of point, line, straight line, surface, and flat surface (Pasch, 1917,
p. 187).24
Pasch was very well aware of the tentative character of axiomatic presentations and he
continuously tried to improve on his previous work by publishing lists of corrections to
19 The preface of Vorlesungen (Pasch, 1882a) is dated ‘March 1882’, while that of Einleitung (Pasch,
1882b) is dated ‘May 1882’.
20 See Pickert (1980, pp. 49–57) for a list of the courses taught by Pasch in Giessen.
21 Interestingly, more publications by Pasch appeared in the two decades after his retirement than
before.
22 Pasch elsewhere describes the aim of mathematics as ‘the most complete clarity of thought and
of their linguistic expressions’ (Pasch, 1924a, pp. 39–40); for a practical example, see Pasch
(1887b, p. 132), where he introduces new terminology that allows for ‘more precise and shorter
formulations’.
23 The distinction between perfect and imperfect mathematics will be discussed in connection with
the notion of decidability in Section 4, below.
24 See also Pasch (1930a, p. 19) for Pasch’s use of ‘Rotte’ instead of ‘Reihe’, which he used in Pasch
(1909, p. 7).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 99
earlier publications and slightly changing formulations even in reprints.25 Pasch’s attitude
toward foundational work is expressed quite tellingly his review of a book by Dingler
on the notion of logical independence in mathematics, subtitled Also an Introduction to
Axiomatics (Dingler, 1915). Here Pasch criticizes the author for reprinting an obviously un-
finished article without further revisions and for not seriously trying to present a complete
set of core propositions (Pasch, 1916, p. 276). Pasch concludes that the book would need
further ‘patient work’ before being able to yield concrete results from the accumulated raw
material. In addition, Pasch demands higher standards regarding the ‘exactness of one’s
thought and expression’ and a more thorough self-criticism especially from somebody
who writes an introduction to axiomatics. There are good reasons to believe that he did
hold himself responsible to such standards.26
§3. Pasch’s empiricism. Pasch’s version of empiricism, the main points of which I
will try to outline in this section, differs in important respects from the views held by his
contemporaries, but bears some resemblance to the views of Berkeley, Locke, and Hume.27
In contrast to the question as to which geometry is the ‘right’ description of space, which
was the driving force behind von Helmholtz’s form of empiricism, Pasch’s main concern
was the nature of the fundamental concepts of mathematics. A satisfactory account of this,
according to Pasch, must answer questions regarding the applicability of mathematics as
well as the epistemology and certainty of mathematical knowledge. In accord with my
thesis that the main elements of Pasch’s philosophy can be found already in his lectures
on projective geometry, I shall begin the discussion with Pasch’s remarks on the nature of
geometry.
In the opening sentence of his Vorlesungen über neuere Geometrie (Pasch, 1882a, V),
Pasch laments that the empirical origins of geometry have not been consistently brought
out in the recent treatments of this discipline that tried to meet the increased standards
of rigor, and he announces that his lectures aim at carrying out such a project. Shortly
after this pronouncement Pasch justifies his point of view by claiming that the successful
applications of geometry in daily life and in science are based on the fact that the geometric
concepts originally conformed exactly with empirical objects, and that only later they
were ‘covered by a network of artificial concepts’ to foster the advancement of theoretical
developments. By restricting himself from the start to empirical concepts only, Pasch
intends to retain the character of geometry as a natural science.28 A few pages later he
repeats his resolve of steadfastly holding on to the empiricist standpoint, according to
25 This can be seen, for example, in the additions to the 1912 edition of his lectures on projective
geometry and the various (seemingly overly pedantic) corrections to previous publications that
he adds in later works. Just to mention a few, Pasch (1909) contains corrections to Pasch (1882a)
on pp. 117–188 and to Pasch (1882b) on p. 120; corrections to Pasch (1912) are listed in Pasch
(1914, VI) together with further corrections to Pasch (1909). A number of small changes in the
text can be found in the versions of Pasch (1894) reprinted in Pasch (1909), Pasch (1919), and
Pasch (1924a).
26 Pasch’s publications, as well as his autobiographical reflections and the descriptions of his
character and work ethic by people who knew him personally confirm this; see Pasch (1930b,
p. 10), Dehn (1928), Engel & Dehn (1934), and Tamari (2007).
27 See Jesseph (1993, pp. 44–87) and Pressman (1997); see also John Stuart Mill’s A System of Logic
(Mill, 1851), and Harré (2003).
28 The view that geometry is a natural science is frequently echoed by Hilbert, see Hallett & Majer
(2004, pp. 66, 197, 266, 504).
100 DIRK SCHLIMM
which ‘geometry is seen as nothing else but a part of natural science’ (Pasch, 1882a, p. 3).
Thus, Pasch presents his work from the outset as providing philosophical foundations for
projective geometry, in addition to purely mathematical ones (see Table 1, above). This
combination, which is rather unusual in its extent for a mathematical treatise, may have
come as a surprise and possibly also as an irritation to readers who did not share the aims
and methods of Pasch’s conception of mathematics and his philosophical project.29
Pasch’s original move, which characterizes his version of empiricism and sets it apart
from that of his contemporaries, is to take empirical concepts as the starting point for a
rigorous development of mathematical theories. This involves two main steps: First, the
stem concepts of a discipline have to be developed from the empirical core concepts, and
second, the remainder of the theory has to be based on the stem concepts alone. Note,
that in order to guarantee the empirical character of mathematics as a whole, its theorems
must inherit the epistemological status of the axioms; it is at this point where Pasch’s
deductivism becomes fundamental for establishing his empiricism.
As a general and essential criterion for the choice of core concepts Pasch holds that they
should be able to explain how the mathematical concepts originated or at least how they
could have originated (Pasch, 1917, p. 190).30 Moreover, they should be as few as possible
and express the simplest content possible (Pasch, 1894, p. 24). Pasch also insists that the
basic terms of a mathematical theory can neither be defined nor can they be reduced to other
concepts, but that we can only understand them through reference to appropriate physical
objects (‘den Hinweis auf geeignete Naturobjecte’) (Pasch, 1882a, p. 16). In particular, the
principle of duality in projective geometry, that is, the fact that the basic terms in the stem
propositions can be interchanged systematically while yielding again valid propositions,
is taken by Pasch as evidence that these propositions cannot be regarded as definitions
of the basic concepts (Pasch, 1914, p. 143). This stands in direct contrast to the modern
understanding of axiom systems as implicit definitions of its primitive terms.31 Thus, in
geometry Pasch introduces points as those objects that cannot be further divided within
the limits of observation determined by the best tools that are currently available to us.
He also rejects the common view that lines must be ‘ “imagined” as being infinitely ex-
tended’, since such a demand does not correspond to any perceptible objects (Pasch, 1882a,
p. 4); instead, Pasch takes the notion of (finite) line segments as a core concept.
In addition to the demand that the basic objects of geometry should be observable, they
must satisfy some further restrictions in order to be usable. For all practical purposes,
configurations of physical geometric objects (i.e., figures or diagrams), Pasch explains,
must be such that, on the one hand, the observer is relatively close to them, and on the other
hand, that their parts are sufficiently close to allow for an immediate grasp of their rela-
tionships (Pasch, 1882a, pp. 18–19). As a consequence, one can have immediate evidence
that these relationships hold only within a relatively small, bounded region of space.32
29 That readers might be irritated is mentioned, for example, in Tamari (2007, pp. 77, 194–195). Two
examples: Russell speaks of Pasch’s ‘empirical pseudo-philosophical reasons’ (Russell, 1903, p.
393) and in a recent commentary Majer mentions some ‘curiosities’ that characterize Pasch’s
approach (Majer, 2004, p. 104).
30 For similar remarks, regarding the axiomatization of arithmetic, see Pasch (1924a, p. 16 [1915]).
31 That Pasch did understand axioms to implicitly define the primitives is claimed in Tamari (2007,
ii, p. 6, and 96). But, compare the footnote in Pasch (1920, p. 145), in which Pasch explicitly
denies such an interpretation; see also Gabriel (1978).
32 For similar views on these fundamental assumptions Pasch refers to Riemann (1854, p. 266),
Klein (1871, pp. 576 and 624), and Klein (1873b, p. 134).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 101
In a similar vein, Pasch also notes that general terms (universals) are introduced only with
reference to a finite number of particular objects (Pasch, 1914, p. 3).
For Pasch, the further development of a mathematical discipline proceeds from obser-
vations to propositions. In geometry, repeated observations of concretely given figures
yield simple relations between the basic concepts, some of which are formulated as basic
propositions, from which all other propositions of geometry follow. For example, two of
Pasch’s basic propositions are: ‘I. Between two points one can always draw one and only
one segment’ and ‘VI. Given any two points A and B, it is possible to choose a point C,
such that B lies within the segment AC ’ (Pasch, 1882a, p. 5). Given the empirical referents
of the primitive terms, Pasch notes that these propositions do not hold in general, but
that they are subject to certain restrictions. In order to draw a segment between any two
points, these points must be sufficiently apart from each other, while the points A and B
of basic proposition VI must be sufficiently close to each other to allow for the actual
construction of the third point. These conditions are met in the usual diagrams or mental
visualizations that accompany mathematical investigations, but they must be made explicit
and kept track of in the deductive development of geometry. Theorems that depend on
the above basic propositions, are thus also subject to restrictions. For example, also the
construction expressed in theorem 8, which states that ‘Given two points A and B on a line,
it is always possible to choose a point C on that line, such that C lies between A and B’
(Pasch, 1882a, p. 10), and which is proved using the above axioms, cannot be applied
indefinitely often (i.e., the ‘always’ must be taken with a grain of salt).
While Pasch does not give specific arguments for his empiricist standpoint in his early
writings, he does provide an argument in Pasch (1914, pp. 138–139), which is based on
the applicability of mathematics. In order to apply mathematical propositions to the world,
the concepts that occur in them must be related to things that occur in experience, which
is straightforward if they are understood to refer to empirical notions from the outset.
If, however, mathematical concepts are not understood as referring to empirical objects,
in which case Pasch calls them ‘hypothetical concepts’, their applicability rests on two
sets of hypotheses: First, the axioms themselves are purely hypothetical, and second, the
association between mathematical concepts and their empirical correlates is hypothetical,
too. The position that Pasch describes here bears strong similarities to that of a hypothetico-
deductive account of mathematics, which must be augmented by ‘coordinative definitions’
to be applied.33 From Pasch’s empiricist standpoint, however, these two kinds of hypothe-
ses present a detour that does not add any benefits, so that the empiricist approach is simpler
and thus to be preferred. In Pasch (1917, pp. 185–186) Pasch remarks that ‘hypothetical
geometry’ is completely independent from physical objects (‘Naturgegenständen’), which
becomes completely obvious if the terms ‘thing of the first, second, and third kind’ are used
instead of ‘points, lines, planes’, as was suggested by Hilbert (1899). From a mathematical
point of view this way of proceeding is unobjectionable for Pasch, but it leaves the relation
to figures and applications unexplained. More generally, he maintained that despite the
fact that the problem of applicability had been widely discussed from a nonempiricist
standpoint no satisfactory solution had yet been given.
A second argument for empiricism is presented in Pasch (1922, pp. 3–4) and Pasch
(1924a, p. 44). Here Pasch notes that different viewpoints regarding the nature of geom-
etry, for example, that it is ‘a pure creation of human thought’, find their expression in
33 See Reichenbach (1957, p. 14) or Nagel (1961, p. 93); Hilbert’s account of the application of
mathematical theories is also similar.
102 DIRK SCHLIMM
introductory textbooks. However, a closer look at these expositions also reveals that none
of them remains completely consistent in presenting geometry from a single point of view.
For example, a ‘body’ may have been defined as a part of space that is delimited on all
sides, but it is later said to be moved, despite the fact that a part of space is not something
movable (Pasch, 1924a, p. 44). Without going into further details Pasch argues that it is
impossible to purge all allusions to experience from any introduction to geometry, and
thus, that a coherent presentation should treat geometry as an empirical science. The lack
of a textbook that consistently pursues the empiricist standpoint is explained by the fact
that much more work needs to be done to lay bare the prescientific foundations that such a
presentation would require.34 Nonetheless, Pasch also recommends teaching geometry in
school by starting with empirical notions, since they are what seems to come most naturally
to beginners (Pasch, 1909, pp. 134–135).
34 Pasch mentions Thaer & Lony (1915) as a valuable attempt in this direction.
35 On Eduard Study, see Hartwich (2005); I intend to discuss the interactions between Pasch and
Klein in a subsequent paper.
36 Quotations of this passage can be found, for example, in Nagel (1939, p. 197), Kennedy (1972,
p. 133), Torretti (1978, p. 211), Shapiro (1997, p. 149), Boniface (2004, p. 134), Detlefsen (2005,
p. 251).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 103
This passage has been referred to as the ‘birthplace of modern axiomatics’37 and is
the basis for Tamari’s reference to Pasch as the ‘father of modern axiomatics’ (Tamari,
2007, title), and Freudenthal’s remark that ‘[t]he father of rigor in geometry is Pasch’
(Freudenthal, 1962, p. 619). How quickly Pasch’s conception of mathematical deduction
became widely accepted can be gleaned from the fact that 8 years later Klein mentions
Pasch as espousing an ‘almost generally held view’, according to which in geometric
considerations one has to rely only on axioms without making any use of intuition (Klein,
1890, p. 571). Pasch never grew tired of emphasizing again and again the importance of
this understanding of deduction, which he also referred to as ‘the genuine mathematical
method’ (Pasch, 1918a, p. 228) and as an ‘imperative’ (‘Gebot’) for mathematical research,
which is completely independent of any position regarding the philosophical foundations
of mathematics one might want to adopt (Pasch, 1917, p. 188). As he repeated over three
decades after his lectures on geometry were published, mathematical proofs must remain
valid if the basic concepts are replaced throughout ‘by any concepts or by meaningless
signs’ (Pasch, 1914, p. 120).38 He refers here to this method as ‘a formalism, that has to
be carried downright to the extremes’ in the development of mathematics (Pasch, 1914,
p. 121; emphasis in original), and concludes emphatically: ‘This formalism is the lifeblood
(‘Lebensnerv’) of mathematics’ (Pasch, 1914, p. 121).39
Replacing meaningful terms by variables, for example, changing ‘There are points’ to
‘There are αs’, is the key to formalization, according to Pasch. He emphasized that geomet-
ric arguments must remain valid even if the geometric terms are replaced by code names
(‘Decknamen’) like ‘P-thing, G-thing, and E-thing’ (Pasch, 1918a, p. 231).40 In general,
for a mathematical proof to be rigorous it must rest only on propositions that allow such
substitutions and whose inferences remain valid under such transformations (Pasch, 1926c,
p. 263). This analysis leads Pasch to distinguish between ‘material words’ (‘Stoffwörter’)
and ‘joins’ (‘Fügemittel’) (Pasch, 1926c, pp. 243 and 263). As Pasch explains, the former
are meaningful terms that denote concepts, which are the material (‘Stoff ’) of the propo-
sition, like ‘two’, ‘points’, ‘segment’, and ‘endpoint’. The joins constitute what is needed
to connect the material words in order to express relations between the denoted concepts,
and they include what are now called the logical parts of expressions.41 They are called
the ‘scaffolding’ (‘Gerüst’) of a stem in Pasch (1924a, p. 11 [1915]), and in Pasch (1894,
p. 21) he explains that in order to carry out deductions one only needs to understand ‘those
parts of language that are common to all domains of thought’ (‘Denkgebieten’). This allows
Pasch to reformulate his understanding of deduction as follows:
The mathematical proof has nothing to do with the meaning of the ma-
terial words; it depends ultimately only on the joins and thus presents a
pure formalism. (Pasch, 1926c, p. 263; emphasis in original)
37 See Engel & Dehn (1934, p. 133) and Pickert (1982, p. 271).
38 At this point Pasch does not distinguish terminologically between words and concepts, that is,
between linguistic entities and their meanings. He addresses this distinction, however, in Pasch
(1926c). See also Pasch (1909, p. 1).
39 This remark is echoed in Pasch (1926c, p. 263).
40 For similar considerations, see also Dedekind’s letter to Lipschitz, July 27, 1876 (Dedekind,
1932a, p. 479), and Hilbert’s letter to Frege, December 29, 1899 (Frege, 1980, p. 40).
41 It appears that Pasch’s distinction between material words and joins is intended to distinguish
nonlogical from logical components of expressions in a natural language.
104 DIRK SCHLIMM
Formalization, as the most reliable touchstone for the validity of proofs, can be dis-
pensed with if one is very careful, but this is very difficult, Pasch warns, as the gap in
Euclid’s first proof illustrates (Pasch, 1926c, p. 140). Nevertheless, Pasch acknowledges
the usefulness of diagrams in mathematical practice, as will be discussed below (p. 107).
Pasch (1914, pp. 121–137) discusses at some length four historical case studies of math-
ematical errors made by Ampère, Cauchy, Dirichlet, and Hasse, which he traces back to
a lack of rigour in the development. Such rigor can be achieved through formalization,
which for Pasch is a powerful technique to ascertain the logical validity of arguments.
As such, Pasch’s notion of formalization does not involve the presentation of mathematical
reasoning in a symbolic language like Peano’s or in a completely formalized language
like Frege’s. Pasch explicitly distanced himself from these approaches and promoted for-
malization only to the extent that it remained compatible with ordinary mathematical
practice.42
Pasch realized that his understanding of deduction requires to address the question of
what counts as a mathematical proof, and in a letter to Frege from 1894 he expressed his
surprise of finding how rarely this topic had been seriously investigated.43 In a lecture on
the value of mathematical education delivered in the same year, Pasch pointed out that
mathematical proofs serve two main goals. Originally, they were a means for ‘discovering
new properties of figures and numbers’, but later they were also employed for examining
the ‘logical dependencies’ among propositions (Pasch, 1894, pp. 23–24). The second point
is vividly illustrated by Pasch’s own investigations in Pasch (1882a), in particular the
discussions of various equivalent axiomatizations in Section 1. Pasch did not consider the
study of mathematical inferences as a subject matter of mathematical research per se, but
as a matter of independent and general importance properly belonging to the domain of
philosophy (Pasch, 1914, p. 33). As is evident from the correspondence with Frege, Pasch
showed great interest in Frege’s work, but he also remarked that due to his age and the
heavy demands on his time he was not in a position to familiarize himself with Frege’s
notation.44 Nevertheless, Pasch undertook his own investigations of the notion of proof in
order to give an account of the necessary conditions for valid mathematical inferences that
apply to informal arguments as well as to those presented in a formal language. While his
investigations remained only in the fledgling stages, Pasch expressed the hope that they
might lead to a ‘renewal of logic’ and that ‘the indicated path will lead to the main features
of a logic that does justice to the accomplishments of mathematics’ (Pasch, 1918a, p. 232).
The position that Pasch arrived at is that
[i]t is part of the essence of pure deduction that every proof can be
‘atomized’, i.e., resolved into steps of certain kinds, or that it consists
of a single such step. (Pasch, 1917, p. 189)
In his ‘Begriffsbildung und Beweis in der Mathematik’ (1925) Pasch illustrates and dis-
cusses in great detail how the Aristotelian syllogistic form Barbara, that is, the deduction
of ‘All As are Cs’ from the premises ‘All As are Bs’ and ‘All Bs are Cs’, can be atomized
into 16 individual steps. According to Pasch’s analysis, each of these steps in the deduction
42 See Pasch’s letter to Klein, October 19, 1891; held at the Staats- und Universitätsbibliothek
Göttingen, Sig. Klein 11, 184.
43 Pasch’s letter to Frege, February 11, 1894 (Frege, 1980, p. 103).
44 Letter from Pasch to Frege, January 18, 1903 (Frege, 1980, p. 105). The preserved correspondence
with Frege consists of seven letters from Pasch in the period from 1894 to 1906.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 105
is the reformulation of the same content in other words, or the dissection of the original
content while retaining only a part of it, or the combination of the contents of previous
steps, or a definition.45 Pasch concludes that the most basic inferential steps must be such
that one can decide, by a general method in a finite amount of time or of steps, whether
they are valid or not. He also discusses questions that he considers to be undecidable, such
as whether a given proof can be rendered gap free and whether a given formula is derivable
from a set of assumptions. Pasch appeals here to a notion of decidability that he attributes
to Kronecker and which he recognized only during his work on Grundlagen der Analysis
(1909) as being of fundamental importance.46 Pasch (1914, pp. 153–157) distinguishes
between ‘proper mathematics’, which makes only use of decidable notions, and ‘improper
mathematics’, which does not, but which is much more common. To avoid speaking of
‘improper’ mathematics, Pasch later changed the terminology to ‘perfect’ and ‘imperfect’
mathematics (Pasch, 1918a, p. 230).
In his 1894 lecture mentioned above Pasch remarks that neither Euclidean nor non-
Euclidean geometries contradict any facts of experience, but that nevertheless these
systems could still be inconsistent (‘einen inneren Widerspruch enthalten’), ‘because ex-
perience only refers to approximated usability, which is quite compatible with certain
inconsistencies’ (Pasch, 1894, p. 31). He also maintains that explicit and complete proofs
for the consistency of both geometries are still lacking, and suggests that such proofs
could be based on analytic means, which would settle the question at least for those
who consider the consistency of analysis as necessary (Pasch, 1894, pp. 31–32). More
than 20 years later Pasch took up the issue of consistency again in a lecture ‘Über in-
nere Folgerichtigkeit’ (1915), which was published in Pasch (1919). Here he introduces a
classification of inconsistencies, two of which are ‘internal’ to a theory, while the other
two concern applications of theories. An inconsistency of the first level occurs within
a single sentence or between two given sentences. Since they involve only a finite set
of sentences, such inconsistencies are ‘decidable’ (granting the investigator a sufficiently
long life and a big enough memory). An inconsistency of the second level, however, is one
between consequences of a given stem, which are in general infinite in number, and Pasch
notes that we do not have any general process by which we could decide whether such an
inconsistency obtains or not, since this would involve the determination of all consequences
of a set of axioms. As a method for establishing consistency Pasch explains how a given
set of meaningful propositions can be ‘formalized’, resulting in an ‘empty stem’, which
in turn can be ‘realized’ by replacing the meaningless symbols by meaningful concepts,
yielding a ‘filled stem’ (Pasch, 1926a, p. 11 [1915]). If a realization of a formalized stem is
consistent, Pasch argues, then the original stem is also consistent. In modern terminology,
Pasch here describes the notion of relative consistency proofs. To ‘arithmetize’ a stem,
which often depends on a ‘felicitous idea’, amounts to showing its consistency relative
to arithmetic.47 Pasch defends the standpoint that arithmetic is indeed consistent, while
the consistency of any other mathematical discipline must be established by a proof.48
45 See also Pasch (1924a, p. 38), where Pasch refers to his earlier discussions of proofs in Pasch
(1909), Pasch (1912), and Pasch (1914).
46 See Pasch’s discussion of the notion of decidability in Pasch (1914, pp. 153–157), Pasch (1918a),
and Pasch (1927, pp. 88–93).
47 Pasch presents this terminology as if it were his own. References to Weierstrass, Kronecker, and
Klein are conspicuously missing; see Klein (1895) and the discussion in Boniface (2007, p. 332).
48 Kronecker’s views on the natural numbers seem to loom in the background of this discussion, but
Pasch never mentions them explicitly (see also Footnote 9).
106 DIRK SCHLIMM
To show the consistency of arithmetic one would have to show that an axiomatization of
arithmetic itself is consistent, or, in other words, this should follow from arithmetic itself.49
Pasch argues that this is indeed the case by appealing to the intimate connection between
arithmetic, thought, and language. The source of arithmetic, which finds expression in its
core propositions, Pasch contends, is necessary for thought in general and its constituents
are so extremely primitive that we are not consciously aware of them. We have committed
ourselves to their content when we made experiences and fixed them in language. It follows
that these propositions and their consequences are binding for us, which justifies, for
Pasch, our belief in their consistency (Pasch, 1924a, p. 17 [1915]). It is worth pointing
out that Pasch does not mention any empirical considerations in his discussions of the
consistency of mathematical theories, except in the argument for grounding the consistency
of arithmetic.50
The value of the deductive method in mathematics, for Pasch, is that it excludes all arbi-
trariness in proofs and thus renders them unassailable, which, together with the empirical
evidence for the core propositions, is the basis for our ascribing the ‘highest level of reli-
ability’ to mathematics (Pasch, 1882a, p. 100). Notice how, as a thoroughgoing empiricist,
Pasch does not speak of the necessity of mathematics, but only of its reliability.51 For all
practical purposes mathematical knowledge is as good as certain. Although strict adherence
to the deductive method in mathematics might lead to more long-winded expositions, it has
two further advantages for mathematical practice, according to Pasch. Firstly, proofs that
have been carried out without any appeal to the meanings of the nonlogical primitives
occurring in them are reusable, in the sense that replacing the terms in the assumptions in
such a way that they become true statements automatically also yields true statements
for the conclusions if the terms are replaced accordingly (Pasch, 1882a, pp. 98, 100).
In this way new mathematical results can be obtained ‘in a purely mechanical fashion’
without having to repeat the derivations (Pasch, 1914, p. 120). Secondly, a deductive
presentation of a domain can be exploited to determine which concepts and propositions
are necessary or dispensable for the theoretical development of the discipline (Pasch,
1882a, p. 100).52
As mentioned above, Pasch never suggested that ordinary mathematics should be carried
out in a formal system and he seriously doubted the feasibility of such an undertaking.
Instead, formalization is only a technique, albeit a very powerful one, for ascertaining the
rigor of deductions. Already in the first edition of his Vorlesungen (1882) Pasch notes that
it is admissible and useful to think about the meanings of the geometric terms during a
deduction, but that as soon as this becomes necessary the incompleteness of the deduction
49 Any reference to Hilbert (1900) or Hilbert (1905) are again conspicuously missing from Pasch’s
discussion.
50 See Pasch (1894, p. 17) and Pasch (1909, p. 134), which are referred to in Pasch (1917, p. 185).
It is also noteworthy that Pasch does not discuss Dedekind’s nor Peano’s axiomatizations of
arithmetic (Dedekind 1888; Peano 1889a); indeed, in Pasch (1927, p. 90) he remarks that there is
no generally accepted set of core propositions for arithmetic. It is possible that he did not accept
Dedekind’s notions of system and mapping as being empirically grounded, and that he objected
to Peano’s use of a purely symbolic language.
51 Pasch only rarely speaks of the truth of the core propositions, for example, he refers to them as
‘basic truths’ (‘Grundwahrheiten’) in a talk to a general audience (Pasch, 1894, p. 21). I think
Pasch would agree to Einstein’s famous remark that ‘As far as the laws of mathematics refer to
reality, they are not certain; and as far as they are certain, they do not refer to reality’, quoted from
(Hempel, 1945, p. 17).
52 See also Pasch (1924b, p. 233).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 107
§5. Pasch’s programme. Given the distinction Pasch makes between the mathemati-
cal and philosophical foundations of a discipline, the general problem arises of connecting
these two. In Pasch’s case, this problem is exacerbated by the fact that he opted for the
philosophical foundations to be grounded empirically, instead of, for example, in a Platonic
realm of mathematical objects. Thus, his overall framework is put under stress by the
tension between his empiricism and his deductivism. Pasch’s programme, intended to ease
this tension, consists in finding adequate ways of combining these two standpoints and in
developing deductive theories from empirical cores.
Since the reader of Pasch’s Vorlesungen (1882) might easily miss the general aim and
structure of his approach, the development can appear unmotivated and needlessly
cumbersome. In fact, the stem concepts and propositions of projective geometry,56 which
one would expect to find at the beginning of the book, are only introduced in section
10, after almost 100 pages of long-winded deductions and definitions from the empirical
core concepts and propositions that Pasch starts out with. A reader interested in studying
projective geometry may well wonder what the first 100 pages are all about. It is only in
‘Grundfragen der Geometrie’ (1917) that Pasch explicitly presents his overall conception
of geometry (presented in Section 2, above) and discusses his method of ‘extending the
meanings of concepts’. This is taken up again 5 years later and discussed in detail with
reference to the ‘deep contrast’ (‘tiefen Gegensatz’) between ‘physical geometry’ and
‘mathematical geometry’ (Pasch, 1922, pp. 362–363).
Let us take a look at how Pasch presents these matters in his 1882 lectures on geometry.
Here he describes the relation between mathematical theories and their empirical founda-
tions by stating that ‘[m]athematics sets up relations between the mathematical concepts,
which should correspond to facts from experience’ (Pasch, 1882a, p. 17). This makes it
sound as if all mathematical propositions have direct empirical correlates. However, while
this characterization might well have been an ideal that Pasch had in mind at the time,
it does not square with his own way of developing the axioms of projective geometry
in his lectures. Pasch’s approach is captured more accurately in his later, more nuanced
reflections, in which he only speaks of correlates that have been developed from empirical
propositions.
Once the empiricist has completed the substructure, he can attach to it
the theory that I referred to as mathematical geometry without changing
the wording. He would then, whenever one speaks of points understand
it as ‘mathematical points’, a concept which has been developed from
the physical point in the substructure. (Pasch, 1924a, p. 43).
Thus, Pasch’s strategy for bridging the gap between empirical and mathematical con-
cepts can be characterized as follows: Start with empirical core concepts and proposi-
tions, and develop theorems through definitions and deductions, which can be used as
correlates of the mathematical stem propositions of a particular discipline. This enterprise
may involve two different kinds of moves: (a) the lifting of empirical restrictions, and (b)
the definition of new concepts that extend previous ones. These are illustrated in what
follows.
Consider the statement ‘Between any two points on a line segment there is another
point’. Taken as an empirical statement, it is false. Due to the limits of our perception
and the fact that points must be extended to be observable, two points might just be
so close to each other that there is not enough space to fit another point between them.
Thus, if the statement is to be understood as expressing a core proposition about empirical
points, it must be augmented with the proviso that the points in question be sufficiently
apart from each other. As a mathematical proposition, however, the above statement can
be accepted without any further restrictions on the relative locations of the points. Thus,
we can obtain mathematical statements from empirical ones by simply dropping certain
additional conditions, which is one way of connecting a mathematical theory with its
empirical substructure.
56 They are 22 in total: 8 for line segments, 4 for the plane, and 10 for congruency; the latter are
common to Euclidean and non-Euclidean geometry.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 109
However, not all mathematical propositions can be obtained by simply removing restric-
tions that are necessary for empirical ones. If this is the case, Pasch resorts to a technique
he refers to as ‘extending the meanings of concepts’ (‘Begriffserweiterung’) (Pasch, 1882a,
p. 64) which he employs, for example, for points, lines, and planes in Pasch (1882a)
and for numbers in Pasch (1882b). To motivate this technique, Pasch also mentions two
extensions of concepts from the history of mathematics: The notion of number originally
meant only positive rational numbers, but was extended at some point in history to include
also negative numbers, while the notion of power was originally used only for natural
numbers as exponents, but was also extended to include negative and rational numbers in a
similar way (Pasch, 1882a, pp. 40–41). Analogously, Pasch introduces the concept ‘thing’
for concrete objects, but extends it in Pasch (1909, p. 20) to include sequences and in Pasch
(1909, p. 94) to include infinite sets.57
Since the method of extending the meanings of concepts plays a crucial role for develop-
ing deductive mathematical theories from an empirical core, I will present next how Pasch
proceeds to extend the meanings of ‘numbers’ and then of ‘bundle of lines’ and ‘points’.
Pasch extends the meaning of the concept ‘number’ in his Einleitung in die Differential-
und Integralrechnung (1882) to include also irrational numbers on the basis of Dedekind
cuts (Dedekind, 1872). He begins by introducing the word ‘number’ as referring only
to positive whole numbers and their quotients, that is, to nonnegative rational numbers,
and assumes that for these the relations of equality, greater, and less than, as well as the
operations of addition, subtraction, multiplication, division, and exponentiation are known.
After deriving a few basic theorems from these assumptions, Pasch notices that not every
number can be represented as the power of another number (e.g., that there is no number x
in the domain, such that x 3 = 25). Following Dedekind, he considers all numbers whose
nth power is less than a given number a as forming a ‘group’ (‘Gruppe’), which Pasch
calls ‘number segment’ or just ‘segment’. Then he notes that for some numbers a and n
there is a least number that does not belong to the corresponding number segment (e.g.,
for a = 25 and n = 2 this least number is 5), but that for others there is no such least
number (e.g., for a = 25 and n = 3). Pasch calls those number segments with a least upper
bound rational and the others irrational, and proceeds to define the relations of equality,
greater, and less than, as well as the operations of addition, subtraction, multiplication,
and division for number segments. On the basis of these definitions he argues that all
expressions that involve these notions for numbers also hold of number segments, both
rational and irrational. With regard to powers, Pasch shows that, unlike in the case of
numbers, every number segment can be represented as the nth power of another segment,
and he shows that the powers remain well defined and obey the familiar laws not only
for every rational number segment, but also for every irrational one. This allows Pasch
to notice that the computations with segments completely subsume the computations with
numbers, but also go beyond them, since they allow for the unrestricted application of
the inverse operation of taking the power. Once the theory of number segments (which is
‘more complete’ (Pasch, 1882b, p. 11), since subject to fewer restrictions than the theory
of numbers) is adopted, the term ‘number’ plays no particular role any more, since it can
be replaced throughout by ‘rational number segment’. This observation motivates Pasch to
57 Another example for the extension of concepts concerns the notion of limit, see Pasch (1918b).
I am grateful to an anonymous reviewer for bringing this to my attention.
110 DIRK SCHLIMM
dispense with the old meaning of ‘number’ and use this term for number segments instead,
so that now we can speak of ‘rational’ and ‘irrational numbers’.58
In sum, Pasch’s extension of the meaning of the term ‘numbers’ proceeds in three steps:
first, it is taken to refer only to nonnegative rational numbers; then, rational and irrational
number segments are introduced, the former of which correspond to numbers; finally, the
term ‘number’ is applied to number segments in general, which also allows to speak of
‘rational’ and ‘irrational numbers’. Further extensions of the domain of numbers to include
negative numbers, zero, infinity, and imaginary numbers are also mentioned by Pasch, but
he does not present them in detail.
To show the usefulness of the newly introduced concept of number, Pasch discusses
the measurement of straight lines. Empirical measurements, he notes, can only be made
up to a certain limit of accuracy, but mathematics aims at establishing general rules that
are independent of limitations of what can be observed (‘Beobachtungsverhältnisse’). This
can be achieved by admitting also irrational numbers, since then no knowledge of any
particular threshold of accuracy is required (Pasch, 1882b, p. 13).59
It is informative to notice the striking contrast between Pasch’s and Dedekind’s presenta-
tions of the introduction of irrational numbers. While Dedekind uses abstract set-theoretic
terminology and considers the real numbers to correspond to points on a line, Pasch uses
more concrete terminology in his approach and takes the limitations of our empirical
interactions with lines as the starting point, distinguishing between the calculation of the
length of a line and its measurement. Moreover, while Dedekind clearly distinguishes
between a cut and its corresponding number, Pasch redefines the term ‘number’ to refer
to cuts, but, as was not uncommon at the time, he does not distinguish carefully between
the term ‘number’ and the concept of number. Since Pasch obviously would not want to
assert that a number has infinitely many elements, he must restrict his number talk to only
certain properties of cuts, but he completely avoids to address this issue. Pasch also does
not comment on the problem that the uncountability of the irrational numbers might pose
for his empiricist approach.
The latter difficulty points at a more general issue of Pasch’s programme, namely the
exact specification of the means that he regards as admissible for the development of new
concepts from given ones. Since Pasch’s attitude is not revisionist, he must be open in
principle to accept the results that are obtained by any method used in mathematics. One
way of showing the compatibility of mathematical practice with his empiricist standpoint
is to find ways of achieving the same results by licensed methods.
In the case of projective geometry, it was accepted practice to introduce ideal points
as ‘points at infinity’ where parallel lines meet.60 Such a definition, however, does not
conform to Pasch’s empiricist standards, because infinity is not an empirical notion, and so
he set out to introduce these objects by other means. In his lectures on projective geometry
Pasch extends the meaning of ‘bundle of lines’ and ‘point’ (Pasch, 1882a, pp. 33–46).
On the basis of the notions of points and lines, the latter of which he defined using the
58 In an unusually opinionated review for the Jahrbuch für Fortschritte in der Mathematik Pasch’s
redefinition of ‘number’ was severely criticized for being circular by Hoppe, who insists that the
concepts of number segment and irrational number should be kept apart (Hoppe, 1882).
59 Pasch always remained sceptical with regard to the applications of irrational numbers outside of
mathematics. After explaining how the square root of 2 arises from considerations regarding the
diagonal of a unit square, he writes in Pasch (1909, p. 99): “It remains open as to whether every
irrational number corresponds to a problem outside of analysis.”
60 See Torretti (1978, p. 111).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 111
core concept of line segment, Pasch initially defines a bundle of lines, as the collection of
lines that meet in a common point. He then proves that for lines e, f , and g the relation
‘g belongs to the bundle e f ’ can be defined using the property of being coplanar, but
without making any reference to the point in which e and f meet. Thus, this relation can
also hold between lines e, f , and g, if e and f have no point in common, and Pasch
takes it as the defining characteristic for an extended notion of bundle of lines. Also for
these bundles Pasch shows that they are determined by any two lines that belong to them.
Moreover, if two lines in a bundle meet in a point P, then all lines of the bundle meet
in that same point, so that some bundles can be said to correspond to a unique point,
namely P, and these are the ones that were formerly referred to as bundles and are now
called proper bundles. Other bundles, however, may contain lines that do not intersect,
such that there is no obvious relation between these bundles and particular points, and
they are called improper bundles. Finally, after having defined these notions and proved
some properties about them, Pasch extends the meaning of the term ‘point’ to refer to
bundles of lines instead (just as he extended the meaning of the term ‘number’ to refer
to number segments). Those bundles that correspond to a point in the original sense are
then referred to as proper points, while the others are called improper points. Thus, the
extended meaning of the relation of ‘line l goes through point P’ is that of ‘line l belongs
to bundle of lines P’. The advantage of this change in terminology is that previous axioms
and theorems about points and lines remain valid also under the extended meaning. For
example, ‘For any two points there is a line that that goes through both of them’ is also
valid if ‘point’ refers to a bundle of lines. In addition, now also statements that were false
under the original restricted understanding of points become true, if understood as referring
to points in the extended sense, for example, ‘Two lines in a plane always have a point
in common’. Pasch’s improper points had previously been treated as ideal elements in
projective geometry and Torretti describes Pasch’s approach of introducing these elements
only on the basis of ostensive concepts and empirically justifiable axioms as his ‘most
remarkable feat’ (Torretti, 1978, p. 213).
Nowadays we would describe Pasch’s method of extending the meaning of concepts
by saying that the term ‘point’ is given two different interpretations: while it originally
referred to points, it is later taken to refer to bundles. But, Pasch does not yet possess
a conceptually clear distinction between syntax and semantics, and it does not seem to
come naturally for him to speak of different interpretations of a given term, in particular,
since ‘point’ is a meaningful term, unlike, say, a mathematical variable. Thus, he speaks of
substituting one concept for another in a proposition, changing the meanings of concepts,
or replacing concepts by meaningless signs.61
In geometry, the notion of continuity also goes beyond what can be developed on an
empirical basis and it indicates the conceptual gap between empirical and the mathematical
geometry. Pasch carries through the development of projective geometry to allow for the in-
troduction of coordinates through the construction of rational nets. Thus, these coordinates
remain limited to rational values. Nevertheless, he notices that ultimately only an analytic
treatment of geometry in terms of real coordinates yields the customary notions of points
and lines, which Pasch qualifies with the adjective ‘mathematical’. He briefly considers
the possibility of adding an axiom of continuity, but dismisses it on the grounds that it
would not be empirically justified and opts for a version of the Archimedean axiom instead
61 See also Pasch’s notions of ‘formalization’ and ‘realization’ of a stem, discussed in Section 4,
above.
112 DIRK SCHLIMM
(Pasch, 1882a, p. 126).62 However, given that all his empirically based constructions are
subject to limitations, mathematical points allow for more fine-grained distinctions than
empirical points do; in other words, every empirical point corresponds to an entire sequence
of mathematical points. This phenomenon is referred to as the ‘inexactness of geometric
concepts’ and Pasch emphasizes that ‘the transfer of a diagram into numbers and the
return from the results of a calculation to the diagram cannot be carried out with the same
degree of exactness’ (Pasch, 1882a, p. 200).63 Nonetheless, Pasch remarks that also the
mathematical points (if appropriately defined) satisfy the stem propositions of projective
geometry. Given that he considers real numbers themselves to be grounded in empirical
core concepts, this does not seem to pose a serious problem for his philosophy, but it
confirms his assertion that geometry presupposes arithmetic (Pasch, 1922, p. 5).
Pasch commented that one of the goals he pursued in his lectures on geometry was
to show that a reduction of parts of geometry to empirical notions was possible in prin-
ciple (Pasch, 1887a, p. 130),64 and it appears that many of his contemporaries accepted
this reduction. For example, in the article on geometry by Weber and Wellstein in the
Encyclopedia of Elementary Mathematics (1905), Pasch’s book is discussed in the first
chapter on the fundamental notions of geometry (Weber & Wellstein, 1905, pp. 25–27).
After formulating a critique of the idealization processes that are intended to lead from
the empirical raw material of geometry to its abstract objects, the authors ask whether it
is possible to build up an intuitive geometry, which they call ‘natural geometry’, without
recurring to these idealizations, and they note that an affirmative answer to this question is
presented in Pasch’s lectures on projective geometry, ‘this beautiful book that everybody
must have read, who is more interested in intensive rather than extensive knowledge of
geometry’ (Weber & Wellstein, 1905, p. 25).
§6. Concluding remarks. We have seen how Moritz Pasch formulated the corner-
stones of his philosophy of mathematics in his two books of 1882 and continued to develop
and refine his views in numerous, more and more philosophical publications throughout
his life. Pasch’s philosophy is quite unique in combining a strong empiricism, according to
which the meanings of mathematical terms should be based on observable physical entities,
with a deductivist view, according to which the validity of mathematical inferences does
not depend on the meanings of the terms. These seemingly incompatible views are brought
together in Pasch’s conception of different layers of philosophical and mathematical inves-
tigations, and ‘Pasch’s programme’, which aims at building up correlates of mathematical
axioms from an empirical basis. Since Pasch’s philosophical ideas originally appeared only
as interspersed remarks in his mathematical textbooks and were elaborated in more detail
only in his later articles it has been difficult to grasp and appreciate his philosophy of math-
ematics as a whole. This might be part of the reason for the general lack of awareness of his
ideas, which Pasch himself noticed (Pasch, 1926b, p. 166). Another reason might be that
62 See Ehrlich (2006, p. 6) and Greenberg (1993, p. 125), who writes that ‘[t]he full significance of
Archimedes’ axiom was first grasped in the 1880s by M. Pasch and O. Stolz’.
63 Pasch reminds the reader in Pasch (1882b, pp. 13 and 39) that every number that is used in practice
or that arises from observations or measurements can only have a limited degree of exactness, and
he adds in a remark on p. 188 that the inexactness of geometric concepts had been discussed by
Klein already in 1873 (Klein, 1883). This is repeated at other occasions, for example, Pasch
(1887a, p. 130) and Pasch (1912, p. 203).
64 See also Pasch (1912, p. 203).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 113
§7. Acknowledgments. I would like to thank, first and foremost, Michael Hallett for
helpful comments on previous versions of this paper. In addition, I am also grateful for
remarks and comments by an anonymous reviewer of this journal, Greg Frost-Arnold,
Jeremy Heis, Paul Rusnock, as well as audience members at the Annual meeting of the As-
sociation of Symbolic Logic (Irvine, CA), the PhiMSAMP-3 conference ‘Is mathematics
special?’ (Vienna, Austria), the HOPOS meeting 2008 (Vancouver, BC), the Winter 2008
meeting of the Canadian Mathematical Society (Ottawa, ON), and the Winter meeting of
the Association of Symbolic Logic (Philadelphia, PA). Last, but not least, I would like to
thank Dr. Rudolf Thaer for generously sharing his knowledge about Pasch with me. Work
on this paper was funded by Social Sciences and Humanities Research Council of Canada
(SSHRC). Translations are by the author, unless noted.
BIBLIOGRAPHY
Boniface, J. (2004). Hilbert et la notion d’existence en mathématiques. Mathesis. Paris,
France: J. Vrin.
Boniface, J. (2007). The concept of number from Gauss to Kronecker. In Goldstein, C.,
Schappacher, N., and Schwermer, J., editors. The Shaping of Arithmetic After C. F.
Gauss’s Disquisitiones Arithmeticae. Berlin: Springer, pp. 315–342.
Contro, W. S. (1976). Von Pasch zu Hilbert. Archive for History of Exact Sciences, 15(3),
283–295.
Dedekind, R. (1872). Stetigkeit und irrationale Zahlen. Braunschweig, Germany: Vieweg.
Reprinted in Dedekind (1932a), pp. 315–334. English translation Continuity and
Irrational Numbers in Ewald (1996), pp. 765–779.
Dedekind, R. (1888). Was sind und was sollen die Zahlen? Braunschweig, Germany:
Vieweg. Reprinted in Dedekind (1932a), pp. 335–391. English translation in Ewald
(1996), pp. 787–833.
Dedekind, R. (1932a). Gesammelte mathematische Werke, Vol. 3. Braunschweig,
Germany: F. Vieweg & Sohn. Edited by Robert Fricke, Emmy Noether, and Öystein
Ore.
114 DIRK SCHLIMM
Dehn, M. (1928). Moritz Pasch. Zum fünfundachtzigsten Geburtstag am 8. 11. 1928. Die
Naturwissenschaften, 16(44), 813–815.
Detlefsen, M. (2005). Formalism. In Shapiro, S., editor. Oxford Handbook of Philosophy
of Mathematics and Logic. Oxford: Oxford University Press, pp. 236–317.
Dingler, H. (1915). Das Prinzip der logischen Unabhängigkeit in der Mathematik, zugleich
als Einführung in die Axiomatik. München, Germany: Theodor Ackermann.
DiSalle, R. (1993). Helmholtz’s empiricist philosophy of mathematics. In Cahan, D.,
editor. Hermann von Helmholtz and the Foundations of 19th Century Science. Los
Angeles, CA: University of California Press, pp. 498–521.
Du Bois-Reymond, P. (1882). Die allgemeine Funktionentheorie. Erster Teil. Metaphysik
und Theorie der mathematischen Grundbegriffe: Größe, Grenze, Argument, und
Funktion. Tübingen, Germany: H. Laupp. There is only this part.
Ehrlich, P. (2006). The rise of non-Archimedean mathematics and the roots of a
misconception I: The emergence of non-Archimedean systems of magnitudes. Archive
for History of Exact Sciences, 60, 1–121.
Engel, F., & Dehn, M. (1931). Moritz Pasch. Zwei Gedenkreden, gehalten am 24. Januar
1931. Giessen, Germany: Töpelmann.
Engel, F., & Dehn, M. (1934). Moritz Pasch. Jahresbericht der Deutschen Mathematiker
Vereinigung 44(5/8), 120–142. Reprint, with changes and additions, of Engel & Dehn
(1931).
Ewald, W. (1996). From Kant to Hilbert: A Source Book in Mathematics. Oxford, UK:
Clarendon Press. Two volumes.
Frege, G. (1980). Philosophical and Mathematical Correspondence. University of Chicago
Press. Edited by Gottfried Gabriel, Hans Hermes, Friedrich Kambartel, Christian Thiel,
and Albert Veraart.
Freudenthal, H. (1962). The main trends in the foundations of geometry in the 19th century.
In Nagel, E., Suppes, P., and Tarski, A., editors. International Congress for Logic,
Methodology and Philosophy of Science (1960 : Stanford, Calif.), Logic, Methodology,
and Philosophy of Science. Stanford, CA: Stanford University Press, pp. 613–621.
Friedman, M. (1985). Kant’s theory of geometry. The Philosophical Review, 94(4),
455–506.
Gabriel, G. (1978). Implizite Definitionen — Eine Verwechslungsgeschichte. Annals of
Science, 35, 419–423.
Gandon, S. (2005). Pasch entre Klein et Peano: empirisme et idéalité en géométrie.
Dialogue, 44: 4, 653–692.
Gandon, S. (2006). La réception des Vorlesungen über neuere Geometrie de Pasch par
Peano. Revue d’historie des mathématiques, 12(2), 249–290.
Gray, J. (2007). Worlds Out of Nothing. A Course in the History of Geometry in the 19th
Century. London: Springer.
Greenberg, M. J. (1993). Euclidean and non-Euclidean Geometries: Development and
History (third edition). New York, NY: W.H. Freeman.
Hallett, M., & Majer, U., editors (2004). David Hilbert’s Lectures on the Foundations of
Geometry 1891–1902. Berlin, Germany: Springer.
Harré, R. (2003). Positivist thought in the nineteenth century. In Baldwin, T., editor. The
Cambridge History of Philosophy 1870–1945. Cambridge: Cambridge University Press,
pp. 11–26.
Hartwich, Y. (2005). Eduard Study (1862–1930)—ein mathematischer Mephistopheles im
geometrischen Gärtchen. PhD Thesis, Department of Mathematics, Johannes Gutenberg
Universität, Mainz.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 115
DEPARTMENT OF PHILOSOPHY
McGILL UNIVERSITY
855 SHERBROOKE ST. W.
MONTREAL, QC H3A 2T7, CANADA
E-mail: dirk.schlimm@mcgill.ca
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010
Abstract. Finitism is given an interpretation based on two ideas about strings (sequences
of symbols): a replacement principle extracted from Hilbert’s work and a counting principle
inspired by Tait. These principles are used to justify an equational arithmetic 2 based on the
ℒ
algebra of lower elementary functions. The extension of this algebra to Grzegorczyk’s class
ℇ can be justified by means of an additional finitistic choice principle, thus obtaining a second
2
equational theory 2 . It is unknown whether 2 is strictly stronger than 2 since ℇ2 may coincide
ℇ ℇ ℒ
with the class of lower elementary functions.
If the objects of arithmetic are taken to be binary numerals instead of tally numerals, then it
becomes possible to provide a finitistic justification for a theory B that may be incomparable to 2
ℇ
(neither of the two includes the other). I conclude by suggesting that the equational theory of Kalmar
elementary functions is a strict upper bound for finitistic arithmetic.
§1. Introduction: Hilbert’s Cogito. The direct descendant of David Hilbert’s work in
the foundations of mathematics is reductive proof theory, in which typically a mathematical
theory 1 is shown to be -conservative over a weaker theory 2 ( a set of sentences in the
language of 2 ). If 2 has certain epistemic advantages over 1 (it is finitistic, constructive,
or predicative, whereas 1 is not), then such a conservation theorem may be regarded as
securing a foundation for 1 (it helps if the result is proved in 2 ).
The received view is that Hilbert aimed to prove that ideal mathematics (classical anal-
ysis and set theory) is finitistically conservative over what he called real or ‘contentual’
[inhaltlich] arithmetic and this result was to be established by a finitistic consistency proof
for the axiomatic theories formalizing ideal mathematics.1 The required consistency proof
would have to be prefaced by a philosophical analysis of the notion of finitistic or ‘con-
tentual’ arithmetic explaining its special epistemic status and delineating with sufficient
precision the finitistic means of proof. But initially the expectation that an obviously fini-
tistic consistency proof exists made the philosophical task seem less pressing and then
Gödel’s work made it seem pointless. For it is generally accepted that the incompleteness
theorems show that Hilbert’s original goal cannot be achieved even in the absence of a
precise analysis of finitism, provided one accepts the reasonable view that finitistic means
of proof can be reproduced in the axiomatic theories of classical mathematics.2
c Association for Symbolic Logic, 2010
119 doi:10.1017/S1755020309990323
120 MIHAI GANEA
There still are good reasons to seek a precise analysis of finitism, for example, in order
to obtain partial realizations of Hilbert’s program (see Simpson (1988)). Instead of aiming
for a definitive solution to the question of the foundations of ideal mathematics, one could
describe it in terms of the kinds of evidence available for its theorems. Determining the
nature and extent of finitistic arithmetic is a first step in classifying these kinds of evidence.
Hilbert and his school did not offer an exact definition of finitistic arithmetic, and were
fully aware of the lack of precision of their position.3 On the one hand it is clearly required
that the objects of a finitistic theory are finite (given in intuition), such as finite sequences
of symbols. On the other hand it is unclear what principles regarding these objects can be
considered finitistic. Within the Hilbert school there was considerable indecision over this
point. They seemed to accept, without a clear justification, that the principles of transfinite
induction used in the consistency proofs by Ackermann and Gentzen were finitistic.4 The
main criterion operating in the selection of these principles was supposed to be security,
but unpacking its exact meaning is not straightforward. Hilbert suggested that they ought
to be indubitable or indispensable to scientific thought:
[. . . ] as a precondition for the application of logical inferences and for
the activation of logical operations, something must be already be given
in representation [in der Vorstellung]: certain extra-logical discrete ob-
jects, which exist intuitively as immediate experience before all thought.
If logical inference is to be certain, then these objects must be capable of
being completely surveyed in all their parts, and their presentation, their
difference, their succession (like the objects themselves) must exist for
us immediately, intuitively, as something that cannot be reduced to some-
thing else. Because I take this standpoint the objects of number theory are
for me [. . . ] the signs themselves [. . . ]. The solid philosophical attitude
that I think is required for the grounding of pure mathematics—as well
as for all scientific thought, understanding, and communication—is this:
In the beginning was the sign. (Hilbert, 1922, p. 202).
This well-known passage, which reoccurs in several of his writings, identifies the objects
of finitistic thought as sequences of symbols (the forms or types of messages in an intersub-
jective medium presupposed by any communication) and suggests a foundational strategy
that derives the basic principles of arithmetic from the preconditions or norms of successful
communication.
We might call this idea Hilbert’s Cogito: I express mathematics (in accordance with the
rules of some language), therefore I know (some) arithmetic.5 The nature and extent of this
3 Here is a significant excerpt from Hilbert & Bernays (1939): “. . . we have not introduced the
expression ‘finit’ as a sharply delimited term, but only as the name of a methodological guideline,
which, to be sure, enables us to recognize certain kinds of concept-formations and ways of
reasoning as definitely finitistic and others as definitely not finitistic. This guideline, however,
does not provide us with a precise demarcation between those [concept-formations and ways of
reasoning] which accord with the requirements of the finitistic method and those that do not.”
(translation in Zach (1998), footnote 16)
4 See the discussion in Zach (2003), especially section 3.3, for evidence on this point.
5 The affinity between Hilbert’s position and the Cartesian Cogito is emphasized in Tait (1981),
p. 525: “. . . the special role of finitism consists in the circumstance that it is a minimal kind
of reasoning presupposed by all nontrivial mathematical reasoning about numbers. And for this
reason it is indubitable in a Cartesian sense that there is no preferred or even equally preferable
ground on which to stand and criticize it.”
TWO ( OR THREE ) NOTIONS OF FINITISM 121
basic (finitistic) arithmetical knowledge are subjected to two constraints I sketch below and
then expand on in the rest of the paper.
a) The basic notions and principles of finitistic arithmetic are grounded in an analysis
of the fundamental properties of language. Arithmetic is to be reduced to a body of indu-
bitable knowledge, and this knowledge concerns the preconditions for the intersubjective
expression of thought. Numbers are to be identified with (the types of) a certain kind of
expressions and then certain properties of, operations on, and principles about numbers can
be perceived as evident through an analysis of our syntactical competence with respect to
those expressions. This competence is made explicit in the process of describing precisely
(formalizing) the practice of mathematics. Hilbert assigned a fundamental importance to
formalization, which to him was to be the result of a transcendental reflection:
The formula game that Brouwer so deprecates has, besides its mathe-
matical value, an important general philosophical significance. For this
formula game is carried out according to certain definite rules, in which
the technique of our thinking is expressed. These rules form a closed
system that can be discovered and definitively stated. The fundamental
idea of my proof theory is none other than to describe the activity of our
understanding, to make a protocol of the rules according to which our
thinking actually proceeds. (Hilbert, 1927, p. 475)
Like Frege, Hilbert assigned a special epistemic status to logic, the system of ‘laws
of thought’. However, unlike Frege, Hilbert thought that the most basic concepts and
principles of arithmetic cannot be grounded in the content of these laws, but rather in
the knowledge of the form of their expression, that is, in our knowledge of signs:
We start from the assumption that we possess the capacity to name things
by signs, and that we can always recognize them again. We can then
carry out certain operations with these signs, operations which are anal-
ogous to those of arithmetic and which satisfy analogous laws. ((Hilbert,
1910, p. 159), quoted in Hallett (1995, p. 164)).
What operations with signs (expressions) are to be taken as fundamental and how is
one to exploit the similarities between basic syntax and arithmetic in order to provide a
foundation for the latter in the former? Clearly some degree of freedom is to be expected
in answering these questions, since expressions (finitistic objects) and their relations can
be conceived in various ways.
b) Finitistic arithmetic is logically restricted. Since numbers form a (potentially) infinite
collection, standard forms of expression, and usual logical rules may not be used freely in
reasoning about them. Hilbert (1925) stated that
In the domain of finitistic propositions [. . . ] the logical relations that
prevail are very imperspicuous, and this lack of perspicuity mounts un-
bearably if “all” and “there exists” occur combined or appear in nested
propositions. In any case, those logical laws that man has always used
since he began to think, the very ones that Aristotle taught, do not hold.
(p. 379)
In particular, a finitistically meaningful arithmetical language may not use unbounded
quantifiers since the meaning of expressions using such quantifiers needs to be explicated
(in finitistic terms). The same considerations apply to the domain of expressions of a
122 MIHAI GANEA
language taken to be basic, that is, to basic syntax and to the arithmetic that one can
interpret in it.
Therefore our most fundamental arithmetical reasoning is characterized by certain re-
strictions derived from the prohibition against the free use of unbounded quantifiers. This
is a negative constraint that also leaves some degree of freedom in the specific choice of
the language and logic of the fundamental theory. However, we could hope that the various
candidates for the fundamental theory fall within a sufficiently narrow range.
I turn now to a closer examination of the constraints outlined above, with the aim of
interpreting them with sufficient precision to narrow the choice of a fundamental arith-
metical theory (or class of theories). First I will examine the commonly held view that
Primitive Recursive Arithmetic (PRA) is the most natural theoretical description of fini-
tistic arithmetical knowledge in Hilbert’s sense. It will turn out that the relation between
PRA and Hilbert’s Cogito is an uneasy one and that the resolution of this tension can be
accomplished by a weakening of the schema of primitive recursion.
§2. Two versions of PRA. Hilbert was not the first author to articulate reservations
about the use of standard logical rules in arithmetical reasoning. The line of thought that
demands restrictions on the logic of number theory is represented by Kronecker,6 Poincaré,
Brouwer, and Hilbert’s most prominent student, Hermann Weyl. For example, existential
quantifiers in arithmetic are said to produce incomplete or partial judgments in Weyl (1921,
p. 97) as well as in Hilbert & Bernays (1934, pp. 32–33). Motivated by similar concerns7
Skolem took the resolute step of eliminating quantifiers from the foundations of arithmetic
in Skolem (1923):
[. . . ] If we consider the general theorems of arithmetic to be functional
assertions and take the recursive mode of thought as a basis, then that
science can be founded in a rigorous way without the use of Russell
and Whitehead’s notions “always” and “sometimes.” This can also be
expressed as follows: A logical foundation can be provided for arithmetic
without the use of apparent logical variables. (p. 304)
In this pioneering work Skolem showed that a series of elementary number-theoretical
theorems (culminating with the result that any number has a unique decomposition in
6 Hilbert acknowledged Kronecker as the originator of the finitistic viewpoint, for example, in
Hilbert & Bernays (1934, p. 42): “Kronecker, who was the first to insist on the requirements
of the finitistic standpoint, sought to completely eliminate nonfinitistic methods of proof from
mathematics. He reached his aim in the theory of algebraic numbers and number fields.” Hilbert’s
attitude toward Kronecker methodological position oscillates from appreciation (see, e.g., Sieg
(1999, p. 4)) to outright hostility, as when he calls him a ‘classical prohibiting dictator’ in Hilbert
(1922, pp. 200–201).
For an analysis of Kronecker’s methodological views emphasizing their connection with his
mathematical work, see Marion (1995).
7 In Skolem (1950) he states the following: “When I wrote my article I hoped that the very natural
feature of my considerations would convince people that this finitistic treatment of mathematics
was not only a possible one but the true or correct one—at least for arithmetic. [. . . ] I am no
fanatic, and it is not my intention to condemn the nonfinitistic ideas and methods. But I should
like to emphasize that the finitistic development of mathematics as far as it may be carried out
has a very great advantage with regard to clearness and security. Further it may be good reason to
conjecture that it can be carried out very far, if one would make serious attempts in that direction.”
(p. 527)
TWO ( OR THREE ) NOTIONS OF FINITISM 123
prime factors) can be given a quantifier-free form, that is, they can be written as sentential
combinations of equations between terms constituted from free variables, 0̄, and symbols
for functions defined through primitive recursion.
Skolem did not provide an axiomatic framework for his proofs of the theorems in ques-
tion; this was supplied in Hilbert & Bernays (1934) after primitive recursive functions
had been given their standard definition and used extensively in the arithmetization of
syntax in Gödel (1931). PRA as presented by Hilbert and Bernays (henceforward called
PRAHB ) consists in a logical calculus of quantifier-free formulas (essentially sentential
logic plus the rules of substitution and equality), explicit and primitive recursive definitions
of functions (starting from the constant 0̄ and the successor function s(x)), the axiom
s(0̄) = 0̄ and the schema of induction, which can be formulated as an inference rule:
where ϕ(x) is any formula in the language of PRA with x among its variables. Precise
descriptions of PRA along these lines are given in Girard (1987, p. 67) and Troelstra & van
Dalen (1988, pp. 120–126).
Curry (1941) and Goodstein (1954, 1957) sought to purify arithmetic from logic even
further and eliminated sentential connectives from its language. The process of simplifying
the primitives of the theory culminated in Rose (1962, 1984), but at the price of losing
some readability. Below I will rigorously define Goodstein’s system PRAG and assess
its suitability as a theory of finitistic arithmetic. As we shall see, the element that makes
its suitability doubtful is central to all versions of PRA: it is the scheme of definition by
primitive recursion itself.
The primitives of the language are the constant 0̄, individual variables xi (i ≥ 1), the
equality sign =, the symbols for the successor and constant zero functions (s and z, respec-
tively), symbols for the projection functions πm,n (with 1 ≤ m ≤ n), and symbols for two
kinds of functional operators: Cm,n (with 1 ≤ m, n) for the operation of composition, and
Rn+1 (with n ≥ 0) for primitive recursion (in a fully fintistic formulation these operators
would not be allowed, but they permit a more concise presentation and therefore are used
here for reasons of convenience). I write x, y, z, . . . for variables with possibly distinct
indices and x, y, . . . for (possibly empty) vectors of such variables. Until Section 4, where
the number 0 is identified with the symbol ‘0’, I will distinguish between the constant 0̄
and the number 0.
Functionals and their arity are defined as follows: s, z, and πm,n are functionals of arities
1,1, and n respectively (I will write n-functional for a functional of arity n) . If f 1 , . . ., f m
are all n-functionals and g is an m-functional, then Cm,n [g, f 1 , . . ., f m ] is an n-functional.
If f is an n + 1-functional (with n ≥ 0) and g a n + 2-functional, then Rn+1 [ f , g] is an
n + 1-functional. f , g, h, f 1 , g1 , . . . will be used as variables for functionals.
0̄ and the individual variables are primitive terms. If f is a functional of arity n and
t1 , . . ., tn are terms, then f (t1 , . . ., tn ) is a term and f its dominant functional. I use
t, u, v, . . ., as variables for terms and write t ( x ), u( y), . . . to indicate that the variables
x, y, . . . occur in the terms t, u, . . .. Sometimes terms are written as f ( x ), g( y), . . .,
indicating the dominant functional. t (x/u) is the term obtained from t (x) by substituting
the occurrences of the variable x in t by the term u.
The only formulas of the theory are equalities between terms: t = u. Its axioms
are of definitional nature, specifying the meaning of expressions for primitive recursive
functions:
124 MIHAI GANEA
Z . z(x1 ) = 0̄.
P. πm,n (x1 , . . ., xn ) = xm for any m, n with 1 ≤ m ≤ n.
C. If g is an m-functional and f 1 , . . ., f m are n-functionals, then
Cm,n [g, f 1 , . . ., f m ](
x ) = g( f 1 (
x ), . . ., f m (
x )).
R. If g is an n+1-functional and h is an n+2-functional, then
x , 0̄) = g(
Rn+1 [g, h]( x , 0̄) and
x , s( y)) = h(
Rn+1 [g, h]( x , y, Rn+1 [g, h](
x , y)).
There are four inference rules: the first three specify properties of equality and the fourth is
a uniqueness rule for the procedure of definition by primitive recursion that does the work
of induction (given here in a slightly different form than Goodstein’s).
t (x) = u(x)
S1 .
∴ t (x/v) = u(x/v)
t =u
S2 .
∴ v(x/t) = v(x/u)
x = y, x = z
T.
∴y=z
for any terms t, u, v, where t (x/v) is the term resulting by substituting v for all the
occurrences of x in t.
f (
x , 0̄) = g(
x , 0̄), f (
x , s( y)) = h(
x , y, f (
x , y))
U. for any functionals f, g, h.8
∴ f (
x , y) = Rn+1 [g, h]( x , y)
Using U and T one can immediately derive another inference schema taken as primitive
by Goodstein:
f (
x , 0̄) = g(
x , 0̄), f (
x , s( y)) = h(x , y, f (
x , y)), g(
x , s( y)) = h(
x , y, g(
x , y))
U1 . .
∴ f (
x , y) = g( x , y)
PRAG and PRAHB are co-interpretable (a proof is given in Goodstein’s book) and hence
the two versions are interchangeable in the considerations that follow.
How well does PRAG comply with the requirements arising from Hilbert’s Cogito? With
respect to the logic of the theory it is hard to imagine that any serious challenge could be
mounted against the use of free variables and the identity relation or against the rules S1 ,
S2 , and T . Hilbert never asserted that we lack a general concept of number (i.e., the ability
to determine whether an object is a number or not if it is presented in an appropriate way),
but only that quantification over the totality of numbers and the logical rules associated
with it are problematic. S1 , S2 , and T could be attacked in contexts that involve intensional
or vague notions, but that is not the case in arithmetic. Difficult to challenge are also
8 If T had the form x = y, y = x/∴ x = z it would be impossible to prove that equality is reflexive
or commutative. For no equation of form x = t would be derivable (as it can be proved through
an inductive argument on the length of the derivation) and so x = x would not be a theorem. The
schema x = y/ ∴ y = x would not be derivable either, since otherwise by using it one would
prove x = π1,1 (x) from P, contrary to the previous observation. Using T as given in the text
one immediately derives x = x using P and S1 , commutativity from x = y and x = x and then
transitivity.
TWO ( OR THREE ) NOTIONS OF FINITISM 125
the proper axioms of the theory regarding the zero function, projections, and composition
(projections will become entirely perspicuous once counting on sequences is introduced in
Section 4).
What remains to be examined is the actual engine powering the theoretical machinery of
PRAG —the definition schema of primitive recursion with its two components, R and U .
Taken together they assert that given functions (‘intuitive procedures’ for constructing
numbers) g( x , y) and h( x , y, z) of n + 1 and n + 2 arguments, respectively, there exists a
unique function f ( x , y) of n + 1 arguments such that f (x , 0̄) = g(
x , 0̄) and f (
x , s(y)) =
h(x , y, f (
x , y)). Does this principle have the same basic status as the rest of PRA’s com-
ponents? Is it implicit in our fundamental knowledge of operations with strings of symbols,
as the first Hilbertian insight demands?
I should emphasize from the outset that it is out of the question to justify finitistically
R and U as generalizations about finitistic functions (operations on numbers). The very
concept of finitistic function or operation is not available from a finitistic viewpoint and
therefore such a generalization cannot be finitistically meaningful. What we can hope for
is a piecemeal justification in which all cases (instances) of R and U are made finitistically
evident. Even the use of the functional operators Cm,n and Rn+1 in the language of PRAG
is in fact questionable from a strictly finitistic viewpoint, since no finitistic meaning can
be directly assigned to them—the idea of higher-order order operations on finitistically
acceptable functions does not make direct finitistic sense. Every single definition of a
primitive recursive function must be seen as separated from the others, as an isolated
insight. I am using the operators for reasons of convenience, but it should be kept in mind
that I do not require a direct justification for them in any finitistic foundation for PRAG .
§3. From intuitions to principles. Hilbert and Bernays did supply a justification of
(all cases of) primitive recursion and crucial in assessing the success of their attempt is the
notion of elementary property (relation) of strings, since they claimed that the principle of
induction was obvious when restricted to elementary (intuitive) properties:
[. . . ] Here we use the inference rule of complete induction [die Be-
weismethode der vollständigen Induction]. Let us explain from the outset
the meaning of this rule from our elementary viewpoint: let there be
given an arbitrary statement [Aussage] about a numeral with an elemen-
tary intuitive content [elementar anschaulichen Inhalt]. Suppose that the
statement holds for 1 and we also know that if it holds for a numeral n,
then it also holds for the numeral n + 1. From this it follows that that the
statement holds for any given numeral a.
In fact the numeral a is built up by the application of the procedure of
adding 1 starting from 1. First one notices that the statement applies to 1
and then, on the ground of the inductive hypothesis, that the statement
holds for every new numeral obtained by the addition of 1, and so, by
the time one completes the structure of a, one knows that the statement
holds for a.
What we have here is not an independent principle, but a consequence
we obtain from the concrete structure of numerals. (Hilbert & Bernays,
1934, p. 23)
In what sense is induction a derived and not independent principle? This question
should be answered on the basis of an analysis of the finitistic intuition of strings and
126 MIHAI GANEA
the most detailed such analysis is given in writings by Parsons (1980, 1986, 1994, 1998,
2008).9
Although Parsons follows guidelines set by Hilbert and Bernays, some of the elements
of his analysis are novel. A key idea he emphasizes is that an imprecise intuition of a
generic string is required if any intuitive basis is to be given to general principles about
numbers, such as the existence of the successor function, which is assumed both in PRAHB
and PRAG (Parsons, 2008, pp. 173–178). In contrast, Hilbert alluded to the completeness
of intuitions of finitistic objects (see, e.g., the programmatic quote from Hilbert on p. 3).
Parsons does not reject the view that each individual finitistic object (string) can be intuited
in full detail (see Parsons (1998), pp. 266–269, or Parsons (2008), pp. 260–262), so appar-
ently his views are compatible with Hilbert’s position. However, this appearance may be
deceiving since the sense of possibility involved in Parsons’s considerations is an ‘abstract
mathematical’ one (some argument is needed to show that a justification of finitism can use
such a concept), and the conclusion that intuitive knowledge of arithmetical propositions
involving large strings is possible is weaker than the actual intuitive knowledge of such
propositions.
As far as I know Hilbert never considers in writing the obvious fact that the clarity of our
intuitive grasp of strings weakens as the size and complexity of the strings increase10 and
he does not comment on how a satisfactory conception of finitistic intuition should take it
into account. However, the issue is examined in Bernays (1930), on the basis of the notion
20
of formal abstraction. After he acknowledges that it is doubtful that a number such as 1010
is instantiated in physical reality (let alone intuitively representable), Bernays claims that
it is nevertheless a finitistically acceptable object since even finitistic arithmetic involves
a certain degree of abstraction that disregards all accidental features of numbers, focusing
exclusively on their structural relations (p. 251). In the case at hand, once exponentiation
and successor (implicit in the arguments 10, 20) are perceived as intuitive operations, it
20
becomes irrelevant whether the term 1010 designates a concretely instantiable number.
Even if we accept this viewpoint (see the discussion of strict finitism below), we still
require an account of how a numerical function such as exponentiation can be defined
intuitively.
In order to address this need, Parsons’s theory introduces a distinction between the
knowledge we acquire on the basis of the intuition of specific (types of) strings and “knowl-
edge of general propositions about types, which have in their scope indefinitely many
types” (Parsons, 2008, p. 173). The latter is founded on imagining an arbitrary string
either as a Gestalt, a bounded figure against a surrounding ground, or as an object con-
structed through the iteration of an operation that can be repeated indefinitely. The key
presupposition at play here is that the string in question is imagined in a representational
‘space’ that is not exhausted by the string’s presence and it can always accommodate extra
objects, such as an additional symbol extending the string. In other words, the “form of
intuition” is invariant: the object of intuition (imagination) is always bounded within an
unbounded environment (spatial or temporal). Clearly this presupposition is not absolutely
9 Parsons develops a notion of ‘intuition of ’, that is, an intuition of objects in the spirit enunciated
by Hilbert and Bernays: “What is characteristic of this methodological standpoint is that
considerations are put in the form of thought experiments on objects that are assumed to be
concretely present” (Hilbert & Bernays (1934, p. 20), translated in Parsons (2008 p. 172)).
10 A tally numeral is a homogeneous string and as such it does not have any internal complexity.
But if intuited strings are composed of at least two different signs, there is an obvious sense in
which they have an internal structure of varying complexity.
TWO ( OR THREE ) NOTIONS OF FINITISM 127
indubitable and Parsons does not claim it has an a priori status (Parsons, 2008, p. 176). He
follows Husserl in claiming that the mathematical intuition of types of strings is founded
on the perception and imagination of tokens of those strings and as such the standard form
of intuition his theory is based on cannot be considered necessary, since perceptions that
do not have the structure [bounded object within unbounded space] are clearly possible.
Circumstances where the standard form of intuition does not apply are described as
‘extreme situations’ in Gandy (1982). They arise when the resources for representing in-
scriptions are running out, an inscription being understood as “some sufficiently permanent
physical state which is prepared or set up by its author and which can subsequently be
‘read’ by himself or another” (p. 129). It seems safe to say that we do not have a satisfactory
theory of extreme situations and consequently we do not have a full-fledged theory of what
might be called representation systems, that is, of the physical systems whose subparts are
used to instantiate types of strings for the purposes of communication. It is not my goal in
this paper to pursue this topic, so I will limit myself to some brief remarks relating it to
Hilbertian finitism.
A key characteristic of a representation system is its capacity: the size (number of indi-
vidual symbols) of the largest string it can support at a given time. The implicit assumption
made by Hilbert, Bernays, and Parsons is that there is no limit on the capacity of the
representation systems that are available ‘in principle’. Gandy’s ‘fundamental hypothesis’
is precisely its negation: “There is an upper bound B, independent of time, to the size
of inscriptions,” with the corollary that there is also an upper bound on the numbers that
are representable by concrete tokens of numerals. Numbers can be represented not only
directly through tokens of numerals, but also indirectly through tokens of closed terms
written using symbols for acceptable functions. Characterizing the collection of these
‘concretely definable’ numbers on the basis of Gandy’s fundamental hypothesis is the main
task of what might be called ‘strict finitism’ or ‘ultrafinitism’.
A simple proposal for this characterization would be to use a system designed to codify
the more relaxed version of finitism initiated by Hilbert (such as PRAHB ) and adjust it in
two ways: i) adopt the language and logic of partial terms11 as presented, for example, in
Beeson (1986), which permits a simple formulation of the fundamental hypothesis, and (ii)
use the modified system as its own syntactical meta-theory, so that syntactical operations
are also conceived as partial. The first step involves the introduction of a special monadic
predicate ↓(t ↓ means that the term t is defined or has a referent) and of a modified relation
of equality ∼= (s ∼= t stands for (t ↓ ⊃ t = s) ∧ (s ↓ ⊃ s = t) for every pair of terms
s, t), and the reformulation of the axioms and rules of inference for PRAHB by means of
these two elements. Whereas the propositional fragment of its logic remains unchanged,
substitution rules take the following forms:
s∼= t, ϕ[x/s] t↓, ϕ(x)
,
ϕ[x/t] ϕ[x/t]
for every pair of terms s, t and formula ϕ. Axioms pertaining to the definability predicate
are 0↓, x ↓ (for every variable x), and R(t1 , . . ., tk ) ⊃ t1 ↓ ∧ . . . ∧ tk ↓ for every predicate
symbol of arity k and arbitrary terms t1 , . . ., tk (relevant cases are t = s ⊃ t ↓ ∧ s ↓
and f (t1 , . . ., tk )↓⊃ t1↓ ∧. . . ∧ tk ↓). Furthermore, the axioms introducing the primitive
recursive function symbols are formulated by means of ∼ = rather than =.
11 Parsons mentions the logic of partial terms in Parsons (2008, section 41) when discussing the
elementary axioms of arithmetic, but does not use it to describe strict finitism.
128 MIHAI GANEA
In this setup Gandy’s fundamental hypothesis can be written as ¬(t↓) for some term t,
and together with any consistent axiom t ↓ (for some term t ) it produces a theoretical
description of a representation system that can inscribe the numeral described by t , but
not the one described by t (if t is s(t ) then the system could be called strictly regimented;
otherwise it could be called tolerant). It is easily observed that ↓ becomes superfluous if
one does not postulate the hypothesis, but rather adopts the axiom s(x)↓ stating that the
successor function is everywhere defined (that every term ‘converges’ or is defined follows
by induction on term complexity).
What grounds are there for choosing between these two opposing hypotheses? Parsons
argues for the totality of the successor function on the basis of a transcendental insight, but
it could be said that the adoption of this principle is simply a matter of convenience, a way
of purifying the theory from a nonmathematical element, since the meaning of ↓ depends
on the nature of the physical world and the only possible evidence for Gandy’s fundamental
hypothesis is empirical. Parsons’s ‘step toward infinity’ could be interpreted as the decision
to pursue pure, a priori mathematics.12 It could be also be argued that Hilbert’s Cogito does
not demand an absolutely secure foundation for arithmetic, but rather a foundation that is as
secure as that of linguistic communication (which includes the intersubjective expression
of mathematical thought). Exhausting representation resources for strings leads not only
to a breakdown in arithmetic (the existence of a number that is indirectly representable,
but not directly representable), but in communication as well (the existence of expressions
that can be described but not inscribed). Conversely, as long as syntactical operations such
as concatenation are unrestricted, so are arithmetical operations such as successor. The
version of finitism outlined by Hilbert and Bernays and analyzed by Parsons does make
the infinitary assumption that the representation system supporting (tokens of) strings is
potentially infinite (it can be extended indefinitely), and one good reason for going along
with it is that arithmetic becomes a simpler theory and we do not need to bother with the
definability predicate anymore.
Therefore from now on I will take for granted that basic operations on strings are
everywhere defined, although strictly speaking they are completely intuitive only on small
arguments (i.e., only relatively small strings can be represented ‘in the mind’s eye’). In
other words, we assume that large strings, which can be described but not directly intuited
in full detail, can be supported by some system of representation. While we do have an
imprecise global intuition of these large strings, our access to the totality of their details is
based on the stability of their concrete tokens—on external memory rather than on internal
memory. Both of them (the internal as well as the external representation of strings) can be
mobilized in an argument for the finitistic character of induction.
Consider an elementary statement ϕ(n) such that ϕ(0̄) and ϕ(n) ⊃ ϕ(n + 1) are finitistic
theorems (in the rest of this section I will use German letters for numerical constants and
variables, following Hilbert and Bernays). Let a be a numeral instantiated in a certain
system of representation in the sense previously indicated. What the Grundlagen passage
on induction suggests is that from a representation of a = 1. . .1 ≥ 11 one can obtain a
representation of the finitistic proof (a) consisting in the sequence of formulas
ϕ(0̄), ϕ(n) ⊃ ϕ(n + 1), ϕ(0̄) ⊃ ϕ(1), ϕ(1), ϕ(1) ⊃ ϕ(11), ϕ(11), . . ., ϕ(a).
This sequence contains 2a + 2 elements, such that the first and second are the inductive
· 1 is the conditional ϕ(b −
hypotheses, element 2b − · 1) ⊃ ϕ(b), which follows from the
second induction hypothesis by instantiation, element 2b is ϕ(b) and follows from the
preceding two by modus ponens for 11 ≤ b ≤ a + 1. Note that it is circular to assert that
the existence of the proof in question is established by induction, a point made in George
& Velleman (1998, p. 323).
What would advance the finitist account of induction would be to show that the sequence
of symbols constituting the proof can be obtained on the basis of the original representation
of the numeral a. This could be made plausible if representing a would be accompanied
by representing the sequence of all numerals smaller than it: 0̄, 1, 11, . . ., a = ℋ(a) (the
‘history’ of a).13 For if ℋ(a) can be represented with complete clarity and its parts can be
considered distinctly, then so can the sequence (a): the latter is obtained from the former
by turning its first element (ℋ(a))0 = 0̄ into the pair ϕ(0̄), ϕ(n) ⊃ ϕ(n + 1), and every
subsequent element (ℋ(a))b = b into the pair ϕ(b − · 1) ⊃ ϕ(b), ϕ(a).
It seems in line with Parsons’s analysis to claim that if a sequence of finitary objects
can be (globally) intuited and externally represented then so can a sequence obtained
from by a transformation μ that acts on the individual members of and is in itself
intuitive, in the sense that from a global intuition of b one can obtain a global intuition
of μ(b) and from a concrete representation of b one can obtain a concrete representation
of μ(b). This could be named a finitistic replacement principle (by analogy with the set-
theoretic one). The intuitive appeal of this principle is very strong, since substitution is
presupposed not only by the most basic function of language (symbolization), but also by
any proper expression of generality (involving pronouns or variables, expressions whose
role is that of placeholder for other expressions).
One way to support the claim that if a is representable then so is ℋ(a) is from a visual-
geometric viewpoint. Imagining ℋ(a) (i.e., producing an internal global representation of
ℋ(a)) is not significantly more difficult than imagining a. Departing a bit from the view
that tally numerals are constituted by occurrences of 1, let us imagine them as sequences
of dots: • • . . . • • (as emphasized in Bernays (1923), the actual shape of the primitive
symbols involved is irrelevant). Then ℋ(a) is simply the triangular ‘figure’ with a as basis.
For a = • • • • • • • the respective figure is
Imagining ℋ(a) on the basis of a vague global representation of a is thus an easy exercise in
two-dimensional geometric intuition. However, it could be objected that visual perception
in at least two dimensions is not presupposed by Hilbert’s Cogito and constitutes a cogni-
tive resource that cannot be assumed available for the finitist, for nontrivial mathematics is
possible even when perception is restricted to one dimension. This is shown for instance by
Turing’s analysis of computation, which can be seen as an exercise in phenomenological
13 Hilbert & Bernays (1934) name this sequence ‘the series of numerals from 1 to a’ [die Reihe
der Ziffern von 1 bis a]. They describe it as follows: “The construction [Bildung] of a numeral a
proceeds through a concrete series of numerals, starting with 1 and ending with a, in which every
numeral is obtained from the preceding one by attaching 1. It is evident that, with the exception
of a, this series contains only numerals < a, and that any numeral < a must be present in this
series” (p. 24). It seems consistent with this passage to assume that the (global) intuition of a is
accompanied by the (global) intuition of ℋ(a).
130 MIHAI GANEA
imagination. The perceptual space of his idealized computing agent is a finite portion of
the read/write tape and as such unidimensional. Yet such an agent could easily imitate the
mathematical behavior of perceptually more sophisticated agents.
Unlike two-dimensional perception, memory seems an indispensable resource for math-
ematical thought, and indeed for any coherent form of thought (how is one to carry out
an intention without remembering what it is?). Memory has a dual aspect—one may dis-
tinguish between internal and external memory (we store tokens of strings on external
representation systems) and their compatibility is a key precondition to adequate thought
and communication. It is memory that can be invoked in justifying the history principle,
for if a string a is constructed through a temporal succession of acts of inscribing basic
signs in a representation system, then ℋ(a) is simply the recollection of the stages of this
process in their temporal order.
It should be remarked though that by producing concrete instances of a one does not
produce concrete instances of ℋ(a) in the same sense. But such a concrete representation
of ℋ(a) could be obtained if every individual sign x of a was replaced with the initial
segment of a that x determines (followed by some kind of marker). Therefore, a concrete
representation of ℋ(a) is guaranteed to exist by the finitistic replacement principle, pro-
vided the operation ax giving the initial segment of a that ends with x is considered
intuitive.
Thus it is a double application of the finitistic replacement principle that allows us to
conclude that (a) is representable if a is. If we accept the view that the conclusions of
finitistic proofs that can be globally represented in intuition are as reliable as their premises
we may infer that for every a, ϕ(a) follows from ϕ(1) and ϕ(n) ⊃ ϕ(n + 1), that is,
that induction is a finitistically acceptable mode of inference (restricted to finitistically
meaningful formulas, as ϕ(n) is assumed to be).14
So what are the elementary relations over numerals that may be used in a finitistic
inductive argument? Let us assume that basic finitistic statements are equations of form
f (a) = g(a) involving numerical variables and symbols for a special set of computable
functions. These functions form an algebra generated from some basic functions and
closed under operations such as composition. The function-generating operations come
along with rules of inference such as S, T , and possibly rules similar to R and U (we could
have a version of bounded recursion on notation, for example). Assuming that includes
addition, truncated subtraction x − · y, and the functions sg (sign) and sg (countersign),15
any formula ϕ(a) constructed from basic finitistic statements by means of the standard
sentential operators can be given a translation ϕ(a)∗ in the purely equational finitistic lan-
guage (without sentential operators). If the rules of inference associated with permit the
derivation of the standard logical rules for equality and sentential operators, then the basic
finitistic language may be enriched with the latter. This holds for every decent candidate
for the algebra of finitistic functions. The same cannot be said about the representation of
14 We should note that the argument above can be readily adapted to justify the form of induction
assumed by PRAG , that is, principle U1 . The only difference is that instead of a sequence of
applications of instantiation and modus ponens, the proof to be represented is a sequence of
applications of the rules S1 , S2 , and T .
15 I will take as a given that contains the zero, successor, and projection functions. − · is defined
using the predecessor function p (which satisfies p(0̄) = 0̄ and p(s(a)) = a), by a − · 0̄ = a and
a− · s(b) = p(a − · b); we have sg(0̄) = 0̄, sg(s(a)) = s(0̄), sg(0̄) = s(0̄) and sg(s(a)) = 0̄. These
equations just register the properties of the functions in question—they are intuitively clear and
need no proof.
TWO ( OR THREE ) NOTIONS OF FINITISM 131
bounded quantifiers in the basic finitistic language. If, for example, is identified with
the set of polynomial-time computable functions (on binary numerals), then the question
whether finitistic relations are closed under bounded quantification comes down to the
extremely difficult problem of the collapse of the polynomial time hierarchy (which most
researchers expect to ultimately receive a negative answer).
It turns out though that if the objects of arithmetic are conceived as tally numerals, the
projection principle ensures the existence of bounded sum, and thus allows the introduction
of bounded quantifiers. Bounded sum is a definition procedure that associates with every
the function of the same arity f (n, m)
function f (n, m) defined by the equations
= f (0̄, m),
f (0̄, m) f (s(n), m)
= f (n, m)
+ f (s(n), m).
Suppose now that the arguments a, b are representable. By the history principle so is
ℋ(a) and by the finitistic replacement principle the sequence
0̄, b,
1, b,
. . .,
a, b
is
also representable (for surely forming the pair
c, b is a finitistic operation given c ≤
A further application of the replacement principle yields the representation of
a and b).
f (1, b),
the sequence f (0̄, b), . . ., f (a, b).
Replacement can also assume the guise of a
union principle which guarantees that any concretely represented sequence of numerals
can be converted intuitively into a single numeral (simply by eliminating the breaks
between the members of ). For suppose that among the basic signs we intuit are a break
marker (say ‘’) and the empty symbol (symbolized by ‘ε’). Then if sequences of tally
numerals are strings of ‘1’ interspersed with occurrences of ‘’, then is the result
of substituting ε for in . Thus from a global intuitive representation of a, b it should
be possible to obtain the global intuitive representation of f (a, b) through the kind of
‘thought experiment’ that characterizes finitistically acceptable functions. It is also easy to
show that by using projections and composition one can apply summation over any of the
variables of a finitistic function.
If the above argument is seen as successful and all the other assumptions made about
the algebra of finitistically acceptable functions (namely that it includes zero, successor,
predecessor, projections, addition, truncated subtraction, sign and countersign, and that it is
closed under composition) are maintained, then the set of finitistic propositional functions
that can be justified as meaningful is closed under bounded quantifiers.
Switching focus now on the basic relations between numerals, we find that in Section 2
of Hilbert & Bernays (1934) the authors consider identity [Übereinstimmung], symbolized
by ‘=’ and difference [Verschiedenheit], symbolized by ‘=’, as basic. They also assume
certain primitive operations [Handlungen] and constructive processes [Bildungsprozesse]
whereby other numerals [Ziffern] are obtained from given ones. The most fundamental of
these is the ‘progression process’ [Prozess des Fortschreitens], namely the concatenation
of 1 to a given numeral; but concatenation can take as arguments other two figures [Fig-
uren] as well, and so we have the operation of addition, denoted by ‘+’. Associated with
it is the relation of order: one can show that a < b only by producing a c such that
a + c = b, since it is only in this way that a is an initial segment [Anfangsstück]
of b. Multiplication is another basic operation on numerals described by means of an
intuitive procedure on strings: “a · b denotes the numeral obtained from the numeral b,
when substituting throughout its structure 1 with the numeral a” (p. 24). Just like with
addition, multiplication is not given a recursive definition—rather, its fundamental property
of distributivity with respect to addition is supposed to follow directly from its intuitive
definition.
132 MIHAI GANEA
In Grundlagen der Mathematik no other fundamental operations are defined and the
authors proceed to the explanation of primitive recursion. In a generous interpretation we
could consider that functions shown to be finitistically acceptable constitute the functional
algebra described above (multiplication can be considered as basic, but it can also be
defined by means of bounded sum). These functions, known as the lower elementary func-
tions, have been discussed in Skolem (1962) and their set is denoted by ℒ2 in Rose (1984).
The formulas available for an inductive proof that primitive recursion is a permissible
definitional method are those expressed by the 0 formulas of the language {0̄, =} ∪ Fℒ2
(where Fℒ2 the set of symbols for functions in ℒ2 ).16 But these are insufficient for the
purpose of proving that definition by primitive recursion [das Verfahren der rekursiven
Definition] is an elementary procedure (that it produces elementary functions when applied
on elementary functions). Here is how Hilbert and Bernays describe it:
A new function symbol, say ϕ, is introduced, and the [corresponding]
function is defined by two equations. In the simplest case, these equa-
tions are of the form:
ϕ(1) = a
· y = 0̄.
16 We need not take the symbol ‘≤’ as primitive, since x ≤ y may be taken to be x −
TWO ( OR THREE ) NOTIONS OF FINITISM 133
the computation when given ψ(ψ(. . .ψ(a, 1). . .), m) as input or provide a bound for the
length of this computation). If the ‘elementary’ properties of tally numerals are those
expressed by the 0 formulas of the language {0̄, =} ∪ Fℒ2 , then induction over those
properties cannot possibly prove the existence of all primitive recursive functions, even
if we allow ourselves the full use of classical logic. For let be the first-order theory
axiomatized by the equations defining the functions in ℒ2 (including s(x) = 0̄ and s(x) =
s(y) ⊃ x = y) and the 0 induction scheme (for the language {0̄, =} ∪ Fℒ2 ). Then
can be proved consistent in a finite fragment of PRAG , and therefore cannot prove the
existence of all the functions in that fragment.
§4. Two theories that may be one and the same. The organizing idea of Tait (1981)
is to identify the forms of reasoning implicit in the concept of number. According to
Tait, understanding Number consists essentially in the ability to count, that is, to order
collections and to put them in correspondence with numbers (objects generated from 0
by successor). I will combine this idea with Hilbert’s Cogito and give them a precise
interpretation in the typed framework used in Tait’s paper, extended so as to accommodate
syntactical as well as numerical types.
Following Hilbert, I will take the basic type E to be that of expressions (the “extra-
logical discrete objects, which exist intuitively as immediate experience before all
thought”). E is a free monoid with a finite set of generators ℬ (the basic alphabet
that includes the standard signs used in formal theories of arithmetic), basic operation
∧ : E 2 → E (concatenation), and neutral element ε (the empty expression). Thus we have
if x : STn and x ∗s(n) y : STs(n) if x, y : STs(n) . It should be noted that the indexed sequence types
s(0)
are cumulative (i.e., if m ≤ n and x : STm , then x: STn ) and that ST coincides with ST .
The notions previously introduced for ST can be extended for STn (n ≥ 1). Counting has
p(n)
an indexed version nT : STn → N with the analogue properties nT (x) = 1 for x : ST ,
n ∗n
and T (x y) = T (x) + T (y) for x, y : ST . If z: ST and k ≤ T (z) then zk is defined
n n n n n
as the initial segment of z of length k, that is, the string u: STn such that z = u ∗n v for some
v: STn and nT (u) = k. If zk = x or zk = y ∗n x for some x: ST , then we write (z)k = x
p(n)
(x is the k-th element of the sequence z). If x = (z)k for some k ≤ nT (z) we write x ∈ z.
The transitive closure of this relation will be written as ∈∗ . The union of all types STn is ST∗ .
The T -rank r T (x) of x : ST∗ as the smallest n such that x : STn (it is the length of the
longest expression of form nT that occurs as a substring of x). We write RTn for the type of
strings x: ST∗ such that r T (x) = n. RT0 coincides with T and RT is characterized by the
s(n)
a ∧ b∧ a ∧ bb∧ a ∧ bbb∧ a,
that is, the sequence in RA3 whose first element is z and the second is (y)2 .
Therefore the ranked substitution operation sub RT : (ST∗ ∧ ST∗ ) ∧ ST∗ → ST∗ will be
characterized by the properties subRT (x, y, z) = z if x = y; subRT (x, y, z) = y if r T (x) =
r T (y) (i.e., x, y : RTn where n = r T (x)) and x = y; subRT (x, y, z) = y if r T (x) > r T (y)
(since in that case x ∈ / ∗ y); sub RT (x, u ∗k v, z) = sub RT (x, u, z)∗l sub RT (x, v, z), where
k > r T (x), u, v : ST , x ∈∗ u or x ∈∗ v and l = max(k, r T (z) + 1). It is clear that
p(k)
Certain equations are taken as axioms (type Ax), describing the basic properties of the
· and of functions defined by means of composition and
functions s, p, z, πm,n , +, −
summation. The complete list is as follows:
19 If the language of the theory is enriched with the operator R , then f would be represented as
b
Rb [g, h, g ].
138 MIHAI GANEA
(8) is justifiable in terms of the finitistic choice principle, for given a sequence argument
n one can obtain the sequence of sequences (n) = ℋ(g (m,
m, 0)), ℋ(g (m,
s(0))), . . .,
ℋ(g (m, n)). This array of numbers includes all the values of the function Rb [g, h, g ]
for arguments m, c, where c ≤ n. Identifying Rb [g, h, g ](m, n) in the last component
of (n) can be done through n + 1 choices, corresponding to the successive values
Rb [g, h, g ](m,
0) = g(m) ≤ g (m, 0), Rb [g, h, g ](m,
s(0)) = h(m, 0, Rb [g, h, g ]
0)) ≤ g (m,
(m, s(0)), . . ., Rb [g, h, g ](m,
n) ≤ g (m,
n). Using the notation previously
introduced, we have Rb [g, h, g ](m, n) = ( (n), g (m, 0)), h). The crucial difference
between primitive recursion and bounded recursion is that in the case of the latter the value
of the newly introduced function belongs to a collection of numbers whose existence we
accept every time we represent the corresponding argument.
It is not known whether the addition of schema (8) genuinely strengthens ℒ2 . It may be
that the two equational theories presented above are equivalent since it is an open problem
whether the algebra of lower elementary functions used by ℒ2 is closed under bounded
recursion. Evidence regarding the considerable mathematical strength of ℇ2 is given in
Berarducci & Intrigila (1991) and Cornaros (1995)20 : the combinatorics it can develop is
enough for proofs of classical results in number theory such as Bertrand’s postulate or
quadratic reciprocity. It would be interesting to test the theory with respect to Dirichlet’s
theorem on primes in arithmetical progressions.
What is a number? Originally, sequences of tally marks
were used to count things. Then positional notation
—the most powerful achievement in mathematics —
was invented.
Edward Nelson.
§5. Binary Arithmetic. I do not claim that the theory ℇ2 is the strongest arithmetical
formalism that can be justified in clearly finitistic terms. The foundation offered to the
theories in Section 4 suffers from an obvious limitation: the numerals considered are
homogeneous strings of a single symbol—following Hilbert’s original views, the objects
of arithmetic are taken there to be tally numerals. Mastering communication at an even
basic level usually presupposes producing and recognizing messages (strings of symbols)
from a richer alphabet, and in introducing theories ℒ2 and ℇ2 I have used facts about
our intuitions concerning such objects. Hilbert’s Cogito is thus probably compatible with
stronger theories that conceive of numbers as written in positional notation, but identifying
such theories is not easy.
An obvious candidate for a finitistically acceptable extension of ℇ2 that utilizes po-
sitional notation is Cook’s theory PV (see Cook (1975), Cook & Urquhart (1993) and
Krajı́ček (1995, pp. 76–78)). This is an equational theory that describes polynomial-time
20 The articles in question describe the theory I ℇ2 , but it can be shown that this theory is a
∗
conservative extension of using a cut elimination argument—see theorem 1.4.2 in Buss
ℇ2
(1998, p. 111).
TWO ( OR THREE ) NOTIONS OF FINITISM 139
Unfortunately we meet a stumbling block when we try to adapt the schema of bounded
recursion to binary numerals and complete the theoretical structure of PV. The so-called
bounded recursion on notation definitional schema (R B ), essential in the structure of PV,
introduces a function f ( x , y), h 0 (
x , y) using functions g( x , y, z), and h 1 (
x , y, z) by the
conditions
f (
x , 0) = g(
x , 0), f (
x , siB (y)) = h i (
x , y, f (
x , y)) (with i = 0, 1),
provided that f ( x , y) ≤ g ( x , y) for every x, y (one can replace this inequality by a differ-
ent sufficient condition that does not mention f directly). Here x ≤ y will be understood
as Less(x, y) = 0.
The trouble is that the finitistic choice principle cannot be invoked in the same way it
was in the justification of the bounded recursion schema for functions on tally numerals.
The history sequence for a binary numeral does not include all binary numerals smaller
than it and therefore it is hard to imagine how starting from the intuition of a sequence of
arguments m ∧ B∧ n : B k+1 one can derive the intuition of a sequence of binary numerals
that would include f (m, n) even if the limiting function g is available. The point is that
ℋ B (g (m, n)) will not necessarily include the string that would be identified with f (m, n),
since f (m, n) ≤ g (m,
n) does not imply f (m, n) g (m, n).
Assuming in general the existence of the collection of all binary numerals smaller than
an arbitrary one would help restore the argument in favor of the bounded recursion schema
but would also seem to be a considerable step on the abstraction scale; furthermore, if
this step is taken, the resulting theory would not be PV, but rather elementary arithmetic,
since the collection of binary numerals smaller than the binary numeral n is exponential
ex p
in size relative to n. More specifically, if we set up ℋ B : B → S B such that it satisfies
ℋ B (0) = 0 and ℋ B (s1 (s1 (x))) = ℋ B (x)∗ s1B (s0B (x))∗ s1B (s1B (x)) for every x : B,
ex p ex p B B ex p
ex p
then the function B (ℋ B (n)) : N → N is simply the exponential function 2n : N → N
(a sequence of 1’s can be interpreted both as a tally numeral and as a binary one).
Smaller steps on the abstraction scale can be taken, in such a way they do not involve
such a radical departure from the demands of intuition. Let us recall Tait’s minimal criterion
for understanding Number: it is the ability to count finite collections using numbers as
representations for those collections. Applied to binary strings it tells us that having a
minimal understanding of their arithmetical nature presupposes the ability to use them
in counting—say, in counting symbols in tally numerals. We are thus led to postulating
the intuitive character of the function B : N → B satisfying B (0) = B (1) = 1,
B
(s(s(0) · n) = s0B ( B (n)), B ((s(s(0) · n)) + 1) = s1B ( B (n)) for every n : N . Usually
the function B ( (x)) is written |x|—it is the binary numeral for the number of symbols
in the expression x (the binary length of x).
It seems natural to claim that if we can produce (intuit) an expression x: E then we can
also count its symbols in binary, that is, produce |x|. But when we count in binary, we do
go through all the consecutive numerals leading up to the final total and thus we are led to
a more modest binary history principle: whenever we intuit a binary numeral x, we also
log
intuit the collection of all binary numerals smaller than or equal to |x|. ℋ B : B → S B
log log
satisfies B ( B (ℋ B (x))) = |x| and (ℋ B (x))i = B (i) for every 1 ≤ i ≤ B (ℋ B (x)).
The basic finitistic insight regarding the logarithmic history of a binary numeral x is that
z ∈ ℋ B (x) if and only if z ≤ |x|.
This logarithmic history principle (a second version of a binary history principle) sup-
ports the so-called sharply bounded recursion on notation schema (R sB ), identical to R B
except for the stronger condition f ( x , y) ≤ |g (
x , y)|. The argument for R sB would proceed
TWO ( OR THREE ) NOTIONS OF FINITISM 141
by analogy with the argument for the bounded recursion schema for functions on tally
numerals: from the representation of n derive the representation of ℋ B (n); replace every
element of this sequence with g (m, n); make the |n| choices from the elements of this last
sequence (using the functions g, h 0 , and h 1 ) to ultimately select the value of f (m, n).
It turns out that once R sB is in place we are positioned to identify a clear candidate for
the algebra of fintistically acceptable functions on binary numerals: it is the set ℱLOGSPACE
of functions of polynomial growth whose bitgraphs are computable in logarithmic space
(starting from work by J. Lind, this complexity class is characterized as a functional
algebra in Clote (1999), theorem 3.22). ℱLOGSPACE is proved to be the closure of the set
{z, I , s0B , s1B , #, Bit} (where z is the constant 0 function and I is the set of all projection
functions) under the operations of composition, sharply bounded recursion on notation
and concatenation recursion on notation (CRN). CRN defines a function f ( x , y) from
functions g( x , y), h 0 (
x , y), and h 1 (
x , y) with the property that h i (
x , y) ≤ 1(i = 0, 1) by
the conditions
f (
x , 0) = g(
x , 0),
f (
x , siB (y)) = shBi (x ,y) ( f (
x , y)) (i = 0, 1).
This last ingredient of the characterization theorem can also be easily justified with the
apparatus elaborated in Section 4. First, let us remark that the functions h i ( x , y) can be
considered of type S B → N (since their only possible values are 0, 1). Second, the function
Last: B → N defined by Last(x) = (x) (x) gives the last digit of the binary numeral x,
that is, (0) (0) = 0 and (siB (x)) (s B (x)) = i (with i = 0, 1). Third, let u: S B → N be the
i
function defined by u( x , y) = h 0 (
x , y) · sg(Last(y)) + h 1 (
x , y) · sg(Last(y)). Suppose that
n : S B ; applying the first binary history principle we obtain the existence of the sequence
m,
ℋ B (n) : S B . By the replacement principle there exists the sequence u S (ℋ B (n)) : S N
obtained from ℋ B (n) by replacing each member k ∈ ℋ B (n) by u(m, k). f (m,
n) is simply
0)
sub( B , ε, u S (ℋ B (n))).
g(m,
An equational theory B can be organized on the basis of this algebra (in analogy with
PV being supported by the algebra of polynomial-time computable functions) that clearly
defines functions not in ℇ2 (e.g., #). However, it is very doubtful that ℇ2 ⊆ ℱLOGSPACE
(the initial functions of ℇ2 are in ℱLOGSPACE , but bounded recursion is a powerful resource
that may be impossible to replicate in ℱLOGSPACE or even in ℱPTIME ) and therefore it is
not clear that B represents a genuine strengthening of ℇ2 . It turns out that integrating
our arithmetical intuitions regarding unary and binary numerals in a unified theoretical
framework is not that simple! We are led to ask if there is a natural functional algebra that
includes both ℇ2 and ℱLOGSPACE and whose operations can be seen as finitistic ‘thought
experiments’ on strings.
Thus we have not reached a clear limit to the possibilities of intuitive arithmetical
theorizing. There may very well be ways to strengthen B and ℇ2 while remaining within
the confines imposed by Hilbert’s Cogito, and gauging the exact strengths of the various
finitistically justifiable theories could depend on difficult problems in complexity theory.
However, it is doubtful that elementary arithmetic could be shown to be compatible with
the demands of our fundamental syntactical intuition. If the existence of the exponential
function is justifiable intuitively (where the intuition in question is the ability to use strings
of symbols for the purposes of communication and inference) then it would turn out that
implicit in the intuition of any string of symbols x : ℒ (where ℒ is the language used to
formalize mathematical practice) is the intuition of the collection E(x) of all strings of
142 MIHAI GANEA
symbols of length equal to x, the vast majority of which are useless for the purposes of
communication within ℒ. The apparent incompatibility between Hilbert’s Cogito and the
exponential function thus adds further support to the view expressed in Nelson (1986)
that exponentiation belongs to the realm of abstract mathematics. Even if the boundary
between intuitive and nonintuitive arithmetic is a fuzzy one, elementary arithmetic (the
equational theory of the algebra of Kalmar elementary functions) seems to lie decidedly
on the nonintuitive side.
It is probably unreasonably optimistic to expect that we may reach a precise formal
characterization of a concept such as that of intuitive arithmetic. The analysis in this
paper seems to indicate that we may uncover a collection of finitistic arithmetics, in which
theories gain in strength as they lose their intuitive character and may not be ordered in a
simple hierarchy. Hilbert’s Cogito can ultimately be interpreted as the demand to describe
the structure of this collection.
BIBLIOGRAPHY
Beeson, M. (1986). Proving programs and programming proofs. In Barcan Marcus, R.,
Dorn, G.J.W., and Weingartner, P., (eds.), Logic, Methodology and Philosophy of
ScienceVII, proceedings of the International Congress, Salzburg, 1983, Amsterdam:
North-Holland, pp. 51–81.
Berarducci, A., & Intrigila, B. (1991). Combinatorial principles in elementary number
theory. Annals of Pure and Applied Logic, 55, 35–50.
Bernays, P. (1923). Erwiderung auf die Note von Herrn Aloys Müller: Über Zahlen als
Zeichen. Mathematische Annalen, 90, 159–63. English translation in Mancosu (1998),
pp. 223– 226.
Bernays, P. (1930). Die Philosophie der Mathematik und die Hilbertsche Beweistheorie’.
Blätter fur deutsche Philosophie, 4, 326–367. English translation in Mancosu (1998),
pp. 234–265.
Buss, S. (1998). First order proof theory of arithmetic. In Buss, S., editor. Handbook of
Proof Theory. Amsterdam: Elsevier Science BV, pp. 79–147.
Clote, P. (1999). Computation models and function algebras. In Griffor, E. R., editor.
Handbook of Computability Theory. Amsterdam: Elsevier, pp. 589–681.
Cook, S. (1975). Feasible constructive proofs and the propositional calculus. In, Chandra,
A.K., Meyer, A.R., Rounds, W.C., Stearns, R.E., Tarjan, R.E., Winograd, S., Young,
P.R., (eds.) Proceedings of the 7th ACM Symposium on the Theory of Computation.
New York: ACM (Association for computing machinery) pp. 83–97.
Cook, S., & Urquhart, A. (1993). Functional interpretations of feasibly constructive
arithmetic. Annals of Pure and Applied Logic, 63, 103–200.
Cornaros, C. (1995). On Grzegorczyk induction. Annals of Pure and Applied Logic, 74,
l–21.
Curry, H. (1941). A formalization of recursive arithmetic. American Journal of
Mathematics, 63, 263–282.
TWO ( OR THREE ) NOTIONS OF FINITISM 143
DEPARTMENT OF PHILOSOPHY
BOSTON UNIVERSITY
BOSTON, MA 02215
E-mail: mganea@bu.edu
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010
Abstract. We study the logical structure of ‘real’ relations, and in particular the notion of
occurrences of objects in a state. We start with formulating a number of principles for occurrences
and defining corresponding mathematical models. These models are analyzed to get more insight
in the formal properties of occurrences. In particular, we prove uniqueness results that tell us more
about the possible logical structures relations might have.
§1. Introduction. Relations are sometimes simply identified with sets of tuples of
certain objects. For mathematical relations this is obviously appropriate since they are sets
of tuples. But what about other relations, like, for example, the love relation? The idea that
the state of Albert’s loving Karen would be nothing but a tuple seems perverse.
According to Fine (2000), the standard view of relations is that the constituents of a
relation always come in a certain order. But then any relation also has a converse relation.
For example, the relation is older than has the converse relation is younger than. Now
consider the state of Thomas’s being older than Nick. If—in a certain context—you regard
this as exactly the same state as Nick’s being younger than Thomas, and you regard this
state as a relational complex of a single underlying relation, then the question is, of which
relation?
On an alternative view of relations any relation comes with orderless positions or
argument-places to which objects can be assigned. For example, for the love relation
we have the positions lover and beloved. Unfortunately, this alternative view is also
problematic. Fine (2000, pp. 16–17) raises two objections. First, positions seem onto-
logically excessive, and second, he regards positions unsuitable for strictly symmetric
relations, like the adjacency relation, since switching positions of arguments would
give a different state. (See Leo, 2008b, for a more detailed discussion of argument-
places).
A very promising view of relations introduced by Fine is based on the notion of sub-
stitution. According to this antipositionalist view the states of a relation form a network
in which substituting objects of a state by other objects yields another state. For example,
substituting Roos for Albert and Bas for Karen in the state of Albert’s loving Karen, results
in the state of Roos’s loving Bas.
In Leo (2008a) we defined models that represent different views of relations. In partic-
ular we defined substitution models. We argued that substitution models adequately model
a large class of ‘real’ relations. However, we also noted that these substitution models
have certain limitations. For example, they have no typed domains for the objects. As a
consequence, they are not accurate for a relation like ‘drinks’, because “Mo drinks tea”
corresponds in a natural way to a state, but “tea drinks Mo” does not. We do not consider
c Association for Symbolic Logic, 2010
145 doi:10.1017/S1755020309990347
146 JOOP LEO
this limitation to be a serious one, since typed domains can be incorporated into the models
in a straightforward way.
Substitution models as they were defined might have a more serious shortcoming.
It could be argued that the substitution mechanism of the models is perhaps not refined
enough, since objects are not explicitly substituted for individual occurrences of objects,
but only for objects in a global sense. For example, for the state of Narcissus’s loving
Narcissus, the model did not allow for a substitution resulting in the state of Echo’s loving
Narcissus. Now, if occurrences of objects are a basic notion for relations, then this is a
serious limitation of our substitution models.
But what exactly are occurrences? The notion of occurrences does not have the reputa-
tion of being crystal clear. Mates (1972, p. 49) even called it a “woolly notion.” Occurrences
can be considered in different contexts, for example, in expressions (Wetzel, 1993; Janssen
& Visser, 2004; Kracht, 2007). In this paper we try to get a better grip on the nature of
occurrences of objects in the logical space of ‘real’ relations by developing mathematical
models for relations.1
Our approach is as follows. First, we formulate an initial set of principles for occur-
rences. Then, in light of these principles, we define mathematical models in which occur-
rences have a constitutive function. We perform a technical investigation of these models
to get a better understanding of the formal properties of occurrences. Section 6 is the more
philosophical part of the paper. There we try to say more about the nature of occurrences
and consider the question whether, for ‘real’ relations, models with a refined substitution
mechanism for occurrences of objects are in a relevant sense more complete or adequate
than models with an undifferentiated substitution mechanism. In Section 6 we also briefly
discuss the idea of explicitly distinguishing between states ‘out there’ and relational com-
plexes, and to allow single states to figure as relational complexes in more than one re-
lation. On a first reading of this paper, it might not be a bad idea to peek ahead to this
section.
§2. Basic principles for occurrences. In his paper ‘The Problem of De Re Modality’,
Fine (1989) proposes the development of a general theory of constituent structure, where
the basic structure of the entities involved is given by the operation of substitution. Fine
considers the following syntactic notions to be basic: occurrence of, occurrence in, and
substitution. He gives the following example of a basic principle for these notions:
One basic principle, for example, is that if F is the result of substituting
E for the occurrence e of E within F, then there is an occurrence e of
E within F such that the result of substituting any expression E for e
within F is identical to the result of substituting E directly for e in F.
The objective of this paper is not to develop a general theory of constituent structure. We
restrict ourselves to the structure of relations. We postulate the following basic principles
for occurrences of objects in relations:
P-1 Each relation has a nonempty set of states.
P-2 Each state of a relation has exactly one set of occurrences of objects.
1 By the logical space of a relation we mean the totality of the substitutional interconnections in
which the states of a relation stand to each other.
MODELING OCCURRENCES IN RELATIONS 147
I do not claim that the principles given above are in any sense complete. For example,
one might perhaps like to add that different states have no occurrence in common. Also
from these principles it cannot always be deduced uniquely how many occurrences a state
has. They do not tell us whether the state of Narcissus’ loving Narcissus has one or two
occurrences.
Principle P-8 may seem too weak. The mapping μ is surjective by P-7 and P-8, but
could we not also demand that μ is injective? Unfortunately, for certain relations this seems
problematic, since substitution might perhaps result in a coalescence of occurrences. This
is illustrated by the following three exemplary situations.
1. Consider the variably polyadic relation of waiting for the bus. Take the state of
Janneke and Vincent’s waiting. If we substitute Janneke for Vincent, we expect to
get—if anything—a state with only one occurrence.
2. Consider the ternary relation where abc is the state that a loves b and b loves
c. Assume that the order of the conjuncts is irrelevant. Now suppose that a, b, c are
three different objects, and that each has one occurrence in the state abc. Substi-
tuting in this state a for the occurrence of c gives the state aba. Simultaneously
substituting in the state abc, the objects b, a, b for the occurrences of a, b, c gives
the state bab, which appears to be the same state as the state aba. It follows
148 JOOP LEO
by Principle P-8 that the state aba has only one occurrence of a and only one
occurrence of b. So, μ cannot be injective (see Figure 1).
3. As a last example of a possible coalescence consider the conjunction &
of a relation with itself. We may envision that the occurrences of each state
s & s in the conjunction relation correspond one-to-one to the occurrences of s
itself in .
There may be arguments that can be used to reject a possible coalescence of occurrences.
Against a coalescence for the ternary love relation it may, for example, be objected that
substitution is a more subtle operation than we just seemed to suppose, and that substituting
a for the occurrence of c in the state abc does not give the same result as substituting
b, a, b for a, b, c in the state abc. In Section 6 we will discuss such an alternative view
in more detail.
In the next sections we will study the principles from a technical perspective by defining
and analyzing mathematical models based on them.
§3. Modeling occurrences. We define two types of substitution frames to model the
logical space of relations. In the first type, occurrences play no role and substitution is
defined for objects. In the second type, we have a more refined substitution mechanism
working on occurrences of objects.
3.1. Undifferentiated substitution frames. We call frames with substitution defined
for objects undifferentiated substitution frames. Since this type of frames is extensively
discussed in Leo (2008a), we give the definition without further explanation:
D EFINITION 3.1. An undifferentiated substitution frame is a triple F = S, O, ,
where S is a nonempty set of states, O is a nonempty set of objects, and is a function
from S × O O to S such that
1. for all s ∈ S, (s, id O ) = s,
2. for all s ∈ S and δ, δ ∈ O O , ((s, δ), δ ) = (s, δ ◦ δ).
For convenience, we will often write s ·F δ or s · δ for (s, δ). Further, we will also often
write f · g for g ◦ f . With this notation, is such that for all s ∈ S and for all δ, δ ∈ O O ,
s · id O = s, and (s · δ) · δ = s · (δ · δ ).
We now give a definition of the objects of a state. Roughly put, they are the objects
for which it makes a difference for the resulting state which objects are substituted for
them.
If Core-obF (s) is an object-domain, then we call this set the objects of s. We denote this
set as ObF (s). If Core-obF (s) is not an object-domain, then we leave ObF (s) undefined.
For each substitution frame we can define its degree as a cardinal number:
D EFINITION 3.3. Let F = S, O, be an undifferentiated substitution frame. For a
state s in S, we define the object-degree of s as:
ob-degreeF (s) = glb {| A| | A is an object-domain of s}.
Here |A| denotes as usual the cardinality of A, “glb” denotes the greatest lower bound,
and “lub” denotes the least upper bound. Note that the degree of s and the degree of F
always exist and are indeed cardinal numbers.
3.2. Differentiated substitution frames. In this subsection we define differentiated
substitution frames that allow for a fine-grained account of substitution in states. We also
discuss the adequacy of these frames, and present some basic properties.
3.2.1. Definition We give a definition of differentiated substitution frames that is based
on the principles for occurrences given in Section 2:
D EFINITION 3.4. A differentiated substitution frame is a tuple G = S, O, Oc, , ,
where S is a nonempty set of states, O is a nonempty set of objects, Oc is a set of occur-
rences, is a mapping from Oc to O, and is a function from S × O Oc to S such that
A further point to note is that in our models states may share the same occurrences. In
our analysis we will take care to explicitly mention if any of our results depend on this.
3.2.3. Basic properties We define occurrences of states as follows:
D EFINITION 3.5. Let G = S, O, Oc, , be a differentiated substitution frame. Then
we call X ⊆ Oc an occurrence-domain of s ∈ S if for all σ, σ : Oc → O,
σ =X σ ⇒ s · σ = s · σ .
If Core-ocG (s) is an occurrence-domain, then we call this set the occurrences of s, and we
denote it as OcG (s). If Core-ocG (s) is not an occurrence-domain, then we leave OcG (s)
undefined.
MODELING OCCURRENCES IN RELATIONS 151
σ
D EFINITION 3.6. We say that a transition s → s is a composition of s →σ s and
s →σ s if σ = X μ · σ for some occurrence-domain X of s and some mapping μ
corresponding to s →σ s .
It may happen that Core-oc(s) is not an occurrence-domain:
E XAMPLE 3.7. Let G = S, O, Oc, , be a differentiated substitution frame with O
an infinite set, Oc = O, = id O , S the set of subsets of O modulo a finite difference, that
is,
S = {A | A ⊆ O}
= {A ⊆ O | A A is finite}, and defined by
with A
· σ = σ
A [A].3
= B,
is well-defined, since for any A, B ⊆ O, if A then σ
[A] = σ
[B]. Further, G is a
substitution frame, since
· = id
1. A
O [A] = A and
· σ ) · σ = σ
2. ( A [A] · σ = = · (σ · σ ).
=A
is the empty set, but if A is
It is not difficult to see that for any A ⊆ O, Core-oc( A)
infinite, then the empty set is not an occurrence-domain of A.
For undifferentiated substitution frames, a similar example can be given that shows
Core-ob(s) is not always an object-domain (Leo, 2008a, Example 3.16).
L EMMA 3.8. Let G = S, O, Oc, , be a differentiated substitution frame. For
every s ∈ S, the occurrence-domains of s form a (possibly nonproper) filter on Oc.
Proof. To prove that the occurrence-domains of s are closed under finite intersection,
let X and X be occurrence-domains of s. Let σ, σ : Oc → O be such that σ = X ∩X σ .
Define
⎧
⎪
⎪σ (α) if α ∈ X − X ,
⎨
σ (α) = σ (α) = σ (α) if α ∈ X ∩ X ,
⎪
⎪
⎩
σ (α) if α ∈ X − X .
Then σ = X σ and σ = X σ . So, s · σ = s · σ = s · σ . Thus X ∩ X is an occurrence-
domain of s.
It is trivial that the occurrence-domains of s are upward closed.
Since an occurrence-domain may be empty, we may have a nonproper filter.
D EFINITION 3.9. For a differentiated substitution frame G = S, O, Oc, , , we call
A ⊆ O an object-domain of s ∈ S if for some occurrence-domain X of s,
A = [X ].
We define the core objects of s as:
Core-obG (s) = {A | A is an object-domain of s}.
If Core-ob(s) is an object-domain, then we call this set the objects of s, and we denote it
as Ob(s). If Core-ob(s) is not an object-domain, then we leave Ob(s) undefined.
L EMMA 3.10. Let G = S, O, Oc, , be a differentiated substitution frame. For
every s ∈ S, the object-domains of s form a (possibly nonproper) filter on O.
Proof. To prove that the object-domains of s are closed under finite intersection, let A
and A be object-domains of s. Then A = [X ] and A = [X ] for some occurrence-
domains X, X . So, A ∩ A = [X ] ∩ [X ] ⊇ [X ∩ X ]. By Lemma 3.8, X ∩ X is
an occurrence-domain. It follows that A ∩ A is an object-domain of s.
By the surjectivity of and the upward closedness of occurrence-domains, the object-
domains of s are also upward closed.
Since an object-domain may be empty, we may have a nonproper filter.
In the next lemma we show that (i) the objects of the core-occurrences of a state form a
subset of its core-objects and (ii) if the occurrences of a state exist, then the objects of the
state also exist, and are exactly the objects of the occurrences of the state.
L EMMA 3.11. Let G = S, O, Oc, , be a differentiated substitution frame. For
every state s ∈ S,
(i) [Core-oc(s)] ⊆ Core-ob(s).
(ii) If Oc(s) exists, then Ob(s) exists and [Oc(s)] = Ob(s).
Proof. By Definition 3.5 and Definition 3.9,
(Core-oc(s)) = {X | X is an occurrence-domain of s}
⊆ {[X ] | X is an occurrence-domain of s}
= Core-ob(s).
To prove the second claim, assume Oc(s) exists. Then Oc(s) is an occurrence-domain of
s. So, [Oc(s)] is an object-domain of s. Thus Core-ob(s) ⊆ [Oc(s)]. By the first claim
of the lemma, we also have the reverse inclusion. So, Core-ob(s) = [Oc(s)]. Because
[Oc(s)] is an object-domain, the claim follows.
The inclusion of Lemma 3.11 may be a proper inclusion, as the next example shows.
E XAMPLE 3.12. Let G = S, O, Oc, , be a differentiated substitution frame with
O = {a, b}, S = {s1 , s2 }, Oc = O × ω, (x, n) = x, and defined by
⎧
⎨s1 if s = s1 , and both σ −1 (a) and σ −1 (b) are infinite,
s·σ =
⎩
s2 otherwise.
We see that Ob(s1 ) = {a, b}, but Core-oc(s1 ) = ∅ and Oc(s1 ) is undefined.
The next lemma expresses the core-occurrences of a state in terms of single substitutions:
L EMMA 3.13. If in Core-oc(s) the number of occurrences of each object is finite, then
Core-oc(s) = {α | ∃b ∈ O [ s · [α → b] = s ]}.4
Proof. If O has just one element, then it is obviously true. So, assume that O has at least
two elements. Consider α0 ∈ Core-oc(s). Let a = (α0 ) and b be some object in O with
b = a. Define σ0 = [α0 → b]. By the definition of differentiated substitution frames,
there is a mapping μ0 : Oc → Oc such that
μ0 · = σ0 and for all σ : Oc → O, (s · σ0 ) · σ = s · (μ0 · σ ).
Assume that A = {α ∈ Core-oc(s) | (α) = a} is finite. Then there is an α1 ∈ A such that
α1 ∈
/ μ0 [Core-oc(s)]. It follows that s · σ0 = s.
The inclusion in the other direction is obvious.
In the lemma we cannot drop the finiteness condition, as the next example shows.
E XAMPLE 3.14. Let G = S, O, Oc, , be a differentiated substitution frame with
O = {a, b}, S = {s0 , s1 , s2 , . . . , s∞ }, Oc = O × ω, (x, n) = x, and defined by
⎧
⎪
⎪ s0 if s = sn with n ∈ ω and σ (a, 0) = σ (b, 0),
⎪
⎨s
n if s = sn with n ∈ ω and σ (a, 0) = σ (b, 0),
s·σ = −1
⎪
⎪ |σ −1 (x)| if s = s∞ and σ (x) is finite,
⎪
⎩
s
s∞ otherwise. 5
The object-degrees ob-degreeG (s) and ob-degreeG are defined in a similar way by starting
with the object-domains of s.
In Section 4 we will see that the occurrence-degree of a frame can be much higher than
its object-degree.
3.3. Underlying frames. There is a natural embedding of the undifferentiated sub-
stitution frames in the class of differentiated substitution frames, and, conversely, each
differentiated substitution frame also has an underlying undifferentiated substitution
frame:
D EFINITION 3.16. Let F = S, O, be an undifferentiated substitution frame and let
G = S , O , Oc, , be a differentiated substitution frame. We say that F underlies G
if S = S , O = O , and for every s ∈ S and δ : O → O,
s ·G ( · δ) = s ·F δ.
We call G a basic refinement of F if F underlies G, Oc = O and = id O .
Note that in a basic refinement, different states may have occurrences in common.
To prevent this, we could alternatively have defined for a basic refinement the set of
occurrences as S × O.
T HEOREM 3.17. (i) Each undifferentiated substitution frame has a unique basic refine-
ment. (ii) Each differentiated substitution frame has a unique underlying undifferentiated
substitution frame. (iii) If F is an undifferentiated substitution frame, then the underlying
undifferentiated substitution frame of the basic refinement of F is F itself.
Proof.
(i) This follows immediately from the definition of a basic refinement and the definition
of a differentiated substitution frame.
(ii) Let G = S, O, Oc, , be a differentiated substitution frame. Let F = S, O,
be defined by s ·F δ = s ·G ( · δ). Then by Condition 1 of the definition of a
differentiated substitution frame s ·F id O = s ·G ( · id O ) = s. By Condition 2 of the
definition of a differentiated substitution frame:
(s ·F δ) ·F δ = (s ·G ( · δ)) ·G ( · δ )
= s ·G (μ · · δ ) for some μ with μ · = · δ
= s ·G ( · δ · δ )
= s ·F (δ · δ ).
d if d ∈ A,
δ0 (d) =
a otherwise.
Define σ0 = · δ0 . Then s ·G σ0 = s ·F δ0 = s. Because G is a differentiated substitution
frame, there is a function μ0 : Oc → Oc such that μ0 · = σ0 , and for any σ : Oc → O,
(s ·G σ0 ) ·G σ = s ·G (μ0 · σ ). So,
s ·G σ2 = (s ·G σ0 ) ·G σ2 = s ·G (μ0 · σ2 ) = s ·G (μ0 · σ1 ) = (s ·G σ0 ) ·G σ1 = s ·G σ1 .
Conversely, let A be an object-domain of s in G. Then there is an occurrence-domain X
of s such that A = [X ]. So, for any δ, δ : O → O with δ = A δ , we have ·δ = X ·δ ,
and thus s ·F δ = s ·G ( · δ) = s ·G ( · δ ) = s ·F δ . It follows that A is also an object-
domain of s in F.
As a direct consequence of the lemma, if F underlies G, then for any state s,
Core-obF (s) = Core-obG (s), and ob-degreeF (s) = ob-degreeG (s).
1. Oc = Oc × κ,
2. = proj1 |Oc · ,
3. s ·G σ = s ·G σ with σ such that ∀α ∈ Oc, {λ | σ (α, λ) = σ (α)} ∈ U .
(a) μ ·G = σ ,
(b) μ(α, λ) = (μ(α), λ) if σ (α, λ) = σ (α).
(s ·G σ ) ·G σ = (s ·G σ ) ·G σ
= s ·G (μ · σ )
= s ·G (μ · σ ),
MODELING OCCURRENCES IN RELATIONS 157
where the last equation can be proved as follows. Define for each α ∈ Oc the
following sets:
Aα = {λ | μ(α, λ) = (μ(α), λ)},
Bα = {λ | σ (α, λ) = σ (α)},
Cα = {λ | σ (μ(α, λ)) = σ (μ(α))}.
(α) if (α, λ) ∈ X ,
σ0 (α, λ) =
σ0 (α) otherwise.
Then σ0 = X . So, s0 ·G σ0 = s0 . Now suppose |X | < κ. Then, by the definition
of U , for any α ∈ Oc, {λ | (α, λ) ∈ Oc − X } ∈ U , and thus s0 ·G σ0 = s0 ·G σ0 = s0 .
So, we have a contradiction. It follows that oc-degreeG = κ.
What about frames with an infinite number of objects? Do they also always have proper
refinements? By a rather straightforward modification of the proof of the previous theorem
we can show that any frame with at least one transition from a state to a different one has
a proper refinement—provided that there are arbitrary large measurable cardinals:
T HEOREM 4.4. Let G = S, O, Oc, , be a differentiated substitution frame with
oc-degreeG = 0. Then for any measurable cardinal κ > max(|O|, oc-degreeG ), there is a
refinement G of G with oc-degreeG = κ.
Proof. By definition, an uncountable cardinal κ is measurable iff there is a nonprincipal
κ-complete ultrafilter on κ, where a filter is called κ-complete if it is closed under intersec-
tion of less than κ sets. See for example, Jech (2006, p. 127) or Kanamori (2005, p. 26).
Now let κ be a measurable cardinal greater than max(|O|, oc-degreeG ), and let U be a
nonprincipal κ-complete ultrafilter over κ. Then for any A ∈ U , |A| = κ. Further, since
κ > |O|, for any τ ∈ O κ , there is exactly one a ∈ O such that {λ | τ (λ) = a} ∈ U .
Define G = S, O, Oc , , exactly as in the proof of previous theorem. We may
prove that G is a differentiated substitution frame and that G is a refinement of G in the
same way as we did in (i) and (ii) of the proof of the previous theorem.
To prove that oc-degreeG = κ, we first note that since oc-degreeG ≤ κ, and κ is infinite,
we have oc-degreeG ≤ κ. Then we may follow (iii) of the proof of the previous theorem
to complete the proof.
For ‘real’ relations the ultrarefinements constructed in the proof of the last two theorems
do not seem to be adequate, since the states do not have a well-defined set of occurrences.
Probably no metaphysical significance should be given to the existence of these spurious
refinements.
158 JOOP LEO
Because for each transition ti →σi ti+1 in G the mapping σi OcG (ti ) is injective,
it follows that if transition s0 →σ i+1 ti+1 is a composition of s0 →σ0 s0 and
s0 →σ i+1 ti+1 , then s0 →σ i ti is a composition of s0 →σ0 s0 and s0 →σ i ti . So,
choosing transition s0 →σ0 s0 as a composition of s0 →σ n tn and an inverse of
s0 →σ n tn gives by induction that s0 →σ 1 t1 is a composition of s0 →σ0 s0 and
s0 →σ 1 t1 . Thus, μ0 · σ1 =OcG (s0 ) τ1 · σ .
(iii) Proof that G ∗ is a refinement of each frame in
:
Let s0 , s1 , σ1 be as in the construction of G ∗ . Let μ1 : Oc0 → Oc0 correspond to
s0 →σ1 s1 in G0 . Let τ : Oc∗ → Oc0 be a function with
⎧
⎨μ1 (α) if a = σ1 (α),
τ (α, a) =
⎩α with (α ) = a otherwise.
0
Then
1. τ · 0 = ∗ , because
⎧
⎨0 (μ1 (α)) = σ1 (α) = ∗ (α, a) if a = σ1 (α),
0 (τ (α, a)) =
⎩a = ∗ (α, a) otherwise.
So, G ∗ is a refinement of G0 .
To see that G ∗ is also a refinement of any other frame G in
, observe that if in
the construction of G ∗ we would have chosen G instead of G0 , then we would have
obtained a frame equally refined as G ∗ . Thus it follows that G ∗ is a refinement of
each frame in
.
(iv) Proof that G ∗ is a unique least common refinement of the frames in
:
Let G = S, O, Oc , , be an arbitrary common refinement of the frames
in
. Let s0 , s1 , σ1 be as in the construction of G ∗ . Then there is a mapping
τ : Oc → Oc∗ with τ · ∗ = and τ (α) ∈ OcG ∗ (s0 ) for any α ∈ OcG (s0 ).
Let μ1 : Oc → Oc be a bijection corresponding to s0 →τ ·σ1 s1 in G , and let
μ2 : Oc → Oc correspond to s0 → 1 s1 in G . Further, let τ : Oc → Oc∗ be a
∗ ∗ σ ∗
MODELING OCCURRENCES IN RELATIONS 161
function with
⎧
⎨(μ−1 · τ · μ2 )(α) if α ∈ OcG (s1 ),
1
τ (α) =
⎩α with ∗ (α ) = (α) otherwise.
Then
1. τ · ∗ = , because (τ · μ2 · ∗ )(α) = (τ · σ1 )(α) = (μ1 · )(α)
if α ∈ OcG (s0 ).
2. s1 ·G (τ · σ ) = s0 ·G (τ · μ2 · σ ) = s0 ·G ∗ (μ2 · σ ) = s1 ·G ∗ σ .
So, G is a refinement of G ∗ , and thus, because we assumed G to be an arbitrary
common refinement of the frames in
, we see that G ∗ is not only a least common
refinement, but also unique, modulo equal refinedness.
In the theorem we assumed that the object-degree of the frames is finite. I am not sure
whether this condition could be dropped.
R EMARK 4.8. We can also prove that for any nonempty collection of normal refinements
of a common substitution frame of finite object-degree, there is a greatest common subre-
finement, which is also normal and unique, modulo equal refinedness. The construction of
this subrefinement is analogous to the construction of G ∗ in Theorem 4.7, only here we use,
instead of →∗ , the transitive closure of the relation →1 defined by:
s →1 s if s ·G σ = s for all G ∈
and some σ .
The proof that G ∗ is indeed the greatest common subrefinement and unique is relatively
simple.
It follows that for a substitution frame of finite object-degree, the normal refinements—
modulo equal refinedness—form a complete lattice.
A direct consequence of Theorem 4.7 is the following uniqueness result:
C OROLLARY 4.9. A normal substitution frame of finite object-degree has—modulo
equal refinedness—a unique maximal normal refinement.
4.3. Coalescence-free refinements. Of special interest are frames in which no coales-
cence of occurrences takes place:
D EFINITION 4.10. We call a frame G = S, O, Oc, , coalescence-free if each
transition s →σ s has a corresponding μ : Oc → Oc that is injective on an occurrence-
domain of s.
Note that coalescence-free normal frames are maximally refined.
We will characterize coalescence-free frames in terms of their underlying undifferen-
tiated frames. We will restrict ourselves to cases where the underlying frame is of finite
object-degree and simple:
D EFINITION 4.11. We call a frame F = S, O, simple if there is a state s0 such that
S = {s0 · δ | δ : O → O}.
We call s0 an initial state.
Similarly, we call a frame G = S, O, Oc, , simple if there is a state s0 such that
S = {s0 · σ | σ : Oc → O}.
162 JOOP LEO
We define a class of frames whose transitions from an initial state s0 are unique, modulo
loops of s0 :
D EFINITION 4.12. We call a simple frame F = S, O, of finite object-degree a
loop-initial frame if for any initial state s0 , and any δ1 , δ2 ∈ O O ,
s0 · δ1 = s0 · δ2 ⇒ ∃δ0 [ δ2 =Ob(s0 ) δ0 · δ1 and s0 · δ0 = s0 ].
= (s0 ·G ( · δ1 )) ·G (μ−1
1 · )
= (s0 ·G ( · δ2 )) ·G (μ−1
1 · )
= s0 ·G (μ2 · μ−1
1 · )
= s0 ·G ( · δ0 )
= s0 ·F δ0 .
Further,
· δ0 · δ1 =Oc(s0 ) μ2 · μ−1
1 · · δ1
=Oc(s0 ) μ2 · μ−1
1 · μ1 ·
=Oc(s0 ) μ2 ·
=Oc(s0 ) · δ2 .
So, δ0 · δ1 =Ob(s0 ) δ2 . We have proved that F is a loop-initial frame.
Conversely, assume F = S, O, is a loop-initial frame. Define G = S, O, Oc, ,
as follows:
1. Define Oc = S × O.
2. Choose an initial state s0 of F.
3. For each s ∈ S choose one δs ∈ O O such that s0 ·F δs = s.
4. Define (s, d) = δs (d).
5. Define s ·G σ = s0 ·F δ with for all d ∈ O, δ(d) = σ (s, d).
We prove that G is a differentiated substitution frame by showing that G fulfills the two
conditions of Definition 3.4.
MODELING OCCURRENCES IN RELATIONS 163
(1) s ·G = s0 ·F δs = s.
(2) Consider any s ∈ S and σ : Oc → O. Then s ·G σ = s0 ·F δs·G σ . Also we
have s ·G σ = s0 ·F δ with for all d ∈ O, δ(d) = σ (s, d). Because F is a loop-
initial frame, δ =Ob(s0 ) δ0 · δs·G σ and s0 ·F δ0 = s0 for some δ0 . Define a function
μ : Oc → Oc with μ(s, d) = (s ·G σ, δ0 (d)), and for any other state s ∈ S,
(μ(s , d)) = σ (s , d). Then
(μ(s, d)) = (s ·G σ, δ0 (d))
= σ (s, d).
So, μ · = σ .
We have (s ·G σ ) ·G σ = s0 ·F δ with for all d ∈ O, δ (d) = σ (s ·G σ, d). Further,
s ·G (μ · σ ) = s0 ·F δ with for all d ∈ O, δ (d) = (μ · σ )(s, d). So, because
μ(s, d) = (s ·G σ, δ0 (d)), we see that δ = δ0 · δ . Thus, because s0 ·F δ0 = s0 , we
have (s ·G σ ) ·G σ = s ·G (μ · σ ). This completes the proof that G is a differentiated
substitution frame.
To show that G is a refinement of F, we note that s ·G ( · δ) = s0 ·F δ with for all
d ∈ O, δ (d) = ( · δ)(s, d). So, s0 ·F δ = (s0 ·F δs ) ·F δ = s ·F δ.
Further, it follows from Item 5 of the definition of G that oc-degreeG (s) =
ob-degreeF (s0 ), and thus that G is a coalescence-free normal frame.
Note that in the second part of the proof we choose certain functions from O to O. But
since O consists of objects, we may perhaps not assume that O is representable in ZFC. So,
perhaps it would be more accurate to add an additional condition in the theorem. Because
the object-degree of F is finite, it would be enough to assume that O can be totally ordered.
§5. Restricting frames. We can define operations like conjunction, disjunction and
negation for undifferentiated substitution frames in a rather straightforward way. For ex-
ample, the conjunction F & F will have states s & s with s a state of F and s a state
of F , and substitution defined by (s & s ) · δ = s · δ & s · δ. Note that the definition only
requires that the following condition is satisfied:
s & s = t & t ⇒ s · δ & s · δ = t · δ & t · δ.
For differentiated substitution frames the situation is a bit more complicated. For example,
in defining conjunction it is not immediately clear whether or not we should let the oc-
currences of a state s & s correspond one-to-one to the occurrences of s itself. Also, we
should maybe be content with results that are unique modulo equal refinedness.
A really controversial issue is how to interpret such operations metaphysically. For
example, in The Philosophy of Logical Atomism Russell (1956) said that when he argued
that there were negative facts, it nearly produced a riot (p. 211). Russell says that on the
whole he is inclined to believe that there are negative facts, but no disjunctive facts (Russell,
1956, p. 215). Armstrong (1997) rejects both negative and disjunctive facts, but he does
accept conjunctive facts and totality facts. We will not pursue this issue here.
What I would like to discuss here in more detail is the notion of restriction for relations.
Consider a frame for the love relation with states x loving y. If we restrict the states to those
of x loving Mo, then we get a new frame for this restricted set of states. We get another
type of restriction if we take as states only x loves x. More generally, we define:
164 JOOP LEO
s ·F1 δ = (s0 ·F1 δ ) ·F1 δ = s0 ·F1 (δ · δ) = s0 ·F2 (δ · δ) = (s0 ·F2 δ ) ·F2 δ = s ·F2 δ.
Thus, F1 = F2 .
(ii) Consider the construction of G ∗ in the proof of Theorem 4.7 starting with G1 and
G2 . Let s0 , s1 , σ1 , τ1 be as in the construction of G ∗ .
Further, let X 0 ∈ OcG (s0 ) and τ0 : X 0 → Oc1 be such that τ0 · 1 = X 0 , and for
every σ : Oc1 → O, s0 ·G1 σ = s0 ·G σ with σ = X 0 τ0 · σ and σ =Oc −X .
0
Define σ1 : Oc → O by
⎧
⎨σ1 (τ0 (α)) if α ∈ X 0
σ1 (α) =
⎩(α) otherwise.
Other forms of restriction that never introduce a coalescence of occurrences are conceiv-
able. But for such restrictions more than one state may have the same underlying state in the
original relation. For example, if we start with the ‘double’ love relation of Example 5.3
then the state x −→♥ y & y −→ ♥ x may underly two states of a restriction, namely a
state with two occurrences of x and one of y and another state with one occurrence of x
and two of y. Also one state of a restriction may perhaps have more than one underlying
state. This would, for example, be the case if we can make a further restriction to states
x −→
♥ y & y −→ ♥ x.
§6. Back to reality. A key question here is how to determine, for any state of a
relation, what its occurrences of objects are. For the love relation the state of Echo’s loving
Narcissus obviously has two occurrences, but what about the state of Narcissus’s loving
Narcissus? If this state has one occurrence, then substituting Narcissus for Echo in Echo’s
loving Narcissus gives a coalescence of occurrences.
If in the ‘real’ world no coalescence of occurrences takes place, then for many
relations the logical space is straightforwardly determined. But if we may not exclude
coalescence of occurrences, then the principles in Section 2 often leave us with many
choices.
We are inclined to think that there is just one love relation, one adjacency relation,
one similarity relation, and so forth. But this might perhaps be disputed. Why would we
exclude the possibility that there is a love relation where the state of Narcissus’s loving
Narcissus has only one occurrence of Narcissus, and another relation where Narcissus’s
loving Narcissus has two occurrences of him? Maybe we should not even exclude the
possibility of a love relation with both states, one with two occurrences of him and another
with one. Could it be that there simply is no deeper metaphysical fact that determines the
right choice?
These considerations might engender the uncomfortable feeling that we could easily get
stuck with an abundance of relations for the same states of affairs ‘out there’ in reality.
However, what we want is a clear, uncomplicated canonical view on the logical structure
of relations.
166 JOOP LEO
In our mathematical analysis we have arrived at a few results that provide more insight
with regard to the possibilities:
(i) By Corollary 4.9, any normal substitution frame of finite object-degree has a unique
maximal normal refinement.6
(ii) By Theorem 5.4, any normal substitution frame of finite object-degree has for each
subset of states at most one simple maximal restriction.
We might formulate the first result in terms of relations as follows. Suppose we modeled
the logical space of a given relation by a substitution frame for which the maximum number
of occurrences of its states is finite and equal to the maximum number of objects of its
states. Suppose further that the ‘real’ occurrences of the relation are a maximal refinement
of the occurrences of the frame. Then the logical space of the relation is in fact uniquely
determined by the original frame.
It seems reasonable to postulate that the logical space for restrictions of relations is
maximal, if unique in an appropriate sense. By Theorem 5.4, we know that this applies
to relations whose logical space corresponds to a simple restriction of a normal frame of
finite object-degree. What makes this and the previous uniqueness result of metaphysical
interest is that they probably apply to a large class of ‘real’ relations. Encouraged by
these uniqueness results we might go a step further and postulate the following maximality
principle:
P-x The occurrences of every relation are always maximally refined.
If this principle is true, then that would be very nice, since it would make the notion of
occurrences more accessible. Let us test its viability by examining some objections that
could be made:
Objection 1: Example 4.5 does not support the maximality principle. The example
shows a substitution frame whose refinements have no refinement in
common. So, in this case there is no unique maximum. Furthermore,
it is an open question whether all normal substitution frame of infinite
object-degree have a unique maximal normal refinement.
Objection 2: A conjunction of a relation with itself will introduce a coalescence of
occurrences, if the occurrences of a state s & s correspond one-to-one
with the occurrences of s. If so, then this would obviously contradict the
maximality principle.
Objection 3: Relations whose states have a set-like character are straightforward
counterexamples to the maximality principle. Take the relation of
waiting for the bus. If we may substitute Janneke for Vincent in the
state of Janneke and Vincent’s waiting for the bus, the resulting state
will have only one occurrence, and therefore will not be maximally
refined.
These objections seem strong, but there may be escape routes available:
Against Objection 1 it might be argued that the frame of the example is farfetched and
probably does not correspond to a ‘real’ relation. As a safer alternative, we could restrict
the maximality principle to relations whose logical space corresponds to normal frames of
finite degree.
Some people will reject Objection 2, because they will deny that there are conjunctive
relations. But let us suppose that conjunctive relations exist. Then why would the occur-
rences of s & s correspond one-to-one to the occurrences of s? I don’t see a convincing
argument. Even if s & s and s are identical states, then there could still be a way out,
namely by arguing that not states, but relational complexes have occurrences. A relational
complex could be conceived of as a structured perspective on a state ‘out there’, where we
give up the idea that there is a one-to-one correspondence or even identity between states of
affairs and relational complexes7 (see Figure 2). With this approach, there seems to be no
compelling reason why there should be a one-to-one correspondence between occurrences
of different relational complexes corresponding to the same state.
Maintaining both states ‘out there’ and relational complexes, however, has as potential
drawback that it might give rise to an inflation of ontology. A more detailed analysis
is needed to address this issue. Note, by the way, that in our formal analysis we also
allowed states to belong to more than one frame: in the definition of a restriction of a
frame (Definition 5.1) the states of a restriction of a frame G are a subset of the states of G
itself. If this is not acceptable, then an alternative definition of a restriction might be given
that takes hiding and merging of occurrences as primitive operations.
With respect to Objection 3 two different replies are possible. First, one could argue
that the objects of states with a set-like character are sets themselves, not the members
of the sets. For the given example, this would mean that in fact we should substitute the
set consisting of Janneke for the set consisting of Janneke and Vincent. Alternatively, one
could argue that certain substitutions in states with a set-like character are highly dubious,
since it is in some cases unclear what exactly the result should be. In the example, it may
be more natural to leave substituting Janneke for Vincent undefined. The interrelatedness
of the states may more adequately be expressed by the operation of subtraction. It seems
natural to say that subtracting Vincent from the state of Janneke and Vincent’s waiting for
the bus results in the state of Janneke’s waiting for the bus. If we follow this road, then we
need a weaker version of Principle P-6:
P-6 Any substitution of objects for occurrences in a state results in at most one state of
the same relation.
In addition, we should formulate a principle for subtracting occurrences similar to P-6 and
a composition principle for subtractions.
7 Such a one-to-one correspondence is often taken for granted. For example, Russell (1984, p. 80)
says: “there is certainly a one-one correspondence of complexes and facts.”
168 JOOP LEO
As an aside, note that Principle P-6 also has the advantage that it allows us to keep
meaningless or (conceptually) impossible states out of our ontology. Substituting, for ex-
ample, a for b in a’s being adjacent to b might be problematic. If you do not accept the
existence of conceptually impossible states, then it could be better to leave the result of this
substitution undefined.
To conclude this discussion, I think that a maximality principle like principle P-x might
be the right choice for getting close to capturing the essence of occurrences. But to make
such a principle really plausible, we need to conduct a more profound study of the notion of
states, of substitution and probably of other operations, like subtraction. Also a follow-up
of our formal analysis will be needed where partial substitution and subtraction functions
are taken into account.
A view on relations without coalescence of occurrences still seems an attractive alter-
native, because of its simplicity. We already countered two arguments against it, namely a
coalescence in relations with set-like states, and coalescence introduced by a conjunction
of relations. In Section 2 we gave a third objection against a coalescence-free account. We
considered the ternary relation where abc is the state of a’s loving b & b’s loving
c. This relation has the peculiar property that bab and aba are identical states. As a
consequence, this state can have only one occurrence of a and one occurrence of b, and
thus in a transition from abc to this state we get coalescence of occurrences. How can we
counter this objection?
Again relational complexes may be the solution. We could argue that there are two
relational complexes corresponding to the state of a’s loving b & b’s loving a, namely
one relational complex with two occurrences of a and one with two occurrences of b.
Only in this case we would have to accept that one state ‘out there’ can have more than
one relational complex within the same relation. This approach looks quite natural if we
regard the relation as a restriction of a quaternary relation with states like a’s loving
b & c’s loving d, because then one of the relational complexes corresponding to a’s loving
b & b’s loving a in could be taken as the result of merging two occurrences of a and the
other relational complex as the result of merging two occurrences of b. For any relation
with a comparable symmetry, a similar ‘solution’ could be given. We also regard this issue
as something that requires further investigation.
There is one final issue of metaphysical importance that I would like to discuss, namely
the question: Do we really need occurrences?
Occurrences are definitely a very useful part of our representation of reality, but that in
some cases the best way to express certain properties of states is in terms of occurrences
does not force us to any ontological commitment to them. So, do we have a compelling
reason to assume that for relations occurrences are ontologically basic?
Let us assume, for the moment, that the maximality principle is valid, and in addition
that for every ‘real’ relation the underlying undifferentiated substitution frame has a unique
maximal refinement. Would this not imply that a complete account of relations could be
given in terms of undifferentiated substitutions? If so, then a parsimonious undifferentiated
substitution mechanism might be enough for the ontology of relations.
I only see one strong argument in favor of occurrences being basic for relations, namely
that some relations might just not have enough objects for an occurrence-free account of
relations. Consider a relation whose occurrence-degree is larger than its object-degree.
Then undifferentiated substitution might not provide enough information about the inter-
connection of the states. We might have such a situation, for example, with a conjunc-
tion & & · · · & , where is a binary relation with a finite number of objects.
MODELING OCCURRENCES IN RELATIONS 169
Probably such situations do not only occur in complex relations, but also in what we would
regard as elementary relations. An example of such an elementary relation could maybe be
something like a relation with states of a’s loving b more than c in a mini-world with two
inhabitants.
My conclusion is that we have good reason to consider occurrences of relations to be a
primitive notion. Further, I think our analysis has brought us closer to revealing the essence
of the logical structure of relations. Although some major issues are still open, the way we
have articulated them may very well contribute to their solution.
BIBLIOGRAPHY
Armstrong, D. (1997). A World of States of Affairs. New York: Cambridge University Press.
Fine, K. (2000). Neutral relations. The Philosophical Review, 109, 1–33.
Fine, K. (1989). The Problem of De Re Modality. In Almog, J., Perry, J., and Wettstein,
H., editors. Themes From Kaplan. Oxford, UK: Oxford University Press, pp. 197–272.
Jech, T. (2006). Set Theory (third edition). Berlin: Springer-Verlag.
Janssen, M., & Visser, A. (2004). Some words on word. La Nuova Critica, 43–44, 71–95.
Kanamori, A. (2005). The Highter Infinite. Berlin: Springer-Verlag.
Kracht, M. (2007). Compositionality: The very idea. Research on Language and
Computation, 5.3, 287–308.
Leo, J. (2008a). Modeling relations. Journal of Philosophical Logic, 37, 353–385.
Leo, J. (2008b). The identitity of argument-places. The Review of Symbolic Logic, 1.3,
335–354.
Mates, B. (1972). Elementary Logic. New York, NY: Oxford University Press.
Mac Lane, S. (1998). Categories for the Working Mathematician (second edition). New
York, NY: Springer-Verlag.
Russell, B. (1956). The philosophy of logical atomism. In Marsh, R. C., editor. Logic
and Knowledge, Essays 1901-1950. London: George Allen & Unwin, pp. 175–281.
(Originally published in 1918).
Russell, B. (1984). Theory of knowledge. In Eames, E. R., editor. Theory of Knowledge,
The 1913 Manuscript. London: Allen and Unwin.
Wetzel, L. (1993). What are occurrences of expressions? Journal of Philosophical Logic,
22.2, 215–219.
DEPARTMENT OF PHILOSOPHY
UTRECHT UNIVERSITY
HEIDELBERGLAAN 8, 3584 CS UTRECHT
THE NETHERLANDS
E-mail: joop.leo@phil.uu.nl
Appendix A: Occurrences and roles. An object can fulfill one or more roles in a
state. For example, in the amatory relation an object can play the role of lover or the role
of beloved. In this appendix we investigate how roles are related to occurrences.
We only define roles for simple substitution frames of finite degree. The definition will
be such that
170 JOOP LEO
For arbitrary s ∈ S, a ∈ Ob(s), we say that a in s fulfills role ρ ∈ RolesF if for some
s0 , a0 with ρ = Role(s0 , a0 ) there is a mapping δ : O → O such that
s = s0 · δ and a = δ(a0 ).
It is easy to see that if F is an n-ary simple substitution frame, then F has at most n
roles. Objects sometimes fulfill more than one role in certain states. For example, if F
models the amatory relation, then in the state where Narcissus loves himself, he fulfills
both roles of the model. It is also possible that an n-ary model with n > 1, has only one
role. This is, for example, the case for cyclic models.
For differentiated substitution frames of finite occurrence-degree we can define roles in
a similar way:
D EFINITION A.2. Let G = S, O, Oc, , be a simple differentiated substitution
frame of finite occurrence-degree. Then for any initial state s0 ∈ S and α0 ∈ Oc(s0 ), we
define the role of α0 in s0 as
Role(s0 , α0 ) = (s, α) | ∃σ [ s · σ = s0 & α ∈ Oc(s)
& μ(α) = α0 with μ corresponding to s →σ s0 ] .
For arbitrary s ∈ S, α ∈ Oc(s), we say that a in s fulfills role ρ ∈ RolesG if for some
s0 , α0 with ρ = Role(s0 , α0 ) there is a mapping σ : Oc → O such that
s = s0 · σ and μ(α0 ) = α for a corresponding μ.
Consider again the relation where abc is the state that a loves b & b loves c. Let G
be a differentiated substitution frame for it with oc-degreeG = 3. Then because the state
aba is identical with the state bab, a and b each fulfill three roles, but a and b each
have only one occurrence in bab.
T HEOREM A.3. Let G be a simple differentiated substitution frame of finite occurrence-
degree. Then each occurrence of a state of G fulfills exactly one role iff G is coalescence-
free.
MODELING OCCURRENCES IN RELATIONS 171
f O (δ(d)) if d = f O (d),
2. f S (s ·F δ) = f S (s) ·F δ with δ (d ) =
d otherwise.
Note that Condition 1 guarantees that the function δ in Condition 2 is uniquely defined
by the function δ.
Our category for differentiated substitution frames is somewhat more complicated:
D EFINITION B.2. We define DSF as the category with objects all differentiated substitu-
tion frames and with morphisms from G = S, O, Oc, , to G = S , O , Oc , ,
all triples f S , f O , f Oc of functions f S : S → S , f O : O → O and f Oc : S × X → Oc
with X = ( )−1 [im f O ], the inverse image of im f O by , such that
1. f O is injective,
2. for each s ∈ S and α ∈ X , f O (( f Oc (s, α ))) = (α ),
σ ( f Oc (s, α )) if α ∈ X ,
3. f S (s ·G σ ) = f S (s) ·G σ with σ (α ) =
(α ) otherwise.
The next theorem states that there is an adjunction from USF to DSF.
T HEOREM B.3. For the categories USF and DSF the functor that assigns to an undiffer-
entiated substitution frame its basic refinement is a left-adjoint for the functor that assigns
to a differentiated substitution frame its underlying undifferentiated substitution frame.
Proof. Let R : USF → DSF be the functor that assigns to an undifferentiated substitu-
tion frame its basic refinement, and U : DSF → USF be the forgetful functor that assigns
to a differentiated substitution frame its underlying undifferentiated substitution frame.
For each F ∈ USF and G ∈ DSF, we get a bijection
F ,G : DSF(R(F), G) → USF(F, U (G))
by assigning f S , f O to the morphism f S , f O , f Oc . It is not difficult to see that the
family of bijections
: DSF(R(F), G) ∼ = USF(F, U (G))
is natural in F and G. So, R, U, is an adjunction from USF to DSF.
The theorem can also be proved by characterizing an adjunction in terms of the unit of
adjunction (see, e.g., Theorem IV.1.2(i) of Mac Lane, 1998, p. 83). The unit of adjunction
.
is in this case simply the identity natural transformation η : IUSF −
→ IUSF .
172 JOOP LEO
O BSERVATION C.1. Not for every differentiated substitution frame and every mapping
μ : Oc → Oc we have (s · (μ · )) · σ = s · (μ · σ ).
Proof. Let G = S, O, Oc, , be a differentiated substitution frame without any
symmetry and with a state s for which Oc(s) = {α0 , α1 } with α0 = α1 and (α0 ) =
(α1 ) = a. Let μ : Oc → Oc be such that μ(α0 ) = α1 and μ(α1 ) = α0 . Then s ·(μ·) =
s, but for any b ∈ O with b = a, if σ (α0 ) = a and σ (α1 ) = b, then s · σ = s · (μ · σ ). So,
(s · (μ · )) · σ = s · (μ · σ ).
To a transition s →σ s more than one mapping μ : Oc → Oc may correspond. The next
observation shows that it is unlikely that we can always select one of them as a canonical
mapping.
O BSERVATION C.2. Not for every differentiated substitution frame, there is a represen-
tative function μ : S × O Oc → OcOc preserving composition, that is,
1. μ(s, σ ) · = σ,
2. s · (μ(s, σ ) · σ ) = (s · σ ) · σ ,
3. μ(s, μ(s, σ ) · σ ) = μ(s, σ ) · μ(s · σ, σ ).
a b c d
Let s1 = s0 ·F δ1 with δ1 Ob(s0 ) = .
a b a b
Further, let Oc(s1 ) = {0, 1, 2, 3} with (0) = (2) = a, and let
0 1 2 3
σ1 Oc(s1 ) = .
b a b a
Then s1 · σ1 = s1 . It is not difficult to see that by Properties (1) and (2) of μ:
0 1 2 3 0 1 2 3
μ(s1 , σ1 )Oc(s1 ) = or μ(s1 , σ1 )Oc(s1 ) = .
3 0 1 2 1 2 3 0
So, μ(s1 , σ1 ) · μ(s1 , σ1 ) =Oc(s1 ) idOc .
By Properties (1) and (3) of μ we have for any s ∈ S:
μ(s, ) = μ(s, μ(s, ) · )
= μ(s, ) · μ(s · , )
So, because for any s ∈ S, μ(s, )[Oc(s)] = Oc(s), we have μ(s, ) =Oc(s) idOc , and
so, because μ(s1 , σ1 ) · σ1 =Oc(s1 ) , we have μ(s1 , μ(s1 , σ1 ) · σ1 ) =Oc(s1 ) idOc .
It follows that μ(s1 , μ(s1 , σ1 ) · σ1 ) = μ(s1 , σ1 ) · μ(s1 , σ1 ) = μ(s1 , σ1 ) · μ(s1 · σ1 , σ1 ),
contradicting Property (3) of μ.
THE REVIEW OF SYMBOLIC LOGIC THE REVIEW OF SYMBOLIC LOGIC
Coordinating Editor Information for Contributors
Jeremy Avigad Aims and Scope. The Review of Symbolic Logic is a newly established journal from the
Departments of Philosophy and Mathematical Sciences Association for Symbolic Logic, published in partnership with Cambridge University Press.
Carnegie Mellon University The Review of Symbolic Logic will publish papers in: philosophical and non-classical logics,
algebraic logic, and their applications in such fields as computer science, linguistics, game the-
ory and decision theory, formal epistemology, and cognitive science; history and philosophy
Editors of logic; philosophy and methodology of mathematics, past and present.
Horacio Arlo-Costa Gregory Restall Submission of Manuscripts. Manuscripts should be submitted to the Coordinating Editor
Department of Philosophy Department of Philosophy at rsl@uci.edu. Electronic submission is encouraged: send email with the manuscript file
Carnegie Mellon University University of Melbourne attached in PDF format. The body of the email should include the title of the paper, the
authors, its length in pages, and a clear-text copy of the abstract. Authors are encouraged
Patrick Blackburn Alasdair Urquhart to indicate which editor they would prefer to have handle their papers. Any method of pro-
Equipe TALARIS, Batiment B Departments of Philosophy and ducing the PDF is fine, but LaTex is recommended as it can be used for typesetting
INRIA Lorraine Computer Science the final paper.
Paolo Mancosu University of Toronto
Electronic Manuscripts. The publisher encourages submission of manuscripts in LaTex
Department of Philosophy Richard Zach which can be used for direct typesetting. Authors using LaTex should use the RSL LaTex class
University of California, Department of Philosophy file. This along with related files, can be obtained using anonymous FTP from
Berkeley University of Calgary ftp://ftp.cambridge.org/pub/texarchive/journals/latex/rsl-cls. If you have difficulties obtaining
Ian Proops these files please contact dtranah@cambridge.org; there is also a help-line available via
Department of Philosophy email—please contact texline@cup.cam.ac.uk. While use of the RSL class file is preferred,
University of Michigan plain LaTex or Tex files can also be accepted.
Layout of Manuscripts. Manuscripts should begin with an abstract of not more than 300
Advisory Board words. Papers should conform to a good standard of English prose; please consult a style
guide such as The Elements of Style by Strunk and White (New York: Macmillan). Do not
Steve Awodey Ulrike Sattler begin sentences with a symbol or identifier name. Present programs in one of two styles:
Department of Philosophy School of Computer Science either with identifiers in italics and keywords in bold, or entirely in a fixed-width teletype
Carnegie Mellon University University of Manchester font. Please supply Web URLs for the home page of each author of the paper.
Hartry Field Colin Stirling References. The Harvard system of references should be used. Citations are by author’s
Department of Philosophy School of Informatics surname and year of publication, and may stand either as a noun phrase (e.g., “Curry (1993)”)
New York University University of Edinburgh or as a parenthetical note (e.g., “(Curry 1933)”). List references at the end of the text in alpha-
betical order. A typical entry is: Curry, H.B. (1933) Apparent variables from the standpoint of
Kit Fine James Tappenden mathematical logic, Ann. of Math., 34 (2): 381–404.
Departments of Philosophy Department of Philosophy
and Mathematics University of Michigan Artwork. To ensure that your figures are reproduced to the highest possible standards,
New York University Cambridge Journals recommends the following formats and resolutions for supplying elec-
Michael Friedman Johan van Benthem tronic figures. LINE ARTWORK Format: tif or eps; Resolution: 1200 dpi. BLACK AND
Department of Philosophy Institute for Logic, WHITE HALFTONE Format: tif; Resolution: 300 dpi. COMBINATION ARTWORK Format:
Stanford University Language tif; Resolution: 800 dpi. If you require further guidance on creating suitable electronic figures
and Computation please visit http://dx.sheridan.com/guidelines/digital_art.html. Here you will find extensive
Marcus Kracht University of Amsterdam and guidelines on preparing artwork and gain access to an online preflighting tool where you can
Department of Linguistic and Department of Philosophy check to see if your figures are suitable for reproduction. A list of captions for figures should
Literary Studies Stanford University be supplied in a separate file.
University of Bielefeld
Michiel van Lambalgen Copyediting and Proofreading. The publisher reserves the right to copyedit and proofread
John MacFarlane Institute for Logic, all articles for publication, but the corresponding author will receive page proofs for final
Department of Philosophy and Language and proofreading. These should be checked and returned within three days of receipt. Only typo-
Group in Logic and Computation and graphical or factual errors may be changed at the proof stage. The publisher reserves the right
Methodology of Science Department of Philosophy to charge authors for excessive correction of non-typographical errors.
University of California, Berkeley University of Amsterdam Offprints. No paper offprints are provided, but the corresponding author will be sent a link
Ruth Barcan Marcus
Dag Westerståhl to the pdf of the published article.
Department of Philosophy
Yale University Department of Philosophy Home Page. Information about Review of Symbolic Logic may be viewed on the Cambridge
Gothenburg University University Press home page. The location of this home page is: journals.cambridge.org/rsl
D.A. Martin
Departments of Mathematics Mark Wilson
and Philosophy Department of Philosophy
University of California, Pittsburgh University
Los Angeles Crispin Wright
Lawrence Moss Department of Philosophy
Department of Mathematics University of St. Andrews and
Indiana University New York University
THE REVIEW OF SYMBOLIC LOGIC
THE
REVIEW OF
SYMBOLIC
LOGIC
Copyright © 2010 by the Association for Symbolic Logic. All rights reserved.
Reproduction by photostat, photo-print, microfilm, or like process by permission only.