You are on page 1of 180

THE REVIEW OF SYMBOLIC LOGIC

THE
REVIEW OF
SYMBOLIC
LOGIC

Vol. 3, No. 1 • March 2010 • Pages 1–174


Edited by
Jeremy Avigad, Coordinating Editor
Horacio Arlo-Costa
Patrick Blackburn
Paolo Mancosu
Ian Proops
Gregory Restall
Alasdair Urquhart
Richard Zach

VOLUME 3 • NUMBER 1 • MARCH 2010 • ISSN 1755-0203

Copyright © 2010 by the Association for Symbolic Logic. All rights reserved.
Reproduction by photostat, photo-print, microfilm, or like process by permission only.

Cambridge Journals Online


For further information about this journal
please go to the journal web site at:
journals.cambridge.org/rsl PUBLISHED QUARTERLY BY THE ASSOCIATION FOR SYMBOLIC LOGIC, INC.
WITH SUPPORT FROM INSTITUTIONAL MEMBERS.
THE REVIEW OF SYMBOLIC LOGIC THE REVIEW OF SYMBOLIC LOGIC
Coordinating Editor Information for Contributors
Jeremy Avigad Aims and Scope. The Review of Symbolic Logic is a newly established journal from the
Departments of Philosophy and Mathematical Sciences Association for Symbolic Logic, published in partnership with Cambridge University Press.
Carnegie Mellon University The Review of Symbolic Logic will publish papers in: philosophical and non-classical logics,
algebraic logic, and their applications in such fields as computer science, linguistics, game the-
ory and decision theory, formal epistemology, and cognitive science; history and philosophy
Editors of logic; philosophy and methodology of mathematics, past and present.
Horacio Arlo-Costa Gregory Restall Submission of Manuscripts. Manuscripts should be submitted to the Coordinating Editor
Department of Philosophy Department of Philosophy at rsl@uci.edu. Electronic submission is encouraged: send email with the manuscript file
Carnegie Mellon University University of Melbourne attached in PDF format. The body of the email should include the title of the paper, the
authors, its length in pages, and a clear-text copy of the abstract. Authors are encouraged
Patrick Blackburn Alasdair Urquhart to indicate which editor they would prefer to have handle their papers. Any method of pro-
Equipe TALARIS, Batiment B Departments of Philosophy and ducing the PDF is fine, but LaTex is recommended as it can be used for typesetting
INRIA Lorraine Computer Science the final paper.
Paolo Mancosu University of Toronto
Electronic Manuscripts. The publisher encourages submission of manuscripts in LaTex
Department of Philosophy Richard Zach which can be used for direct typesetting. Authors using LaTex should use the RSL LaTex class
University of California, Department of Philosophy file. This along with related files, can be obtained using anonymous FTP from
Berkeley University of Calgary ftp://ftp.cambridge.org/pub/texarchive/journals/latex/rsl-cls. If you have difficulties obtaining
Ian Proops these files please contact dtranah@cambridge.org; there is also a help-line available via
Department of Philosophy email—please contact texline@cup.cam.ac.uk. While use of the RSL class file is preferred,
University of Michigan plain LaTex or Tex files can also be accepted.
Layout of Manuscripts. Manuscripts should begin with an abstract of not more than 300
Advisory Board words. Papers should conform to a good standard of English prose; please consult a style
guide such as The Elements of Style by Strunk and White (New York: Macmillan). Do not
Steve Awodey Ulrike Sattler begin sentences with a symbol or identifier name. Present programs in one of two styles:
Department of Philosophy School of Computer Science either with identifiers in italics and keywords in bold, or entirely in a fixed-width teletype
Carnegie Mellon University University of Manchester font. Please supply Web URLs for the home page of each author of the paper.
Hartry Field Colin Stirling References. The Harvard system of references should be used. Citations are by author’s
Department of Philosophy School of Informatics surname and year of publication, and may stand either as a noun phrase (e.g., “Curry (1993)”)
New York University University of Edinburgh or as a parenthetical note (e.g., “(Curry 1933)”). List references at the end of the text in alpha-
betical order. A typical entry is: Curry, H.B. (1933) Apparent variables from the standpoint of
Kit Fine James Tappenden mathematical logic, Ann. of Math., 34 (2): 381–404.
Departments of Philosophy Department of Philosophy
and Mathematics University of Michigan Artwork. To ensure that your figures are reproduced to the highest possible standards,
New York University Cambridge Journals recommends the following formats and resolutions for supplying elec-
Michael Friedman Johan van Benthem tronic figures. LINE ARTWORK Format: tif or eps; Resolution: 1200 dpi. BLACK AND
Department of Philosophy Institute for Logic, WHITE HALFTONE Format: tif; Resolution: 300 dpi. COMBINATION ARTWORK Format:
Stanford University Language tif; Resolution: 800 dpi. If you require further guidance on creating suitable electronic figures
and Computation please visit http://dx.sheridan.com/guidelines/digital_art.html. Here you will find extensive
Marcus Kracht University of Amsterdam and guidelines on preparing artwork and gain access to an online preflighting tool where you can
Department of Linguistic and Department of Philosophy check to see if your figures are suitable for reproduction. A list of captions for figures should
Literary Studies Stanford University be supplied in a separate file.
University of Bielefeld
Michiel van Lambalgen Copyediting and Proofreading. The publisher reserves the right to copyedit and proofread
John MacFarlane Institute for Logic, all articles for publication, but the corresponding author will receive page proofs for final
Department of Philosophy and Language and proofreading. These should be checked and returned within three days of receipt. Only typo-
Group in Logic and Computation and graphical or factual errors may be changed at the proof stage. The publisher reserves the right
Methodology of Science Department of Philosophy to charge authors for excessive correction of non-typographical errors.
University of California, Berkeley University of Amsterdam Offprints. No paper offprints are provided, but the corresponding author will be sent a link
Ruth Barcan Marcus
Dag Westerståhl to the pdf of the published article.
Department of Philosophy
Yale University Department of Philosophy Home Page. Information about Review of Symbolic Logic may be viewed on the Cambridge
Gothenburg University University Press home page. The location of this home page is: journals.cambridge.org/rsl
D.A. Martin
Departments of Mathematics Mark Wilson
and Philosophy Department of Philosophy
University of California, Pittsburgh University
Los Angeles Crispin Wright
Lawrence Moss Department of Philosophy
Department of Mathematics University of St. Andrews and
Indiana University New York University
TABLE OF CONTENTS

Intensionality and Paradoxes in Ramsey’s ‘The Foundations of


Mathematics’
DUSTIN TUCKER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

The Modal Logic of Stone Spaces: Diamond as Derivative


GURAM BEZHANISHVILI, LEO ESAKIA AND
DAVID GABELAIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Relevance Logic and The Calculus of Relations


ROGER D. MADDUX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Transfinite Numbers in Paraconsistent Set Theory


ZACH WEBER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Pasch’s Philosophy of Mathematics


DIRK SCHLIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Two (or Three) Notions of Finitism


MIHAI GANEA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Modeling Occurrences of Objects in Relations


JOOP LEO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Subscription Information: Review of Symbolic Logic (ISSN 1755-0203) is published
quarterly by Cambridge University Press for the Association for Symbolic Logic. Annual sub-
scription rates for Volume 3 (2010): Institutional subscription rates, print and online: US $620.00
in the USA, Canada, and Mexico; UK £365.00 + VAT elsewhere. Institutional subscription rates,
online only: US $520.00 in the USA, Canada, and Mexico; UK £336.00 + VAT elsewhere.
Institutional subscription rates, print only: US $545.00 in the USA, Canada, and Mexico; UK
£350.00 + VAT elsewhere. Individual subscription rates, online only: US $130.00 in the USA,
Canada, and Mexico; UK £80.00 + VAT elsewhere. Individual subscription rates, print only:
US $140.00 in the USA, Canada, and Mexico; UK £85.00 + VAT elsewhere. Institutional sub-
scription correspondence should be sent to: Cambridge University Press, 100 Brook Hill Drive,
West Nyack, NY 10994, USA, for customers in the USA, Canada, or Mexico. Customers else-
where should contact: Cambridge University Press, The Edinburgh Building, Shaftesbury Road,
Cambridge CB2 8RU, UK.

Association for Symbolic Logic. Individual Members of the Association for Symbolic Logic
receive a print subscription to The Review of Symbolic Logic as a benefit of membership.
Requests for information, applications for membership, orders for back volumes, business cor-
respondence, and notices and announcements for publication in the The Bulletin of Symbolic
Logic should be sent to the Secretary-Treasurer of the Association: Charles Steinhorn, ASL,
Box 742, Vassar College, 124 Raymond Avenue, Poughkeepsie, NY 12604, USA. The email
address of the Association’s business office is asl@vassar.edu. The ASL website is located at
http://www.aslonline.org. Links from that site provide further information on the Review and
on submitting papers for publication.

Coordinating Editor: Jeremy Avigad, Department of Philosophy, Carnegie Mellon


University, Pittsburgh, PA 15213, USA. Email: avigad@cmu.edu.

Copyright: All rights reserved. No part of this publication may be reproduced, in any form
or by any means, electronic, photocopy, or otherwise, without permission in writing from
Cambridge University Press. Photocopying information for users in the U.S.A.: Copying
for internal or personal use beyond that permitted by Sec. 107 or 108 of the U.S. Copyright
Law is authorized for users duly registered with the Copyright Clearance Center (CCC)
Transaction Reporting Service, provided that the appropriate remittance per article is
paid directly to CCC, 222 Rosewood Drive, Danvers, MA 01923, USA. Specific written
permission must be obtained for all other copying. General enquiries from the USA, Mexico,
and Canada should be addressed to the New York office of Cambridge University Press
http://www.cambridge.org/us/information/rights/contacts/newyork.htm; general enquiries
from elsewhere should be addressed to the Cambridge office http://www.cambridge.org/
uk/information/rights/contacts/cambridge.htm; permission enquiries from Australia and
New Zealand should be addressed to the Melbourne office http://www.cambridge.org/aus/
information/contacts_melbourne.htm; enquiries regarding Spanish-language translation
rights (only) should be addressed to the Madrid office http://www.cambridge.org/uk/
information/rights/contacts/madrid.htm

Copyright © Association for Symbolic Logic, 2010


T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010

INTENSIONALITY AND PARADOXES IN RAMSEY’S


‘THE FOUNDATIONS OF MATHEMATICS’
DUSTIN TUCKER
Department of Philosophy, University of Michigan

Abstract. In ‘The Foundations of Mathematics’, Frank Ramsey separates paradoxes into two
groups, now taken to be the logical and the semantical. But he also revises the logical system
developed in Whitehead and Russell’s Principia Mathematica, and in particular attempts to provide
an alternate resolution of the semantical paradoxes. I reconstruct the logic that he develops for this
purpose, and argue that it falls well short of his goals. I then argue that the two groups of paradoxes
that Ramsey identifies are not properly thought of as the logical and semantical, and that in particular,
the group normally taken to be the semantical paradoxes includes other paradoxes—the intensional
paradoxes—which are not resolved by the standard metalinguistic approaches to the semantical
paradoxes. It thus seems that if we are to take Ramsey’s interest in these problems seriously, then the
intensional paradoxes deserve more widespread attention than they have historically received.

§1. Introduction. Frank Ramsey’s (1925) ‘The Foundations of Mathematics’ is re-


membered almost exclusively for distinguishing two types of paradox:1 the logical and the
semantical. But as influential as this distinction was, its statement takes up less than a page,
while the reprinting of the paper in Braithwaite (1931) (and the reprinting in Mellor, 1990,
for that matter) is 61 pages long. One might wonder what Ramsey was doing in the other
60 pages.
The title of Ramsey’s work makes his purpose clear: he was attempting to construct the
foundations of mathematics. Specifically, he was attempting to revise the logical system
of Principia Mathematica (Whitehead & Russell, 1910)2 in order to avoid what he saw
as its three major defects. The first and third are concerned with classes and identity
respectively, and I will set them aside. The second defect is, ultimately, the axiom of
reducibility. Whitehead and Russell employ a ramified theory of types which, among other
things, requires that the ranges of bound variables be restricted not only by type, but also by
order. Because of this, one cannot talk about, for instance, all functions from individuals to
propositions, but only all functions of order n from individuals to propositions. The axiom
of reducibility was introduced as an attempt to correct the diminished power of the logic.

Received: June 1, 2008


1 A distinction is sometimes drawn between paradoxes and antinomies, according to which most
(but not all) of the problems Ramsey was concerned with are antinomies, not paradoxes. While
this distinction is an important one, it is not significant for my purposes in this paper. Thus, for
simplicity, I use ‘paradox’ throughout.
2 Although I cite the first edition here and throughout the paper (the few page references to
Principia Mathematica can be used with either edition), it is worth noting that Ramsey was
familiar with at least the Introduction to the second edition [p. 25]. All page references, such
as the one in the previous sentence, are to the reprinting of Ramsey (1925) in Braithwaite (1931)
unless otherwise noted.


c Association for Symbolic Logic, 2010
1 doi:10.1017/S1755020309990359
2 DUSTIN TUCKER

This axiom says that for every higher order function, there is an extensionally equivalent
first-order function.
The axiom of reducibility was never popular; even the Introduction to the second edition
of Principia Mathematica attempts to do without it. Since it is only necessary in a ramified
theory of types, Ramsey attempts to develop a nonramified theory of types that can serve
as the foundation of mathematics. The cited motivation for the ramified theory of types is a
collection of seven paradoxes (Whitehead & Russell, 1910, pp. 60–61). Ramsey observes
that these paradoxes can be divided into two classes, which have since come to be called
the semantical and the logical,3 and that ramification is only required for the former. Thus,
with the aim of eliminating the axiom of reducibility, he attempts to develop a nonramified
theory of types—one that only restricts the ranges of bound variables by type—in which
the semantical paradoxes do not arise.
My original aim in studying this section of Ramsey’s paper was twofold. (i) I found
it surprising that a proposed resolution of the semantical paradoxes by a figure such as
Ramsey could have been completely ignored for over 80 years. Even contemporary re-
views, such as Church (1932); Russell (1932), say nothing about this resolution, and I have
not seen any discussion of it in later work. I had hoped that Ramsey’s resolution would
provide fresh ideas about the semantical paradoxes—ideas not colored by Tarski’s subse-
quent analysis. At least, I hoped, it might illuminate a more familiar idea in a new way.
(ii) I wanted to determine whether his resolution could also handle another class of para-
doxes, which are specific to intensional logics (and which I thus call the intensional para-
doxes). Again, the hope was that even if it could not resolve them, it would illuminate
something or suggest something different from what has been done in the intervening
decades.
Unfortunately, my conclusions seem to divest Ramsey’s resolution of almost all its
interest qua logical system. With respect to (i), I think not only that he fails to resolve the
semantical paradoxes, but also that the failure involves such confusion that there is little
insight into those paradoxes to be gained from examining his failure. I am loath to attribute
such confusion to anybody, of course, and especially to someone like Ramsey, but I think
that it is the most charitable reading of his attempted resolution. (ii) fares no better, as
I think that there is nothing to be gleaned from his system’s inability to avoid these other
paradoxes.
In spite of this divestment, I think that it is at least interesting when someone like
Ramsey falls into such confusion. And, of course, perhaps my interpretation is incorrect,
and Ramsey actually has developed a novel and satisfactory resolution of the semantical
paradoxes. I might then hope that this exposition could lead to a better understanding of
such a resolution. Thus, one main aim of this paper is to reconstruct the logic Ramsey
employs in ‘The Foundations of Mathematics’ and to understand how it is intended to
resolve the semantical paradoxes. Unfortunately, his resolution of the paradoxes is far
from transparent. It is somewhat reminiscent of Tarski’s hierarchy of languages in that
Ramsey thinks that the resolution of the paradoxes relies on distinguishing different senses
of ‘mean’, but the details are significantly murkier than those of Tarski’s now-classical
resolution. Also, Ramsey does not develop a hierarchy of meaning relations, and he does
not relativize meaning to a language the way Tarski does. Part I is thus devoted to a
reconstruction and analysis of Ramsey’s logic, the technical consequences of his account

3 Actually, I do not think that his groups are the same as the semantical and the logical paradoxes;
this is the subject of Part II.
INTENSIONALITY AND PARADOXES IN RAMSEY 3

of meaning, and the application of those consequences to that logic in an attempt to resolve
the semantical paradoxes.
Part II is concerned with a slightly different topic. I have said that ‘The Foundations of
Mathematics’ is largely remembered for its division of paradoxes into two groups, Groups
A and B, which are now known as the logical and the semantical respectively. Certainly
this is how people have historically taken the distinction Ramsey draws.4 However, I think
that this misrepresents his division. In particular, I think that there are paradoxes in his
Group B that are not resolved by the common resolutions forwarded for the semantical
paradoxes—that are arguably not semantical paradoxes.5 Even more particularly, I think
that the extra paradoxes in Group B are the intensional paradoxes that I mentioned in (ii)
above. My general interest in the intensional paradoxes is that they pose problems for
logics attempting to capture propositional attitudes, but I will not argue for that point here.
My sole concern in Part II is to argue that Ramsey includes paradoxes in his Group B that
are not resolved by metalinguistic resolutions of the semantical paradoxes.
PART I

RAMSEY’S RESOLUTION OF
THE SEMANTICAL PARADOXES

§2. Ramsey’s intensional logic. Ramsey employs a fairly straightforward form of


simple type theory with lambda abstraction. Since the now-standard notation for the
lambda operator had not yet been introduced, he writes, for example, ‘φ x̂’ instead of
‘λx[φ(x)]’, but the meaning is the same. I will use the more modern notation.6 I also
replace all scope-disambiguating dots with brackets, use more modern connectives, and
write all instances of functional application (other than that of connectives) as the function
followed by all of the arguments.7 Finally, Ramsey’s system differs from standard type
theories in that well-formed formulas belong to a type p of propositions, rather than to the
type t of truth-values.8 What follows is an explicit characterization of the language that
I take Ramsey to be working with.
There are primitive types i and p. If τ and σ are types, then τ, σ  is a type. I some-
times refer to types constructed in this way as functional types. For every type τ , there is

4 See, for example, Fraenkel & Bar-Hillel (1958, pp. 5–14), Beth (1959, §171), Kneale & Kneale
(1962, pp. 664–665), Quine (1963, pp. 254–255), Feferman (1984, p. 75), and Priest (1994,
pp. 25–26).
5 Sort of not semantical paradoxes, anyway. I argue that, while all of Ramsey’s formalizations of
the Group B paradoxes will be prohibited by any resolution of the semantical paradoxes, those
formalizations are not the only ones we should be concerned with. I then attempt to show that
other, better formalizations of those paradoxes do not admit of similar resolutions, and argue that
one therefore ought to not be satisfied as soon as one has a resolution of the semantical paradoxes,
at least if one takes Ramsey’s concern with these paradoxes seriously. These details are the subject
of Section 8.
6 Most of the time. But, as in Section 3, it is sometimes easier to use Ramsey’s notation when
quoting him.
7 I also Curry the logic—I treat, for instance, binary relations as functions to other functions—but
this has no substantive impact on the resulting system, and could be eliminated at the cost of only
a little simplicity.
8 In my formulation of the logic, there are also well-formed formulas of type i and various
functional types (see below). But these well-formed formulas play no role beyond simplifying
the formation rules.
4 DUSTIN TUCKER

an infinite alphabet of variables with superscript τ . These superscripts are often omitted
when no undesired ambiguity thereby arises. There are two primitive constants ∼ p, p
τ, p, p
and ; p, p, p and infinitely many constants , one for each type τ . The other
primitives are λ, [, ], and any primitive constants of determinate type that one wishes to
include.
If f is a variable or constant with superscript τ , it is a well-formed formula of type τ .
If P is a well-formed formula of type τ and x is a variable with superscript σ , λx[P] is a
well-formed formula of type σ, τ . If P is a well-formed formula of type τ, σ  and Q is a
well-formed formula of type τ , then PQ (often written P(Q)) is a well-formed formula of
type σ. I abbreviate P(Q)(R) (i.e., PQR) with P(Q, R), ; (P, Q) with P ; Q, and
(λx[P]) with x[P]. I use square brackets to disambiguate scope when these abbrevia-
tions lead to ambiguity.
Ramsey assigns numbers to some functional types: “A function of individuals we will
call a function of type 1; a function of functions of individuals, a function of type 2; and
so on” [p. 46]. Using the above notation, we can define these recursively. Anything of
type i, p, i, i, p, i, i, i, p, and so forth is of Type 1; given types τi of type n,
anything of type τ1 , p, τ1 , τ2 , p, τ1 , τ2 , τ3 , p, and so forth or of type τ1 , τ2 ,
τ1 , τ2 , τ3 , τ1 , τ2 , τ3 , τ4 , and so forth is of type n + 1. To simplify later definitions
slightly, I will also take anything of type i to be of Type 0.
Ramsey actually takes these numbered collections of types to be types themselves,
and does not distinguish between the different types within each collection. Thus, when
I indicate types with superscripts below, I often follow Ramsey and ambiguously use
numbers as though they denote types rather than collections of types. This undoubtedly
raises difficult technical problems, but I think that I can safely ignore them for the purposes
of this paper.
The numbered types are the only functional types that Ramsey considers. These are
all types of propositional functions:9 the last type symbol appearing in their symbols is
always p. This is what I mean when I say that Ramsey’s logic is intensional—repeated
functional application always results in well-formed formulas of type p rather than of the
type t of truth-values. In this respect, Ramsey’s logic is similar to the Russellian simple
type theory presented in Church (1974). I follow Thomason (1980) in using different
symbols for the primitive constants of this logic, replacing the standard truth-functional
constants ¬, →, and ∀ with my intensional ∼, ;, and respectively. The remaining
symbols ∨, ∧, ↔, and ∃ are replaced with ∪, ∩, , and respectively, which are defined
in the usual way.10

§3. Terminology. I have not tried to provide a model for Ramsey’s logic because
model theory had not been invented when Ramsey was writing. One probably could con-
struct models for the logic I have just described, but they are not necessary for explaining
Ramsey’s resolution of the paradoxes, so I will not be concerned to do so. Of course, if
his resolution seemed promising, we would need to confirm that he really has provided a

9 Except for my added Type 0, of course.


10 One might not want to define ∪, ∩, , and in terms of ∼, ;, and —one might think that, for
instance, the proposition that dogs and cats bark is different from the proposition that neither cats
nor dogs don’t bark. The approach I am taking is a littler simpler, but it is hardly more complex
to take all of the connectives as primitive.
INTENSIONALITY AND PARADOXES IN RAMSEY 5

solution by constructing a model, but I argue that his resolution is not successful, so this
should not be an issue.
This raises some terminological worries, because Ramsey uses ‘proposition’ and ‘func-
tion’, words that are now associated with models. Ramsey frequently uses these terms to
refer to strings of symbols. Thus, ‘φa’ often is a proposition—it isn’t just a formula of
type p—and ‘λx[φ(x)]’ often is a function.
Even for Ramsey, this is not the correct use of ‘proposition’—‘φa’ is a propositional
symbol, and propositional symbols are instances of propositions. Then “two proposi-
tional symbols are to be regarded as instances of the same proposition . . . when they
express agreement and disagreement with the same sets of truth-possibilities of atomic
propositions” [p. 9]. That is, they are instances of the same proposition when their
truth tables agree. He reiterates this criterion of sameness at pp. 33, 34 and employs it
again at p. 35.11
In contrast to propositions, Ramsey, at least most of the time, insists that functions
are just symbols. He initially defines ‘propositional function’ as “an expression of the
form ‘ f x̂’, which is such that it expresses a proposition when any symbol (of a certain
appropriate logical type depending on f ) is substituted for ‘x̂’ ” [p. 8]. Similar statements
litter ‘The Foundations of Mathematics’: “a propositional function of individuals [is] a
symbol of the form . . .” [p. 35], “functions are symbols” [p. 36], and “our definition of a
propositional function as itself a symbol” [p. 43] are some examples. Only twice does he
suggest that there is a distinction to be drawn between functions and functional symbols,
as there was for propositions. As with propositions, it arises when he is explaining identity
for functions; thus, for example, he writes,
Two such symbols [i.e., propositional functional symbols] are regarded
as the same function when the substitution of the same set of names
in the one and in the other always gives the same proposition. Thus if
‘ f (a, b, c)’, ‘g(a, b, c)’ are the same proposition for any set of a, b, c,
‘ f (x̂, ŷ, ẑ)’ and ‘g(x̂, ŷ, ẑ)’ are the same function, even if they are quite
different to look at. [p. 35]
Here it seems as though he has a notion of function that is distinct from the symbols he
has been dealing with.
This, however, is an isolated case, and rarely arises. In fact, Ramsey seldom even invokes
the distinction between propositions and propositional symbols. He is almost exclusively
concerned with propositional and functional symbols12 —in particular, his discussion and
proposed resolution of the semantical paradoxes only make use of these symbolic senses
of ‘proposition’ and ‘function’. Still, in what follows, I will use ‘propositional symbol’ and
‘functional symbol’ whenever I am talking about the actual formulas, and I will edit quotes
from Ramsey accordingly when doing so does not obscure anything relevant. In light of
this exclusive concern with propositional and functional symbols, I almost always follow
the practice of allowing symbols to name themselves and omit mention (and corner) quotes
around them. Section 2 is a good illustration of this practice; the beginning of this section,
an illustration of an exception.

11 This should sound similar to things like Carnap’s state descriptions and more modern possible
worlds. I revisit this in Section 7.3.
12 Ramsey never uses the latter term, but given that propositional symbols are well-formed formulas
of type p, I will take functional symbols to be well-formed formulas of any functional type.
6 DUSTIN TUCKER

§4. Meaning. As mentioned above, Ramsey approaches the semantical paradoxes by


way of meaning. He makes a number of claims about meaning at pp. 43–44. At best, what
he says is intricate—he uses ‘mean’ in at least three distinct ways in a passage of less than
two pages’ length—and at worst, it is poorly thought out—immediately after insisting that
“to speak of ‘F’ as meaning λx[F(x)] at all must appear very odd in view of our definition
of a propositional function [such as λx[F(x)] ] as itself a symbol [as opposed to an object],”
he is happy to talk about an “object . . . S,” where ‘S’ is “the name of a relation” [p. 43].
I think that one can construct a theory of meaning that supports (most of) what Ramsey
says in these pages, but the details are complex.
Luckily, most of what he says about meaning is not strictly relevant to his resolution
of the semantical paradoxes. This is because his resolution relies on his notion of orders,
which arise only for well-formed formulas and symbols introduced by definitions. His
motive for introducing orders relies on his account of meaning, but the formal machinery
employed in his resolution of the paradoxes can be presented and understood without
appealing to any meaning relation beyond the relation between definiendum and definiens.
Put another way, for the purposes of Ramsey’s resolution of the paradoxes, his formal-
ization of “the relation of meaning between ‘φ’ and λx[φ(x)]” [p. 42] can be thought of
as only taking symbols introduced via definition for its first argument. He formalizes this
relation as R. It is not clear whether Ramsey thought seriously about the meanings of any
other symbols, but I will not pursue this issue further.13
Before moving to the formal machinery just mentioned, it is worth considering what
the arguments to R actually are. The first is obviously a symbol. The second seems to
obviously be a function. But recall that for Ramsey, there are no functions—there are only
functional symbols. The second argument, then, must be a functional symbol, and this is
what Ramsey takes it to be. Thus, he writes,
to speak of ‘F’ as meaning λx[F(x)] at all must appear very odd in
view of our definition of a propositional function [such as λx[F(x)] ] as
itself a symbol. But the expression is merely elliptical. . . . [I]t is clearly
an impossible simplification to suppose that there is a single object F,
which [‘F’] means. [p. 43]
 
The expression that Ramsey is talking about can be written as ‘R ‘F’, λx[F(x)] ’. He is
saying quite explicitly that both arguments are symbols, and that the expression is elliptical
for something more complicated. He employs the complicated, possibly confused theory
of meaning that I discussed above in explaining what the expression is elliptical for—what
‘F’ actually does mean, if not a single object F. But since R is the relation that appears
in his formalizations of the paradoxes,
 it is enough for the present purposes to understand
the properties of the expression ‘R ‘F’, λx[F(x)] ’. Thus we can again set to the side the
question of what this expression is elliptical for, the answer to which involves Ramsey’s
theory of meaning.

§5. Formal machinery.


5.1. Orders. Consider the following definitions. The first two are definitions that
Ramsey gives, and the third is at least one he would not object to. (Recall that superscripts
indicate a collection of types.)

13 I do not mean to say that he has no other views about meaning; he does. It is just not clear that
those views were well considered.
INTENSIONALITY AND PARADOXES IN RAMSEY 7

φ(x) =df S 1 (a, x),


ψ(x) =df y 0 [S 1 (y, x)],
  
χ(x) =df φ 1 f 2 λz[φ(z)], x .
Ramsey thinks that ‘φ’, ‘ψ’, and ‘χ’ not only mean different things, but also do so in
quite different ways. As I said above, the details of his theory of meaning are not relevant
here. What matters is that Ramsey thinks that there is an important difference between ‘φ’,
‘ψ’, and ‘χ’ based on the structures of their definientia: the definiens of ‘φ’ contains no
bound variables, the definiens of ‘ψ’ contains a bound variable of Type 0, and the definiens
of ‘χ’ contains a bound variable of Type 1. Ramsey introduces orders to capture this
difference.
Ramsey’s use of ‘order’ here is unfortunate, because it recalls Russell’s use of the term in
connection with ramified type hierarchies. Unlike Russell’s orders, Ramsey’s are properties
of propositional and functional symbols; derivatively, they are also properties of symbols
whose definientia are propositional or functional symbols, such as ‘φ’, ‘ψ’, and ‘χ’.14
His orders are assigned as follows [pp. 46–47]. The order of a propositional symbol is
n + 1, where n is the highest type of bound variable appearing in the propositional symbol.
If a propositional symbol contains no bound variables, its order is 0.15 Ramsey does not
consider bound variables of any other types. This can be glossed as only allowing bound
variables of type i or of a type of propositional functional symbol.16 Functional symbols
are assigned orders in the obvious way: λxP has the same order as P. For any symbol f
such that either f =df P or f(x) =df P, we will say that the order of f is that of P.17 From
this definition, we can see that ‘φ’ is of Order 0, ‘ψ’ is of Order 1, and ‘χ ’ is of Order 2.
Ramsey often takes the difference between his orders and those of ramified type theory
to be a matter of their definition—his orders, unlike those used for ramification, are de-
termined by the appearance of bound variables in strings of symbols. Now distinguishing
propositions from propositional symbols, he writes,
For me, propositions in themselves have no orders; they are just differ-
ent truth-functions of atomic propositions—a definite totality, depend-
ing only on what atomic propositions there are. Orders and illegitimate
totalities only come in with the symbols we use to symbolize the facts in
variously complicated ways.18 [pp. 48–49]
But this is not the most important difference between his orders and Russell’s. What
keeps Ramsey’s type theory from being ramified is his insistence that the ranges of bound
variables not be restricted by order. This means that in a propositional symbol such as

14 He is not this explicit about the way the orders of definienda are determined, but this is clearly
what he has in mind.
15 As an aside, it is interesting to note that Ramsey has not actually introduced a hierarchy of orders:
orders are not defined in terms of earlier orders, but of types.
16 These are the only functions that he considers [p. 35, n. 1].
17 Technically, this would require that there be mention quotes around the definiendum, as f is a
variable over symbols. In practice, and in keeping with my general laxity about mention quotes,
I take =df to imply mention rather than use and omit these quotes without loss of precision.
18 One might be concerned here that the distinction between propositions and propositional symbols
is playing more of a role in Ramsey’s resolution of the paradoxes than I claim, since it is being
appealed to here. But, as will be apparent below when the resolution is presented, the distinction
is only important in that propositions proper play no role whatsoever in the resolution.
8 DUSTIN TUCKER

‘ φ[ f (φ)]’, the range of φ is “the set of [all] functions [of a specific type], not . . . the set
of [functions of order 1]” [p. 44, n. 1]. Similarly, Ramsey never allows functional symbols
that can only take arguments of certain orders. This is understandable—if he allowed such
functional symbols, he would have a hard time keeping orders out of bound variables.
Note that Ramsey characterizes the range of the variable φ as a set of functions. But
for Ramsey, there are no functions proper, just functional symbols, and in fact, we can
interpret this remark as taking the ranges of bound variables of type > 0 to actually be
sets of functional symbols (or, equivalently for my purposes, symbols that are short for
functional symbols—symbols such as the ‘φ’, ‘ψ’, and ‘χ ’ introduced above). However
implausible this sounds to modern ears, it follows from Ramsey’s insistence that there are
no functions but functional symbols. And even if it didn’t, I think that it plays a crucial role
in his analysis of the semantical paradoxes; see Section 6.2.
We can now see why I take the characteristic property of Ramsey’s orders to be that they
do not restrict the ranges of bound variables beyond their type. One might have thought
that there was no such restriction simply in virtue of his definition of orders. He, at least,
seems to have thought so. After all, how can properties of symbols restrict the ranges of
bound variables, which range over objects? But if I am right that the correct interpretation
of his resolution requires that at least some variables range over symbols, then his evasion
of such a restriction—and thus his evasion of ramification—is not so immediate. In fact,
I argue below that he does restrict the ranges of variables in ways that go beyond mere type
restrictions (although not, it turns out, by order, which is a separate problem).
5.2. The relations Ri . We still do not have enough to resolve the semantical para-
doxes, because we have yet to make any changes to the formalism of simple type theory.
Orders have been defined, but they do not place any restrictions on anything. We have not
yet captured the motivation for his orders, namely, that ‘φ’, ‘ψ’, and ‘χ’ are related to their
definientia in importantly different ways.
Ramsey’s solution is that there is not just one meaning relation R, but instead one such
relation for each order. Then a formula of the form R(f, P) is true when both f actually does
mean P (in the elliptical sense mentioned above) and R is the relation corresponding to the
order of f. Given the above motivation, this idea is not entirely unreasonable. Although he
does not use this notation, I will use Rn to indicate the relation corresponding to order n,
so that we can say that Rn (f, P) is only true when f is of order n and f elliptically means P.
As an example of how this works, we can represent the above definitions with the
following formulas.

 
R0 ‘φ’, λx[S 1 (a, x)] ,
  
R1 ‘ψ’, λx y 0 [S 1 (y, x)] ,
    
R2 ‘χ’, λx φ 1 f 2 λz[φ(z)], x .

These are true (or, more accurately, denote true propositions), but would be false if the Ri s
were changed.
Ramsey also considers meaning relations that are appropriate to multiple orders. For
instance, we could have a relation R that is only capable of being true when its first
argument is either of order n or of order m, which we might represent with ‘Rn,m ’. For
simplicity’s sake I will restrict myself to relations only appropriate to a single order. This
restriction has no substantive consequences.
INTENSIONALITY AND PARADOXES IN RAMSEY 9

It is very important to note that Rn (f, P) is not ill formed when f is not of order n, but
simply false. Orders cannot render formulas ill formed—to allow that would be to embrace
ramification.

§6. The semantical paradoxes.


6.1. Grelling’s paradox. With all this machinery in place, we can consider
Ramsey’s treatment of the semantical paradoxes. He examines Grelling’s paradox about
‘heterological’ (which he incorrectly attributes to Weyl) in the most detail.19 I will begin,
as he does, by formulating the paradox without the various Ri . He begins by defining ‘F’:
   
F(x) =df φ 1 R x, λz[φ(z)] ∩ ∼φ(x) . (1)
Since ‘R(x, y)’ means ‘x means y’, ‘F(x)’ is intended to be ‘x is heterological’. Ramsey
then thinks that we have
 
R ‘F’, λx[F(x)] . (2)
It is not entirely clear to me how he gets to this, but I am going to assume it. He seems
to generally have a disquotational theory of meaning, so that, for instance, ‘a’ means a,
where ‘a’ has been introduced as the name of some object, but I will not pursue the issue
further.20
From (2) we have
  
φ 1 R ‘F’, λx[φ(x)] , (3)
whence Ramsey concludes
F(‘F’) ∼F(‘F’). (4)
It is worth looking more carefully at (4), because it is not clear that Ramsey can actually
derive it from (1)–(3) alone. The right-to-left direction is straightforward. We assume
∼F(‘F’), (5)
whence by (1)
   
∼ φ 1 R ‘F’, λz[φ(z)] ∩ ∼φ(‘F’) , (6)
whence by (2)
F(‘F’). (7)
The left-to-right direction requires an additional assumption, though. There are many
premisses that will do, but (8) is a particularly plausible principle.
     
φ ψ R f, λx[φ(x)] ∩ R f, λx[ψ(x)] ; x[φ(x) ψ(x)] . (8)

(8) is an axiom schema. It can be glossed as the principle that all meanings of a given
symbol f are coextensive.
Now we can prove the left-to-right direction of (4). Assume
F(‘F’), (9)
whence by (1)
   
φ 1 R ‘F’, λz[φ(z)] ∩ ∼φ(‘F’) . (10)

19 Ramsey’s formulation of this paradox is at the bottom of p. 42.


20 This is an example of what I was referring to in note 13.
10 DUSTIN TUCKER

Now let ψ be a functional symbol such that


 
R ‘F’, λz[ψ(z)] ∩ ∼ψ(‘F’), (11)
whence (via the first conjunct) by (2) and (8), letting f be ‘F’,
x[F(x) ψ(x)], (12)
whence by the second conjunct of (11)
∼F(‘F’). (13)
Ramsey himself does not give any proof, and proceeds directly to (4) from (3). But,
despite not making any use of (3), I think that (1)–(13) are a reasonable reconstruction of
the reasoning he intends to employ and make his resolution of Grelling’s paradox more
transparent.
6.2. Ramsey’s resolution of Grelling’s paradox. As one might expect, Ramsey’s res-
olution relies on orders [pp. 45–46]. As R is actually ambiguous between the different
relations that are appropriate to symbols of different orders, all of the Rs in (1)–(13) have
to be replaced by appropriate Ri . To resolve the paradox, Ramsey argues that no matter
what Rn is used in (1), (2) will only be true if it employs some Rm , m > n. That is,
Ramsey claims that no matter what Rn is used in (1), ‘F’ will be of order greater than n.
In eliminating R, we must replace (8) with something like (8 ), an axiom schema yield-
ing an axiom for every pair of f and n.21
     
φ ψ Rn f, λx[φ(x)] ∩ Rn f, λx[ψ(x)] ; x[φ(x) ψ(x)] . (8 )

It is easy to see how Ramsey’s strategy works when (1) is replaced with
   
F(x) =df φ 1 R0 x, λz[φ(z)] ∩ ∼φ(x) ; (1 )
this results in (6) and (11) becoming
   
φ 1 R0 ‘F’, λz[φ(z)] ; φ(‘F’) (6 )
and
 
R0 ‘F’, λz[ψ(z)] ∩ ∼ψ(‘F’) (11 )
respectively. But since the definiens in (1 )
contains a bound variable of Type 1, ‘F’ has
Order 2. Thus, as explained above, (2) will only be true if it is replaced with
 
R2 ‘F’, λx[F(x)] . (2 )
Clearly, we can derive nothing from (6 ), (11 ), and (2 ), so the proof of (4) no longer goes
through.22
The obvious response is to change (1 ) to
   
F(x) =df φ 1 R2 x, λz[φ(z)] ∩ ∼φ(x) , (1 )

21 This is the obvious revision of (8), but it is not the only possible one. Perhaps, for instance, we
would want to allow the subscripts on the two Rs to be different. This would not defeat Ramsey’s
resolution, though—see note 22—and I can think of no plausible replacement for (8) that would
do so.
22 If, as I considered in note 21, (8 ) allowed different R s, then the left-to-right direction would still
i
be provable. As it is, we cannot go in either direction.
INTENSIONALITY AND PARADOXES IN RAMSEY 11

which results in (6 ) and (11 ) becoming


   
φ 1 R2 ‘F’, λz[φ(z)] ; φ(‘F’) (6 )
and
 
R2 ‘F’, λz[ψ(z)] ∩ ∼ψ(‘F’) (11 )
respectively. From these and (2 ), we can derive (7) and (12), whence we can derive (4).
Ramsey considers this case,23 and argues that now ‘F’ must be of Order 3 because the
definiens in (1 ) contains a “hidden variable” of Type 2 [pp. 45–46]. If this is right, then
the paradox is once again avoided, because (2 ) becomes
 
R3 ‘F’, λx[F(x)] , (2 )
whence we can derive neither (7) nor (12). But his argument that ‘F’ is now of Order 3 is
both intricate and dubious.  
Ramsey begins by observing that for R2 , R2 ‘φ2 ’, λx[φ2 (x)] . I have not omitted any
words there; the quote I am relying on is “this new R, for which ‘φ2 ’R(φ2 x̂)” [p. 45]. Here,
the “new R” is R0,1,2 , which I am replacing with R2 for simplicity’s sake. Recalling his
use of x̂, this remark is equivalent to the statement that began this paragraph.
There is one point to quickly address and set aside:in this part of Ramsey’s
 paper,
subscripts indicate order, so ‘ ‘φ2 ’R(φ2 x̂)’, that is, ‘R2 ‘φ2 ’, λx[φ2 (x)] ’, seems to say
nothing more than that there is a symbol of Order 2 that has been introduced via a defini-
tion (and thus means something). Ramsey does not actually explain what this formula is
supposed to mean, but I do not think that these details are actually of much consequence,
so I will not worry about them further.
The main issue raised by this passage in Ramsey is that it is not clear what he
is actually saying about R2 . There are at least two plausible interpretations:
 (i) that
R2 ‘φ2 ’, λx[φ2 (x)] is not ill formed, and (ii) that R2 ‘φ2 ’, λx[φ2 (x)] is true. As I said
earlier, Ramsey has to say that when an Ri is given arguments of the wrong order, it is
false, not ill formed, so (ii) had better be what he intends on pain of triviality. Actually,
I think that he has (i) in mind, as will become clear later. This should be cause for concern,
though, because if he intends (i), then either he is not saying anything of substance—
orders are supposed to be irrelevant to well-formedness—or he is reintroducing something
dangerously like ramification.
In an attempt to settle this issue, let us turn to what he hopes to conclude from this
observation.
  Immediately  after making it, he writes, “since ‘φ2 (x)’ is of some
 such form
as φ 1 f 2 λz[φ(z)], x , . . . [(1 ) involves] at least a variable function f 2 λz[φ(z)], x
of functions of individuals” [p. 45].24 This variable f is the hidden variable because it
“is involved in the notion of a variable ‘φ2 ’, which is involved in the variable φ taken in
conjunction with R2 [in (1 )]” [pp. 45–46].25
Before we go  on, a quick
 comment should be made about Ramsey’s notation here.
Obviously, f 2 λz[φ(z)], x is not actually a functional symbol, and Ramsey should have
    
written f 2 λz[φ(z)], x̂ , that is, λx f 2 λz[φ(z)], x , when talking about the “variable
function.” In what follows, I will often prepend the λx where necessary.

23 Actually, he considers changing the R to R


0,1,2 , but as I said above, restricting ourselves to
R2 here has no substantive consequences.
24 None of the superscripts here or in the later quotes appears in the original, of course.
25 The subscript on the R is not in the original.
12 DUSTIN TUCKER

With that notational point made, we can return to examining Ramsey’s reasoning. The
idea here seems to be that since φ can be  instantiated by something “of some such form
     
as φ 1 f 2 λz[φ(z)], x ” (or, rather, λx φ 1 f 2 λz[φ(z)], x ), (1 ) involves a hidden
variable that must be able to range over functions of the same type as f —functions of
Type 2.26
It is not clear how this is supposed to play out in the formalism, because Ramsey
never explains exactly what a hidden variable is or what it means to involve one. But
to criticize Ramsey’s resolution of the paradoxes, as I wish to do, we need only understand
the conditions under which a formula involves a hidden variable, not why it does so.
To get a handle on this, it is helpful to return to the observation made above, that vari-
ables range over symbols. The principle he is using, as best I can tell, is this: a hidden
variable of type n is involved in a formula whenever an explicit variable in that formula is

 contain symbols of type n. In (1 ), φ could be the
capable of ranging over symbols that 27
  
symbol λx φ 1 f 2 λz[φ(z)], x , so according to this principle, (1 ) contains (or, again,
involves) a hidden variable of Type 2, making ‘F’ of Order 3.
One ought to be dubious of this principle, and thus of any interpretation of Ramsey that
claims that he relies on it. But consider his discussion of the other semantical paradoxes.
He does not spend much time on these paradoxes, but his discussion of what he calls the
Liar, which I will call the Liar (to distinguish it from the modern metalinguistic Liar)
seems to rely on the same principle. He formulates the liar sentence as
‘ p’ p[Say(‘ p’) ∩ Rn (‘ p’, p) ∩ ∼ p]. (14)
According to Ramsey, since ‘ p’ is of order n, “ ‘ p’ may be φ n−1 [ψ n (φ)]. Hence ‘ p’
involves ψ n , and ‘I am lying’ in the sense of ‘I am asserting a false proposition of order
n’ is at least of order n + 1 and does not contradict itself” [p. 48].28

26 It is not clear whether Ramsey is here saying that all functions of Order 2 have to be of this form.
They do not, of course; ‘ φ 1 [φ(a)]’ is of Order 2 and does not contain a function of Type 2.
Ramsey was aware of formulas of this form, too; one appears at the top of p. 42. But whether he
is making a mistake here is irrelevant, as he does not need to say that the second argument to R
in (1 ) must contain a function of Type 2. If he did, he would be in trouble, because as mentioned
above, the R he actually uses is not just R2 , but R0,1,2 , “the sum of [R0 , R1 , and R2 ]” [p. 45].
See also note 28.
27 Or, if we want variables to range over symbols that are short for functional symbols, whenever an
explicit variable in that formula is capable of ranging over symbols that are short for expressions
containing symbols of type n. For simplicity’s sake, I will assume that variables range over
functional symbols directly, but this is clearly not importantly different from assuming that they
range over symbols that are defined with functional symbols.
28 Again, Ramsey is working with a ‘ p’ of order n or less, but this is irrelevant. He uses φ for both
types, but I have changed one to ψ for clarity’s sake. Ramsey says that φ and ψ are of types
n and n + 1 respectively—and thus that ‘ p’ involves ψ n+1 —but this, I think, can only be a
simple mistake. Functions of order n contain bound variables of type n − 1, not type n. If we used
Ramsey’s n for φ, then the Rn would have to be Rn+1 .
Against the concerns raised in note 26, Ramsey’s use of ‘may’ here is further evidence not only
that he need not say that a variable must be instantiated by the right sort of expression, but also
that he himself relies on nothing more than that it may be so.
One might be concerned about Ramsey’s gloss of ‘I am lying’ as ‘I am asserting a false
proposition of order n’, since propositions do not have orders. I will set this issue aside for now;
I revisit it in note 41.
INTENSIONALITY AND PARADOXES IN RAMSEY 13

The first thing that one notices about this account of the Liar paradox is ‘ p’ in (14).
(Though I have often omitted mention quotes throughout the paper, I especially do so
here to avoid confusion about whether they originate in Ramsey’s text.) This is the first
time that a quoted symbol has appeared immediately following a quantifier in Ramsey’s
paper, and he does not explain what it means. The natural reading is that the variable
in ‘ p’ is a variable that ranges over propositional symbols (or symbols introduced via
definitions whose definientia are propositional symbols; see note 27). This might strike
one as somewhat odd, since I argued above that bound variables of any type (other than i)
range over symbols. If that is right, then there should be no need to have an explicit
symbol following a quantifier. But it is actually not so clear that Ramsey is ignoring the
difference between propositions and propositional symbols here: he writes, “ ‘ p’ may be
φ n−1 [ψ n (φ)],” whence, it seems, we are supposed to think that p—which ‘ p’ means—is
the proposition denoted by φ n−1 [ψ n (φ)]. If this is right, then the variable in p could
plausibly range over propositions themselves, rather than propositional symbols, and
the Rn in (14) would be different from the Ri above: its second argument would be an
object—a proposition—rather than another symbol.29
For now, this discussion can be set aside, as Ramsey is clear that the order of (14), which
is all that is relevant to his resolution of the Liar paradox, is determined by the variable in
‘ p’, not the variable in p. (The discussion will return with force in Section 8.) The only
point I wish to make here is that Ramsey’s reasoning clearly relies on the principle that
I stated above: (14) involves a hidden variable function of type n because ‘ p’30 may be
short for an expression containing a symbol of type n—because “ ‘ p’ involves ψ n ”—and
it is this hidden variable that forces (14) to be of order n + 1.
To recap: I am suggesting that a formula P contains a hidden variable of type n just in
case there is an explicit variable in P that can be instantiated by a formula containing a
constant of type n. The order of P is then m + 1, m the higehst type of variable—explicit
or hidden—occurring in P.
6.3. Problems for Ramsey’s resolution. If this principle is really what Ramsey is
relying on, then his resolution of the paradoxes faces serious problems. I hinted at one of
these when I pointed out that (i) he seems to be talking as though giving an Ri arguments of
the wrong order yields an ill-formed formula. This actually points to a much more serious
issue, which I have also hinted at before: (ii) Ramsey seems to actually be reintroducing
a restriction on the ranges of bound variables beyond types. This is already problematic,
because this is precisely what ramified type theories do, but it will turn out that
(iii) Ramsey’s restriction is not even as good a restriction as Russellian orders are.
Ramsey’s restriction, unlike that of ramification, has no basis in the formalism—it is
entirely ad hoc. Finally, I argue that (iv) if we are allowed to restrict the ranges of variables
in this way, we can reintroduce the paradox.
Of course, Ramsey does not see it this way, so let us start from the beginning and
return
 to (i).
 I have said
 repeatedly that he seems to be assuming that formulas like
R0 ‘φ’, λx x 0 [ f (x)] are not just false but ill formed. He does not actually have to
assume this, but if he doesn’t, then the reasoning behind his resolution becomes extremely

29 Indeed, this might not be too surprising. Recall that though Ramsey is almost always insistent
that there are no functions but functional symbols, he does think that there are propositions
independent of propositional symbols. Perhaps that distinction is (probably unconsciously) in
play here. This becomes more significant in Section 8.2.
30 As I said above, I am adding no mention quotes here; this is the first variable occurring in (14).
14 DUSTIN TUCKER

weak. As explained above, he wants  to infer the existence


 of a hidden variable in (1 ) from
his observation that for R2 , R2 ‘φ2 ’, λx[φ2 (x)] . I offered two possible
 interpretations

of “for R2 , R2 ‘φ2 ’, λx[φ2 (x)] ”: that it is about the truth of R2 ‘φ2 ’, λx[φ2 (x)] , and
that it is about its well-formedness. Now we can see why I thought that the latter was the
better interpretation. If Ramsey means the former, then we have to be able to conclude
that φ is capable of ranging over symbols containing symbols of Type 2 just because
R2 x, λx[φ(x)] is only possibly true if it is such a symbol. Much  more reasonable,
it seems, is to conclude this from the observation that R2 x, λx[φ(x)] would actually be ill
formed if φ were not of such a form. Of course, to say this is to embrace ramification,
so Ramsey seems forced to make the weaker claim about truth.
This poses a problem for motivating his resolution, but no matter which interpretation is
correct, point (ii) from above is looming. A ramified theory of types restricts the ranges of
bound variables by their order as well as their type. This, I argued, is crucially absent from
Ramsey’s logic—his variables range over all functions (or, rather, functional symbols) of
a given type, irrespective of order. But this cannot be quite right. Consider once again the
two possible interpretations discussed above. Clearly if we go with the second, Ramsey is
forced to restrict the ranges of variables by order as well as by type; this is why he seems
to be forced to go with the weaker first interpretation. But it turns out that he actually has
to restrict the ranges further, regardless of which interpretation is correct. This is because
functional symbols
 of Order 2 can contain functional symbols of arbitrarily high type. For
  
instance, λx φ 1 f 4 (g 3 ) ∩ h 2 λz[φ(z)], x is of Order 2, but it contains a functional
symbol of Type 4. If the φ in (1 ) were allowed to range over this expression, then there
would be a hidden variable of Type 4, and ‘F’ would be of Order 5. And since one can
make the type of f in this expression arbitrarily high, ‘F’ seems to have no determinate
order at all—it contains hidden variables of every type.
In order to avoid this consequence—in order to retain the ability to assign orders to
symbols after hidden variables have been introduced—Ramsey needs to restrict the ranges
of bound variables by more than just their type. But the foundation of his restriction has
to be more fine  grained than even his orders, because he needs to be able  to allow for-
 2 
mulas like λx φ f λz[φ(z)], x
1 while disallowing formulas like λx φ 1 f 4 (g 3 ) ∩
  
h 2 λz[φ(z)], x .
Such a restriction is problematic for at least two reasons. First, it is precisely the sort of
restriction that Ramsey was attempting to avoid. But second and more pressingly, (iii): it
has absolutely no basis in the formalism. That is, there is no way to determine the range
of a bound variable simply by looking at a formula; one must first decide what order one
wants the formula to be, and then decide how to restrict the range of the variable. We can
no longer determine what the order of ‘F’ is just by looking at (1 ). First, we have to
decide that
 we want it to be of Order 2. Then we know that φ must not range over formulas
 4 3  
like λx φ f (g ) ∩ h λz[φ(z)], x . In this sense, his resolution relies on an entirely
1 2

ad hoc restriction on the ranges of bound variables.31

31 Of course, one could add more structure to the logic in order to avoid this worry. One could
probably even argue for such additions through an appeal to Ramsey’s understanding of meaning.
But the other three worries would still stand, and spelling out his account of meaning would be a
lengthy digression, so I will not pursue this response.
INTENSIONALITY AND PARADOXES IN RAMSEY 15

This leads to one final point, (iv), which is that we can also choose the range of φ so that
‘F’ is only of Order 2, whence we can once again derive (4). To do this, we simply restrict
φ so
 that the only
 second-order functional symbols it can range over are symbols like
λx φ 1 [φ(x)] . Prohibiting this restriction would, of course, require still more arbitrary
restrictions, this time on what restrictions are permissible. Thus, after all this work, it
seems as though his resolution of the paradoxes not only requires an unmotivated and un-
wanted restriction on the ranges of bound variables, but does not even successfully resolve
the paradoxes without even more ad hoc restrictions on how these very restrictions can
look.
PART II

RAMSEY’S DIVISION OF THE PARADOXES


At this point, Ramsey’s solution seems to be in trouble. He has appealed to a restriction
on the ranges of bound variables akin to ramification, he has no basis for that restriction,
and anyway we seem to be able to construct paradoxes by employing the restriction. But
I now want to set all of that aside and consider his division of the paradoxes, which is the
contribution of ‘The Foundations of Mathematics’ that has had the most lasting impact.
The division is now put as a division between the logical and the semantical paradoxes,32
and indeed Ramsey seems to encourage this reading with his own analysis of the paradoxes:
he claims that the contradictions of his Group B “all contain some reference to thought,
language, or symbolism” [p. 20] and thinks that the correct formalizations of the Group
B paradoxes all involve semantical notions. I think, however, that there are other possible
formalizations, which more closely capture the informal statements of the paradoxes and
involve no semantical predicates. If this is right, then it is unfortunate that nobody writing
about Ramsey’s division of the paradoxes looked more closely at the paradoxes he was
actually concerned with. In what follows, after introducing some additional connectives, I
give an example of the intensional paradoxes I have in mind (Section 7.1); argue that they
do not involve any concepts that cannot reasonably be attributed to Ramsey (Section 7.3);
and attempt to show that, despite Ramsey’s own formalization of the paradoxes he lists,
there are more natural formalizations of the informal statements of his Group B paradoxes
that behave like the intensional, and not the semantical, paradoxes (Section 8).

§7. The intensional paradoxes. So far, the intensionality of the logic I have been
using has played no role: neither fact that well-formed formulas are of type p nor the
use of nonstandard logical symbols has been relevant. But paralleling the intensional part
of the logic, I now introduce a type t, intuitively of truth-values; the primitive constants
¬t,t and →t,t,t ; and infinitely many constants ∀τ,t,t , one for each type τ . The
constants ∨, ∧, ↔, and ∃ are defined in the usual way. I also introduce infinitely many
constants =τ,τ,t and ≈τ,τ, p , which are the obvious identity relations for each type
τ ,33 and the constant ∨ p,t , which takes propositions to their truth-values. This suggests

32 Again, see Fraenkel & Bar-Hillel (1958, pp. 5–14), Beth (1959, §171), Kneale & Kneale (1962,
pp. 664–665), Quine (1963, pp. 254–255), Feferman (1984, p. 75), and Priest (1994, pp. 25–26).
33 If there is a reason to think that Ramsey’s logic absolutely prohibits the inclusion of identity, then
we can add Church’s strict equivalence from Church (1974). The following paradox requires a
little more work in that case, but it can still be constructed. However, it seems to me that the
reasons Church presents in Church (1974) in favor of strict equivalence work in favor of identity
16 DUSTIN TUCKER

the following translation principles. (Superscripted τ s on the metavariables indicate that


there is one such principle for each type τ ; the types of the other symbols are fixed by
context.)

[Pτ ≈ Qτ ] ↔ [Pτ = Qτ ],

∼P ↔ ¬∨P,

[P ; Q] ↔ [∨P → ∨Q],

xτ P ↔ ∀xτ ∨P.
The natural principles for the defined constants fall out of these. With this additional
machinery in place, we can construct an intensional paradox. Unlike, say, Grelling’s
paradox, what follows is not an antinomy: I only show that a contradiction follows from
premises that are not obviously contradictory (and, indeed, seem to describe a possible
state of the world), and not that the logic just described is inconsistent.
7.1. An intensional paradox. Let a i be Aristotle and Ai, p, p (x, y) mean that x
asserts y.34 (15) then denotes the proposition that everything Aristotle asserts is false.
x p [A(a, x) ; ∼x]. (15)
Thus (16) denotes the proposition that Aristotle asserts that everything Aristotle asserts is
false.
 
A a, x p [A(a, x) ; ∼x] . (16)
Once we have this, though, paradox threatens; if the propositions denoted by both (16)
and (17) are true, then we can derive a contradiction.
   
x p A(a, x) ; x ≈ y p [A(a, y) ; ∼y] ∪ ∼x . (17)

(17) denotes the proposition that the only things that Aristotle has asserted (if he has
asserted anything at all) are either the proposition denoted by (15) or false propositions.
This seems to at least be possibly true, and it does not seem to contradict the denotation
of (16), so hypothesizing that the two of them are true should not be problematic. But, of
course, it is. Formally, we are supposing

 
A a, x p [A(a, x) ; ∼x] (18)
and   


∀x p A(a, x) → x= y p [A(a, y) ; ∼y] ∨ ¬∨x ; (19)
these are ∨(16) and ∨(17) respectively.
From (18) we have
¬∀x p [∨A(a, x) → ¬∨x], (20)
which is simply ¬∨(15), almost immediately. The derivation is elementary; one assumes
∨(15) and, after applying the translation principles, instantiates the variable with (15) itself.

here, and the concerns raised there about using identity instead of strict equivalence do not apply
to the present logical system.
34 The example of Aristotle and assertion comes from Church (1974), although Church does not
develop any paradoxes.
INTENSIONALITY AND PARADOXES IN RAMSEY 17

From (20) we know that there is a constant b p such that both



A(a, b) (21)
and

b. (22)
From (19), (21), and (22) we have
b= x p [A(a, x) ; ∼x], (23)
whence by (22)

x p [A(a, x) ; ∼x], (24)
which is just ∨(15).

(24) and (20) are contradictories, so the supposition of (18) and (19) has gone wrong
somewhere. It is hard to see where, though. It is certainly possible for Aristotle to say
(the Greek equivalent of) “Everything Aristotle asserts is false” and nothing else, which
would, at least prima facie, satisfy both assumptions; such a situation is, at least prima
facie, one in which Aristotle has said (and said only) that everything Aristotle says
is false.35
7.2. The difference between the intensional and the semantical paradoxes. It is
tempting to think that the above paradox—call it the Aristotle paradox—will be resolved by
whatever one adopts to resolve the Liar (and the Grelling, and maybe the Strengthened Liar,
etc.). It certainly feels similar to the semantical paradoxes. But there is a crucial difference.
Consider, for example, Tarski’s hierarchy of languages. This resolves the semantical
paradoxes by prohibiting any language from talking about the semantics of that very
language. In particular, no sentence of a given language L n can contain a satisfaction
predicate for L n , so one cannot construct a sentence that says of itself that it is
false.
In the Aristotle paradox, though, no sentence says of itself that it is false. Indeed, there is
no appearance of a metalinguistic notion of truth at all. The only notion of truth that appears
in the Aristotle paradox is one that applies to propositions. The problematic assumption is
not that Aristotle asserts, “Every sentence Aristotle asserts is false”; the problem is that
Aristotle asserts the proposition that every proposition Aristotle asserts is false. Thus, any
restriction on metalinguistic satisfaction predicates that we wish to include in our logic
(such as a Tarskian prohibition on languages containing their own satisfaction predicates)
can be taken on board without qualification—it will do nothing to the above derivation of
a contradiction, because satisfaction of formulas is completely irrelevant there.
The preceding paragraph is a little imprecise. We can informally state the situation that
leads to the Aristotle paradox as one in which (i) Aristotle asserts the proposition that every
proposition Aristotle asserts is false and (ii) every proposition Aristotle asserts is either that
proposition or false. The ‘false’s in this informal characterization apply to propositions,
and in that sense, as I wrote above, “[t]he only notion of truth that appears in the Aristotle

35 I do not mean to say that, at the end of the day, this actually is a situation in which (18) and (19)
are both true; one way to resolve the paradoxes is to insist that it is not actually such a situation.
But if they were not even prima facie true, there would nothing for a logician to do here; there is
work to be done precisely because (i) this (clearly possible) situation seems to be one in which
both (18) and (19) are true and (ii) (18) and (19) seem to imply a contradiction. If things did not
even seem this way, then there would be no paradox to resolve in the first place.
18 DUSTIN TUCKER

paradox is one that applies to propositions.” But I did not use any truth or falsity predicates
when formalizing (i) and (ii), and generally do not do so to translate English sentences
containing ‘true’ or ‘false’ (in their propositional senses, anyway). I would, for example,
represent the proposition that Aristotle asserts a true proposition with ‘ p[A(a, p) ∩ p]’,
in which nothing like a truth predicate appears; if we were saying that the proposition is
false, I would simply insert a ∼ in front of the second conjunct.36
While good to clarify—I really was speaking imprecisely—these details are irrelevant
to the main point, which is that no metalinguistic predicates of any sort appear in either the
informal statements (i) and (ii) or the formalizations thereof. As long as that is true, Tarski’s
hierarchy of semantical predicates—of satisfaction predicates—will be of no immediate
assistance in resolving the Aristotle paradox.37
Similarly, a truth-value gap approach to the semantical paradoxes will not help here
without adaptation. That approach allows sentences to lack truth-values, but again, truth-
values of sentences are irrelevant to the Aristotle paradox; truth-values of propositions are
what matter.
I do not mean to say that these are the only two approaches to the semantical paradoxes,
or that they cannot be adapted to resolve the intensional paradoxes as well.38 But such
issues are far beyond the scope of this paper, as I am not concerned here with resolutions of
the intensional paradoxes. My only concern is to show that as far as Ramsey is concerned,
the intensional paradoxes are distinct from the semantical paradoxes, and it is enough for
this purpose that resolutions of the latter do not always resolve the former.
7.3. Propositions. “As far as Ramsey is concerned” in the preceding sentence is not
innocuous. This distinction between the intensional and the semantical paradoxes might
not be all that interesting if one takes propositions to be, or at least be very much like,
sentences. Thus, one might worry at this point that I am imposing too specific an account
of propositions on Ramsey. Russell, for instance, is notorious for not being clear about the
distinction between propositions and the sentences that express them. While I think that
the issues raised by the intensional paradoxes are interesting in their own right, they would

36 This way of formalizing the relevant propositions (without the ∨ constant) has been taken by
others working on these paradoxes; Prior (1961) is a notable example. The general approach of
translating English sentences containing ‘true’ and ‘false’ without truth or falsity predicates is
discussed at length in Grover et al. (1974).
37 One might think that ∨ is the relevant truth predicate, contra my claim that no metalinguistic
truth predicate appears in my formalizations of (i) and (ii). If it were, then the Aristotle paradox
wouldn’t be very surprising—it would be arising in a logic that Tarski already proved couldn’t
have models. But the ∨ s in (18) and (19) only serve to say that the propositions expressed
by (16) and (17)—which are formalizations of (i) and (ii) respectively and which contain no
truth predicates of any sort—are true. That is, if one wants to think of ∨ as a truth predicate, one
must think of it as a propositional, not metalinguistic, truth predicate, and this is precisely the
distinction that I am trying to draw between the intensional and semantical paradoxes: the former
are concerned with propositional truth; the latter, metalinguistic.
38 Indeed, the ramified theory of types is not entirely unlike Tarski’s hierarchy—especially in light
of Church (1976)—and it has been employed to resolve these paradoxes in, for example, Church
(1993, p. 152). And there is no reason to think that an approach that posits some sort of proposition
gaps would be hopeless, although none has been developed in perfect detail. Parsons argues for
such an approach in Parsons (1974), and Bealer (1994, p. 162) at least thinks that positing gaps is
the most promising extant route. But what—and, more importantly, where—such gaps are exactly
is never very clear, and such an approach has never been worked out in print in any detail (as far
as I know).
INTENSIONALITY AND PARADOXES IN RAMSEY 19

have little place in this paper if it turned out that Ramsey was equally unclear, or if he
clearly took propositions and sentences to be importantly alike.
Luckily, Ramsey is explicit both about the distinction between propositions and formulas
and about the nature of propositions themselves. As I said in Section 3, his terminology is
somewhat confusing at times, but he seems to be clear on the distinction in practice, even if
his notation does not observe it very scrupulously. In fact, I think that his understanding of
propositions is very close to the popular modern account of propositions as sets of possible
worlds.39 One of the quotes in Section 3 already suggests this. As I wrote there,
“two propositional symbols are to be regarded as instances of the same
proposition . . . when they express agreement and disagreement with the
same sets of truth-possibilities of atomic propositions” [p. 9]. That is,
they are instances of the same proposition when their truth tables agree.
Here, I have glossed expressing agreement and disagreement with sets of truth-
possibilities of atomic propositions as having truth tables that agree. But it is easy to
think of each row of a truth table picking out a set of possible worlds, namely, the worlds
at which the atomic propositions in the truth table have the truth-values they are assigned
on that row. One can then think of two propositional symbols as instantiating the same
proposition, to use Ramsey’s terminology, when they are true at the same possible worlds.
This, at least, should behave the same as his notion of agreeing and disagreeing with the
same truth-possibilities.

§8. Intensional paradoxes in Group B. I have argued that, given Ramsey’s account
of propositions, the intensional paradoxes are distinct from the semantical paradoxes. But
this is not enough; I also have to argue that Ramsey was worried (at least de re, if not de
dicto) about the intensional paradoxes. The argument here is a bit more intricate than one
might expect, because all of Ramsey’s formalizations of the Group B paradoxes actually
involve semantical notions: Tarski’s hierarchy, for example, will resolve them all. But this
does not mean that the formalizations he provides are the only ones available, or the only
ones that he would have endorsed. Indeed, given the informal statements of the paradoxes
that both he and Russell—from whom Ramsey inherited all but one of the paradoxes he
lists—provide, I think that it is plain that one can construct more faithful formalizations in
some cases. In particular, I think that he has shoehorned meaning into his formalization of
the Liar , and that a more natural formalization of even the sentence that he starts with,
and definitely the sentence that Russell starts with, involves no semantical predicates.
If (i) this is right, and if (ii) we are interested in resolving the (informal statements of the)
paradoxes that Ramsey began with, then we need to look beyond the standard resolutions
of the semantical paradoxes even to resolve all of the paradoxes that Ramsey himself was
worrying about (though we do not need to do so to deal with all of his formalizations
of those paradoxes). Given the attention people have paid to Ramsey’s account of the
paradoxes, (ii) seems to be obviously true. I hope to show that (i) is as well.

39 Or the precursor to that account, Carnap’s notion of state descriptions. In fact, Carnap (1956, p. 9)
writes, “Some ideas of Wittgenstein were the starting-point for the development of [state-
descriptions],” citing the Tractatus in a footnote. Meanwhile, §I of ‘The Foundations of
Mathematics’, which contains most of Ramsey’s discussion of propositions, makes frequent
reference to Wittgenstein and his Tractatus. It is thus not surprising that Ramsey’s account of
propositions is similar to Carnap’s state descriptions.
20 DUSTIN TUCKER

8.1. The division. I quote Ramsey.


The best known [contradictions] are divided as follows:—
A. (1) The class of all classes which are not members of themselves.
(2) The relation between two relations when one does not have
itself to the other.
(3) Burali Forti’s [sic] contradiction of the greatest ordinal.
B. (4) ‘I am lying.’
(5) The least integer not nameable in fewer than nineteen
syllables.
(6) The least indefinable ordinal.
(7) Richard’s Contradiction.
(8) Weyl’s contradiction about ‘heterologisch’.40

The principle according to which I have divided them is of fundamental


importance. . . . [T]he contradictions of Group B . . . all contain some
reference to thought, language, or symbolism. . . . So they may be due
not to faulty logic or mathematics, but to faulty ideas concerning thought
and language. [p. 20]
Beginning with Tarski, we have seen that language and symbolism are very much the
concern of logic and mathematics, but that part of Ramsey’s principle is not relevant.
I contend that the first claim, that “the contradictions of Group B . . . all contain some
reference to thought, language, or symbolism,” is misleading. Well, I don’t actually know
what it means to contain a reference to thought. But as long as they do not essentially con-
tain a reference to language or symbolism, they will not be resolved by just any resolution
of the semantical paradoxes (certainly not by any of the standard ones), and that is enough
for my purposes.
Before arguing for this, I should make a note about my terminology. Though
Ramsey follows Russell in calling these eight paradoxes ‘contradictions’, I refer to them
as Paradoxes 1–8 in what follows. (I refer to the seven paradoxes presented in Whitehead
& Russell (1910) as Contradictions 1–7 in Section 8.3.2 below.) My concern will be with
Paradox 4.
8.2. Ramsey’s formalization of Paradox 4. When addressing Paradox 4, Ramsey
claims that we should analyze ‘I am lying’ as
(∃‘ p’, p) : I am saying ‘ p’ . ‘ p’ means p . ∼ p. [p. 48] (25)
I have already presented a translation of (25) into the logic I have constructed, which
I reproduce here.
‘ p’ p[Say(‘ p’) ∩ Rn (‘ p’, p) ∩ ∼ p]. (14)
I have not introduced Say, but it is intended to be that function that takes a sentence and
returns the proposition that I am saying that sentence.
As I said when discussing (14) in Section 6.2, and especially in note 29, I think that
the Rn here is one that relates symbols to propositions, rather than symbols to symbols. To
distinguish this meaning relation from the one employed in Ramsey’s analysis of Grelling’s

40 This is, of course, Grelling’s paradox, incorrectly attributed to Weyl.


INTENSIONALITY AND PARADOXES IN RAMSEY 21

paradox (Paradox 8), I will represent it with Rn , and retranslate (25) as
‘ p’ p[Say(‘ p’) ∩ Rn (‘ p’, p) ∩ ∼ p]. (14 )
The idea seems to be that one cannot say the sentence (14 ) and nothing else, though
there is nothing intuitively contradictory about such a supposition. If we were to take this
formalization seriously, we would probably want to rephrase it using more modern tech-
niques for representing self-reference, which had not been developed when Ramsey was
writing. As with the Aristotle paradox, one could then spell the supposition out carefully
with ∨ .
I am not going to take this formalization seriously, though, because it involves a met-
alinguistic relation, Rn , and a metalinguistic speech predicate, Say. Of course, with such
relations present, resolutions of the semantical paradoxes can be easily adapted to prohibit
any contradiction from forming. Consider, for example, Tarski’s argument that no language
can contain its own satisfaction predicate. One can easily adapt that argument to show
that no language with a constant like ∨ can contain an expression relation that holds
between formulas and the propositions they express. That is, one can show that, on pain of
contradiction, no language can include both ∨ and Rn . If we were to follow Tarski all the
way, this would lead us to develop a hierarchy of Rn relations, in much the same way that
Tarski proposes a hierarchy of truth predicates.
This, however, is a very unsatisfactory resolution of Paradox 4, especially in light of
the way Ramsey himself glosses ‘I am lying’ immediately after presenting this resolution.
As I quoted in Section 6.2, he thinks that he has shown that “ ‘I am lying’ in the sense
of ‘I am asserting a false proposition of order n’ . . . does not contradict itself” [p. 48].41
But surely ‘I am asserting a false proposition of order n’ does not involve any semantical
notions—it just involves the notion of assertion as applied to a proposition (a proposition
which happens to be about itself). Thus, although Ramsey’s own formalization of Para-
dox 4 will be resolved by, say, an adaptation of Tarski’s argument, it does not seem like
the best formalization of Paradox 4 in the first place. If we are going to take Paradox 4
seriously, then, we ought to try to construct a more faithful formalization of it, and see if
it, too, will turn out to be no different than the semantical paradoxes.
8.3. Better formalizations of Paradox 4 and a related paradox. There are two places
we could look for the informal statements that we are attempting to formalize. Ramsey’s
statement of the paradoxes is, of course, one place. But since he took himself to be sim-
ply repeating the list in Whitehead & Russell (1910) (with the exception of Grelling’s
paradox), one might also look there. If it turns out that there is an important difference be-
tween a paradox listed in ‘The Foundations of Mathematics’ and its precursor in Principia
Mathematica, then it seems reasonable to think that even Ramsey would be interested in
resolving both the paradox he lists and the distinct version from Russell. I think that such
a difference does turn up with respect to Paradox 4 and what I will call Contradiction 1,
the first of seven contradictions listed at Whitehead & Russell (1910, pp. 60–61); I discuss
possible formalizations of both paradoxes (or, more precisely, of the informal statements
of both paradoxes).

41 As I said in note 28, this is not actually a good gloss of his formalization, because propositions
do not have orders: it is not the order of the proposition, but the order of the propositional symbol
in (25)—and (14 )—that leads to the hidden variable. A better gloss would be ‘I am saying a
sentence of order n that denotes a false proposition’; in addition to attributing the order to the
right sort of thing, this captures the essentially semantical nature of (25).
22 DUSTIN TUCKER

8.3.1. Ramsey’s Paradox 4. As I quoted above, Ramsey reads ‘I am lying’ as ‘I am


asserting a false proposition (of a certain order)’. I observed in notes 28 and 41 that this is
not a satisfactory gloss of Ramsey’s formalization of ‘I am lying’, since it both attributes
an order to a proposition and does not involve any semantical relations like Rn . But it is
at least not unreasonable to think that Ramsey would be interested in the actual problem
posed by ‘I am asserting a false proposition’. After all, this is very similar to the informal
formulation of the analogous paradox in Whitehead & Russell (1910, p. 62): ‘There is a
proposition which I am affirming and which is false’. Thus, in the interest of doing what
Ramsey failed to do, one might attempt to formalize ‘I am asserting a false proposition’.
One way to read this is as involving a self-referential proposition of some sort. Self-
reference is easy and well documented in the case of sentences; though Ramsey did not
know how to construct self-referential sentences, we now do. But almost nothing analogous
has been done for propositions.42 Thus, it would be best to find a non–self-referential
reading that still goes beyond the power of, say, Tarski’s hierarchy.
This is easy to come by: one might also read ‘I am asserting a false proposition’ as not
involving self-reference at all, and take it completely at face value. ‘A dog is walking’ just
means that there is a thing that is a dog and is walking. Similarly, one might think that
‘I am asserting a false proposition’ just means that there is a proposition which I am
asserting and which is false.43 Formally, where I am ci ,
∃x p [∨A(c, x) ∧ ¬∨x]. (26)
Notice that the connectives are all extensional. (26) is not intended to represent the
proposition that I am asserting; it simply states a fact about me and what I am asserting.
But (26) is not contradictory at all. The contradiction arises when I am asserting (and
asserting only) that I am asserting something false. That is, the contradiction arises on an
assumption of

   
A c, x p [A(c, x) ∩ ∼x] ∧ ∀x p ∨A(c, x) ↔ x = x p [A(c, x) ∩ ∼x] . (27)
This, of course, is analogous to the assumption of (18) and (19) in the Aristotle paradox,
and the derivation of a contradiction would proceed in a similar way.
I think that this is a more faithful formalization of ‘I am asserting a false proposition’
than anything Ramsey provides. It clearly involves no metalinguistic predicates.44 Thus,
if one thinks that the paradoxes that Ramsey was concerned with are real problems to be
solved, then it is misleading to think that all of the paradoxes in Group B can be resolved
by the standard resolutions of the semantical paradoxes.
However, one might be a little worried about my formalization for two reasons: (i) there
still seems to be something self-referential in ‘I am asserting a false proposition’ that I have
not captured; and (ii) there is an indexical, ‘I’, in ‘I am asserting a false proposition’, and
there is certainly nothing in (26) or (27) that behaves at all indexically. (27) is clearly a
problem for anybody interested in intensional logic, but again, that is not enough for my

42 Barwise & Etchemendy (1987) addresses self-referential (they prefer ‘circular’) propositions, but
it is highly unusual in this. While I think that the suggestions made in that book deserve closer
attention, this is not the place for it.
43 This is just the way it is put in Whitehead & Russell (1910), as quoted above.
44 It does involve an assertion relation that relates an individual to a proposition directly, and there is
no evidence that Ramsey considered such relations, but this does not strike me as a huge problem.
Certainly we now are happy to countenance propositional attitudes, and I see no reason to think
that Ramsey would be averse to them.
INTENSIONALITY AND PARADOXES IN RAMSEY 23

purposes. Luckily, (27) is very close to a formalization of one of the paradoxes listed in
Whitehead & Russell (1910), about which the point can be made without such concerns.
As my aim is only to show that there is at least one paradox that Ramsey would have been
interested in that is not resolved along with the semantical paradoxes, this should suffice.
8.3.2. Russell’s Contradiction 1. There are seven paradoxes, or contradictions, dis-
cussed in Whitehead & Russell (1910); they are basically Paradoxes 1–7 in a different
order. The first of the seven, which I will call Contradiction 1, is the analogue of Paradox 4.
It is introduced:
Epimenides the Cretan said that all Cretans were liars, and all other
statements made by Cretans were certainly lies. Was this a lie? The
simplest form of this contradiction is afforded by the man who says “I am
lying”; if he is lying, he is speaking the truth, and vice versa. (Whitehead
& Russell, 1910, p. 60)
Later, when discussing the resolution of the paradoxes by the ramified theory of types
[p. 62], the sentence I quoted above, ‘There is a proposition which I am affirming and which
is false’, is presented. But as all of these ostensible simplifications involve indexicality and
possibly self-reference, neither of which is present in the original statement, I want to focus
on the Epimenides paradox itself.
The problematic state of affairs here is one in which three things obtain: (i) Epimenides
says the proposition that every proposition a Cretan says is false; (ii) Epimenides is a
Cretan; and (iii) every other proposition any Cretan has said is false. As I said above,
Russell is not known for his care in distinguishing propositions from sentences, so one
might worry about my use of ‘proposition’ in (i) and (iii). But he uses ‘proposition’ in the
simplified version from p. 62, and Ramsey clearly takes those propositions to be distinct
from the sentences that denote them in a very modern way, so I do not think that my use of
the term is illicit.
It should now be clear how these formalizations go, but for completeness, I include them
here. Letting ei be Epimenides, S i, p, p (x, y) mean that x says (the proposition) y, and
C i, p (x) mean that x is a Cretan,

  
S e, x p y i C(y) ; [S(y, x) ; ∼x] , (28)

C(e), (29)
and
 
∀x p ∀y i ∨C(y) → ∨S(y, x) →
   
x= x p y i C(y) ; [S(y, x) ; ∼x] ∨ ¬∨x (30)

are (i), (ii), and (iii) respectively.


The derivation of a contradiction again closely follows that in the Aristotle paradox. The
supposition of (28), (29), and (30) seems to me to be precisely the assumption at issue in
the first informal statement of Contradiction 1 in Principia Mathematica. There is certainly
no indexicality or self-reference to create worries. But there are also no metalinguistic
predicates. Thus, if we take the problem posed by Contradiction 1 seriously, as it seems
Ramsey did, then we seem to have to conclude, contra Ramsey, and contra everybody
following Ramsey for the last 80 years, that the paradoxes in Group B do not all essentially
“contain some reference to . . . language” [p. 20].
24 DUSTIN TUCKER

§9. Conclusions. In light of much of ‘The Foundations of Mathematics’, Ramsey


might be taken to be uninterested in intensionality. Indeed, he claims that if he is right that
the paradoxes of Group B involve extralogical notions of thought or language, then “they
[are] not . . . relevant to mathematics or logic, if by ‘logic’ we mean a symbolic system”
[p. 21]. But he immediately follows this with “though of course they would be relevant to
logic in the sense of the analysis of thought.” This sense of ‘logic’, it seems, is his concern
when he analyzes meaning and attempts to formalize and resolve his Group B paradoxes.
Unfortunately, I think that his analysis of meaning would sound naive to modern ears,
although I have not said much about it. And the logic itself seems to be somewhat sketchily
thought out, as evinced by the number of details I have had to fill in with my reconstruction.
Aside from these difficulties, his resolution of the semantical paradoxes seems to rely on
a restriction of the ranges of bound variables that is dangerously close to ramification and
has no basis in the logic. Worse still, I have argued that it does not successfully avoid all of
the paradoxes it is designed to avoid.
But this failure is perhaps not as interesting as the general sketchiness and confusion that
seem to pervade §III of ‘The Foundations of Mathematics’, which contains his treatment
of meaning and the Group B paradoxes. Time and again in that section one encounters
small mistakes, as with his treatment of Paradox 4, or confusing uses of terms, as with his
multiple distinct uses of ‘mean’ (and ‘involve’, though I have not said much about it). As
I said in Section 1, I am reluctant to impute such confusion to Ramsey, but I hope that
the interpretation I have provided in Part I is at least not entirely unsatisfactory, in spite
of its generally pessimistic tone. Indeed, I think that the confusion is easily understood:
these problems are notoriously difficult, and Ramsey was attempting to formalize concepts
that are still to some extent problematic today. It is not surprising that one would not have
a clearer idea of what is going on when one only has Principia Mathematica (and the
Tractatus) to work from.
If I am right about this confusion, though, then it is unfortunate that so much weight
has been given to Ramsey’s characterization of the paradoxes. I tried to show in Part II
that we can construct formalizations of some of the Group B paradoxes that are, even
by his own lights, more faithful to the informal statements of the paradoxes. If I am
right, then it is unfortunate that so many people have taken Ramsey at his word that the
paradoxes in Group B are all due to semantical notions, thereby obscuring an important,
and possibly importantly different, collection of paradoxes.45 Again, Ramsey’s confusion
is understandable, but one might wish that it had not been so influential.

§10. Acknowledgments. Thanks to Rich Thomason for many helpful comments and
suggestions throughout the development of this paper. Thanks also to Gabriel Sandu for
comments that helped frame my thoughts, especially regarding Section 6.2, and an anony-
mous referee, whose comments helped me clarify and tighten up several parts of Part II.

BIBLIOGRAPHY
Barwise, J., & Etchemendy, J. (1987). The Liar. Oxford, UK: Oxford University Press.
Bealer, G. (1982). Quality and Concept. Oxford, UK: Oxford University Press.

45 The intensional paradoxes have not been completely ignored. See, for example, Prior (1961),
Church (1993, p. 152), Bealer (1982, pp. 98–100), and Klement (2002). But attention to them
beyond these and with the realization that they are importantly different from the semantical
paradoxes has been minimal.
INTENSIONALITY AND PARADOXES IN RAMSEY 25

Bealer, G. (1994). Property theory: The type-free approach v. the Church approach. Journal
of Philosophical Logic, 23, 139–171.
Beth, E. W. (1959). The Foundations of Mathematics. Amsterdam, The Netherlands:
North-Holland Publishing Company.
Braithwaite, R. B., editor. (1931). The Foundations of Mathematics and Other Logical
Essays. London: Routledge and Kegan Paul. Collected papers of Frank P. Ramsey.
Carnap, R. (1956). Meaning and Necessity (second edition). Chicago, IL: Chicago
University Press. First edition published in 1947.
Church, A. (1932). Review of The Foundations of Mathematics and Other Logical Essays.
The American Mathematical Monthly, 39, 355–357.
Church, A. (1974). Russellian simple type theory. Proceedings of the American
Philosophical Association, 47, 21–33.
Church, A. (1976). Comparison of Russell’s resolution of the semantical antinomies with
that of Tarski. Journal of Symbolic Logic, 41, 747–760.
Church, A. (1993). A revised formulation of the logic of sense and denotation. Alternative
(1). Noûs, 27, 141–157.
Fraenkel, A. A., & Bar-Hillel, Y. (1958). Foundations of Set Theory. Amsterdam, The
Netherlands: North-Holland Publishing Co.
Feferman, S. (1984). Toward useful type-free theories, I. Journal of Symbolic Logic, 49,
75–111.
Grover, D. L., Camp, J. L. Jr., & Belnap, N. D. Jr. (1974). A prosentential theory of truth.
Philosophical Studies, 27, 73–125.
Kneale, W., & Kneale, M. (1962). The Development of Logic. Oxford, UK: Oxford
University Press.
Klement, K. C. (2002). Frege and the Logic of Sense and Reference. New York, NY:
Routledge.
Mellor, D. Hugh, editor. (1990). Philosophical Papers. Cambridge, England: Cambridge
University Press. Collected papers of Frank P. Ramsey.
Parsons, C. (1974). The liar paradox. Journal of Philosophical Logic, 3, 381–412.
Prior, A. N. (1961). On a family of paradoxes. Notre Dame Journal of Formal Logic, 2,
16–32.
Priest, G. (1994). The structure of the paradoxes of self-reference. Mind, New Series,
103(409), 25–34.
Quine, W. V. O. (1963). Set Theory and Its Logic. Cambridge, MA: Harvard University
Press.
Ramsey, F. P. (1925). The foundations of mathematics. Proceedings of the London
Mathematical Society, 25, 338–384. Reprinted in Braithwaite (1931); Mellor (1990).
Russell, B. (1932). Review of the Foundations of Mathematics and Other Logical Essays.
Philosophy, 7, 84–86.
Thomason, R. H. (1980). A model theory for propositional attitudes. Linguistics and
Philosophy, 4, 47–70.
Whitehead, A. N., & Russell, B. (1910). Principia Mathematica (first edition). Cambridge,
England: Cambridge University Press.

DEPARTMENT OF PHILOSOPHY
UNIVERSITY OF MICHIGAN
435 SOUTH STATE STREET
ANN ARBOR, MI 48109–1003
E-mail: dtuck@umich.edu
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010

THE MODAL LOGIC OF STONE SPACES: DIAMOND AS


DERIVATIVE
GURAM BEZHANISHVILI
Department of Mathematical Sciences, New Mexico State University
LEO ESAKIA and DAVID GABELAIA
Department of Mathematical Logic, A. Razmadze Mathematical Institute

Abstract. We show that if we interpret modal diamond as the derived set operator of a topo-
logical space, then the modal logic of Stone spaces is K4 and the modal logic of weakly scattered
Stone spaces is K4G. As a corollary, we obtain that K4 is also the modal logic of compact Hausdorff
spaces and K4G is the modal logic of weakly scattered compact Hausdorff spaces.

§1. Introduction. Topological semantics of modal logic was first developed by


McKinsey & Tarski (1944), who suggested two interpretations of modal diamond 3:
one as the closure operator and another as the derived set operator of a topological space.
They showed that if we interpret 3 as the closure operator, then the modal logic of all
topological spaces is Lewis’ well-known modal system S4. The main result of McKinsey &
Tarski (1944) states that S4 is in fact the modal logic of any dense-in-itself metrizable
space.
On the other hand, if we interpret 3 as the derived set operator, then the modal logic
of all topological spaces is wK4—weak K4—which is obtained from the basic normal
modal logic K by adding 33 p → ( p ∨ 3 p) as a new axiom (Esakia, 2004). Moreover,
K4 = K + 33 p → 3 p is the modal logic of all TD -spaces (Esakia, 2004) and GL =
K + 2(2 p → p) → 2 p is the modal logic of all scattered spaces (Esakia, 1981). Further
results in this direction can be found in Shehtman (1990), Bezhanishvili et al. (2005), and
Gabelaia (2004).
In this paper we are interested in the modal logic of compact Hausdorff zero-dimensional
spaces, also known as Stone spaces. The interest in Stone spaces stems from the celebrated
Stone duality, which establishes duality (dual equivalence) between the category of
Boolean algebras and Boolean algebra homomorphisms and the category of Stone spaces
and continuous maps. Under Stone duality atomless Boolean algebras correspond to dense-
in-itself Stone spaces, atomic Boolean algebras correspond to weakly scattered Stone
spaces, and superatomic Boolean algebras correspond to scattered Stone spaces. It follows
from Abashidze (1988) that the modal logic of scattered Stone spaces is GL. In Shehtman
(1990), the McKinsey–Tarski technique was adopted to show that K4D = K4 + 3 is
the modal logic of any dense-in-itself zero-dimensional metrizable space. It follows that
K4D is the modal logic of dense-in-itself Stone spaces. To this we add that the modal
logic of all Stone spaces is K4 and the modal logic of weakly scattered Stone spaces
is K4G = K4 + ¬2⊥ → ¬2¬2⊥. As a corollary, we obtain that the modal logic

Received: January 28, 2009


c Association for Symbolic Logic, 2010
26 doi:10.1017/S1755020309990335
MODAL LOGIC OF STONE SPACES 27

of compact Hausdorff spaces is K4 and the modal logic of weakly scattered compact
Hausdorff spaces is K4G.

§2. Preliminaries. In this paper we will be interested in the following modal logics:

1. K4 = K + 33 p → 3 p;
2. K4D = K4 + 3;
3. K4G = K4 + ¬2⊥ → ¬2¬2⊥; and
4. GL = K + 2(2 p → p) → 2 p.

These logics are related to each other by the following diagram:

It is well known that K4 is the modal logic of transitive frames, that K4D is the
modal logic of transitive serial frames, and that GL is the modal logic of dually well-
founded frames. These three logical systems are well known in the literature (see, e.g.,
Chagrov & Zakharyaschev, 1997). On the other hand, K4G is a relatively new system
introduced in Esakia (2002). Its main importance lies in its capability to express modally
Gödel’s second incompleteness theorem (a consistent logical system cannot prove its own
consistency).
Each of the four modal logics is complete with respect to its relational semantics. We
briefly recall some basic facts about relational semantics which will be used subsequently.
Let F = (W, R) be a K4-frame; that is, F is transitive (w Rv and v Ru imply w Ru). Then
F is a K4D-frame if in addition it is serial (i.e., for each w ∈ W there exists v ∈ W such
that w Rv). We call w ∈ W a reflexive point if w Rw; otherwise we call w an irreflexive
point. Let
C(w) = {w} ∪ {v ∈ W : w Rv and v Rw}.
We call C(w) the cluster generated by w; we also call a subset C of W a cluster if C =
C(w) for some w ∈ W . Let C be a cluster of W . We call C proper if it consists of more than
one element, simple if it consists of a single reflexive point, and degenerate if it consists
of a single irreflexive point. We call w ∈ W a maximal point if w Rv implies w = v, and
a quasimaximal point if w Rv implies v Rw. Clearly each maximal point is quasimaximal,
but not vice versa.
Now, F is a GL-frame iff F is dually well founded (i.e., for each nonempty subset V
of W there exists v ∈ V such that v Ru for no u ∈ V ); and F is a K4G-frame iff F is a
K4-frame and for each w ∈ W , either w is an irreflexive maximal point or there exists an
irreflexive maximal point v ∈ W such that w Rv.
We say that w ∈ W is a root of F if w Rv for each v ∈ W − {w}, and that F is rooted if
there exists a root in F. Note that a root may not be unique. In fact, if w is a root, then each
element of C(w) is also a root.
The next proposition states that all four modal logics of our interest have the finite model
property.
28 GURAM BEZHANISHVILI ET AL.

P ROPOSITION 2.1.
1. K4 is the modal logic of finite rooted transitive frames.
2. K4D is the modal logic of finite rooted transitive serial frames.
3. GL is the modal logic of finite rooted transitive irreflexive frames.
4. K4G is the modal logic of finite rooted K4G-frames.
Proof. For (1) and (2) see, for example, Chagrov & Zakharyaschev (1997, corollary
5.3.2); and for (3) see, for example, Chagrov & Zakharyaschev (1997, theorem 5.46).
We sketch a proof that K4G is the modal logic of finite rooted K4G-frames, using the
standard filtration argument through a well-chosen set of formulas. If K4G
ϕ, then
ϕ is refuted on the canonical model MK4G of K4G. Since K4 is a canonical logic and
the formula ¬2⊥ → ¬2¬2⊥ contains no propositional letters, the underlying frame of
MK4G is a K4G-frame. Consider the standard transitive filtration (see, e.g., Chagrov &
Zakharyaschev, 1997, pp. 141–145) of MK4G through the set
 = {ψ : ψ is a subformula of ϕ ∧ (¬2⊥ → ¬2¬2⊥)}.
Since the underlying frame of MK4G is a K4G-frame, it is not difficult to see that the finite
refutation frame obtained by such a filtration has all quasimaximal clusters degenerate.
Indeed, let x be an arbitrary element in the filtrated model. Then x can be identified with
a maximal consistent subset of . Suppose x is not an irreflexive maximal point. Then
x must contain ¬2⊥. We also have that ¬2⊥ → ¬2¬2⊥ ∈ x. Therefore, by Modus
Ponens, ¬2¬2⊥ ∈ x. But then x is related to some y in the filtrated model with 2⊥ ∈
y. This implies that y is an irreflexive maximal point of the filtrated model. Thus, the
underlying frame of the filtrated model is a finite K4G-frame. That ϕ can be refuted on a
finite rooted K4G-frame is now straightforward. 
Let X be a topological space and A ⊆ X . We recall that x ∈ X is a limit point of A if
for each open neighborhood U of x we have A ∩ (U − {x}) = ∅. Let d(A) denote the set
of limit points of A; d(A) is called the derived set of A. It is obvious that the closure of A
is A union d(A); that is, cl(A) = A ∪ d(A).
We also recall that a valuation of the basic modal language in a topological space X is a
map ν from the set of propositional letters into the powerset of X . Given a valuation ν and
x ∈ X , we define the satisfaction relation by induction:
1. x |ν p iff x ∈ ν( p);
2. x |ν ϕ ∧ ψ iff x |ν ϕ and x |ν ψ;
3. x |ν ¬ϕ iff not x |ν ϕ; and
4. x |ν 3ϕ iff for each open neighborhood U of x there exists y ∈ U − {x} such that
y |ν ϕ.
It follows that
2a. x |ν ϕ ∨ ψ iff x |ν ϕ or x |ν ψ
and that
4a. x |ν 2ϕ iff there exists an open neighborhood U of x such that y |ν ϕ for each
y ∈ U − {x}.
Given a topological space X , a valuation ν, and a formula ϕ, we say that ϕ is true in X
if x |ν ϕ for each x ∈ X and that ϕ is valid if ϕ is true under any valuation. If ϕ is valid
in X , then we write X | ϕ.
MODAL LOGIC OF STONE SPACES 29

Let L(X ) = {ϕ : X | ϕ}. Then it is well known (and easy to verify) that L(X ) is
a modal logic,
 called the modal logic of X . Given a class K of topological spaces, let
L(K ) = {L(X ) : X ∈ K }. Obviously L(K ) is a modal logic, called the modal logic
of K .
Let X be a topological space. We recall that X is a TD -space if each point of X is the
intersection of an open subset and a closed subset of X . Alternatively, X is a TD -space iff
dd(A) ⊆ d(A) for each A ⊆ X . We also recall that x ∈ X is an isolated point if {x} is
an open subset of X . Let iso(X ) denote the set of isolated points of X . Then X is called
dense-in-itself if iso(X ) = ∅. Alternatively, X is dense-in-itself iff d(X ) = X .
We say that a subset A of X is dense if cl(A) = X , that X is weakly scattered if iso(X )
is dense in X , and that X is scattered if each subspace of X is weakly scattered.
The next proposition is well known. It shows that three of the four logics we are inter-
ested in are all modal logics of natural classes of topological spaces.
P ROPOSITION 2.2.

1. Esakia (2004) K4 is the modal logic of TD -spaces.


2. Shehtman (1990) K4D is the modal logic of dense-in-itself TD -spaces.
3. Esakia (1981) GL is the modal logic of scattered spaces.

On the other hand, it will follow from our results that K4G is the modal logic of weakly
scattered TD -spaces.
A particularly important class of topological spaces is that of compact Hausdorff spaces.
Since each Hausdorff space is TD , it follows that the modal logic of compact Hausdorff
spaces contains K4.
We recall that a subset A of a topological space is clopen if it is both closed and open, and
that X is zero-dimensional if clopen subsets of X form a basis for the topology. Compact
Hausdorff zero-dimensional spaces are often called Stone spaces. They play an important
role in the theory of Boolean algebras as it follows from Stone duality that the category of
Boolean algebras and Boolean algebra homomorphisms is dually equivalent to the category
of Stone spaces and continuous maps. Under Stone duality, atomless Boolean algebras
correspond to dense-in-itself Stone spaces, atomic Boolean algebras correspond to weakly
scattered Stone spaces, and superatomic Boolean algebras correspond to scattered Stone
spaces.
It follows from Shehtman (1990) that K4D is the modal logic of any dense-in-itself zero-
dimensional metrizable space. In particular, K4D is the modal logic of the Cantor space C.
Since C is a dense-in-itself Stone space, it follows that the modal logic of dense-in-itself
Stone spaces is K4D. In addition, it follows from Abashidze (1988) that GL is the modal
logic of any ordinal α ≥ ωω (viewed as a topological space in the interval topology). In
particular, GL is the modal logic of ωω + 1. Since ωω + 1 is a scattered Stone space, it
follows that GL is the modal logic of scattered Stone spaces.
In this paper we show that K4 is the modal logic of all Stone spaces and K4G is the
modal logic of weakly scattered Stone spaces. As a consequence, we obtain that K4 is also
the modal logic of all compact Hausdorff spaces and K4G is the modal logic of weakly
scattered compact Hausdorff spaces. Consequently, K4G is also the modal logic of weakly
scattered TD -spaces. Thus, we obtain the following picture:

1. K4 = the modal logic of TD -spaces = the modal logic of compact Hausdorff spaces
= the modal logic of Stone spaces;
30 GURAM BEZHANISHVILI ET AL.

2. K4D = the modal logic of dense-in-itself TD -spaces = the modal logic of dense-in-
itself compact Hausdorff spaces = the modal logic of dense-in-itself Stone spaces;
3. K4G = the modal logic of weakly scattered TD -spaces = the modal logic of weakly
scattered compact Hausdorff spaces = the modal logic of weakly scattered Stone
spaces; and
4. GL = the modal logic of scattered spaces = the modal logic of scattered compact
Hausdorff spaces = the modal logic of scattered Stone spaces.

§3. Modal logic of dense-in-itself Stone spaces: a new proof. As we pointed out in
the previous section, K4D is the modal logic of the Cantor space. In this section we give
a new and simplified proof of this result by adopting the technique developed in Aiello
et al. (2003) for proving completeness of S4 with respect to the Cantor space (when 3 is
interpreted as the closure operator).
We proceed as follows. By Proposition 2.1(2), K4D is complete with respect to finite
rooted K4D-frames. Therefore, if K4D
ϕ, then there exists a finite rooted K4D-frame
F = (W, R) such that F | ϕ. Since F is a K4D-frame, each quasimaximal point of F is
reflexive. Figuratively speaking, F is top-reflexive.
We recall that U ⊆ W is an upset of W if w ∈ U and w Rv imply v ∈ U , and that the
collection of upsets of W forms a topology τ R on W, called an Alexandroff topology (in
which the intersection of any family of open subsets is again open). We also recall from
Bezhanishvili et al. (2005) that a map f from a topological space (X, τ ) into (W, R) is a
d-morphism if:
(i) f is continuous (V ∈ τ R implies f −1 (V ) ∈ τ ),
(ii) f is open (U ∈ τ implies f (U ) ∈ τ R ),
(iii) f is i-discrete (w an irreflexive point of W implies f −1 (w) is a discrete subspace
of X ), and
(iv) f is r-dense (w a reflexive point of W implies f −1 (w) is a dense-in-itself subspace
of X ),
and that onto d-morphisms preserve validity of formulas; or put differently, they reflect
refutation. Therefore, in order to refute ϕ on the Cantor space C, it is sufficient to construct
a d-morphism from C onto W .
L EMMA 3.1. For each finite rooted K4D-frame F, there exists a d-morphism f : C  F
from the Cantor space C onto F.
Proof. We view C as the collection of infinite paths of the infinite binary tree T2 .

The topology on C is defined as follows. For each finite path X of T2 , let


B X = {σ ∈ C : X is an initial segment of σ }.
Then {B X : X is a finite path of T2 } is a basis for the topology on C.
MODAL LOGIC OF STONE SPACES 31

Now we label T2 by nodes of F as follows. First let us fix some enumeration of W =


{w1 , . . . , wn }. Let w ∈ W , R(w) = {v ∈ W : w Rv}, and R + (w) = {w} ∪ R(w). Since
F is a K4D-frame, R(w) = ∅. We label the root of T2 by a root of F; if a node t of
T2 is labeled by w ∈ W, then we label the whole left path of T2 starting at t by w, and
we label the right-son of t by the first unused node of R(w) in the enumeration of W (if
all nodes were already used, we start over at the least node in R(w) in the enumeration
of W ).
Let σ be an infinite path of T2 . If σ is going infinitely to the left, then there is a w ∈ W
such that each node of σ is labeled by w starting from some node on. In this case we say
that w stabilizes σ. Else there exists a cluster C of W such that every node of σ is labeled by
an element of C starting from some node on. In this case we say that σ keeps cycling in C.
We pick any wC ∈ C and define f : C → W as follows:

⎨ w if w stabilizes σ,
f (σ ) =
⎩ w if σ keeps cycling in C.
C

It is left to be shown that f is an onto d-morphism. That f is onto is obvious from the
definition of f . To see that f is open, let X be a finite path of T2 and the end of X be
labeled by w. Then, by the definition of f , we have f (B X ) ⊆ R + (w). Conversely, if
v ∈ R + (w), then there exists a finite path Y extending X whose end is labeled by v. Let
σ = (Y, 0, 0, . . .). Then σ ∈ BY ⊆ B X and f (σ ) = v. Thus, f (B X ) = R + (w), and so f
is open.
To see that f is continuous, let w ∈ W . We let

U= {B X : X is a finite path of T2 whose end is labeled by v ∈ R + (w)},

and show that f −1 (R + (w)) = U . We have σ ∈ f −1 (R + (w)) iff f (σ ) ∈ R + (w), and


σ ∈ U iff there exists a finite path X of T2 whose end is labeled by v ∈ R + (w). It is easily
seen that if σ ∈ U , then f (σ ) ∈ R + (w), and so U ⊆ f −1 (R + (w)). For the converse
inclusion, note that f (σ ) ∈ R + (w) iff there is a v ∈ R + (w) such that either v stabilizes
σ or σ keeps cycling in C(v). In either case we can find a finite path X of T2 whose end
is labeled by v ∈ R + (w). Thus, f −1 (R + (w)) ⊆ U , and so f −1 (R + (w)) = U . It follows
that f is continuous.
To see that f is i-discrete, let w be an irreflexive point of W and σ ∈ f −1 (w). Then
f (σ ) = w and there exists a finite initial segment X of σ whose end is labeled by w. Note
that all finite paths Y = (X, . . . , 1, . . .) have ends labeled with some v ∈ R(w). Since w is
irreflexive, w ∈ R(w). Therefore, the only infinite path in B X that contains infinitely many
points labeled with w is (X, 0, 0, . . .). Thus, B X ∩ f −1 (w) = {σ } = {(X, 0, 0, . . .)}, and
so f −1 (w) is a discrete subspace of C.
To see that f is r-dense, let w be a reflexive point of W and σ ∈ f −1 (w). Suppose
σ ∈ B X for some finite initial segment X of σ whose end is labeled by v. Then v Rw and
so we can find a finite initial segment Y of σ such that Y contains X as an initial segment
and whose end is labeled by w. But w is reflexive, hence w ∈ R(w). Therefore, there
are at least two infinite paths having Y as an initial segment that belong to f −1 (w): One
is (Y, 0, 0, . . .) and the other is of the form (Y, 0, 0, . . . , 1, 0, 0, . . .), where the number
of 0s after Y is precisely the number required for w to come up again as a label in the
enumeration of R(w). It follows that B X ∩ f −1 (w) contains at least one infinite path
other than σ. Thus, there are no isolated points in the subspace f −1 (w) of C, and so
32 GURAM BEZHANISHVILI ET AL.

f −1 (w) is a dense-in-itself subspace of C. Consequently, f : C  W is an onto


d-morphism. 
C OROLLARY 3.2. K4D is the modal logic of the Cantor space, hence K4D is the modal
logic of dense-in-itself Stone spaces.
Proof. Since C is a dense-in-itself TD -space, K4D is sound with respect to C. To see
completeness, let K4D
ϕ. By Proposition 2.1(2), there exists a finite rooted K4D-frame
F = (W, R) such that F | ϕ. By Lemma 3.1, there exists a d-morphism from C onto W .
Therefore, C | ϕ, and so K4D = L(C). Thus, K4D is the modal logic of dense-in-itself
Stone spaces. 

§4. Trees, ordinals, and compactifications. In this section we discuss connections


between trees, ordinals, and compactifications, thus providing the necessary background
for our main results, which will be discussed in the next section.
Let F = (W, R) be a K4-frame. For w ∈ W we recall that
R −1 (w) = {v ∈ W : v Rw} and R(w) = {v ∈ W : w Rv}.
Also, for U ⊆ W let
 
R −1 (U ) = {R −1 (w) : w ∈ U } and R(U ) = {R(w) : w ∈ U }.

For w, v ∈ W we write w Rv  whenever w Rv and v R w. We say that w is of depth n if there


  n with w = w0 and for each other sequence v 0 R,
exists a sequence w0 R, . . . , Rw  . . . , Rv
 k
with w = v 0 we have k ≤ n. We also say that F is of depth n if there is w ∈ W of depth n
and no other element of F has greater depth.
Let F = (W, R) be a rooted K4-frame. We call F a quasitree if for each u, v ∈ R −1 (w)
we have that u = v implies u Rv or v Ru. If in addition F has no proper clusters, then we
call F a tree. We call a tree T reflexive if each element of T is reflexive, and irreflexive if
each element of T is irreflexive. In addition, we call a finite quasitree F top-irreflexive if
each quasimaximal element of F is irreflexive. Then we have the following strengthening
of Proposition 2.1:
T HEOREM 4.1.
1. K4 is the modal logic of finite quasitrees.
2. K4D is the modal logic of finite top-reflexive quasitrees.
3. GL is the modal logic of finite irreflexive trees.
4. K4G is the modal logic of finite top-irreflexive quasitrees.
Proof. Parts of Theorem 4.1 are well known. We sketch a uniform construction akin
to the standard finite unraveling argument to treat all four cases. Suppose F = (W, R) is
a finite rooted transitive frame. Let C = {C1 , . . . , Cn } be the set of clusters of F. We set
Ci ≤ C j if i = j or there exist w ∈ Ci and v ∈ C j with w Rv. Then C = (C, ≤) is a finite
partially ordered set. By the standard unraveling of C (see, e.g., Chagrov & Zakharyaschev,
1997, theorem 2.19), we obtain a finite tree T of clusters. The points of T are finite paths
(x1 , . . . , xk ) of C, where xi ∈ C and xi < x j whenever i < j, ordered by the relation
“is an initial segment of.” We substitute each point (x1 , . . . , xk ) of T by the cluster xk to
obtain a finite quasitree G. Then it can be easily checked that:
• G maps p-morphically onto F,
• G is top-reflexive whenever F is top-reflexive,
MODAL LOGIC OF STONE SPACES 33

• G is top-irreflexive whenever F is top-irreflexive,


• G is irreflexive whenever F is irreflexive.
The theorem follows as p-morphisms reflect refutation. 
Now we proceed to link trees with appropriate spaces.
L EMMA 4.2. For each finite irreflexive tree T with the root r there exists a limit ordinal
ωn+1 and an onto d-morphism g : ωn+1  T such that g −1 (r ) = {ωn · k : 0 < k < ω}.
Proof. Let T be a finite irreflexive tree with the root r , and let n be the depth of T. We
build a tree T∗ by adjoining a new root to T; that is, if T = (T, R), then T∗ = (T ∗ , R ∗ ),
where T ∗ = T  {∗} and R ∗ = R ∪ {(∗, t) : t ∈ T }. Then T∗ is of depth n + 1. By
Bezhanishvili & Morandi (2010, lemma 3.4), each finite tree of depth n + 1 is a d-morphic
image of ωn+1 + 1. Therefore, there exists an onto d-morphism f : ωn+1 + 1  T ∗ .
It follows from the proof of Bezhanishvili & Morandi (2010, lemma 3.4) that f −1 (∗) =
{ωn+1 } and f −1 (r ) = {ωn · k : 0 < k < ω}. Thus, f −1 (T ) = ωn+1 . Let g be the
restriction of f to ωn+1 . Then g is clearly an onto d-morphism from the limit ordinal ωn+1
onto the initial tree T with g −1 (r ) = {ωn · k : 0 < k < ω}. 
Let X be a completely regular space. We recall that a compactification of X is a compact
Hausdorff space Y such that X is homeomorphic to a dense subspace of Y . Without loss of
generality we identify X with the dense subspace of Y which is homeomorphic to X . Let
Y ∗ = Y − X . As usual, we call Y ∗ the remainder of Y.
Let X be a topological space and Y a subspace of X . We recall that Y is a retract of X
if there is a continuous onto map f : X → Y such that f (y) = y for each y ∈ Y . In
this case we call f a retraction; if in addition the f inverse image of each compact subset
of Y is compact in X , then we call f a compact retraction.
L EMMA 4.3. Let X be a noncompact locally compact zero-dimensional Hausdorff
space, S a noncompact locally compact subspace of X , f : X → S a compact retrac-
tion, and Y a zero-dimensional compactification of S. Then there is a zero-dimensional
compactification Z of X such that Z ∗ is homeomorphic to Y ∗ and Z ∗ ⊆ cl(S).
Proof. Since X is locally compact and zero dimensional, there is a basis B X of compact
clopen subsets of X . As S is a subspace of a zero-dimensional Hausdorff space, S is
also zero-dimensional Hausdorff. Therefore, S is a noncompact locally compact zero-
dimensional Hausdorff space, and so it is an open subset of Y . Let Cp(Y ) denote the basis
of all clopen subsets of Y. We set Z to be the disjoint union of X and Y ∗ , and define a
topology on Z by letting B Z = B X ∪ BY be the basis for the topology, where
BY = {(U ∩ Y ∗ ) ∪ f −1 (U ∩ S) : U ∈ Cp(Y )}.
To see that B Z is a basis, it is obvious that the union of the elements of B Z is Z . We
show that B Z is closed under finite intersections. That B X and BY are closed under fi-
nite intersections is obvious. On the other hand, if U ∈ B X and V ∈ BY , then U ∩
V = U ∩ f −1 (W ∩ S) for some W ∈ Cp(Y ). We clearly have that W ∩ S is clopen
in S, and so f −1 (W ∩ S) is clopen in X . Therefore, U ∩ f −1 (W ∩ S) is a clopen sub-
set of U , which is compact. Thus, U ∩ V is a compact clopen of X , and so belongs
to B X .
Note that if we extend f : X → S to f : Z → Y by setting f (x) = x for each x ∈ Y ∗ ,
then the topology on Z given by the basis B Z is the least topology on Z containing all the
open subsets of X and making f continuous.
34 GURAM BEZHANISHVILI ET AL.

We show that Z is Hausdorff. Let x, y ∈ Z with x = y. If x, y ∈ X , then since B X is a


basis of X and X is Hausdorff, there exist disjoint U, V ∈ B X separating x and y. If x, y ∈
Y ∗ , then as Y is Hausdorff, there exist disjoint U, V ∈ Cp(Y ) separating x and y. But then
(U ∩ Y ∗ ) ∪ f −1 (U ∩ S) and (V ∩ Y ∗ ) ∪ f −1 (V ∩ S) are disjoint elements of BY separating
x and y. Finally, let x ∈ X and y ∈ Y ∗ . Then there exists a compact clopen U of X
containing x. Therefore, f (U ) is a compact subset of S, and hence a compact subset of Y .
Thus, Y − f (U ) is an open subset of Y containing y. Since Y is zero dimensional, there
exists a clopen subset V of Y such that y ∈ V ⊆ Y − f (U ). But then (V ∩Y ∗ )∪ f −1 (V ∩ S)
is an open subset of Z containing y and disjoint from U . Thus, Z is Hausdorff.
Next we show that Z is compact. Let U ⊆ B Z be a cover of Z . We let UY = {U ∈
Cp(Y ) : (U ∩Y ∗ )∪ f −1 (U ∩S) ∈ U}. Then UY is an open cover of Y ∗ . Since Y ∗ is compact,
there exist U1 , . . . , Un ∈ UY such that Y ∗ ⊆ U1 ∪ · · · ∪ Un . Let F = Y − (U1 ∪ · · · ∪ Un ).
Then F ⊆ S is a compact subset of Y and as S is an open subset of Y , F is also compact
in S. Since f is a compact retraction, f −1 (F) is a compact subset of X , and so a compact
subset of Z . Therefore, there exist V1 , . . . , Vm ∈ U such that f −1 (F) ⊆ V1 ∪ · · · ∪ Vm .
Thus, Z = V1 ∪ · · · ∪ Vm ∪ [(U1 ∩ Y ∗ ) ∪ f −1 (U1 ∩ S)] ∪ · · · ∪ [(Un ∩ Y ∗ ) ∪ f −1 (Un ∩ S)],
and so Z is compact.
To see that Z is zero-dimensional, let U ∈ B X . Then U is a compact clopen subset of X .
Since X is an open subset of Z , U is compact open in Z ; and as Z is compact Hausdorff,
U is a clopen subset of Z . Now let U ∈ BY . Then U = (V ∩ Y ∗ ) ∪ f −1 (V ∩ S) for
some V ∈ Cp(Y ). Therefore, Z − U = Z − [(V ∩ Y ∗ ) ∪ f −1 (V ∩ S)] = [Z − (V ∩
Y ∗ )] ∩ [Z − f −1 (V ∩ S)] = [X ∪ (Y ∗ ∩ (Y − V ))] ∩ [Y ∗ ∪ f −1 ((Y − V ) ∩ S)] =
f −1 ((Y − V ) ∩ S) ∪ (Y ∗ ∩ (Y − V )); and as Y − V ∈ Cp(Y ), we obtain that Z − U ∈ BY .
Thus, each element of B Z is a clopen subset of Z , and so Z is zero-dimensional.
We show that Z ∗ ⊆ cl(S). Let x ∈ Z ∗ and U ∈ B Z be a neighborhood of x. Then
there exists V ∈ Cp(Y ) such that U = (V ∩ Y ∗ ) ∪ f −1 (V ∩ S). Since clY (S) = Y ,
we have V ∩ S = ∅. Therefore, f −1 (V ∩ S) = ∅, and as f is a retraction, we obtain
S ∩ f −1 (V ∩ S) = ∅. Thus, S ∩ U = ∅, and so Z ∗ ⊆ cl(S).
It follows that Z is a compact Hausdorff zero-dimensional space such that Z ∗ ⊆ cl(S).
Therefore, cl(X ) = Z , and so Z is a zero-dimensional compactification of X . Finally, that
the identity map from Z ∗ to Y ∗ is continuous follows from the definition of BY . Thus, there
is a continuous bijection between compact Hausdorff spaces Z ∗ and Y ∗ , which means that
Z ∗ and Y ∗ are homeomorphic. 
L EMMA 4.4. For a limit ordinal ωn+1 and a closed subset X of the Cantor space C,
there exists a compactification Z of ωn+1 such that:
1. Z is a Stone space;
2. The remainder Z ∗ is homeomorphic to X ; and
3. Z ∗ ⊆ cl({ωn · k : 0 < k < ω}).
Proof. Let X be a closed subset of C. Then X is a compact Hausdorff metrizable space.
Therefore, by Terasawa (1997, theorem 1), there is a compactification Y of ω such that
the remainder Y ∗ is homeomorphic to X . In fact, the proof of Terasawa (1997, theorem 1)
implies that Y can be chosen to be zero-dimensional.1 Now consider the following partition

1 Indeed, since X is a subspace of C, it is zero-dimensional; therefore, in the proof of Terasawa


(1997, theorem 1) we can choose the basis {Un : n < ω} to consist of clopen subsets of X , which
makes each element Hn,m of the basis of Y clopen.
MODAL LOGIC OF STONE SPACES 35

of ωn+1 into ω-many pairwise disjoint clopen intervals:


[0, ωn ], (ωn , ωn · 2], . . . , (ωn · (k − 1), ωn · k], . . . .
Using this partition, we define a retraction f : ωn+1  {ωn · k : k < ω} by sending all the
points in (ωn · (k − 1), ωn · k] ⊆ ωn+1 to ωn · k. It is easy to see that f is an onto continuous
map and that {ωn · k : k < ω} ⊆ ωn+1 with its subspace topology is homeomorphic to
ω. Since compact subsets of ω are finite, it follows that the f inverse image of a compact
subset of {ωn · k : k < ω} is a finite union of intervals of the form (ωn · (k − 1), ωn · k].
Since each of these is compact, so is their finite union. Consequently, f is a compact
retraction.
Now ωn+1 is a noncompact locally compact zero-dimensional Hausdorff space, {ωn · k :
k < ω} is a noncompact locally compact subspace of ωn+1 , f : ωn+1  {ωn ·k : k < ω} is
a compact retraction, and Y is a zero-dimensional compactification of {ωn ·k : k < ω} such
that Y ∗ is homeomorphic to X . Thus, we are in a position to apply Lemma 4.3, by which
we obtain a zero-dimensional compactification Z of ωn+1 such that Z ∗ is homeomorphic
to X and Z ∗ ⊆ cl({ωn · k : 0 < k < ω}). 

§5. Main results. In this section we prove our main results, that the modal logic of
Stone spaces is K4, and that the modal logic of weakly scattered Stone spaces is K4G. As
a corollary, we obtain that the modal logic of compact Hausdorff spaces is also K4 and that
the modal logic of weakly scattered compact Hausdorff spaces is K4G.
The key observation in establishing our main results is that each finite quasitree F =
(W, R) is a d-morphic image of an appropriately chosen Stone space. Our strategy will be
as follows:
1. Represent F as the disjoint union of two finite frames D and T in such a way that:
– D is a top-reflexive quasitree, hence a K4D-frame;
– T is the disjoint union of irreflexive trees T1 , . . . , Tn , hence a GL-frame.
2. Use Lemma 3.1 to build a d-morphism f from the Cantor space C onto D.
3. Use Lemma 4.2 to build d-morphisms gi from limit ordinals ωki +1 onto the
trees Ti .
4. Combine C and ωk1 +1 , . . . , ωkn +1 to obtain a Stone space X .
5. Combine f and g1 , . . . , gn to obtain a d-morphism from X onto F.
For Step (1) we employ a method reminiscent of the Cantor–Bendixson theorem which
represents each space X as the disjoint union of an open subspace U and a closed subspace
F so that U is scattered and F is dense-in-itself.
L EMMA 5.1. Let F = (W, R) be a finite quasitree. Then there exist finite (possibly
empty) frames D = (D, R D ) and T = (T, RT ) such that:
(i) W = D ∪ T , D ∩ T = ∅, R D is the restriction of R to D, and RT is the restriction
of R to T ;
(ii) D is a top-reflexive quasitree; and
(iii) T is the disjoint union of irreflexive trees T1 , . . . , Tn .
Proof. We first build D by applying repeatedly the operator R −1 to W until we reach the
(largest) fixpoint. More precisely, let D0 = W and Di+1 = R −1 (Di ). Clearly Di+1 ⊆ Di .
36 GURAM BEZHANISHVILI ET AL.

Fig. 1. Bisecting a quasitree into top-reflexive and irreflexive parts.

Since W is finite, at some stage k < ω we obtain Dk+1 = Dk . Set D = Dk and


D = (D, R D ). It follows from the construction that D is a top-reflexive quasitree. Now set
T = W − D and T = (T, RT ). Clearly W = D ∪ T and D ∩ T = ∅. Moreover, T consists
of irreflexive points, and so there are no nondegenerate clusters in T. Let r1 , . . . , rn be the
minimal points of T . We set Ti = R(ri ) ∪ {ri } and Ti =(Ti , RTi ). Since F is a quasitree,
n
it is obvious that each Ti is an irreflexive tree, that T = i=1 Ti , and that the trees Ti are
disjoint. Therefore, T is the disjoint union of irreflexive trees T1 , . . . , Tn . 
The construction described in Lemma 5.1 is shown in Figure 1, where r denotes a root
of F. For each i≤n let Di =R −1 (ri ). It is clear that Di ⊆ D and that for each u, v ∈Di such
that u = v we have u Rv or v Ru. It is also clear that R −1 (D) = D and that D = i=1 n
Di
iff each quasimaximal point of F is irreflexive (in which case it is a maximal point of F).
For Step (2) we use Lemma 3.1 to obtain a d-morphism f : C  D. For each i ≤ n
let Ci = f −1 (Di ). It is readily seen that each Ci is a closed subspace of C, hence a Stone
space.
For Step (3) note that by Lemma 4.2, for each i ≤ n, there exists a limit ordinal ωki +1
and a d-morphism gi : ωki +1  Ti .
Next we concentrate on Step (4). Fix i ≤ n and consider the limit ordinal ωki +1 and
the Stone space Ci ⊆ C. By Lemma 4.4, there exists a Stone space Yi such that Yi is a
compactification of ωki +1 and the remainder Yi∗ is homeomorphic to Ci . We look at Yi as
the disjoint union of ωki +1 and Yi∗ , where ωki +1 is an open dense subspace of Yi , Yi∗ is a
closed subspace of Yi homeomorphic to Ci , and Yi∗ ⊆ cl({ωki · k : k < ω}). The space Yi
is shown in Figure 2.

Fig. 2. The compactification of ωki +1 with the remainder Ci .

To build X we employ the following lemma:


L EMMA 5.2. Let X, Y, Z be Stone spaces and i : Z
→ X , j : Z
→ Y continuous
injections. Then there exists a Stone space Z  and closed subspaces X  , Y  of Z  such that
X  is homeomorphic to X , Y  is homeomorphic to Y , and Z  = X  ∪ Y  .
MODAL LOGIC OF STONE SPACES 37

Proof. This follows easily from the well-known fact that the category Stone of Stone
spaces and continuous maps is closed under pushouts. In fact, Z  is the pushout of the
diagram X ← Z
→ Y in the category Stone. More precisely, Z  is the factor space of the
topological sum X ⊕ Y by the equivalence relation {(i(z), j (z)) : z ∈ Z }. 
We denote the pushout of the diagram X ← Z
→ Y by X ⊕ Z Y and point out
that since we are working with compact Hausdorff spaces, continuous injections are in
fact topological (homeomorphic) embeddings. We consider an example which will be the
starting point in the construction of the space X to follow.
Suppose we are given an ordinal ωk1 +1 and its compactification Y1 such that Y1∗ is
homeomorphic to a closed subspace C1 of C. Then using Lemma 5.2 we can identify
the copies of C1 present in both C and Y1 to obtain the space X 2 = C ⊕ C1 Y1 such
that:
(a) X 2 is a Stone space based on the disjoint union of ωk1 +1 and C,
(b) ωk1 +1 is homeomorphic to an open subspace of X 2 ,
(c) C is homeomorphic to a closed subspace of X 2 , and
(d) Y1∗ ⊆ cl({ωk1 · k : k < ω}).
This situation is depicted in Figure 3 below.

Fig. 3. Glueing of Y1 and C along the shared closed subspace C1 .

Since C is a closed subspace of X 2 , and C2 is a closed subspace of C, it follows that


C2 is (homeomorphic to) a closed subspace of X 2 . This enables us to iterate the procedure
and now adjoin ωk2 +1 to X 2 along C2 . A formal definition of the procedure is obtained by
putting:
X 1 = C;
X 2 = X 1 ⊕ Y1∗ Y1 ;
..
.
X n+1 = X n ⊕ Yn∗ Yn .
By identifying each Yi∗ with Ci , we can write each X i+1 as X i+1 = X i ⊕ Ci Yi . Finally,
we set X = X n+1 . We clearly have that X is a Stone space, which concludes our Step (4).
Pictorially X can be represented as in Figure 4.
We point out that since the constructed X is a metrizable Stone space, it is in fact
homeomorphic to a closed subspace of C (see, e.g., Engelking, 1977, theorem 6.2.16).
Moreover, since the topological sum of ωk1 +1 , . . . , ωkn +1 is homeomorphic to a limit
ordinal ωk+1 , we can think of the scattered part of X as a limit ordinal.
38 GURAM BEZHANISHVILI ET AL.

Fig. 4. The Stone space X .

For our final Step (5), we need to construct an onto d-morphism h : X → F. For this we
observe that X is the disjoint union of C and ωk1 +1 , . . . , ωkn +1 . Now let x ∈ X . We set

f (x), x ∈ C
h(x) =
gi (x), x ∈ ωki +1 .
That h is a well defined onto map is obvious. It is left to be shown that h is a d-morphism.
We first show that the restriction of h to each Yi , which we denote by h i , is a d-morphism.
Let Ti+ denote the range of h i , which is a subframe of F based on the set Ti ∪ Di . Let also
f i denote the restriction of f to Ci .
L EMMA 5.3. The map h i : Yi → Ti+ is a d-morphism.
Proof. To see that h i is continuous, let U be an upset of Ti+ . If U ⊆ Ti , then h i−1 (U ) =
gi−1 (U ), which is open in Yi since gi is continuous and ωki +1 is an open subset of Yi . If
U ∩ Di = ∅, then U = (U ∩ Di ) ∪ Ti and h i−1 (U ) = f i−1 (U ) ∪ ωki +1 , which is open in
Yi because f i−1 (U ) is open in Yi∗ and ωki +1 is open and dense in Yi .
To see that h i is open, let U be an open subset of Yi . If U ⊆ ωki +1 , then h i (U ) = gi (U ),
which is an upset of Ti since gi is open. Therefore, h i (U ) is also an upset of Ti+ . Suppose
now that U ∩ Yi∗ = ∅. Then f (U ) = gi (U ∩ ωki +1 ) ∪ f i (U ∩ Yi∗ ). By Lemma 4.4,
ωki · k ∈ U ∩ ωki +1 for some k < ω; and by Lemma 4.2, gi (ωki · k) = ri . Since gi is open,
gi (U ∩ ωki +1 ) = Ti . Thus, as f i is open, Ti ∪ f i (U ∩ Yi∗ ) is an upset of Ti+ .
That h i is r-dense is obvious because there are no reflexive points in Ti and f i is r-dense.
Similarly, as both f i and gi are i-discrete, it is easy to see that h i is i-discrete. Consequently,
h i : Yi → Ti+ is a d-morphism. 
Now we show that h is a d-morphism. Since F is finite, by Bezhanishvili et al. (2005,
corollary 2.8), it is sufficient to show that d(h −1 (w)) = h −1 (R −1 (w)) for each w ∈ W ,
where d denotes the derived set operator of X .
L EMMA 5.4. For each w ∈ W we have d(h −1 (w)) = h −1 (R −1 (w)).
Proof. First we recall that if Y is a closed subspace of X and A ⊆ Y , then d X (A) =
dY (A). Now let w ∈ W . If w ∈ D, then R −1 (w) ⊆ D. Therefore, h −1 (w) = f −1 (w) and
h −1 (R −1 (w)) = f −1 (R −1 (w)). Since f is a d-morphism, we have:
d X (h −1 (w)) = dC ( f −1 (w)) = f −1 (R −1 (w)) = h −1 (R −1 (w)).
Next suppose that w ∈ Ti for some i ≤ n. Then h −1 (w) = h i−1 (w) and h −1 (R −1 (w)) =
h i−1 (RT−1+ (w)). By Lemma 5.3, h i is a d-morphism. Now as Yi is a closed subspace of X ,
i
MODAL LOGIC OF STONE SPACES 39

we have:
d X (h −1 (w)) = dYi (h i−1 (w)) = h i−1 (RT−1+ (w)) = h −1 (R −1 (w)).
i

Thus, d(h −1 (w)) = h −1 (R −1 (w)). 


As a result, we obtain that for each quasitree F, there exists a (metrizable) Stone space X
such that F is a d-morphic image of X . In addition, if F is top-irreflexive, then X is weakly
scattered. Indeed, since h is open, continuous, and i-discrete, the h inverse image of the
set of maximal points of F is precisely the set of isolated points of X , which is dense in
X because the set of maximal points of F is dense in the Alexandroff topology of F. This
immediately leads to our main theorem.
T HEOREM 5.5. K4 is the modal logic of Stone spaces and K4G is the modal logic
of weakly scattered Stone spaces.
Proof. That K4 is sound with respect to the class of Stone spaces follows from Proposi-
tion 2.2(1) because each Stone space is Hausdorff, hence a TD -space. To prove complete-
ness, let K4
ϕ. By Theorem 4.1(1), K4 is complete with respect to finite quasitrees.
Therefore, there exists a finite quasitree F such that F | ϕ. By our construction, there
exists a Stone space X and an onto d-morphism f : X  F. By Bezhanishvili et al. (2005,
corollary 2.9), onto d-morphisms preserve validity of formulas. Therefore, X | ϕ, and so
K4 is sound and complete with respect to the class of Stone spaces.
That K4G is sound with respect to the class of weakly scattered Stone spaces follows
from Proposition 2.2(1) and an easily verifiable fact that if X is weakly scattered, then
X | G. To prove completeness, let K4G
ϕ. By Theorem 4.1(4), K4G is complete with
respect to finite top-irreflexive quasitrees. Therefore, there exists a finite top-irreflexive
quasitree F such that F | ϕ. By our construction, there exists a weakly scattered Stone
space X and an onto d-morphism f : X  F. Now since onto d-morphisms preserve
validity of formulas, we obtain X | ϕ, and so K4G is sound and complete with respect to
the class of weakly scattered Stone spaces. 
Since the class of Stone spaces is contained in the class of compact Hausdorff spaces
and the class of weakly scattered Stone spaces is contained in the class of weakly scattered
compact Hausdorff spaces, the following is an immediate corollary to Theorem 5.5:
C OROLLARY 5.6. K4 is the modal logic of compact Hausdorff spaces and K4G is the
modal logic of weakly scattered compact Hausdorff spaces.

§6. Acknowledgment. The second and third authors were partially supported by the
Georgian National Science Foundation Grant GNSF/ST06/3-017.

BIBLIOGRAPHY

Abashidze, M. (1988). Ordinal completeness of the Gödel-Löb modal system. In


Intensional logics and the logical structure of theories, ed. V. Smirnov and M.
Bezhanishvili (In Russian) (Telavi, 1985). Tbilisi, Georgia: Metsniereba, pp. 49–73.
Aiello, M., van Benthem, J., & Bezhanishvili, G. (2003). Reasoning about space: The
modal way. Journal of Logic and Computation, 13(6), 889–920.
Bezhanishvili, G., Esakia, L., & Gabelaia, D. (2005). Some results on modal
axiomatization and definability for topological spaces. Studia Logica, 81(3), 325–355.
40 GURAM BEZHANISHVILI ET AL.

Bezhanishvili, G., & Morandi, P. J. (2010). Scattered and hereditarily irresolvable spaces
in modal logic. Archive for Mathematical Logic.
Chagrov, A., & Zakharyaschev, M. (1997) Modal Logic, volume 35 of Oxford Logic
Guides. New York, NY: The Clarendon Press Oxford University Press.
Engelking, R. (1977). General Topology. Warsaw, Poland: PWN—Polish Scientific
Publishers.
Esakia, L. (1981). Diagonal constructions, Löb’s formula and Cantor’s scattered spaces.
In Studies in logic and semantics, ed. Z. Mikeladze (In Russian). Tbilisi, Georgia:
Metsniereba, pp. 128–143.
Esakia, L. (2002). A modal version of Gödel’s second incompleteness theorem, and the
McKinsey system. In Logical Investigations, No. 9, ed. A. Karpenko (In Russian).
Moscow, Russia: Nauka, pp. 292–300.
Esakia, L. (2004). Intuitionistic logic and modality via topology. Annals of Pure and
Applied Logic, 127(1–3), 155–170.
Gabelaia, D. (2004). Topological semantics and two-dimensional combinations of modal
logics. PhD Thesis, King’s College, London.
McKinsey, J. C. C., & Tarski, A. (1944). The algebra of topology. Annals of Mathematics,
45, 141–191.
Shehtman, V. (1990). Derived sets in Euclidean spaces and modal logic. Preprint X-90-05,
University of Amsterdam.
Terasawa, J. (1997). Metrizable compactification of ω is unique. Topology and its
Applications, 76, 189–191.

DEPARTMENT OF MATHEMATICAL SCIENCES


NEW MEXICO STATE UNIVERSITY
LAS CRUCES, NM 88003
E-mail: gbezhani@nmsu.edu

DEPARTMENT OF MATHEMATICAL LOGIC


A. RAZMADZE MATHEMATICAL INSTITUTE
M. ALEKSIDZE STR. 1
TBILISI 0193, GEORGIA
E-mail: esakia@hotmail.com

DEPARTMENT OF MATHEMATICAL LOGIC


A. RAZMADZE MATHEMATICAL INSTITUTE
M. ALEKSIDZE STR. 1
TBILISI 0193, GEORGIA
E-mail: gabelaia@gmail.com
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010

RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS


ROGER D. MADDUX
Department of Mathematics, Iowa State University

Abstract. Sound and complete semantics for classical propositional logic can be obtained by
interpreting sentences as sets. Replacing sets with commuting dense binary relations produces an
interpretation that turns out to be sound but not complete for R. Adding transitivity yields sound and
complete semantics for RM, because all normal Sugihara matrices are representable as algebras of
binary relations.

§1. Introduction. One way to get sound and complete semantics for classical propo-
sitional logic is to evaluate each variable as one of two truth values, and extend this
valuation to more complicated sentences by the classical truth tables. Another way to
get sound and complete semantics for classical propositional logic is to evaluate each
variable as a subset of a fixed universe of discourse. For complex sentences, interpret
conjunction as intersection, disjunction as union, negation as complementation, and so
on. These two methods are essentially the same, but the second one provides an obvious
generalization: replace “set” with “binary relation.” This approach was taken by Tarski,
who produced an undecidable fragment of classical propositional logic by early 1942;
see Tarski & Givant (1987, sec. 5.4, sec. 5.5, fn. 3*). Tarski’s operations include Boolean
intersection ∩, union ∪, and complementation , relative (or Peircean) multiplication | and
addition †, conversion −1 , and an identity relation.
Relevance logic arose in the 1950s and 1960s from attempts to axiomatize the notion that
an implication A → B should be regarded as true only if the hypothesis A is “relevant”
to the conclusion B. The earliest systems were proposed by Orlov in 1928 (Došen, 1992),
and by Moh (1950), Church (1951), and Ackermann (1956) in the 1950s. Semantics
were introduced and developed only much later, in the 1970s; see Routley & Routley
(1972), Routley & Meyer (1972a, 1972b), Urquhart (1972), Routley & Meyer (1973),
Fine (1974), Meyer & Routley (1973), Anderson & Belnap (1975), Routley et al. (1982),
Anderson et al. (1992), and Brady (2003).
The calculus of relations was created by De Morgan (1856, 1864a, 1864b, 1966)
and Peirce (1870, 1880, 1883, 1885, 1897, 1960, 1984), and was extensively developed
by Schröder (1966). Relation algebras arose from Tarski’s axiomatization of the calculus
of relations; see Tarski (1941), Tarski & Givant (1987), and Maddux (1991). Tarski’s un-
decidable propositional calculus is equivalent to the equational theory of relation algebras.
The Routley–Meyer semantics for relevance logic and the theory of relation algebras
have a significant class of structures in common. A structure is in this class if it is simul-
taneously the atom structure of a relation algebra and a normal relevant model structure.

Received: February 8, 2009


c Association for Symbolic Logic, 2010
41 doi:10.1017/S1755020309990293
42 ROGER D . MADDUX

Prominent examples of these are the ones constructed by Lyndon (1961) from projec-
tive planes. This connection is the key to deep undecidability results in both subjects;
see Andréka et al. (1997) and Urquhart (1984).
This confluence makes it possible to think of propositional variables, sentences, and
worlds in a relevant model structure as binary relations. The connectives of relevance logic
are then certain operations on binary relations determined by the Routley–Meyer seman-
tics. For example, negation ∼ turns out to be converse complementation while fusion ◦
is simply composition. The constants of relevance logic will not be considered here
because they are the source of some difficulties; see Routley et al. (1982, p. 348), Bimbo
et al. (2009).
In Section 2 we present axioms and rules of deduction for relevance logic, and focus
attention on two prominent systems, R and RM. Sections 3 and 4 introduce relational
relevance algebras and give two examples, due to Belnap and Meyer. Soundness for the
interpretation of sentences as binary relations is shown in Section 5. In Section 6 we
prove that RM is a complete axiomatization of the logic of transitive commutative dense
relational relevance algebras, while in Sections 7 and 8 we show that R is an incomplete
axiomatization of the logic of commutative dense relational relevance algebras. Some
closing remarks are made in Section 9.
For discussions and communications about these topics, thanks to K. Bimbó, J. M. Dunn,
N. Galatos, R. Hirsch, I. Hodkinson, P. Jipsen, T. Kowalski, R. L. Kramer, D. McCarty,
R. K. Meyer, S. Mikulás, L. Moss, A. Urquhart, and the referee.

§2. Systems of relevance logic. Let Pv be a countable set whose elements are called
propositional variables. There are five connectives, ∨, ∧, ◦, →, and ∼. For any C ⊆
{∨, ∧, ◦, →, ∼}, the set SentC of C-sentences is the closure of the variables under appli-
cation of the connectives in C. Let Sent := Sent{∨,∧,◦,→,∼} . The connectives are opera-
tions on Sent which act in the way required of a language, that is, Sent, ∨, ∧, ◦, →, ∼ is
an algebra of type 2, 2, 2, 2, 1 (four binary operations
 and one unary operation)
 which is
absolutely freely generated by Pv. This means that Sent, ∨, ∧, ◦, →, ∼ is generated by
Pv and any function from Pv to an algebra R of type 2, 2, 2, 2, 1 has a unique extension
to a homomorphism from Sent, ∨, ∧, ◦, →, ∼ into R.
A sentence S ∈ Sent is an axiom of R if there are sentences A, B, C ∈ Sent such
that S is one of the sentences (A1)–(A31) listed below, and S an axiom of RM if S is one
of (A1)–(A33). The axioms of Routley & Meyer (1973, pp. 204, 224) are A1–A15.
A→A A1 (A1)

A∧B → A A5 (A2)

A∧B → B A6 (A3)

(( A → B) ∧ (A → C)) → (A → (B ∧ C)) A7 (A4)

A→ A∨B A8 (A5)

B → A∨B A9 (A6)

((A → C) ∧ (B → C)) → (( A ∨ B) → C) A10 (A7)


A ∧ (B ∨ C) → (A ∧ B) ∨ (A ∧ C) A11 (A8)
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 43

∼∼A → A A13 (A9)

∼(A ∨ B) → (∼A ∧ ∼B) (A10)


(∼A ∧ ∼B) → ∼(A ∨ B) (A11)
(A → B) → ((C → A) → (C → B)) (A12)

((A → A) → B) → B (A13)

A → ((∼B → ∼ A) → B) (A14)
A → (∼B → ∼(A → B)) (A15)

(A → (B → C)) → (∼(A → ∼B) → C) (A16)

A ◦ B → ∼(A → ∼B) (A17)

∼(A → ∼B) → A ◦ B (A18)

A → (B → (A ◦ B)) A14 (A19)

(A → (B → C)) → ((A ◦ B) → C) A15 (A20)

(A → (B → C)) → (B → (A → C)) (A21)

(A → ∼B) → (B → ∼ A) A12 (A22)

A → ((A → B) → B) A2 (A23)

(A → B) → ((B → C) → (A → C)) A3 (A24)

(A → (A → B)) → (A → B) A4 (A25)

(A → ∼ A) → ∼ A (A26)

(A → (B → C)) → ((A ∧ B) → C) (A27)

(A → B) → (∼A ∨ B) (A28)

(A ∧ (A → B)) → B (A29)

((A → B) ∧ (B → C)) → (A → C) (A30)


(A → (B → C)) → ((A → B) → (A → C)) (A31)
A → (A → A) (A32)

(A → B) → (A → (A → B)) (A33)

Among the following rules of deduction, only modus ponens and Adjunction are used in
R and RM. The rules used in the Basic Logic of Routley et al. (1982, p. 287) are modus
ponens, Adjunction, Suffixing, Prefixing, and Contraposition.

A, A → B  B modus ponens
A, B  A ∧ B Adjunction
44 ROGER D . MADDUX

A → ∼B  B → ∼ A Contraposition

∼A, A ∨ B  B Disjunctive Syllogism


A ∧ B → C, B → C ∨ A  B → C Cut
A → B  (C → A) → (C → B) Prefixing

A → B  (B → C) → (A → C) Suffixing

A → (B → C)  B → (∼C → ∼ A) Cycling
A  (A → B) → B E-rule (Brady, 2003, p. 8)
For any A ∈ Sent, we write R A (or RM A) if A belongs to every subset of Sent that
contains the axioms of R (or RM) and is closed under modus ponens and Adjunction. This
axiomatization of R is highly redundant but provides more input for semantic analysis
in Theorem 5.1. Routley & Meyer (1973) use only A1–A15. Furthermore, R is well-
axiomatized in the following sense.
T HEOREM 2.1 (Routley & Meyer, 1973, theorem 7). Let C be one of the following sets
of connectives:
{→}, {→, ∼}, {→, ◦}, {→, ∼, ◦}, {→, ∧}, {→, ∨, ∧},
{→, ◦, ∧}, {→, ◦, ∧, ∨}, {→, ∼, ∧, ∨}, {→, ∼, ◦, ∧, ∨}.
If A ∈ SentC , then R A iff A is derivable, using only modus ponens and Adjunction, from
those axioms among A1–A15 that explicitly contain connectives in C.

§3. Relational relevance algebras. Binary relations are, by definition, sets of ordered
pairs. For arbitrary binary relations A and B, their union, intersection, difference, converse,
and relative product are defined as follows.
A ∪ B := { x, y : x, y ∈ A or x, y ∈ B} (1)

A ∩ B := { x, y : x, y ∈ A and x, y ∈ B} (2)

A − B := { x, y : x, y ∈ A and x, y ∈
/ B} (3)

A−1 := { x, y : y, x ∈ A} (4)
A|B := { x, y : ∃z ( x, z ∈ A and z, y ∈ B)}. (5)
Let U be a non-empty
 set. U 2 = { x, y : x, y ∈ U } is the set of ordered pairs of elements
of U . Sb U is the set of subsets of U 2 , and is called the set of binary relations on U .
2

The identity and diversity relations on U are


Id := { x, x : x ∈ U } (6)

Di := { x, y : x, y ∈ U and x = y}. (7)


For any binary relations A, B ⊆ U 2, set
A := U 2 − A (Boolean complement) (8)

∼A := U 2 − A−1 (De Morgan complement) (9)


RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 45

A ◦ B := B|A (composition) (10)

A † B := ∼(∼A ◦ ∼B) (relative sum) (11)


A → B := ∼(∼A ◦ B) (residual). (12)
Alternate characterizations, obtained by unwinding definitions, are
A = { x, y : x, y ∈ U and x, y ∈
/ A}

∼A = { x, y : x, y ∈ U and y, x ∈
/ A}
A † B = { x, y : x, y ∈ U and ∀z∈U ( x, z ∈ A or z, y ∈ B)}

A → B = { x, y : x, y ∈ U and ∀z∈U ( if z, x ∈ A then z, y ∈ B)}.


A relational relevance algebra on a nonempty set U is an algebra
R = R, ∪, ∩, ◦, →, ∼ (13)
of type 2, 2, 2, 2, 1 such that R is a nonempty set of relations on U , and
 R is closed
under the operations ∪, ∩, ◦, →, and ∼. For example, choosing R = Sb U 2 produces
the relational relevance algebra of all binary relations on the set U ,
  
Rel (U ) := Sb U 2 , ∪, ∩, ◦, →, ∼ .

Relational relevance algebras lack the constants of relevant algebras (Urquhart, 1996)
or De Morgan monoids (Anderson & Belnap, 1975), but they do satisfy many equations
not involving constants that have been used in the definitions of these and other algebras
designed for relevance logic. For example, if R = R, ∪, ∩, ◦, →, ∼ is a relational
relevance algebra, then R, ∪, ∩ is a distributive lattice, R, ◦ is a semigroup, and many
other equations and inclusions hold for all A, B, C ∈ R, such as
A ◦ (B ∪ C) = (A ◦ B) ∪ (A ◦ C)

(B ∪ C) ◦ A = (B ◦ A) ∪ (C ◦ A)
(A ∪ B) → C = (A → B) ∩ (A → C)

A → (B ∩ C) = (A → B) ∩ (A → C)

(A ∪ B) → C = (A → C) ∩ (B → C)

∼(∼A) = A

∼(A ∪ B) = ∼ A ∩ ∼B
∼(A ∩ B) = ∼ A ∪ ∼B
A ◦ B = ∼(A → ∼B)

A → B = ∼(A ◦ ∼B)

A → (B → C) = A ◦ B → C
(A → B) ◦ A ⊆ B
A → B ⊆ (C → A) → (C → B).
46 ROGER D . MADDUX

If R is a set of relations closed under composition, we say that R is commutative if A◦ B =


B ◦ A for every A, B ∈ R. A relational relevance algebra R is commutative if its universe
R is commutative. For example, Rel (U ) is commutative iff |U | = 1. In a commutative
relational relevance algebra, A → ∼B = B → ∼ A and A → B ⊆ (B → C) →
(A → C).
A binary relation A is dense if A ⊆ A ◦ A, transitive if A ◦ A ⊆ A, and symmetric if
A = A−1 . We say that a relational relevance algebra R is dense, transitive, or symmetric
if every relation in R is dense, transitive, or symmetric, respectively. Let R, Rcd , and Rcdt
be the classes of relational relevance algebras, commutative dense relational relevance al-
gebras, and commutative dense transitive relational relevance algebras, respectively. Define
Rc , Rd , Rdt , Rct , and Rt similarly. For any class S of algebras let IS be the class of algebras
isomorphic to algebras in S.
Symmetry is preserved by ∼, for if A is a symmetric relation then ∼A is also symmetric.
However, transitivity is not preserved by ∼ because, for any nonempty U , Id is transitive
but ∼Id is not transitive. It follows that no transitive relational relevance algebra on a
nonempty U contains the identity relation on U . The identity relation Id is always dense,
but ∼Id is not dense whenever |U | = 2. This is one of the reasons for not requiring Id to
belong to a relational relevance algebra.
No relational relevance algebra has any ∼-fixed points, for if A ⊆ U 2 and A = ∼ A, then
x, y ∈ A iff y, x ∈ / A for all x, y ∈ U , hence x, x ∈ A iff x, x ∈/ A, a contradiction.
A relational relevance algebra generated by a commutative set of relations may not be
commutative. For example, if U = {0, 1}, B = ∅, and A = { 0, 1 }, then ∼B = U 2 and
A ◦ B = B ◦ A = ∅, so {A, B} is commutative, but
{0} × U = U 2 ◦ A = ∼B ◦ A = A ◦ ∼B = A ◦ U 2 = U × {1}
so {A, ∼B} is not commutative.
Suppose R is a relational relevance algebra on U and Id is the identity relation on U .
We say that a sentence A ∈ Sent is valid in R, and write R | A, if Id ⊆ H (A) for every
homomorphism H from Sent, ∨, ∧, ◦, →, ∼ into R. For any class S ⊆ R of relational
relevance algebras, A is valid in S if A is valid in every algebra in S, and S-logic is the set
of sentences valid in S. The notion of validity applies to isomorphic copies of relational
relevance algebras in the obvious way, so S-logic is the same as IS-logic.

§4. Relational relevance algebras of Belnap and Meyer. In this section we give two
useful examples of relational relevance algebras, one on an infinite set, and one on a finite
set. For these examples we first define two closely related finite algebras, Belnap’s M0 and
Meyer’s RM84. They can be defined together as follows.
(i) Both M0 and RM84 are algebras of the form S3 , ∨, ∧, ◦, →, ∼ where
S3 := {−3, −2, −1, −0, +0, +1, +2, +3},
the set of designated values is {+0, +1, +2, +3}, and a sentence A is valid in
the algebra if every homomorphism from the algebra of sentences carries A to a
designated value.
(ii) For both algebras the reduct S3 , ∨, ∧ is the lattice of a Boolean algebra whose
atoms are −1, +0, and −2, whose top element is +3 and bottom element is −3,
satisfying these equations: −1∨+0 = +1, −1∨−2 = −0, and +0∨−2 = +2. For
tables see Belnap (1960, p. 145), Anderson & Belnap (1975, p. 252), Routley et al.
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 47

(1982, pp. 178, 253), and Brady (2003, p. 101); for Hasse diagrams see Anderson
& Belnap (1975, pp. 198, 252), Routley et al. (1982, p. 178), and Brady (2003,
p. 102).
(iii) In both algebras the operation ∼ takes −i to +i and +i to −i for every i ∈
{0, 1, 2, 3}.
(iv) The operation → in Belnap’s M0 is defined in Table 1, (Belnap, 1960, p. 145,
Anderson & Belnap, 1975, p. 253, and Brady, 2003, p. 101).
(v) The operation → in Meyer’s RM84 is defined in Table 2, (Anderson & Belnap,
1975, p. 334, and Routley et al., 1982, p. 253).
(vi) In both algebras the operation ◦ is defined by x ◦ y = ∼(x → ∼y); see Tables 3
and 4.

In the next two theorems we show that M0 and RM84 are isomorphic to algebras in Rcd ,
and hence belong to IRcd . Theorem 4.1 was announced in the abstract (Maddux, 2007) and
noted again in Bimbo et al. (2009), while Theorem 4.2 is new.
T HEOREM 4.1. Belnap’s M0 is isomorphic to a commutative dense relational relevance
algebra on a countable set, so
M0 ∈ IRcd . (14)
Proof. Let Q be the set of rational numbers. Define a map ρ from the universe S3 of M0
into the set of binary relations on Q, as follows.

Table 1. Operation → in M0
→ −3 −2 −1 −0 +0 +1 +2 +3
−3 +3 +3 +3 +3 +3 +3 +3 +3
−2 −3 +2 −3 +2 −3 −3 +2 +3
−1 −3 −3 +1 +1 −3 +1 −3 +3
−0 −3 −3 −3 +0 −3 −3 −3 +3
+0 −3 −2 −1 −0 +0 +1 +2 +3
+1 −3 −3 −1 −1 −3 +1 −3 +3
+2 −3 −2 −3 −2 −3 −3 +2 +3
+3 −3 −3 −3 −3 −3 −3 −3 +3

Table 2. Operation → in RM84


→ −3 −2 −1 −0 +0 +1 +2 +3
−3 +3 +3 +3 +3 +3 +3 +3 +3
−2 −3 +0 −3 +2 −3 −3 +0 +3
−1 −3 −3 +0 +1 −3 +0 −3 +3
−0 −3 −3 −3 +0 −3 −3 −3 +3
+0 −3 −2 −1 −0 +0 +1 +2 +3
+1 −3 −3 −3 −1 −3 +0 −3 +3
+2 −3 −3 −3 −2 −3 −3 +0 +3
+3 −3 −3 −3 −3 −3 −3 −3 +3
48 ROGER D . MADDUX

Table 3. Operation ◦ in M0
◦ −3 −2 −1 −0 +0 +1 +2 +3
−3 −3 −3 −3 −3 −3 −3 −3 −3
−2 −3 −2 +3 +3 −2 +3 −2 +3
−1 −3 +3 −1 +3 −1 −1 +3 +3
−0 −3 +3 +3 +3 −0 +3 +3 +3
+0 −3 −2 −1 −0 +0 +1 +2 +3
+1 −3 +3 −1 +3 +1 +1 +3 +3
+2 −3 −2 +3 +3 +2 +3 +2 +3
+3 −3 +3 +3 +3 +3 +3 +3 +3

Table 4. Operation ◦ in RM84


◦ −3 −2 −1 −0 +0 +1 +2 +3
−3 −3 −3 −3 −3 −3 −3 −3 −3
−2 −3 −0 +3 +3 −2 +3 −0 +3
−1 −3 +3 −0 +3 −1 −0 +3 +3
−0 −3 +3 +3 +3 −0 +3 +3 +3
+0 −3 −2 −1 −0 +0 +1 +2 +3
+1 −3 +3 −0 +3 +1 +3 +3 +3
+2 −3 −0 +3 +3 +2 +3 +3 +3
+3 −3 +3 +3 +3 +3 +3 +3 +3

ρ(−3) := ∅,

ρ(−2) := { x, y : x ∈ Q, x > y ∈ Q},

ρ(−1) := { x, y : x ∈ Q, x < y ∈ Q},

ρ(−0) := ρ(−1) ∪ ρ(−2),

ρ(+0) := Id := { x, x : x ∈ Q},

ρ(+1) := ρ(−1) ∪ ρ(+0),

ρ(+2) := ρ(−2) ∪ ρ(+0),

ρ(+3) := Q2 .

Then ρ(S3 ) is closed under ∪, ∩, ◦, →, and ∼, ρ(S3 ), ∪, ∩, ◦, →, ∼ is a commutative


dense relational relevance algebra, ρ is an isomorphism, and

M0 ∼
= ρ(S3 ), ∪, ∩, ◦, →, ∼ ∈ Rcd . 

For every i ∈ S3 , i is a designated value iff Id ⊆ ρ(i). Therefore a sentence A is valid


in M0 according to its definition as an algebra with designated values iff A is valid in M0
as a relational relevance algebra. The same is true for RM84.
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 49

T HEOREM 4.2. Meyer’s RM84 is isomorphic to a commutative dense relational rele-


vance algebra on a seven-element set, so
RM84 ∈ IRcd . (15)
Proof. Define a map ρ from the universe S3 of RM84 to the set of binary relations on
U := {0, 1, 2, 3, 4, 5, 6}, where “+7 ” denotes addition modulo 7:
ρ(−3) := ∅,

ρ(−2) := { x, x +7 y : x ∈ U, y ∈ {3, 5, 6}},


ρ(−1) := { x, x +7 y : x ∈ U, y ∈ {1, 2, 4}},

ρ(−0) := ρ(−1) ∪ ρ(−2),

ρ(+0) := { x, x : x ∈ {0, . . . , 6}},

ρ(+1) := ρ(−1) ∪ ρ(+0),

ρ(+2) := ρ(−2) ∪ ρ(+0),

ρ(+3) := U 2 .
Then ρ(S3 ) is closed under ∪, ∩, ◦, →, and ∼, ρ(S3 ), ∪, ∩, ◦, →, ∼ is a commutative
dense relational relevance algebra, ρ is an isomorphism, and
RM84 ∼
= ρ(S3 ), ∪, ∩, ◦, →, ∼ ∈ Rcd . 

The logic called BM is defined by Brady (2003, p. 128) as an extension of R. Brady


(2003, p. 138) proves that Belnap’s M0 is characteristic for the logic BM, so BM is {M0 }-
logic. By Theorem 4.1 we have {M0 } ⊆ IRcd , hence BM is a complete decidable exten-
sion of Rcd -logic. By Theorem 4.2, {RM84}-logic is a complete decidable extension of
Rcd -logic.
M0 and RM84 may be replaced, for all algebraic purposes, with their relational descrip-
tions. Instead of eight elements with operations defined on them by tables, we have eight
relations with set-theoretically defined operations: intersection, composition, and so forth.
For example, if we let A be the relation < on the rationals Q and let B be the relation
> on Q, then a simple calculation shows ∅ = A → (B → A) = (A ∩ ∼ A) → B, so
the sentences A → (B → A) and (A ∧ ∼ A) → B are not provable in R. For another
such proof, of a more general result, first note that {<, ≤} and {>, ≥} are closed under
∪, ∩, ◦, →, ∼, and A → B = ∅ whenever A ∈ {<, ≤} and B ∈ {>, ≥}. Suppose
the sentences A and B share no variable. By evaluating the variables of A as < and the
variables of B as >, we get A ∈ {<, ≤} and B ∈ {>, ≥}, hence A → B = ∅. It follows
that A → B is not a theorem of R.

§5. Soundness. The Peirce–Schröder calculus of relations may be defined as Boolean


combinations of equations between terms denoting relations. The terms are built up from
variables using complementation , intersection ∩, union ∪, relative multiplication |, com-
position ◦, relative addition †, conversion −1 , and the identity relation Id. Relevance
logic is the fragment of the calculus of relations in which the terms are built up using
only intersection ∩, union ∪, residuation →, and converse complementation ∼. Relative
50 ROGER D . MADDUX

multiplication and composition are definable in this fragment since A|B = ∼(B → ∼ A)
and A ◦ B = ∼(A → ∼B), so one may understand relevance logic as the restriction of the
calculus of relations to the operations ∪, ∩, |, ◦,→, and ∼.
Schröder (1966, sec. 11, pp. 153ff) showed that if a term is understood as the assertion
that the relation it denotes contains the universal relation, then every Boolean combination
of equations between terms denoting relations is equivalent to a single term. This conven-
tion allows the formulation of the calculus of relations as a sentential calculus; for details
see Tarski & Givant (1987, chap. 5). The corresponding convention for relevance logic is
that an individual term asserts that the relation it denotes contains the identity relation.
In the next theorem, parts (16)–(22) are handy computational rules, parts (23)–(31) show
that validity is preserved in all relational relevance algebras by the rules of deduction,
parts (32)–(47) show that several sentences are valid in R, and the remaining parts give
sentences valid in Rc , Rd , Rcd , and Rt .
T HEOREM 5.1. Suppose U is a set and A, B, C ⊆ U 2 . Then
Id → A = A, (16)

A⊆B iff Id ⊆ A → B, (17)

A → (B → C) = B|A → C = A ◦ B → C, (18)

A|(A → B) ⊆ B, ( A → B) ◦ A ⊆ B, (19)

(A → B)|∼B ⊆ ∼ A, ∼B ◦ (A → B) ⊆ ∼ A, (20)

A ⊆ B implies B → C ⊆ A → C, (21)

A ⊆ B implies C → A ⊆ C → B. (22)

The rules of deduction preserve validity in R because

if Id ⊆ A and Id ⊆ A → B then Id ⊆ B, (23)

if Id ⊆ A and Id ⊆ B then Id ⊆ A ∩ B, (24)

if Id ⊆ A → ∼B then Id ⊆ B → ∼ A, (25)

if Id ⊆ ∼ A and Id ⊆ A ∪ B then Id ⊆ B, (26)

if Id ⊆ A ∩ B → C and Id ⊆ B → C ∪ A then Id ⊆ B → C, (27)

if Id ⊆ A → B then Id ⊆ (C → A) → (C → B), (28)

if Id ⊆ A → B then Id ⊆ (B → C) → (A → C), (29)

if Id ⊆ A → (B → C) then Id ⊆ B → (∼C → ∼ A), (30)


if Id ⊆ A then Id ⊆ (A → B) → B, (31)

(A1)–(A20) are valid in R because

Id ⊆ A → A, (32)
Id ⊆ A ∩ B → A, (33)
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 51

Id ⊆ A ∩ B → B, (34)

Id ⊆ (( A → B) ∩ (A → C)) → (A → (B ∩ C)), (35)


Id ⊆ A → A ∪ B, (36)
Id ⊆ B → A ∪ B, (37)

Id ⊆ ((A → C) ∩ (B → C)) → (( A ∪ B) → C), (38)

Id ⊆ A ∩ (B ∪ C) → (A ∩ B) ∪ (A ∩ C), (39)
Id ⊆ ∼∼ A → A, (40)

Id ⊆ ∼(A ∪ B) → (∼A ∩ ∼B), (41)

Id ⊆ (∼A ∩ ∼B) → ∼(A ∪ B), (42)

Id ⊆ (A → B) → ((C → A) → (C → B)), (43)

Id ⊆ ((A → A) → B) → B, (44)

Id ⊆ A → ((∼B → ∼ A) → B), (45)

Id ⊆ A → (∼B → ∼(A → B)), (46)

Id ⊆ (A → (B → C)) → (∼(A → ∼B) → C), (47)

Id ⊆ A ◦ B → ∼(A → ∼B), (48)

Id ⊆ ∼(A → ∼B) → A ◦ B, (49)

Id ⊆ A → (B → (A ◦ B)), (50)

Id ⊆ (A → (B → C)) → ((A ◦ B) → C), (51)

(A21)–(A24) are valid in Rc because

if {A, B} is commutative then Id ⊆ (A → (B → C)) → (B → (A → C)), (52)

if {A, B} is commutative then Id ⊆ (A → ∼B) → (B → ∼ A), (53)

if {A, A → B} is commutative then Id ⊆ A → ((A → B) → B), (54)

if {B → C, A → B} is commutative, then (55)


Id ⊆ (A → B) → ((B → C) → (A → C)),

(A25)–(A30) are valid in Rd because

if A is dense then Id ⊆ (A → (A → B)) → (A → B), (56)

if A is dense then Id ⊆ (A → ∼ A) → ∼ A, (57)


if A ∩ B is dense then Id ⊆ (A → (B → C)) → ((A ∩ B) → C), (58)

if A ∩ ∼B is dense then Id ⊆ (A → B) → (∼A ∪ B), (59)


52 ROGER D . MADDUX

if A ∩ (A → B) is dense then Id ⊆ (A ∩ (A → B)) → B, (60)

if (A → B) ∩ (B → C) is dense then (61)


Id ⊆ ((A → B) ∩ (B → C)) → (A → C),

(A31) is valid in Rcd because

if {A, A → B} is commutative and A is dense then (62)


Id ⊆ (A → (B → C)) → ((A → B) → (A → C)),

(A32) and (A33) are valid in Rt because

if A is transitive then Id ⊆ A → (A → A), (63)

if A is transitive then Id ⊆ (A → B) → (A → (A → B)). (64)


Prefixing is valid in R both as a rule and as axiom (A12).
Suffixing is valid in R as a rule but not as axiom (A24), because there are noncommuta-
tive relational relevance algebras. On the other hand, (A24) holds in a relational relevance
algebra whenever A → B and B → C commute, so (A24) is valid in Rc . This does
not seem to exclude the possibility of a noncommutative relational relevance algebra in
which (A24) is valid.
Contraposition is valid as a rule in R but not as axiom (A22), again because noncom-
mutative relational relevance algebras exist. (A21) and (A22) are valid in Rc ; in fact, they
are valid in a relational relevance algebra R iff R is commutative.
We can use relations to give independence proofs. For example, the existence of non-
commutative relational relevance algebras shows that (A21)–(A24) cannot be proved from
axioms (A1)–(A20) using all nine rules.
C OROLLARY 5.2. (Soundness theorem) For every A ∈ Sent,
(i) if R A then A is valid in Rcd ,
(ii) if RM A then A is valid in Rcdt .
Two questions were asked in Maddux (2007):
(Q1) if A is not a theorem of R, is there some R ∈ Rcd in which A is not valid?
(Q2) if A is not a theorem of RM, is there some R ∈ Rcdt in which A is not valid?
The (expected) answer to (Q1) is “no.” This was first established by Mikulás (2009), who
proved that there is no finite axiomatization of Rcd -logic. In Section 8 we present two
examples of sentences in Rcd -logic that are not theorems of R. The (unexpected) answer
to (Q2) is “yes,” for reasons given in the next section.

§6. Completeness of RM for Rcdt . Sugihara matrices were introduced by Sugihara


(1955) and simplified by Anderson & Belnap (1975, sec. 26.9). R. K. Meyer used them to
prove completeness results for RM; see Anderson & Belnap (1975, sec. 29.3).
We define only the finite Sugihara matrices Sn , with 2 ≤ n < ω. If n = 2k for some
k > 0 then
Sn := {−k, . . . , −1, 1, . . . , k}
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 53

with designated values 1, . . . , k, and if n = 2k + 1 for some k ≥ 0 then


Sn := {−k, . . . , −1, 0, 1, . . . , k},
with designated values 0, 1, . . . , k. For example, S1 := {0}, S2 := {−1, 1}, S3 :=
{−1, 0, 1}, S4 := {−2, −1, 1, 2}, and S5 := {−2, −1, 0, 1, 2}. Note that Sn ⊆ Z so Sn is a
chain under the natural ordering inherited from the ordering of the integers, that is,
−k < · · · < −1 < 0 < 1 < · · · < k.
With respect to the natural ordering of Sn the binary operations ∧ and ∨ are defined as
follows. For any i, j ∈ Sn , i ∧ j is the minimum of i and j, and i ∨ j is the maximum of
i and j. The unary operation ∼ is multiplication by −1, that is, it maps 0 to 0 (if n is odd
and 0 ∈ Sn ), i to −i, and −i to i whenever 0 < i ∈ Sn . The binary operation → is defined
for all i, j ∈ Sn by

−i ∨ j if i ≤ j
i → j := (65)
−i ∧ j if i > j.
The binary operation ◦, obtained by the definition i ◦ j := ∼(i → ∼ j), can be character-
ized as follows (Anderson & Belnap, 1975, p. 400). If 1 ≤ i, j ≤ n then
−i ◦ − j = − max(i, j), (66)

⎨−i if j ≤ i
−i ◦ j = , (67)
⎩j if i < j

i ◦ j = max(i, j). (68)


Said another way, i ◦ j is whichever of i and j is strictly larger in absolute value, or else
is the minimum of i and j in case |i| = | j|. Another way to say this is that i ◦ j is the
maximum of i and j under the linear ordering of Sn that begins in this way: 0 < 1 < −1 <
2 < −2 < 3 < −3 < 4 < −4 < · · · . Examples of ◦ are shown in Table 5.
The Sugihara matrix Sn is the algebra Sn , ∨, ∧, ◦, →, ∼ . Sn is normal  if n is even.
A sentence A ∈ Sent is valid in Sn if every homomorphism H from Sent, ∨, ∧, ◦,
→, ∼ into Sn sends A to a designated value. Meyer’s completeness theorem follows.
T HEOREM 6.1 (Meyer; see (Anderson & Belnap, 1975, cor. 3.1, p. 413)). If a sentence
A has no more than n propositional variables then RM A iff A is valid in Sn .
We can now show that RM is complete for Rcdt .
T HEOREM 6.2. Assume 1 ≤ n < ω. Then there is a finite commutative dense transitive
relational relevance algebra Tn ∈ Rcdt such that
(i) The Sugihara matrix S2n+2 is isomorphic to Tn , so S2n+2 ∈ IRcdt .
(ii) If A has no more than 2n + 2 propositional variables, then RM A iff Tn | A.
(iii) RM A iff A is valid in Rcdt .
Proof. Let Qn := { q1 , . . . , qn : q1 , . . . , qn ∈ Q}, where Q is the set of rational
numbers. Define binary relations Id and L 1 on Qn by
Id = { q, q : q ∈ Qn }, (69)
 
q, q  ∈ L 1 iff q1 < q1 , (70)
54 ROGER D . MADDUX

Table 5. ◦ in S8 and S9 .
◦ −4 −3 −2 −1 1 2 3 4
−4 −4 −4 −4 −4 −4 −4 −4 −4
−3 −4 −3 −3 −3 −3 −3 −3 4
−2 −4 −3 −2 −2 −2 −2 3 4
−1 −4 −3 −2 −1 −1 2 3 4
1 −4 −3 −2 −1 1 2 3 4
2 −4 −3 −2 2 2 2 3 4
3 −4 −3 3 3 3 3 3 4
4 −4 4 4 4 4 4 4 4

◦ −4 −3 −2 −1 0 1 2 3 4
−4 −4 −4 −4 −4 −4 −4 −4 −4 −4
−3 −4 −3 −3 −3 −3 −3 −3 −3 4
−2 −4 −3 −2 −2 −2 −2 −2 3 4
−1 −4 −3 −2 −1 −1 −1 2 3 4
0 −4 −3 −2 −1 0 1 2 3 4
1 −4 −3 −2 −1 1 1 2 3 4
2 −4 −3 −2 2 2 2 2 3 4
3 −4 −3 3 3 3 3 3 3 4
4 −4 4 4 4 4 4 4 4 4

and for 1 < i ≤ n, define binary relations L i on Qn by


   
q, q  ∈ L i iff q1 , . . . , qi−1 = q1 , . . . , qi−1

and qi < qi . (71)
It follows that
 
q, q  ∈ L 1 ∪ L −1
1 iff q1 = q1
   
q, q  ∈ L i ∪ L i−1 iff q1 , . . . , qi−1 = q1 , . . . , qi−1

and qi = qi .
Let
Ln := {Id, L 1 , L −1 −1
1 , . . . , L n , L n }. (72)
The relations in Ln are pairwise  disjoint, and their union is Qn ×Qn . To see this, it is
enough to observe that q, q  belongs to exactly one of the relations  in Ln . If q = q ,


then for each i = 1, . . . n it is not the case that qi < qi (hence q, q ∈   / L i ), nor is it

 
 −1  

the case that qi > qi (hence q, q ∈ / L i ). Thus q, q is not in any of the relations in
{L 1 , L −1
1 , . . . , L n , L −1 }.
n
Assume q = q  . Let i = 1 if q1 = q1 , and otherwise  let i be the smallest element
of {2, . . . , n} such that q1 , . . . , qi−1 = q1 , . . . , qi−1
 and qi = qi . Since Q is linearly
   
ordered, either qi < qi or qi > qi , hence q,  q  ∈ L i iff qi < qi and  q, q  ∈ L i−1 iff
qi > qi . It follows from q1 , . . . , qi−1 = q1 , . . . , qi−1  that q, q  is not in any of the
−1 −1
relations L 1 , L 1 , . . . , L i−1 , L i−1 . The assumption that qi = qi prevents the pair from
−1
belonging to any of the remaining relations L i+1 , L i+1 , . . . , L n , L −1
n .
Let  
An := S : S ⊆ Ln . (73)
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 55

Since the relations in Ln partition Qn × Qn , An is the universe of a finite Boolean algebra


of subsets of Qn × Qn , and Ln is the set of atoms of this Boolean algebra. Clearly Ln is
closed under conversion −1 , so An is also closed under −1 because conversion distributes
over union. Next we calculate
  the relative products of relations in Ln .
 ∈ Qn . If q, q  ∈ L |L then for some q  ∈ Qn we have q < q  < q  ,
Let q, q  1  1  1 1 1
hence q, q  ∈ L 1 . Conversely, if q, q  ∈ L 1 then we may choose q  ∈ Qn so that
 
q1 = 12 (q1 + q1 ), which yields q1 < q1 < q1 , hence q, q  ∈ L 1 |L 1 . Thus L 1 |L 1 = L 1 .
 
 
 

 1 <  i ≤ n and q, q ∈ L i |L i , then there is some q ∈ Q such that q, q ∈ L i and
If n

q , q ∈ L i , hence
    
q1 , . . . , qi−1 = q1 , . . . , qi−1
 
= q1 , . . . , qi−1 (74)

qi < qi < qi , (75)


 
  

so q, q ∈ Li . For the other direction, assume q, q ∈ L i . This gives us q1 , . . . , qi−1 =
q1 , . . . , qi−1 and qi < qi , so we may choose q  ∈ Qn such that (74) holds and
   
qi = 12 (qi + qi ), hence (75) also holds. We get q, q  ∈ L i and q  , q  ∈ L i from (74)
 
and (75), hence q, q  ∈ L i |L i . So far we have proved
L i |L i = L i whenever 1 ≤ i ≤ n. (76)
 
Assume 1 < i < j ≤ n. If q, q  ∈ L i |(L j ∪ L −1 
j ), then there is some q ∈ Q such that
n
   
q, q  ∈ L i and q  , q  ∈ L j ∪ L −1 j , hence
  

q1 , . . . , qi−1 = q1 , . . . , qi−1 and qi < qi
 
q1 , . . . , qi−1

, qi , . . . , q j−1 = q1 , . . . , qi−1

, qi , . . . , q j−1 and q j = q j ,

so
 
q1 , . . . , qi−1 = q1 , . . . , qi−1

and qi < qi = qi .

This proves that L i |(L j ∪ L −1 −1


j ) ⊆ L i , hence L i |L j ⊆ L i and L i |L j ⊆ L i . To show the
opposite inclusions, suppose q, q  ∈ L i . Then
 
q1 , . . . , qi−1 = q1 , . . . , qi−1

and qi < qi .

If we let

q  = q1 , . . . , qi−1 , qi , . . . , q j−1 , q j − 1, q j+1 , . . .
     
then q, q  ∈ L i and q  , q  ∈ L j , hence q, q  ∈ L i |L j , but if we let

q  = q1 , . . . , qi−1 , qi , . . . , q j−1 , q j + 1, q j+1 , . . .
     
then q, q  ∈ L i , q  , q  ∈ L −1  −1
j , and q, q ∈ L i |L j . Except for the case 1 = i, which
is notationally simpler, we have completed the proof that
L i = L i |L j = L i |L −1
j whenever 1 ≤ i < j ≤ n. (77)
56 ROGER D . MADDUX

By very slightly rearranging the proof of (77) we also establish


L i = L j |L i = L −1
j |L i whenever 1 ≤ i < j ≤ n. (78)
By applying conversion to both sides of (76), (77), and (78) we also obtain
L i−1 = L i−1 |L i−1 whenever 1 ≤ i ≤ n (79)

L i−1 = L −1 −1 −1
j |L i = L j |L i whenever 1 ≤ i < j ≤ n (80)

L i−1 = L i−1 |L −1 −1
j = L i |L j whenever 1 ≤ i < j ≤ n. (81)
 
Next we consider the products L i |L i−1 and L i−1 |L i . If q, q  ∈ L i |L i−1 , then there is some
   
q  ∈ Qn such that q, q  ∈ L i and q  , q  ∈ L i−1 , hence
    
q1 , . . . , qi−1 = q1 , . . . , qi−1

= q1 , . . . , qi−1 
, qi < qi > qi .
   
There are three cases. First, if qi < qi then q, q  ∈ L i . Second, if qi > qi then q, q  ∈
L i−1 . For the third case we suppose qi = qi , which implies
 
q1 , . . . , qi = q1 , . . . , qi . (82)
 
If q = q  then q, q  ∈ Id. Suppose q = q  . From (82) we know that q and q  must
differ at some index greater  than i. Let j be the smallestindex such that i < j ≤ n and
q j = q j . If q j < q j then q, q  ∈ L j . If q j > q j then q, q  ∈ L −1 j . This exhausts all
the possibilities, and shows that
     
L i |L i−1 ⊆ Id ∪ L i ∪ L i−1 ∪ L j ∪ L −1j = Id ∪ L j ∪ L −1
j .
i< j≤n i≤ j≤n

For the opposite inclusion, assume


  
q, q  ∈ Id ∪ (L j ∪ L −1
j ),
i≤ j≤n

which is equivalent to
 
q1 , . . . , qi−1 = q1 , . . . , qi−1

. (83)
If we choose q ∈ Qn so that
    
q1 , . . . , qi−1 = q1 , . . . , qi−1
 
= q1 , . . . , qi−1 (84)
     
and qi > max(qi , qi ), we get q, q  ∈ L i and q  , q  ∈ L i−1 , hence q, q  ∈ L i |L i−1 .
This completes the proof that
  
L i |L i−1 = Id ∪ L j ∪ L −1
j . (85)
i≤ j≤n

By slightly altering the proof of (85) we also get


  
L i−1 |L i = Id ∪ L j ∪ L −1
j . (86)
i≤ j≤n

We can summarize (76)–(86) as follows.


L i |L j = L min(i, j) , (87)
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 57

L i−1 |L −1 −1
j = L min(i, j) , (88)


⎪ if i < j
⎪L i

L −1 −1
j |L i = L i |L j = ⎪ L j
−1
if j < i . (89)

⎩Id ∪ 
⎪ −1
i= j≤k≤n (L k ∪ L k ) if i = j

The remaining relative products of relations in Ln , which all involve Id, are

Id = Id|Id, L i = L i |Id = Id|L i , L i−1 = L i−1 |Id = Id|L i−1 . (90)

Relative multiplication distributes over union, so it follows that An is closed under relative
multiplication as well as union, intersection, complementation with respect to Qn × Qn ,
and conversion. Note that An contains the identity relation on Qn .
For every J ⊆ {1, 2, . . . , n}, let L J := ∅ and L −1 J := ∅ if J = ∅, and otherwise
 −1  −1
let L J := i∈J L i and L J := i∈J iL . For every i ∈ {1, 2, . . . , n} let [1, i] =
{1, 2, . . . , i − 1, i} and [i, n] = {i, i + 1, . . . , n − 1, n}. Using this notation we can rewrite
(85) and (86) as

L i−1 |L i = L i |L i−1 = Id ∪ L [i,n] ∪ L −1


[i,n] , (91)

and derive a few more computational rules.


 
L [1,i] |L [1, j] = L k |L l = L min(k,l) = L [1,min(i, j)] , (92)
1≤k≤i, 1≤l≤ j 1≤k≤i, 1≤l≤ j
 
L [i,n] |L [ j,n] = L k |L l = L min(k,l) = L [min(i, j),n] , (93)
i≤k≤n, j≤l≤n i≤k≤n, j≤l≤n

L −1 −1 −1
[1,i] |L [1, j] = L [1,min(i, j)] , (94)

L −1 −1 −1
[i,n] |L [ j,n] = L [min(i, j),n] . (95)

If i < j then L k |L l−1 = L k whenever 1 ≤ k ≤ i and j ≤ l ≤ n, so


 
L [1,i] |L −1
[ j,n] = L k |L l−1 = L k = L [1,i] . (96)
1≤k≤i, j≤l≤n 1≤k≤i, j≤l≤n

On the other hand, if 1 ≤ j ≤ i then


 
L [1,i] |L −1
j = L k |L −1 −1
j ∪ L j |L j ∪ L k |L −1
j
1≤k< j j<k≤i
 
= L k ∪ Id ∪ L [ j,n] ∪ L −1
[ j,n] ∪ L −1
j
1≤k< j j<k≤i

= L [1, j−1] ∪ Id ∪ L [ j,n] ∪ L −1 −1


[ j,n] ∪ L j

= L [1,n] ∪ Id ∪ L −1
[ j,n] ,
58 ROGER D . MADDUX

which implies

L [1,i] |L −1 −1
[ j,n] = L [1,n] ∪ Id ∪ L [ j,n] whenever 1 ≤ j ≤ i. (97)
We will use the relations in Ln to create a copy of the Sugihara matrix S2n+2 . The
example which inspired this construction is Belnap’s M0 , which has two copies of S4 as
subalgebras, namely {−3, −2, +2, +3} and {−3, −1, +1, +3}.
First, define a function T : S2n+2 → Sb Qn 2 by

T−n−1 := ∅ (98)

T−i := L [1,n+1−i] whenever 1 ≤ i ≤ n (99)

T1 := L [1,n] ∪ Id (100)

Ti := L [1,n] ∪ Id ∪ L −1
[n+2−i,n] whenever 2 ≤ i ≤ n + 1 (101)
and let
Tn := {T−n−1 , T−n , . . . , T−1 , T1 , . . . , Tn , Tn+1 }.
Note that Tn+1 = Qn × Qn . Also, the images of the designated values of S2n+2 are
T1 , . . . , Tn , Tn+1 , exactly the elements of Tn that contain the identity relation Id. It follows
immediately from the definitions that the relations in Tn form a chain,
T−n−1 ⊆ T−n ⊆ · · · ⊆ T−1 ⊆ T1 ⊆ · · · ⊆ Tn ⊆ Tn+1 . (102)
Therefore Tn is closed under union and intersection. A straightforward calculation shows
that Tn is also closed under converse complementation. In fact, for every i ∈ S2n=2 =
{−n − 1, . . . , −1, 1, . . . , n + 1} we have
∼(Ti ) = T−i = T∼(i) . (103)
To show that Tn is closed under relative multiplication, we need to examine all the products
of relations in Tn .
First note that all products involving T−n−1 = ∅ are pretty trivial, for if X ∈ Tn then
T−n−1 |X = ∅|X = ∅ = T−n−1 . (104)
If 1 ≤ i, j ≤ n then we have
T−i |T− j = T− max(i, j) , (105)
since
T−i |T− j = L [1,n+1−i] |L [1,n+1− j]

= L [1,min(n+1−i,n+1− j) by (92)

= L [1,n+1−max(i, j)]

= T− max(i, j) .
We use this to show
T−i |T1 = T−i (106)
as follows.
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 59

T−i |T1 = T−i |(T−1 ∪ Id) by (100)

= T−i |T−1 ∪ T−i |Id

= T− max(i,1) ∪ T−i by (105) with j = 1

= T−i ∪ T−i

= T−i .
If n ≥ i ≥ j ≥ 1, then n + 1 − i < n + 2 − j, so by (96),
T−i |L −1 −1
[n+2− j,n] = L [1,n+1−i] |L [n+2− j,n] = L [1,n+1−i] = T−i .

Using these last two observations we get


T−i |T j = T−i |(T1 ∪ L −1
[n+2− j,n] )

= T−i |T1 ∪ T−i |L −1


[n+2− j,n]

= T−i ∪ T−i = T−i .


On the other hand, if 1 ≤ i < j ≤ n then n + 1 − i ≥ n + 2 − j, so by (97)
T−i |L −1 −1 −1
[n+2− j,n] = L [1,n+1−i] |L [n+2− j,n] = L [1,n] ∪ Id ∪ L [n+2− j,n] = T j ,

hence
T−i |T j = T−i |(T1 ∪ L −1
[n+2− j,n] )

= T−i |T1 ∪ T−i |L −1


[n+2− j,n]

= T−i ∪ T j = T j .
We have proved that

⎨T−i if n ≥ i ≥ j ≥ 1
T−i |T j = . (107)
⎩T if 1 ≤ i < j ≤ n
j

Next we deal with one special product.


T1 |T1 = (L [1,n] ∪ Id)|(L [1,n] ∪ Id) (108)

= L [1,n] |L [1,n] ∪ Id|L [1,n] ∪ L [1,n] |Id ∪ Id|Id

= L [1,n] ∪ L [1,n] ∪ L [1,n] ∪ Id by (92)

= T1 .
Suppose 2 ≤ j ≤ n. First observe that
T1 |L −1 −1
[n+2− j,n] = (L [1,n] ∪ Id)|L [n+2− j,n] (109)

= L [1,n] |L −1 −1
[n+2− j,n] ∪ Id|L [n+2− j,n]
60 ROGER D . MADDUX

= L [1,n] ∪ Id ∪ L −1 −1
[n+2− j,n] ∪ L [n+2− j,n] by (97)

= Tj ,
and then use this observation together with (108) to obtain
 
T1 |T j = T1 | T1 ∪ L −1
[n+2− j,n] (110)

= T1 |T1 ∪ T1 |L −1
[n+2− j,n]

= T1 ∪ T j = T j by (108), (109).
Finally, if 2 ≤ i, j ≤ n + 1 then we first note
 
 −1
Ti |L −1
[n+2− j,n] = T1 ∪ L −1
[n+2−i,n] L [n+2− j,n] (111)
 
  −1
= T1 L −1 −1
[n+2− j,n] ∪ L [n+2−i,n] L [n+2− j,n]

= T j ∪ L −1
[min(n+2−i,n+2− j),n] by (109), (95)

= T j ∪ L −1
[n+2−max(i, j),n] ,

and then
 
Ti |T j = Ti | T1 ∪ L −1
[n+2− j,n] (112)

= Ti |T1 ∪ Ti |L −1
[n+2− j,n]

= Ti ∪ T j ∪ L −1
[n+2−max(i, j),n] by (110), (111)

= Tmax(i, j) ∪ L −1
[n+2−max(i, j),n]

= Tmax(i, j) .
This completes the proof that Tn is closed under relative multiplication and composition.
Since Tn is closed under ∪, ∩, ◦, →, ∼, we may use it as the universe of an algebra with
these operations. Let
Tn := Tn , ∪, ∩, ◦, →, ∼ . (113)
Observe that (105), (107), (108), (110), and (112) are enough to confirm that relative
multiplication in Tn behaves the same as multiplication in the Sugihara matrix S2n+2
according to (66)–(68). It is easy to see that the other operations are preserved by T, so
T is an isomorphism from the Sugihara matrix S2n+2 to Tn . Combining these observations
with Theorem 6.1 completes the proof of part (i).
For part (ii), consider a sentence A and choose n so that A has fewer than 2n + 2
propositional variables. By Theorem 6.1 we have RM A iff A is valid in S2n+2 . The
isomorphism from S2n+2 to Tn carries designated values of S2n+2 onto the relations in
Tn that contain Id, so A is valid in S2n+2 iff Tn | A. Part (iii) follows from parts (i)
and (ii). 
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 61

§7. Relevant model structures. Relevant model structures, introduced in Routley &
Meyer (1972a, 1972b, 1973), provide sound and complete semantics for R. A relevant
model structure K = K , R, ∗ , 0 consists of a nonempty set K, a ternary relation R ⊆
K 3 , a unary operation ∗ : K → K , and a distinguished element 0 ∈ K, such that postulates
(p1)–(p6) hold for all a, b, c ∈ K. To state these postulates, we first adopt some definitions.
R 2 abcd iff ∃x (Rabx, Rxcd, x ∈ K ) (d1)

R 2 a(bc)d iff ∃x (Rbcx, Raxd, x ∈ K ) (d2)


b ≤a c iff Rabc. (d3)
The defining properties of relevant model structures are
R0aa (0-reflexivity) (p1)

Raaa (density) (p2)

R 2 abcd ⇒ R 2 acdb (p3)

R 2 0abc ⇒ Rabc (0-cancellation) (p4)

Rabc ⇒ Rac∗ b∗ (p5)


∗∗
a =a (involution). (p6)
Next are four more properties of relevant model structures, as shown in Theorem 7.1 below.
Rabc ⇒ Rbac (commutativity) (comm)

R 2 abcd ⇒ R 2 a(bc)d (associativity) (p3 )

Rabc ⇒ Rc∗ ab∗ (right rotation) (p5 )

Rabc ⇒ Rbc∗ a ∗ (left rotation). (p5 )


The next three properties do not hold in all relevant model structures.
R0ab iff a = b (0-identity) (p1 )
Rabc ⇒ Rcb∗ a (right reflection) (p5 )
Rabc ⇒ Ra ∗ cb (left reflection). (p5 )
The rotation properties (p5 ) and (p5 ) are equivalent in the presence of (p6).
The reflections of a triple a, b, c are c, b∗ , a , a ∗ , c, b , and b∗ , a ∗ , c∗ . The ro-
tations of a triple a, b, c are a, b, c , c∗ , a, b∗ , and b, c∗ , a ∗ . The ternary relation
[a, b, c] defined by
          
[a, b, c] := a, b, c , c∗ , a, b∗ , b, c∗ , a ∗ , a ∗ , c, b , c, b∗ , a , b∗ , a ∗ , c∗ (114)
is called a cycle. It is the closure of { a, b, c } under (both left and right) rotations and
reflections. Any union of cycles will satisfy both rotation and reflection properties. The
size of a cycle is 1, 2, 3, or 6, depending on the behovior of ∗ on a, b, and c.
A relevant model structure K = K , R, ∗ , 0 is normal if 0∗ = 0. If a relevant model
structure K satisfies (p1 ) then K is normal, because R0∗ 0∗ 0∗ by (p2), R0∗ 00 by (p5) and
involution (p6), R00∗ 0 by (comm), so 0∗ = 0 by (p1 ).
62 ROGER D . MADDUX

T HEOREM 7.1. Properties (p1)–(p6) are equivalent to (p1), (p2), (p3 ), (p4), (p5 ) (p6),
and (comm).
Proof. Assume postulates (p1)–(p6). We must show (comm), (p3 ), and (p5 ). For this
we only need (p3), (p4), and (p5).
Assume Rabc. We have R0aa by (p1), so R 2 0abc by (d2), hence R 2 0bac by (p3),
and finally Rbac by (p4). Thus (comm) holds. (p5 ) follows from (p5) by (comm). For
(p3 ), assume R 2 abcd. Then R 2 bacd by (d2) and (comm), so R 2 bcad by (p3), and finally
R 2 a(bc)d by (d1), (comm), and (d2).
For the converse, assume (p1), (p2), (p3 ), (p4), (p5 ), (p6), and (comm). We get (p5)
from (p5 ) and (comm). For (p3), assume R 2 abcd. Then R 2 bacd by (d1) and (comm),
hence R 2 b(ac)d by (p3 ), and finally R 2 acbd by (d2), (comm), and (d1). 
Because of this theorem we think of a relevant model structure as one that satisfies
0-reflexivity, 0-cancellation, density, involution, associativity, commutativity, and both
rotations.
Suppose K = K , R, ∗ , 0 is a structure with distinguished element 0 ∈ K , ternary
relation R ⊆ K 3 , and unary operation ∗ : K → K . (K need not be a relevant model
structure.) For any a ∈ K and X ⊆ K, X is a-closed if y ∈ X whenever x ∈ X and
x ≤a y. Let (K) be the set of 0-closed subsets of K. A valuation on K is a function
ν : Sent → Sb (K ) such that, for all A, B ∈ Sent,

ν(A) ∈ (K) if A ∈ Pv,

ν( A ∧ B) = ν(A) ∩ ν(B),

ν( A ∨ B) = ν(A) ∪ ν(B),

ν(A ◦ B) = {c : (∃a, b ∈ K )(Rabc and a ∈ ν( A) and b ∈ ν(B))},

ν(A → B) = {c : (∀a, b ∈ K )(if Rcab and a ∈ ν(A) then b ∈ ν(B))},


ν(∼A) = {a : a ∗ ∈
/ ν( A)}.

We say that A ∈ Sent is valid in K if 0 ∈ ν( A) for every valuation ν on K.


Define operations ◦, →, and ∼ on subsets X, Y ⊆ K by

X ◦ Y = {c : (∃a, b ∈ K )(Rabc and a ∈ X and b ∈ Y )}, (115)

X → Y := {c : (∀a, b ∈ K )(if Rcab and a ∈ X then b ∈ Y )}, (116)

∼X := {a : a ∗ ∈
/ X }. (117)

If K satisfies (p6) and (p5 ) then X ◦ Y = ∼(X → ∼Y ) and X → Y = ∼(X ◦ ∼Y ). The


next lemma shows that the set of 0-closed subsets of a relevant model structure is closed
under union, intersection, and the operations ◦, →, and ∼.
L EMMA 7.2. (Routley & Meyer, 1973, lemma 1) If K is a relevant model structure, ν is a
valuation on K, and A ∈ Sent, then ν(A) ∈ (K).
Since (K) is closed under the operations ∪, ∩, ◦, →, and ∼, we define the algebra of
K to be
Pr(K) := (K), ∪, ∩, ◦, →, ∼ . (118)
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 63

In this definition we avoid distinguished elements, but they are sometimes included;
see Routley & Meyer (1973, p. 228) and Brady (2003, p. 81) for other choices of similarity
type for the algebra of K.
The algebra Pr(K) of a relevant model structure K is a subalgebra of a larger algebra
obtained by using the set of all subsets of K instead of (K). This is the complex algebra
of K, defined by

Cm (K) := Sb (K ) , ∪, ∩, ◦, →, ∼ . (119)

Note that if 0-identity property (p1 ) holds in K, then Cm (K) coincides with the algebra
of K.
Furthermore, the complex algebra Cm (K) has no ∼-fixed points. To see this, suppose
X = ∼X = {a : a ∗ ∈ / X } for some X ⊆ K . Then a ∈ X iff a ∗ ∈ / X , for all a ∈ K .
In particular, for a = 0 we would have 0 ∈ X iff 0∗ ∈ / X , but 0 = 0∗ in every relevant
model structure satisfying (p1 ), a contradiction.
Every valuation
 ν on a relevant model structure K is a homomorphism from the algebra
of sentences Sent, ∨, ∧, ◦, →, ∼ to the algebra of K, and conversely. Therefore
 A is valid
in K iff 0 ∈ ν( A) for every homomorphism ν from Sent, ∨, ∧, ◦, →, ∼ to the algebra
of K.
The following two constructions are from Meyer & Routley (1973, part I) and Routley &
Meyer (1973). For both of them we assume K = K , R, ∗ , 0 where R ⊆ K 3 , ∗ : K → K ,

and 0 ∈ K . Let 0 ∈ / K and let K  := K ∪ {0 }. Define a unary operation ∗ : K  → K  as
 
follows: a ∗ = a ∗ if a ∈ K and 0 ∗ = 0 . Let R  be the ternary relation on K  defined by
 
R  :=R ∪ 0 , 0 , 0
  
∪ 0 , 0 , a : 0, 0, a ∈ R
    
∪ 0 , a, 0 : 0, a, 0∗ ∈ R
    
∪ a, 0 , 0 : a, 0, 0∗ ∈ R
    
∪ a, b, 0 : a, b, 0∗ ∈ R
  
∪ 0 , a, b : 0, a, b ∈ R
  
∪ a, 0 , b : a, 0, b ∈ R ,
 

and let K := K  , R  , ∗ , 0 . Then K is the normalization of K.
L EMMA 7.3. (Routley & Meyer, 1973). If K is a relevant model structure then the nor-
malization of K is a normal relevant model structure. If a sentence A ∈ Sent is invalid
in K, then A is also invalid in the normalization of K.
,
For a similar construction from Meyer & Routley (1973, part I), choose some 1 ∈ / K
,  
and let K  := K ∪ {1 }. Define a unary operation ∗ : K  → K  as follows: a ∗ = a ∗ if
,  ,
a ∈ K and 1 ∗ = 1 . Define a ternary relation
, ,
R  :=R ∪ { a, 1 , a : a ∈ K } ∪ { 1 , a, a : a ∈ K }
 ,  , , ,
∪ a, a ∗ , 1 : a ∈ K ∪ { 1 , 1 , 1 }
64 ROGER D . MADDUX


 ,
and let K := K  , R  , ∗ , 1 . Meyer & Routley (1973, part I) did not give a name to K . We
,
will call it K-with-identity, and denote it briefly by K[1 ].
,
L EMMA 7.4 (Meyer & Routley, 1973, part I). If K is a relevant model structure then K[1 ]
is a normal relevant model structure that satisfies (p1 ). Furthermore, if K is normal then
,
exactly the same sentences are valid in both K and K[1 ].
Next are the Routley–Meyer completeness results.
T HEOREM 7.5 (Routley & Meyer, 1973; Meyer & Routley, 1973). The following state-
ments are equivalent for every sentence A ∈ Sent.
(i) R A.
(ii) A is valid in every relevant model structure.
(iii) A is valid in every normal relevant model structure.
(iv) A is valid in every relevant model structure that satisfies (p1 ).
Proof. The equivalence of (i) and (ii) is theorem 3 of Routley & Meyer (1973).
Obviously (ii) implies (iii), and (iii) implies (iv) since every relevant model structure that
satisfies (p1 ) is normal.
To show that (iv) implies (i) it is enough to prove that every nontheorem of R is invalid
in some (normal) relevant model structure that satisfies (p1 ). Assume R A. Since (ii)
implies (i), there exists some relevant model structure K such that A is not valid in K. Let
,
K be the normalization of K and let K be K [1 ]. Thus K has two more elements than K.
Since A is invalid in K, it is also invalid in the normalization K of K by Lemma 7.3. But
the same sentences are valid in both K and K by Lemma 7.4, so A is also invalid in K .
Since K is a relevant model structure that satisfies property (p1 ), we are done. 
Part (iv) of Theorem 7.5 inspired the following question, which was asked in Maddux
(2007).
(Q3) Is is true that R A iff A is valid in every relevant model structure that satis-
fies (p1 ), (p5 ), and (5 )?
In addressing this question, Kowalski (2007) defines a system B whose language contains
only ∧, ∨, and →. The axioms of B are (A1)–(A8) and the rules are modus ponens,
Adjunction, Prefixing, and Suffixing. He proves that B A iff A is valid in every structure
that satisfies (p6), (p1 ), (p5 ), (5 ), plus the condition that Ra0b iff a = b.

§8. Incompleteness of R for Rcd . We answer question (Q1) here, for which we will
need
T HEOREM 8.1. Let U be a nonempty set and assume A, B, C, D, E, F, G ⊆ U 2 . Then
Id ⊆ A|B ∩ C|D ∩ E|F → (L)
 
A| A−1 |C ∩ B|D −1 ∩ (A−1 |E ∩ B|F −1 )|(E −1 |C ∩ F|D −1 ) |D,

Id ⊆ A|B ∩ C|D ∩ E|F → (L )


   
(A ∩ ∼ A)|B ∩ C|D ∩ E|F ∪ A|B ∩ C|(D ∩ ∼D) ∩ E|F
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 65
   
∪ A|B ∩ C|D ∩ (E ∩ ∼E)|F ∪ A|B ∩ C|D ∩ E|(F ∩ ∼F)
 
∪ A| A|C ∩ B|D ∩ (A|E ∩ B|F)|(E|C ∩ F|D) |D,

Id ⊆ A ∩ (B ∩ C|D)|(E ∩ F|G) → (M)


 
C| (C −1 |A ∩ D|E)|G −1 ∩ D|F ∩ C −1 |(A|G −1 ∩ B|F) |G,

Id ⊆ A ∩ (B ∩ C|D)|(E ∩ F|G) → (M )


 
A ∩ (B ∩ (C ∩ ∼C)|D)|(E ∩ F|G)
 
∪ A ∩ (B ∩ C|D)|(E ∩ F|(G ∩ ∼G))
 
∪ C| (C|A ∩ D|E)|G ∩ D|F ∩ C|(A|G ∩ B|F) |G.

Parts (L) and (M) are in the calculus of relations, but they are not part of relevance logic
because they involve conversion. Accompanying (L) and (M) are their consequences (L )
and (M ). These use only the operations allowed in relevance logic but, as is shown below,
their corresponding sentences are not provable in R. Infinitely many more such examples
can be found in Mikulás (2009).
Now (L), (M), and the equations used by Mikulás (2009) all have the same special form.
There is a general procedure applicable to such equations which produces (L ) and (M )
from (L) and (M), respectively. There are also procedures that work on all equations if a
particular constant is available in the language. However, we will not go further into these
matters.
Proof. We only prove (M) and (M ). The proofs of (L) and (L ) are similar. By (17), (M)
and (M ) are equivalent to inclusions whose left side is the relation A ∩ (B ∩ C|D)|(E ∩
F|G).
For (M), suppose v, w ∈ A ∩ (B ∩ C|D)|(E ∩ F|G). Then v, w ∈ A and there is
some x ∈ U such that v, x ∈ B, v, x ∈ C|D, x, w ∈ E, and x, w ∈ F|G. Hence
there are y, z ∈ U such that v, y ∈ C, y, x ∈ D, x, z ∈ F, and z, w ∈ G. It now
follows from only v, w ∈ A, v, x ∈ B, x, w ∈ E, v, y ∈ C, y, x ∈ D, x, z ∈ F,
and z, w ∈ G that v, w is in the relation in the conclusion of (M), that is,
 
v, w ∈ C| (C −1 |A ∩ D|E)|G −1 ∩ D|F ∩ C −1 |(A|G −1 ∩ B|F) |G.

For (M ), suppose v, w ∈ A∩(B∩C|D)|(E ∩F|G). Then, as before, there are x, y, z ∈ U


such that v, w ∈ A, v, x ∈ B, x, w ∈ E, v, y ∈ C, y, x ∈ D, x, z ∈ F, and
z, w ∈ G. If y, v ∈ / C or w, z ∈ / G, then v, y ∈ C ∩ ∼C or z, w ∈ G ∩
∼G, respectively, and in either case v, w belongs to one of the first two relations in the
conclusion of (M ). Hence
   
v, w ∈ A ∩ (B ∩ (C ∩ ∼C)|D)|(E ∩ F|G) ∪ A ∩ (B ∩ C|D)|(E ∩ F|(G ∩ ∼G)) .

On the other hand, if y, v ∈ C and w, z ∈ G, then v, y ∈ C ∩ C −1 and z, w ∈


G ∩ G −1 , so
v, w ∈ A ∩ (B ∩ (C ∩ C −1 )|D)|(E ∩ F|(G ∩ G −1 )).
66 ROGER D . MADDUX

Now apply (M) with C ∩C −1 and G ∩ G −1 in place of C and G, respectively, and conclude
that v, w belongs to a relation contained in the third relation in the conclusion of (M ), as
follows.
 −1 −1
v, w ∈ (C ∩ C −1 )| ((C ∩ C −1 ) |A ∩ D|E)|(G ∩ G −1 )

−1 −1

∩ D|F ∩ (C ∩ C −1 ) |(A|(G ∩ G −1 ) ∩ B|F) |G

= (C ∩ C −1 )| ((C ∩ C −1 )|A ∩ D|E)|(G ∩ G −1 )

∩ D|F ∩ (C ∩ C −1 )|(A|(G ∩ G −1 ) ∩ B|F) |G
 
⊆ C| (C|A ∩ D|E)|G ∩ D|F ∩ C|(A|G ∩ B|F) |G. 

We use the abbreviation A|B := ∼(B → ∼ A) to transcribe (L ) and (M ) into sentences
(L ), (M ) ∈ Sent.
A ∧ (B ∧ C|D)|(E ∧ F|G) → (M )
 
A ∧ (B ∧ (C ∧ ∼C)|D)|(E ∧ F|G)
 
∨ A ∧ (B ∧ C|D)|(E ∧ F|(G ∧ ∼G))
 
∨ C| (C|A ∧ D|E)|G ∧ D|F ∧ C|(A|G ∧ B|F) |G

A|B ∧ C|D ∧ E|F → (L )


   
(A ∧ ∼ A)|B ∧ C|D ∧ E|F ∨ A|B ∧ C|(D ∧ ∼D) ∧ E|F
   
∨ A|B ∧ C|D ∧ (E ∧ ∼E)|F ∨ A|B ∧ C|D ∧ E|(F ∧ ∼F)
 
∨ A| A|C ∧ B|D ∧ (A|E ∧ B|F)|(E|C ∧ F|D) |D.

The validity of (L ) and (M ) in R was established by Theorem 8.1. However,
T HEOREM 8.2. R (L ) and R (M ).
Proof. Let K28 = K , R28 , ∗ , 0 , where K = {0, 1, 2, 3}, x ∗ = x for every x ∈ K , and
R28 is the following ternary relation on K with 28 triples.
R28 :=[0, 0, 0] ∪ [1, 1, 1] ∪ [2, 2, 2] ∪ [3, 3, 3] ∪
[0, 1, 1] ∪ [0, 2, 2] ∪ [0, 3, 3] ∪
[1, 2, 2] ∪ [3, 1, 1] ∪ [2, 3, 3] ∪ [1, 2, 3].
K28 is (isomorphic to) the atom structure of the relation algebra 4265 from Maddux (2006).
K28 is a normal relevant model structure that satisfies (p1 ) and the reflection properties
(p5 ) and (p5 ). By (p1 ), the algebra of K28 is the same as its complex algebra Cm (K28 ).
Neither (L ) nor (M ) is valid in K28 . Both (L ) and (M ) will fail if we choose variables
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 67

Table 6. Products of singletons in the complex algebra of K28


◦ {0} {1} {2} {3}
{0} {0} {1} {3} {3}
{1} {1} {0,1,3} {2,3} {1,2}
{2} {2} {2,3} {0,1,3} {1,3}
{3} {3} {1,2} {1,3} {0,2,3}

A, B, C, D, E, F, G ∈ Pv and a valuation ν such that ν( A) = {1}, ν(B) = {1}, ν(C) =


{3}, ν(D) = {2}, ν(E) = {1}, ν(F) = {3}, and ν(G) = {1}. To check this it is convenient,
in evaluating the terms in (L ) and (M ), to have the products of singletons in Table 6.
By Theorem 7.5, we conclude that (L ) and (M ) are not provable in R. 

§9. Conclusions. Algebras for relevance logic can be created in an abstract algebraic
way: add operations for the connectives and distinguished elements for the constants, and
impose on the operations and distinguished elements postulates that mimic the axioms.
Operations in individual algebras may be specified by tables (in the finite case) or rules,
and are designed to validate the axioms of the logic. Although algebraization may be
mathematically illuminating, it is open to the philosophical charge that “. . . algebraic char-
acterizations . . . are merely formal, exhibiting no connection with the intended meanings
of the logical constants” (Copeland, 1979, p. 405).
Somewhat less abstract are the algebras of relevant model structures. Here the elements
of the algebras are actually sets, so two of the operations, namely intersection and union,
need not be specified by rules or postulates. But the other operations arise abstractly from
the ternary relation R and the unary operation ∗ of the structure. Postulates imposed on
R and ∗ are designed to validate the axioms. Indeed, many books and papers have lists of
axioms (which are essentially second-order statements about relevant model structures)
and their corresponding postulates on R and ∗ (which are first-order statements about
relevant model structures). Once again, “If the only constraint on ∗ is that the resulting
theory should validate the right set of sentences, then we are indeed in the presence of
merely formal model theory” (Copeland, 1979, p. 410).
In contrast, the elements of relational relevance algebras are binary relations, none of
the operations are abstractly defined, and there are no postulates for R. The operations
of relational relevance algebras are just standard set-theoretically defined operations on
binary relations. Of course, some axioms of R fail in R. The reasons for their failure
are given in Theorem 5.1, from which we can see that the commutative dense relational
relevance algebras will satisfy all the axioms of R. Focusing attention on the subclass of
commutative dense algebras in R is a response to the axioms of R. For the system of Basic
Logic consisting of axioms (A1)–(A20) and all nine rules, no such response is needed. The
natural class of models is R, and Basic Logic is a finite approximation to R-logic.
One should expect ad hoc semantics to be sound and complete because they are designed
for that purpose. But R-logic, Rcd -logic, Rcdt -logic, and so forth, are part of the nineteenth
century calculus of relations, while R and RM are mid-twentieth century inventions that
just happen to be a proper subsystem of Rcd -logic and exactly the same as Rcdt -logic,
respectively. Is this just a pure coincidence, or is there some underlying reason? There is
no sign that the founders of relevance logic were trying to capture properties of binary
relations in their axioms, so perhaps it is a coincidence. At least the binary relational
68 ROGER D . MADDUX

interpretation escapes the charge that “. . . it is completely obscure what meaning is given
to negation in the Routley–Meyer theory . . . ” (Copeland, 1979, p. 408). The meaning
of negation is quite clear; ∼ is converse complementation. Anderson & Belnap (1975,
p. 345) ask, “How then to interpret ◦? We confess puzzlement.” In the binary relational
interpretation, ◦ is composition.
Philosophical considerations are (or, at least, ought to be) constrained by mathematical
theorems, so we give here a summary of the main results in this paper (Theorems 4.1, 4.2,
5.1, Corollary 5.2, and Theorems 6.2, 8.1, and 8.2).
(L ), (M ) ∈ R-logic ⊂ Rcd -logic ⊂ Rcdt -logic = RM

(L ), (M ) ∈
/ R ⊂ Rcd -logic ⊂ {M0 }-logic = BM

Rcd -logic ⊂ {RM84}-logic.

BIBLIOGRAPHY
Ackermann, W. (1956). Begründung einer strengen Implikation. Journal of Symbolic
Logic, 21, 113–128.
Anderson, A. R., & Belnap, N. D. Jr. (1975). Entailment. Princeton, NJ: Princeton
University Press.
Anderson, A. R., Belnap, N. D. Jr. & Dunn, J. M. (1992). Entailment. The Logic of
Relevance and Necessity. Vol. II. Princeton, NJ: Princeton University Press.
Andréka, H., Givant, S. R., & Németi, I. (1997). Decision problems for equational theories
of relation algebras. Memoirs of the American Mathematical Society, 126(604), xiv+126.
Belnap, N. D. Jr. (1960). Entailment and relevance. Journal of Symbolic Logic, 25,
144–146.
Bimbo, K., Dunn, J. M., & Maddux, R. D. (2009). Relevance logics and relation algebras.
The Review of Symbolic Logic, 2(1), 102–131.
Brady, R. T. (editor.) (2003). Relevant Logics and Their Rivals. Volume II. Aldershot,
Hants, England. Ashgate Publishing Ltd.
Church, A. (1951). The weak positive implicational propositional calculus. Journal of
Symbolic Logic, 16(5), 238.
Copeland, B. J. (1979). On when a semantics is not a semantics: Some reasons for disliking
the Routley-Meyer semantics for relevance logic. Journal of Philosophical Logic, 8(4),
399–413.
De Morgan, A. (1856). On the symbols of logic, the theory of the syllogism, and in
particular of the copula, and the application of the theory of probabilities to some
questions in the theory of evidence. Transactions of the Cambridge Philosophical
Society, 9, 79–127. Reprinted in De Morgan (1966).
De Morgan, A. (1864a). On the syllogism: III, and on logic in general. Transactions of the
Cambridge Philosophical Society, 10, 173–230. Reprinted in De Morgan (1966).
De Morgan, A. (1864b). On the syllogism: IV, and on the logic of relations. Transactions
of the Cambridge Philosophical Society, 10, 331–358. Reprinted in De Morgan (1966).
De Morgan, A. (1966). “On the Syllogism” and Other Logical Writings. New Haven, CT:
Yale University Press.
Došen, K. (1992). The first axiomatization of relevant logic. Journal of Philosophical
Logic, 21(4), 339–356.
Fine, K. (1974). Models for entailment. Journal of Philosophical Logic, 3, 347–372.
RELEVANCE LOGIC AND THE CALCULUS OF RELATIONS 69

Kowalski, T. (2007). Weakly associative relation algebras hold the key to the universe.
Bulletin of the Section of Logic, 36(3/4), 145–157.
Lyndon, R. C. (1961). Relation algebras and projective geometries. Michigan Math
Journal, 8, 21–28.
Maddux, R. D. (1991). The origin of relation algebras in the development and
axiomatization of the calculus of relations. Studia Logica, 50(3–4), 421–455.
Maddux, R. D. (2006). Relation Algebras, Volume 150 of Studies in Logic and Foundations
of Mathematics. Amsterdam, The Netherlands: Elsevier B. V.
Maddux, R. D. (2007). Relevance logic and the calculus of relations [abstract].
International Conference on Order, Algebra and Logics, Vanderbilt University, June
13, 2007, pp. 1–3. Available from: www.math.vanderbilt.edu/∼oa12007/submissions/
submission 10.pdf.
Meyer, R. K., & Routley, R. (1973). Classical relevant logics. I, II. Studia Logica, 32,
51–68; ibid. 33 (1974), 183–194.
Mikulás, S. (2009). Algebras of relations and relevance logic. Journal of Logic and
Computation, 19(2), 305–321.
Moh, S.-K. (1950). The deduction theorems and two new logical systems. Methodos, 2,
56–75.
Peirce, C. S. (1870). Description of a notation for the logic of relatives, resulting from an
amplification of the conceptions of Boole’s calculus of logic. Memoirs of the American
Academy of Sciences, 9, 317–378. Reprinted in Peirce (1960) and Peirce (1984).
Peirce, C. S. (1880). On the algebra of logic. American Journal of Mathematics, 3, 15–57.
Reprinted in Peirce (1960).
Peirce, C. S. (1883). Note B: The logic of relatives. In Peirce, C. S., editor. Studies in
Logic by Members of the Johns Hopkins University. Boston, MA: Little, Brown, and Co.,
pp. 187–203. Reprinted by John Benjamins Publishing Co., Amsterdam and
Philadelphia, 1983, pp. lviii, vi+203.
Peirce, C. S. (1885). On the algebra of logic; a contribution to the philosophy of notation.
American Journal of Mathematics, 7, 180–202. Reprinted in Peirce (1960).
Peirce, C. S. (1897). The logic of relatives. The Monist, 7, 161–217. Reprinted in Peirce
(1960).
Peirce, C. S. (1960). Collected Papers. Cambridge, MA: The Belknap Press of Harvard
University Press.
Peirce, C. S. (1984). Writings of Charles S. Peirce. Vol. 2 (chronological edition).
Bloomington, IN: Indiana University Press.
Routley, R., & Meyer, R. K. (1972a). The semantics of entailment. II. Journal of
Philosophical Logic , 1(1), 53–73.
Routley, R., & Meyer, R. K. (1972b). The semantics of entailment. III. Journal of
Philosophical Logic, 1(2), 192–208.
Routley, R., & Meyer, R. K. (1973). The semantics of entailment. I. In Truth, Syntax and
Modality (Proc. Conf. Alternative Semantics, Temple Univ., Philadelphia, Pa., 1970),
pp. 199–243. Studies in Logic and the Foundations of Math., Vol. 68. Amsterdam, The
Netherlands: North-Holland.
Routley, R., Plumwood, V., Meyer, R. K., & Brady, R. T. (1982). Relevant Logics and Their
Rivals. Part I. Atascadero, CA: Ridgeview Publishing Co.
Routley, R., & Routley, V. (1972). The semantics of first degree entailment. Noûs, 6(4),
335–359.
Schröder, F. W. K. E. (1966). Vorlesungen über die Algebra der Logik (exacte Logik),
Volume 3, “Algebra und Logik der Relative,” Part I (second edition). Bronx, NY:
Chelsea. First published by B. G. Teubner, Leipzig, 1895.
70 ROGER D . MADDUX

Sugihara, T. (1955). Strict implication free from implicational paradoxes. Memoirs of the
Faculty of Liberal Arts, Series 1, no. 4, 55–59.
Tarski, A. (1941). On the calculus of relations. Journal of Symbolic Logic, 6, 73–89.
Tarski, A., & Givant, S. (1987). A Formalization of Set Theory Without Variables.
Providence, RI: American Mathematical Society.
Urquhart, A. (1972). Semantics for relevant logics. Journal of Symbolic Logic, 37,
159–169.
Urquhart, A. (1984). The undecidability of entailment and relevant implication. Journal of
Symbolic Logic, 49(4), 1059–1073.
Urquhart, A. (1996). Duality for algebras of relevant logics. Studia Logica, 56(1–2),
263–276.

DEPARTMENT OF MATHEMATICS
396 CARVER HALL
IOWA STATE UNIVERSITY
AMES, IA 50011
E-mail: maddux@iastate.edu
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010

TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY


ZACH WEBER
School of Philosophy and Historical Inquiry, University of Sydney

Abstract. This paper begins an axiomatic development of naive set theory—the consequences
of a full comprehension principle—in a paraconsistent logic. Results divide into two sorts. There is
classical recapture, where the main theorems of ordinal and Peano arithmetic are proved, showing
that naive set theory can provide a foundation for standard mathematics. Then there are major
extensions, including proofs of the famous paradoxes and the axiom of choice (in the form of the
well-ordering principle). At the end I indicate how later developments of cardinal numbers will
lead to Cantor’s theorem, the existence of large cardinals, and a counterexample to the continuum
hypothesis.

§1. Introduction. The axioms of naive set theory define sets through existence and
uniqueness conditions. The first axiom is the principle of comprehension, that any collec-
tion of objects is a set, and is itself an object. The second is the principle of extensionality,
that the members of a set completely determine the identity of that set. On the basis of
these principles alone, the mathematics of set theory can be developed, as well as some
novel inconsistent mathematics, as this article begins to demonstrate.
The comprehension principle is inconsistent. In the presence of classical logic, it
is outright incoherent. Since Russell told this to Frege, it has been the cause of some
surprise and consternation. The most prevalent response has been to maintain classical
logic, and to adopt Zermelo’s (1908) selection of some instances of the comprehension
principle. This is not the only possible response. Classical logic makes comprehension
absurd; the consternation, often tracked by using the word ‘paradox’, is due to the deep
sense that comprehension is not absurd. Others have tried to maintain comprehension, and
adopt, say, Routley and Meyer’s (1976) selection of logical axioms with which to reason
about it.
The resulting theory has several important features. First, it contains the basic theorems
of standard ZFC set theory, initially proving the ZF axioms and building to the theory
of ordinals and Peano postulates. This has been called the classical recapture, and
has proved startlingly elusive until now. The demands placed on the underlying logic
are simply very stringent. Speaking to this point, the usually optimistic authors of
Meyer et al. (1978) conclude in dramatic fashion that “naive set theory is untenable”
(p. 128). Priest summarizes the situation:
Since the early days of paraconsistent logic, it has been clear that the
rejection of ex contradictione [quodlibet] is not possible without the re-
jection of other things which appear to be much more integral to classical
reasoning. . . . Several logicians (including Brady, Meyer, Mortensen,

Received: June 2, 2009


c Association for Symbolic Logic, 2010
71 doi:10.1017/S1755020309990281
72 ZACH WEBER

Priest and Routley) have attempted to reconstruct various fragments


of classical reasoning. . . . While the results are not definitive, they
are not terribly encouraging. Though . . . a highly important—indeed,
essential—enterprise, it would now appear that the aim of reconstruct-
ing sensible classical reasoning in this way may not be realizable
(Priest, 2006, p. 222).
This paper begins to show that the goal is realizable. Recapture of higher theories goes
as far as showing the Peano axioms hold. Not all classical consequences of, for example,
Peano’s axioms are paraconsistent consequences—at least not automatically—so the issue
of what can be proved from here is still open. Classical recapture even to this modest point,
though, shows that paraconsistent set theory is recognizably set theory; it shows, a fortiori,
that the assumption of consistency in basic set theory is unnecessary.
Second, paraconsistent set theory can settle some disputes of its classical counterpart.
For a start, the infamous paradoxes are just theorems. The existence of various intuitive
collections, such as the set of all sets and and the set of all ordinals, is immediate; so the
theory allows us to work with objects that have elsewhere been called ‘proper classes’
and have remained rather philosophically and mathematically mysterious. With these, a
proof of Cantor’s well-ordering principle, and so Zermelo’s axiom of choice, can be given.
While cardinal numbers are not treated here, I do indicate at the end some straightforward
arguments that would establish, again from the set concept alone, the existence of large
cardinals and a decision on the generalized continuum hypothesis.
Finally, the theory begins to address new mathematical questions that arise in coher-
ent inconsistent settings, showing that objects beyond the consistent are mathematically
rich. I make no attempt to interpret or philosophically defend at any length the results.
I make only an uncontroversial claim to rigor: If one draws out consequences of the
naive set concept in a first-order paraconsistent logic, then this is some of what one will
find.

§2. Logic. A paraconsistent logic is one in which the inference , ¬  , called


explosion, does not hold for every sentence , . There are many paraconsistent logics;
but most of them are not appropriate for treating naive set theory. Comprehension is an
extremely powerful principle and can only be channeled by weak logics. Let us briefly
consider some of the constraints, leading to the present choice of logic.
Prominently, any logic with contraction ( → ( → )) → ( → ), or its
mate, axiomatic or pseudo-modus ponens  ∧ ( → ) → , will explode in the
presence of naive comprehension (Meyer et al., 1978). Further, almost no paraconsis-
tent logic can include the disjunctive syllogism; then any logic without a non-extensional
implication connective is all but useless, since we are actually interested in proving and
detaching theorems. These considerations eliminate da Costa’s C systems and Priest’s LP
as suitable background logics for a robust set theory (although there are some interest-
ing results from both, e.g., da Costa (2000) in Batens et al. (2000) and Restall (1992),
respectively).
For the purposes of set theory, we also want a logic that respects some natural behaviors
of sets. For example, given that two sets with all and only the same members are identical,
it should also be the case that sets with different members are not identical. Or again,
when y = {z : (z)} and ¬(x), then it should be that x ∈ y. These are fairly strong
requirements on negation. When formalized (e.g., with a counterexample Axiom VIII
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 73

below), these demands mean that set negation excludes the middle, tertium non datur:
 ∨ ¬ for any sentence . This is to be expected. An exhaustive universe is explicitly
postulated by Cantor and Frege; it is the cause of inconsistency in comprehension to begin
with. In light of the noted duality between paraconsistent and paracomplete (i.e., partial or
intuitionistic) systems, it is at least reasonable to investigate the mathematics of a gap-free
theory.
Essentially, we want the strongest logic possible that does not explode when given a
comprehension principle. Once all these are tailored and formalized, we obtain one of the
original tools fashioned for the job. Let us have the language of first-order set theory: with
primitives ∧, ¬, →, ∀, =, and ∈; variables x, y, z, . . .; names a, b, c, . . .; and formulae
, , ϒ, . . ., built up by standard formation rules. The usual shorthand1 is used:  ∨ 
for ¬(¬ ∧ ¬);  ↔  for ( → ) ∧ ( → ); ∃ is ¬∀¬.
2.1. Axioms. All instances of the following schemata are theorems:
I →
IIa ∧ →
IIb ∧ →
III  ∧ ( ∨ ϒ) → ( ∧ ) ∨ ( ∧ ϒ) (distribution)
IV ( → ) ∧ ( → ϒ) → ( → ϒ) (conjunctive syllogism)
V ( → ) ∧ ( → ϒ) → ( →  ∧ ϒ)
VI ( → ¬) → ( → ¬) (contraposition)
VII ¬¬ →  (double negation elimination)
VIII ( → ) → ¬( ∧ ¬) (counter example)
IXa ( → ) → [( → ϒ) → ( → ϒ)]
IXb ( → ) → [(ϒ → ) → (ϒ → )] (hypothetical syllogisms)
X ∀x → (y/x)
XI ∀x( → ) → ( → ∀x)
XII ∀x( ∨ ) →  ∨ ∀x.
In Axiom X, y is free for x in . In Axioms XI and XII, x is not free in .
The hypothetical syllogism pair IXa and IXb are called suffixing and prefixing, respec-
tively. Without these we have the weak relevant logic DLQ, dialethic logic with quantifiers,
introduced in Routley & Meyer (1976) as ‘dialectical’ logic. Adding them amplifies the
logic to TLQ.
2.2. Rules. The following rules are valid:
I ,    ∧  (ad junction)
II ,  →    (modus ponens)
III  → , ϒ →   ( → ϒ) → ( → )
IV   ∀x
V x = y  (x) → (y) (substitution).
Modus ponens is also valid in the single premise form:  ∧ ( → )  , not to
be mistaken for the illegitimate axiom form. Substitution, similarly, is only valid in rule
form. Given Axioms IXa and IXb, the hypothetical syllogism pair, then Rule III is actually
redundant because it is derivable. There is no harm in keeping it explicit, though.

1 Taking these as definitions means that , for example,  ∨  → ¬(¬ ∧ ¬) is no more than an
instance of Axiom I below.
74 ZACH WEBER

Brady proves that naive set theory in the closely related logics DKQ and TKQ is
non-trivial (Brady, 2006, p. 242), in the sense that it has a model in which some
sentences fail. Brady’s work begins in Brady (1971) and he and Routley collaborated
on the proof in Brady & Routley (1989). Routley used DKQ in his ?, calling the set
theory DST, for dialectical set theory, which in present terminology would be dialethic
set theory. The logic DLQ is the same as DKQ, except that  ∨ ¬ is strength-
ened to the counterexample axiom; and TLQ, again, adds the hypothetical syllogisms.
I have found DKQ to be too weak to obtain satisfactory results, namely, the classical
recapture. Non-triviality of naive set theory in DLQ and TKQ is an open problem.
I conjecture that DLQ is the weakest logic of its kind that can support any robust
mathematics.
The following derived facts about the logic will be most helpful.
Double negation introduction follows from Axioms I and VI, and modus ponens:
(¬ → ¬) → ( → ¬¬). Then the contraposition Axiom VI can be rearranged to
( → ) → (¬ → ¬). Counterexample gets its name from the contraposed form
 ∧ ¬ → ¬( → ). From the instance of counterexample ( → ) → ¬( ∧ ¬),
by Axiom I and modus ponens we have the law of non-contradiction: ¬( ∧ ¬). Then by
the definition of disjunction follows the law of excluded middle,  ∨ ¬. Contraposition
on Axiom V gives a schema for ∨-elimination: ( → ) ∧ (ϒ → ) → ( ∨ ϒ → ).
Then the pair ( → ¬) → ¬ and (¬ → ) → , reductio and consequentia
mirabilis, are theorems. We have the scheme for existential instantiation, ∀x( → ) →
∃x → , from Axiom XI, contraposition, and the definition of ∃, and subject to the
condition that x is not free in .
A useful derived rule is
 → ( → ϒ), ϒ →    → ( → ).

The rule holds because ϒ →  implies [ → ( → ϒ)] → [ → ( → )] by two


applications of prefixing; and then the conclusion follows by modus ponens. Similarly, an
extra application of prefixing gives the extended

 → ( → (ϒ → ),  →    → ( → (ϒ → )),

and so forth. These are called derived prefixing rules.


Like other relevant logics in its area, the deduction theorem fails, exactly because a
demonstration that  is provable from  is not sufficient to show that  relevantly implies
; probably a good deal more information than  alone is needed to get . So the
difference between deductions marked by  and implications marked by → are stark.
While the move from  →  to    is modus ponens, the other direction does not
follow. In general, → carries the burden of relevance and theorems containing it are harder
to obtain. To this end we have one.
2.3. Meta-rule. If   , then  ∨ ϒ   ∨ ϒ.
The idea of meta-rules is Meyer’s (Brady, 2006, p. 6). Importantly, the meta-rule
softens the loss of the deduction theorem; in particular, it validates the following
argument:
If   , and ϒ  , then  ∨ ϒ  .
This is useful for disjunction eliminations when reasoning under hypothesis. We also obtain
the useful substitution form, a = b, (a)  (b). Another simple derived rule is:  ∧
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 75

( → )  , and its quantificational counterpart, (a) ∧ ∀x((x) → (x))  (a).


So although pseudo-modus ponens is not included, using the meta-rule we can still reason
fairly naturally even under hypothesis.
Perhaps the most difficult challenge to set theory in relevant logic is formalizing re-
stricted quantification.2 There seem to be many cases in which we want to say that all s
are s, without meaning or being able to prove the entailment that ∀x((x) → (x)).
A weaker statement is the inference (x)  (x), but this is not part of the object
language, so it cannot be used to define predicates. Formalizing restricted quantification in
a useful way is a major challenge and is, ultimately, not addressed in the present work. For
now, following again Routley’s original presentation of dialethic set theory, we recognize
the problem by use of the t constant, to be thought of as the conjunction of all theorems,
and added conservatively through a two-way rule3 :
2.4. t-rule.   t → .
Then the following enthymematic conditional is defined:
 →  :=  ∧ t → .
The main technical behavior of → is to validate    → , and its quantificational
analogue. Thus → is obtainable sometimes when → is not (and always obtainable when
→ is), but does not contrapose.4

§3. Axioms. Our first-order formal language is now augmented with a variable bind-
ing term forming operator {· : −}; it remains open how to conservatively add term-forming
symbols in relevance contexts (Brady, 2006, p. 177), and is not a problem addressed here.
The set concept is now characterized by two axioms.
A XIOM 3.1 (Abstraction). x ∈ {z : (z)} ↔ (x).
A XIOM 3.2 (Extensionality). ∀z(z ∈ x ↔ z ∈ y) ↔ x = y.
Existential generalization on abstraction immediately yields the comprehension
principle5 :
T HEOREM 3.3 (Comprehension). ∃y∀x(x ∈ y ↔ (x)).
Under abstraction, the substitution rule is x = y  ∀z(x ∈ z → y ∈ z).
Abstraction and extensionality can be reconnected, as in Frege’s axiom:
T HEOREM 3.4 (Basic law V). {x : } = {x : } ↔ ∀x( ↔ ).
Peano chose the ∈ sign to denote predication, from the Greek verb σ τ ιν, ‘to be’.
Sets are predicates in extension. Since arbitrary predicates determine sets, then, in the
comprehension principle the occurrence of y in the predicate  is not ruled out. Following
Routley, this is completely unrestricted comprehension; without this, some sets would
not be obtainable, for example, the limiting case of diagonal sets, Z = {x : x ∈ Z }.

2 One natural approach is outlined in Brady (2003, p. 327).


3 The constant is added conservatively to the logic. Whether it is also conservative over the set
theory is unknown.
4 See Mares (2004, pp. 198–200), Priest (2006, p. 255), Beall et al. (2006), Asmus (2009), Weber
(forthcoming-a).
5 The terminological distinction between abstraction and comprehension I take from Libert (2005).
76 ZACH WEBER

To guarantee that such instances are valid abstractions—to ensure that every pred-
icate, even groundless ones, determines a set—we have abstraction instances of the
form
x ∈ {zu : (z, u)} ↔ (z/x, u/{zu : (z, u)})
where the right-hand side indicates a simultaneous substitution in  of z by x, and u
by the term {zu : (z, u)}. (At first, in Brady & Routley (1989, p. 419), a new quantifier,
formation rule, and reflection axiom were added to handle circular predicates; but by Brady
(2006, pp. 177, 200), the idea is streamlined as above.) Axiom 3.1 in this way includes
cases
x ∈ {zu : (z, u)} ↔ (x, {zu : (z, u)}).
We can then say ‘a set y of all x such that . . . x . . . y . . .’, as opposed to the usual ‘the set
of all x such that . . . x . . .’.
Naive set theory is studied here with the full comprehension scheme for three reasons.
The first has already been stated, namely, the philosophical conviction that all predicates,
even baroque predicates, determine sets.6 Note that the liar paradox has a similar kind of
self-reference. Second is a pragmatic motive: Since weak relevant logics are so terribly
weak, the proving power for theorems must come from somewhere. Unrestricted compre-
hension will allow easier construction of functions, diagonal sets, and other useful objects.
The third motive, then, is to model easily some venerable fixed-point phenomena in set
theory; see , for example, our recursive characterization of the ordinals and natural numbers
below. Uniqueness is not essential to these cases, as there are non-standard models in the
classical account anyway.
It is frequently convenient to use names for sets, for example, for a set a there is a set of
all the subsets of a, called the powerset of a. The symbol P(a) is used to denote this set in
the same way that ∅ is used to denote a particular set below; it is a name. So, to use a bit
of notation to be introduced below, x ∈ P(a) is just x ∈ {y : y ⊆ a}. Similar comments
apply to complementation a, intersection, union, and so forth. This notation can be thought
of as governed by instances of comprehension:
x0 , . . . , xn  ∈ y ↔ (x0 , . . . , xn ).
Since in later sections we will officially develop ordered pairs and the natural numbers to
act as indices, their use as primitives here is only heuristic.

§4. Basics.
P ROPOSITION 4.1. y = {z : (z)} ↔ ∀x(x ∈ y ↔ (x)).
Proof. By extensionality, y = {z : (z)} ↔ ∀x(x ∈ y ↔ x ∈ {z : (z)}. By
abstraction, ∀x(x ∈ {z : (z)} ↔ (x)). Then by transitivity of the conditional, ∀x(x ∈
y ↔ (x)). For the converse, we again invoke the abstraction scheme, where ∀x((x) ↔
x ∈ {z : (z)}, so by transitivity ∀x(x ∈ y ↔ x ∈ {z : (z)}. And this with the
extensionality axiom completes the proof. 

6 Priest and Routley: “The naive notion of set is that of the extension of an arbitrary predicate....
This is as tight an account as can be expected from any fundamental notion. It was thought to be
problematical only because it was assumed (under the ideology of consistency) that ‘arbitrary’
could not mean arbitrary. However, it does” (Priest et al., 1989, p. 499).
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 77

P ROPOSITION 4.2. Identity is an equivalence relation:

x = x,

x = y → y = x,
x = y ∧ y = z → x = z.

Proof. These follow directly from the properties of → and ∧ (Axioms I, II, IV, and V),
and extensionality. 
As with Proposition 4.8 below, identity also obeys an alternate form of transitivity due
to the hypothetical syllogism, namely

x = y → (y = z → x = z).

P ROPOSITION 4.3. ∀x∀y(x ∈ y ∨ x ∈ y).


Proof. With Axiom I, a ∈ y → a ∈ y. Then by Axiom VIII, a ∈ y ∨ a ∈ y. Rule IV
generalizes to the result. 

P ROPOSITION 4.4. Sets that differ with respect to membership are not identical. In
particular, ∃x(x ∈ a ∧ x ∈ a) → a = a.
Proof. We prove the contrapositive.
1 a = a → ∀z(z ∈ a ↔ z ∈ a) E xt.
2 ∀z(z ∈ a ↔ z ∈ a) → (b ∈ a ↔ b ∈ a) 1, Ax.X
3 (b ∈ a ↔ b ∈ a) → b ∈ a ∨ b ∈ a Ax.I I, V I I I
4 b ∈ a ∨ b ∈ a → ¬(b ∈ a ∧ b ∈ a) 3, Ax.I
5 a = a → ¬(b ∈ a ∧ b ∈ a) Ax.I V
6 (b ∈ a ∧ b ∈ a) → a = a Ax.V I.
Existential generalization completes the result. 
When a set a is such that its membership is inconsistent, some b ∈ a and b ∈ a, then a
is inconsistent.
P ROPOSITION 4.5. ∃x(x = x).
Proof. By comprehension we have Russell’s set, R = {x : x ∈ x}.
1 ∀x(x ∈ R ↔ x ∈ x) Comp.
2 R ∈ R ↔ R ∈ R 1, Ax.X
3 R ∈ R → R ∈ R 2, Ax.I I
4 R ∈ R ∨ R ∈ R 3, Ax.V I I I
5 R ∈ R 4, Ax.V
6 R∈R 2, 5, RuleI I
7 R ∈ R ∧ R ∈ R 5, 6, RuleI.
Since R differs from itself with respect to membership, by Proposition 4.4, R = R. 
From this novelty of inconsistent set theory, Restall (1992, p. 427) infers by generaliza-
tion the
C OROLLARY 4.6 (Restall). There are at least two objects, ∃x∃y(x = y).
78 ZACH WEBER

These few facts already show us that the present set theory will have as theorems
some propositions not contained by classical theory, and that classical theorems may be
recaptured by distinctively non-classical means.
D EFINITION 4.7 (Subsets). x ⊆ y := ∀z(z ∈ x → z ∈ y);
a proper subset is x ⊂ y := x ⊆ y ∧ ∃z(z ∈ y ∧ z ∈ x).
This leads to a natural partial order; the converse of antisymmetry actually holds, too,
since this is just the axiom of extensionality rewritten.
P ROPOSITION 4.8. Subsets are reflexive and antisymmetric,

x ⊆ x,

x ⊆ y ∧ y ⊆ x → x = y.

They are also transitive:

x ⊆ y ∧ y ⊆ z → x ⊆ z,

x ⊆ y → (y ⊆ z → x ⊆ z),

and x ⊆ y → (z ⊆ x → z ⊆ y).

Proof. Reflexivity and antisymmetry come from extensionality, Axiom I, and the
commutativity of conjunction. The forms of transitivity are direct results of conjunctive
syllogism and the hypothetical syllogism pair, respectively. 

P ROPOSITION 4.9. Every set has a complement, x = {y : y ∈ x}, and x = x.


Proof. Existence is by comprehension; its behavior is checked with double negation
introduction and elimination: If y ∈ x ↔ y ∈ x, then y ∈ x ↔ y ∈ x. 
Moving from these facts, we fix a top and bottom of the universe of sets, an empty set and
a universal set, noting that von Neumann’s definition of set is an object that is a member of
something. V and ∅ will exhibit most of the expected structure of a boolean lattice, except
that x ∩ x = ∅ and x ∪ x = V do not generally hold. Note that the characterizations we
are about to use are not those given by Russell and Whitehead, which have since become
standard: {x : x = x} for the universe and its complement, {x : x = x} for the empty set.
We have already seen in R = R that there are objects not in, and in, these sets respectively.
Nor, in our logic, would it be the case that y ⊆ {x : x = x} or {x : x = x} ⊆ y for every
y, since  → x = x is irrelevant. Our universal and empty sets, by contrast, are better
behaved.
P ROPOSITION 4.10. The universe,

V = {x : ∃y(x ∈ y)},

exists, and ∀x(x ∈ V ). The empty set,

∅ = {x : ∀y(x ∈ y)},

exists, too, and is empty: ∀x(x ∈ ∅).


TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 79

Proof. Existence of both sets is by comprehension. To prove that ∀x(x ∈ V ),


1 ∀x(x ∈ V ∨ x ∈ V ) pr op.4.3
2 ∀x(x ∈ V → x ∈ V ) abs.
3 ∀x(x ∈ V → ∃yx ∈ y) Ax.X
4 ∀x(∃yx ∈ y → x ∈ V abs.
5 ∀x(x ∈ V → x ∈ V ) 2, 3, 4.
The conclusion follows by ∨-elimination on Line 1. To show that the empty set is empty,
similarly, if x ∈ ∅ then ∀yx ∈ y; then x ∈ ∅ and therefore x ∈ ∅. 

P ROPOSITION 4.11. ∀x(x ⊆ V ) and ∀x(∅ ⊆ x).


Proof. By existential generalization, z ∈ x → ∃yz ∈ y. Then z ∈ x → z ∈ V by
abstraction, and therefore x ⊆ V . Similarly, z ∈ ∅ → ∀yz ∈ y, and then z ∈ x by
∀-elimination. 
What these theorems do not show—and indeed what probably cannot be shown, sug-
gesting that this set theory is not trivial—is the uniqueness of a universal or empty set.
The extensionality of naive set theory in light of ‘dopplegängers’ is discussed in Weber
(forthcoming-a).
P ROPOSITION 4.12. V = ∅ and ∅ = V .
Proof. One direction of each is already proved. For V ⊆ ∅ we argue the contrapositive:
x ∈ ∅ → ∀yx ∈ y, and ∀yx ∈ y → x ∈ V . Then x ∈ V . For V ⊆ ∅, x ∈ V → ¬∃yx ∈ y,
so ∀yx ∈ y; in particular, x ∈
 ∅. Double negation elimination completes the proof. 
As this proof makes clear, the empty set is explosive: x ∈ ∅ → x ∈ {z : } for any .
One way to read this is that ∅ is really empty, and V is really the universe. That they must,
on pain of triviality, behave consistently is often a useful fallback.

§5. ZF. Now it is time to produce all the axioms of Zermelo–Fraenkel set theory,
except Fundierung. Since general comprehension induces sets that are not well founded,
for example, V ∈ V , the foundation axiom is not a part of the present theory, as is expected
from Restall’s results in Restall (1992). That the other axioms are forthcoming is not
especially surprising, since Zermelo in 1908 (Zermelo, 1967) saw them as a consistent
fragment of the naive theory. The axiom of infinity will be provable, too, showing that the
naive theory does not fail a reductive program in the same way as Russell and Whitehead’s
system did.
In each proposition, the step of universally generalizing on free a, b is omitted.
P ROPOSITION 5.1 (Aussonderung). ∃y∀x(x ∈ y ↔ x ∈ a ∧ (x)).
P ROPOSITION 5.2 (Powerset). ∃y∀x(x ∈ y ↔ x ⊆ a).
For any a, we use the name P(a) := {x : x ⊆ a}.
P ROPOSITION 5.3 (Pairing). ∃y∀x(x ∈ y ↔ x = a ∨ x = b).
For any a, b, {a, b} := {x : x = a ∨ x = b}. A special case is the singleton {a} := {x :
x = a}. For relevance purposes, sometimes singletons are relativized to a particular set,
which is a key to defining ordinal and numerical successor below, and so to which we call
special attention.
D EFINITION 5.4 (Relevant singleton). {a}b := {x : x = a ∧ x ∈ b}.
80 ZACH WEBER

The essential property is that {a}b ⊆ b when a ∈ b, as is the case classically. Actually
we have the stronger fact that {a}b ⊆ b even without a ∈ b.
P ROPOSITION 5.5 (Union). ∃y∀x(x ∈ y ↔ ∃z(z ∈ a ∧ x ∈ z)).
P ROPOSITION 5.6 (Intersection). ∃y∀x(x ∈ y ↔ ∀z(z ∈ a → x ∈ z)).
The
 standard names are adopted:
 a := {x : ∃z(z ∈ a ∧ x ∈ z)} and
a := {x : ∀z(z ∈ a → x ∈ z)} for the union and intersection of a, respectively. Note
that, because the conditional is not material, these are not necessarily duals. On the other
hand,
a ∪ b := {x : x ∈ a ∨ x ∈ b} and
a ∩ b := {x : x ∈ a ∧ x ∈ b} obey their usual algebra—save the explosive a ∩ a ⊆ ∅.
The complement of b in a is now easily construed as a − b := a ∩ b.
The next axiom is the axiom of infinity. Note that V is already a set containing ∅ and
its successors (since every set whatever is in V ), and that we will be properly introducing
the natural numbers later. So V is infinite7 and the following is only for curiosity; compare
Petersen’s characterization of the natural numbers in (Petersen, 2000, p. 386).
P ROPOSITION 5.7 (Infinity). There is a non-empty set i such that when x ∈ i, also
{x}i ∈ i.
Proof. Consider i = {x : x ⊆ i}. Both i ∈ i and ∅ ∈ i, so i is not empty. For x ∈ i, also
{x}i ⊆ i, and so {x}i ∈ i. 
In the next definition and beyond, we take to writing some sets in a commonplace
way, as , for example, {x, y : x ∈ a ∧ y ∈ b}. As noted above, this is to be controlled with
appropriate instances of abstraction. While our language does not, at the moment, contain
function symbols, when x, y ∈ f and f is a function, then we will write f (x) = y,
having in mind something like a two-place predicate: ∀x∃y[(x, y) ∧ ∀z((x, z) →
y = z)].

D EFINITION 5.8. An ordered pair is a, b = {{a}, {a, b}}. The cartesian product is
a × b = {x, y : x ∈ a ∧ y ∈ b}. A relation is r ⊆ a × b. A function is a relation
f : a −→ b with domain dom( f ) = a and range r ng( f ) = b, such that
x, u ∈ f ∧ x, v ∈ f → u = v.
The composition of two functions f , g is f ◦ g = {x, z : ∃y(x, y ∈ g ∧ y, z ∈ f }. The
image of a under f is f  a = { f (x) : x ∈ a}. The restriction of f to x is f |x = f ∩(x ×a).
P ROPOSITION 5.9 (Replacement). Let f be a function with domain a. Then f  a exists.
P ROPOSITION 5.10 (Ordered pairs). a, b = c, d  a = c ∧ b = d.
Proof. Right to left is substitution, a = c  a ∈ a, b → c ∈ a, b, and similarly for
b = d. We prove left to right. For a = c,

7 A set X is dedekind infinite if there is an injection from X to some Y ⊂ X . Then it is provable that
some sets are dedekind infinite, more or less in the same way that Dedekind proved it in 1888.
Consider f = {x, {x} : x ∈ V }. This is an injection of V into a proper part of itself. Therefore
V is dedekind infinite.
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 81

1 a, b = c, d pr emise


2 ∀x(x ∈ {{a}, {a, b}} ↔ x ∈ {{c}, {c, d}}) de f.5.8
3 ∀x)(x = {a} ∨ x = {a, b} ↔ x = {c} ∨ x = {c, d} 2, Abs.
4 {a} = {a} → {a} = {c} ∨ {a} = {c, d} 3, Ax.X
5 {a} = {c} ∨ {a} = {c, d} →
∀x(x ∈ {a} ↔ x ∈ {c}) ∨ ∀x(x ∈ {a} ↔ x ∈ {c, d}) 4, E xt.
6 ∀x(x = a ↔ x = c) ∨ ∀x(x = a ↔ x = c ∨ x = d) 5, Abs.
7 ∀x(x = a ↔ x = c) → (a = a → a = c)
∧∀x(x = a ↔ x = c ∨ x = d) → (c = c ∨ c = d → c = a) 6, Ax.I X
8 a =a∨c =c∨c =d →a =c 7, Ax.V
9 a=c 8, pr op.4.2, mp.
To prove that b = d, first we prove that b = d∨b = a. Then we prove that b = d∨d = a.
And these together will give the result.
1 {a, b} = {a, b} → {a, b} = {c} ∨ {a, b} = {c, d} de f.5.10
2 ∀x(x = a ∨ x = b ↔ x = c)
∨∀x(x = a ∨ x = b ↔ x = c ∨ x = d) 1, Abs
3 (b = a ∨ b = b ↔ b = c)
∨(b = a ∨ b = b ↔ b = c ∨ b = d) 2, Ax.X
4 b =c →b =c∨b =d Ax.I I
5 b =a∨b =b →b =c∨b =d 3, 4, Ax.I V, V
6 b=b pr op4.2
7 b =c∨b =d 5, mp.
Then using a = c, proven at Line 9 above,
8 a=c
9 a = c ∧ (b = c ∨ b = d) 7, 8, RuleI
10 (a = c ∧ b = c) ∨ (a = c ∧ b = d) 9, Ax.I I I
11 a = c ∧ b = c → a = b pr op.4.2
12 b = a ∨ b = d 10, Ax.V
completing the first step of the argument. Now, again with a = c,
1 a=c
2 a, b = c, d pr emise
3 a, b = a, d 1, 2, RuleV
4 ∀x(x = {a} ∨ x = {a, b} ↔ x = {a} ∨ x = {a, d}) 3, de f.5.8
5 {a} = {a} → {a} = {a, d} ∨ {a, d} = {a, b} 4, Ax.X
6 {a} = {a, d} ∨ {a, d} = {a, b} →
∀x(x ∈ {a} ↔ x ∈ {a, d}) ∨ ∀x(x ∈ {a, d} ↔ x ∈ {a, b}) 5, E xt.
7 d =d →d =a∨d =b 6, Ax.X, V
8 d =a∨d =b pr op.4.2, 7, mp
completing the second step of the argument. To conclude, we adjoin the two facts (b =
a ∨ b = d) ∧ (d = a ∨ d = b); distributing possibilities,
(b = a ∧ a = d) ∨ (b = a ∧ d = b) ∨ (b = d ∧ d = a) ∨ (b = d ∧ d = b),
each of which implies that b = d, as required. 
Writing proofs out in this way is tedious; as we go along the more usual informal
presentation in mathematical english is used. But it is important to be clear that the present
logic, weak as it as, is sufficient.
82 ZACH WEBER

Most set theories include the axiom of choice; proving choice would recapture the
axioms of ZFC. It has been a vexed question as to whether or not a choice principle is or
should be a fact about set theory (though it is now mostly regarded without suspicion). For
our naive theory, we will see that choice is a consequence of a deeper Cantorian principle:
that every set can be put into a well order. To show this, though, we must first do a lot of
work on the notion of order itself. Useful classical references for the next section are Drake
(1974), Levy (1979), and Kunen (1980).

§6. Ordinals.
D EFINITION 6.1. A set a with respect to ∈ is:
strictly ordered iff
x, y, z ∈ a → x ∈ x
∧ (x ∈ y ∧ x ∈ x → y ∈ x)

∧ (y ∈ z → (x ∈ y → x ∈ z));
totally or linearly ordered by ⊆ iff a is strictly ordered and
x ∈ a → (y ∈ a → x ⊆ y ∨ y ⊆ x),
that is, ⊆-trichotomy holds;
well founded, W f (a), iff
y ⊆ a ∧ ∃zz ∈ y → ∃z(z ∈ y ∧ ¬∃x(x ∈ z ∧ x ∈ y))];
well ordered, W o(a), iff totally ordered and well founded;
transitive, T r (a), iff x ∈ a → x ⊆ a;
an ordinal iff a is a well-ordered and transitive set of ordinals, ⊆-connected to all other
ordinals.
In summary, by (full) comprehension
P ROPOSITION 6.2. There is a set of all ordinals, On = {x : x is an ordinal }, such that
x ∈ On ↔ W o(x)
∧ y∈x→y⊆x
∧ x ⊆ On
∧ y ∈ On →
(x ⊆ y ∨ y ⊆ x).
The definition of ordinal is adapted from von Neumann. We have added impredicative
clauses, to capture the recursive idea that the ordinals are the same inside and out. The
ordinals are an analysis of the concept of induction, generalized to the transfinite and
reified. The hard work in the theory of ordinals is in locating the right definition. Then
the properties of the ordinals, culminating in the Burali–Forti theorem, should all follow,
as it were, by logic alone.
A few more detailed comments. Our extra clause in the antisymmetry condition for
strict order, x ∈ x, is due to the relevance constraint on implication, as we will see in the
proofs below. In the well-founding clause, the rendering of a set having a least member
is material; the intensional alternative would have been that for any non-empty y ⊆ a,
there is a z ∈ y such that ∀x(x ∈ z → z ∈ y). Using this definition, however, makes it
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 83

too difficult to prove that anything is well founded. Similarly, to gloss z as a least member
if z ∩ y = ∅ would be almost impossible to confirm, since one would have to show that
x ∈ z ∩ y not only fails, but is absurd. It may be contradictory, but it is not absurd. For
linear order, we are using the ⊆ relation instead of the ∈ relation, and have built in an added
clause into the definition of On based on this choice. Finally, notice that → can be replaced
with →.
On to the mathematics. Ordinals are written with lowercase Greek letters.
P ROPOSITION 6.3. ∅ ∈ On.
Proof. This is because ∅ is explosive. First, ∅ ⊆ On, by Proposition 4.11. Similarly, x ∈
∅ → x ⊆ ∅, transitivity, and x ∈ ∅ → (y ∈ ∅ → x ⊆ y ∨ y ⊆ x), a linear order. To show
that ∅ is ⊆-connected to all ordinals it again suffices that ∅ ⊆ x for any x at all. Finally, to
show well foundedness, we have a ∈ y ∧ (x ∈ y → x ∈ ∅) → a ∈ y ∧ (x ∈ ∅ ∨ x ∈ y),
by counterexample. Again since ∅ is explosive we get
a ∈ y ∧ (x ∈ y → x ∈ ∅) → a ∈ y ∧ (x ∈ a ∨ x ∈ y).
Generalization completes the proof. 

P ROPOSITION 6.4. Subsets of a well-ordered set are well ordered:


W o(α), β ⊆ α  W o(β).
Proof. Let α be well ordered and β ⊆ α. By Proposition 4.8, β ⊆ α → (y ⊆ β →
y ⊆ α). And (y ⊆ β → y ⊆ α) means that members of β behave just as members of α,
giving well order. For example, linearity is proved using hypothetical syllogism and (the
rarely seen) Rule III, with → replacing → as appropriate:
1 x ∈β→x ∈α
2 y∈β→y∈α
3 (y ∈ α → x ⊆ y ∨ y ⊆ x) → (y ∈ β → x ⊆ y ∨ y ⊆ x) 2, hyp. syl.
4 (x ∈ α → (y ∈ α → x ⊆ y ∨ y ⊆ x))
→ (x ∈ β → (y ∈ β → x ⊆ y ∨ y ⊆ x)) 1, 3Rule III
5 x ∈ β → (y ∈ β → x ⊆ y ∨ y ⊆ x) 4, de f.6.1, m. p.
For well foundedness, notice that
(*)  →    ∧ ϒ →  ∧ ϒ.
This allows the following derivation.
1 β ⊆ α.
2 β ⊆ α → (y ⊆ β → y ⊆ α). prop.4.8
3 y ⊆ β → y ⊆ α. 1, 2, m. p.
4 y ⊆ β ∧ ∃zz ∈ y → y ⊆ α ∧ ∃zz ∈ y. 3, (∗)
5 y ⊆ α ∧ ∃zz ∈ y → ∃z(z ∈ y ∧ ∀x(x ∈ z ∨ x ∈
 y)). def.6.1
6 y ⊆ β ∧ ∃zz ∈ y → ∃z(z ∈ y ∧ ∀x(x ∈ z ∨ x ∈ y)). 4, 5Ax.IV
as required. 

Proposition 6.4 extends immediately to ordinals, as these are well-ordered sets.


P ROPOSITION 6.5. On is transitive, α ∈ On → α ⊆ On.
Proof. This is a clause in the definition of being an ordinal. 
84 ZACH WEBER

P ROPOSITION 6.6. An ordinal α is the set of all preceding ordinals,

α = {x : x ∈ On ∧ x ∈ α}.
Proof. Let α ∈ On. The previous theorem and the axiomatic instance β ∈ α → β ∈ α
show that β ∈ α → β ∈ On ∧ β ∈ α. The other direction is immediate. Therefore
α = α ∩ On. 

P ROPOSITION 6.7. α ∈ On → α ∈ α.
Proof. The idea is that, were α ∈ α, then still α ∈ α.
1 α ∈ On → ∀x(x ∈ α → x ∈ x). de f.6.1
2 ∀x(x ∈ α → x ∈ x) → (α ∈ α → α ∈ α). Ax.X.
3 (α ∈ α → α ∈ α) → α ∈ α ∨ α ∈ α. Ax.V I I I
4 α ∈ α ∨ α ∈ α → α ∈ α. Ax.V.
Conjunctive syllogism completes the proof. 
P ROPOSITION 6.8. α ∈ β ∧ α ∈ α → β ∈ α.
Proof. α ∈ β ∧ α ∈ α → β ⊆ α by counterexample, and β ⊆ α → β ∈ α. 

P ROPOSITION 6.9. β ∈ γ → (α ∈ β → α ∈ γ ).
Proof. By the definition of transitivity on γ . 
The last few propositions have showed, by virtue of self-similarity, that the ordinals are
strictly ordered. Now we need total and well ordering.
T HEOREM 6.10 (Trichotomy of ordinals). Any two ordinals are ⊆-connected,

α ∈ On → (β ∈ On → α ⊆ β ∨ β ⊆ α).
Proof. This is a clause in the definition of ordinal. 
This delivers, for a start, some miscellany like the following.
P ROPOSITION 6.11. α ∩ β ∈ On, and α ∪ β ∈ On.
Proof. We can show α ∩ β is well ordered and transitive independently of trichotomy.
Let x ∈ α ∩ β. Then x ∈ α ∧ x ∈ β; then x ⊆ α ∧ x ⊆ β; then x ⊆ α ∩ β, showing
transitivity. Meanwhile, α ∩ β ⊆ α; since α is well ordered, α ∩ β is well ordered, too.
Easier still is to notice that if α ⊆ β, then α ∩ β = α; and if β ⊆ α then α ∩ β = β,
ordinals both. The case for α ∪ β is just like this. 
More weakly, we see that certain intervals of On must be empty. Since α ∩ β ∈ On, we
have α ∩ β ∈ α ∩ β. Therefore either α ∩ β ∈ α or α ∩ β ∈ β. Therefore, there cannot
be any ordinals intervening between both α and α ∩ β, and β and α ∩ β. Observations like
these are the spur for the classical proof that On is linearly ordered.
P ROPOSITION 6.12. The ordinals are well founded.
Proof. We have to show that a non-empty θ ⊆ On has a least member. Let β ∈ θ . The
idea is that β is ∈-least in θ, or else the least member of β is the least member of θ . Note
that β ∩ θ ⊆ β; since β is well ordered, β ∩ θ is well ordered (by Proposition 6.4). Either
β ∩ θ is empty, or not.
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 85

Suppose ∀y(y ∈ β ∩ θ). We infer by t-introduction that


θ ⊆ On ∧ β ∈ θ → β ∈ θ ∧ ∀y(y ∈ β ∨ y ∈ θ).
This says that β is a least member of θ , and generalizes:
θ ⊆ On ∧ ∃ zz ∈ θ → ∃z(z ∈ θ ∧ ∀y(y ∈ z ∨ y ∈ θ)).
So far this shows ∀y(y ∈ β ∩ θ)  W f (On). On the other hand, suppose ∃y(y ∈ β ∩ θ).
Since β ∩ θ is well founded, trading ∃ for γ we have
γ ∈ β ∧ γ ∈ θ ∧ ∀y(y ∈ γ ∨ y ∈ β ∨ y ∈ θ).
As γ ⊆ β (since γ ∈ β), by contraposition y ∈ β → y ∈ γ . Therefore the above formula
reduces to
γ ∈ θ ∧ ∀y(y ∈ γ ∨ y ∈ θ),
which says that γ is ∈-minimal in θ. Again with t-introduction this generalizes to
θ ⊆ On ∧ ∃zz ∈ θ → ∃z(z ∈ θ ∧ ∀y(y ∈ z ∨ y ∈ θ)).
This shows ¬∀y(y ∈ β ∩ θ)  W f (On).
Both directions of an excluded middle lead to W f (On). The derived meta-rule,
if   ϒ and ¬  ϒ, then  ∨ ¬  ϒ,
is used to incorporate the law of excluded middle. We have a theorem. Any arbitrary non-
empty subset of ordinals has a least member. 

P ROPOSITION 6.13. Any transitive set of ordinals, connected to all other ordinals by
⊆, is an ordinal.
Proof. Any set of ordinals is well ordered, by the previous theorems. Then the definition
of being an ordinal is satisfied. 

T HEOREM 6.14 (Burali–Forti 1897). On ∈ On.


Proof. By the transitivity of On, we have y ∈ On → y ⊆ On. By ∨-introduction,
y ⊆ On → y ⊆ On ∨ On ⊆ y. With conjunctive syllogism, y ∈ On → y ⊆ On ∨
On ⊆ y as required. On is a well-ordered, transitive set of ordinals connected to all other
ordinals. 

C OROLLARY 6.15. On ∈ On, and then On = On.


Proof. From Proposition 6.7, and then Proposition 4.4. 
The Burali–Forti contradiction is a key reason to think naive set theory is inconsistent
on an interesting mathematical level (as opposed to Russell’s contradiction, which can
look artificial). It would have been perverse if a logic for naive set theory could not
reproduce the paradoxes of naive set theory. With Theorem 6.14, we have shown that,
with a paraconisistent logic, we can still prove the very theorems that moved us in the first
place.
At this point, nevertheless, the limitations of our logic leave us with an unsolved problem
of basic relevant set theory. To see that On is closed under some expected operations, and
that certain objects are ordinals, we need the following lemma:
86 ZACH WEBER

C ONJECTURE 6.16. Let α, β ∈ On, let θ be a well-ordered, transitive set of ordinals,


and let α ⊆ θ ⊆ β. Then θ ∈ On.
To prove this, we need to show under the assumptions of the conjecture that ∀y(y ∈
On → y ⊆ θ ∨ θ ⊆ y); then all the conditions for θ being an ordinal are satisfied.
There are no counterexamples to the conjecture, due to the facts about intervals cited after
Proposition 6.11 above—which show that any stretches of ordinals failing the conjecture
would have to be empty. Nevertheless, I do not have a proof that is adequate from a
paraconsistent perspective. (Solatium miseris, socios habuisse malorum.) Meanwhile, any
set of ordinals θ is such that ∅ ⊆ θ ⊆ On; so if this conjecture were proved, then for any
θ that is transitive and well ordered, θ is an ordinal. We will assume the conjecture for the
duration.
D EFINITION 6.17. α + = α ∪ {α} On is the successor of α. Ordinals α with no predeces-
sor, ¬∃β(β ∈ On ∧ β + = α), are called limits.
In the definition, {α} is relativized to On, that is, as {x ∈ On : x = α}. This is harmless,
because the relevance-inducing conjunct x ∈ On can always be eliminated going right.
And the definition makes it easy to prove, with →s, that every member of α + is an ordinal.
P ROPOSITION 6.18. α + ⊆ On.
Proof. Well, x ∈ α + → (x ∈ α ∨ x = α) ∧ x ∈ On by the definitions of relevant
singleton and ordinal successor. And (x ∈ α ∨ x = α) ∧ x ∈ On → x ∈ On by ∧
elimination, so the result holds by conjunctive syllogism. 

P ROPOSITION 6.19. α ∈ On  α + ∈ On.


Proof. All the members of α + are ordinals, therefore α + is well ordered. For transitivity,
1 x ∈ α + → x ∈ α ∨ (x ⊆ α ∧ x ∈ On).
2 x ∈ α → x ⊆ α.
3 x ⊆ α → x ⊆ α ∪ {α} On .
4 x ∈ α+ → x ⊆ α+.
To show that α is connected to all other ordinals, it suffices to notice that α ⊆ α + ⊆ On.


P ROPOSITION 6.20. On is a successor, and On is a limit.


Proof. To show successor: Since On is an ordinal, On + is an ordinal. So On + ∈ On,
and by transitivity, On + ⊆ On. And On ∈ On + like all other ordinals, so On ⊆ On + by
transitivity again. Therefore On = On + . To show limit, we need β + = On  β ∈ On
for all β. We can then apply this rule disjunctively to excluded middle, to obtain ∀β(β ∈
On ∨ β + = On), which is equivalent to On being a limit. We prove β + = On  β = On
from which β ∈ On follows, since On ∈ On by Corollary 6.15. So let β + = On; then by
Definition 6.17, β ⊆ On. Conversely, since On ∈ On by Theorem 6.14, also On ∈ β + .
Then by Definition 6.17 On ∈ β ∨ On = β and in either case, On ⊆ β by the transitivity
of β. So β = On as required.8 

8 Thanks here to Brady.


TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 87

§7. Arithmetic. Full comprehension induces the set


ω = {x ∈ On : [x = ∅ ∨ ∃y(y + = x)] ∧ ∀y(y ∈ x → y ∈ ω)}.
Members n, m, . . . of ω are called natural numbers, or just numbers. By definition, ω ⊆
On; so it is a well-ordered set.
P ROPOSITION 7.1. ω ∈ On.
Proof. Let n ∈ ω. Then by definition, n ⊆ ω, showing transitivity. As a subset of On,
ω is well ordered. And n ⊆ ω ⊆ On, showing ⊆-connection to all other ordinals. So ω is
an ordinal. 
To prove the Peano postulates, we will again consider a restriction of the successor
function to ω, using a relevant singleton. That is, for every n ∈ ω, n + means n ∪ {x ∈ ω :
x = n}.
T HEOREM 7.2. ∅ ∈ ω.
Proof. Since ∅ ∈ On ∧ ∅ = ∅ ∧ ∀y(y ∈ ∅ → y ∈ ω), zero is a number. 

T HEOREM 7.3. n + = ∅.
Proof. Since n ∈ n + but no n ∈ ∅, zero is not the successor of any number. That is, if
n+ = ∅ then n ∈ ∅, which implies everything, including the desired theorem. 

T HEOREM 7.4. n ∈ ω  n + ∈ ω.
Proof. If n ∈ ω, then n + meets all the requirements to be a number: n + ∈ On, and
has a predecessor, and all its members are in n, in which case they are numbers, or else a
number identical to n, a number again. 

T HEOREM 7.5. n + = m +  n = m.
Proof. If n + = m + , then ∀z(z ∈ n ∨ (z ∈ ω ∧ z = n) ↔ z ∈ m ∨ (z ∈ ω ∧ z = m)).
Then picking n and m for z respectively,
(n ∈ m ∨ (n ∈ ω ∧ n = m)) ∧ (m ∈ n ∨ (m ∈ ω ∧ m = n)),
which, once the extraneous conjuncts are dropped, distributes to
(n ∈ m ∧ m ∈ n) ∨ (n ∈ m ∧ m = n) ∨ (n = m ∧ m ∈ n) ∨ (n = m ∧ m = n).
Since n ∈ m ∧ m ∈ n implies n = m by two applications of transitivity, each above
conjunct implies that n = m. So the successor of every number is unique. 
Names of the first few natural numbers are
∅ = 0

∅+ = 1

∅++ = 2
..
.
The fifth postulate is induction, proved in the general, transfinite case over On.
88 ZACH WEBER

T HEOREM 7.6 (Transfinite induction). Let θ ⊆ On. Suppose

∀β(β ∈ α ∨ β ∈ θ) → α ∈ θ.

Then ¬∃α(α ∈ On ∧ α ∈ θ).


Proof. Suppose ∃α(α ∈ On ∧ α ∈ θ). Then there is a least such,

∃α[α ∈ θ ∧ ∀β(β ∈ α ∨ β ∈ θ)].

But the hypothesis implies ∃β(β ∈ α ∧ β ∈ θ) ∨ α ∈ θ , for every α, which negates this
claim. Therefore there is no least, and so no ordinal not in θ , as required. 

Transfinite induction will hold for any well-ordered set, including ω. The base case, 0,
is covered automatically by the induction hypothesis: If ¬(0) → ∃x(x ∈ 0 ∧ ¬(x)),
then (0) by the explosiveness of ∅.
A close mate of induction is definition by recursion. There is something appropriate in
proving the recursion theorem, as we are about to, with a set containing itself in its defining
condition.
T HEOREM 7.7 (Transfinite recursion). Let h be a function from V to V . There is a
function f from On to V such that

f (α) = h( f |α).
Proof. Take the set
x, y ∈ f ↔ y = h( f |x).
Existence is immediate from full comprehension. This is a function because h is; if
x, y, x, z ∈ f then y = h( f |x) = z. 
The recursion scheme is used to define ordinal arithmetic. To deal with limit ordinals,
we look to least upper bounds.9
D EFINITION 7.8. Let X be a set of ordinals. The supremum of X , sup(X ), is the least
ordinal δ such that every x ∈ X is either in δ or identical to δ.
Addition, taking h to be h(α) = α + , is

α+0 = α

α + (β + 1) = (α + β) + 1

α+β = sup{α + γ : γ ∈ β} for limit β;

9 The usual identification of the sup(α) with  α does not seem to follow in the logic here, or at
leastnot with ordinals as they are here defined, because the existential quantifier resists a proof
that α is an ordinal. Nevertheless, where X is a set of ordinals, then ordinals including every
member of X certainly exist—On, for example—and so, by well foundedness, a least exists, too.
If there is any question about the uniqueness of a supremum, a choice function—developed in the
next section independently of ordinal arithmetic—can be applied to make a functional selection.
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 89

multiplication, taking h to be h β (α) = α + β, is


α·0 = 0

α · (β + 1) = α·β +α
α·β = sup{α · γ : γ ∈ β} for limit β;
exponentation, with h β (α) = α · β, is
α0 = 1

α β+1 = αβ · α

αβ = sup{α γ : γ ∈ β} for limit β.

§8. Global choice. In 1977, Routley produced an argument for the axiom of global
choice from full comprehension (Routley, 1980), (Priest et al., 1989, p. 374). Since then
the claim that choice is a theorem of naive set theory has become part of paraconsistent
folklore. There are some problems with Routley’s proofs (Weber, forthcoming-b), but his
general idea is correct. Answering a fundamental question about the set concept, we here
surpass classical theory and derive a global choice theorem without need of any further
assumptions. In fact what is proved is a weak version of Cantor’s well-ordering principle,
which he took to be a Denkgesetz, a law of thought (Hallett, 1984). It was in service
of proving this principle, essential to the theory of transfinite cardinals, that Zermelo
formulated his choice principle in 1904. The well ordering here is produced by injecting
V into a particular subset of On, thereby giving a well order for every subset of V , that is,
every set.
A function f : a −→ b is injective iff ∀x∀y(x = y → f (x) = f (y)).
T HEOREM 8.1. The universe can be well ordered.
Proof. An injection f : V −→ On is required. Consider
x, y ∈ f ↔ x ∈ V ∧ y ∈ On ∧ y = On.
That is, for each x ∈ V , f (x) = On. This is clearly a function. Now,
{α ∈ On : α = On} ⊆ On,
showing that the range of f is a segment of the ordinals and therefore well ordered.
Intuitively, the Burali–Forti paradox indicates that the members of the range of f are
discrete (Corollary 6.15), of the form
. . . ∈ On ∈ On ∈ . . . ,
so {On} may be injected into by arbitrarily large sets, inducing a well order on them.
Formally, because On = On, ∀x∀y(x = y → On = On), so ∀x∀y(x = y → f (x) =
f (y)). Therefore f is an injection. Thus
{x f (x) : f (x) ∈ On}
is a well order on V . 
The proof is clearly not constructive; given the ordering {a On , b On , c On , . . .} on V with
each On distinct, it is not said how to determine a first member. This is exactly the case
90 ZACH WEBER

with Zermelo’s choice principle, which is a pure existence claim. A proof of a Cantorian
“law of thought” will inevitably be by demonstration of a bare existence of an ordering;
and so here, it has been established that for any well-ordered set there is a least member,
and {On} is well ordered. The difference between Zermelo’s proof and our own is that no
extra assumptions are needed to produce the existence claim. It comes directly from the set
concept.
Since subsets of a well-ordered set are well ordered.
C OROLLARY 8.2. (Zermelo 1904) Every set can be well ordered.
The following familiar steps lead to choice (Rubin & Rubin, 1985). The details are not
at all trivial, but are here omitted. A chain is a set connected by ⊆.
P ROPOSITION 8.3 (Hausdorff’s maximal principle). Every set has a maximal chain.
L EMMA 8.4 (Zorn). If every chain of some non-empty a has an upper bound, then a
has a maximal element.
A function f is called a choice function on a iff f (x) ∈ x when x ∈ a and ∃ zz ∈ x.
T HEOREM 8.5 (Choice). There is a choice function on every set.
C OROLLARY 8.6 (Global choice). There is a choice function on V . A fortiori, for every
non-empty a there is a choice function f such that f (a) ∈ a.

§9. Conclusions. The results in this paper are a beginning. For future work, a more
definite approach to restricted quantification, once one is known, will be invaluable.
A proof of the open conjecture at the end of the section on ordinals is obviously desirable.
And a more effective transfinite induction principle, with →s instead of extensional
connectives in key places, is desirable, too. Nevertheless, I hope these omissions can be
forgiven for now, in light of the significant increase in detail and insight provided.
To conclude on a forward-looking note, let me indicate some very striking results await-
ing in the theory of cardinal numbers. With the cardinals, we see clearly why all the work
on ordinals has been worthwhile, and also just how powerful the theory is. The construction
follows von Neumann’s assignment of least ordinals in an equinumerous class as cardinals.
Let < denote strict ordering by cardinality.
First, On should itself be a cardinal number, being evidently the least of its size. It
would be the cardinal of V . All cardinals are ordinals, so all cardinals are less than On.
This shows that
∃x∀y(|y| < |x|).
Then a simple quantifier swap proves the essence of Cantor’s theorem, that there are
distinctive orders of infinity:
∀y∃x(|y| < |x|).
For example, if |ω| := ℵ0 by definition, then ∃y(ℵ0 < y), so there are uncountable
cardinals. Note that this transcendence result holds even for V or On themselves—Cantor’s
paradoxes.
In a similar vein, a cardinal λ is said to be inaccessible if for every κ < λ, also 2κ < λ.
The existence of inaccessible cardinals is unprovable in ZFC. But, almost trivially, On
is such a cardinal: For κ < On, also 2κ < On, because On is the biggest. Therefore
inaccessible cardinals exist.
TRANSFINITE NUMBERS IN PARACONSISTENT SET THEORY 91

Finally, the generalized continuum hypothesis GCH conjectures for all cardinals κ, λ
that ¬(κ < λ < |P(κ)|). The GCH evidently fails at On, as the cardinal On provides a
counterexample. Assuming that On < |P(On)|, still we would have (by the Burali–Forti
contradiction) that On < On < |P(On)|. Then again, this does not rule out the GCH in
general; in fact, an instance of GCH holds at On. Let λ be a cardinal On < λ < |P(On)|.
Being a cardinal, ¬(On < λ), since all cardinals are members of, and less than, On. Thus
¬(On < λ < |P(On)|). In fact, further assuming that On = On + 1 obtains via the
Schröder–Bernstein theorem, and that On = ℵ On , then by existential generalization,
∃α(2ℵα = ℵα+1 ).

§10. Acknowledgment. Much thanks to Graham Priest, Greg Restall, Ross Brady,
Conrad Asmus, Sam Butchardt, Lloyd Humberstone, Stewart Shapiro, and anonymous
referees.

BIBLIOGRAPHY
Asmus, C. (2009). Restricted Arrow. Journal of Philosophical Logic, 38, 405–431.
Batens, D., Mortensen, C., Priest, G., & van Bendegem, J.-P., editors. (2000). Frontiers of
Paraconsistent Logic. Baldock, Hertfordshire, England and Philadelphia, PA: Research
Studies Press.
Beall, J. C., Brady, R. T., Hazen, A. P., Priest, G., & Restall, G. (2006). Relevant restricted
quantification. Journal of Philosophical Logic, 35, 587–598.
Brady, R. (1971). The consistency of the axioms of the axioms of abstraction and
extensionality in a three valued logic. Notre Dame Journal of Formal Logic, 12, 447–
453.
Brady, R., editor. (2003). Relevant Logics and Their Rivals, Volume II: A Continuation
of the Work of Richard Sylvan, Robert Meyer, Val Plumwood and Ross Brady. With
contributions by: Martin Bunder, Andre Fuhrmann, Andrea Loparic, Edwin Mares, Chris
Mortensen, and Alasdair Urquhart. Aldershot, Hampshire, UK: Ashgate.
Brady, R. (2006). Universal Logic. Stanford, California: CSLI.
Brady, R. T., & Routley, R. (1989) The non-triviality of extensional dialectical set theory.
In Priest, G., Routley, R., and Norman, J., editors. Paraconsistent Logic: Essays on the
Inconsistent. Munich: Philosophia Verlag, pp. 415–436.
da Costa, N. (2000). Paraconsistent mathematics. In Batens, D., Mortensen, G., Priest G.,
and van Bendegem J.-P., editors. Frontiers of Paraconsistent Logic. Baldock,
Hertfordshire, England and Philadelphia, PA: Research Studies Press, pp. 165–180.
Drake, F. (1974). Set Theory: An Introduction to Large Cardinals. Amsterdam: North
Holland Publishing Co.
Hallett, M. (1984). Cantorian Set Theory and Limitation of Size. Oxford Logic Guides.
Oxford [Oxfordshire]: Clarendon Press, 1984.
Kunen, K. (1980). Set Theory: An Introduction to Independence Proofs. Amsterdam: North
Holland Publishing Co.
Levy, A. (1979) Basic Set Theory. Berlin, Heidelberg and New York: Springer Verlag.
Reprinted by Dover, 2002.
Libert, T. (2005). Models for paraconsistent set theory. Journal of Applied Logic, 3, 15–41.
Mares, E. (2004). Relevant Logic. Cambridge, UK; New York: Cambridge University
Press.
92 ZACH WEBER

Meyer, R. K., Routley, R., & Michael Dunn, J. (1978). Curry’s paradox. Analysis, 39, 124–
128. Rumored to have been written only by Meyer.
Petersen, U. (2000). Logic without contraction as based on inclusion and unrestricted
abstraction. Studia Logica, 64, 365–403.
Priest, G. (2006). In Contradiction: A Study of the Transconsistent. Oxford, UK: Oxford
University Press. Second expanded edition of Priest (1987).
Priest, G., Routley, R., & Norman, J., editors. (1989). Paraconsistent Logic: Essays on the
Inconsistent. Munich: Philosophia Verlag.
Restall, G. (1992). A note on naı̈ve set theory in L P. Notre Dame Journal of Formal Logic,
33, 422–432.
Routley, R. (1980). Exploring Meinong’s Jungle and Beyond. Canberra: Philosophy
Department, RSSS, Australian National University. Interim Edition, Departmental
Monograph number 3.
Routley, R., & Meyer, R. K. (1976). Dialectical logic, classical logic and the consistency
of the world. Studies in Soviet Thought, 16, 1–25.
Rubin, H., & Rubin, J. E. (1985) [1963]. Equivalents of the Axiom of Choice. Amsterdam,
North Holland Publishing Co.
Weber, Z. (forthcoming-a). Extensionality and restriction in naive set theory. Studia Logica.
Weber, Z. (forthcoming-b). Notes on inconsistent set theory. In Tanaka, K., Berto, F.,
Paoli, F., and Mares, E., editors. World Congress of Paraconsistency 4.
Zermelo, E. (1967). Investigations in the foundations of set theory. In van Heijenoort,
J, editor. From Frege to Gödel: A Source Book in Mathematical Logic, 1879–1931.
Cambridge, MA: Harvard University Press, pp. 200–215.

SCHOOL OF PHILOSOPHY AND HISTORICAL INQUIRY


UNIVERSITY OF SYDNEY
NSW 2006 AUSTRALIA
E-mail: zach.weber@usyd.edu.au
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010

PASCH’S PHILOSOPHY OF MATHEMATICS


DIRK SCHLIMM
Department of Philosophy, McGill University

Abstract. Moritz Pasch (1843–1930) gave the first rigorous axiomatization of projective geom-
etry in his Vorlesungen über neuere Geometrie (1882), in which he also clearly formulated the view
that deductions must be independent from the meanings of the nonlogical terms involved. Pasch also
presented in these lectures the main tenets of his philosophy of mathematics, which he continued to
elaborate on throughout the rest of his life. This philosophy is quite unique in combining a deductivist
methodology with a radically empiricist epistemology for mathematics. By taking into consideration
publications from the entire span of Pasch’s career, the latter decades of which he devoted primarily
to careful reflections on the nature of mathematics and of mathematical knowledge, Pasch’s highly
original, but virtually unknown, philosophy of mathematics is presented.

§1. Introduction. Moritz Pasch’s influence on the development of modern mathe-


matics cannot be overestimated. In 1882 he presented in his Vorlesungen über neuere
Geometrie the first rigorous axiomatization of projective geometry, which has been called
the ‘birthplace of modern axiomatics’ (Engel & Dehn, 1934, p. 133) and which earned
him the honor of being referred to as the ‘father of rigor in geometry’ (Freudenthal, 1962,
p. 619). Indeed, Pasch’s lectures exerted a considerable direct influence on Hilbert’s think-
ing about geometry and axiomatics in general, as can be seen from the development of
Hilbert’s lecture notes on geometry in the 1890s, which contain lengthy paraphrases of
Pasch’s discussions, and also from remarks Hilbert made in his correspondence.1 This deep
influence is not acknowledged properly in Hilbert’s seminal Grundlagen der Geometrie
(1899), where Pasch is only credited in a footnote for the first ‘detailed investigations’ of
the axioms of betweenness, in particular the axiom that became later known as Pasch’s
axiom. Nonetheless, Hilbert’s brief published remarks have been reason enough for Pasch
being mentioned in almost every account of the history of modern geometry. In addition
to Pasch’s impact on Hilbert, his book also exerted considerable influence on the work
of Peano and of the Italian school of geometry,2 and it is discussed in detail in Russell’s
Principles of Mathematics (Russell, 1903, pp. 393–403). But, Pasch did not only spark
the development of modern geometry, he also lived long enough to witness its progress
in the first three decades of the twentieth century.3 During all this time he was deeply

Received June 3, 2009


1 In a letter to Friedrich Engel from January 14, 1894, Hilbert writes with reference to Pasch’s
lectures: ‘I have learned non-Euclidean geometry solely from this book’ (quoted from Tamari
(2007, p. 113)). For the development of Hilbert’s lectures on geometry, see Hallett & Majer
(2004).
2 See Peano (1889b), Contro (1976), Gandon (2006), and Marchisotto & Smith (2007). I am
grateful to an anonymous reviewer for bringing to my attention the article by Gandon.
3 Moritz Pasch was born November 8, 1843, in Breslau, where he also studied mathematics with
Schröter; he wrote his dissertation in 1865 in Breslau, then spent two semesters in Berlin with


c Association for Symbolic Logic, 2010
93 doi:10.1017/S1755020309990311
94 DIRK SCHLIMM

concerned with the foundations of geometry as well as those of analysis and arithmetic,
and he developed a distinctive and original philosophy of mathematics.
The thesis that underlies the present paper is that Pasch’s reflections on the nature
of mathematics, which he presented throughout his life, in particular in his later more
systematic accounts, are elaborations and refinements of a philosophical position that he
put forward already in his famous lectures of 1882. One clear indication of this are the
later editions of these lectures in 1912 and 1926, where he had the opportunity to modify
or retract his earlier claims, but in which he only elaborated and added minor details. Thus,
what we can find in Pasch is a very thoughtful and consistent approach to the foundations
of mathematics.
The two main aspects of Pasch’s philosophy are a formal stance with regard to the valid-
ity of mathematical deductions and a strong commitment to an empiricist understanding of
the basic concepts of mathematics. On the face of it, these views might appear incompatible
in various ways. Firstly, that the meanings of the mathematical terms are given empirically
might not square with Pasch’s particular conception of deductivism, according to which
deductions must be independent of the meanings of the terms. Secondly, the introduction of
ideal elements in mathematics might stand in conflict with an empirical stance, and thirdly,
empiricism might appear to be incompatible with the common view that mathematical
deductions provide certain and necessary knowledge. (These aspects of incompatibility
will be addressed below.) Pasch’s deductivism and empiricism are mentioned in Ernest
Nagel’s informative paper on the development of geometry and logic (Nagel, 1939), with-
out, however, containing an account of Pasch’s attempt to reconcile them. Such an account
is also missing from Walter Contro’s detailed analysis of the axioms presented in Pasch’s
1882 lectures on projective geometry (Contro, 1976), and from the recent, in parts highly
speculative discussion in Tamari (2007).4 By drawing on Pasch’s lectures on geometry,
as well as his publications on the foundations of analysis and arithmetic, and his later
more philosophical works, this paper presents Pasch’s quite unique philosophy of mathe-
matics as a coherent system.5 The historical context and Pasch’s views of the relationship
between mathematical and philosophical investigations, which form the framework for
Pasch’s work, are presented in the next section. The tension between his radical empiri-
cism, aimed as providing an epistemological basis for mathematics, and his goal to capture
the essence of mathematical reasoning deserves particular attention. While Pasch maintains
that empiricism provides the best philosophical foundations for mathematics, he also
advances a very modern deductivist methodology for purely mathematical investigations.
These views are discussed in Sections 3 and 4, respectively. Finally, Pasch’s efforts to
merge these considerations into a unified whole, which I shall refer to as Pasch’s pro-
gramme, are presented in Section 5.
In presenting the reflections on mathematics and mathematical practice of a deep and
clear thinker such as Moritz Pasch, who stood with one foot firm in the empiricist tradition
of the nineteenth century, while vigorously striding with his other foot into the modern

Kronecker and Weierstrass, and submitted his Habilitation in 1870 in Giessen; after having been
Privatdozent at the University of Giessen he became extraordinary professor in 1873, and was
full professor from 1875 until 1910; he died September 20, 1930, at the age of 86.
4 For a coherent interpretation of Pasch’s mathematical work, see Gandon (2005). (Footnote added
November 2009).
5 Not all aspects of Pasch’s views can be dealt with in a satisfactory manner in the present paper.
Some of these are mentioned in the Concluding Remarks below, and are intended to be covered
in future work.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 95

mathematics of the twentieth century, this paper is also intended as a contribution toward
a better understanding of the radical transition mathematics underwent at the turn of the
twentieth century.

§2. Pasch’s view of mathematics. The received account of the nature of mathematics
in the first half of the nineteenth century was that given by Kant, who considered the
theorems of arithmetic and of Euclidean geometry as synthetic a priori. However, interest in
this topic was revived after non-Euclidean geometries became to be regarded as acceptable
consistent theories through the work of Riemann, Beltrami, and Klein. The presence of vi-
able alternatives to Euclidean geometry cast doubt on Kant’s transcendental reasoning, and
Hermann von Helmholtz famously argued in the late 1860s and early 1870s that the ques-
tion as to which theory of geometry describes best the space we live in should be answered
empirically.6 Around the same time reliance on mathematical intuition was also severely
called into question by another development in mathematics, namely in function theory.
In the 1860s Weierstrass lectured about the possibility of continuous functions that are
nowhere differentiable and soon thereafter many other such ‘monster’ functions, which de-
fied visualization and which proved commonly accepted intuitions wrong, where studied.
It is against this background that Pasch formed his views on the nature of mathematics.7
One of the earliest insights into Pasch’s own development is offered in letters written
in 1882 to Felix Klein.8 Herein Pasch mentions as influences to his views the lectures of
Kronecker and Weierstrass that he attended in Berlin in 1865–1866,9 and also discussions
in the 1860s with his friend and colleague Jakob Rosanes.10 In these letters Pasch also
expresses his disappointment regarding the views of the few philosophers that he has read
(without mentioning any names, however). Nevertheless, Pasch thought his views to be so
commonsensical that he assumed them to be generally shared and he was surprised to hear
of Klein’s experiences of the contrary. In print, Pasch readily points out that his views are
‘by no means new’ (Pasch, 1887a, p. 129), but again without mentioning any predecessors.
We find only a brief reference in Pasch (1882a, p. 17) to von Helmholtz (1876), but whether
Pasch was in fact influenced by von Helmholtz, or whether he just quoted the famous
scientists in support of a view that he arrived at independently or influenced by other
authors remains an open question.
Thus, it seems that the seed to Pasch’s views on mathematics, which underlie his axioma-
tization of geometry as well as his other foundational and philosophical investigations, was
planted early in his career. As he admits without hesitation (and as will be discussed below),
particular aspects of his philosophical outlook evolved over time, but on the whole Pasch
remained committed throughout his life to the two pillars of his philosophy: deductivism

6 von Helmholtz (1866, 1876); for a discussion, see DiSalle (1993).


7 For historical overviews of the developments just sketched, see Volkert (1986) and Gray (2007).
8 Letters from June 16 and 22, 1882, held at the Staats- und Universitätsbibliothek Göttingen,
Sig. Klein 11, 176, and 177. Unfortunately, not much material from the time period before 1882
has been preserved in Pasch’s Nachlass at the University of Giessen.
9 This claim is repeated in Pasch’s short autobiography (Pasch, 1930b, p. 7), where he praises
both Kronecker and Weierstrass for having taught him the necessary tools for his foundational
investigations.
10 In his Habilitation lecture of 1870 Rosanes states that ‘in more recent times, one has
predominantly switched over to von Helmholtz’s so-called empiricist theory, according to which
space is nothing more than a concept that has been abstracted from experience’ and also mentions
Locke as an proponent of an empiricist theory of space (Rosanes, 1871, p. 8; emphasis in original).
96 DIRK SCHLIMM

Table 1. Layers of mathematical and philosophical investigations

Investigations Goals

Research (‘rough’) New mathematical results


Mathematical
Foundational (‘delicate’) Stem concepts and propositions

Foundational Core concepts and propositions


Philosophical
Pre-scientific Necessary conditions, skills, etc.

and empiricism. Pasch published a systematic account of his views, which I turn to next,
only in the last two decades of his career.
In order to accommodate his deductivism and empiricism into a coherent picture, Pasch
distinguishes between different layers of mathematical and philosophical investigations,
which are characterized by different aims and methodologies (see Table 1). According to
this picture, mathematical investigations take place at two distinct layers. The first one,
which Pasch describes as ‘rough’ (‘derb’) mathematics, comprises the usual work that is
done by mathematicians in order to obtain new results (Pasch, 1918a, p. 230).11 The bulk
of mathematical research falls into this category, and it is worth mentioning already at this
point that as a practicing mathematician Pasch was well aware of the distinction between
how mathematics is presented and how it is actually done (more on this later).
The second layer of mathematical work is foundational in character and it involves care-
fully working out the fundamental concepts and propositions of a discipline and showing
how the entire discipline can be built up from them. Pasch refers to this part of mathematics
as ‘heikel’ (Pasch, 1918a, p. 230), which is translated here as ‘delicate’, but could also mean
‘finicky’ and ‘touchy’. His own axiomatization of projective geometry (Pasch, 1882a) and
his introduction to analysis (Pasch, 1882b) are examples of such investigations. Delicate
mathematics is guided by the difficult demand for a ‘scrupulous completeness of the trains
of thought’ (‘unbedingte Vollständigkeit der Gedankengänge’) and is motivated by an ‘urge
for pure knowledge’ (‘entspringt [ . . . ] dem Drange nach Erkenntnis überhaupt’) (Pasch,
1924a, p. 36). Such investigations aim at an axiomatic presentation of a mathematical
discipline, which Pasch calls a stem (‘Stamm’), consisting of stem concepts and propo-
sitions (Pasch, 1882a, pp. 74, 98).12 On their basis a mathematical theory can be built up
deductively, and as long as they are not given any philosophical grounding, Pasch also
refers to them as ‘hypothetical’ (Pasch, 1917, p. 185) or ‘mathematical’ (Pasch, 1924a,
p. 43).
Once a mathematical foundation of a discipline has been given, the philosophical task
arises of determining the meanings of the mathematical terms and of giving an account
of their applicability to the world. In other words, a ‘substructure’ (‘Unterbau’) has to
be provided that supports and grounds the mathematical theory (Pasch, 1917, p. 185).13
For these philosophical foundations different approaches are possible, and Pasch mentions

11 See also Pasch (1924a, p. 35).


12 In Pasch (1924a, p. 16), Pasch refers to stem propositions also as ‘basic propositions, axioms,
postulates’. For more on Pasch’s choice of terminology, see below.
13 See also Pasch (1924a, p. 42).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 97

rationalist, a priori, and empiricist accounts as alternatives (Pasch, 1926c, p. 138). For
reasons that will be discussed later (Section 3), Pasch himself decided to pursue a radical
empiricist approach. The details of Pasch’s efforts to connect the philosophical substruc-
ture to the purely mathematical foundations are discussed under the heading of ‘Pasch’s
programme’ below (Section 5).
Finally, a second layer of philosophical investigations is concerned with uncovering the
conditions, skills, and so forth, that are necessary for employing the basic concepts and car-
rying out the investigations at the higher levels. Pasch refers to this layer as investigations
regarding the ‘prescientific origins’ or simply the ‘origin’ (‘Ursprung’) of mathematics
and thinking in general (Pasch, 1924a, p. 40), and he identifies as its fundamental concepts
those of ‘thing’, ‘proper name’, ‘event’ (in particular that of ‘naming a thing’), ‘collective
name’, ‘earlier’ and ‘later’ events, ‘immediate following’, and ‘chain’ of events (Pasch,
1924b, p. 234).14 Pasch’s investigations at this layer might be characterized, borrowing an
expression of Hilbert, as a deepening of the foundations of human knowledge.15
The investigations at each of the top three layers can be pursued independently of the
considerations pertaining to a lower layer, which allows for the division of ordinary and
foundational research as well as the division of mathematical and philosophical labor. As
a consequence, mathematicians can ignore the questions regarding the origins and the
applicability of mathematics altogether, and most often they do.16 For Pasch, however,
a complete picture of mathematics requires an account of each of these four layers and of
their interconnections. To emphasize and illustrate this organic, hierarchical structure Pasch
employs terminology that evokes the picture of a tree of mathematics: On the one hand,
he refers to the philosophical foundations as a ‘Kern’, which is rendered here as core, but
could also be translated as ‘pip’ or ‘kernel’, that consists of core concepts and propositions
(‘Kernbegriffe’ and ‘Kernsätze’) (Pasch, 1916).17 On the other hand, the mathematical
foundations of a discipline are called a ‘Stamm’, translated here as stem, which but could
also be rendered as ‘stalk’ or ‘trunk’, that consists of stem concepts and propositions
(‘Stammbegriffe’ and ‘Stammsätze’); in accordance with this botanical metaphor, the do-
main of philosophical inquiry that is common to all sciences is referred to as an area of
roots (‘Wurzelgebiet’) in Pasch (1924a, p. 34).
Failure to notice Pasch’s distinction between a (mathematical) stem and a (philosophical)
core, and indiscriminate reference to both stem propositions and core propositions as
‘axioms’ has led to misinterpretations and disputes in the literature. For example, Kline
(1972, p. 1008) mentions that some of Pasch’s axioms have empirical origins, while Torretti
(1978, p. 211, and footnote 49) explicitly disagrees with this assessment and claims that
Pasch considers all axioms to be empirically grounded.18 Since also Nagel does not address
Pasch’s crucial distinction between a core and a stem (Nagel, 1939, pp. 193–199), it has
also been missed by many later commentators who relied heavily on Nagel’s account.

14 See also Pasch (1927), Pasch (1930a), and (Pasch, 1980, p. 16). The search for such origins can
certainly be traced back to Kant, but also some of Pasch’s colleagues addressed such questions,
for example, Dedekind (1888, p. 336), and Veronese (1894, pp. 1–2).
15 See Pasch’s discussion of Hilbert (1922) and Hilbert (1923) in Pasch (1924b, pp. 236–240).
16 See Pasch (1912, p. 204), Pasch (1924a, p. 43), and Pasch (1927, p. 123).
17 Core propositions are also referred to as ‘primitive stem propositions’ in Pasch (1924a, p. 16
[1915]), and in Pasch (1924b, p. 232) the core is referred to as a ‘ “natural” stem’.
18 A similar claim is made in Boniface (2004, p. 133).
98 DIRK SCHLIMM

In the course of his career, which spanned over 60 years, Pasch devoted his attention
increasingly to the deeper layers of mathematical and philosophical investigations, de-
scribing his aim as ‘getting as far as possible to the beginnings’ (Pasch, 1926b, p. 166).
Pasch’s earliest publications were a few short research articles, after which he brought
out two books in 1882, both of which are concerned with foundational mathematical
work and are interspersed with philosophical reflections. Soon after Pasch’s lectures on
geometry, his Einleitung in die Differential- und Integralrechnung (Pasch, 1882b) ap-
peared.19 But, Pasch was not able to develop the foundations of analysis as deeply as
he had intended, and he tried to remedy this in Grundlagen der Analysis (Pasch, 1909)
and Veränderliche und Funktion (Pasch, 1914). Only after his retirement from teaching for
almost four decades at the University of Giessen,20 Pasch found the time to write on more
philosophical topics. This led to numerous articles and the collections Mathematik und
Logik (Pasch, 1919, 1924a), Mathematik am Ursprung (Pasch, 1927), and Der Ursprung
des Zahlbegriffs (Pasch, 1930a).21
According to his strong conviction of the existence of a tight connection between ‘cor-
rectness of linguistic expression and correctness of thinking’ (‘Sprachrichtigkeit und Denk-
richtigkeit’) (Pasch, 1930b, p. 6),22 Pasch always struggled to find the most appropriate
terminology for expressing his ideas. For example, while he distinguished between ‘basic’
and ‘stem’ concepts and propositions in Pasch (1882a, pp. 74 and 98), he began referring to
the former as ‘core’ in Pasch (1916, p. 276), remarking that his original terms ‘Grundsätze’
and ‘Grundbegriffe’ were often understood in a different sense than he intended. Changes
in terminology also reflect changes in Pasch’s way of thinking. For example, the distinction
between ‘rough’ and ‘delicate’ mathematics (Pasch, 1918a, p. 230) was first introduced as
one between ‘consistent’ (‘konsistent’) and ‘disputable’ (‘strittig’) mathematics 2 years
earlier (Pasch, 1916, p. 275). Similarly, the distinction between ‘proper’ and ‘improper’
mathematics that Pasch introduces in Pasch (1914, pp. 153–157) was later reformulated as
one between ‘perfect’ and ‘imperfect’ mathematics (Pasch, 1918a, p. 230).23 In later years
Pasch also urged to employ different names for mathematical notions and their empirical
correlates, on the grounds that their conceptual differences can be easily overlooked if
they are both referred to by similar names, and he suggested the terms ‘location’, ‘path’,
‘segment’, ‘bowl’, and ‘plate’ (‘Stelle’, ‘Weg’, ‘Strecke’, ‘Schale’, ‘Platte’) as names for
the empirical conceptions of point, line, straight line, surface, and flat surface (Pasch, 1917,
p. 187).24
Pasch was very well aware of the tentative character of axiomatic presentations and he
continuously tried to improve on his previous work by publishing lists of corrections to

19 The preface of Vorlesungen (Pasch, 1882a) is dated ‘March 1882’, while that of Einleitung (Pasch,
1882b) is dated ‘May 1882’.
20 See Pickert (1980, pp. 49–57) for a list of the courses taught by Pasch in Giessen.
21 Interestingly, more publications by Pasch appeared in the two decades after his retirement than
before.
22 Pasch elsewhere describes the aim of mathematics as ‘the most complete clarity of thought and
of their linguistic expressions’ (Pasch, 1924a, pp. 39–40); for a practical example, see Pasch
(1887b, p. 132), where he introduces new terminology that allows for ‘more precise and shorter
formulations’.
23 The distinction between perfect and imperfect mathematics will be discussed in connection with
the notion of decidability in Section 4, below.
24 See also Pasch (1930a, p. 19) for Pasch’s use of ‘Rotte’ instead of ‘Reihe’, which he used in Pasch
(1909, p. 7).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 99

earlier publications and slightly changing formulations even in reprints.25 Pasch’s attitude
toward foundational work is expressed quite tellingly his review of a book by Dingler
on the notion of logical independence in mathematics, subtitled Also an Introduction to
Axiomatics (Dingler, 1915). Here Pasch criticizes the author for reprinting an obviously un-
finished article without further revisions and for not seriously trying to present a complete
set of core propositions (Pasch, 1916, p. 276). Pasch concludes that the book would need
further ‘patient work’ before being able to yield concrete results from the accumulated raw
material. In addition, Pasch demands higher standards regarding the ‘exactness of one’s
thought and expression’ and a more thorough self-criticism especially from somebody
who writes an introduction to axiomatics. There are good reasons to believe that he did
hold himself responsible to such standards.26

§3. Pasch’s empiricism. Pasch’s version of empiricism, the main points of which I
will try to outline in this section, differs in important respects from the views held by his
contemporaries, but bears some resemblance to the views of Berkeley, Locke, and Hume.27
In contrast to the question as to which geometry is the ‘right’ description of space, which
was the driving force behind von Helmholtz’s form of empiricism, Pasch’s main concern
was the nature of the fundamental concepts of mathematics. A satisfactory account of this,
according to Pasch, must answer questions regarding the applicability of mathematics as
well as the epistemology and certainty of mathematical knowledge. In accord with my
thesis that the main elements of Pasch’s philosophy can be found already in his lectures
on projective geometry, I shall begin the discussion with Pasch’s remarks on the nature of
geometry.
In the opening sentence of his Vorlesungen über neuere Geometrie (Pasch, 1882a, V),
Pasch laments that the empirical origins of geometry have not been consistently brought
out in the recent treatments of this discipline that tried to meet the increased standards
of rigor, and he announces that his lectures aim at carrying out such a project. Shortly
after this pronouncement Pasch justifies his point of view by claiming that the successful
applications of geometry in daily life and in science are based on the fact that the geometric
concepts originally conformed exactly with empirical objects, and that only later they
were ‘covered by a network of artificial concepts’ to foster the advancement of theoretical
developments. By restricting himself from the start to empirical concepts only, Pasch
intends to retain the character of geometry as a natural science.28 A few pages later he
repeats his resolve of steadfastly holding on to the empiricist standpoint, according to

25 This can be seen, for example, in the additions to the 1912 edition of his lectures on projective
geometry and the various (seemingly overly pedantic) corrections to previous publications that
he adds in later works. Just to mention a few, Pasch (1909) contains corrections to Pasch (1882a)
on pp. 117–188 and to Pasch (1882b) on p. 120; corrections to Pasch (1912) are listed in Pasch
(1914, VI) together with further corrections to Pasch (1909). A number of small changes in the
text can be found in the versions of Pasch (1894) reprinted in Pasch (1909), Pasch (1919), and
Pasch (1924a).
26 Pasch’s publications, as well as his autobiographical reflections and the descriptions of his
character and work ethic by people who knew him personally confirm this; see Pasch (1930b,
p. 10), Dehn (1928), Engel & Dehn (1934), and Tamari (2007).
27 See Jesseph (1993, pp. 44–87) and Pressman (1997); see also John Stuart Mill’s A System of Logic
(Mill, 1851), and Harré (2003).
28 The view that geometry is a natural science is frequently echoed by Hilbert, see Hallett & Majer
(2004, pp. 66, 197, 266, 504).
100 DIRK SCHLIMM

which ‘geometry is seen as nothing else but a part of natural science’ (Pasch, 1882a, p. 3).
Thus, Pasch presents his work from the outset as providing philosophical foundations for
projective geometry, in addition to purely mathematical ones (see Table 1, above). This
combination, which is rather unusual in its extent for a mathematical treatise, may have
come as a surprise and possibly also as an irritation to readers who did not share the aims
and methods of Pasch’s conception of mathematics and his philosophical project.29
Pasch’s original move, which characterizes his version of empiricism and sets it apart
from that of his contemporaries, is to take empirical concepts as the starting point for a
rigorous development of mathematical theories. This involves two main steps: First, the
stem concepts of a discipline have to be developed from the empirical core concepts, and
second, the remainder of the theory has to be based on the stem concepts alone. Note,
that in order to guarantee the empirical character of mathematics as a whole, its theorems
must inherit the epistemological status of the axioms; it is at this point where Pasch’s
deductivism becomes fundamental for establishing his empiricism.
As a general and essential criterion for the choice of core concepts Pasch holds that they
should be able to explain how the mathematical concepts originated or at least how they
could have originated (Pasch, 1917, p. 190).30 Moreover, they should be as few as possible
and express the simplest content possible (Pasch, 1894, p. 24). Pasch also insists that the
basic terms of a mathematical theory can neither be defined nor can they be reduced to other
concepts, but that we can only understand them through reference to appropriate physical
objects (‘den Hinweis auf geeignete Naturobjecte’) (Pasch, 1882a, p. 16). In particular, the
principle of duality in projective geometry, that is, the fact that the basic terms in the stem
propositions can be interchanged systematically while yielding again valid propositions,
is taken by Pasch as evidence that these propositions cannot be regarded as definitions
of the basic concepts (Pasch, 1914, p. 143). This stands in direct contrast to the modern
understanding of axiom systems as implicit definitions of its primitive terms.31 Thus, in
geometry Pasch introduces points as those objects that cannot be further divided within
the limits of observation determined by the best tools that are currently available to us.
He also rejects the common view that lines must be ‘ “imagined” as being infinitely ex-
tended’, since such a demand does not correspond to any perceptible objects (Pasch, 1882a,
p. 4); instead, Pasch takes the notion of (finite) line segments as a core concept.
In addition to the demand that the basic objects of geometry should be observable, they
must satisfy some further restrictions in order to be usable. For all practical purposes,
configurations of physical geometric objects (i.e., figures or diagrams), Pasch explains,
must be such that, on the one hand, the observer is relatively close to them, and on the other
hand, that their parts are sufficiently close to allow for an immediate grasp of their rela-
tionships (Pasch, 1882a, pp. 18–19). As a consequence, one can have immediate evidence
that these relationships hold only within a relatively small, bounded region of space.32

29 That readers might be irritated is mentioned, for example, in Tamari (2007, pp. 77, 194–195). Two
examples: Russell speaks of Pasch’s ‘empirical pseudo-philosophical reasons’ (Russell, 1903, p.
393) and in a recent commentary Majer mentions some ‘curiosities’ that characterize Pasch’s
approach (Majer, 2004, p. 104).
30 For similar remarks, regarding the axiomatization of arithmetic, see Pasch (1924a, p. 16 [1915]).
31 That Pasch did understand axioms to implicitly define the primitives is claimed in Tamari (2007,
ii, p. 6, and 96). But, compare the footnote in Pasch (1920, p. 145), in which Pasch explicitly
denies such an interpretation; see also Gabriel (1978).
32 For similar views on these fundamental assumptions Pasch refers to Riemann (1854, p. 266),
Klein (1871, pp. 576 and 624), and Klein (1873b, p. 134).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 101

In a similar vein, Pasch also notes that general terms (universals) are introduced only with
reference to a finite number of particular objects (Pasch, 1914, p. 3).
For Pasch, the further development of a mathematical discipline proceeds from obser-
vations to propositions. In geometry, repeated observations of concretely given figures
yield simple relations between the basic concepts, some of which are formulated as basic
propositions, from which all other propositions of geometry follow. For example, two of
Pasch’s basic propositions are: ‘I. Between two points one can always draw one and only
one segment’ and ‘VI. Given any two points A and B, it is possible to choose a point C,
such that B lies within the segment AC ’ (Pasch, 1882a, p. 5). Given the empirical referents
of the primitive terms, Pasch notes that these propositions do not hold in general, but
that they are subject to certain restrictions. In order to draw a segment between any two
points, these points must be sufficiently apart from each other, while the points A and B
of basic proposition VI must be sufficiently close to each other to allow for the actual
construction of the third point. These conditions are met in the usual diagrams or mental
visualizations that accompany mathematical investigations, but they must be made explicit
and kept track of in the deductive development of geometry. Theorems that depend on
the above basic propositions, are thus also subject to restrictions. For example, also the
construction expressed in theorem 8, which states that ‘Given two points A and B on a line,
it is always possible to choose a point C on that line, such that C lies between A and B’
(Pasch, 1882a, p. 10), and which is proved using the above axioms, cannot be applied
indefinitely often (i.e., the ‘always’ must be taken with a grain of salt).
While Pasch does not give specific arguments for his empiricist standpoint in his early
writings, he does provide an argument in Pasch (1914, pp. 138–139), which is based on
the applicability of mathematics. In order to apply mathematical propositions to the world,
the concepts that occur in them must be related to things that occur in experience, which
is straightforward if they are understood to refer to empirical notions from the outset.
If, however, mathematical concepts are not understood as referring to empirical objects,
in which case Pasch calls them ‘hypothetical concepts’, their applicability rests on two
sets of hypotheses: First, the axioms themselves are purely hypothetical, and second, the
association between mathematical concepts and their empirical correlates is hypothetical,
too. The position that Pasch describes here bears strong similarities to that of a hypothetico-
deductive account of mathematics, which must be augmented by ‘coordinative definitions’
to be applied.33 From Pasch’s empiricist standpoint, however, these two kinds of hypothe-
ses present a detour that does not add any benefits, so that the empiricist approach is simpler
and thus to be preferred. In Pasch (1917, pp. 185–186) Pasch remarks that ‘hypothetical
geometry’ is completely independent from physical objects (‘Naturgegenständen’), which
becomes completely obvious if the terms ‘thing of the first, second, and third kind’ are used
instead of ‘points, lines, planes’, as was suggested by Hilbert (1899). From a mathematical
point of view this way of proceeding is unobjectionable for Pasch, but it leaves the relation
to figures and applications unexplained. More generally, he maintained that despite the
fact that the problem of applicability had been widely discussed from a nonempiricist
standpoint no satisfactory solution had yet been given.
A second argument for empiricism is presented in Pasch (1922, pp. 3–4) and Pasch
(1924a, p. 44). Here Pasch notes that different viewpoints regarding the nature of geom-
etry, for example, that it is ‘a pure creation of human thought’, find their expression in

33 See Reichenbach (1957, p. 14) or Nagel (1961, p. 93); Hilbert’s account of the application of
mathematical theories is also similar.
102 DIRK SCHLIMM

introductory textbooks. However, a closer look at these expositions also reveals that none
of them remains completely consistent in presenting geometry from a single point of view.
For example, a ‘body’ may have been defined as a part of space that is delimited on all
sides, but it is later said to be moved, despite the fact that a part of space is not something
movable (Pasch, 1924a, p. 44). Without going into further details Pasch argues that it is
impossible to purge all allusions to experience from any introduction to geometry, and
thus, that a coherent presentation should treat geometry as an empirical science. The lack
of a textbook that consistently pursues the empiricist standpoint is explained by the fact
that much more work needs to be done to lay bare the prescientific foundations that such a
presentation would require.34 Nonetheless, Pasch also recommends teaching geometry in
school by starting with empirical notions, since they are what seems to come most naturally
to beginners (Pasch, 1909, pp. 134–135).

§4. Pasch’s deductivism. In addition to exploring the philosophical foundations of


mathematical concepts, Pasch was also interested in capturing an ideal of mathematical
reasoning and in providing a general criterion for mathematical rigor. He shared this goal
with his contemporaries Frege and Hilbert. A related concern of Pasch regards the clarifica-
tion of the role of intuition in mathematical reasoning and he engaged in brief discussions
with Study and Klein on this issue.35 Pasch’s later investigations led to a careful analysis
of the nature of proofs and to discussions of the notions of decidability, consistency, and
mathematical discovery. As in the previous section, I shall begin with Pasch’s earliest
reflections on these matters.
Pasch informs us in Pasch (1918a, p. 231) that he arrived at his views on deduction
only while writing the Vorlesungen über neuere Geometrie (1882). Just as the opening
sentence of these lectures introduces the reader to the empiricist background of the book,
the next sentence expresses the second cornerstone of Pasch’s philosophy of mathematics:
the view of geometry as a science that obtains its results ‘by purely deductive means’
(Pasch, 1882a, V). What Pasch means by this is that regardless of the content that is
intended to be conveyed by the basic propositions, once these have been put forward no
recourse to perceptual experience should be necessary for the further development of the
theory (Pasch, 1882a, p. 17). He expresses this very modern deductivist stance even more
explicitly in the most often quoted passage from his lectures as follows36 :
In fact, if geometry is genuinely deductive, the process of deducing must
be in all respects independent of the sense of the geometrical concepts,
just as it must be independent of figures; only the relations set out be-
tween the geometrical concepts used in the propositions (respectively
definitions) concerned ought to be taken into account. (Pasch, 1882a,
p. 98; emphasis in original)

34 Pasch mentions Thaer & Lony (1915) as a valuable attempt in this direction.
35 On Eduard Study, see Hartwich (2005); I intend to discuss the interactions between Pasch and
Klein in a subsequent paper.
36 Quotations of this passage can be found, for example, in Nagel (1939, p. 197), Kennedy (1972,
p. 133), Torretti (1978, p. 211), Shapiro (1997, p. 149), Boniface (2004, p. 134), Detlefsen (2005,
p. 251).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 103

This passage has been referred to as the ‘birthplace of modern axiomatics’37 and is
the basis for Tamari’s reference to Pasch as the ‘father of modern axiomatics’ (Tamari,
2007, title), and Freudenthal’s remark that ‘[t]he father of rigor in geometry is Pasch’
(Freudenthal, 1962, p. 619). How quickly Pasch’s conception of mathematical deduction
became widely accepted can be gleaned from the fact that 8 years later Klein mentions
Pasch as espousing an ‘almost generally held view’, according to which in geometric
considerations one has to rely only on axioms without making any use of intuition (Klein,
1890, p. 571). Pasch never grew tired of emphasizing again and again the importance of
this understanding of deduction, which he also referred to as ‘the genuine mathematical
method’ (Pasch, 1918a, p. 228) and as an ‘imperative’ (‘Gebot’) for mathematical research,
which is completely independent of any position regarding the philosophical foundations
of mathematics one might want to adopt (Pasch, 1917, p. 188). As he repeated over three
decades after his lectures on geometry were published, mathematical proofs must remain
valid if the basic concepts are replaced throughout ‘by any concepts or by meaningless
signs’ (Pasch, 1914, p. 120).38 He refers here to this method as ‘a formalism, that has to
be carried downright to the extremes’ in the development of mathematics (Pasch, 1914,
p. 121; emphasis in original), and concludes emphatically: ‘This formalism is the lifeblood
(‘Lebensnerv’) of mathematics’ (Pasch, 1914, p. 121).39
Replacing meaningful terms by variables, for example, changing ‘There are points’ to
‘There are αs’, is the key to formalization, according to Pasch. He emphasized that geomet-
ric arguments must remain valid even if the geometric terms are replaced by code names
(‘Decknamen’) like ‘P-thing, G-thing, and E-thing’ (Pasch, 1918a, p. 231).40 In general,
for a mathematical proof to be rigorous it must rest only on propositions that allow such
substitutions and whose inferences remain valid under such transformations (Pasch, 1926c,
p. 263). This analysis leads Pasch to distinguish between ‘material words’ (‘Stoffwörter’)
and ‘joins’ (‘Fügemittel’) (Pasch, 1926c, pp. 243 and 263). As Pasch explains, the former
are meaningful terms that denote concepts, which are the material (‘Stoff ’) of the propo-
sition, like ‘two’, ‘points’, ‘segment’, and ‘endpoint’. The joins constitute what is needed
to connect the material words in order to express relations between the denoted concepts,
and they include what are now called the logical parts of expressions.41 They are called
the ‘scaffolding’ (‘Gerüst’) of a stem in Pasch (1924a, p. 11 [1915]), and in Pasch (1894,
p. 21) he explains that in order to carry out deductions one only needs to understand ‘those
parts of language that are common to all domains of thought’ (‘Denkgebieten’). This allows
Pasch to reformulate his understanding of deduction as follows:
The mathematical proof has nothing to do with the meaning of the ma-
terial words; it depends ultimately only on the joins and thus presents a
pure formalism. (Pasch, 1926c, p. 263; emphasis in original)

37 See Engel & Dehn (1934, p. 133) and Pickert (1982, p. 271).
38 At this point Pasch does not distinguish terminologically between words and concepts, that is,
between linguistic entities and their meanings. He addresses this distinction, however, in Pasch
(1926c). See also Pasch (1909, p. 1).
39 This remark is echoed in Pasch (1926c, p. 263).
40 For similar considerations, see also Dedekind’s letter to Lipschitz, July 27, 1876 (Dedekind,
1932a, p. 479), and Hilbert’s letter to Frege, December 29, 1899 (Frege, 1980, p. 40).
41 It appears that Pasch’s distinction between material words and joins is intended to distinguish
nonlogical from logical components of expressions in a natural language.
104 DIRK SCHLIMM

Formalization, as the most reliable touchstone for the validity of proofs, can be dis-
pensed with if one is very careful, but this is very difficult, Pasch warns, as the gap in
Euclid’s first proof illustrates (Pasch, 1926c, p. 140). Nevertheless, Pasch acknowledges
the usefulness of diagrams in mathematical practice, as will be discussed below (p. 107).
Pasch (1914, pp. 121–137) discusses at some length four historical case studies of math-
ematical errors made by Ampère, Cauchy, Dirichlet, and Hasse, which he traces back to
a lack of rigour in the development. Such rigor can be achieved through formalization,
which for Pasch is a powerful technique to ascertain the logical validity of arguments.
As such, Pasch’s notion of formalization does not involve the presentation of mathematical
reasoning in a symbolic language like Peano’s or in a completely formalized language
like Frege’s. Pasch explicitly distanced himself from these approaches and promoted for-
malization only to the extent that it remained compatible with ordinary mathematical
practice.42
Pasch realized that his understanding of deduction requires to address the question of
what counts as a mathematical proof, and in a letter to Frege from 1894 he expressed his
surprise of finding how rarely this topic had been seriously investigated.43 In a lecture on
the value of mathematical education delivered in the same year, Pasch pointed out that
mathematical proofs serve two main goals. Originally, they were a means for ‘discovering
new properties of figures and numbers’, but later they were also employed for examining
the ‘logical dependencies’ among propositions (Pasch, 1894, pp. 23–24). The second point
is vividly illustrated by Pasch’s own investigations in Pasch (1882a), in particular the
discussions of various equivalent axiomatizations in Section 1. Pasch did not consider the
study of mathematical inferences as a subject matter of mathematical research per se, but
as a matter of independent and general importance properly belonging to the domain of
philosophy (Pasch, 1914, p. 33). As is evident from the correspondence with Frege, Pasch
showed great interest in Frege’s work, but he also remarked that due to his age and the
heavy demands on his time he was not in a position to familiarize himself with Frege’s
notation.44 Nevertheless, Pasch undertook his own investigations of the notion of proof in
order to give an account of the necessary conditions for valid mathematical inferences that
apply to informal arguments as well as to those presented in a formal language. While his
investigations remained only in the fledgling stages, Pasch expressed the hope that they
might lead to a ‘renewal of logic’ and that ‘the indicated path will lead to the main features
of a logic that does justice to the accomplishments of mathematics’ (Pasch, 1918a, p. 232).
The position that Pasch arrived at is that
[i]t is part of the essence of pure deduction that every proof can be
‘atomized’, i.e., resolved into steps of certain kinds, or that it consists
of a single such step. (Pasch, 1917, p. 189)
In his ‘Begriffsbildung und Beweis in der Mathematik’ (1925) Pasch illustrates and dis-
cusses in great detail how the Aristotelian syllogistic form Barbara, that is, the deduction
of ‘All As are Cs’ from the premises ‘All As are Bs’ and ‘All Bs are Cs’, can be atomized
into 16 individual steps. According to Pasch’s analysis, each of these steps in the deduction

42 See Pasch’s letter to Klein, October 19, 1891; held at the Staats- und Universitätsbibliothek
Göttingen, Sig. Klein 11, 184.
43 Pasch’s letter to Frege, February 11, 1894 (Frege, 1980, p. 103).
44 Letter from Pasch to Frege, January 18, 1903 (Frege, 1980, p. 105). The preserved correspondence
with Frege consists of seven letters from Pasch in the period from 1894 to 1906.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 105

is the reformulation of the same content in other words, or the dissection of the original
content while retaining only a part of it, or the combination of the contents of previous
steps, or a definition.45 Pasch concludes that the most basic inferential steps must be such
that one can decide, by a general method in a finite amount of time or of steps, whether
they are valid or not. He also discusses questions that he considers to be undecidable, such
as whether a given proof can be rendered gap free and whether a given formula is derivable
from a set of assumptions. Pasch appeals here to a notion of decidability that he attributes
to Kronecker and which he recognized only during his work on Grundlagen der Analysis
(1909) as being of fundamental importance.46 Pasch (1914, pp. 153–157) distinguishes
between ‘proper mathematics’, which makes only use of decidable notions, and ‘improper
mathematics’, which does not, but which is much more common. To avoid speaking of
‘improper’ mathematics, Pasch later changed the terminology to ‘perfect’ and ‘imperfect’
mathematics (Pasch, 1918a, p. 230).
In his 1894 lecture mentioned above Pasch remarks that neither Euclidean nor non-
Euclidean geometries contradict any facts of experience, but that nevertheless these
systems could still be inconsistent (‘einen inneren Widerspruch enthalten’), ‘because ex-
perience only refers to approximated usability, which is quite compatible with certain
inconsistencies’ (Pasch, 1894, p. 31). He also maintains that explicit and complete proofs
for the consistency of both geometries are still lacking, and suggests that such proofs
could be based on analytic means, which would settle the question at least for those
who consider the consistency of analysis as necessary (Pasch, 1894, pp. 31–32). More
than 20 years later Pasch took up the issue of consistency again in a lecture ‘Über in-
nere Folgerichtigkeit’ (1915), which was published in Pasch (1919). Here he introduces a
classification of inconsistencies, two of which are ‘internal’ to a theory, while the other
two concern applications of theories. An inconsistency of the first level occurs within
a single sentence or between two given sentences. Since they involve only a finite set
of sentences, such inconsistencies are ‘decidable’ (granting the investigator a sufficiently
long life and a big enough memory). An inconsistency of the second level, however, is one
between consequences of a given stem, which are in general infinite in number, and Pasch
notes that we do not have any general process by which we could decide whether such an
inconsistency obtains or not, since this would involve the determination of all consequences
of a set of axioms. As a method for establishing consistency Pasch explains how a given
set of meaningful propositions can be ‘formalized’, resulting in an ‘empty stem’, which
in turn can be ‘realized’ by replacing the meaningless symbols by meaningful concepts,
yielding a ‘filled stem’ (Pasch, 1926a, p. 11 [1915]). If a realization of a formalized stem is
consistent, Pasch argues, then the original stem is also consistent. In modern terminology,
Pasch here describes the notion of relative consistency proofs. To ‘arithmetize’ a stem,
which often depends on a ‘felicitous idea’, amounts to showing its consistency relative
to arithmetic.47 Pasch defends the standpoint that arithmetic is indeed consistent, while
the consistency of any other mathematical discipline must be established by a proof.48

45 See also Pasch (1924a, p. 38), where Pasch refers to his earlier discussions of proofs in Pasch
(1909), Pasch (1912), and Pasch (1914).
46 See Pasch’s discussion of the notion of decidability in Pasch (1914, pp. 153–157), Pasch (1918a),
and Pasch (1927, pp. 88–93).
47 Pasch presents this terminology as if it were his own. References to Weierstrass, Kronecker, and
Klein are conspicuously missing; see Klein (1895) and the discussion in Boniface (2007, p. 332).
48 Kronecker’s views on the natural numbers seem to loom in the background of this discussion, but
Pasch never mentions them explicitly (see also Footnote 9).
106 DIRK SCHLIMM

To show the consistency of arithmetic one would have to show that an axiomatization of
arithmetic itself is consistent, or, in other words, this should follow from arithmetic itself.49
Pasch argues that this is indeed the case by appealing to the intimate connection between
arithmetic, thought, and language. The source of arithmetic, which finds expression in its
core propositions, Pasch contends, is necessary for thought in general and its constituents
are so extremely primitive that we are not consciously aware of them. We have committed
ourselves to their content when we made experiences and fixed them in language. It follows
that these propositions and their consequences are binding for us, which justifies, for
Pasch, our belief in their consistency (Pasch, 1924a, p. 17 [1915]). It is worth pointing
out that Pasch does not mention any empirical considerations in his discussions of the
consistency of mathematical theories, except in the argument for grounding the consistency
of arithmetic.50
The value of the deductive method in mathematics, for Pasch, is that it excludes all arbi-
trariness in proofs and thus renders them unassailable, which, together with the empirical
evidence for the core propositions, is the basis for our ascribing the ‘highest level of reli-
ability’ to mathematics (Pasch, 1882a, p. 100). Notice how, as a thoroughgoing empiricist,
Pasch does not speak of the necessity of mathematics, but only of its reliability.51 For all
practical purposes mathematical knowledge is as good as certain. Although strict adherence
to the deductive method in mathematics might lead to more long-winded expositions, it has
two further advantages for mathematical practice, according to Pasch. Firstly, proofs that
have been carried out without any appeal to the meanings of the nonlogical primitives
occurring in them are reusable, in the sense that replacing the terms in the assumptions in
such a way that they become true statements automatically also yields true statements
for the conclusions if the terms are replaced accordingly (Pasch, 1882a, pp. 98, 100).
In this way new mathematical results can be obtained ‘in a purely mechanical fashion’
without having to repeat the derivations (Pasch, 1914, p. 120). Secondly, a deductive
presentation of a domain can be exploited to determine which concepts and propositions
are necessary or dispensable for the theoretical development of the discipline (Pasch,
1882a, p. 100).52
As mentioned above, Pasch never suggested that ordinary mathematics should be carried
out in a formal system and he seriously doubted the feasibility of such an undertaking.
Instead, formalization is only a technique, albeit a very powerful one, for ascertaining the
rigor of deductions. Already in the first edition of his Vorlesungen (1882) Pasch notes that
it is admissible and useful to think about the meanings of the geometric terms during a
deduction, but that as soon as this becomes necessary the incompleteness of the deduction

49 Any reference to Hilbert (1900) or Hilbert (1905) are again conspicuously missing from Pasch’s
discussion.
50 See Pasch (1894, p. 17) and Pasch (1909, p. 134), which are referred to in Pasch (1917, p. 185).
It is also noteworthy that Pasch does not discuss Dedekind’s nor Peano’s axiomatizations of
arithmetic (Dedekind 1888; Peano 1889a); indeed, in Pasch (1927, p. 90) he remarks that there is
no generally accepted set of core propositions for arithmetic. It is possible that he did not accept
Dedekind’s notions of system and mapping as being empirically grounded, and that he objected
to Peano’s use of a purely symbolic language.
51 Pasch only rarely speaks of the truth of the core propositions, for example, he refers to them as
‘basic truths’ (‘Grundwahrheiten’) in a talk to a general audience (Pasch, 1894, p. 21). I think
Pasch would agree to Einstein’s famous remark that ‘As far as the laws of mathematics refer to
reality, they are not certain; and as far as they are certain, they do not refer to reality’, quoted from
(Hempel, 1945, p. 17).
52 See also Pasch (1924b, p. 233).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 107

is revealed. He mentions that it is actually a ‘common conception’ that theorems should


follow logically from axioms (Pasch, 1882a, p. 99), but that his remarks on mathematical
rigor are nevertheless not uncalled for, since this demand remains unfulfilled more often
than not, even in publications that deal explicitly with the foundations of a mathematical
discipline.53 As possible reasons for this discrepancy Pasch mentions the frequent use of
diagrams or other perceived or imagined pictures that accompany a deduction. But, Pasch
does not at all reject the use of diagrams in mathematical practice. In fact, he acknowledges
their efficiency for representing the relations that are expressed in the assumptions or
constructions of a geometric proof ‘in an intuitive way’ (‘in anschaulicher Form’) (Pasch,
1882a, p. 43). On the one hand, such a representation makes it easy to survey and bring
back to memory the relevant relations, and, on the other hand, it stimulates the ‘inventive
talent’ (‘Erfindungskraft’) and it is thus a fruitful tool for discovering new relations and
constructions (Pasch, 1882a, p. 43). Indeed, for Pasch the ‘creative trains of thought’ by
which mathematicians advance to new knowledge ‘must not and cannot’ be rendered com-
pletely in formal terms (Pasch, 1926c, p. 142). Diagrams are necessary for understanding
the core propositions of geometry, Pasch maintains, since the latter express observations
made on simple figures. Moreover, every inference can be confirmed by a figure, although
the figures themselves do not justify the inferences. How diagrams can be misleading is
illustrated by Pasch’s discussion of the first proof in Euclid’s Elements.54 Pasch briefly
considers the possibility of admitting inferences that are based on figures, but he quickly
dismisses it with the comment that one would hardly succeed to delineate clearly which
inferences would be acceptable and which would need to be justified by recourse to earlier
statements made in the proof (Pasch, 1882a, p. 45). In a similar vein Pasch also notes
that the use of familiar terms in mathematical discourse can be misleading, since they
evoke, often unconsciously, many associations that are not logical consequences of the
axioms (Pasch, 1882a, p. 99). Again, these considerations are considered unobjectionable
by Pasch when it comes to the discovery of new geometric truths, where all means can
be applied that lead to the end, but not for the verification and presentation of the results
(Pasch, 1882a, p. 99).55

§5. Pasch’s programme. Given the distinction Pasch makes between the mathemati-
cal and philosophical foundations of a discipline, the general problem arises of connecting
these two. In Pasch’s case, this problem is exacerbated by the fact that he opted for the
philosophical foundations to be grounded empirically, instead of, for example, in a Platonic
realm of mathematical objects. Thus, his overall framework is put under stress by the
tension between his empiricism and his deductivism. Pasch’s programme, intended to ease
this tension, consists in finding adequate ways of combining these two standpoints and in
developing deductive theories from empirical cores.
Since the reader of Pasch’s Vorlesungen (1882) might easily miss the general aim and
structure of his approach, the development can appear unmotivated and needlessly

53 Pasch (1917, p. 188) mentions Du Bois-Reymond (1882) as an example.


54 Pasch’s discussion seems to have contributed much to the popularization of the critique of Euclid’s
proof, so much that it even has been referred to as the first instance of such a critique, for example,
in Friedman (1985, p. 462).
55 Here Pasch distinguishes clearly between what has later been called the ‘context of discovery’ and
the ‘context of justification’ (Reichenbach, 1938, pp. 6–7); see also his ‘Forschen und Darstellen’
(Pasch, 1919), in particular p. 35. For a contemporary discussion of this matter, see Hersh (1991).
108 DIRK SCHLIMM

cumbersome. In fact, the stem concepts and propositions of projective geometry,56 which
one would expect to find at the beginning of the book, are only introduced in section
10, after almost 100 pages of long-winded deductions and definitions from the empirical
core concepts and propositions that Pasch starts out with. A reader interested in studying
projective geometry may well wonder what the first 100 pages are all about. It is only in
‘Grundfragen der Geometrie’ (1917) that Pasch explicitly presents his overall conception
of geometry (presented in Section 2, above) and discusses his method of ‘extending the
meanings of concepts’. This is taken up again 5 years later and discussed in detail with
reference to the ‘deep contrast’ (‘tiefen Gegensatz’) between ‘physical geometry’ and
‘mathematical geometry’ (Pasch, 1922, pp. 362–363).
Let us take a look at how Pasch presents these matters in his 1882 lectures on geometry.
Here he describes the relation between mathematical theories and their empirical founda-
tions by stating that ‘[m]athematics sets up relations between the mathematical concepts,
which should correspond to facts from experience’ (Pasch, 1882a, p. 17). This makes it
sound as if all mathematical propositions have direct empirical correlates. However, while
this characterization might well have been an ideal that Pasch had in mind at the time,
it does not square with his own way of developing the axioms of projective geometry
in his lectures. Pasch’s approach is captured more accurately in his later, more nuanced
reflections, in which he only speaks of correlates that have been developed from empirical
propositions.
Once the empiricist has completed the substructure, he can attach to it
the theory that I referred to as mathematical geometry without changing
the wording. He would then, whenever one speaks of points understand
it as ‘mathematical points’, a concept which has been developed from
the physical point in the substructure. (Pasch, 1924a, p. 43).
Thus, Pasch’s strategy for bridging the gap between empirical and mathematical con-
cepts can be characterized as follows: Start with empirical core concepts and proposi-
tions, and develop theorems through definitions and deductions, which can be used as
correlates of the mathematical stem propositions of a particular discipline. This enterprise
may involve two different kinds of moves: (a) the lifting of empirical restrictions, and (b)
the definition of new concepts that extend previous ones. These are illustrated in what
follows.
Consider the statement ‘Between any two points on a line segment there is another
point’. Taken as an empirical statement, it is false. Due to the limits of our perception
and the fact that points must be extended to be observable, two points might just be
so close to each other that there is not enough space to fit another point between them.
Thus, if the statement is to be understood as expressing a core proposition about empirical
points, it must be augmented with the proviso that the points in question be sufficiently
apart from each other. As a mathematical proposition, however, the above statement can
be accepted without any further restrictions on the relative locations of the points. Thus,
we can obtain mathematical statements from empirical ones by simply dropping certain
additional conditions, which is one way of connecting a mathematical theory with its
empirical substructure.

56 They are 22 in total: 8 for line segments, 4 for the plane, and 10 for congruency; the latter are
common to Euclidean and non-Euclidean geometry.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 109

However, not all mathematical propositions can be obtained by simply removing restric-
tions that are necessary for empirical ones. If this is the case, Pasch resorts to a technique
he refers to as ‘extending the meanings of concepts’ (‘Begriffserweiterung’) (Pasch, 1882a,
p. 64) which he employs, for example, for points, lines, and planes in Pasch (1882a)
and for numbers in Pasch (1882b). To motivate this technique, Pasch also mentions two
extensions of concepts from the history of mathematics: The notion of number originally
meant only positive rational numbers, but was extended at some point in history to include
also negative numbers, while the notion of power was originally used only for natural
numbers as exponents, but was also extended to include negative and rational numbers in a
similar way (Pasch, 1882a, pp. 40–41). Analogously, Pasch introduces the concept ‘thing’
for concrete objects, but extends it in Pasch (1909, p. 20) to include sequences and in Pasch
(1909, p. 94) to include infinite sets.57
Since the method of extending the meanings of concepts plays a crucial role for develop-
ing deductive mathematical theories from an empirical core, I will present next how Pasch
proceeds to extend the meanings of ‘numbers’ and then of ‘bundle of lines’ and ‘points’.
Pasch extends the meaning of the concept ‘number’ in his Einleitung in die Differential-
und Integralrechnung (1882) to include also irrational numbers on the basis of Dedekind
cuts (Dedekind, 1872). He begins by introducing the word ‘number’ as referring only
to positive whole numbers and their quotients, that is, to nonnegative rational numbers,
and assumes that for these the relations of equality, greater, and less than, as well as the
operations of addition, subtraction, multiplication, division, and exponentiation are known.
After deriving a few basic theorems from these assumptions, Pasch notices that not every
number can be represented as the power of another number (e.g., that there is no number x
in the domain, such that x 3 = 25). Following Dedekind, he considers all numbers whose
nth power is less than a given number a as forming a ‘group’ (‘Gruppe’), which Pasch
calls ‘number segment’ or just ‘segment’. Then he notes that for some numbers a and n
there is a least number that does not belong to the corresponding number segment (e.g.,
for a = 25 and n = 2 this least number is 5), but that for others there is no such least
number (e.g., for a = 25 and n = 3). Pasch calls those number segments with a least upper
bound rational and the others irrational, and proceeds to define the relations of equality,
greater, and less than, as well as the operations of addition, subtraction, multiplication,
and division for number segments. On the basis of these definitions he argues that all
expressions that involve these notions for numbers also hold of number segments, both
rational and irrational. With regard to powers, Pasch shows that, unlike in the case of
numbers, every number segment can be represented as the nth power of another segment,
and he shows that the powers remain well defined and obey the familiar laws not only
for every rational number segment, but also for every irrational one. This allows Pasch
to notice that the computations with segments completely subsume the computations with
numbers, but also go beyond them, since they allow for the unrestricted application of
the inverse operation of taking the power. Once the theory of number segments (which is
‘more complete’ (Pasch, 1882b, p. 11), since subject to fewer restrictions than the theory
of numbers) is adopted, the term ‘number’ plays no particular role any more, since it can
be replaced throughout by ‘rational number segment’. This observation motivates Pasch to

57 Another example for the extension of concepts concerns the notion of limit, see Pasch (1918b).
I am grateful to an anonymous reviewer for bringing this to my attention.
110 DIRK SCHLIMM

dispense with the old meaning of ‘number’ and use this term for number segments instead,
so that now we can speak of ‘rational’ and ‘irrational numbers’.58
In sum, Pasch’s extension of the meaning of the term ‘numbers’ proceeds in three steps:
first, it is taken to refer only to nonnegative rational numbers; then, rational and irrational
number segments are introduced, the former of which correspond to numbers; finally, the
term ‘number’ is applied to number segments in general, which also allows to speak of
‘rational’ and ‘irrational numbers’. Further extensions of the domain of numbers to include
negative numbers, zero, infinity, and imaginary numbers are also mentioned by Pasch, but
he does not present them in detail.
To show the usefulness of the newly introduced concept of number, Pasch discusses
the measurement of straight lines. Empirical measurements, he notes, can only be made
up to a certain limit of accuracy, but mathematics aims at establishing general rules that
are independent of limitations of what can be observed (‘Beobachtungsverhältnisse’). This
can be achieved by admitting also irrational numbers, since then no knowledge of any
particular threshold of accuracy is required (Pasch, 1882b, p. 13).59
It is informative to notice the striking contrast between Pasch’s and Dedekind’s presenta-
tions of the introduction of irrational numbers. While Dedekind uses abstract set-theoretic
terminology and considers the real numbers to correspond to points on a line, Pasch uses
more concrete terminology in his approach and takes the limitations of our empirical
interactions with lines as the starting point, distinguishing between the calculation of the
length of a line and its measurement. Moreover, while Dedekind clearly distinguishes
between a cut and its corresponding number, Pasch redefines the term ‘number’ to refer
to cuts, but, as was not uncommon at the time, he does not distinguish carefully between
the term ‘number’ and the concept of number. Since Pasch obviously would not want to
assert that a number has infinitely many elements, he must restrict his number talk to only
certain properties of cuts, but he completely avoids to address this issue. Pasch also does
not comment on the problem that the uncountability of the irrational numbers might pose
for his empiricist approach.
The latter difficulty points at a more general issue of Pasch’s programme, namely the
exact specification of the means that he regards as admissible for the development of new
concepts from given ones. Since Pasch’s attitude is not revisionist, he must be open in
principle to accept the results that are obtained by any method used in mathematics. One
way of showing the compatibility of mathematical practice with his empiricist standpoint
is to find ways of achieving the same results by licensed methods.
In the case of projective geometry, it was accepted practice to introduce ideal points
as ‘points at infinity’ where parallel lines meet.60 Such a definition, however, does not
conform to Pasch’s empiricist standards, because infinity is not an empirical notion, and so
he set out to introduce these objects by other means. In his lectures on projective geometry
Pasch extends the meaning of ‘bundle of lines’ and ‘point’ (Pasch, 1882a, pp. 33–46).
On the basis of the notions of points and lines, the latter of which he defined using the

58 In an unusually opinionated review for the Jahrbuch für Fortschritte in der Mathematik Pasch’s
redefinition of ‘number’ was severely criticized for being circular by Hoppe, who insists that the
concepts of number segment and irrational number should be kept apart (Hoppe, 1882).
59 Pasch always remained sceptical with regard to the applications of irrational numbers outside of
mathematics. After explaining how the square root of 2 arises from considerations regarding the
diagonal of a unit square, he writes in Pasch (1909, p. 99): “It remains open as to whether every
irrational number corresponds to a problem outside of analysis.”
60 See Torretti (1978, p. 111).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 111

core concept of line segment, Pasch initially defines a bundle of lines, as the collection of
lines that meet in a common point. He then proves that for lines e, f , and g the relation
‘g belongs to the bundle e f ’ can be defined using the property of being coplanar, but
without making any reference to the point in which e and f meet. Thus, this relation can
also hold between lines e, f , and g, if e and f have no point in common, and Pasch
takes it as the defining characteristic for an extended notion of bundle of lines. Also for
these bundles Pasch shows that they are determined by any two lines that belong to them.
Moreover, if two lines in a bundle meet in a point P, then all lines of the bundle meet
in that same point, so that some bundles can be said to correspond to a unique point,
namely P, and these are the ones that were formerly referred to as bundles and are now
called proper bundles. Other bundles, however, may contain lines that do not intersect,
such that there is no obvious relation between these bundles and particular points, and
they are called improper bundles. Finally, after having defined these notions and proved
some properties about them, Pasch extends the meaning of the term ‘point’ to refer to
bundles of lines instead (just as he extended the meaning of the term ‘number’ to refer
to number segments). Those bundles that correspond to a point in the original sense are
then referred to as proper points, while the others are called improper points. Thus, the
extended meaning of the relation of ‘line l goes through point P’ is that of ‘line l belongs
to bundle of lines P’. The advantage of this change in terminology is that previous axioms
and theorems about points and lines remain valid also under the extended meaning. For
example, ‘For any two points there is a line that that goes through both of them’ is also
valid if ‘point’ refers to a bundle of lines. In addition, now also statements that were false
under the original restricted understanding of points become true, if understood as referring
to points in the extended sense, for example, ‘Two lines in a plane always have a point
in common’. Pasch’s improper points had previously been treated as ideal elements in
projective geometry and Torretti describes Pasch’s approach of introducing these elements
only on the basis of ostensive concepts and empirically justifiable axioms as his ‘most
remarkable feat’ (Torretti, 1978, p. 213).
Nowadays we would describe Pasch’s method of extending the meaning of concepts
by saying that the term ‘point’ is given two different interpretations: while it originally
referred to points, it is later taken to refer to bundles. But, Pasch does not yet possess
a conceptually clear distinction between syntax and semantics, and it does not seem to
come naturally for him to speak of different interpretations of a given term, in particular,
since ‘point’ is a meaningful term, unlike, say, a mathematical variable. Thus, he speaks of
substituting one concept for another in a proposition, changing the meanings of concepts,
or replacing concepts by meaningless signs.61
In geometry, the notion of continuity also goes beyond what can be developed on an
empirical basis and it indicates the conceptual gap between empirical and the mathematical
geometry. Pasch carries through the development of projective geometry to allow for the in-
troduction of coordinates through the construction of rational nets. Thus, these coordinates
remain limited to rational values. Nevertheless, he notices that ultimately only an analytic
treatment of geometry in terms of real coordinates yields the customary notions of points
and lines, which Pasch qualifies with the adjective ‘mathematical’. He briefly considers
the possibility of adding an axiom of continuity, but dismisses it on the grounds that it
would not be empirically justified and opts for a version of the Archimedean axiom instead

61 See also Pasch’s notions of ‘formalization’ and ‘realization’ of a stem, discussed in Section 4,
above.
112 DIRK SCHLIMM

(Pasch, 1882a, p. 126).62 However, given that all his empirically based constructions are
subject to limitations, mathematical points allow for more fine-grained distinctions than
empirical points do; in other words, every empirical point corresponds to an entire sequence
of mathematical points. This phenomenon is referred to as the ‘inexactness of geometric
concepts’ and Pasch emphasizes that ‘the transfer of a diagram into numbers and the
return from the results of a calculation to the diagram cannot be carried out with the same
degree of exactness’ (Pasch, 1882a, p. 200).63 Nonetheless, Pasch remarks that also the
mathematical points (if appropriately defined) satisfy the stem propositions of projective
geometry. Given that he considers real numbers themselves to be grounded in empirical
core concepts, this does not seem to pose a serious problem for his philosophy, but it
confirms his assertion that geometry presupposes arithmetic (Pasch, 1922, p. 5).
Pasch commented that one of the goals he pursued in his lectures on geometry was
to show that a reduction of parts of geometry to empirical notions was possible in prin-
ciple (Pasch, 1887a, p. 130),64 and it appears that many of his contemporaries accepted
this reduction. For example, in the article on geometry by Weber and Wellstein in the
Encyclopedia of Elementary Mathematics (1905), Pasch’s book is discussed in the first
chapter on the fundamental notions of geometry (Weber & Wellstein, 1905, pp. 25–27).
After formulating a critique of the idealization processes that are intended to lead from
the empirical raw material of geometry to its abstract objects, the authors ask whether it
is possible to build up an intuitive geometry, which they call ‘natural geometry’, without
recurring to these idealizations, and they note that an affirmative answer to this question is
presented in Pasch’s lectures on projective geometry, ‘this beautiful book that everybody
must have read, who is more interested in intensive rather than extensive knowledge of
geometry’ (Weber & Wellstein, 1905, p. 25).

§6. Concluding remarks. We have seen how Moritz Pasch formulated the corner-
stones of his philosophy of mathematics in his two books of 1882 and continued to develop
and refine his views in numerous, more and more philosophical publications throughout
his life. Pasch’s philosophy is quite unique in combining a strong empiricism, according to
which the meanings of mathematical terms should be based on observable physical entities,
with a deductivist view, according to which the validity of mathematical inferences does
not depend on the meanings of the terms. These seemingly incompatible views are brought
together in Pasch’s conception of different layers of philosophical and mathematical inves-
tigations, and ‘Pasch’s programme’, which aims at building up correlates of mathematical
axioms from an empirical basis. Since Pasch’s philosophical ideas originally appeared only
as interspersed remarks in his mathematical textbooks and were elaborated in more detail
only in his later articles it has been difficult to grasp and appreciate his philosophy of math-
ematics as a whole. This might be part of the reason for the general lack of awareness of his
ideas, which Pasch himself noticed (Pasch, 1926b, p. 166). Another reason might be that

62 See Ehrlich (2006, p. 6) and Greenberg (1993, p. 125), who writes that ‘[t]he full significance of
Archimedes’ axiom was first grasped in the 1880s by M. Pasch and O. Stolz’.
63 Pasch reminds the reader in Pasch (1882b, pp. 13 and 39) that every number that is used in practice
or that arises from observations or measurements can only have a limited degree of exactness, and
he adds in a remark on p. 188 that the inexactness of geometric concepts had been discussed by
Klein already in 1873 (Klein, 1883). This is repeated at other occasions, for example, Pasch
(1887a, p. 130) and Pasch (1912, p. 203).
64 See also Pasch (1912, p. 203).
PASCH ’ S PHILOSOPHY OF MATHEMATICS 113

his methodological considerations regarding the nature of mathematical deduction were


quickly absorbed by his contemporaries like Peano and Hilbert, whose own contributions
soon overshadowed those of Pasch.
In the present overview I was able to deal with a number of issues in Pasch’s philosophy
only cursorily, but they nevertheless deserve further and more detailed exploration. Among
these are: the origins, principles, and limits of Pasch’s empiricism together with a com-
parison with the British empiricists and Pasch’s contemporaries; Pasch’s ideas about the
most basic constituents of mathematical reasoning and thinking in general; Pasch’s notion
of intuition, in particular in comparison with Klein and Study; the role of definitions, in
particular of implicit definitions, in Pasch’s works; Pasch’s analysis of logical inference,
in particular in comparison with Frege and Hilbert; Kronecker’s notion of ‘decidability’
that became more and more important for Pasch and his distinction between ‘proper’
and ‘improper’ mathematics; finally, the reception and influence of Pasch’s work in the
nineteenth and twentieth centuries. I hope to have shown in the present paper that Pasch’s
highly original philosophy of mathematics is worthy of further study and of becoming
generally known, both in its own right and as a part of the mathematical and philosophical
context in which modern mathematics emerged.

§7. Acknowledgments. I would like to thank, first and foremost, Michael Hallett for
helpful comments on previous versions of this paper. In addition, I am also grateful for
remarks and comments by an anonymous reviewer of this journal, Greg Frost-Arnold,
Jeremy Heis, Paul Rusnock, as well as audience members at the Annual meeting of the As-
sociation of Symbolic Logic (Irvine, CA), the PhiMSAMP-3 conference ‘Is mathematics
special?’ (Vienna, Austria), the HOPOS meeting 2008 (Vancouver, BC), the Winter 2008
meeting of the Canadian Mathematical Society (Ottawa, ON), and the Winter meeting of
the Association of Symbolic Logic (Philadelphia, PA). Last, but not least, I would like to
thank Dr. Rudolf Thaer for generously sharing his knowledge about Pasch with me. Work
on this paper was funded by Social Sciences and Humanities Research Council of Canada
(SSHRC). Translations are by the author, unless noted.

BIBLIOGRAPHY
Boniface, J. (2004). Hilbert et la notion d’existence en mathématiques. Mathesis. Paris,
France: J. Vrin.
Boniface, J. (2007). The concept of number from Gauss to Kronecker. In Goldstein, C.,
Schappacher, N., and Schwermer, J., editors. The Shaping of Arithmetic After C. F.
Gauss’s Disquisitiones Arithmeticae. Berlin: Springer, pp. 315–342.
Contro, W. S. (1976). Von Pasch zu Hilbert. Archive for History of Exact Sciences, 15(3),
283–295.
Dedekind, R. (1872). Stetigkeit und irrationale Zahlen. Braunschweig, Germany: Vieweg.
Reprinted in Dedekind (1932a), pp. 315–334. English translation Continuity and
Irrational Numbers in Ewald (1996), pp. 765–779.
Dedekind, R. (1888). Was sind und was sollen die Zahlen? Braunschweig, Germany:
Vieweg. Reprinted in Dedekind (1932a), pp. 335–391. English translation in Ewald
(1996), pp. 787–833.
Dedekind, R. (1932a). Gesammelte mathematische Werke, Vol. 3. Braunschweig,
Germany: F. Vieweg & Sohn. Edited by Robert Fricke, Emmy Noether, and Öystein
Ore.
114 DIRK SCHLIMM

Dehn, M. (1928). Moritz Pasch. Zum fünfundachtzigsten Geburtstag am 8. 11. 1928. Die
Naturwissenschaften, 16(44), 813–815.
Detlefsen, M. (2005). Formalism. In Shapiro, S., editor. Oxford Handbook of Philosophy
of Mathematics and Logic. Oxford: Oxford University Press, pp. 236–317.
Dingler, H. (1915). Das Prinzip der logischen Unabhängigkeit in der Mathematik, zugleich
als Einführung in die Axiomatik. München, Germany: Theodor Ackermann.
DiSalle, R. (1993). Helmholtz’s empiricist philosophy of mathematics. In Cahan, D.,
editor. Hermann von Helmholtz and the Foundations of 19th Century Science. Los
Angeles, CA: University of California Press, pp. 498–521.
Du Bois-Reymond, P. (1882). Die allgemeine Funktionentheorie. Erster Teil. Metaphysik
und Theorie der mathematischen Grundbegriffe: Größe, Grenze, Argument, und
Funktion. Tübingen, Germany: H. Laupp. There is only this part.
Ehrlich, P. (2006). The rise of non-Archimedean mathematics and the roots of a
misconception I: The emergence of non-Archimedean systems of magnitudes. Archive
for History of Exact Sciences, 60, 1–121.
Engel, F., & Dehn, M. (1931). Moritz Pasch. Zwei Gedenkreden, gehalten am 24. Januar
1931. Giessen, Germany: Töpelmann.
Engel, F., & Dehn, M. (1934). Moritz Pasch. Jahresbericht der Deutschen Mathematiker
Vereinigung 44(5/8), 120–142. Reprint, with changes and additions, of Engel & Dehn
(1931).
Ewald, W. (1996). From Kant to Hilbert: A Source Book in Mathematics. Oxford, UK:
Clarendon Press. Two volumes.
Frege, G. (1980). Philosophical and Mathematical Correspondence. University of Chicago
Press. Edited by Gottfried Gabriel, Hans Hermes, Friedrich Kambartel, Christian Thiel,
and Albert Veraart.
Freudenthal, H. (1962). The main trends in the foundations of geometry in the 19th century.
In Nagel, E., Suppes, P., and Tarski, A., editors. International Congress for Logic,
Methodology and Philosophy of Science (1960 : Stanford, Calif.), Logic, Methodology,
and Philosophy of Science. Stanford, CA: Stanford University Press, pp. 613–621.
Friedman, M. (1985). Kant’s theory of geometry. The Philosophical Review, 94(4),
455–506.
Gabriel, G. (1978). Implizite Definitionen — Eine Verwechslungsgeschichte. Annals of
Science, 35, 419–423.
Gandon, S. (2005). Pasch entre Klein et Peano: empirisme et idéalité en géométrie.
Dialogue, 44: 4, 653–692.
Gandon, S. (2006). La réception des Vorlesungen über neuere Geometrie de Pasch par
Peano. Revue d’historie des mathématiques, 12(2), 249–290.
Gray, J. (2007). Worlds Out of Nothing. A Course in the History of Geometry in the 19th
Century. London: Springer.
Greenberg, M. J. (1993). Euclidean and non-Euclidean Geometries: Development and
History (third edition). New York, NY: W.H. Freeman.
Hallett, M., & Majer, U., editors (2004). David Hilbert’s Lectures on the Foundations of
Geometry 1891–1902. Berlin, Germany: Springer.
Harré, R. (2003). Positivist thought in the nineteenth century. In Baldwin, T., editor. The
Cambridge History of Philosophy 1870–1945. Cambridge: Cambridge University Press,
pp. 11–26.
Hartwich, Y. (2005). Eduard Study (1862–1930)—ein mathematischer Mephistopheles im
geometrischen Gärtchen. PhD Thesis, Department of Mathematics, Johannes Gutenberg
Universität, Mainz.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 115

Hempel, C. G. (1945). Geometry and empirical science. American Mathematical


Monthly, 52(1), 7–17.
Hersh, R. (1991). Mathematics has a front and a back. Synthese, 88(2), 127–133.
Hilbert, D. (1899). Grundlagen der Geometrie. Leipzig, Germany: Teubner. Reprinted in
Hallett & Majer (2004), pp. 436–525. English translation The Foundations of Geometry,
Open Court, Chicago, 1902.
Hilbert, D. (1900). Mathematische Probleme. Vortrag gehalten auf dem Internationalen
Mathematiker-Kongreß zu Paris 1900. Nachrichten der Königlichen Gesellschaft der
Wissenschaften zu Göttingen, 253–297. English translation ‘Mathematical Problems’, in
Ewald (1996), pp. 1096–1105.
Hilbert, D. (1905). Über die Grundlagen der Logik und der Arithmetik. In Verhandlungen
des Dritten Internationalen Mathematiker-Kongresses. Leipzig, Germany: Teubner, pp.
174–185. English translation ‘On the Foundations of Logic and Arithmetic’, in van
Heijenoort (1967), pp. 129–138.
Hilbert, D. (1922). Neubegründung der Mathematik. Abhandlungen aus dem Seminar der
Hamburgischen Universität, 1, 157–177. English translation ‘The New Grounding of
Mathematics. First Report’, in Ewald (1996), pp. 1115–1134.
Hilbert, D. (1923). Die logischen Grundlagen der Mathematik. Mathematische
Annalen, 88, 151–165. English translation ‘The Logical Foundations of Mathematics’,
in Ewald (1996), pp. 1134–1148.
Hoppe, R. (1882). Review of Einleitung in die Differential- und Integralrechnung.
Jahrbuch über die Fortschritte der Mathematik, 14, 197.2.
Jesseph, D. M. (1993). Berkeley’s Philosophy of Mathematics. Chicago, IL: University of
Chicago Press.
Kennedy, H. C. (1972). The origins of modern axiomatics: Pasch to Peano. American
Mathematical Monthly, 79, 133–136.
Kennedy, H. C., editor. (1973). Selected Works of Giuseppe Peano. Toronto, Canada:
Toronto University Press.
Klein, F. (1871). Ueber die sogenannte Nicht-Euklidische Geometrie. Mathematische
Annalen, 4(4), 573–625. Reprinted in Klein (1921), ch. 16, pp. 254–305.
Klein, F. (1873a). Ueber den allgemeinen Functionsbegriff und dessen Darstellung durch
eine willkürliche Curve. Sitzungsberichte der physikalisch-medicinischen Societät zu
Erlangen. December 8. Reprinted in Klein (1883) and in Klein (1922), ch. 45, pp.
214–224.
Klein, F. (1873b). Ueber die sogenannte Nicht-Euklidische Geometrie (Zweiter Aufsatz).
Mathematische Annalen, 6, 112–145. Reprinted in Klein (1921), ch. 18, pp. 311–343.
Klein, F. (1883). Ueber den allgemeinen Functionsbegriff und dessen Darstellung durch
eine willkürliche Curve. Mathematische Annalen, 22(2), 249–259. Reprint of Klein
(1873a). Reprinted in Klein (1922), ch. 45, pp. 214–224.
Klein, F. (1890). Zur Nicht-Euklidischen Geometrie. Mathematische Annalen, 37(4), 544–
572. Reprinted in Klein (1921), ch. 21, pp. 353–383. Discussion of the lectures held in
the Winter Semester 1889/90, published as Klein (1892).
Klein, F. (1892). Nicht-Euklidische Geometrie. Göttingen, Germany. Litographed notes
(autograph) worked out by Friedrich Schilling of lectures held in Winter Semester
1889/90 in Göttingen.
Klein, F. (1895). Über Arithmetisierung der Mathematik. Nachrichten der Königlichen
Gesellschaft der Wissenschaften zu Göttingen (2). Reprinted in Klein (1922), ch. 42,
pp. 232–240.
116 DIRK SCHLIMM

Klein, F. (1921). Gesammelte mathematische Abhandlungen, Vol. 1. Berlin: Springer.


Edited by R. Fricke and A. Ostrowski, with additions by F. Klein.
Klein, F. (1922). Gesammelte mathematische Abhandlungen, Vol. 2. Berlin: Springer.
Edited by R. Fricke and H. Vermeil, with additions by F. Klein.
Kline, M. (1972). Mathematical Thought from Ancient to Modern Times. New York, NY:
Oxford University Press.
Majer, U. (2004). Husserl and Hilbert on geometry. In Feist, R., editor. Husserl and the
Sciences: Selected Perspectives. Ottawa, Canada: University of Ottawa Press, pp. 101–
120.
Marchisotto, E., & Smith, J. T. (2007). The Legacy of Mario Pieri in Geometry and
Arithmetic. Boston, MA: Birkhäuser.
Mill, J. S. (1851). A System of Logic, Ratiocinative and Inductive: Being a Connected View
of the Principles of Evidence, and the Methods of Scientific Investigation (third edition).
London: J.W. Parker.
Nagel, E. (1939). The formation of modern conceptions of formal logic in the development
of geometry. Osiris, 7, 142–223.
Nagel, E. (1961). The Structure of Science. Problems in the Logic of Scientific Explanation.
New York, NY: Harcourt, Brace and World.
Pasch, M. (1882a). Vorlesungen über Neuere Geometrie. Leipzig, Germany: B.G. Teubner.
Pasch, M. (1882b). Einleitung in die Differential- und Integralrechnung. Leipzig,
Germany: B.G. Teubner.
Pasch, M. (1887a). Ueber die projective Geometrie und die analytische Darstellung der
geometrischen Gebilde. Mathematische Annalen, 30, 127–131.
Pasch, M. (1887b). Ueber einige Punkte der Functionentheorie. Mathematische
Annalen, 30, 132–154.
Pasch, M. (1894). Ueber den Bildungswert der Mathematik. Giessen, Germany: Grossh.
Hof- und Universitäts-Druckerei von Münchow.
Pasch, M. (1909). Grundlagen der Analysis. Leipzig, Germany: B.G. Teubner.
Pasch, M. (1912). Vorlesungen über Neuere Geometrie (second edition, with additions).
Leipzig, Germany, B.G. Teubner.
Pasch, M. (1914). Veränderliche und Funktion. Leipzig, Germany: B.G. Teubner.
Pasch, M. (1916). Hugo Dingler. Das Prinzip der logischen Unabhängigkeit in der
Mathematik (1915). Archiv der Mathematik und Physik, 24, 275–277.
Pasch, M. (1917). Grundfragen der Geometrie. Journal für die reine und angewandte
Mathematik, 147, 184–190.
Pasch, M. (1918a). Die Forderung der Entscheidbarkeit. Jahresbericht der Deutschen
Mathematiker Vereinigung, 27, 228–232.
Pasch, M. (1918b). Über die Erweiterung des Grenzbegriffs. Jahresbericht der Deutschen
Mathematiker Vereinigung, 27, 232–234.
Pasch, M. (1919). Mathematik und Logik. Vier Abhandlungen. Leipzig, Germany:
W. Engelmann. Contains: I. Über innere Folgerichtigkeit (1915). II. Über den
Bildungswert der Mathematik (1894). III. Forschen und Darstellen (Mai 1916). IV. Der
Aufbau der Geometrie (November 1917).
Pasch, M. (1920). Die Begründung der Mathematik und die implizite Definition. Ein
Zusammenhang mit der Lehre vom Als-Ob. Annalen der Philosophie, 2(2), 144–162.
Pasch, M. (1922). Physikalische und mathematische Geometrie. Annalen der Philoso-
phie, 3(3), 362–374.
Pasch, M. (1924a). Mathematik und Logik. Vier Abhandlungen (second edition). Leipzig,
Germany: Wilh. Engelmann.
PASCH ’ S PHILOSOPHY OF MATHEMATICS 117

Pasch, M. (1924b). Betrachtungen zur Begründung der Mathematik. Mathematische


Zeitschrift, 20, 231–240.
Pasch, M. (1925). Begriffsbildung und Beweis in der Mathematik. Annalen der
Philosophie, 4, 348–367, 417–426. Reprinted in Pasch (1927).
Pasch, M. (1926a). Vorlesungen über Neuere Geometrie (second, revised edition). Berlin:
Springer. With an appendix Die Grundlegung der Geometrie in historischer Entwicklung
by Max Dehn.
Pasch, M. (1926b). Betrachtungen zur Begründung der Mathematik (Zweite Abhandlung).
Mathematische Zeitschrift, 25, 166–171.
Pasch, M. (1926c). Die axiomatische Methode in der neueren Mathematik. Annalen der
Philosophie, 5, 241–274. Reprinted in Pasch (1927).
Pasch, M. (1927). Mathematik am Ursprung. Gesammelte Abhandlungen über
Grundfragen der Mathematik. Leipzig, Germany: Felix Meiner.
Pasch, M. (1930a). Der Ursprung des Zahlbegriffs. Berlin: Springer.
Pasch, M. (1930b). Eine Selbstschilderung. Gießen, Germany: von Münchowsche
Universitäts-Druckerei. Reprinted in Pasch (1980).
Pasch, M. (1980). Eine Selbstschilderung. Mitteilungen des mathematischen Seminars
Giessen, 146, 1–19.
Peano, G. (1889a). Arithmetices Principia, nova methodo exposita. Torino, Italy: Bocca.
English translation The Principles of Arithmetic, Presented by a New Method in Kennedy
(1973), pp. 101–134. Partial translation in van Heijenoort (1967), pp. 81–97.
Peano, G. (1889b). I principi di geometria logicamente esposti. Torino, Italy: Fratelli
Bocca. Reprinted in Opere scelte, Vol. 2., Roma, Italy: Edizioni Cremonese, pp. 56–91.
Pickert, G. (1980). Habilitation und Vorlesungstätigkeit von Moritz Pasch. Mitteilungen
des mathematischen Seminars Giessen, 146, 46–57.
Pickert, G. (1982). Moritz Pasch, 1843–1930, Mathematiker. In Gundel, H. G., Morav,
P., and Press, V., editors. Gießener Gelehrte in der ersten Hälfte des 20. Jahrhunderts,
Volume 35 of Lebensbilder aus Hessen, pp. 704–713. Veröffentlichungen der
Historischen Kommission für Hessen. Reprinted in Tamari (2007, pp. 262–272).
Pressman, H. M. (1997). Hume on geometry and infinite divisibility in the Treatise. Hume
Studies, 23(2), 227–244.
Reichenbach, H. (1938). Experience and Prediction. An Analysis of the Foundations and
the Structure of Knowledge. Chicago, IL: University of Chicago Press.
Reichenbach, H. (1957). The Philosophy of Space and Time. New York, NY: Dover
Publications Inc. English translation of Die Philosophie der Raum-Zeit Lehre, De
Gruyter, Berlin, 1928.
Riemann, B. (1854). Über die Hypothesen, welche der Geometrie zu Grunde liegen.
Abhandlungen der Königlichen Gesellschaft der Wissenschaften zu Göttingen, 13
(1868). Reprinted in Riemann (1876), pp. 254–269. English translation in Ewald (1996),
Vol. 2.
Riemann, B. (1876). Gesammelte mathematische Werke und wissenschaftlicher Nachlass.
Leipzig, Germany: Teubner. Edited by Heinrich Weber, with the assistance of Richard
Dedekind. Second edition 1892, with different pagination, reprinted by Dover, New
York, 1953.
Rosanes, J. (1871). Ueber die neuesten Untersuchungen in Betreff unserer Anschauung
vom Raume. Breslau: Maruschke & Berendt. Habilitation lecture held on April 30, 1870,
in Breslau.
Russell, B. (1903). Principles of Mathematics. Cambridge: Cambridge University Press.
Second edition 1938.
118 DIRK SCHLIMM

Shapiro, S. (1997). Philosophy of Mathematics. Structure and Ontology. Oxford,UK:


Oxford University Press.
Tamari, D. (2007). Moritz Pasch (1843–1930). Vater der modernen Axiomatik. Seine Zeit
mit Klein und Hilbert und seine Nachwelt. Eine Richtigstellung. Aachen, Germany:
Shaker Verlag.
Thaer, A., & Lony, G. (1915). Lehrbuch der Mathematik. Breslau: Ferdinand Hirt. Ausgabe
B, Für Oberrealschulen, Realgymnasien und verwandte Anstalten. Bd. 1, Unterstufe.
Torretti, R. (1978). Philosophy of Geometry from Riemann to Poincaré. D. Reidel,
Dordrecht, Holland.
van Heijenoort, J. (1967). From Frege to Gödel: A Sourcebook of Mathematical Logic.
Cambridge, MA: Harvard University Press.
Veronese, G. (1894). Grundzüge der Geometrie von mehreren Dimensionen und mehreren
Arten gradliniger Einheiten in elementarer Form entwickelt. Leipzig, Germany:
B. G. Teubner. Translated by Adolf Schepp.
Volkert, K. T. (1986). Die Krise der Anschauung. Göttingen, Germany: Vandenhoeck &
Ruprecht.
von Helmholtz, H. (1866). Über die thatsächlichen Grundlagen der Geometrie. Ver-
handlungen des historisch-medicinischen Vereins zu Heidelberg, 4, 197–202. Appendix
printed 5, 31–32, 1869. Reprinted in Wissenschaftliche Abhandlungen, Vol. 2, pp. 618–
639, J. A. Barth, Leipzig, 1883.
von Helmholtz, H. (1876). Über den Ursprung und die Bedeutung der geometrischen
Axiome. In Populäre wissenschaftliche Vorträge, Vol. 3. Braunschweig, Germany:
Vieweg. First published 1870. Reprinted in Vorträge und Reden (fifth edition), Vol. 2,
pp. 1–31. English translation ‘On the origin and significance of the axioms of geometry’,
in von Helmholtz et al. (1977), pp. 1–38.
von Helmholtz, H., Hertz, P., Schlick, M., Lowe, M. F., Cohen, R. S., and Elkana, Y.
(1977). Epistemological Writings: The Paul Hertz/Moritz Schlick Centenary Edition of
1921 With Notes and Commentary by the Editors. Dordrecht, Holland: D. Reidel Pub.
Co.
Weber, H., & Wellstein, J. (1905). Encyklopädie der Elementar-Mathematik. Ein
Handbuch für Lehrer und Studierende, Vol. 2: Elemente der Geometrie. Leipzig,
Germany: B.G. Teubner.

DEPARTMENT OF PHILOSOPHY
McGILL UNIVERSITY
855 SHERBROOKE ST. W.
MONTREAL, QC H3A 2T7, CANADA
E-mail: dirk.schlimm@mcgill.ca
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010

TWO (OR THREE) NOTIONS OF FINITISM


MIHAI GANEA
Department of Philosophy, Boston University

Abstract. Finitism is given an interpretation based on two ideas about strings (sequences
of symbols): a replacement principle extracted from Hilbert’s work and a counting principle
inspired by Tait. These principles are used to justify an equational arithmetic 2 based on the

algebra of lower elementary functions. The extension of this algebra to Grzegorczyk’s class
ℇ can be justified by means of an additional finitistic choice principle, thus obtaining a second
2
equational theory 2 . It is unknown whether 2 is strictly stronger than 2 since ℇ2 may coincide
ℇ ℇ ℒ
with the class of lower elementary functions.
If the objects of arithmetic are taken to be binary numerals instead of tally numerals, then it
becomes possible to provide a finitistic justification for a theory B that may be incomparable to 2

(neither of the two includes the other). I conclude by suggesting that the equational theory of Kalmar
elementary functions is a strict upper bound for finitistic arithmetic.

§1. Introduction: Hilbert’s Cogito. The direct descendant of David Hilbert’s work in
the foundations of mathematics is reductive proof theory, in which typically a mathematical
theory 1 is shown to be -conservative over a weaker theory 2 ( a set of sentences in the
language of 2 ). If 2 has certain epistemic advantages over 1 (it is finitistic, constructive,
or predicative, whereas 1 is not), then such a conservation theorem may be regarded as
securing a foundation for 1 (it helps if the result is proved in 2 ).
The received view is that Hilbert aimed to prove that ideal mathematics (classical anal-
ysis and set theory) is finitistically conservative over what he called real or ‘contentual’
[inhaltlich] arithmetic and this result was to be established by a finitistic consistency proof
for the axiomatic theories formalizing ideal mathematics.1 The required consistency proof
would have to be prefaced by a philosophical analysis of the notion of finitistic or ‘con-
tentual’ arithmetic explaining its special epistemic status and delineating with sufficient
precision the finitistic means of proof. But initially the expectation that an obviously fini-
tistic consistency proof exists made the philosophical task seem less pressing and then
Gödel’s work made it seem pointless. For it is generally accepted that the incompleteness
theorems show that Hilbert’s original goal cannot be achieved even in the absence of a
precise analysis of finitism, provided one accepts the reasonable view that finitistic means
of proof can be reproduced in the axiomatic theories of classical mathematics.2

Received: June 27, 2009


1 See Hilbert (1925, p. 383), Smorynski (1977, p. 824), and Girard (1987, pp. 35–36). This standard
representation of Hilbert’s strategic goal is criticized as anachronistic in the last section of Zach
(2004).
2 There exists at least one extensive attempt to defend (a version of) Hilbert’s program against the
impact of the incompleteness theorems, but its success is doubtful (see Detlefsen (1986) and also
Ignjatovic (1994) for a strong criticism).


c Association for Symbolic Logic, 2010
119 doi:10.1017/S1755020309990323
120 MIHAI GANEA

There still are good reasons to seek a precise analysis of finitism, for example, in order
to obtain partial realizations of Hilbert’s program (see Simpson (1988)). Instead of aiming
for a definitive solution to the question of the foundations of ideal mathematics, one could
describe it in terms of the kinds of evidence available for its theorems. Determining the
nature and extent of finitistic arithmetic is a first step in classifying these kinds of evidence.
Hilbert and his school did not offer an exact definition of finitistic arithmetic, and were
fully aware of the lack of precision of their position.3 On the one hand it is clearly required
that the objects of a finitistic theory are finite (given in intuition), such as finite sequences
of symbols. On the other hand it is unclear what principles regarding these objects can be
considered finitistic. Within the Hilbert school there was considerable indecision over this
point. They seemed to accept, without a clear justification, that the principles of transfinite
induction used in the consistency proofs by Ackermann and Gentzen were finitistic.4 The
main criterion operating in the selection of these principles was supposed to be security,
but unpacking its exact meaning is not straightforward. Hilbert suggested that they ought
to be indubitable or indispensable to scientific thought:
[. . . ] as a precondition for the application of logical inferences and for
the activation of logical operations, something must be already be given
in representation [in der Vorstellung]: certain extra-logical discrete ob-
jects, which exist intuitively as immediate experience before all thought.
If logical inference is to be certain, then these objects must be capable of
being completely surveyed in all their parts, and their presentation, their
difference, their succession (like the objects themselves) must exist for
us immediately, intuitively, as something that cannot be reduced to some-
thing else. Because I take this standpoint the objects of number theory are
for me [. . . ] the signs themselves [. . . ]. The solid philosophical attitude
that I think is required for the grounding of pure mathematics—as well
as for all scientific thought, understanding, and communication—is this:
In the beginning was the sign. (Hilbert, 1922, p. 202).
This well-known passage, which reoccurs in several of his writings, identifies the objects
of finitistic thought as sequences of symbols (the forms or types of messages in an intersub-
jective medium presupposed by any communication) and suggests a foundational strategy
that derives the basic principles of arithmetic from the preconditions or norms of successful
communication.
We might call this idea Hilbert’s Cogito: I express mathematics (in accordance with the
rules of some language), therefore I know (some) arithmetic.5 The nature and extent of this

3 Here is a significant excerpt from Hilbert & Bernays (1939): “. . . we have not introduced the
expression ‘finit’ as a sharply delimited term, but only as the name of a methodological guideline,
which, to be sure, enables us to recognize certain kinds of concept-formations and ways of
reasoning as definitely finitistic and others as definitely not finitistic. This guideline, however,
does not provide us with a precise demarcation between those [concept-formations and ways of
reasoning] which accord with the requirements of the finitistic method and those that do not.”
(translation in Zach (1998), footnote 16)
4 See the discussion in Zach (2003), especially section 3.3, for evidence on this point.
5 The affinity between Hilbert’s position and the Cartesian Cogito is emphasized in Tait (1981),
p. 525: “. . . the special role of finitism consists in the circumstance that it is a minimal kind
of reasoning presupposed by all nontrivial mathematical reasoning about numbers. And for this
reason it is indubitable in a Cartesian sense that there is no preferred or even equally preferable
ground on which to stand and criticize it.”
TWO ( OR THREE ) NOTIONS OF FINITISM 121

basic (finitistic) arithmetical knowledge are subjected to two constraints I sketch below and
then expand on in the rest of the paper.
a) The basic notions and principles of finitistic arithmetic are grounded in an analysis
of the fundamental properties of language. Arithmetic is to be reduced to a body of indu-
bitable knowledge, and this knowledge concerns the preconditions for the intersubjective
expression of thought. Numbers are to be identified with (the types of) a certain kind of
expressions and then certain properties of, operations on, and principles about numbers can
be perceived as evident through an analysis of our syntactical competence with respect to
those expressions. This competence is made explicit in the process of describing precisely
(formalizing) the practice of mathematics. Hilbert assigned a fundamental importance to
formalization, which to him was to be the result of a transcendental reflection:
The formula game that Brouwer so deprecates has, besides its mathe-
matical value, an important general philosophical significance. For this
formula game is carried out according to certain definite rules, in which
the technique of our thinking is expressed. These rules form a closed
system that can be discovered and definitively stated. The fundamental
idea of my proof theory is none other than to describe the activity of our
understanding, to make a protocol of the rules according to which our
thinking actually proceeds. (Hilbert, 1927, p. 475)
Like Frege, Hilbert assigned a special epistemic status to logic, the system of ‘laws
of thought’. However, unlike Frege, Hilbert thought that the most basic concepts and
principles of arithmetic cannot be grounded in the content of these laws, but rather in
the knowledge of the form of their expression, that is, in our knowledge of signs:
We start from the assumption that we possess the capacity to name things
by signs, and that we can always recognize them again. We can then
carry out certain operations with these signs, operations which are anal-
ogous to those of arithmetic and which satisfy analogous laws. ((Hilbert,
1910, p. 159), quoted in Hallett (1995, p. 164)).
What operations with signs (expressions) are to be taken as fundamental and how is
one to exploit the similarities between basic syntax and arithmetic in order to provide a
foundation for the latter in the former? Clearly some degree of freedom is to be expected
in answering these questions, since expressions (finitistic objects) and their relations can
be conceived in various ways.
b) Finitistic arithmetic is logically restricted. Since numbers form a (potentially) infinite
collection, standard forms of expression, and usual logical rules may not be used freely in
reasoning about them. Hilbert (1925) stated that
In the domain of finitistic propositions [. . . ] the logical relations that
prevail are very imperspicuous, and this lack of perspicuity mounts un-
bearably if “all” and “there exists” occur combined or appear in nested
propositions. In any case, those logical laws that man has always used
since he began to think, the very ones that Aristotle taught, do not hold.
(p. 379)
In particular, a finitistically meaningful arithmetical language may not use unbounded
quantifiers since the meaning of expressions using such quantifiers needs to be explicated
(in finitistic terms). The same considerations apply to the domain of expressions of a
122 MIHAI GANEA

language taken to be basic, that is, to basic syntax and to the arithmetic that one can
interpret in it.
Therefore our most fundamental arithmetical reasoning is characterized by certain re-
strictions derived from the prohibition against the free use of unbounded quantifiers. This
is a negative constraint that also leaves some degree of freedom in the specific choice of
the language and logic of the fundamental theory. However, we could hope that the various
candidates for the fundamental theory fall within a sufficiently narrow range.
I turn now to a closer examination of the constraints outlined above, with the aim of
interpreting them with sufficient precision to narrow the choice of a fundamental arith-
metical theory (or class of theories). First I will examine the commonly held view that
Primitive Recursive Arithmetic (PRA) is the most natural theoretical description of fini-
tistic arithmetical knowledge in Hilbert’s sense. It will turn out that the relation between
PRA and Hilbert’s Cogito is an uneasy one and that the resolution of this tension can be
accomplished by a weakening of the schema of primitive recursion.

§2. Two versions of PRA. Hilbert was not the first author to articulate reservations
about the use of standard logical rules in arithmetical reasoning. The line of thought that
demands restrictions on the logic of number theory is represented by Kronecker,6 Poincaré,
Brouwer, and Hilbert’s most prominent student, Hermann Weyl. For example, existential
quantifiers in arithmetic are said to produce incomplete or partial judgments in Weyl (1921,
p. 97) as well as in Hilbert & Bernays (1934, pp. 32–33). Motivated by similar concerns7
Skolem took the resolute step of eliminating quantifiers from the foundations of arithmetic
in Skolem (1923):
[. . . ] If we consider the general theorems of arithmetic to be functional
assertions and take the recursive mode of thought as a basis, then that
science can be founded in a rigorous way without the use of Russell
and Whitehead’s notions “always” and “sometimes.” This can also be
expressed as follows: A logical foundation can be provided for arithmetic
without the use of apparent logical variables. (p. 304)
In this pioneering work Skolem showed that a series of elementary number-theoretical
theorems (culminating with the result that any number has a unique decomposition in

6 Hilbert acknowledged Kronecker as the originator of the finitistic viewpoint, for example, in
Hilbert & Bernays (1934, p. 42): “Kronecker, who was the first to insist on the requirements
of the finitistic standpoint, sought to completely eliminate nonfinitistic methods of proof from
mathematics. He reached his aim in the theory of algebraic numbers and number fields.” Hilbert’s
attitude toward Kronecker methodological position oscillates from appreciation (see, e.g., Sieg
(1999, p. 4)) to outright hostility, as when he calls him a ‘classical prohibiting dictator’ in Hilbert
(1922, pp. 200–201).
For an analysis of Kronecker’s methodological views emphasizing their connection with his
mathematical work, see Marion (1995).
7 In Skolem (1950) he states the following: “When I wrote my article I hoped that the very natural
feature of my considerations would convince people that this finitistic treatment of mathematics
was not only a possible one but the true or correct one—at least for arithmetic. [. . . ] I am no
fanatic, and it is not my intention to condemn the nonfinitistic ideas and methods. But I should
like to emphasize that the finitistic development of mathematics as far as it may be carried out
has a very great advantage with regard to clearness and security. Further it may be good reason to
conjecture that it can be carried out very far, if one would make serious attempts in that direction.”
(p. 527)
TWO ( OR THREE ) NOTIONS OF FINITISM 123

prime factors) can be given a quantifier-free form, that is, they can be written as sentential
combinations of equations between terms constituted from free variables, 0̄, and symbols
for functions defined through primitive recursion.
Skolem did not provide an axiomatic framework for his proofs of the theorems in ques-
tion; this was supplied in Hilbert & Bernays (1934) after primitive recursive functions
had been given their standard definition and used extensively in the arithmetization of
syntax in Gödel (1931). PRA as presented by Hilbert and Bernays (henceforward called
PRAHB ) consists in a logical calculus of quantifier-free formulas (essentially sentential
logic plus the rules of substitution and equality), explicit and primitive recursive definitions
of functions (starting from the constant 0̄ and the successor function s(x)), the axiom
s(0̄) = 0̄ and the schema of induction, which can be formulated as an inference rule:

ϕ(0̄) ϕ(x) ⊃ ϕ(s(x))


∴ ϕ(x)

where ϕ(x) is any formula in the language of PRA with x among its variables. Precise
descriptions of PRA along these lines are given in Girard (1987, p. 67) and Troelstra & van
Dalen (1988, pp. 120–126).
Curry (1941) and Goodstein (1954, 1957) sought to purify arithmetic from logic even
further and eliminated sentential connectives from its language. The process of simplifying
the primitives of the theory culminated in Rose (1962, 1984), but at the price of losing
some readability. Below I will rigorously define Goodstein’s system PRAG and assess
its suitability as a theory of finitistic arithmetic. As we shall see, the element that makes
its suitability doubtful is central to all versions of PRA: it is the scheme of definition by
primitive recursion itself.
The primitives of the language are the constant 0̄, individual variables xi (i ≥ 1), the
equality sign =, the symbols for the successor and constant zero functions (s and z, respec-
tively), symbols for the projection functions πm,n (with 1 ≤ m ≤ n), and symbols for two
kinds of functional operators: Cm,n (with 1 ≤ m, n) for the operation of composition, and
Rn+1 (with n ≥ 0) for primitive recursion (in a fully fintistic formulation these operators
would not be allowed, but they permit a more concise presentation and therefore are used
here for reasons of convenience). I write x, y, z, . . . for variables with possibly distinct
indices and x, y, . . . for (possibly empty) vectors of such variables. Until Section 4, where
the number 0 is identified with the symbol ‘0’, I will distinguish between the constant 0̄
and the number 0.
Functionals and their arity are defined as follows: s, z, and πm,n are functionals of arities
1,1, and n respectively (I will write n-functional for a functional of arity n) . If f 1 , . . ., f m
are all n-functionals and g is an m-functional, then Cm,n [g, f 1 , . . ., f m ] is an n-functional.
If f is an n + 1-functional (with n ≥ 0) and g a n + 2-functional, then Rn+1 [ f , g] is an
n + 1-functional. f , g, h, f 1 , g1 , . . . will be used as variables for functionals.
0̄ and the individual variables are primitive terms. If f is a functional of arity n and
t1 , . . ., tn are terms, then f (t1 , . . ., tn ) is a term and f its dominant functional. I use
t, u, v, . . ., as variables for terms and write t ( x ), u( y), . . . to indicate that the variables
x, y, . . . occur in the terms t, u, . . .. Sometimes terms are written as f ( x ), g( y), . . .,
indicating the dominant functional. t (x/u) is the term obtained from t (x) by substituting
the occurrences of the variable x in t by the term u.
The only formulas of the theory are equalities between terms: t = u. Its axioms
are of definitional nature, specifying the meaning of expressions for primitive recursive
functions:
124 MIHAI GANEA

Z . z(x1 ) = 0̄.
P. πm,n (x1 , . . ., xn ) = xm for any m, n with 1 ≤ m ≤ n.
C. If g is an m-functional and f 1 , . . ., f m are n-functionals, then
Cm,n [g, f 1 , . . ., f m ](
x ) = g( f 1 (
x ), . . ., f m (
x )).
R. If g is an n+1-functional and h is an n+2-functional, then
x , 0̄) = g(
Rn+1 [g, h]( x , 0̄) and

x , s( y)) = h(
Rn+1 [g, h]( x , y, Rn+1 [g, h](
x , y)).
There are four inference rules: the first three specify properties of equality and the fourth is
a uniqueness rule for the procedure of definition by primitive recursion that does the work
of induction (given here in a slightly different form than Goodstein’s).
t (x) = u(x)
S1 .
∴ t (x/v) = u(x/v)
t =u
S2 .
∴ v(x/t) = v(x/u)
x = y, x = z
T.
∴y=z
for any terms t, u, v, where t (x/v) is the term resulting by substituting v for all the
occurrences of x in t.
f (
x , 0̄) = g(
x , 0̄), f (
x , s( y)) = h(
x , y, f (
x , y))
U. for any functionals f, g, h.8
∴ f (
x , y) = Rn+1 [g, h]( x , y)
Using U and T one can immediately derive another inference schema taken as primitive
by Goodstein:
f (
x , 0̄) = g(
x , 0̄), f (
x , s( y)) = h(x , y, f (
x , y)), g(
x , s( y)) = h(
x , y, g(
x , y))
U1 . .
∴ f (
x , y) = g( x , y)
PRAG and PRAHB are co-interpretable (a proof is given in Goodstein’s book) and hence
the two versions are interchangeable in the considerations that follow.
How well does PRAG comply with the requirements arising from Hilbert’s Cogito? With
respect to the logic of the theory it is hard to imagine that any serious challenge could be
mounted against the use of free variables and the identity relation or against the rules S1 ,
S2 , and T . Hilbert never asserted that we lack a general concept of number (i.e., the ability
to determine whether an object is a number or not if it is presented in an appropriate way),
but only that quantification over the totality of numbers and the logical rules associated
with it are problematic. S1 , S2 , and T could be attacked in contexts that involve intensional
or vague notions, but that is not the case in arithmetic. Difficult to challenge are also

8 If T had the form x = y, y = x/∴ x = z it would be impossible to prove that equality is reflexive
or commutative. For no equation of form x = t would be derivable (as it can be proved through
an inductive argument on the length of the derivation) and so x = x would not be a theorem. The
schema x = y/ ∴ y = x would not be derivable either, since otherwise by using it one would
prove x = π1,1 (x) from P, contrary to the previous observation. Using T as given in the text
one immediately derives x = x using P and S1 , commutativity from x = y and x = x and then
transitivity.
TWO ( OR THREE ) NOTIONS OF FINITISM 125

the proper axioms of the theory regarding the zero function, projections, and composition
(projections will become entirely perspicuous once counting on sequences is introduced in
Section 4).
What remains to be examined is the actual engine powering the theoretical machinery of
PRAG —the definition schema of primitive recursion with its two components, R and U .
Taken together they assert that given functions (‘intuitive procedures’ for constructing
numbers) g( x , y) and h( x , y, z) of n + 1 and n + 2 arguments, respectively, there exists a
unique function f ( x , y) of n + 1 arguments such that f (x , 0̄) = g(
x , 0̄) and f (
x , s(y)) =
h(x , y, f (
x , y)). Does this principle have the same basic status as the rest of PRA’s com-
ponents? Is it implicit in our fundamental knowledge of operations with strings of symbols,
as the first Hilbertian insight demands?
I should emphasize from the outset that it is out of the question to justify finitistically
R and U as generalizations about finitistic functions (operations on numbers). The very
concept of finitistic function or operation is not available from a finitistic viewpoint and
therefore such a generalization cannot be finitistically meaningful. What we can hope for
is a piecemeal justification in which all cases (instances) of R and U are made finitistically
evident. Even the use of the functional operators Cm,n and Rn+1 in the language of PRAG
is in fact questionable from a strictly finitistic viewpoint, since no finitistic meaning can
be directly assigned to them—the idea of higher-order order operations on finitistically
acceptable functions does not make direct finitistic sense. Every single definition of a
primitive recursive function must be seen as separated from the others, as an isolated
insight. I am using the operators for reasons of convenience, but it should be kept in mind
that I do not require a direct justification for them in any finitistic foundation for PRAG .

§3. From intuitions to principles. Hilbert and Bernays did supply a justification of
(all cases of) primitive recursion and crucial in assessing the success of their attempt is the
notion of elementary property (relation) of strings, since they claimed that the principle of
induction was obvious when restricted to elementary (intuitive) properties:
[. . . ] Here we use the inference rule of complete induction [die Be-
weismethode der vollständigen Induction]. Let us explain from the outset
the meaning of this rule from our elementary viewpoint: let there be
given an arbitrary statement [Aussage] about a numeral with an elemen-
tary intuitive content [elementar anschaulichen Inhalt]. Suppose that the
statement holds for 1 and we also know that if it holds for a numeral n,
then it also holds for the numeral n + 1. From this it follows that that the
statement holds for any given numeral a.
In fact the numeral a is built up by the application of the procedure of
adding 1 starting from 1. First one notices that the statement applies to 1
and then, on the ground of the inductive hypothesis, that the statement
holds for every new numeral obtained by the addition of 1, and so, by
the time one completes the structure of a, one knows that the statement
holds for a.
What we have here is not an independent principle, but a consequence
we obtain from the concrete structure of numerals. (Hilbert & Bernays,
1934, p. 23)
In what sense is induction a derived and not independent principle? This question
should be answered on the basis of an analysis of the finitistic intuition of strings and
126 MIHAI GANEA

the most detailed such analysis is given in writings by Parsons (1980, 1986, 1994, 1998,
2008).9
Although Parsons follows guidelines set by Hilbert and Bernays, some of the elements
of his analysis are novel. A key idea he emphasizes is that an imprecise intuition of a
generic string is required if any intuitive basis is to be given to general principles about
numbers, such as the existence of the successor function, which is assumed both in PRAHB
and PRAG (Parsons, 2008, pp. 173–178). In contrast, Hilbert alluded to the completeness
of intuitions of finitistic objects (see, e.g., the programmatic quote from Hilbert on p. 3).
Parsons does not reject the view that each individual finitistic object (string) can be intuited
in full detail (see Parsons (1998), pp. 266–269, or Parsons (2008), pp. 260–262), so appar-
ently his views are compatible with Hilbert’s position. However, this appearance may be
deceiving since the sense of possibility involved in Parsons’s considerations is an ‘abstract
mathematical’ one (some argument is needed to show that a justification of finitism can use
such a concept), and the conclusion that intuitive knowledge of arithmetical propositions
involving large strings is possible is weaker than the actual intuitive knowledge of such
propositions.
As far as I know Hilbert never considers in writing the obvious fact that the clarity of our
intuitive grasp of strings weakens as the size and complexity of the strings increase10 and
he does not comment on how a satisfactory conception of finitistic intuition should take it
into account. However, the issue is examined in Bernays (1930), on the basis of the notion
20
of formal abstraction. After he acknowledges that it is doubtful that a number such as 1010
is instantiated in physical reality (let alone intuitively representable), Bernays claims that
it is nevertheless a finitistically acceptable object since even finitistic arithmetic involves
a certain degree of abstraction that disregards all accidental features of numbers, focusing
exclusively on their structural relations (p. 251). In the case at hand, once exponentiation
and successor (implicit in the arguments 10, 20) are perceived as intuitive operations, it
20
becomes irrelevant whether the term 1010 designates a concretely instantiable number.
Even if we accept this viewpoint (see the discussion of strict finitism below), we still
require an account of how a numerical function such as exponentiation can be defined
intuitively.
In order to address this need, Parsons’s theory introduces a distinction between the
knowledge we acquire on the basis of the intuition of specific (types of) strings and “knowl-
edge of general propositions about types, which have in their scope indefinitely many
types” (Parsons, 2008, p. 173). The latter is founded on imagining an arbitrary string
either as a Gestalt, a bounded figure against a surrounding ground, or as an object con-
structed through the iteration of an operation that can be repeated indefinitely. The key
presupposition at play here is that the string in question is imagined in a representational
‘space’ that is not exhausted by the string’s presence and it can always accommodate extra
objects, such as an additional symbol extending the string. In other words, the “form of
intuition” is invariant: the object of intuition (imagination) is always bounded within an
unbounded environment (spatial or temporal). Clearly this presupposition is not absolutely

9 Parsons develops a notion of ‘intuition of ’, that is, an intuition of objects in the spirit enunciated
by Hilbert and Bernays: “What is characteristic of this methodological standpoint is that
considerations are put in the form of thought experiments on objects that are assumed to be
concretely present” (Hilbert & Bernays (1934, p. 20), translated in Parsons (2008 p. 172)).
10 A tally numeral is a homogeneous string and as such it does not have any internal complexity.
But if intuited strings are composed of at least two different signs, there is an obvious sense in
which they have an internal structure of varying complexity.
TWO ( OR THREE ) NOTIONS OF FINITISM 127

indubitable and Parsons does not claim it has an a priori status (Parsons, 2008, p. 176). He
follows Husserl in claiming that the mathematical intuition of types of strings is founded
on the perception and imagination of tokens of those strings and as such the standard form
of intuition his theory is based on cannot be considered necessary, since perceptions that
do not have the structure [bounded object within unbounded space] are clearly possible.
Circumstances where the standard form of intuition does not apply are described as
‘extreme situations’ in Gandy (1982). They arise when the resources for representing in-
scriptions are running out, an inscription being understood as “some sufficiently permanent
physical state which is prepared or set up by its author and which can subsequently be
‘read’ by himself or another” (p. 129). It seems safe to say that we do not have a satisfactory
theory of extreme situations and consequently we do not have a full-fledged theory of what
might be called representation systems, that is, of the physical systems whose subparts are
used to instantiate types of strings for the purposes of communication. It is not my goal in
this paper to pursue this topic, so I will limit myself to some brief remarks relating it to
Hilbertian finitism.
A key characteristic of a representation system is its capacity: the size (number of indi-
vidual symbols) of the largest string it can support at a given time. The implicit assumption
made by Hilbert, Bernays, and Parsons is that there is no limit on the capacity of the
representation systems that are available ‘in principle’. Gandy’s ‘fundamental hypothesis’
is precisely its negation: “There is an upper bound B, independent of time, to the size
of inscriptions,” with the corollary that there is also an upper bound on the numbers that
are representable by concrete tokens of numerals. Numbers can be represented not only
directly through tokens of numerals, but also indirectly through tokens of closed terms
written using symbols for acceptable functions. Characterizing the collection of these
‘concretely definable’ numbers on the basis of Gandy’s fundamental hypothesis is the main
task of what might be called ‘strict finitism’ or ‘ultrafinitism’.
A simple proposal for this characterization would be to use a system designed to codify
the more relaxed version of finitism initiated by Hilbert (such as PRAHB ) and adjust it in
two ways: i) adopt the language and logic of partial terms11 as presented, for example, in
Beeson (1986), which permits a simple formulation of the fundamental hypothesis, and (ii)
use the modified system as its own syntactical meta-theory, so that syntactical operations
are also conceived as partial. The first step involves the introduction of a special monadic
predicate ↓(t ↓ means that the term t is defined or has a referent) and of a modified relation
of equality ∼= (s ∼= t stands for (t ↓ ⊃ t = s) ∧ (s ↓ ⊃ s = t) for every pair of terms
s, t), and the reformulation of the axioms and rules of inference for PRAHB by means of
these two elements. Whereas the propositional fragment of its logic remains unchanged,
substitution rules take the following forms:
s∼= t, ϕ[x/s] t↓, ϕ(x)
,
ϕ[x/t] ϕ[x/t]
for every pair of terms s, t and formula ϕ. Axioms pertaining to the definability predicate
are 0↓, x ↓ (for every variable x), and R(t1 , . . ., tk ) ⊃ t1 ↓ ∧ . . . ∧ tk ↓ for every predicate
symbol of arity k and arbitrary terms t1 , . . ., tk (relevant cases are t = s ⊃ t ↓ ∧ s ↓
and f (t1 , . . ., tk )↓⊃ t1↓ ∧. . . ∧ tk ↓). Furthermore, the axioms introducing the primitive
recursive function symbols are formulated by means of ∼ = rather than =.

11 Parsons mentions the logic of partial terms in Parsons (2008, section 41) when discussing the
elementary axioms of arithmetic, but does not use it to describe strict finitism.
128 MIHAI GANEA

In this setup Gandy’s fundamental hypothesis can be written as ¬(t↓) for some term t,
and together with any consistent axiom t ↓ (for some term t ) it produces a theoretical
description of a representation system that can inscribe the numeral described by t , but
not the one described by t (if t is s(t ) then the system could be called strictly regimented;
otherwise it could be called tolerant). It is easily observed that ↓ becomes superfluous if
one does not postulate the hypothesis, but rather adopts the axiom s(x)↓ stating that the
successor function is everywhere defined (that every term ‘converges’ or is defined follows
by induction on term complexity).
What grounds are there for choosing between these two opposing hypotheses? Parsons
argues for the totality of the successor function on the basis of a transcendental insight, but
it could be said that the adoption of this principle is simply a matter of convenience, a way
of purifying the theory from a nonmathematical element, since the meaning of ↓ depends
on the nature of the physical world and the only possible evidence for Gandy’s fundamental
hypothesis is empirical. Parsons’s ‘step toward infinity’ could be interpreted as the decision
to pursue pure, a priori mathematics.12 It could be also be argued that Hilbert’s Cogito does
not demand an absolutely secure foundation for arithmetic, but rather a foundation that is as
secure as that of linguistic communication (which includes the intersubjective expression
of mathematical thought). Exhausting representation resources for strings leads not only
to a breakdown in arithmetic (the existence of a number that is indirectly representable,
but not directly representable), but in communication as well (the existence of expressions
that can be described but not inscribed). Conversely, as long as syntactical operations such
as concatenation are unrestricted, so are arithmetical operations such as successor. The
version of finitism outlined by Hilbert and Bernays and analyzed by Parsons does make
the infinitary assumption that the representation system supporting (tokens of) strings is
potentially infinite (it can be extended indefinitely), and one good reason for going along
with it is that arithmetic becomes a simpler theory and we do not need to bother with the
definability predicate anymore.
Therefore from now on I will take for granted that basic operations on strings are
everywhere defined, although strictly speaking they are completely intuitive only on small
arguments (i.e., only relatively small strings can be represented ‘in the mind’s eye’). In
other words, we assume that large strings, which can be described but not directly intuited
in full detail, can be supported by some system of representation. While we do have an
imprecise global intuition of these large strings, our access to the totality of their details is
based on the stability of their concrete tokens—on external memory rather than on internal
memory. Both of them (the internal as well as the external representation of strings) can be
mobilized in an argument for the finitistic character of induction.
Consider an elementary statement ϕ(n) such that ϕ(0̄) and ϕ(n) ⊃ ϕ(n + 1) are finitistic
theorems (in the rest of this section I will use German letters for numerical constants and
variables, following Hilbert and Bernays). Let a be a numeral instantiated in a certain
system of representation in the sense previously indicated. What the Grundlagen passage
on induction suggests is that from a representation of a = 1. . .1 ≥ 11 one can obtain a
representation of the finitistic proof (a) consisting in the sequence of formulas
ϕ(0̄), ϕ(n) ⊃ ϕ(n + 1), ϕ(0̄) ⊃ ϕ(1), ϕ(1), ϕ(1) ⊃ ϕ(11), ϕ(11), . . ., ϕ(a).
This sequence contains 2a + 2 elements, such that the first and second are the inductive
· 1 is the conditional ϕ(b −
hypotheses, element 2b − · 1) ⊃ ϕ(b), which follows from the

12 This is what is called ‘formal abstraction’ in Bernays (1930).


TWO ( OR THREE ) NOTIONS OF FINITISM 129

second induction hypothesis by instantiation, element 2b is ϕ(b) and follows from the
preceding two by modus ponens for 11 ≤ b ≤ a + 1. Note that it is circular to assert that
the existence of the proof in question is established by induction, a point made in George
& Velleman (1998, p. 323).
What would advance the finitist account of induction would be to show that the sequence
of symbols constituting the proof can be obtained on the basis of the original representation
of the numeral a. This could be made plausible if representing a would be accompanied
by representing the sequence of all numerals smaller than it: 0̄, 1, 11, . . ., a = ℋ(a) (the
‘history’ of a).13 For if ℋ(a) can be represented with complete clarity and its parts can be
considered distinctly, then so can the sequence (a): the latter is obtained from the former
by turning its first element (ℋ(a))0 = 0̄ into the pair ϕ(0̄), ϕ(n) ⊃ ϕ(n + 1), and every
subsequent element (ℋ(a))b = b into the pair ϕ(b − · 1) ⊃ ϕ(b), ϕ(a).
It seems in line with Parsons’s analysis to claim that if a sequence of finitary objects
can be (globally) intuited and externally represented then so can a sequence obtained
from by a transformation μ that acts on the individual members of and is in itself
intuitive, in the sense that from a global intuition of b one can obtain a global intuition
of μ(b) and from a concrete representation of b one can obtain a concrete representation
of μ(b). This could be named a finitistic replacement principle (by analogy with the set-
theoretic one). The intuitive appeal of this principle is very strong, since substitution is
presupposed not only by the most basic function of language (symbolization), but also by
any proper expression of generality (involving pronouns or variables, expressions whose
role is that of placeholder for other expressions).
One way to support the claim that if a is representable then so is ℋ(a) is from a visual-
geometric viewpoint. Imagining ℋ(a) (i.e., producing an internal global representation of
ℋ(a)) is not significantly more difficult than imagining a. Departing a bit from the view
that tally numerals are constituted by occurrences of 1, let us imagine them as sequences
of dots: • • . . . • • (as emphasized in Bernays (1923), the actual shape of the primitive
symbols involved is irrelevant). Then ℋ(a) is simply the triangular ‘figure’ with a as basis.
For a = • • • • • • • the respective figure is

Imagining ℋ(a) on the basis of a vague global representation of a is thus an easy exercise in
two-dimensional geometric intuition. However, it could be objected that visual perception
in at least two dimensions is not presupposed by Hilbert’s Cogito and constitutes a cogni-
tive resource that cannot be assumed available for the finitist, for nontrivial mathematics is
possible even when perception is restricted to one dimension. This is shown for instance by
Turing’s analysis of computation, which can be seen as an exercise in phenomenological

13 Hilbert & Bernays (1934) name this sequence ‘the series of numerals from 1 to a’ [die Reihe
der Ziffern von 1 bis a]. They describe it as follows: “The construction [Bildung] of a numeral a
proceeds through a concrete series of numerals, starting with 1 and ending with a, in which every
numeral is obtained from the preceding one by attaching 1. It is evident that, with the exception
of a, this series contains only numerals < a, and that any numeral < a must be present in this
series” (p. 24). It seems consistent with this passage to assume that the (global) intuition of a is
accompanied by the (global) intuition of ℋ(a).
130 MIHAI GANEA

imagination. The perceptual space of his idealized computing agent is a finite portion of
the read/write tape and as such unidimensional. Yet such an agent could easily imitate the
mathematical behavior of perceptually more sophisticated agents.
Unlike two-dimensional perception, memory seems an indispensable resource for math-
ematical thought, and indeed for any coherent form of thought (how is one to carry out
an intention without remembering what it is?). Memory has a dual aspect—one may dis-
tinguish between internal and external memory (we store tokens of strings on external
representation systems) and their compatibility is a key precondition to adequate thought
and communication. It is memory that can be invoked in justifying the history principle,
for if a string a is constructed through a temporal succession of acts of inscribing basic
signs in a representation system, then ℋ(a) is simply the recollection of the stages of this
process in their temporal order.
It should be remarked though that by producing concrete instances of a one does not
produce concrete instances of ℋ(a) in the same sense. But such a concrete representation
of ℋ(a) could be obtained if every individual sign x of a was replaced with the initial
segment of a that x determines (followed by some kind of marker). Therefore, a concrete
representation of ℋ(a) is guaranteed to exist by the finitistic replacement principle, pro-
vided the operation ax giving the initial segment of a that ends with x is considered
intuitive.
Thus it is a double application of the finitistic replacement principle that allows us to
conclude that (a) is representable if a is. If we accept the view that the conclusions of
finitistic proofs that can be globally represented in intuition are as reliable as their premises
we may infer that for every a, ϕ(a) follows from ϕ(1) and ϕ(n) ⊃ ϕ(n + 1), that is,
that induction is a finitistically acceptable mode of inference (restricted to finitistically
meaningful formulas, as ϕ(n) is assumed to be).14
So what are the elementary relations over numerals that may be used in a finitistic
inductive argument? Let us assume that basic finitistic statements are equations of form
f (a) = g(a) involving numerical variables and symbols for a special set of computable
functions. These functions form an algebra generated from some basic functions and
closed under operations such as composition. The function-generating operations come
along with rules of inference such as S, T , and possibly rules similar to R and U (we could
have a version of bounded recursion on notation, for example). Assuming that includes
addition, truncated subtraction x − · y, and the functions sg (sign) and sg (countersign),15
any formula ϕ(a) constructed from basic finitistic statements by means of the standard
sentential operators can be given a translation ϕ(a)∗ in the purely equational finitistic lan-
guage (without sentential operators). If the rules of inference associated with permit the
derivation of the standard logical rules for equality and sentential operators, then the basic
finitistic language may be enriched with the latter. This holds for every decent candidate
for the algebra of finitistic functions. The same cannot be said about the representation of

14 We should note that the argument above can be readily adapted to justify the form of induction
assumed by PRAG , that is, principle U1 . The only difference is that instead of a sequence of
applications of instantiation and modus ponens, the proof to be represented is a sequence of
applications of the rules S1 , S2 , and T .
15 I will take as a given that contains the zero, successor, and projection functions. − · is defined
using the predecessor function p (which satisfies p(0̄) = 0̄ and p(s(a)) = a), by a − · 0̄ = a and
a− · s(b) = p(a − · b); we have sg(0̄) = 0̄, sg(s(a)) = s(0̄), sg(0̄) = s(0̄) and sg(s(a)) = 0̄. These
equations just register the properties of the functions in question—they are intuitively clear and
need no proof.
TWO ( OR THREE ) NOTIONS OF FINITISM 131

bounded quantifiers in the basic finitistic language. If, for example, is identified with
the set of polynomial-time computable functions (on binary numerals), then the question
whether finitistic relations are closed under bounded quantification comes down to the
extremely difficult problem of the collapse of the polynomial time hierarchy (which most
researchers expect to ultimately receive a negative answer).
It turns out though that if the objects of arithmetic are conceived as tally numerals, the
projection principle ensures the existence of bounded sum, and thus allows the introduction
of bounded quantifiers. Bounded sum is a definition procedure that associates with every
 the function of the same arity  f (n, m)
function f (n, m)  defined by the equations

 = f (0̄, m),
 f (0̄, m)   f (s(n), m)
 =  f (n, m)
 + f (s(n), m).


Suppose now that the arguments a, b are representable. By the history principle so is
ℋ(a) and by the finitistic replacement principle the sequence 0̄, b,  1, b,
 . . ., a, b
 is
also representable (for surely forming the pair c, b  is a finitistic operation given c ≤
 A further application of the replacement principle yields the representation of
a and b).
 f (1, b),
the sequence f (0̄, b),  . . ., f (a, b).
 Replacement can also assume the guise of a
union principle which guarantees that any concretely represented sequence of numerals
can be converted intuitively into a single numeral  (simply by eliminating the breaks
between the members of ). For suppose that among the basic signs we intuit are a break
marker (say ‘’) and the empty symbol (symbolized by ‘ε’). Then if sequences of tally
numerals are strings of ‘1’ interspersed with occurrences of ‘’, then  is the result
of substituting ε for  in . Thus from a global intuitive representation of a, b it should
be possible to obtain the global intuitive representation of  f (a, b)  through the kind of
‘thought experiment’ that characterizes finitistically acceptable functions. It is also easy to
show that by using projections and composition one can apply summation over any of the
variables of a finitistic function.
If the above argument is seen as successful and all the other assumptions made about
the algebra of finitistically acceptable functions (namely that it includes zero, successor,
predecessor, projections, addition, truncated subtraction, sign and countersign, and that it is
closed under composition) are maintained, then the set of finitistic propositional functions
that can be justified as meaningful is closed under bounded quantifiers.
Switching focus now on the basic relations between numerals, we find that in Section 2
of Hilbert & Bernays (1934) the authors consider identity [Übereinstimmung], symbolized
by ‘=’ and difference [Verschiedenheit], symbolized by ‘=’, as basic. They also assume
certain primitive operations [Handlungen] and constructive processes [Bildungsprozesse]
whereby other numerals [Ziffern] are obtained from given ones. The most fundamental of
these is the ‘progression process’ [Prozess des Fortschreitens], namely the concatenation
of 1 to a given numeral; but concatenation can take as arguments other two figures [Fig-
uren] as well, and so we have the operation of addition, denoted by ‘+’. Associated with
it is the relation of order: one can show that a < b only by producing a c such that
a + c = b, since it is only in this way that a is an initial segment [Anfangsstück]
of b. Multiplication is another basic operation on numerals described by means of an
intuitive procedure on strings: “a · b denotes the numeral obtained from the numeral b,
when substituting throughout its structure 1 with the numeral a” (p. 24). Just like with
addition, multiplication is not given a recursive definition—rather, its fundamental property
of distributivity with respect to addition is supposed to follow directly from its intuitive
definition.
132 MIHAI GANEA

In Grundlagen der Mathematik no other fundamental operations are defined and the
authors proceed to the explanation of primitive recursion. In a generous interpretation we
could consider that functions shown to be finitistically acceptable constitute the functional
algebra described above (multiplication can be considered as basic, but it can also be
defined by means of bounded sum). These functions, known as the lower elementary func-
tions, have been discussed in Skolem (1962) and their set is denoted by ℒ2 in Rose (1984).
The formulas available for an inductive proof that primitive recursion is a permissible
definitional method are those expressed by the 0 formulas of the language {0̄, =} ∪ Fℒ2
(where Fℒ2 the set of symbols for functions in ℒ2 ).16 But these are insufficient for the
purpose of proving that definition by primitive recursion [das Verfahren der rekursiven
Definition] is an elementary procedure (that it produces elementary functions when applied
on elementary functions). Here is how Hilbert and Bernays describe it:
A new function symbol, say ϕ, is introduced, and the [corresponding]
function is defined by two equations. In the simplest case, these equa-
tions are of the form:
ϕ(1) = a

ϕ(n + 1) = ψ(ϕ(n), n).


Here, a is a numeral and ψ a function which is formed from previously
known functions by composition, so that ψ(b, c) can be computed for
given numerals b, c and gives another numeral as value. [. . . ]
It is not immediately clear, which sense can be assigned to this method
of definition. For its elucidation we must first make the notion of function
precise. A function, for us, is an intuitive instruction on the basis of
which to each given numeral another numeral is assigned. A pair of
equations of the above kind - called a “recursion” is to be understood
as an abbreviated communication of the following instruction:
Let m be any numeral. If m = 1, [so] let the numeral a be assigned to
m. Otherwise, m has the form m + 1. One then writes down schemati-
cally:
ψ(ϕ(b), b).
Now if b = 1, [so] one replaces therein ϕ(b) by a; otherwise b again
has the form c + 1, and one then replaces ϕ(b) by
ψ(ϕ(c), c).
Again, either c = 1 or c is of the form d + 1. In the former case one
replaces ϕ(c) by a, in the latter case by
ϕ(ϕ(d), d).
Repeating this process in any case terminates. For the numerals
b, c, d, . . . ,
which we obtain one after the other, develop through the disassembling
of the numeral m, and this must terminate just like the assembling of m

· y = 0̄.
16 We need not take the symbol ‘≤’ as primitive, since x ≤ y may be taken to be x −
TWO ( OR THREE ) NOTIONS OF FINITISM 133

does. When we arrive at 1 in this process of disassembling, then ϕ(1)


is replaced by a; the sign ϕ does then no longer occur in the resulting
figure. Rather, the only function symbol occurring, possibly in multiple
superposition, is ψ and the innermost arguments are numerals. Thus we
have arrived at a computable expression; for ψ was supposed to be a
function already known. This computation must be executed from inside
out, and the numeral thus obtained shall be assigned to the numeral m.
(Hilbert & Bernays (1934, pp. 25–26), translation in Zach (1998)).
As pointed out in Parsons (1998) and Zach (1998), this explanation does not show
conclusively that primitive recursion has an elementary or finitistic character. More exactly,
it cannot be seen as justifying the schema R used in PRAG , even in a case-by-case sense.
Hilbert and Bernays succeed in showing that the recursion equations permit one to rewrite
the term ϕ(m) as an expression in which ϕ does not occur (in m+1 steps). The problem that
is left untouched is how to recognize that the expression ψ(ψ(. . .ψ(a, 1). . .), m) resulting
from ϕ(m) by the stepwise elimination of the function symbol ϕ is a computable one [eine
berechenbaren Ausdruck].
Since we have induction at our disposal, the problem can receive an elementary resolu-
tion if the property ‘computable’ [berechenbar] is elementary. But it seems obvious that it
is not: an expression ξ is computable if and only if there exists a sequence of expressions
ξ1 = ξ, ξ2, . . ., ξk = n
starting with the expression in question and ending with a numeral n and proceeding at
each step according to a computation rule already accepted.17 It is this sequence of steps
that Hilbert and Bernays refer to when they say that what it remains to do is “perform this
computation from inside out” [Diese Berechnung hat man nun von innen her auszuführen].
Existential statements regarding (sequences of) expressions have the same nature as exis-
tential statements about numerals and therefore are just as unacceptable from a finitistic
point of view if no bound is specified. ‘Computable’ can only be seen (in the context of
the theory developed by Hilbert and Bernays up to the point where primitive recursion
is introduced) as a property implicitly containing a noneliminable unbounded existential
quantifier and hence the inductive argument that primitive recursion is an ‘intuitively’
acceptable definition procedure for functions is not an elementary induction.18
Essentially the same inductive argument for primitive recursion is given in Tait (1981,
pp. 531–532) and in this respect Tait’s defense of PRA as the proper theory of finitistic
reasoning suffers from the same basic defect as that of Hilbert and Bernays. I return to
Tait’s views in the next section, since he offers an attractive format for the conceptualization
of finitistic reasoning and another basic principle that nicely complements replacement.
To conclude, even if we allow the Hilbertian finitist the use of function existence
principles involved in the definition of primitive recursive functions other than primitive
recursion itself (composition, projections), and even the procedure of bounded sum-
mation, he cannot justify on their basis the existence of the computation that resolves
ψ(ψ(. . .ψ(a, 1). . .), m) into a numeral (there is no already accepted function that outputs

17 It doesn’t help to define ‘ξ is computable’ by ‘ξ denotes a numeral’. An unbounded quantifier is


still implicit in the latter expression.
18 This point is made in Parsons (1998), where Parsons explicitly says that he doubts that intuitive
arithmetic includes PRA, and following Edward Nelson he expresses the view that exponentiation
is not an intuitively justifiable function (p. 265).
134 MIHAI GANEA

the computation when given ψ(ψ(. . .ψ(a, 1). . .), m) as input or provide a bound for the
length of this computation). If the ‘elementary’ properties of tally numerals are those
expressed by the 0 formulas of the language {0̄, =} ∪ Fℒ2 , then induction over those
properties cannot possibly prove the existence of all primitive recursive functions, even
if we allow ourselves the full use of classical logic. For let be the first-order theory
axiomatized by the equations defining the functions in ℒ2 (including s(x) = 0̄ and s(x) =
s(y) ⊃ x = y) and the 0 induction scheme (for the language {0̄, =} ∪ Fℒ2 ). Then
can be proved consistent in a finite fragment of PRAG , and therefore cannot prove the
existence of all the functions in that fragment.

§4. Two theories that may be one and the same. The organizing idea of Tait (1981)
is to identify the forms of reasoning implicit in the concept of number. According to
Tait, understanding Number consists essentially in the ability to count, that is, to order
collections and to put them in correspondence with numbers (objects generated from 0
by successor). I will combine this idea with Hilbert’s Cogito and give them a precise
interpretation in the typed framework used in Tait’s paper, extended so as to accommodate
syntactical as well as numerical types.
Following Hilbert, I will take the basic type E to be that of expressions (the “extra-
logical discrete objects, which exist intuitively as immediate experience before all
thought”). E is a free monoid with a finite set of generators ℬ (the basic alphabet
that includes the standard signs used in formal theories of arithmetic), basic operation
∧ : E 2 → E (concatenation), and neutral element ε (the empty expression). Thus we have

(x ∧ y)∧ z = x ∧ (y ∧ z) and x ∧ ε = ε ∧ x = x for every x, y, z : E; the elements of ℬ are atoms,


in the sense that they cannot be decomposed into simpler expressions (if x : ℬ and x = u ∧ v,
then either u = x and v = ε or u = ε and v = x). E can also be considered a cancellative
monoid: if x ∧ y = x ∧ z or y ∧ x = z ∧ x, then y = z. Otherwise the same expression can be
given two distinct interpretations and hence be considered ambiguous. If the elements of
E are vehicles of communication, this is a situation to be avoided.
Still following Hilbert, I take N + as a subtype of E defined by the conditions 1 : N + and
m n : N + if m, n: N + and N as N + extended with 0. s : N → N is introduced by the rules

s(0) = 1 and s(n) = n ∧ 1 for n : N + ; addition is introduced by the rules 0+m = m +0 = m


and n + p = n ∧ p for m : N and n, p : N + .
Tait’s demand that the finitist should be able to count can be interpreted in more ways
than one. The simplest counting operation : E → N satisfies (ε) = 0, (x) = 1 if
x : ℬ and (x ∧ y) = (x) + (y) for x, y : E. It thus uses tally numerals to count the
lengths of expressions; other forms of counting will be described below.
The substring relation is clearly intuitive and closely involved with concatenation: x is a
substring of y (written x ⊆ y) if and only if there exist u, v such that y = u ∧ x ∧ v (note that
the quantifier involved in this definition is bounded). If u = ε, then x is an initial segment
of y (written x  y); if v = ε, then x is a final segment of y (written x  y). Using basic
counting one can define the n-th symbol x(n) for any x : E. x(n) = u if and only if u is
in ℬ and there exists v : E such that v ∧ u  x and (v ∧ u) = n.
Sequences of numbers can be introduced as another subtype S N of E, using, for ex-
ample, the symbol ‘’ to mark breaks between the elements of a sequence. In general,
assuming that T is a subtype of E that does not include ε and that T is an element of ℬ
not occurring in expressions of type T , the type ST of sequences of objects of type T can
be introduced by the rules x: ST if x: T and x ∧ T ∧ y: ST if x : ST and y : ST . I write
x ∗y for x ∧ T ∧ y (the sequence concatenation of x and y). Counting extends to sequences
TWO ( OR THREE ) NOTIONS OF FINITISM 135

by the operation T : ST → N which satisfies the conditions T (x) = 1 for x: T , and



T (x y) = T (x)+ T (y) for x, y : ST . If x : ST and k ≤ T (x), then xk : ST is the initial
segment of x of length k, that is, the string u: ST such that x = u ∗ v for some v : ST and
∧ ∧
T (u) = k. If xk = z or xk = v T z for some z: T and v:ST , then we write (x)k = z (z
is the k-th element of the sequence x). If z = (x)k for some k ≤ T (x) we write z ∈ x (this
is how counting on sequences allows the introduction of the projection functions πm,n ).
I will sometimes write members of sequence types using the vector notation: x, y, and so
forth.
In this typed setup the replacement principle can be given various formulations since it
relies on an informal notion of sequence that can be made precise in more than one way.
The simplest version of replacement introduces the operation of substituting an expression
z for the occurrences of a primitive sign x in an expression y. sub : (ℬ ∧ E) ∧ E → E
has the properties sub(x, z, ε) = ε, sub(x, z, x) = z, sub(x, z, u) = u, sub(x, z, v ∧ y) =
sub(x, z, v)∧ sub(x, z, y) for every z, v, y in E, x and u in ℬ (x different from u). Re-
stricted to N it permits the introduction of the multiplication function: we have m · n = 0
if m = 0 or n = 0 and m · n = sub(1, m, n) otherwise. Let T be a type consisting of a
single expression u; I will write u n for the expression sub(T , ε, w), where w : ST and
T (w) = n. u is thus n copies of u concatenated.
n

Summation of number sequences is the operation  : S N → N defined by (x) = 0 if


(x)k = 0 for all k with 1 ≤ k ≤ N (x) and otherwise (x) = sub(0, ε, sub( N , ε, x)),
with the properties x = x if x : N and (x ∗y) = (x)∧ (y) = (x)+(y) for x, y : S N .
Representing the string x involves two acts: first all break symbols  N in x (if any) are
eliminated (substituted with ε); if the resulting string consists exclusively in occurrences
of 0, all such occurrences except one are eliminated, whereas if in the resulting string there
are occurrences of 1, then all occurrences of 0 are eliminated.
Given that ℬ is finite and cannot supply indefinitely many distinct blank symbols, there
arises the issue of how to iterate the operation of forming sequences (this is necessary
in introducing syntactical types such as terms). The problem can be overcome by ranking
sequences and introducing a special expression acting as blank for sequences of a particular
rank. We set x: ST0 as x : T and write x ∗n y for x ∧ nT ∧ y. ST is defined by the rules x : ST
s(n) s(n)

if x : STn and x ∗s(n) y : STs(n) if x, y : STs(n) . It should be noted that the indexed sequence types
s(0)
are cumulative (i.e., if m ≤ n and x : STm , then x: STn ) and that ST coincides with ST .
The notions previously introduced for ST can be extended for STn (n ≥ 1). Counting has
p(n)
an indexed version nT : STn → N with the analogue properties nT (x) = 1 for x : ST ,
n ∗n
and T (x y) = T (x) + T (y) for x, y : ST . If z: ST and k ≤ T (z) then zk is defined
n n n n n

as the initial segment of z of length k, that is, the string u: STn such that z = u ∗n v for some
v: STn and nT (u) = k. If zk = x or zk = y ∗n x for some x: ST , then we write (z)k = x
p(n)

(x is the k-th element of the sequence z). If x = (z)k for some k ≤ nT (z) we write x ∈ z.
The transitive closure of this relation will be written as ∈∗ . The union of all types STn is ST∗ .
The T -rank r T (x) of x : ST∗ as the smallest n such that x : STn (it is the length of the
longest expression of form nT that occurs as a substring of x). We write RTn for the type of
strings x: ST∗ such that r T (x) = n. RT0 coincides with T and RT is characterized by the
s(n)

membership rule x ∗s(n) y : RT if x, y : ST .


s(n) s(n)

Suppose now that x ∈ y, where x : RT and y : RTn , with m ≤ n and z : RTk . We wish
m

to introduce a function that substitutes occurrences of z for the occurrences of x in y,


thus extending replacement to iterated sequences. The definition must take into account
the fact that the rank of z may very well be higher than the ranks of both x and y and yet
136 MIHAI GANEA

structurally the resulting sequence should be isomorphic to y. An example will perhaps


clarify this issue. Let A be the type of strings a n , with n > 0 and let us write b for  A .
Then S nA is the type of strings over the alphabet {a, b} which in which the longest b-tally
(substring containing only b) is at most bn ; R nA is the subtype of S nA of those expressions
whose longest b-tally is exactly n.
Let x = aa, y = aa∧ b∧ a, and z = a ∧ b∧ a ∧ bb∧ a. We should not identify the result of
substituting z for x in y as a ∧ b∧ a ∧ bb∧ a ∧ b∧ a (since the first member of this sequence is
not z, but a ∧ b∧ a and the second member of this sequence is not a, but a ∧ b∧ a); a correct
definition should identify the result of the substitution to be

a ∧ b∧ a ∧ bb∧ a ∧ bbb∧ a,

that is, the sequence in RA3 whose first element is z and the second is (y)2 .
Therefore the ranked substitution operation sub RT : (ST∗ ∧ ST∗ ) ∧ ST∗ → ST∗ will be
characterized by the properties subRT (x, y, z) = z if x = y; subRT (x, y, z) = y if r T (x) =
r T (y) (i.e., x, y : RTn where n = r T (x)) and x = y; subRT (x, y, z) = y if r T (x) > r T (y)
(since in that case x ∈ / ∗ y); sub RT (x, u ∗k v, z) = sub RT (x, u, z)∗l sub RT (x, v, z), where
k > r T (x), u, v : ST , x ∈∗ u or x ∈∗ v and l = max(k, r T (z) + 1). It is clear that
p(k)

sub R A (aa, aa∧ b∧ a, a ∧ b∧ a ∧ bb∧ a) = a ∧ b∧ a ∧ bb∧ a ∧ bbb∧ a as desired.


The familiar operation of substituting a term t for the occurrences of a variable x in
another term u (and, by extension, in an equation ϕ) will be a variant of ranked substitution
in this typed setup since terms are constituted partially through sequence iteration.
In this context the history principle is expressed by the introduction of the operation ℋ:
N → S N with the following properties: ℋ(0) = 0; ℋ(s(n)) = ℋ(n)∧  N ∧ s(n)). It is a
basic finitistic insight that m ∈ ℋ(n) if and only if m ≤ n. Furthermore, N (ℋ(n)) = n+1
and (ℋ(n))k+1 = k for k ≤ n.
Using types a finitistic choice principle can be defined as follows: if x : SST and f :
T → T such that if y ∈ (x)k for 1 ≤ k < ST (x), then f (y) ∈ (x)k+1 (the f -image
of (x)k is included in (x)k+1 ), then for every u ∈ (x)1 there exists z ∈ ST with (z)1 =
u, T (z) = 2T (x), (z)k+1 = f ((z)k ) for 1 ≤ k < T (z). This sequence is obtained by
making ST (x) successive appropriate choices from the members of x (the first choice
determining an element u ∈ (x)1 and the subsequent ones using the function f ). We write
(u, x, f ) for the final element of z.
We can now proceed to justify versions of the two equational theories ℒ2 and ℇ2 intro-
duced in Rose (1984, pp. 140–148), denoted by ℒ2 and ℇ2 respectively. The language of
ℒ2 is based on a set of symbols that includes x, f , 0, 1, a, C, , = (plus a number of blank
symbols T for the various syntactical types introduced). These syntactical types are intro-
duced in analogy with sequential types above: Var (variables), Fe (function expressions),
F (functionals), Tr (terms) and finally Eq (equations). Words of form an play the role of
indices distinguishing variables (expression of form xan ) and function expressions (of form
am fan , where the first index determines arity). We write s for afa, p for afaa, z for afaaa,
πm,k for am fak+4 (1 ≤ k ≤ m), + for aafaaa and − · for aafaaaa. A complication arises
in connection with functionals and terms, since they are constituted through the iterated
forming of sequences and we operate with a finite supply of blank symbols, but it can be
overcome by ranking such expressions and using the word n to separate expressions of
rank n in a new sequence. A functional obtained through the application of the composition
operator on a sequence of functionals of rank n will have rank n + 1, and so on (the details
of the syntactic types are straightforward but tedious; for example, the arity Ar( f ) of a
functional f is easily handled by counting).
TWO ( OR THREE ) NOTIONS OF FINITISM 137

Certain equations are taken as axioms (type Ax), describing the basic properties of the
· and of functions defined by means of composition and
functions s, p, z, πm,n , +, −
summation. The complete list is as follows:

1a) p(0) = 0; 1b) p(s(x1 )) = x1 ; 2)z(x1 ) = 0; 3m,n) πm,n (x1 , . . ., xn ) = xm (1 ≤


m ≤ n); 4a) x1 + 0 = x1 ; 4b) x1 + s(x2 ) = s(x1 + x2 ); 5a) x1 − · 0 = x1 ; 5b)
·
x1 − s(x2 ) = p(x1 − x2 ); ·
6) C[g, f 1 , . . ., f m ](x1 , . . ., xn ) = g( f 1 (x1 , . . ., xn ), . . ., f m (x1 , . . ., xn )) where Ar
(g) = m and Ar( f j ) = n for 1 ≤ j ≤ m);
7a)  f (x1 , . . ., xn , 0) = f (x1 , . . ., xn , 0); 7b)  f (x1 , . . ., xn , s(xn+1 )) =  f (x1 , . . .,
xn , xn+1 ) + f (x1 , . . ., xn , s(xn+1 )).

Axiom schema (6) expresses a principle of composition so obvious to finitists that I


will not try to analyze it. Axiom schema (7), characterizing summation, can be justified by
means of the principles about finitistic types described above.
The rules of inference are the same as those for PRAG , that is, S1 , S2 , T , and U1 . Proofs
(type Pf ) are sequences of equations, that is, strings of type S Eq , such that x : Pf if and
only if for every k ≤ Eq (x), either xk : Ax, or there exist i 1 , i 2 , i 3 < k such that (x)k
is inferred from (x)i1 , (x)i2 , (x)i3 by one of the rules indicated. The key inference rule U1
can be justified by means of the combination of the history and replacement principles
described in Section 3.
We thus have a complete justification for the language, axioms, and principles of infer-
ence of ℒ2 , a variant of the ℒ2 -arithmetic developed in Rose (1984, pp. 139–148). In ℒ2
one can define the sentential operators (in the way already described in connection with
PRAG ) and derive the inference rules of classical logic with respect to them. We can thus
assume that ℒ2 has an extended language including the logical constants and is equipped
with classical predicate logic with equality (minus the rules associated with quantifiers).
In trying to extend the theory with an operator for bounded recursion Rb and its cor-
responding axiom schema, we face the following difficulty, emphasized by Cook in a
discussion of a bounded recursion schema on notation (see Cook (1975, p. 85)). Supposing
that we admit the intuitive operations g( x ), g (
x , y), and h( x , y, z), we would like to be
able to introduce a new operation f ( x , y)19 satisfying the conditions f ( x , 0) = g( x)
and f ( x , s(y)) = h( x , y, f (
x , y)) provided that f ( x , y) ≤ g (x , y). But how is one to
finitistically prove the theorem f ( x , y) ≤ g (x , y) if it constitutes a necessary condition
for the introduction of the functional symbol f ?
Following Cook’s solution to this difficulty, we can introduce orders for function sym-
bols and proofs and demand that prior to introducing a function symbol f of order m+ 1
satisfying the recursive conditions above, one must give a proofs of order m (containing
function symbols of order at most m) of the theorems g( x ) ≤ g (
x , 0) and z ≤ g (
x , y) ⊃
h(
x , y, z) ≤ g (x , s(y)). The theory ℇ2 then results by adding to ℒ2 the function operator
Rb for bounded recursion and the schema

8) If g, g , h are functionals of order at most m and of arities n, n + 1,


and n + 2, respectively, and p1 , p2 are proofs of order k (k ≥ m) ending
in g(x ) ≤ g (x , 0) and z ≤ g ( x , y, z) ≤ g (
x , y) ⊃ h( x , s(y)) then

19 If the language of the theory is enriched with the operator R , then f would be represented as
b
Rb [g, h, g ].
138 MIHAI GANEA

Rb [g, h, g ] is a functional of arity n + 1 and order k + 1 satisfying the


equations
Rb [g, h, g ]( x ) and Rb [g, h, g ](
x , 0) = g( x , y, Rb [g, h, g ](
x , s(y)) = h( x , y)).

(8) is justifiable in terms of the finitistic choice principle, for given a sequence argument
 n one can obtain the sequence of sequences (n) = ℋ(g (m,
m,  0)), ℋ(g (m,
 s(0))), . . .,
ℋ(g (m, n)). This array of numbers includes all the values of the function Rb [g, h, g ]
for arguments m,  c, where c ≤ n. Identifying Rb [g, h, g ](m,  n) in the last component
of (n) can be done through n + 1 choices, corresponding to the successive values
Rb [g, h, g ](m,
 0) = g(m)  ≤ g (m,  0), Rb [g, h, g ](m,
 s(0)) = h(m,  0, Rb [g, h, g ]
 0)) ≤ g (m,
(m,  s(0)), . . ., Rb [g, h, g ](m,
 n) ≤ g (m,
 n). Using the notation previously
introduced, we have Rb [g, h, g ](m,  n) = ( (n), g (m,  0)), h). The crucial difference
between primitive recursion and bounded recursion is that in the case of the latter the value
of the newly introduced function belongs to a collection of numbers whose existence we
accept every time we represent the corresponding argument.
It is not known whether the addition of schema (8) genuinely strengthens ℒ2 . It may be
that the two equational theories presented above are equivalent since it is an open problem
whether the algebra of lower elementary functions used by ℒ2 is closed under bounded
recursion. Evidence regarding the considerable mathematical strength of ℇ2 is given in
Berarducci & Intrigila (1991) and Cornaros (1995)20 : the combinatorics it can develop is
enough for proofs of classical results in number theory such as Bertrand’s postulate or
quadratic reciprocity. It would be interesting to test the theory with respect to Dirichlet’s
theorem on primes in arithmetical progressions.
What is a number? Originally, sequences of tally marks
were used to count things. Then positional notation
—the most powerful achievement in mathematics —
was invented.
Edward Nelson.

§5. Binary Arithmetic. I do not claim that the theory ℇ2 is the strongest arithmetical
formalism that can be justified in clearly finitistic terms. The foundation offered to the
theories in Section 4 suffers from an obvious limitation: the numerals considered are
homogeneous strings of a single symbol—following Hilbert’s original views, the objects
of arithmetic are taken there to be tally numerals. Mastering communication at an even
basic level usually presupposes producing and recognizing messages (strings of symbols)
from a richer alphabet, and in introducing theories ℒ2 and ℇ2 I have used facts about
our intuitions concerning such objects. Hilbert’s Cogito is thus probably compatible with
stronger theories that conceive of numbers as written in positional notation, but identifying
such theories is not easy.
An obvious candidate for a finitistically acceptable extension of ℇ2 that utilizes po-
sitional notation is Cook’s theory PV (see Cook (1975), Cook & Urquhart (1993) and
Krajı́ček (1995, pp. 76–78)). This is an equational theory that describes polynomial-time

20 The articles in question describe the theory I ℇ2 , but it can be shown that this theory is a

conservative extension of using a cut elimination argument—see theorem 1.4.2 in Buss
ℇ2
(1998, p. 111).
TWO ( OR THREE ) NOTIONS OF FINITISM 139

computable arithmetical functions, understood as operations over binary numerals (an


inessential variant uses dyadic numerals). It is undeniable the basic objects and functions
assumed in this theory do have intuitive content and can be integrated within the typed
setup introduced in Section 4. One minor problem with this integration is that while we
write expressions from left to right, binary numerals are usually interpreted as being written
from right to left, in the sense that digits increase in significance from right to left. In 100,
for example, the first digit is considered 0, the second 0, and the third 1. Furthermore, we
have to take into account that 0 is not allowed as the leading (leftmost) digit of a binary
numeral.
One way to deal with this problem is to define the successor functions such that s0B (0) =
0, s1B (0) = 1 and s0B (x) = x ∧ 0, s1B (x) = x ∧ 1 for x: E, x = 0. Then binary numerals
are defined by the conditions 0 : B and siB (x) : B for x : B (i = 0, 1). The function
x(i) (giving the i-th symbol of the string x) also receives a special version for B: we set
x B (i) = ( (x) + 1) − · x(i). Usually, x B (i) is written Bit(i, x)—it is the i-th digit (bit) of
the numeral x in the increasing order of significance.
The original version of the theory PV presented in Cook (1975) and Krajicek’s
variant use as basic operations (beside the two successor functions s0B and s1B ) the
truncation function Tr, concatenation
(a slightly different function from ˆ), the ‘smash’
function #, and the function Less satisfying Tr(0) = 0, Tr(siB (x)) = x, x
0 = x,
x
siB (y) = siB (x
y), x#0 = 0, x#siB (y) = (x#y)
x, Less(x, 0) = x and Less(x,
siB (y)) = Tr(Less(x, y) (with i = 0, 1). All of these operations correspond to the usual
Hilbertian ‘thought experiments’ on strings and are therefore acceptable from a fintistic
point of view as single acts of intuition.
However, the binary versions of functions such as addition, multiplication, modified
subtraction, and even (simple) successor (i.e., addition with 1) are not as intuitive as their
tally counterparts, and the same is true about bounded summation. Binary addition and
binary multiplication can no longer be identified with concatenation or substitution on
binary strings. We could try to introduce them and prove their essential properties by new
definition procedures and inference rules. What could these be? Which elements of the
typed setup from Section 4 can be used in the new context?
The guiding idea in stating the history principle was to find the collection of tally numer-
als whose acceptance is implicit in the production of an arbitrary tally numeral. Adopting
the same point of view it is easy to find a binary counterpart for the history principle: if
x : B then ℋ B (x) : S B will be the sequence of all the initial segments of x ordered by
the  relation plus 0 as the initial element. If x = 0, then ℋ B (0) = 0; ℋ B (siB (x)) =
ℋ B (x)∗ siB (x). If x = 0, then we will have (x) + 1 = B (ℋ B (x)) and (ℋ B (x))i 
(ℋ B (x)) j if 1 < i ≤ j ≤ B (ℋ B (x)). When constructing a binary numeral x one does
not have to construct all the binary numerals smaller than it (in the sense that they denote
smaller numbers than x, or precede it in the lexicographical ordering of binary numerals)
but simply those that are initial segments of x. The string 10101 is not implicit in the
construction of 111101111 in the same way that 11 is implicit in the construction of 1111.
The replacement principle can be used without modification and in conjunction with the
new version of the history principle it can provide a justification of the so-called induction
on notation, that is, of (all the cases of) the inference schema U B :
f (
x , 0) = g(
x , 0), f (
x , siB (y)) = h i (
x , y, f (
x , y)), g(
x , siB (y)) = h i (
x , y, g(
x , y))
∴ f (x , y) = g( x , y)
where f , g, h i (with i = 0, 1) are functionals of appropriate arities.
140 MIHAI GANEA

Unfortunately we meet a stumbling block when we try to adapt the schema of bounded
recursion to binary numerals and complete the theoretical structure of PV. The so-called
bounded recursion on notation definitional schema (R B ), essential in the structure of PV,
introduces a function f ( x , y), h 0 (
x , y) using functions g( x , y, z), and h 1 (
x , y, z) by the
conditions
f (
x , 0) = g(
x , 0), f (
x , siB (y)) = h i (
x , y, f (
x , y)) (with i = 0, 1),
provided that f ( x , y) ≤ g ( x , y) for every x, y (one can replace this inequality by a differ-
ent sufficient condition that does not mention f directly). Here x ≤ y will be understood
as Less(x, y) = 0.
The trouble is that the finitistic choice principle cannot be invoked in the same way it
was in the justification of the bounded recursion schema for functions on tally numerals.
The history sequence for a binary numeral does not include all binary numerals smaller
than it and therefore it is hard to imagine how starting from the intuition of a sequence of
arguments m  ∧  B∧ n : B k+1 one can derive the intuition of a sequence of binary numerals
that would include f (m,  n) even if the limiting function g is available. The point is that

ℋ B (g (m, n)) will not necessarily include the string that would be identified with f (m,  n),
since f (m, n) ≤ g (m,
 n) does not imply f (m,  n)  g (m,  n).
Assuming in general the existence of the collection of all binary numerals smaller than
an arbitrary one would help restore the argument in favor of the bounded recursion schema
but would also seem to be a considerable step on the abstraction scale; furthermore, if
this step is taken, the resulting theory would not be PV, but rather elementary arithmetic,
since the collection of binary numerals smaller than the binary numeral n is exponential
ex p
in size relative to n. More specifically, if we set up ℋ B : B → S B such that it satisfies
ℋ B (0) = 0 and ℋ B (s1 (s1 (x))) = ℋ B (x)∗ s1B (s0B (x))∗ s1B (s1B (x)) for every x : B,
ex p ex p B B ex p
ex p
then the function B (ℋ B (n)) : N → N is simply the exponential function 2n : N → N
(a sequence of 1’s can be interpreted both as a tally numeral and as a binary one).
Smaller steps on the abstraction scale can be taken, in such a way they do not involve
such a radical departure from the demands of intuition. Let us recall Tait’s minimal criterion
for understanding Number: it is the ability to count finite collections using numbers as
representations for those collections. Applied to binary strings it tells us that having a
minimal understanding of their arithmetical nature presupposes the ability to use them
in counting—say, in counting symbols in tally numerals. We are thus led to postulating
the intuitive character of the function B : N → B satisfying B (0) = B (1) = 1,
B
(s(s(0) · n) = s0B ( B (n)), B ((s(s(0) · n)) + 1) = s1B ( B (n)) for every n : N . Usually
the function B ( (x)) is written |x|—it is the binary numeral for the number of symbols
in the expression x (the binary length of x).
It seems natural to claim that if we can produce (intuit) an expression x: E then we can
also count its symbols in binary, that is, produce |x|. But when we count in binary, we do
go through all the consecutive numerals leading up to the final total and thus we are led to
a more modest binary history principle: whenever we intuit a binary numeral x, we also
log
intuit the collection of all binary numerals smaller than or equal to |x|. ℋ B : B → S B
log log
satisfies B ( B (ℋ B (x))) = |x| and (ℋ B (x))i = B (i) for every 1 ≤ i ≤ B (ℋ B (x)).
The basic finitistic insight regarding the logarithmic history of a binary numeral x is that
z ∈ ℋ B (x) if and only if z ≤ |x|.
This logarithmic history principle (a second version of a binary history principle) sup-
ports the so-called sharply bounded recursion on notation schema (R sB ), identical to R B
except for the stronger condition f ( x , y) ≤ |g (
x , y)|. The argument for R sB would proceed
TWO ( OR THREE ) NOTIONS OF FINITISM 141

by analogy with the argument for the bounded recursion schema for functions on tally
numerals: from the representation of n derive the representation of ℋ B (n); replace every
element of this sequence with g (m,  n); make the |n| choices from the elements of this last
sequence (using the functions g, h 0 , and h 1 ) to ultimately select the value of f (m,  n).
It turns out that once R sB is in place we are positioned to identify a clear candidate for
the algebra of fintistically acceptable functions on binary numerals: it is the set ℱLOGSPACE
of functions of polynomial growth whose bitgraphs are computable in logarithmic space
(starting from work by J. Lind, this complexity class is characterized as a functional
algebra in Clote (1999), theorem 3.22). ℱLOGSPACE is proved to be the closure of the set
{z, I , s0B , s1B , #, Bit} (where z is the constant 0 function and I is the set of all projection
functions) under the operations of composition, sharply bounded recursion on notation
and concatenation recursion on notation (CRN). CRN defines a function f ( x , y) from
functions g( x , y), h 0 (
x , y), and h 1 (
x , y) with the property that h i (
x , y) ≤ 1(i = 0, 1) by
the conditions
f (
x , 0) = g(
x , 0),

f (
x , siB (y)) = shBi (x ,y) ( f (
x , y)) (i = 0, 1).
This last ingredient of the characterization theorem can also be easily justified with the
apparatus elaborated in Section 4. First, let us remark that the functions h i ( x , y) can be
considered of type S B → N (since their only possible values are 0, 1). Second, the function
Last: B → N defined by Last(x) = (x) (x) gives the last digit of the binary numeral x,
that is, (0) (0) = 0 and (siB (x)) (s B (x)) = i (with i = 0, 1). Third, let u: S B → N be the
i
function defined by u( x , y) = h 0 (
x , y) · sg(Last(y)) + h 1 (
x , y) · sg(Last(y)). Suppose that
 n : S B ; applying the first binary history principle we obtain the existence of the sequence
m,
ℋ B (n) : S B . By the replacement principle there exists the sequence u S (ℋ B (n)) : S N
obtained from ℋ B (n) by replacing each member k ∈ ℋ B (n) by u(m,  k). f (m,
 n) is simply
 0)
sub( B , ε, u S (ℋ B (n))).
g(m,
An equational theory B can be organized on the basis of this algebra (in analogy with
PV being supported by the algebra of polynomial-time computable functions) that clearly
defines functions not in ℇ2 (e.g., #). However, it is very doubtful that ℇ2 ⊆ ℱLOGSPACE
(the initial functions of ℇ2 are in ℱLOGSPACE , but bounded recursion is a powerful resource
that may be impossible to replicate in ℱLOGSPACE or even in ℱPTIME ) and therefore it is
not clear that B represents a genuine strengthening of ℇ2 . It turns out that integrating
our arithmetical intuitions regarding unary and binary numerals in a unified theoretical
framework is not that simple! We are led to ask if there is a natural functional algebra that
includes both ℇ2 and ℱLOGSPACE and whose operations can be seen as finitistic ‘thought
experiments’ on strings.
Thus we have not reached a clear limit to the possibilities of intuitive arithmetical
theorizing. There may very well be ways to strengthen B and ℇ2 while remaining within
the confines imposed by Hilbert’s Cogito, and gauging the exact strengths of the various
finitistically justifiable theories could depend on difficult problems in complexity theory.
However, it is doubtful that elementary arithmetic could be shown to be compatible with
the demands of our fundamental syntactical intuition. If the existence of the exponential
function is justifiable intuitively (where the intuition in question is the ability to use strings
of symbols for the purposes of communication and inference) then it would turn out that
implicit in the intuition of any string of symbols x : ℒ (where ℒ is the language used to
formalize mathematical practice) is the intuition of the collection E(x) of all strings of
142 MIHAI GANEA

symbols of length equal to x, the vast majority of which are useless for the purposes of
communication within ℒ. The apparent incompatibility between Hilbert’s Cogito and the
exponential function thus adds further support to the view expressed in Nelson (1986)
that exponentiation belongs to the realm of abstract mathematics. Even if the boundary
between intuitive and nonintuitive arithmetic is a fuzzy one, elementary arithmetic (the
equational theory of the algebra of Kalmar elementary functions) seems to lie decidedly
on the nonintuitive side.
It is probably unreasonably optimistic to expect that we may reach a precise formal
characterization of a concept such as that of intuitive arithmetic. The analysis in this
paper seems to indicate that we may uncover a collection of finitistic arithmetics, in which
theories gain in strength as they lose their intuitive character and may not be ordered in a
simple hierarchy. Hilbert’s Cogito can ultimately be interpreted as the demand to describe
the structure of this collection.

§6. Acknowledgments. This work originates in the author’s doctoral dissertation


(Ganea, 2005). I am grateful to my thesis advisor Bill Hart for his guidance (among other
things, I owe him the phrase ‘Hilbert’s Cogito’) and to Charles Parsons for illuminating
conversations on the topic of finitism. I also thank Richard Zach and an anonymous referee
for comments and suggestions that have significantly improved this paper.

BIBLIOGRAPHY
Beeson, M. (1986). Proving programs and programming proofs. In Barcan Marcus, R.,
Dorn, G.J.W., and Weingartner, P., (eds.), Logic, Methodology and Philosophy of
ScienceVII, proceedings of the International Congress, Salzburg, 1983, Amsterdam:
North-Holland, pp. 51–81.
Berarducci, A., & Intrigila, B. (1991). Combinatorial principles in elementary number
theory. Annals of Pure and Applied Logic, 55, 35–50.
Bernays, P. (1923). Erwiderung auf die Note von Herrn Aloys Müller: Über Zahlen als
Zeichen. Mathematische Annalen, 90, 159–63. English translation in Mancosu (1998),
pp. 223– 226.
Bernays, P. (1930). Die Philosophie der Mathematik und die Hilbertsche Beweistheorie’.
Blätter fur deutsche Philosophie, 4, 326–367. English translation in Mancosu (1998),
pp. 234–265.
Buss, S. (1998). First order proof theory of arithmetic. In Buss, S., editor. Handbook of
Proof Theory. Amsterdam: Elsevier Science BV, pp. 79–147.
Clote, P. (1999). Computation models and function algebras. In Griffor, E. R., editor.
Handbook of Computability Theory. Amsterdam: Elsevier, pp. 589–681.
Cook, S. (1975). Feasible constructive proofs and the propositional calculus. In, Chandra,
A.K., Meyer, A.R., Rounds, W.C., Stearns, R.E., Tarjan, R.E., Winograd, S., Young,
P.R., (eds.) Proceedings of the 7th ACM Symposium on the Theory of Computation.
New York: ACM (Association for computing machinery) pp. 83–97.
Cook, S., & Urquhart, A. (1993). Functional interpretations of feasibly constructive
arithmetic. Annals of Pure and Applied Logic, 63, 103–200.
Cornaros, C. (1995). On Grzegorczyk induction. Annals of Pure and Applied Logic, 74,
l–21.
Curry, H. (1941). A formalization of recursive arithmetic. American Journal of
Mathematics, 63, 263–282.
TWO ( OR THREE ) NOTIONS OF FINITISM 143

Detlefsen, M. (1986). Hilbert’s Program: An Essay on Mathematical Instrumentalism.


Synthese Library 182. Boston, MA: Reidel/Kluwer Academic.
Gandy, R. (1982). Limitations to mathematical knowledge. In van Dalen, D., Lascar, D.,
and Smiley, J., editors. Logic Colloquium ’80. Amsterdam: Elsevier, pp. 129–146.
Ganea, M. (2005). Arithmetic Without Numbers. Doctoral dissertation, Department of
Philosophy, University of Illinois at Chicago.
George, A., & Velleman, D. J. (1998). Two conceptions of natural number. In Dales, H. G.
and Oliveri, G., editors. Truth in Mathematics. New York: Oxford University Press,
pp. 311–327.
Girard, J. -Y. (1987). Proof Theory and Logical Complexity. Naples: Bibliopolis.
Gödel, K. (1931). Über formal unentscheibare Sätze der Principia Mathematica und
verwandter Systeme I. Monatshefte für Mathematik und Physik, 38, 173–198. English
translation in Gödel (1986), pp. 144–195.
Gödel, K. (1986). Collected Works I. Feferman, S., Dawson, J.W., Jr., Kleene, S.C.,
Moore, G.H., Solovay, R.M., van Heijenoort, J. Oxford: Oxford University Press.
Goodstein, R. (1954). Logic-free characterizations of recursive arithmetic. Mathematica
Scandinavica, 2, 247–261.
Goodstein, R. (1957). Recursive Number Theory—A Development of Recursive Arithmetic
in a Logic-Free Equation Calculus. Studies in Logic and the Foundations of
Mathematics. Amsterdam: North-Holland.
Hallett, M. (1995). Hilbert and logic. In Marion, M., and Cohen, R.S., editors. Quebec
Studies in the Philosophy of Science I. Dordrecht: Kluwer Academic Publishers.
pp. 135–187.
Hilbert, D. (1910). Elemente und Prinzipienfragen der Mathematik. Sommersemester
1910: Ausgearbeit von Richard Courant. Göttingen, Germany: Mathematisches Institut,
Georg-August Universität. 163 pages, handwritten.
Hilbert, D. (1922). Neubegründung der Mathematik. Erste Mitteilung. Abhandlungen aus
dem Mathematischen Seminar der Hamburgischen Universität, 1, 157–177. English
translation in Mancosu (1998), pp. 198–214.
Hilbert, D. (1925). Uber das Unendliche. Mathematische Annalen, 95, 161–190. Lecture
given in Münster, 4 June 1925. English translation in van Heijenoort (1967), pp. 367–
392.
Hilbert, D. (1927). Die Grundlagen der Mathematik. Abhandlungen aus dem
mathematischen Seminar der Hamburghischen Universität, 6 (1928), 65–85. English
translation in van Heijenoort (1967), pp. 464–479.
Hilbert, D., & Bernays, P. (1934). Grundlagen der Mathematik I. Berlin: Springer.
Hilbert, D., & Bernays, P. (1939). Grundlagen der Mathematik II. Berlin: Springer.
Ignjatovic, A. (1994). Hilbert’s program and the omega-rule. Journal of Symbolic Logic,
59(1), 322–343.
Krajı́ček, J. (1995). Bounded Arithmetic, Propositional Logic, and Complexity Theory.
Encyclopedia of Mathematics and Its Applications 60. New York: Cambridge University
Press.
Mancosu, P., editor. (1998). From Brouwer to Hilbert—The Debate on the Foundations of
Mathematics in the 1920’s. New York: Oxford University Press.
Marion, M. (1995). Kronecker’s safe haven of real mathematics. In Marion, M., and
Cohen, R.S., editors. pp. 189–215.
Nelson, E. (1986). Predicative Arithmetic. Mathematical Notes 32, Princeton: Princeton
University Press.
144 MIHAI GANEA

Parsons, C. (1980). Mathematical intuition. Proceedings of the Aristotelian Society, 80,


142–168.
Parsons, C. (1986). Intuition in constructive mathematics. In Butterfield, J., editor.
Language, Mind and Logic. New York: Cambridge University Press, pp. 211–229.
Parsons, C. (1994). Intuition and number. In George, A., editor. Mathematics and Mind.
New York: Oxford University Press, pp. 141–157.
Parsons, C. (1998). Finitism and intuitive knowledge. In Schirn, M., editor. Philosophy of
Mathematics Today. New York: Clarendon Press, pp. 249–270.
Parsons, C. (2008). Mathematical Thought and Its Objects. New York: Cambridge
University Press.
Rose, H. (1962). Ternary recursive arithmetic. Mathematica Scandinavica, 10, 201–216.
Rose, H. (1984). Subrecursion—Functions and Hierarchies. Oxford, UK: Clarendon Press.
Sieg, W. (1999). Hilbert’s programs: 1917–1922. Bulletin of Symbolic Logic, 5(1), 1–44.
Simpson, S. (1988). Partial realizations of Hilbert’s program. Journal of Symbolic Logic,
53(2), 349–363.
Skolem, T. (1923). Begründung der elementaren Arithmetik durch die rekurriende
Denkweise ohne Anwendung scheinbarer Verädnderlichen mit unendlichen Ausde-
nungsbereich. Videnskapsselskkapets skrifter, I. Matematisk-naturvidenskabelig klasse
6. English translation in van Heijenoort (1967), pp. 302–333.
Skolem, T. (1950). Some remarks on the foundation of set theory. In Proceedings
of the International Congress of Mathematicians, Cambridge, Massachusetts, U.S.A.,
August 30 to September 6, 1950, American Mathematical Society, Providence 1952,
vol. I, pp. 695–704; reprinted in Skolem, 1970, pp. 519–528.
Skolem, T. (1962). Proof of some theorems on recursively enumerable sets. Notre Dame
Journal of Formal Logic, 3, 65–74.
Skolem, T. (1970). Selected Works in Logic. Fenstad, J. E., editor. Oslo Universitetforlaget.
Smorýnski, C. (1977). The incompleteness theorems. In Barwise, J., editor. The Handbook
of Mathematical Logic. New York: Elsevier. pp. 821–865.
Tait, W. (1981). Finitism. Journal of Philosophy, 78, 524–546.
Troelstra, A. S., & van Dalen, D. (1988). Constructivism in Mathematics I. Studies in Logic
and the Foundations of Mathematics 121. New York: Elsevier.
van Heijenoort, J., editor. (1967). From Frege to Gödel. Cambridge, MA: Harvard
University Press.
Weyl, H. (1921). Über die neuen Grundlagenkrise der Mathematik. Mathematische
Zeitschrift, 10, 39–79. English translation in Mancosu (1998), pp. 86–118.
Zach, R. (1998). Numbers and functions in Hilbert’s finitism. Taiwanese Journal for
Philosophy and History of Science, 10, 33–60.
Zach, R. (2003). The practice of finitism: Epsilon calculus and consistency proofs in
Hilbert’s program. Synthese, 137, 211–259.
Zach, R. (2004). Hilbert’s “Verunglückter Beweis,” the first epsilon theorem and
consistency proofs. History and Philosophy of Logic, 25, 79–94.

DEPARTMENT OF PHILOSOPHY
BOSTON UNIVERSITY
BOSTON, MA 02215
E-mail: mganea@bu.edu
T HE R EVIEW OF S YMBOLIC L OGIC
Volume 3, Number 1, March 2010

MODELING OCCURRENCES OF OBJECTS IN RELATIONS


JOOP LEO
Department of Philosophy, Utrecht University

Abstract. We study the logical structure of ‘real’ relations, and in particular the notion of
occurrences of objects in a state. We start with formulating a number of principles for occurrences
and defining corresponding mathematical models. These models are analyzed to get more insight
in the formal properties of occurrences. In particular, we prove uniqueness results that tell us more
about the possible logical structures relations might have.

§1. Introduction. Relations are sometimes simply identified with sets of tuples of
certain objects. For mathematical relations this is obviously appropriate since they are sets
of tuples. But what about other relations, like, for example, the love relation? The idea that
the state of Albert’s loving Karen would be nothing but a tuple seems perverse.
According to Fine (2000), the standard view of relations is that the constituents of a
relation always come in a certain order. But then any relation also has a converse relation.
For example, the relation is older than has the converse relation is younger than. Now
consider the state of Thomas’s being older than Nick. If—in a certain context—you regard
this as exactly the same state as Nick’s being younger than Thomas, and you regard this
state as a relational complex of a single underlying relation, then the question is, of which
relation?
On an alternative view of relations any relation comes with orderless positions or
argument-places to which objects can be assigned. For example, for the love relation
we have the positions lover and beloved. Unfortunately, this alternative view is also
problematic. Fine (2000, pp. 16–17) raises two objections. First, positions seem onto-
logically excessive, and second, he regards positions unsuitable for strictly symmetric
relations, like the adjacency relation, since switching positions of arguments would
give a different state. (See Leo, 2008b, for a more detailed discussion of argument-
places).
A very promising view of relations introduced by Fine is based on the notion of sub-
stitution. According to this antipositionalist view the states of a relation form a network
in which substituting objects of a state by other objects yields another state. For example,
substituting Roos for Albert and Bas for Karen in the state of Albert’s loving Karen, results
in the state of Roos’s loving Bas.
In Leo (2008a) we defined models that represent different views of relations. In partic-
ular we defined substitution models. We argued that substitution models adequately model
a large class of ‘real’ relations. However, we also noted that these substitution models
have certain limitations. For example, they have no typed domains for the objects. As a
consequence, they are not accurate for a relation like ‘drinks’, because “Mo drinks tea”
corresponds in a natural way to a state, but “tea drinks Mo” does not. We do not consider

Received: October 9, 2009


c Association for Symbolic Logic, 2010
145 doi:10.1017/S1755020309990347
146 JOOP LEO

this limitation to be a serious one, since typed domains can be incorporated into the models
in a straightforward way.
Substitution models as they were defined might have a more serious shortcoming.
It could be argued that the substitution mechanism of the models is perhaps not refined
enough, since objects are not explicitly substituted for individual occurrences of objects,
but only for objects in a global sense. For example, for the state of Narcissus’s loving
Narcissus, the model did not allow for a substitution resulting in the state of Echo’s loving
Narcissus. Now, if occurrences of objects are a basic notion for relations, then this is a
serious limitation of our substitution models.
But what exactly are occurrences? The notion of occurrences does not have the reputa-
tion of being crystal clear. Mates (1972, p. 49) even called it a “woolly notion.” Occurrences
can be considered in different contexts, for example, in expressions (Wetzel, 1993; Janssen
& Visser, 2004; Kracht, 2007). In this paper we try to get a better grip on the nature of
occurrences of objects in the logical space of ‘real’ relations by developing mathematical
models for relations.1
Our approach is as follows. First, we formulate an initial set of principles for occur-
rences. Then, in light of these principles, we define mathematical models in which occur-
rences have a constitutive function. We perform a technical investigation of these models
to get a better understanding of the formal properties of occurrences. Section 6 is the more
philosophical part of the paper. There we try to say more about the nature of occurrences
and consider the question whether, for ‘real’ relations, models with a refined substitution
mechanism for occurrences of objects are in a relevant sense more complete or adequate
than models with an undifferentiated substitution mechanism. In Section 6 we also briefly
discuss the idea of explicitly distinguishing between states ‘out there’ and relational com-
plexes, and to allow single states to figure as relational complexes in more than one re-
lation. On a first reading of this paper, it might not be a bad idea to peek ahead to this
section.

§2. Basic principles for occurrences. In his paper ‘The Problem of De Re Modality’,
Fine (1989) proposes the development of a general theory of constituent structure, where
the basic structure of the entities involved is given by the operation of substitution. Fine
considers the following syntactic notions to be basic: occurrence of, occurrence in, and
substitution. He gives the following example of a basic principle for these notions:
One basic principle, for example, is that if F  is the result of substituting
E  for the occurrence e of E within F, then there is an occurrence e of
E  within F  such that the result of substituting any expression E  for e
within F  is identical to the result of substituting E  directly for e in F.
The objective of this paper is not to develop a general theory of constituent structure. We
restrict ourselves to the structure of relations. We postulate the following basic principles
for occurrences of objects in relations:
P-1 Each relation has a nonempty set of states.
P-2 Each state of a relation has exactly one set of occurrences of objects.

1 By the logical space of a relation we mean the totality of the substitutional interconnections in
which the states of a relation stand to each other.
MODELING OCCURRENCES IN RELATIONS 147

P-3 Each occurrence is an occurrence of exactly one object.


P-4 Each occurrence has a type that corresponds to a domain of objects.
P-5 Objects of the same type can simultaneously be substituted for occurrences of
objects in a state.
P-6 Any substitution of objects for occurrences in a state results in exactly one state of
the same relation.
P-7 For any occurrence of an object in a state it makes for some substitution a difference
for the resulting state which object is substituted for it.
P-8 Composition principle: if a substitution in state s results in s  , then there is a
mapping μ from the occurrences in s to the occurrences in s  such that

(a) μ maps each occurrence α in s to an occurrence of the object substituted for


α and
(b) any substitution in s  gives the same result as when we substitute in s for each
occurrence α the object that is substituted in s  for μ(α).

It should be noted that what is at stake here is substitution of occurrences of objects in


states of relations and not of occurrences of terms in expressions denoting states.
We may depict the composition principle P-8 as follows:

I do not claim that the principles given above are in any sense complete. For example,
one might perhaps like to add that different states have no occurrence in common. Also
from these principles it cannot always be deduced uniquely how many occurrences a state
has. They do not tell us whether the state of Narcissus’ loving Narcissus has one or two
occurrences.
Principle P-8 may seem too weak. The mapping μ is surjective by P-7 and P-8, but
could we not also demand that μ is injective? Unfortunately, for certain relations this seems
problematic, since substitution might perhaps result in a coalescence of occurrences. This
is illustrated by the following three exemplary situations.

1. Consider the variably polyadic relation of waiting for the bus. Take the state of
Janneke and Vincent’s waiting. If we substitute Janneke for Vincent, we expect to
get—if anything—a state with only one occurrence.
2. Consider the ternary relation  where abc is the state that a loves b and b loves
c. Assume that the order of the conjuncts is irrelevant. Now suppose that a, b, c are
three different objects, and that each has one occurrence in the state abc. Substi-
tuting in this state a for the occurrence of c gives the state aba. Simultaneously
substituting in the state abc, the objects b, a, b for the occurrences of a, b, c gives
the state bab, which appears to be the same state as the state aba. It follows
148 JOOP LEO

by Principle P-8 that the state aba has only one occurrence of a and only one
occurrence of b. So, μ cannot be injective (see Figure 1).
3. As a last example of a possible coalescence consider the conjunction  & 
of a relation  with itself. We may envision that the occurrences of each state
s & s in the conjunction relation correspond one-to-one to the occurrences of s
itself in .
There may be arguments that can be used to reject a possible coalescence of occurrences.
Against a coalescence for the ternary love relation it may, for example, be objected that
substitution is a more subtle operation than we just seemed to suppose, and that substituting
a for the occurrence of c in the state abc does not give the same result as substituting
b, a, b for a, b, c in the state abc. In Section 6 we will discuss such an alternative view
in more detail.
In the next sections we will study the principles from a technical perspective by defining
and analyzing mathematical models based on them.

§3. Modeling occurrences. We define two types of substitution frames to model the
logical space of relations. In the first type, occurrences play no role and substitution is
defined for objects. In the second type, we have a more refined substitution mechanism
working on occurrences of objects.
3.1. Undifferentiated substitution frames. We call frames with substitution defined
for objects undifferentiated substitution frames. Since this type of frames is extensively
discussed in Leo (2008a), we give the definition without further explanation:
D EFINITION 3.1. An undifferentiated substitution frame is a triple F = S, O, ,
where S is a nonempty set of states, O is a nonempty set of objects, and  is a function
from S × O O to S such that
1. for all s ∈ S, (s, id O ) = s,
2. for all s ∈ S and δ, δ  ∈ O O ,  ((s, δ), δ  ) = (s, δ  ◦ δ).
For convenience, we will often write s ·F δ or s · δ for (s, δ). Further, we will also often
write f · g for g ◦ f . With this notation,  is such that for all s ∈ S and for all δ, δ  ∈ O O ,
s · id O = s, and (s · δ) · δ  = s · (δ · δ  ).
We now give a definition of the objects of a state. Roughly put, they are the objects
for which it makes a difference for the resulting state which objects are substituted for
them.

Fig. 1. Coalescence of occurrences.


MODELING OCCURRENCES IN RELATIONS 149

D EFINITION 3.2. Let F = S, O,  be an undifferentiated substitution frame. We call


A ⊆ O an object-domain of s ∈ S if for every δ, δ  : O → O,
δ = A δ  ⇒ s · δ = s · δ  .2
We define the core objects of s as:

Core-obF (s) = {A | A is an object-domain of s}.

If Core-obF (s) is an object-domain, then we call this set the objects of s. We denote this
set as ObF (s). If Core-obF (s) is not an object-domain, then we leave ObF (s) undefined.
For each substitution frame we can define its degree as a cardinal number:
D EFINITION 3.3. Let F = S, O,  be an undifferentiated substitution frame. For a
state s in S, we define the object-degree of s as:
ob-degreeF (s) = glb {| A| | A is an object-domain of s}.

The object-degree of F we define as:


ob-degreeF = lub {ob-degreeF (s) | s ∈ S}.

Here |A| denotes as usual the cardinality of A, “glb” denotes the greatest lower bound,
and “lub” denotes the least upper bound. Note that the degree of s and the degree of F
always exist and are indeed cardinal numbers.
3.2. Differentiated substitution frames. In this subsection we define differentiated
substitution frames that allow for a fine-grained account of substitution in states. We also
discuss the adequacy of these frames, and present some basic properties.
3.2.1. Definition We give a definition of differentiated substitution frames that is based
on the principles for occurrences given in Section 2:
D EFINITION 3.4. A differentiated substitution frame is a tuple G = S, O, Oc, , ,
where S is a nonempty set of states, O is a nonempty set of objects, Oc is a set of occur-
rences,  is a mapping from Oc to O, and  is a function from S × O Oc to S such that

1. for all s ∈ S, (s, ) = s,


2. for all s ∈ S and σ : Oc → O there is a mapping μ : Oc → Oc such that
(a)  ◦ μ = σ ,
(b) for all σ  : Oc → O,  ((s, σ ), σ  ) = (s, σ  ◦ μ).
We say that μ corresponds to (s, σ ) → (s, σ ). For α ∈ Oc, we say that α is an
occurrence of (α).
Note that 2(a) implies that  is surjective.
We will often write s ·G σ or s · σ for (s, σ ). With this notation,  is such that (1)
s ·  = s, and (2) μ ·  = σ and (s · σ ) · σ  = s · (μ · σ  ).

2 For functions f, g we say that f = g if f X = gX , that is, f restricted to X is equal to g


X
restricted to X .
150 JOOP LEO

If s · σ = s  , then we denote the corresponding transition as s →σ s  .


Since the only type of frames we will discuss in this paper are differentiated and undif-
ferentiated substitution frames, we will often just call them frames when it is clear from
the context what type of frame we are talking about.
With any differentiated substitution frame we can associate a category with objects
all states in S and with morphisms all triples s, μ, s   with μ : Oc → Oc satisfying
Conditions 2(a) and 2(b) of the definition of a differentiated substitution frame for some
σ : Oc → O with s · σ = s  .
Note that differentiated and undifferentiated substitution frames are both defined in
the language of second-order logic. In Appendix C we will consider some alternative
definitions of frames that also fulfill the principles in the previous section.
3.2.2. Adequacy of the definition For reasons of simplicity we deliberately chose to let
our definition of a differentiated substitution frame deviate slightly from the principles in
Section 2. The main differences are:

1. Substitution frames do not have typed domains for the objects.


2. The definition does not imply that every state has exactly one set of occurrences.
3. The definition is not completely consistent with the composition principle P-8.

Ad 1. As we remarked in the introduction, typed domains can easily be incorporated


into our models. For our purposes, however, they do not really play a role.
Ad 2. In the next subsection we will define occurrences of states in accordance with
Principle P-7. However, as Example 3.7 will show, our definition of differentiated
substitution frames is too liberal, since occurrences are not always defined for
every state. So, Principle P-2 is not always fulfilled. But I think this should not be
considered to be a serious weakness of our models. Allowing certain borderline
cases might be illuminating. It might even be that Principle P-2 turns out to be
just too strict for certain ‘real’ relations.
Ad 3. The function μ corresponding to a transition s →σ s  might map the occurrences
of s to a proper superset of the occurrences of s  . Although I do not have an
example of a ‘real’ relation where this is the case, I see no compelling reasons to
exclude the possibility.

A further point to note is that in our models states may share the same occurrences. In
our analysis we will take care to explicitly mention if any of our results depend on this.
3.2.3. Basic properties We define occurrences of states as follows:
D EFINITION 3.5. Let G = S, O, Oc, ,  be a differentiated substitution frame. Then
we call X ⊆ Oc an occurrence-domain of s ∈ S if for all σ, σ  : Oc → O,

σ =X σ  ⇒ s · σ = s · σ .

We define the core occurrences of s as:



Core-ocG (s) = {X | X is an occurrence-domain of s}.

If Core-ocG (s) is an occurrence-domain, then we call this set the occurrences of s, and we
denote it as OcG (s). If Core-ocG (s) is not an occurrence-domain, then we leave OcG (s)
undefined.
MODELING OCCURRENCES IN RELATIONS 151
σ 
D EFINITION 3.6. We say that a transition s → s  is a composition of s →σ s  and

s →σ s  if σ  = X μ · σ  for some occurrence-domain X of s and some mapping μ
corresponding to s →σ s  .
It may happen that Core-oc(s) is not an occurrence-domain:
E XAMPLE 3.7. Let G = S, O, Oc, ,  be a differentiated substitution frame with O
an infinite set, Oc = O,  = id O , S the set of subsets of O modulo a finite difference, that
is,
S = {A  | A ⊆ O}
 = {A ⊆ O | A  A is finite}, and  defined by
with A
 · σ = σ
A [A].3
 = B,
 is well-defined, since for any A, B ⊆ O, if A  then σ
[A] = σ
[B]. Further, G is a
substitution frame, since
 ·  = id
1. A 
O [A] = A and
 · σ ) · σ  = σ
2. ( A [A] · σ  = =  · (σ · σ  ).
=A
 is the empty set, but if A is
It is not difficult to see that for any A ⊆ O, Core-oc( A)

infinite, then the empty set is not an occurrence-domain of A.
For undifferentiated substitution frames, a similar example can be given that shows
Core-ob(s) is not always an object-domain (Leo, 2008a, Example 3.16).
L EMMA 3.8. Let G = S, O, Oc, ,  be a differentiated substitution frame. For
every s ∈ S, the occurrence-domains of s form a (possibly nonproper) filter on Oc.
Proof. To prove that the occurrence-domains of s are closed under finite intersection,
let X and X  be occurrence-domains of s. Let σ, σ  : Oc → O be such that σ = X ∩X  σ  .
Define


⎪σ (α) if α ∈ X − X  ,

σ  (α) = σ (α) = σ  (α) if α ∈ X ∩ X  ,


⎩ 
σ (α) if α ∈ X  − X .
Then σ  = X σ and σ  = X  σ  . So, s · σ = s · σ  = s · σ  . Thus X ∩ X  is an occurrence-
domain of s.
It is trivial that the occurrence-domains of s are upward closed.
Since an occurrence-domain may be empty, we may have a nonproper filter. 

D EFINITION 3.9. For a differentiated substitution frame G = S, O, Oc, , , we call
A ⊆ O an object-domain of s ∈ S if for some occurrence-domain X of s,
A = [X ].
We define the core objects of s as:

Core-obG (s) = {A | A is an object-domain of s}.

3 f [X ] = { f (x) | x ∈ X }, the image of X under f .


152 JOOP LEO

If Core-ob(s) is an object-domain, then we call this set the objects of s, and we denote it
as Ob(s). If Core-ob(s) is not an object-domain, then we leave Ob(s) undefined.
L EMMA 3.10. Let G = S, O, Oc, ,  be a differentiated substitution frame. For
every s ∈ S, the object-domains of s form a (possibly nonproper) filter on O.
Proof. To prove that the object-domains of s are closed under finite intersection, let A
and A be object-domains of s. Then A = [X ] and A = [X  ] for some occurrence-
domains X, X  . So, A ∩ A = [X ] ∩ [X  ] ⊇ [X ∩ X  ]. By Lemma 3.8, X ∩ X  is
an occurrence-domain. It follows that A ∩ A is an object-domain of s.
By the surjectivity of  and the upward closedness of occurrence-domains, the object-
domains of s are also upward closed.
Since an object-domain may be empty, we may have a nonproper filter. 
In the next lemma we show that (i) the objects of the core-occurrences of a state form a
subset of its core-objects and (ii) if the occurrences of a state exist, then the objects of the
state also exist, and are exactly the objects of the occurrences of the state.
L EMMA 3.11. Let G = S, O, Oc, ,  be a differentiated substitution frame. For
every state s ∈ S,
(i) [Core-oc(s)] ⊆ Core-ob(s).
(ii) If Oc(s) exists, then Ob(s) exists and [Oc(s)] = Ob(s).
Proof. By Definition 3.5 and Definition 3.9,

(Core-oc(s)) =  {X | X is an occurrence-domain of s}

⊆ {[X ] | X is an occurrence-domain of s}

= Core-ob(s).

To prove the second claim, assume Oc(s) exists. Then Oc(s) is an occurrence-domain of
s. So, [Oc(s)] is an object-domain of s. Thus Core-ob(s) ⊆ [Oc(s)]. By the first claim
of the lemma, we also have the reverse inclusion. So, Core-ob(s) = [Oc(s)]. Because
[Oc(s)] is an object-domain, the claim follows. 
The inclusion of Lemma 3.11 may be a proper inclusion, as the next example shows.
E XAMPLE 3.12. Let G = S, O, Oc, ,  be a differentiated substitution frame with
O = {a, b}, S = {s1 , s2 }, Oc = O × ω, (x, n) = x, and  defined by

⎨s1 if s = s1 , and both σ −1 (a) and σ −1 (b) are infinite,
s·σ =

s2 otherwise.
We see that Ob(s1 ) = {a, b}, but Core-oc(s1 ) = ∅ and Oc(s1 ) is undefined.
The next lemma expresses the core-occurrences of a state in terms of single substitutions:
L EMMA 3.13. If in Core-oc(s) the number of occurrences of each object is finite, then
Core-oc(s) = {α | ∃b ∈ O [ s · [α → b] = s ]}.4

4 f [a → b] denotes the function defined by f [a → b](x) = b if x = a; f (x) otherwise.


MODELING OCCURRENCES IN RELATIONS 153

Proof. If O has just one element, then it is obviously true. So, assume that O has at least
two elements. Consider α0 ∈ Core-oc(s). Let a = (α0 ) and b be some object in O with
b = a. Define σ0 = [α0 → b]. By the definition of differentiated substitution frames,
there is a mapping μ0 : Oc → Oc such that
μ0 ·  = σ0 and for all σ : Oc → O, (s · σ0 ) · σ = s · (μ0 · σ ).
Assume that A = {α ∈ Core-oc(s) | (α) = a} is finite. Then there is an α1 ∈ A such that
α1 ∈
/ μ0 [Core-oc(s)]. It follows that s · σ0 = s.
The inclusion in the other direction is obvious. 
In the lemma we cannot drop the finiteness condition, as the next example shows.
E XAMPLE 3.14. Let G = S, O, Oc, ,  be a differentiated substitution frame with
O = {a, b}, S = {s0 , s1 , s2 , . . . , s∞ }, Oc = O × ω, (x, n) = x, and  defined by


⎪ s0 if s = sn with n ∈ ω and σ (a, 0) = σ (b, 0),

⎨s
n if s = sn with n ∈ ω and σ (a, 0) = σ (b, 0),
s·σ = −1

⎪ |σ −1 (x)| if s = s∞ and σ (x) is finite,


s
s∞ otherwise. 5

Then Oc(s∞) = O × ω, but for any b ∈ O and α ∈ Oc, s∞ · [α → b] = s∞ .


The notion of occurrence-degree and object-degree are defined as follows:
D EFINITION 3.15. Let G = S, O, Oc, ,  be a differentiated substitution frame. For
a state s in S, we define the occurrence-degree of s as:
oc-degreeG (s) = glb {|X | | X is an occurrence-domain of s}.

The occurrence-degree of G we define as:


oc-degreeG = lub {oc-degreeG (s) | s ∈ S}.

The object-degrees ob-degreeG (s) and ob-degreeG are defined in a similar way by starting
with the object-domains of s.
In Section 4 we will see that the occurrence-degree of a frame can be much higher than
its object-degree.
3.3. Underlying frames. There is a natural embedding of the undifferentiated sub-
stitution frames in the class of differentiated substitution frames, and, conversely, each
differentiated substitution frame also has an underlying undifferentiated substitution
frame:
D EFINITION 3.16. Let F = S, O,  be an undifferentiated substitution frame and let
G = S  , O  , Oc, ,  be a differentiated substitution frame. We say that F underlies G
if S = S  , O = O  , and for every s ∈ S and δ : O → O,
s ·G ( · δ) = s ·F δ.
We call G a basic refinement of F if F underlies G, Oc = O and  = id O .

5 |X | denotes as usual the cardinality of X .


154 JOOP LEO

Note that in a basic refinement, different states may have occurrences in common.
To prevent this, we could alternatively have defined for a basic refinement the set of
occurrences as S × O.
T HEOREM 3.17. (i) Each undifferentiated substitution frame has a unique basic refine-
ment. (ii) Each differentiated substitution frame has a unique underlying undifferentiated
substitution frame. (iii) If F is an undifferentiated substitution frame, then the underlying
undifferentiated substitution frame of the basic refinement of F is F itself.
Proof.

(i) This follows immediately from the definition of a basic refinement and the definition
of a differentiated substitution frame.
(ii) Let G = S, O, Oc, ,  be a differentiated substitution frame. Let F = S, O, 
be defined by s ·F δ = s ·G ( · δ). Then by Condition 1 of the definition of a
differentiated substitution frame s ·F id O = s ·G ( · id O ) = s. By Condition 2 of the
definition of a differentiated substitution frame:

(s ·F δ) ·F δ  = (s ·G ( · δ)) ·G ( · δ  )
= s ·G (μ ·  · δ  ) for some μ with μ ·  =  · δ
= s ·G ( · δ · δ  )
= s ·F (δ · δ  ).

So, F is an undifferentiated substitution frame. Its uniqueness follows immediately


from the definition of an underlying undifferentiated substitution frame.
(iii) This also follows immediately from the definitions. 
In Appendix B we will define categories for undifferentiated and differentiated substi-
tution frames and show that there is an adjunction between them.
L EMMA 3.18. Let G be a basic refinement of F = S, O, . Then for every s ∈ S, the
occurrence-domains of s in G are the same as the object-domains of s in G and the same
as the object-domains of s in F.
Proof. The lemma follows immediately from the definitions. 
We may use this lemma to deduce certain results about undifferentiated substitution
frames from results about differentiated substitution frames. For example, from Lemma
3.10 it follows that also for undifferentiated substitution frames the object-domains of s
form a filter on O, and from Lemma 3.13 we get for undifferentiated substitution frames a
characterization of the core of s. But of course, these results could also be obtained more
directly.
The next lemma shows that refining a frame does not change the object-domains of its
states.
L EMMA 3.19. If F underlies G, then any state has in F the same object-domains
as in G.
Proof. Let F = S, O,  underly G = S, O, Oc, , . Let A be an object-domain
of s in F. If O consists of one object, then ∅ is an object-domain of s in G, and so is A.
Therefore we may assume that O contains (at least) two different objects a and b. Let
MODELING OCCURRENCES IN RELATIONS 155

σ1 , σ1 : Oc → O be such that σ1 =−1 [A] σ1 . Define



⎪ 
⎪σ1 (α) = σ1 (α) if (α) ∈ A,




⎨σ1 (α) if (α) = a,
σ2 (α) =

⎪ σ1 (α) if (α) = b,




⎩(α) otherwise.
We will show that s ·G σ2 = s ·G σ1 . Then, by symmetry of the definition of σ2 , we also
have s ·G σ2 = s ·G σ1 , and thus A is also an object-domain of s in G.
Define δ0 : O → O by

d if d ∈ A,
δ0 (d) =
a otherwise.
Define σ0 =  · δ0 . Then s ·G σ0 = s ·F δ0 = s. Because G is a differentiated substitution
frame, there is a function μ0 : Oc → Oc such that μ0 ·  = σ0 , and for any σ : Oc → O,
(s ·G σ0 ) ·G σ = s ·G (μ0 · σ ). So,
s ·G σ2 = (s ·G σ0 ) ·G σ2 = s ·G (μ0 · σ2 ) = s ·G (μ0 · σ1 ) = (s ·G σ0 ) ·G σ1 = s ·G σ1 .
Conversely, let A be an object-domain of s in G. Then there is an occurrence-domain X
of s such that A = [X ]. So, for any δ, δ  : O → O with δ = A δ  , we have ·δ = X ·δ  ,
and thus s ·F δ = s ·G ( · δ) = s ·G ( · δ  ) = s ·F δ  . It follows that A is also an object-
domain of s in F. 
As a direct consequence of the lemma, if F underlies G, then for any state s,
Core-obF (s) = Core-obG (s), and ob-degreeF (s) = ob-degreeG (s).

§4. Refining occurrences. An undifferentiated substitution frame may underly vari-


ous differentiated substitution frames. In establishing for a given relation the most adequate
one, our principles in Section 2 seem to leave us with a lot of choices. In this section
we investigate a natural (pre)ordering of frames. A consequence of the main result to
be presented here, is that for a large class of relations unique maximally differentiated
substitution frames exist.
We define refinements of differentiated substitution frames as follows:
D EFINITION 4.1. Let G = S, O, Oc, ,  and G  = S  , O  , Oc ,  ,   be differ-
entiated substitution frames. We say that G is a refinement of G  if S = S  , O = O  , and
for every s ∈ S there is a function τ : Oc → Oc such that
1. τ ·  = ,
2. for every σ : Oc → O, s ·G (τ · σ ) = s ·G  σ .
We say that G and G  are equally refined if for every s ∈ S the function τ is injective on
some occurrence-domain X of s in G.
We call a refinement a proper refinement, if the frames are not equally refined.
We also call G a refinement of an undifferentiated frame F if G is a refinement of the
basic refinement of F.
Note that G is a refinement of an undifferentiated substitution frame F iff F underlies G.
Also note that the relation is equally refined is an equivalence relation.
156 JOOP LEO

For an undifferentiated substitution frame F we will call G a proper refinement, if G is


a proper refinement of the basic refinement of F.
L EMMA 4.2. Differentiated substitution frames with all states of finite occurrence-
degree are refinements of each other iff they are equally refined.
Proof. The lemma follows immediately from the definitions. 
I am not sure whether the claim of the lemma is also true for all frames of infinite
occurrence-degree. But refinements—modulo equal refinedness—obviously form a pre-
order. In the next subsections we investigate the structure of this preorder in more detail.
4.1. Spurious refinements. Perhaps contrary to what one would expect, an undifferen-
tiated substitution frame for a relation like the love relation has (many) proper refinements:
T HEOREM 4.3. Let G = S, O, Oc, ,  be a differentiated substitution frame with O
finite, and oc-degreeG also finite, but unequal zero. Then for any infinite cardinal κ, there
is a refinement G  of G with oc-degreeG  = κ.
Proof. Let κ be an infinite cardinal. Define the filter F on κ by A ∈ F iff |κ − A| < κ.
Let U be an ultrafilter on κ extending F. (Such an ultrafilter exists by the Axiom of Choice.)
Then for any A ∈ U , |A| = κ. Further, since O is finite, for any τ ∈ O κ , there is exactly
one a ∈ O such that {λ | τ (λ) = a} ∈ U .
It follows that we can make a proper refinement of G by replacing each occurrence by κ
occurrences of the same object. More precisely, define G  = S, O, Oc ,  ,   with

1. Oc = Oc × κ,
2.  = proj1 |Oc · ,
3. s ·G  σ = s ·G σ with σ such that ∀α ∈ Oc, {λ | σ (α, λ) = σ (α)} ∈ U .

We will show that (i) G  is a differentiated substitution frame, (ii) G  is a refinement of


G, and (iii) oc-degreeG  = κ.

(i) We show that G  satisfies Conditions 1 and 2 of the definition of a differentiated


substitution frame.

1. s ·G   = s because {λ |  (α, λ) = (α)} = κ ∈ U .


2. Consider any s ∈ S, and σ : Oc → O. Let σ be as in the definition of G  ,
let μ : Oc → Oc correspond to s →σ s · σ , and let μ : Oc → Oc be such
that

(a) μ ·G   = σ ,
(b) μ(α, λ) = (μ(α), λ) if σ (α, λ) = σ (α).

For any σ  : Oc → O, let σ  be such that ∀α ∈ Oc, {λ | σ  (α, λ) = σ  (α)} ∈ U .


Then

(s ·G  σ ) ·G  σ  = (s ·G σ ) ·G σ 
= s ·G (μ · σ  )
= s ·G  (μ · σ  ),
MODELING OCCURRENCES IN RELATIONS 157

where the last equation can be proved as follows. Define for each α ∈ Oc the
following sets:
Aα = {λ | μ(α, λ) = (μ(α), λ)},
Bα = {λ | σ  (α, λ) = σ  (α)},
Cα = {λ | σ  (μ(α, λ)) = σ  (μ(α))}.

Then Aα ∩ Bμ(α) ⊆ Cα . Because Aα , Bμ(α) ∈ U , and U is a filter, we have Cα ∈ U ,


from which it follows that s ·G  (μ · σ  ) = s ·G (μ · σ  ). Thus G  is a differentiated
substitution frame.
(ii) To prove that G  is a refinement of G, define τ : Oc → Oc by τ (α, λ) = α. Then
clearly τ is surjective and  = τ · . Further, s ·G σ = s ·G  (τ · σ ), because for
each α ∈ Oc, {λ | σ (τ (α, λ)) = σ (α)} = κ ∈ U .
(iii) Since oc-degreeG is finite, and κ is infinite, we have oc-degreeG  ≤ κ. Since
oc-degreeG = 0, there is a s0 ∈ S and σ0 : Oc → O such that s0 ·G σ0 = s0 .
Let X be an occurrence-domain of s0 in G  . Define σ0 : Oc → O by


(α) if (α, λ) ∈ X ,
σ0 (α, λ) =
σ0 (α) otherwise.

Then σ0 = X  . So, s0 ·G  σ0 = s0 . Now suppose |X | < κ. Then, by the definition
of U , for any α ∈ Oc, {λ | (α, λ) ∈ Oc − X } ∈ U , and thus s0 ·G  σ0 = s0 ·G σ0 = s0 .
So, we have a contradiction. It follows that oc-degreeG  = κ. 
What about frames with an infinite number of objects? Do they also always have proper
refinements? By a rather straightforward modification of the proof of the previous theorem
we can show that any frame with at least one transition from a state to a different one has
a proper refinement—provided that there are arbitrary large measurable cardinals:
T HEOREM 4.4. Let G = S, O, Oc, ,  be a differentiated substitution frame with
oc-degreeG = 0. Then for any measurable cardinal κ > max(|O|, oc-degreeG ), there is a
refinement G  of G with oc-degreeG  = κ.
Proof. By definition, an uncountable cardinal κ is measurable iff there is a nonprincipal
κ-complete ultrafilter on κ, where a filter is called κ-complete if it is closed under intersec-
tion of less than κ sets. See for example, Jech (2006, p. 127) or Kanamori (2005, p. 26).
Now let κ be a measurable cardinal greater than max(|O|, oc-degreeG ), and let U be a
nonprincipal κ-complete ultrafilter over κ. Then for any A ∈ U , |A| = κ. Further, since
κ > |O|, for any τ ∈ O κ , there is exactly one a ∈ O such that {λ | τ (λ) = a} ∈ U .
Define G  = S, O, Oc ,  ,   exactly as in the proof of previous theorem. We may
prove that G  is a differentiated substitution frame and that G  is a refinement of G in the
same way as we did in (i) and (ii) of the proof of the previous theorem.
To prove that oc-degreeG  = κ, we first note that since oc-degreeG ≤ κ, and κ is infinite,
we have oc-degreeG  ≤ κ. Then we may follow (iii) of the proof of the previous theorem
to complete the proof. 
For ‘real’ relations the ultrarefinements constructed in the proof of the last two theorems
do not seem to be adequate, since the states do not have a well-defined set of occurrences.
Probably no metaphysical significance should be given to the existence of these spurious
refinements.
158 JOOP LEO

4.2. Normal refinements. In this section we focus on frames of finite occurrence-


degree, which, as a consequence of Lemma 4.2, form—modulo equal refinedness—a par-
tial order. In particular we want to investigate whether two given frames have a supremum.
We begin with a negative result:
E XAMPLE 4.5. Let F = S, O,  be an undifferentiated substitution frame with O =
{a, b}, S = {{a}, {b}, {a, b}}, and  defined by
s · δ = δ[s].
Then F underlies a frame G with for each s ∈ S, oc-degreeG (s) = 2. But for any natural
number k > 2, F also underlies a frame G  in which the state {a, b} has k occurrences
for a and k occurrences for b. Since in G  the other states necessarily have only one
occurrence, it follows immediately that G and G  have no refinement of finite occurrence-
degree in common. Some extra thought shows that they also have no refinement of infinite
occurrence-degree in common.
In the example, the frame G  has the property of having an occurrence-degree that is
greater than its object-degree. Although we can imagine that this might be the case for
certain ‘real’ relations, we expect that in ‘normal’ cases the occurrence-degree and object-
degree will be the same:
D EFINITION 4.6. We call a frame or refinement normal if its occurrence-degree is the
same as its object-degree. We call a normal frame or refinement maximal if it has no proper,
normal refinement.
The next theorem will be very useful in evaluating possible structures of relations:
T HEOREM 4.7. Let be a nonempty collection of normal refinements of a common
substitution frame of finite object-degree. Then there is a least common refinement of the
frames in , which is also normal and unique, modulo equal refinedness.
Proof. We will construct a normal frame G ∗ = S, O, Oc∗ , ∗ , ∗  of which we will
show that it is a unique least common refinement of the frames in .
(i) Construction of G ∗ :
Choose a frame G0 = S, O, Oc0 , 0 , 0  in .
Define Oc∗ = Oc0 × O, and ∗ (α, a) = a. For the definition of ∗ we need some
preparations:
Let →∗ be the transitive closure of the relation → defined by:
s → s  if s ·G σ = s  for some G ∈ and some σ .
For each s ∈ S define
s = {s  ∈ S | s →∗ s  & s  →∗ s}.
In each s choose s0 with ob-degreeG0 (s0 ) maximal. Because the frames in
are normal refinements of a common frame, we have for each G ∈ ,
oc-degreeG (s0 ) = ob-degreeG0 (s0 ). It follows that for any G ∈ and any
s ∈ S there is a transition from s0 to s in G iff there is a transition from s0 to s
in G0 . Thus, for every s1 ∈ s , we may choose σ1 such that s0 ·G0 σ1 = s1 . Now
define for an arbitrary σ : Oc∗ → O,
s1 ·G ∗ σ = s0 ·G0 (τ1 · σ )
with τ1:Oc0 →Oc∗ defined by τ1 (α)=(α, σ1 (α)). This completes the definition of G ∗.
MODELING OCCURRENCES IN RELATIONS 159

(ii) Proof that G ∗ is a differentiated substitution frame:


We will show that G ∗ fulfills the two conditions of Definition 3.4. Let G0 , s0 , s1 ,
σ1 , τ1 be as in the definition of G ∗ . Then:
1. s1 ·G ∗ ∗ = s0 ·G0 (τ1 · ∗ ) = s0 ·G0 σ1 = s1 .
2. Consider any σ : Oc∗ → O. We have to show that there is a μ such that
(a) μ · ∗ = σ ,
(b) for all σ  : Oc∗ → O, (s1 ·G ∗ σ ) ·G ∗ σ  = s1 ·G ∗ (μ · σ  ).
Let s1 = s1 ·G ∗ σ , and let s0 , σ1 , τ1 be related entities of s1 in the construction of G ∗ .
As we noted before, for any G ∈ and any s ∈ S there is a transition from s0 to s
in G iff there is a transition from s0 to s in G0 . So, because there is a transition from
s0 to s1 in G0 , and because there is a path from s1 to s0 of transitions in frames of ,
there is a transition σ0 from s0 to s0 in G0 . Then for any corresponding mapping
μ0 : Oc → Oc we have:
(s1 ·G ∗ σ ) ·G ∗ σ  = s0 ·G0 (τ1 · σ  )
= (s0 ·G0 σ0 ) ·G0 (τ1 · σ  )
= s0 ·G0 (μ0 · τ1 · σ  )
= s0 ·G0 (τ1 · τ1∗ · μ0 · τ1 · σ  ) with τ1 · τ1∗ = idOc0
= s1 ·G ∗ (τ1∗ · μ0 · τ1 · σ  ).
This proves Condition 2b.
Condition 2a requires some extra work. Let X = {(α, σ1 (α)) | α ∈ OcG0 (s0 )}.
Showing that τ1∗ · μ0 · τ1 · ∗ = X σ for some appropriate μ0 , will prove 2a.
Because τ1 (α) = (α, σ1 (α)), (τ1 · τ1∗ )(α) = α and τ1 · ∗ = σ1 , this is equivalent
to showing that μ0 · σ1 =OcG (s0 ) τ1 · σ for some appropriate μ0 .
0
Choose a path s1 = t1 →σ1 t2 →σ2 · · · →σn−1 tn = s0 of transitions in frames
of such that for each transition ti →σi ti+1 in G the mapping σi OcG (ti ) is
injective.
160 JOOP LEO

For any G ∈ , there is a natural one-to-one correspondence π between the


occurrences of s0 in G and in G0 , namely π(α) = α  iff (α) = 0 (α  ). For
convenience, we will simply identify the occurrences of s0 in G and G0 . For s0 we
will do similarly. Now define transitions s0 →σ i ti as follows:

1. s0 →σ 1 t1 is the transition s0 →τ1 ·σ t1 ,


2. s0 →σ i+1 ti+1 is a composition of s0 →σ i ti and ti →σi ti+1 .

Also define transitions s0 →σ i ti :
 
1. s0 →σ 1 t1 is the transition s0 →σ1 t1 ,
  
2. s0 →σ i+1 ti+1 is a composition of s0 →σ i ti and ti →σi ti+1 .

Because for each transition ti →σi ti+1 in G the mapping σi OcG (ti ) is injective,
it follows that if transition s0 →σ i+1 ti+1 is a composition of s0 →σ0 s0 and
 
s0 →σ i+1 ti+1 , then s0 →σ i ti is a composition of s0 →σ0 s0 and s0 →σ i ti . So,
choosing transition s0 →σ0 s0 as a composition of s0 →σ n tn and an inverse of

s0 →σ n tn gives by induction that s0 →σ 1 t1 is a composition of s0 →σ0 s0 and

s0 →σ 1 t1 . Thus, μ0 · σ1 =OcG (s0 ) τ1 · σ .
(iii) Proof that G ∗ is a refinement of each frame in :
Let s0 , s1 , σ1 be as in the construction of G ∗ . Let μ1 : Oc0 → Oc0 correspond to
s0 →σ1 s1 in G0 . Let τ : Oc∗ → Oc0 be a function with

⎨μ1 (α) if a = σ1 (α),
τ (α, a) =
⎩α  with  (α  ) = a otherwise.
0

Then

1. τ · 0 = ∗ , because

⎨0 (μ1 (α)) = σ1 (α) = ∗ (α, a) if a = σ1 (α),
0 (τ (α, a)) =
⎩a = ∗ (α, a) otherwise.

2. for every σ : Oc0 → O,


s1 ·G ∗ (τ · σ ) = s0 ·G0 (τ1 · τ · σ ) = s0 ·G0 (μ1 · σ ) = s1 ·G0 σ .

So, G ∗ is a refinement of G0 .
To see that G ∗ is also a refinement of any other frame G in , observe that if in
the construction of G ∗ we would have chosen G instead of G0 , then we would have
obtained a frame equally refined as G ∗ . Thus it follows that G ∗ is a refinement of
each frame in .
(iv) Proof that G ∗ is a unique least common refinement of the frames in :
Let G  = S, O, Oc ,  ,   be an arbitrary common refinement of the frames
in . Let s0 , s1 , σ1 be as in the construction of G ∗ . Then there is a mapping
τ  : Oc → Oc∗ with τ  · ∗ =  and τ  (α) ∈ OcG ∗ (s0 ) for any α ∈ OcG  (s0 ).

Let μ1 : Oc → Oc be a bijection corresponding to s0 →τ ·σ1 s1 in G  , and let
μ2 : Oc → Oc correspond to s0 → 1 s1 in G . Further, let τ : Oc → Oc∗ be a
∗ ∗ σ ∗
MODELING OCCURRENCES IN RELATIONS 161

function with

⎨(μ−1 · τ  · μ2 )(α) if α ∈ OcG  (s1 ),
1
τ (α) =
⎩α  with ∗ (α  ) =  (α) otherwise.
Then
1. τ · ∗ =  , because (τ  · μ2 · ∗ )(α) = (τ  · σ1 )(α) = (μ1 ·  )(α)
if α ∈ OcG  (s0 ).
2. s1 ·G  (τ · σ ) = s0 ·G  (τ  · μ2 · σ ) = s0 ·G ∗ (μ2 · σ ) = s1 ·G ∗ σ .
So, G  is a refinement of G ∗ , and thus, because we assumed G  to be an arbitrary
common refinement of the frames in , we see that G ∗ is not only a least common
refinement, but also unique, modulo equal refinedness. 
In the theorem we assumed that the object-degree of the frames is finite. I am not sure
whether this condition could be dropped.
R EMARK 4.8. We can also prove that for any nonempty collection of normal refinements
of a common substitution frame of finite object-degree, there is a greatest common subre-
finement, which is also normal and unique, modulo equal refinedness. The construction of
this subrefinement is analogous to the construction of G ∗ in Theorem 4.7, only here we use,
instead of →∗ , the transitive closure of the relation →1 defined by:
s →1 s  if s ·G σ = s  for all G ∈ and some σ .
The proof that G ∗ is indeed the greatest common subrefinement and unique is relatively
simple.
It follows that for a substitution frame of finite object-degree, the normal refinements—
modulo equal refinedness—form a complete lattice.
A direct consequence of Theorem 4.7 is the following uniqueness result:
C OROLLARY 4.9. A normal substitution frame of finite object-degree has—modulo
equal refinedness—a unique maximal normal refinement.
4.3. Coalescence-free refinements. Of special interest are frames in which no coales-
cence of occurrences takes place:
D EFINITION 4.10. We call a frame G = S, O, Oc, ,  coalescence-free if each
transition s →σ s  has a corresponding μ : Oc → Oc that is injective on an occurrence-
domain of s.
Note that coalescence-free normal frames are maximally refined.
We will characterize coalescence-free frames in terms of their underlying undifferen-
tiated frames. We will restrict ourselves to cases where the underlying frame is of finite
object-degree and simple:
D EFINITION 4.11. We call a frame F = S, O,  simple if there is a state s0 such that
S = {s0 · δ | δ : O → O}.
We call s0 an initial state.
Similarly, we call a frame G = S, O, Oc, ,  simple if there is a state s0 such that
S = {s0 · σ | σ : Oc → O}.
162 JOOP LEO

We define a class of frames whose transitions from an initial state s0 are unique, modulo
loops of s0 :
D EFINITION 4.12. We call a simple frame F = S, O,  of finite object-degree a
loop-initial frame if for any initial state s0 , and any δ1 , δ2 ∈ O O ,
s0 · δ1 = s0 · δ2 ⇒ ∃δ0 [ δ2 =Ob(s0 ) δ0 · δ1 and s0 · δ0 = s0 ].

T HEOREM 4.13. Let F be a simple undifferentiated substitution frame of finite object-


degree. Then F has a coalescence-free normal refinement iff F is a loop-initial frame .
Proof. Assume F has a coalescence-free normal refinement G = S, O, Oc, , . Let
s0 be an initial state of F, and let δ1 , δ2 be such that s0 ·F δ1 = s0 ·F δ2 . Then, because G
is a refinement of F, s0 ·G ( · δ1 ) = s0 ·G ( · δ2 ).
For i = 1, 2 let μi : Oc → Oc correspond to the transition  · δi from s0 . Since all
states have the same, finite occurrence-degree, we may assume that the mappings μi are
bijective. Clearly, there is a mapping δ0 : O → O such that  · δ0 =Oc(s0 ) μ2 · μ−1 1 · .
So,
s0 = s0 ·G (μ1 · μ−1
1 · )

= (s0 ·G ( · δ1 )) ·G (μ−1
1 · )

= (s0 ·G ( · δ2 )) ·G (μ−1
1 · )

= s0 ·G (μ2 · μ−1
1 · )

= s0 ·G ( · δ0 )

= s0 ·F δ0 .
Further,
 · δ0 · δ1 =Oc(s0 ) μ2 · μ−1
1 ·  · δ1

=Oc(s0 ) μ2 · μ−1
1 · μ1 · 

=Oc(s0 ) μ2 · 

=Oc(s0 )  · δ2 .
So, δ0 · δ1 =Ob(s0 ) δ2 . We have proved that F is a loop-initial frame.
Conversely, assume F = S, O,  is a loop-initial frame. Define G = S, O, Oc, , 
as follows:
1. Define Oc = S × O.
2. Choose an initial state s0 of F.
3. For each s ∈ S choose one δs ∈ O O such that s0 ·F δs = s.
4. Define (s, d) = δs (d).
5. Define s ·G σ = s0 ·F δ with for all d ∈ O, δ(d) = σ (s, d).
We prove that G is a differentiated substitution frame by showing that G fulfills the two
conditions of Definition 3.4.
MODELING OCCURRENCES IN RELATIONS 163

(1) s ·G  = s0 ·F δs = s.
(2) Consider any s ∈ S and σ : Oc → O. Then s ·G σ = s0 ·F δs·G σ . Also we
have s ·G σ = s0 ·F δ with for all d ∈ O, δ(d) = σ (s, d). Because F is a loop-
initial frame, δ =Ob(s0 ) δ0 · δs·G σ and s0 ·F δ0 = s0 for some δ0 . Define a function
μ : Oc → Oc with μ(s, d) = (s ·G σ, δ0 (d)), and for any other state s  ∈ S,
(μ(s  , d)) = σ (s  , d). Then
(μ(s, d)) = (s ·G σ, δ0 (d))

= δs·G σ (δ0 (d))

= σ (s, d).
So, μ ·  = σ .
We have (s ·G σ ) ·G σ  = s0 ·F δ  with for all d ∈ O, δ  (d) = σ  (s ·G σ, d). Further,
s ·G (μ · σ  ) = s0 ·F δ  with for all d ∈ O, δ  (d) = (μ · σ  )(s, d). So, because
μ(s, d) = (s ·G σ, δ0 (d)), we see that δ  = δ0 · δ  . Thus, because s0 ·F δ0 = s0 , we
have (s ·G σ ) ·G σ  = s ·G (μ · σ  ). This completes the proof that G is a differentiated
substitution frame.
To show that G is a refinement of F, we note that s ·G ( · δ) = s0 ·F δ  with for all
d ∈ O, δ  (d) = ( · δ)(s, d). So, s0 ·F δ  = (s0 ·F δs ) ·F δ = s ·F δ.
Further, it follows from Item 5 of the definition of G that oc-degreeG (s) =
ob-degreeF (s0 ), and thus that G is a coalescence-free normal frame. 
Note that in the second part of the proof we choose certain functions from O to O. But
since O consists of objects, we may perhaps not assume that O is representable in ZFC. So,
perhaps it would be more accurate to add an additional condition in the theorem. Because
the object-degree of F is finite, it would be enough to assume that O can be totally ordered.

§5. Restricting frames. We can define operations like conjunction, disjunction and
negation for undifferentiated substitution frames in a rather straightforward way. For ex-
ample, the conjunction F & F  will have states s & s  with s a state of F and s  a state
of F  , and substitution defined by (s & s  ) · δ = s · δ & s  · δ. Note that the definition only
requires that the following condition is satisfied:
s & s  = t & t  ⇒ s · δ & s  · δ = t · δ & t  · δ.
For differentiated substitution frames the situation is a bit more complicated. For example,
in defining conjunction it is not immediately clear whether or not we should let the oc-
currences of a state s & s correspond one-to-one to the occurrences of s itself. Also, we
should maybe be content with results that are unique modulo equal refinedness.
A really controversial issue is how to interpret such operations metaphysically. For
example, in The Philosophy of Logical Atomism Russell (1956) said that when he argued
that there were negative facts, it nearly produced a riot (p. 211). Russell says that on the
whole he is inclined to believe that there are negative facts, but no disjunctive facts (Russell,
1956, p. 215). Armstrong (1997) rejects both negative and disjunctive facts, but he does
accept conjunctive facts and totality facts. We will not pursue this issue here.
What I would like to discuss here in more detail is the notion of restriction for relations.
Consider a frame for the love relation with states x loving y. If we restrict the states to those
of x loving Mo, then we get a new frame for this restricted set of states. We get another
type of restriction if we take as states only x loves x. More generally, we define:
164 JOOP LEO

D EFINITION 5.1. Let G = S, O, Oc, ,  and G  = S  , O  , Oc ,  ,   be substi-


tution frames. We say that G is a restriction of G  if S ⊆ S  , O = O  , and for every s ∈ S
there is an X ⊆ Oc and a function τ : X → Oc such that
1. τ ·  = X  ,
2. for every σ : Oc → O, s ·G σ = s ·G  σ  with σ  = X τ · σ and σ  =Oc −X  .
We call a restriction G of G  simple if G is a simple substitution frame.
We call a restriction G of G  maximal for S if for any G1 with state set S, and such
that G is a restriction of G1 and G1 is a restriction of G  , the frames G and G1 are equally
refined.
Note that if G  is a refinement of G, then G is a restriction of G  .
The relation restriction of is transitive:
L EMMA 5.2. If G1 is a restriction of G2 and G2 is a restriction of G3 , then G1 is a
restriction of G3 .
Proof. The proof follows straightforwardly from the definition of restriction. 
The relation maximal restriction of is not transitive:
E XAMPLE 5.3. Let x yuv be the state x −→ ♥ y & u −→ ♥ v. Let G1 be a frame
with states x yuv and such that each state has four occurrences. Let G2 be a maximal
restriction of G1 with states x yyv, and such that the states x yyx have one occurrence
of x and one occurrence of y. Now for a given b let G3 be a maximal simple restriction
of G2 with states xbbv. Then, xbbx obviously has one occurrence of x in G2 , but this
state has two occurrences of x in some refinement of G3 . Since this refinement is also a
restriction of G1 , it follows that G3 is not a maximal restriction of G1 .
Note that the example also shows that coalescence-free frames are not closed under the
restriction relation.
Also note that for any frame G = S, O, Oc, ,  and any nonempty S  ⊆ S, there is
a restriction of G with state set S  , namely frame G  = S  , O, Oc, ,   with s ·G  σ = s
for every σ : Oc → O. Because of such (degenerated) restrictions, maximal restrictions
will obviously in general not be unique, modulo equal refinedness. However, if we restrict
ourselves to simple restrictions, we have the following uniqueness result:
T HEOREM 5.4. Let G = S, O, Oc, ,  be a normal frame of finite object-degree.
Then for any S  ⊆ S, there is—modulo equal refinedness—at most one simple maximal
restriction of G for S .
Proof. Let G1 = S  , O, Oc1 , 1 , 1  and G2 = S  , O, Oc2 , 2 , 2  be simple
restrictions of G. First we will show that G1 and G2 have the same underlying undif-
ferentiated substitution frame. Then by Theorem 4.7, G1 and G2 have a least common
refinement, which is normal and unique, modulo equal refinedness. We will show that this
refinement is also a restriction of G. Then the theorem follows since there are—modulo
equal refinedness—only a finite number of normal refinements of G1 .
(i) Let F1 and F2 be the underlying undifferentiated substitution frames of G1 and G2 .
Choose a state s0 ∈ S  with oc-degreeG (s0 ) maximal. Then s0 is an initial state of
G1 and G2 , because otherwise there would be a state t0 ∈ S  with oc-degreeG (t0 ) >
oc-degreeG (s0 ). It is not difficult to see that ObG1 (s0 ) = ObG2 (s0 ), and that for
any δ : O → O, s0 ·F1 δ = s0 ·F2 δ. So, for any s ∈ S  , there is a δ  such that
MODELING OCCURRENCES IN RELATIONS 165

s ·F1 δ = (s0 ·F1 δ  ) ·F1 δ = s0 ·F1 (δ  · δ) = s0 ·F2 (δ  · δ) = (s0 ·F2 δ  ) ·F2 δ = s ·F2 δ.
Thus, F1 = F2 .
(ii) Consider the construction of G ∗ in the proof of Theorem 4.7 starting with G1 and
G2 . Let s0 , s1 , σ1 , τ1 be as in the construction of G ∗ .
Further, let X 0 ∈ OcG (s0 ) and τ 0 : X 0 → Oc1 be such that τ 0 · 1 = X 0 , and for
every σ : Oc1 → O, s0 ·G1 σ = s0 ·G σ  with σ  = X 0 τ 0 · σ and σ  =Oc −X .
0
Define σ 1 : Oc → O by

⎨σ1 ( τ0 (α)) if α ∈ X 0
σ 1 (α) =
⎩(α) otherwise.

Let μ 1 correspond to s0 →σ1 s1 . Then μ 1 is injective on X 0 , because there is in G


also a transition from s1 to s0 . Now define X 1 = μ μ1 X 0 )−1 ·
1 [X 0 ] and τ 1 = ( τ0 ·τ1 .
It is straightforward to check that τ 1 · ∗ = X 1 , and for every σ : Oc∗ → O,
s ·G ∗ σ = s ·G σ  with σ  = X 1 τ 1 · σ and σ  =Oc−X 1 . So, it follows that G ∗ is a
restriction of G. 

Other forms of restriction that never introduce a coalescence of occurrences are conceiv-
able. But for such restrictions more than one state may have the same underlying state in the
original relation. For example, if we start with the ‘double’ love relation of Example 5.3
then the state x −→♥ y & y −→ ♥ x may underly two states of a restriction, namely a
state with two occurrences of x and one of y and another state with one occurrence of x
and two of y. Also one state of a restriction may perhaps have more than one underlying
state. This would, for example, be the case if we can make a further restriction to states
x −→
♥ y & y −→ ♥ x.

§6. Back to reality. A key question here is how to determine, for any state of a
relation, what its occurrences of objects are. For the love relation the state of Echo’s loving
Narcissus obviously has two occurrences, but what about the state of Narcissus’s loving
Narcissus? If this state has one occurrence, then substituting Narcissus for Echo in Echo’s
loving Narcissus gives a coalescence of occurrences.
If in the ‘real’ world no coalescence of occurrences takes place, then for many
relations the logical space is straightforwardly determined. But if we may not exclude
coalescence of occurrences, then the principles in Section 2 often leave us with many
choices.
We are inclined to think that there is just one love relation, one adjacency relation,
one similarity relation, and so forth. But this might perhaps be disputed. Why would we
exclude the possibility that there is a love relation where the state of Narcissus’s loving
Narcissus has only one occurrence of Narcissus, and another relation where Narcissus’s
loving Narcissus has two occurrences of him? Maybe we should not even exclude the
possibility of a love relation with both states, one with two occurrences of him and another
with one. Could it be that there simply is no deeper metaphysical fact that determines the
right choice?
These considerations might engender the uncomfortable feeling that we could easily get
stuck with an abundance of relations for the same states of affairs ‘out there’ in reality.
However, what we want is a clear, uncomplicated canonical view on the logical structure
of relations.
166 JOOP LEO

In our mathematical analysis we have arrived at a few results that provide more insight
with regard to the possibilities:
(i) By Corollary 4.9, any normal substitution frame of finite object-degree has a unique
maximal normal refinement.6
(ii) By Theorem 5.4, any normal substitution frame of finite object-degree has for each
subset of states at most one simple maximal restriction.
We might formulate the first result in terms of relations as follows. Suppose we modeled
the logical space of a given relation by a substitution frame for which the maximum number
of occurrences of its states is finite and equal to the maximum number of objects of its
states. Suppose further that the ‘real’ occurrences of the relation are a maximal refinement
of the occurrences of the frame. Then the logical space of the relation is in fact uniquely
determined by the original frame.
It seems reasonable to postulate that the logical space for restrictions of relations is
maximal, if unique in an appropriate sense. By Theorem 5.4, we know that this applies
to relations whose logical space corresponds to a simple restriction of a normal frame of
finite object-degree. What makes this and the previous uniqueness result of metaphysical
interest is that they probably apply to a large class of ‘real’ relations. Encouraged by
these uniqueness results we might go a step further and postulate the following maximality
principle:
P-x The occurrences of every relation are always maximally refined.
If this principle is true, then that would be very nice, since it would make the notion of
occurrences more accessible. Let us test its viability by examining some objections that
could be made:
Objection 1: Example 4.5 does not support the maximality principle. The example
shows a substitution frame whose refinements have no refinement in
common. So, in this case there is no unique maximum. Furthermore,
it is an open question whether all normal substitution frame of infinite
object-degree have a unique maximal normal refinement.
Objection 2: A conjunction of a relation with itself will introduce a coalescence of
occurrences, if the occurrences of a state s & s correspond one-to-one
with the occurrences of s. If so, then this would obviously contradict the
maximality principle.
Objection 3: Relations whose states have a set-like character are straightforward
counterexamples to the maximality principle. Take the relation of
waiting for the bus. If we may substitute Janneke for Vincent in the
state of Janneke and Vincent’s waiting for the bus, the resulting state
will have only one occurrence, and therefore will not be maximally
refined.
These objections seem strong, but there may be escape routes available:
Against Objection 1 it might be argued that the frame of the example is farfetched and
probably does not correspond to a ‘real’ relation. As a safer alternative, we could restrict
the maximality principle to relations whose logical space corresponds to normal frames of
finite degree.

6 More precisely we should say: unique up to equal refinedness.


MODELING OCCURRENCES IN RELATIONS 167

Fig. 2. Should we distinguish relational complexes from states of affairs?

Some people will reject Objection 2, because they will deny that there are conjunctive
relations. But let us suppose that conjunctive relations exist. Then why would the occur-
rences of s & s correspond one-to-one to the occurrences of s? I don’t see a convincing
argument. Even if s & s and s are identical states, then there could still be a way out,
namely by arguing that not states, but relational complexes have occurrences. A relational
complex could be conceived of as a structured perspective on a state ‘out there’, where we
give up the idea that there is a one-to-one correspondence or even identity between states of
affairs and relational complexes7 (see Figure 2). With this approach, there seems to be no
compelling reason why there should be a one-to-one correspondence between occurrences
of different relational complexes corresponding to the same state.
Maintaining both states ‘out there’ and relational complexes, however, has as potential
drawback that it might give rise to an inflation of ontology. A more detailed analysis
is needed to address this issue. Note, by the way, that in our formal analysis we also
allowed states to belong to more than one frame: in the definition of a restriction of a
frame (Definition 5.1) the states of a restriction of a frame G are a subset of the states of G
itself. If this is not acceptable, then an alternative definition of a restriction might be given
that takes hiding and merging of occurrences as primitive operations.
With respect to Objection 3 two different replies are possible. First, one could argue
that the objects of states with a set-like character are sets themselves, not the members
of the sets. For the given example, this would mean that in fact we should substitute the
set consisting of Janneke for the set consisting of Janneke and Vincent. Alternatively, one
could argue that certain substitutions in states with a set-like character are highly dubious,
since it is in some cases unclear what exactly the result should be. In the example, it may
be more natural to leave substituting Janneke for Vincent undefined. The interrelatedness
of the states may more adequately be expressed by the operation of subtraction. It seems
natural to say that subtracting Vincent from the state of Janneke and Vincent’s waiting for
the bus results in the state of Janneke’s waiting for the bus. If we follow this road, then we
need a weaker version of Principle P-6:
P-6 Any substitution of objects for occurrences in a state results in at most one state of
the same relation.
In addition, we should formulate a principle for subtracting occurrences similar to P-6 and
a composition principle for subtractions.

7 Such a one-to-one correspondence is often taken for granted. For example, Russell (1984, p. 80)
says: “there is certainly a one-one correspondence of complexes and facts.”
168 JOOP LEO

As an aside, note that Principle P-6 also has the advantage that it allows us to keep
meaningless or (conceptually) impossible states out of our ontology. Substituting, for ex-
ample, a for b in a’s being adjacent to b might be problematic. If you do not accept the
existence of conceptually impossible states, then it could be better to leave the result of this
substitution undefined.
To conclude this discussion, I think that a maximality principle like principle P-x might
be the right choice for getting close to capturing the essence of occurrences. But to make
such a principle really plausible, we need to conduct a more profound study of the notion of
states, of substitution and probably of other operations, like subtraction. Also a follow-up
of our formal analysis will be needed where partial substitution and subtraction functions
are taken into account.
A view on relations without coalescence of occurrences still seems an attractive alter-
native, because of its simplicity. We already countered two arguments against it, namely a
coalescence in relations with set-like states, and coalescence introduced by a conjunction
of relations. In Section 2 we gave a third objection against a coalescence-free account. We
considered the ternary relation  where abc is the state of a’s loving b & b’s loving
c. This relation has the peculiar property that bab and aba are identical states. As a
consequence, this state can have only one occurrence of a and one occurrence of b, and
thus in a transition from abc to this state we get coalescence of occurrences. How can we
counter this objection?
Again relational complexes may be the solution. We could argue that there are two
relational complexes corresponding to the state of a’s loving b & b’s loving a, namely
one relational complex with two occurrences of a and one with two occurrences of b.
Only in this case we would have to accept that one state ‘out there’ can have more than
one relational complex within the same relation. This approach looks quite natural if we
regard the relation  as a restriction of a quaternary relation with states like a’s loving
b & c’s loving d, because then one of the relational complexes corresponding to a’s loving
b & b’s loving a in  could be taken as the result of merging two occurrences of a and the
other relational complex as the result of merging two occurrences of b. For any relation
with a comparable symmetry, a similar ‘solution’ could be given. We also regard this issue
as something that requires further investigation.
There is one final issue of metaphysical importance that I would like to discuss, namely
the question: Do we really need occurrences?
Occurrences are definitely a very useful part of our representation of reality, but that in
some cases the best way to express certain properties of states is in terms of occurrences
does not force us to any ontological commitment to them. So, do we have a compelling
reason to assume that for relations occurrences are ontologically basic?
Let us assume, for the moment, that the maximality principle is valid, and in addition
that for every ‘real’ relation the underlying undifferentiated substitution frame has a unique
maximal refinement. Would this not imply that a complete account of relations could be
given in terms of undifferentiated substitutions? If so, then a parsimonious undifferentiated
substitution mechanism might be enough for the ontology of relations.
I only see one strong argument in favor of occurrences being basic for relations, namely
that some relations might just not have enough objects for an occurrence-free account of
relations. Consider a relation whose occurrence-degree is larger than its object-degree.
Then undifferentiated substitution might not provide enough information about the inter-
connection of the states. We might have such a situation, for example, with a conjunc-
tion  &  & · · · & , where  is a binary relation with a finite number of objects.
MODELING OCCURRENCES IN RELATIONS 169

Probably such situations do not only occur in complex relations, but also in what we would
regard as elementary relations. An example of such an elementary relation could maybe be
something like a relation with states of a’s loving b more than c in a mini-world with two
inhabitants.
My conclusion is that we have good reason to consider occurrences of relations to be a
primitive notion. Further, I think our analysis has brought us closer to revealing the essence
of the logical structure of relations. Although some major issues are still open, the way we
have articulated them may very well contribute to their solution.

§7. Acknowledgments. I am grateful to Kit Fine for illuminating discussions about


coalescence of occurrences. Further, I thank Albert Visser and the referee for valuable
comments on the manuscript.

BIBLIOGRAPHY
Armstrong, D. (1997). A World of States of Affairs. New York: Cambridge University Press.
Fine, K. (2000). Neutral relations. The Philosophical Review, 109, 1–33.
Fine, K. (1989). The Problem of De Re Modality. In Almog, J., Perry, J., and Wettstein,
H., editors. Themes From Kaplan. Oxford, UK: Oxford University Press, pp. 197–272.
Jech, T. (2006). Set Theory (third edition). Berlin: Springer-Verlag.
Janssen, M., & Visser, A. (2004). Some words on word. La Nuova Critica, 43–44, 71–95.
Kanamori, A. (2005). The Highter Infinite. Berlin: Springer-Verlag.
Kracht, M. (2007). Compositionality: The very idea. Research on Language and
Computation, 5.3, 287–308.
Leo, J. (2008a). Modeling relations. Journal of Philosophical Logic, 37, 353–385.
Leo, J. (2008b). The identitity of argument-places. The Review of Symbolic Logic, 1.3,
335–354.
Mates, B. (1972). Elementary Logic. New York, NY: Oxford University Press.
Mac Lane, S. (1998). Categories for the Working Mathematician (second edition). New
York, NY: Springer-Verlag.
Russell, B. (1956). The philosophy of logical atomism. In Marsh, R. C., editor. Logic
and Knowledge, Essays 1901-1950. London: George Allen & Unwin, pp. 175–281.
(Originally published in 1918).
Russell, B. (1984). Theory of knowledge. In Eames, E. R., editor. Theory of Knowledge,
The 1913 Manuscript. London: Allen and Unwin.
Wetzel, L. (1993). What are occurrences of expressions? Journal of Philosophical Logic,
22.2, 215–219.

DEPARTMENT OF PHILOSOPHY
UTRECHT UNIVERSITY
HEIDELBERGLAAN 8, 3584 CS UTRECHT
THE NETHERLANDS
E-mail: joop.leo@phil.uu.nl

Appendix A: Occurrences and roles. An object can fulfill one or more roles in a
state. For example, in the amatory relation an object can play the role of lover or the role
of beloved. In this appendix we investigate how roles are related to occurrences.
We only define roles for simple substitution frames of finite degree. The definition will
be such that
170 JOOP LEO

1. any object of any initial state fulfills exactly one role,


2. if s0 is an initial state, a0 ∈ Ob(s0 ), and a ∈ Ob(s), then a fulfills in s the role of a0
in s0 iff there is a δ such that s = s0 · δ and δ(a0 ) = a.
It is not clear how to generalize the definition for arbitrary undifferentiated substitution
frames.
D EFINITION A.1. Let F = S, O,  be a simple undifferentiated substitution frame of
finite object-degree. Then for any initial state s0 ∈ S and a0 ∈ Ob(s0 ), we define the role
of a0 in s0 as
Role(s0 , a0 ) = {(s, a) | ∃δ [ s · δ = s0 & a ∈ Ob(s) & δ(a) = a0 ]}.

Further, we define the roles of F as


RolesF = {Role(s0 , a0 ) | s0 ∈ S is an initial state & a0 ∈ Ob(s0 )}.

For arbitrary s ∈ S, a ∈ Ob(s), we say that a in s fulfills role ρ ∈ RolesF if for some
s0 , a0 with ρ = Role(s0 , a0 ) there is a mapping δ : O → O such that
s = s0 · δ and a = δ(a0 ).
It is easy to see that if F is an n-ary simple substitution frame, then F has at most n
roles. Objects sometimes fulfill more than one role in certain states. For example, if F
models the amatory relation, then in the state where Narcissus loves himself, he fulfills
both roles of the model. It is also possible that an n-ary model with n > 1, has only one
role. This is, for example, the case for cyclic models.
For differentiated substitution frames of finite occurrence-degree we can define roles in
a similar way:
D EFINITION A.2. Let G = S, O, Oc, ,  be a simple differentiated substitution
frame of finite occurrence-degree. Then for any initial state s0 ∈ S and α0 ∈ Oc(s0 ), we
define the role of α0 in s0 as

Role(s0 , α0 ) = (s, α) | ∃σ [ s · σ = s0 & α ∈ Oc(s)

& μ(α) = α0 with μ corresponding to s →σ s0 ] .

Further, we define the roles of G as


RolesG = {Role(s0 , α0 ) | s0 ∈ S is an initial state & α0 ∈ Oc(s0 )}.

For arbitrary s ∈ S, α ∈ Oc(s), we say that a in s fulfills role ρ ∈ RolesG if for some
s0 , α0 with ρ = Role(s0 , α0 ) there is a mapping σ : Oc → O such that
s = s0 · σ and μ(α0 ) = α for a corresponding μ.
Consider again the relation  where abc is the state that a loves b & b loves c. Let G
be a differentiated substitution frame for it with oc-degreeG = 3. Then because the state
aba is identical with the state bab, a and b each fulfill three roles, but a and b each
have only one occurrence in bab.
T HEOREM A.3. Let G be a simple differentiated substitution frame of finite occurrence-
degree. Then each occurrence of a state of G fulfills exactly one role iff G is coalescence-
free.
MODELING OCCURRENCES IN RELATIONS 171

Proof. The theorem follows immediately from the definitions. 

Appendix B: Categories of substitution frames. For a better understanding of the


relation between undifferentiated and differentiated substitution frames, it is useful to
consider them in terms of category theory.
There are various ways in which categories of substitution frames can be defined. We give
for both types of substitution frames a definition that seems to be a rather natural choice.
D EFINITION B.1. We define USF as the category with objects all undifferentiated
substitution frames, and with morphisms from F = S, O,  to F  = S  , O  ,    all
pairs  f S , f O  of functions f S : S → S  and f O : O → O  such that
1. f O is injective,

f O (δ(d)) if d  = f O (d),
2. f S (s ·F δ) = f S (s) ·F  δ  with δ  (d  ) =
d otherwise.
Note that Condition 1 guarantees that the function δ  in Condition 2 is uniquely defined
by the function δ.
Our category for differentiated substitution frames is somewhat more complicated:
D EFINITION B.2. We define DSF as the category with objects all differentiated substitu-
tion frames and with morphisms from G = S, O, Oc, ,  to G  = S  , O  , Oc ,  ,  
all triples  f S , f O , f Oc  of functions f S : S → S  , f O : O → O  and f Oc : S × X → Oc
with X = ( )−1 [im f O ], the inverse image of im f O by  , such that
1. f O is injective,
2. for each s ∈ S and α  ∈ X , f O (( f Oc (s, α  ))) =  (α  ),

  
σ ( f Oc (s, α  )) if α  ∈ X ,
3. f S (s ·G σ ) = f S (s) ·G  σ with σ (α ) =
 (α  ) otherwise.
The next theorem states that there is an adjunction from USF to DSF.
T HEOREM B.3. For the categories USF and DSF the functor that assigns to an undiffer-
entiated substitution frame its basic refinement is a left-adjoint for the functor that assigns
to a differentiated substitution frame its underlying undifferentiated substitution frame.
Proof. Let R : USF → DSF be the functor that assigns to an undifferentiated substitu-
tion frame its basic refinement, and U : DSF → USF be the forgetful functor that assigns
to a differentiated substitution frame its underlying undifferentiated substitution frame.
For each F ∈ USF and G ∈ DSF, we get a bijection
F ,G : DSF(R(F), G) → USF(F, U (G))
by assigning  f S , f O  to the morphism  f S , f O , f Oc . It is not difficult to see that the
family of bijections
 : DSF(R(F), G) ∼ = USF(F, U (G))
is natural in F and G. So, R, U,  is an adjunction from USF to DSF. 
The theorem can also be proved by characterizing an adjunction in terms of the unit of
adjunction (see, e.g., Theorem IV.1.2(i) of Mac Lane, 1998, p. 83). The unit of adjunction
.
is in this case simply the identity natural transformation η : IUSF −
→ IUSF .
172 JOOP LEO

We should be careful when drawing conclusions from this theorem. In particular,


I would not directly regard it as strong evidence of a natural relationship between undif-
ferentiated and differentiated substitution frames. For some slightly different, but equally
natural, definitions of a category of differentiated substitution frames, we do not get an
adjunction.

Appendix C: Alternative substitution frames. In this appendix we present two


alternative types of frames. In the first type the states have disjoint occurrence-domains,
and in the second type occurrences are explicitly mapped to occurrences. Further, we dis-
cuss two reasons why our attempts to simplify our definition of differentiated substitution
frames failed.
C.1 Frames with disjoint occurrence-domains. In the definition of differentiated sub-
stitution frames we did not demand that different states have disjoint occurrence-domains.
One may perhaps regard this as a shortcoming, but it can rather easily be fixed by defining
a differentiated substitution frame as a sextuple F = S, O, Oc, ,  S ,  with  S a
mapping from Oc to S that provides the states in S with disjoint occurrence-domains
Ocs = −1 S (s). Further,  is a mapping from {(s, σ ) | s ∈ S & σ ∈ O
Ocs } to S such
that:
1. (s, Ocs ) = s,
2. for all s ∈ S and σ : Ocs → O, there is a mapping μ : Ocs → Oc(s,σ ) such
that
(a)  ◦ μ = σ,
(b) for all σ  : Oc(s,σ ) → O,  ((s, σ ), σ  ) = (s, σ  ◦ μ).
Because of the extra complexity of this definition we prefer to work with the initial
definition of differentiated substitution frames, being on the alert for results that only
applies to them, but not to those with disjoint occurrence-frames.
C.2 Frames explicitly mapping occurrences to occurrences. In our refined frames
the correspondence between the occurrences of different states is implicit in the defi-
nition of . We could make this correspondence explicit by defining frames as tuples
S, O, Oc, , , where for any s, s  ∈ S, (s, s  ) is a set of functions from Oc to Oc
such that
1. idOc ∈ (s, s),
2. if μ ∈ (s, s  ) and μ ∈ (s  , s  ), then μ · μ ∈ (s, s  ),
3. if μ ∈ (s, s  ) and μ is a bijection, then μ−1 ∈ (s  , s),
4. for all s ∈ S and σ : Oc → O there is exactly one s  ∈ S and a mapping
μ ∈ (s, s  ) such that μ ·  = σ.
There are several variants possible. For example, instead of assuming that the corre-
spondence between the occurrences of one state and the occurrences of another one is
functional, we could (perhaps more adequately) assume only a relational correspondence
between the occurrences.
C.3 Failures to simplify our definition. The definition of a differentiated substitution
frame perhaps looks more complex than needed. Our attempts to simplify the definition
failed, because differentiated substitution frames lack certain nice properties, as we show
in the next two observations.
MODELING OCCURRENCES IN RELATIONS 173

O BSERVATION C.1. Not for every differentiated substitution frame and every mapping
μ : Oc → Oc we have (s · (μ · )) · σ = s · (μ · σ ).
Proof. Let G = S, O, Oc, ,  be a differentiated substitution frame without any
symmetry and with a state s for which Oc(s) = {α0 , α1 } with α0 = α1 and (α0 ) =
(α1 ) = a. Let μ : Oc → Oc be such that μ(α0 ) = α1 and μ(α1 ) = α0 . Then s ·(μ·) =
s, but for any b ∈ O with b = a, if σ (α0 ) = a and σ (α1 ) = b, then s · σ = s · (μ · σ ). So,
(s · (μ · )) · σ = s · (μ · σ ). 
To a transition s →σ s  more than one mapping μ : Oc → Oc may correspond. The next
observation shows that it is unlikely that we can always select one of them as a canonical
mapping.
O BSERVATION C.2. Not for every differentiated substitution frame, there is a represen-
tative function μ : S × O Oc → OcOc preserving composition, that is,

(representative) μ(s, σ ) corresponds to a transition s →σ s ,


(preservation) μ(s, μ(s, σ ) · σ  ) = μ(s, σ ) · μ(s · σ, σ  ).

Proof. Let F = S, O,  be an undifferentiated substitution frame with an initial state


s0 for which Ob(s0 ) = {a, b, c, d}, and such that s0 · δ = s0 · δ  iff δ  =Ob(s0 ) (δ0 )i · δ with
i ∈ {0, 1, 2, 3} and
 
a b c d
δ0 Ob(s0 ) = .
d a b c
Let G = S, O, Oc, ,  be a refinement of F with for each state four occurrences.
Now suppose there is a function μ : S × O Oc → OcOc such that:

1. μ(s, σ ) ·  = σ,
2. s · (μ(s, σ ) · σ  ) = (s · σ ) · σ ,
3. μ(s, μ(s, σ ) · σ  ) = μ(s, σ ) · μ(s · σ, σ  ).
 
a b c d
Let s1 = s0 ·F δ1 with δ1 Ob(s0 ) = .
a b a b
Further, let Oc(s1 ) = {0, 1, 2, 3} with (0) = (2) = a, and let
 
0 1 2 3
σ1 Oc(s1 ) = .
b a b a
Then s1 · σ1 = s1 . It is not difficult to see that by Properties (1) and (2) of μ:
   
0 1 2 3 0 1 2 3
μ(s1 , σ1 )Oc(s1 ) = or μ(s1 , σ1 )Oc(s1 ) = .
3 0 1 2 1 2 3 0
So, μ(s1 , σ1 ) · μ(s1 , σ1 ) =Oc(s1 ) idOc .
By Properties (1) and (3) of μ we have for any s ∈ S:
μ(s, ) = μ(s, μ(s, ) · )

= μ(s, ) · μ(s · , )

= μ(s, ) · μ(s, ).


174 JOOP LEO

So, because for any s ∈ S, μ(s, )[Oc(s)] = Oc(s), we have μ(s, ) =Oc(s) idOc , and
so, because μ(s1 , σ1 ) · σ1 =Oc(s1 ) , we have μ(s1 , μ(s1 , σ1 ) · σ1 ) =Oc(s1 ) idOc .
It follows that μ(s1 , μ(s1 , σ1 ) · σ1 ) = μ(s1 , σ1 ) · μ(s1 , σ1 ) = μ(s1 , σ1 ) · μ(s1 · σ1 , σ1 ),
contradicting Property (3) of μ. 
THE REVIEW OF SYMBOLIC LOGIC THE REVIEW OF SYMBOLIC LOGIC
Coordinating Editor Information for Contributors
Jeremy Avigad Aims and Scope. The Review of Symbolic Logic is a newly established journal from the
Departments of Philosophy and Mathematical Sciences Association for Symbolic Logic, published in partnership with Cambridge University Press.
Carnegie Mellon University The Review of Symbolic Logic will publish papers in: philosophical and non-classical logics,
algebraic logic, and their applications in such fields as computer science, linguistics, game the-
ory and decision theory, formal epistemology, and cognitive science; history and philosophy
Editors of logic; philosophy and methodology of mathematics, past and present.
Horacio Arlo-Costa Gregory Restall Submission of Manuscripts. Manuscripts should be submitted to the Coordinating Editor
Department of Philosophy Department of Philosophy at rsl@uci.edu. Electronic submission is encouraged: send email with the manuscript file
Carnegie Mellon University University of Melbourne attached in PDF format. The body of the email should include the title of the paper, the
authors, its length in pages, and a clear-text copy of the abstract. Authors are encouraged
Patrick Blackburn Alasdair Urquhart to indicate which editor they would prefer to have handle their papers. Any method of pro-
Equipe TALARIS, Batiment B Departments of Philosophy and ducing the PDF is fine, but LaTex is recommended as it can be used for typesetting
INRIA Lorraine Computer Science the final paper.
Paolo Mancosu University of Toronto
Electronic Manuscripts. The publisher encourages submission of manuscripts in LaTex
Department of Philosophy Richard Zach which can be used for direct typesetting. Authors using LaTex should use the RSL LaTex class
University of California, Department of Philosophy file. This along with related files, can be obtained using anonymous FTP from
Berkeley University of Calgary ftp://ftp.cambridge.org/pub/texarchive/journals/latex/rsl-cls. If you have difficulties obtaining
Ian Proops these files please contact dtranah@cambridge.org; there is also a help-line available via
Department of Philosophy email—please contact texline@cup.cam.ac.uk. While use of the RSL class file is preferred,
University of Michigan plain LaTex or Tex files can also be accepted.
Layout of Manuscripts. Manuscripts should begin with an abstract of not more than 300
Advisory Board words. Papers should conform to a good standard of English prose; please consult a style
guide such as The Elements of Style by Strunk and White (New York: Macmillan). Do not
Steve Awodey Ulrike Sattler begin sentences with a symbol or identifier name. Present programs in one of two styles:
Department of Philosophy School of Computer Science either with identifiers in italics and keywords in bold, or entirely in a fixed-width teletype
Carnegie Mellon University University of Manchester font. Please supply Web URLs for the home page of each author of the paper.
Hartry Field Colin Stirling References. The Harvard system of references should be used. Citations are by author’s
Department of Philosophy School of Informatics surname and year of publication, and may stand either as a noun phrase (e.g., “Curry (1993)”)
New York University University of Edinburgh or as a parenthetical note (e.g., “(Curry 1933)”). List references at the end of the text in alpha-
betical order. A typical entry is: Curry, H.B. (1933) Apparent variables from the standpoint of
Kit Fine James Tappenden mathematical logic, Ann. of Math., 34 (2): 381–404.
Departments of Philosophy Department of Philosophy
and Mathematics University of Michigan Artwork. To ensure that your figures are reproduced to the highest possible standards,
New York University Cambridge Journals recommends the following formats and resolutions for supplying elec-
Michael Friedman Johan van Benthem tronic figures. LINE ARTWORK Format: tif or eps; Resolution: 1200 dpi. BLACK AND
Department of Philosophy Institute for Logic, WHITE HALFTONE Format: tif; Resolution: 300 dpi. COMBINATION ARTWORK Format:
Stanford University Language tif; Resolution: 800 dpi. If you require further guidance on creating suitable electronic figures
and Computation please visit http://dx.sheridan.com/guidelines/digital_art.html. Here you will find extensive
Marcus Kracht University of Amsterdam and guidelines on preparing artwork and gain access to an online preflighting tool where you can
Department of Linguistic and Department of Philosophy check to see if your figures are suitable for reproduction. A list of captions for figures should
Literary Studies Stanford University be supplied in a separate file.
University of Bielefeld
Michiel van Lambalgen Copyediting and Proofreading. The publisher reserves the right to copyedit and proofread
John MacFarlane Institute for Logic, all articles for publication, but the corresponding author will receive page proofs for final
Department of Philosophy and Language and proofreading. These should be checked and returned within three days of receipt. Only typo-
Group in Logic and Computation and graphical or factual errors may be changed at the proof stage. The publisher reserves the right
Methodology of Science Department of Philosophy to charge authors for excessive correction of non-typographical errors.
University of California, Berkeley University of Amsterdam Offprints. No paper offprints are provided, but the corresponding author will be sent a link
Ruth Barcan Marcus
Dag Westerståhl to the pdf of the published article.
Department of Philosophy
Yale University Department of Philosophy Home Page. Information about Review of Symbolic Logic may be viewed on the Cambridge
Gothenburg University University Press home page. The location of this home page is: journals.cambridge.org/rsl
D.A. Martin
Departments of Mathematics Mark Wilson
and Philosophy Department of Philosophy
University of California, Pittsburgh University
Los Angeles Crispin Wright
Lawrence Moss Department of Philosophy
Department of Mathematics University of St. Andrews and
Indiana University New York University
THE REVIEW OF SYMBOLIC LOGIC
THE
REVIEW OF
SYMBOLIC
LOGIC

Vol. 3, No. 1 • March 2010 • Pages 1–174


Edited by
Jeremy Avigad, Coordinating Editor
Horacio Arlo-Costa
Patrick Blackburn
Paolo Mancosu
Ian Proops
Gregory Restall
Alasdair Urquhart
Richard Zach

VOLUME 3 • NUMBER 1 • MARCH 2010 • ISSN 1755-0203

Copyright © 2010 by the Association for Symbolic Logic. All rights reserved.
Reproduction by photostat, photo-print, microfilm, or like process by permission only.

Cambridge Journals Online


For further information about this journal
please go to the journal web site at:
journals.cambridge.org/rsl PUBLISHED QUARTERLY BY THE ASSOCIATION FOR SYMBOLIC LOGIC, INC.
WITH SUPPORT FROM INSTITUTIONAL MEMBERS.

You might also like