Professional Documents
Culture Documents
Paraconsistency
Paraconsistency in Mathematics
in Mathematics
About the Series Series Editors
This Cambridge Elements series provides Penelope Rush
an extensive overview of the philosophy University of
Zach Weber
of mathematics in its many and varied Tasmania
forms. Distinguished authors will provide Stewart Shapiro
an up-to-date summary of the results of The Ohio State
current research in their fields and give University
their own take on what they believe are
the most significant debates influencing
research, drawing original conclusions.
PARACONSISTENCY IN
MA T H E M A T I C S
Zach Weber
University of Otago
www.cambridge.org
Information on this title: www.cambridge.org/9781108995412
DOI: 10.1017/9781108993968
© Zach Weber 2022
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2022
A catalogue record for this publication is available from the British Library.
ISBN 978-1-108-99541-2 Paperback
ISSN 2399-2883 (online)
ISSN 2514-3808 (print)
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
DOI: 10.1017/9781108993968
First published online: July 2022
Zach Weber
University of Otago
Author for correspondence: Zach Weber, zach.weber@otago.ac.nz
Introduction 1
2 Set Theory 19
3 Arithmetic 40
References 71
Introduction
This Element is an introduction to some uses of paraconsistent logic for math-
ematics. It is for beginners – or at least, readers who know enough about what
the words in the title mean to have picked this up, but not much more. In writing
it, I have mostly tried to take a neutral “tour guide” approach, both in the selec-
tion of material and in avoiding trying to “sell” the reader anything, though
my biases have inevitably shown through, especially by the end. The views
expressed in this Element are the author’s and do not necessarily reflect the
position of Paraconsistency Inc or its affiliates.
Each of the first four little sections exposits some key ideas (including, inev-
itably, using some formal symbolism); the last section of each discusses a more
general philosophical issue that arises. The final section is a brief philosophical
and critical appraisal, looking to the future of this little field.
course, it still is.) Developments in formal logic since the 1950s, though, have
established that such theories can indeed exist. “Paraconsistent logic” now has
an official mathematics subject classification code.1
One aim of this emerging, diverse field of work is to widen the horizons
of mathematics by discovering and studying new objects – much in the way
that historically mathematics has advanced by admitting the existence of “bad”
entities like zero, negative numbers, irrational numbers, imaginary numbers,
transfinite sets, geometries where parallel lines can meet, and so forth. Newton
da Costa suggests that “it would be as interesting to study the inconsistent sys-
tems as, for instance, the non-euclidean geometries (da Costa, 1974, p. 498).”
For paraconsistency in mathematics, as Robert Meyer puts it, “what is to be
hoped for most of all are not new routes to old truths, but an expansion of the
pragmatic imagination (Meyer, 2021a, p. 158).”
The aim of this Element is to give the interested reader a critical sense of
some of the work in this area to date, its strengths and weaknesses, and to
indicate what might be next.
1.1 Motivations
Let us begin with an example of possible inconsistency in mathematics that
motivates using paraconsistent logic. The discussion proceeds informally. Then
we will get into a few details of how the logic itself works.
Mathematics is, by standard accounts developed in the twentieth century,
based on set theory – itself a mathematical theory of collections that provides a
foundation for all other areas of mathematics. Even very cautious philosophers
like Quine have grudgingly accepted that sets are indispensable for mathemat-
ical (and so scientific) practice. Mathematics takes place in the universe of
sets.2
Consider, then, the universe of sets – the collection of all sets, U. This is,
one imagines, a very big collection, the most inclusive collection of sets there
could be, one containing every set. This collection is, intuitively, the domain of
discourse for statements that set theorists are interested in, such as “every set
can be wellordered” or “there are no self-membered sets”; and whether or not
those statements are true, they do clearly seem to be – have been taken to be –
meaningful. The universal collection is the prima facie basis of the meaning
1 Under the Mathematics Subject Classification 2010, database of the American Mathe-
matical Society, 03B53: “Logics admitting inconsistency (paraconsistent logics, discussive
logics, etc).”
2 For a good account, see Potter (2004).
of these statements: when you study set theory, U is “where” you work. It has
been traditional at least since Whitehead and Russell’s Principia Mathematica
in 1910 to define the universal class with a “property posessed by everything,”
for example, the collection of all sets x such that x = x, or
U = {x : x = x},
which is universal because everything is, after all, self-identical.3 Every set x
is a member of this collection, x ∈ U.
Within U are subcollections of all sorts – the continuous functions, the com-
mutative groups, the set of all collections with exactly three members, and so
forth. One such collection is all the singletons. Every x has a singleton, {x}, and
the collection of all singletons comprises the sets with just one member. Now,
that means every x in the universe, every x ∈ U, can be paired up exactly, into
a one-to-one correspondence, with its singleton, in pairs
⟨x, {x}⟩.
This shows a natural sense in which the universe and its subcollection of all
singletons have exactly the same “size”: they can be paired off perfectly. But
of course, there are more objects in the universe than just the singletons, since
most things are not singletons. So there is also a natural sense in which U is
not the same size as the subcollection of all singletons.
This is a little puzzling, but we are talking about the entire universe after
all, so we should be prepared for some surprises – and indeed, the outstanding
mathematician Richard Dedekind used exactly this fact in 1888 to define what
it means for a set to be infinite; namely, having a proper part of the same size.
He then used a variant on the aforementioned argument to prove that infinite
sets exist.4
Now consider all the subcollections of the universe collected together, P(U)
(called the powerset of U). Since U is maximally inclusive, both P(U) and
all its members are inside of U. Writing subsethood as “⊆,” then
P(U) ⊆ U. (1)
3 See Whitehead and Russell (1910, p. 216). Here, we are focusing on the universe of sets, so
“everything is self-identical” is short for “every set is self-identical.”
4 In the infamous Theorem 66 of his Was sind und was sollen die Zahlen? of 1888 (reprinted
in (Dedekind, 1901, p. 64)), he argues that the set of his thoughts is infinite because for each
thought x there is also the thought of that thought {x}, the thought of the thought of that thought
{ {x} }, and so forth. See Priest (2006, p. 33, sec. 10.1). For alternative ways to think about the
sizes of infinite sets, see Mancosu (2009).
But similarly, each member of U is a set, which in turn has only sets as
members; so for any x ∈ U, if z ∈ x then z ∈ U, which is to say that x
is also a subset of U and hence a member of the powerset of the universe,
showing
U ⊆ P(U) (2)
But, if all the members of U are members of P(U) and vice versa, then these
collections are exactly the same collections; by (1) and (2),
U = P(U) (3)
because sets with all and only the same members are the same set, by the
principle of extensionality.
A powerset contains all possible recombinations of elements of a set, so this
equality is a little odd – but also natural enough, maybe, since the set in question
is the universe. Note that, since U is a subset of itself, that is, U ∈ P(U),
then this means U ∈ U. The universe is contained in itself; the universe is
everything, after all.
But now we have a real problem. By (3), U and P(U) must be the same size
(since they are the same set). If they are the same size, there is a way to pair
off their members in a one-to-one correspondence. Call such a pairing f, that
matches members x ∈ U with members y ∈ P(U), as in f(x) = y. For members
x of U paired up with subsets y of U, sometimes x will be in that subset, and
sometimes not.5 So consider
r = {x : x < f(x)}
This is a subset of the universe comprising all the things that are not in the set
they are paired with. But then, since f pairs off everything, and r ∈ P(U), there
must be some x ∈ U such that f(x) = r. Now we just have to ask: is x ∈ r,
or not? If it is, then x < f(x) by definition, so x < r after all; yet if x < r, then
x ∈ f(x) again by definition, so x ∈ r after all. Since x is either in r or it is not,
it is both: contradiction.
Classically, this general argument has been taken as a reductio – requiring
the rejection of some assumption, usually the existence of a pairing off between
a set and its powerset, or a universal set, or both (see §2.3). But the existence of
a set theoretic universe is extremely hard to shake; and since we independently
established that U = P(U), it looks like there must be at least one such pairing
5 For instance, if a = {1, 2, 3} and b = {2, 4, 6} then if f(1) = a, we would have 1 ∈ f(1), and if
f(1) = b then 1 < f(1).
off, namely, identity f(x) = x. Under this identity mapping, where f(r) = r, then
writing “iff” for “if and only if,”
r ∈ r iff r ∈ f (r)
iff r < f (r)
iff r < r.
6 The fact that there cannot be a one-to-one correspondence between a set and its powerset was
proven by Cantor and is called Cantor’s theorem. The fact that Cantor’s theorem becomes incon-
sistent at the universe was known to Cantor by 1895 or so but was made especially public
by Russell in 1902 and is called Cantor’s Paradox. See van Heijenoort (1967, p. 124) and
Section 2.
7 If it does not seem good enough, though, I think you might be right and suggest you see Priest
(2006, ch. 2) and Weber (2021, ch. 1).
1.2 Methods
1.2.1 Paraconsistent Logic Tutorial
A paraconsistent logic allows inconsistency without absurdity. In a non-
paraconsistent logic, any inconsistent premises p, ¬p will have any arbitrary
q as a valid conclusion (ex contradictione quodlibet, or explosion); a logic is
paraconsistent if and only if explosion is invalid. Denying explosion is all that
is required for a logic to be paraconsistent.
More precisely, let us say that a logic is determined by a consequence relation
⊢ that relates some sentences (premises) to another (conclusion).9 When the
consequence relation
p0, ..., pn ⊢ q
holds then the argument from p0, ..., pn to q is valid; and if not, not. And then
let us say that a theory is a set of sentences closed under logical consequence:
the “starting” sentences, and all the ones that validly follow under ⊢. An incon-
sistent theory contains both some sentence p and its negation ¬p. And so an
inconsistent theory under a non-paraconsistent logic will include every sen-
tence, which makes the theory trivial. Thus, if an inconsistent theory is to be
8 “...that all that can be spoken of or described (non-trivially) is consistent” (Priest, Routley, &
Norman, 1989, p. 4; cf. Routley and Meyer, 1976).
9 There is a lot of good (paraconsistent) work on multiple conclusion consequence relations; see
Beall and Ripley (2018, p. 744). The focus on single conclusion here is to keep it simple.
p, ¬p ⊬ q.
Paraconsistent logics are the basis for the study of paraconsistent theories.
There are many strategies for making a logic paraconsistent, and within
these strategies there are many – infinitely many – paraconsistent logics. For
concreteness, let us look at one simple approach due to Asenjo (1966) (cf.
Asenjo and Tamburino, 1975) and then to Priest (1979). This is to general-
ize the standard truth conditions on logical evaluations and the definition of
semantic validity, opening some extra space that leaves classical conditions as
a special case.
Just as with classical logic, we are given a formal language with connec-
tives ¬ (negation, “not”), ∧ (conjunction, “and”), and ∨ (disjunction, “or”),
with propositional atoms p, q, ... connected by these connectives into complex
expressions A, B, .... The material conditional A ⊃ B is defined ¬A ∨ B and a
biconditional A ≡ B is defined as (¬A ∨ B) ∧ (¬B ∨ A). The new twist is in
an assignment ν taking sentences of the language to (two) truth values, t and f.
Sentences may be true, or false, or – now diverging from classical logic – both.
On this arrangement, no sentence must be “both” but some can. This is pos-
sible because ν is a relation, rather than a function.10 Relations can be multiple,
as in “y is a place x has lived” can take one x to multiple values for y. While an
evaluation function would treat a “true” contradiction as having a value equal
to both true and false, t = ν(p) = f, and hence11 t = f (which not even para-
consistentists will approve of), a relation will let t and f be among the values
of ν(p), one not always ruling the other out.
Relational truth conditions for negation, conjunction, and disjunction may
be spelled out in a standard-looking homophonic way:
10 Following an idea from J. Michael Dunn in the 1960s (see Omori and Wansing, 2019b), pub-
lished as Dunn (1976); cf. Priest (2008, p. 161). Both Asenjo and Graham Priest present this as
a three-valued logic, where the “both” value is a third distinct status along with truth and falsity.
They treat the relation ν as a function, assigning each sentence exactly one of the three possi-
ble statuses. There are strong reasons why it is philosophically preferable to take the relational
approach, as presented here; see Weber (2021, ch. 3).
11 At least, assuming the transitivity of identity: a = b, b = c ⊢ a = c. This is disputed in Priest
(2014).
then this is the idea behind the logic now known as LP (for “logic of paradox”).
It is paraconsistent because if we just consider the sentences p and q, then it is
possible for p to be assigned both true and false, while q is not assigned true.
Then p and ¬p are both assigned at least true, but not at all so for q, and so ex
contradictione quodlibet is invalid – it has a counterexample.
This is a way of generalizing classical logic. If the relation ν were tightened
up to be a function, so that, for example, ν(¬A) = t iff ν(A) = f, then these con-
ditions simply are those of classical logic. Without assuming functionality, the
logic allows for gluts. We do assume that the relation ν assigns every sentence
at least value t or f, making negation exhaustive:
A ∨ ¬A
is always assigned at least value t. If one wanted to, this condition could be
dropped, and then the logic would allow gaps too; that would deliver the
logic FDE, which can then serve as a base to extend to several other sorts of
paraconsistent logics; see Belnap (1977). In LP there are gluts but no gaps.12
Nothing about the general idea of paraconsistency would seem to commit
to actual gluts (or true contradictions), only the hypothetical that even if there
were some true contradiction, still not everything would be true.13 This opens
12 In this Element, we will not be considering “gappy” approaches, because intuitionistic and con-
structive mathematics have already told us a lot about the “incomplete” (but consistent) side of
things, and it seems worthwhile to try to understand the dual. Also, without the law of excluded
middle (LEM), many of the standard paradoxical proofs of contradictions – those that motivate
inconsistent mathematics to begin with – fall apart; one can derive theorems of the form “A
iff ¬A” but no further. The question of whether one can still derive contradictions without the
LEM has been a point of contention between Priest and Brady; see Brady and Rush (2008);
Priest (2019).
13 See Barrio and Da Re (2018).
the radical aim is to study paradoxical objects directly, on their own terms –
expanding mathematical horizons, ideally, in the way that irrational numbers
or non-Euclidean geometries opened new directions in the past.
This moderate/radical distinction is not cut and dry, and we will be revisiting
it throughout. Nevertheless, we can see at least two competing goals. First, there
is classical recapture. This is to show that most or all of accepted classical
mathematical theorems are true in a paraconsistent framework.
The metaphor of “great chunks” is a little imprecise, but the intent seems
clear enough. The idea is to reassure a would-be paraconsistentist that noth-
ing important is lost. Second, there is expansion: adding new theorems and
insights to the stock of mathematical knowledge, particularly about objects and
structures that are beyond the classical pale.18 This direction is more indefinite
and open-ended; as such, it has also received less attention. But I will be sug-
gesting that it is the direction that, ultimately, may decide the future of this
enterprise.
Some early proponents of inconsistent mathematics suggested that recapture
and expansion would come in a natural package, because using paraconsis-
tent logic for mathematics would involve no change at all to ordinary practice.
This is because, they argued, ordinary practice is not classical. That is, classi-
cal logic is not a good formalization of what actually goes on in real proofs.
Following Anderson and Belnap (1975), people like Routley/Sylvan, Meyer,
and Priest argued that if one formalized the canons and norms of reasoning
found in everyday mathematical use, where people do not in fact routinely
infer arbitrary conclusions from arbitrary contradictions, the result would be
a “relevant” (or “relevance”) paraconsistent logic.19 Priest says “we will have
to relinquish ‘classical’ logic” (Priest, 1979, p. 226); but Routley says this will
not be a problem if
18 An analogy might be how the study of transfinite numbers may require different, more strin-
gent rules, but does not change any facts about the good old natural numbers; for example,
addition in N is commutative (n + m = m + n) but not necessarily so for infinite ordinals
(ω + 1 , 1 + ω). To encourage this analogy, Priest coins his work “a study of the transcon-
sistent.” Carnielli and Coniglio use the same analogy to explain the place of their (different)
approach to paraconsistency (Carnielli & Coniglio, 2016, p. x).
19 “Imagine someone offering [an explosive argument] as a proof . . . in a class on number theory.
It is clear that it would not be acceptable” (Priest, 2008, p. 74, §4.74).
Viewed this way, some fairly radical ideas can be presented in an apparently
moderate light. Priest states that “the programme of paraconsistent logic has
never been revisionist” and that “by and large, it has accepted the reasoning of
classical mathematics is correct.” He makes this claim on the basis that logic
as used by actual mathematicians would turn out to be paraconsistent:
What [paraconsistency] has wished to do is to reject the excrescence of ex
contradictione quodlibet which does not appear to be an integral part of clas-
sical reasoning but merely leads to trouble when reasoning ventures into the
transconsistent. (Priest, 2006, p. 221)20
20 Though Priest admits this conception results in “some tension” [p. 248], and his views have
developed since 2006 (see Priest [2013] and his Element on pluralism).
21 This is the approach taken by Brady in later chapters of Brady (2006) on the development of
mathematical theories. See Beall (2009, p. 111); cf. Beall (2013b), and Omori and De (2022).
22 Ideas suggested in various ways by Jc Beall, Mortensen, Stewart Shapiro, and more recently
perhaps Priest; cf. Mortensen (2009); Shapiro (2014).
logic. One can work with models to show that, for example, models of para-
consistent set theory already include models for classical set theory (see §2).
One advantage of working model theoretically is that a lot of work has already
been done with models.
For this approach to be coherent, it appears to require assuming that classical
mathematics, and model theory in particular, is correct. This assumption makes
sense from a moderate, conservative sort of standpoint that does not dispute
accepted mathematics or its formalization in classical logic. For example, on
some approaches to paraconsistency,
The tantalizing idea here is that mathematical truth is invariant under changes
of logic, or alternatively that invariance under logic change is the mark of
mathematical truth.
These views reflect that most work in logic and mathematics is conducted
in a cooperative spirit, and there is a lot to show for it. However, recalling the
idea that looking for structure in the inconsistent is in some sense a repudi-
ation of the “universal consistency hypothesis,” there is a clear way in which
a more “fundamentalist” paraconsistent approach may be committed to saying
that some classical mathematics is not correct. For example, standard set the-
ory cannot make sense of the “paradoxical” arguments mentioned earlier about
the universal set, and this could be taken as a critical failing of the standard
approach – a failing at the very foundation of mathematics.23
23 There are non-paraconsistent set theories with a universal set. These raise other issues; see
Forster (1995).
¬¬A ⊣⊢ A
¬(A ∧ B) ⊣⊢ ¬A ∨ ¬B
¬(A ∨ B) ⊣⊢ ¬A ∧ ¬B.
That seems good from the standpoint of trying to generalize or extend classical
logic without changing it too much.25
The law of excluded middle (LEM) A ∨ ¬A is always evaluated as (at least)
true. That means, using de Morgan’s laws, that any conjunction of contradictory
24 Rosenblatt worries that “the project of proving [standard] results for one’s preferred deviant
logic is going to be either impossible to carry out or, in the best case scenario, a massive under-
taking” Rosenblatt (2022b). He has a point, though I do not see this as an objection to trying. For
small steps into a serious nonclassical metatheory, see Rosenblatt (2021b); cf. Weber, Badia,
and Girard (2016), Badia, Weber, and Girard (2022). Meyer already attempted some work like
this in Meyer (1985).
25 Compare this with intuitionistic negation, which does not obey all these laws. See Priest (2008,
ch. 6) and Posey’s Element on intuitionism.
sentences A ∧ ¬A, even glutty ones, will be at least false. A distinctive feature
of negation in LP, then, is that the law of noncontradiction
¬(A ∧ ¬A)
is a theorem: all contradictions are at least false; some may also be true.
It is worth saying that again. A logic designed to handle truth gluts has as a
theorem that there are no truth gluts. So the so-called glutty approach is itself
glutty.26 This is a feature, not a bug. The logic LP does not itself have any
contradictions in it, but it is a logic for seriously working with contradictions,
in an appropriately contradictory way. The language of LP, and in particular its
negation connective, is so open-minded, in fact, that there is nothing to prevent
an evaluation assigning every sentence of the language both truth values, a
“trivial” model (see §2.1).
And so LP is, by design, much weaker than classical logic. Most notably, the
common argument form of disjunctive syllogism is invalid:
A, ¬A ∨ B ⊬ B
since if A were assigned both t and f, but B not assigned t, then there would be
a way for the premises to be true but not the conclusion. And that means, as the
reader may have noticed, that LP has no implication connective; the “material
conditional” p ⊃ q as defined ¬p ∨ q will fail to obey modus ponens,
A, A ⊃ B ⊬ B
or indeed most other behaviors expected from a conditional; see Beall, Forster,
and Seligman (2013).
Given the absence of a conditional in LP one can try to add a conditional.
One way to do so – the one favored by, for example, Routley, Brady, Meyer,
and Mortensen in their investigations of inconsistent mathematics – is to adopt
some version of implication from relevant paraconsistent logic, →, where if
A→B
holds then there must be some “meaningful” connection between the anteced-
ent and the consequent.27 This conditional obeys modus ponens,
A, A → B ⊢ B
among other properties; but it is still much weaker (or stronger, depending on
your viewpoint) than classical “material” implication, and is not entirely happy,
as we will see. There are other conditionals to try adding to LP as we will see,
but none of them yet known are the “universal key, which opens, if rightly
operated, all locks.”28
Other paraconsistent approaches favor a different account of negation.
Asenjo (1966) states that the law of noncontradiction should not be a theorem
of any logic suitable for formalizing inconsistent theories. The pioneering work
of da Costa takes a similar attitude, where he states as his first adequacy condi-
tion on any paraconsistent logic that “¬(A ∧ ¬A) must not be a valid schema”
(da Costa, 1974, p. 498). Da Costa’s main logics are known as the C-systems,
and have since been subsumed and brought more sophistication under a richer
family of logics known as logics of formal inconsistency, or the LFIs.
In LFIs the idea is to allow explosion “in a local or controlled way” (Carnielli
& Coniglio, 2016, p. 31).29 A characteristic feature of this approach is to
include in the language an operator ◦ that may be read as “is consistent,” and
so is designed to separate out the “normal” or well-behaved sentences from
the paradoxical ones, demarcating sentences that may be reasoned about clas-
sically. (Note that if ◦(p) holds then ◦(¬p) holds, too.) So while in an LFI the
principle of explosion is invalid,
p, ¬p ⊬LFI q,
28 As Routley once envisioned (Routley, 1977, p. 893). Alternatively, the idea of sticking with
LP alone as a logic (or even the weaker FDE), without a new conditional, has been favored in
various moods by Beall (2013a, 2018), Omori (2015), and others, following a suggestion of
Laura Goodship in Goodship (1996). See Priest (2017b).
29 See Carnielli, Coniglio, and Marcos (2007) and Marcos (2005).
30 For recent attempts at adding a consistency operator, see Barrio, Pailos, and Szmuc (2017); for
problems with this and other strategies, see Rosenblatt (2021a) and cf. Omori and Weber (2019).
31 The logic CLuNs is often taken as the lower limit logic for such an approach; this logic, or
variants thereof, have been independently proposed across the paraconsistency literature. It is
equivalent to LFI1 as well as da Costa and D’Ottaviano’s J3, and has the same underlying
algebra as Łukasiewicz’s Ł3 (Carnielli & Coniglio, 2021, p. 192). The underlying algebras of
Ł3 and “relevant” RM3 are also term equivalent, assuming the presence of a ⊥ constant. (Thanks
to Omori here.)
32 A similar proposal is Priest’s “minimally inconsistent LP,” which works by selecting the least
inconsistent models and recovering more classical reasoning for cases that appear to be safe, but
where such reasoning may become unacceptable if contradictions were to arise (Priest, 2006,
ch.16).
mental, rather than ontological, focus (Carnielli & Coniglio, 2016, pp. 4, 14).
Or again, paraconsistent or inconsistent theories can capture the content of
our cognition or experience of apparently contradictory phenomenon; this is
the position taken by Mortensen (1995, 2010). Paraconsistency in mathemat-
ics might then be an account of mental constructions or imaginings, leaving
the logic of mind-independent truth values to classical logic or something
else.
Similar comments will apply to other positions. If you are a structuralist
about mathematics, you can be a structuralist about inconsistent mathemat-
ics; mutatis mutandis for fictionalism; and so forth. Inconsistent mathematics
may add some pressure to these positions by adding some unexpected commit-
ments, but where to draw the line has always been a question for such positions.
For example, perhaps you are happy to say that natural numbers really truly
exist, but become uncomfortable with a similar platonism about large trans-
finite cardinals. So here for realism, or not, about objects with inconsistent
properties. We will return to this issue in §4.4 on the topic of indespensibility
arguments.
The relationship between language and reality (whatever that is) does let
us answer a question about the meaning of expressions like “inconsistent
mathematical structure.” Inconsistency is a linguistic property, concerning a
sentence and its negation; whereas a structure, or its parts, cannot be “negated”
in a meaningful way. So it is worth clarifying that any talk about inconsist-
ent objects is a kind of shorthand for talking about an inconsistent theory, or
description, of an object. Whether one thinks that the inconsistency comes from
our mind, language, or the object itself, is a further question. As Priest puts it,
concerning the question of the “consistency of the world,”
[i]f something is true, there must be something that makes it so. Call this
the world. If some contradictions are true, then the world must be such as to
make this the case. In this sense, the world is contradictory. What it is in the
world that makes something true is another matter. (Priest, 2006, p. 299)
2 Set Theory
One of the first and most enduring motivating projects in paraconsistent math-
ematics has been to address the infamous paradoxes of “naive” set theory,
discussed in §1.1. For, ever since Frege’s Grundgesetze der Arithmetik at the
turn of the twentieth century, it has been clear that a basic assumptions about
sets is extremely plausible and yet inconsistent. This is the assumption that a
set is an object exactly composed of all and only the things sharing a common
property; Frege expressed this as
Basic Law V The set of all As is identical to the set of all Bs if and only if all
As are Bs and all Bs are As
This sounds very close to being some kind of trivially true or analytic state-
ment, and indeed Frege intended it to be a purely logical truth. The law can be
expressed as two principles:
Comprehension There is a set {x : A(x)} of all and only the xs that are A
Extensionality Sets that have all and only the same members are identical
R ∈ R and R < R.
This is a contradiction that appears to follow from very simple reasoning from
true premises.
That is not all. By extensionality, sets with the same elements are identical,
and conversely, sets with different elements are not identical. Since we have
shown that the Russell set has different members from itself, that means
R , R.
U , P(U)
For Frege, the extension of a set (the “list” of all its members) automatically
determines its anti-extension (the “list” of all its nonmembers). Classically, the
claim that the set of As comprises of all and only the As is equivalent to the
claim that the set of non-As is comprised of all and only the ¬As. On a para-
consistent account of negation, though, this might no longer be automatic. The
extension of a predicate may be independent of the anti-extension of that pred-
icate, if contradictory membership is allowed. Should membership in the As
immediately entail nonmembership in the non-As?
Questions like this start to show that any details of a paraconsistent set theory
will depend very much on how the axioms are formulated. In the question just
posed, it is a matter of whether the biconditional in the axiom of comprehen-
sion contraposes34 or not; and here, the approaches of various paraconsistent
mathematicians have differed. So, just as there is no one paraconsistent logic,
there is no one paraconsistent set theory to describe to you in general terms; we
will need here, as in subsequent sections, to look at specific systems.
Let us follow a quasi-historical narrative, where a key takeaway is that,
whether moderate or radical, anyone who thinks paraconsistency is the “easy
way out” is wrong. Anyone looking for a miracle whereby we get a set the-
ory with all of the virtues and none of the vices will need to keep looking. As
Arruda and Batens already put it in the early days, “There can be no doubt that
the expectations of paraconsistent logicians with respect to set theory did not
come true (Arruda & Batens, 1982, p. 132).”
Some of the first work in paraconsistent set theory was done by da Costa
and Arruda going back to the early 1960s.35 Recall that the idea in da Costa’s
logics is to have an operator A◦ (now writing the ball as a superscript, following
the original notation) meaning that A is consistent; iterating, A◦◦ means that it
is consistent that A is consistent, and An abbreviates A◦...◦ where ◦ appears n
times. Then A(n) abbreviates A1 ∧ . . . ∧ An . This can even be iterated infinitely
many times. The propositional language of da Costa’s Cn , for 1 ⩽ n, takes
∧, ∨, ¬, and ⊃ as primitive; postulates are as follows.
1. A ⊃ (B ⊃ A)
2. (A ⊃ B) ⊃ ((A ⊃ (B ⊃ C)) ⊃ (A ⊃ C))
3. A ⊃ (B ⊃ A ∧ B)
34 That is, satisfies: if A then B, then if ¬B then ¬A. For a good account of different approaches to
models of paraconsistent set theory, see Libert (2005).
35 For recent scholarship on Arruda and her contributions to logic, see Lisboa and Secco (2022).
Arruda suggests that paraconsistency will “only subsist as far as professional mathematicians
occupy themselves with its problems” (Lisboa and Secco, 2022, §3, note 22).
4. A ∧ B ⊃ A, A ∧ B ⊃ B
5. (A ⊃ C) ⊃ ((B ⊃ C) ⊃ (A ∨ B ⊃ C))
6. A ⊃ A ∨ B, B ⊃ A ∨ B
7. ¬¬A ⊃ A
8. A ∨ ¬A
9. B(n) ⊃ ((A ⊃ B) ⊃ ((A ⊃ ¬B) ⊃ ¬A))
10. A(n) ∧ B(n) ⊃ ((A ⊃ B)(n) ∧ (A ∨ B)(n) ∧ (A ∧ B)(n) )
along with the rule of modus ponens, that from A and A ⊃ B we can derive B.
Postulates for quantifiers include standard axioms and rules, plus consistency
shifts like
∀x(A(x))◦ → (∀x(A(x)))◦
C = {x : if x ∈ x then p},
where p can be any sentence whatsoever. Suppose C ∈ C. Then the set’s defin-
ing condition gives us that, if C ∈ C then p. On assumption, then, p. That is
unacceptable, since p could be anything, so C ∈ C cannot be – but it looks like
we just proved that, if C ∈ C, then p, which is exactly what it would take for
C ∈ C to be true. So we are in trouble, and it looks like negation had nothing
to do with it.
For this reason, many even paraconsistent systems cannot accommodate the
full comprehension principle, because of Curry’s paradox.37 Arruda showed
that set theories based on da Costa’s systems under various formulations lead
to triviality, or at least something very close – the “unpleasant result” that there
is only one object, ∀x∀y(x = y). She and da Costa considered set theories based
on different logics, including logics without modus ponens, but similarly found
36 Here, I am mainly following Arruda (1989) (written in 1978; cf. Arruda (1979)). Da Costa ini-
tially worked with versions of Quine’s set theory NF, the topic of Arruda’s 1964 PhD thesis; she
eventually showed NF1 to be trivial in an unpublished talk from 1981 (D’Ottaviano & Carvalho,
2005, §2; Lisboa & Secco, 2022, §5).
37 One diagnosis is that postulate (2) for the conditional connective(s) in Cn is a form of
contraction; see §2.2.
that while these are nontrivial, that in them “every two sets are identical,” which
is bad.
Nevertheless, these investigations might be seen as some of the first bread-
and-butter mathematical work in this area. Since the logics in question are
paraconsistent and hence could presumably handle the Russell contradiction,
but seem to run into “unpleasantness” in the presence of a universal set, Arruda
came to wonder about the relationship between the Russell set and a universal
set, and in particular, whether it is possible to have one without the other. Under
more classical approaches, no sets are self-membered, so the Russell class is
the universe of all sets; but perhaps in a nonclassical setup matters could be
different?
Arruda shows that under relatively minimal conditions they cannot be kept
apart. Here is her original result, that the union of the union of the Russell set
is universal; see Arruda and Batens (1982).
∪∪
Theorem R = U.
• if x, y ∈ {z} then x = y.
by our fact about singletons. But from this identity and the fact that the Russell
set is a member of itself, we have R ∈ R ∪ {x}. So by 4 again, R ∈ {R ∪ {x}}
and hence by singleton facts,
R = R ∪ {x}. (5)
By that identity and the fact that the Russell set is still self-membered, we get
R ∪ {x} ∈ R, and so
{R ∪ {x}} ∈ R
There are several questions this exercise raises. We have gone through it
because this is an early and (relatively) isolated example of seriously think-
ing through some basic properties of an inconsistent object, and trying to draw
sensible conclusions about its structure. This is paraconsistency in mathemat-
ics.
∪
Arruda and Batens then improved the result to show that U = R. Thus,
it seemed, a set theory with a Russell set would also have a universal set, and
then, depending on the other set existence axioms and the underlying logic,
be subject to Curry’s paradox. Still, the paraconsistent opportunity to study,
for example, the universal set, was there, and so it made more sense for da
Costa and those who followed to study paraconsistent versions of axiomatic set
theories that assume principles weaker than comprehension, but that can control
the existence of the Russell set or the universal set as additional axioms.38
At this fork in the road, then, one path points away from “naive” set theory
and considers instead paraconsistent versions of standard axiomatic set theo-
ries, like Zermelo–Fraenkel set theory, the theory proposed starting in 1908
to replace naive set theory, usually denoted ZFC (to include the axiom of
choice).39 This is a much more moderate project, in which we are dealing with
(presumably) consistent assumptions and a consistent theory, but nevertheless
embed it in a contradiction-tolerant logic – perhaps to demonstrate that the full
force of classical logic is excessive for the purposes of even standard mathe-
matics. As Carnielli and Coniglio put it, “Paraconsistent logic tries to find the
38 Da Costa proved nontriviality for a paraconsistent set theory in da Costa (1986). For more
see da Costa, Krause, and Bueno (2007) in Jacquette (2007) and (Carnielli and Coniglio
(2016, ch.8).
39 For reference, informally, the axioms of ZFC are, along with extensionality:
Pairing For any sets x, y there exits their pair z = {x, y}.
Union For any set there exists the set of all members of its members.
Powerset For any set there exists the set of all its subsets.
Separation For any set, there is a subset of it consisting of all and only the As.
Infinity There is an infinite set.
Replacement Every function with a domain also has a range.
Foundation No sets have infinitely descending membership chains.
Choice For every collection of non-empty sets, there is a function that chooses a member from
each.
Note that separation is a restricted form of comprehension, and would lead to Russell’s paradox
if the universe is a set. The standard reference is Jech (2003); cf. Bell (2005, p. 17).
C(x), x ∈ x, x < x ⊢ B
cons = {x : C(x)}
existed (the set of all consistent sets), then C(cons) would be true and hence
cons ∈ cons, which is trivializing, given other ZF assumptions.
On a similarly ZFC-directed tack, a detailed study of models of paracon-
sistent ZFC is being undertaken using an algebra-valued approach, by Sourav
Tarafder, Giorgio Venturi, and Santiago Jockwich-Martinez, following Löwe
and Tarafder (2015); see Jockwich-Martinez and Venturi (2021). The idea here
is to take the Boolean-valued models of set theory developed for forcing and
independence proofs (see Bell, 2005) but replace the Boolean algebra with
one(s) more amenable to paraconsistency. These efforts, based in classical alge-
braic methods, are some of the most active in the field at time of writing, and are
especially promising ways of building connections between paraconsistency
and more mainstream work in set theory.
Using adaptive logic, Verdée has developed paraconsistent set theories in
Verdée (2013a, 2013b) and argued that in the spirit of the adaptive approach
they can provide a “pragmatic” foundation for mathematics. Returning to the
motive of restoring naive set theory, Batens (2020) outlines some desiderata
for a naive set theory to be “Fregean”: it matches, at least in terms of symbols,
the axioms of Frege’s system; it does justice to the notions of membership,
nonmembership, identity, and nonidentity, and most characteristically for the
adaptive approach, “if a set (or sentence) can be selected as consistent” for a
“systematic or formal reason, then the set (or sentence) is consistent.” So to
balance the desire to maintain Basic Law V and the inevitable contradictions
that result, “the natural reaction is to allow for contradictions but to minimize
them (Batens, 2020, p. 912).”
The hope is to recover an inconsistent but coherent naive set theory that
“serves the purposes a decent set theory should serve” which on this approach
means including a consistent set theory. This would be, as Batens calls it, the
“adaptive gain.”
As with other attempts at paraconsistent set theory, Verdée found that the
adaptive approach runs in to trouble from Curry’s paradox. This leads to a
choice between developing adaptive set theories that do not entirely meet
Batens’ Fregean desideratum, or else to give up any “adaptive gain.” Batens
suggests as a solution defining a cumulative sequence of abnormalities leading
up to a “sequential superposition” called AFS1 (adaptive Fregean set theory 1),
which would have many of the desired properties. As Batens admits, though,
a proof that “there is an adaptive gain in AFS1 may have the same complexity
as a proof that ZFC is consistent” and so this theory is presented circa 2020 at
the level of a hypothesis or program for further investigation.
To avoid Curry’s paradox, among other problems, one option touched on by
da Costa and Arruda as mentioned is working with systems that do not obey
modus ponens (Arruda, 1989, p. 112). Without a “detachable” conditional, as
it is sometimes said, then there is no worry from Curry’s paradox or the like.
There is indeed a very natural paraconsistent logic with this feature, namely,
LP that we met in Section 1.2.1. Asenjo considers using this “antinomic logic”
for recovering naive set theory, but states that “Cantor’s axiom remains for us a
paradise lost, inaccessible to our most earnest efforts” (Asenjo, 1989, p. 399).
Nevertheless, by 1991, Greg Restall had made some strides in this direction,
which we now consider.
With no workable notion of modus ponens, there is not much hope of
using LP set theory to prove theorems, at least not in the “natural” deductive
way that Arruda’s abovementioned theorem is derived. But models of LP are
easy to devise and work with, and there has been ongoing efforts to under-
stand paraconsistent set theory in this way. Here is how models of (naive)
set theory in LP work, following (Restall, 1992, p. 422; cf. Priest, 2006, p.
223). It is worth noting that this is an entirely classical setup for modeling a
theory.
Given a first-order langauge with ∧, ∨, ¬, ∀, ∃, ∈ and variables x, y, z, ..., an LP
model structure consists of a non-empty domain of objects and an interpreta-
tion such that, for each a, b in the domain, the value of a ∈ b is either true or
false (or both). Variables x, y of the language are assigned objects ax, by in the
domain, and the truth value of x ∈ y is exactly that of ax ∈ by . Truth values are
then assigned recursively as in §1.2.1 earlier for LP, now with clauses for the
quantifiers:
The existential quantifier will have dual properties, so that ∃xA(x) is ¬∀x¬A(x).
Naive set theory in LP is obtained by taking every instance of comprehen-
sion40
∃y∀x(x ∈ y ≡ A(x))
∀z(z ∈ x ≡ z ∈ y) ⊃ x = y
M1
a ∈ a is both true and false a = a is both true and false
a ∈ b is false a = b is false
b ∈ a is both true and false b = a is false
b ∈ b is true b = b is both true and false
∀x(x ∈ R ≡ x < x)
Russell set is itself a witness for this claim) and so by the definition of identity,
R , R, which proves as a corollary that
∃x∃y(x , y)
are satisfied in every LP model: that is, there is a universal set and an empty
set. The basic idea of the proof is that, if there were no empty set, then the set
of all empty sets would be empty; and if there are no universal sets, then the
set of all nonuniversal sets is universal.41
The question arises of what, if anything, is not a theorem of this set theory.
It turns out that the (classical) axiom of foundation,
which implies that no sets are self-membered, is not a theorem. Indeed, the little
model M1 is enough to provide a counterexample, since element b is not empty
but has no “least” member.
So the theory is inconsistent but not trivial – a paraconsistent theory. One
can go on to show that the axioms of Zermelo–Fraenkel set theory (without
foundation) are all theorems. For example, the axiom of infinity says that there
is a set containing the empty set and closed under a notion of successors (y+
the successor of y),
This holds simply because the universal set is a witness: the universal set is
closed under . . . everything. So the first conjunct and the second disjunct are
automatic. We already saw in §1.1 how, for perhaps more intuitive reasons, the
existence of a universal set is already the existence of an infinite set.
On this setup, while the classical axioms of ZF (minus foundation) are the-
orems, it is not the case that all of their classical consequences are too. Indeed
it is unclear if much at all of real mathematical use can be extracted, since LP
is so weak; see Thomas (2014b). Undaunted, Priest has pursued increasingly
sophisticated methods for manipulating these models and found that there is at
least one nontrivial model of LP set theory that satisfies every theorem of ZF,
too (Priest, 2006, p. 257; 2017b, sec.11). Thus, it is possible that set theory in LP
41 To work in LP, these ideas need to be expressed as statements about models; see Restall (1992,
p. 426).
M2
a ∈ a is both true and false a = a is both true and false,
(A ∧ ¬A) → B
will fail because there are points in the model where the antecedent holds
without the consequent. As in §1.2.4, logics in the family that Routley was
considering are now called relevant logics. The application proposed by
Routley, and then pursued especially by Ross Brady, is relevant set theory.
It “meets the paradoxes head-on, taking the paradoxical arguments as proofs
and the contradictory conclusions as holding in the theory (Routley, 1977, p.
912).”
So far, the set theories we have seen cannot accommodate Curry’s paradox,
or can only do so at the cost of having no conditional. A well-investigated
diagnosis here is to say that any conditional operator in comprehension cannot
obey contraction, the general principle that two uses of a premise are the same
as one. Curry’s paradox needs contraction to go through. In some weak relevant
logics, though, the principle of contraction is invalid, in the form
p → (p → q) ⊬ p → q.
This makes it possible to face Curry’s paradox more directly. We can have full
comprehension, but – as has been the paraconsistent plan all along – it must be
reasoned about more carefully.42
One such relevant logic without contraction is, for not very good reasons,
called DKQ and is given by closure of the following axioms and rules.43
42 Matters do not end there. While the classical approach drops comprehension and keeps exten-
sionality, keeping comprehension puts pressure on extensionality, as further Curry-esque
paradoxes show. Other nonclassical set theories keep comprehension but drop extensionality. It
appears that relevant set theory is highly unusual in that it can keep both axioms, but see Øgaard
(2016) and Weber (2021, ch. 4).
43 It is a very close relative of the logic DLQ, or dialectical logic with quantifiers, introduced in
Routley and Meyer (1976). It is also a very close relative of the logic DJQ, studied by Brady
(2006). As it extends the “extensional” connectives of LP, we define A ∨ B := ¬(¬A ∧ ¬B) and
again ∃xA(x) := ¬∀x¬A(x).
A1 A→A
A2 (A → B) ∧ (B → C) → (A → C)
A3 A∧B→A
A4 A∧B→B
A5 (A → B) ∧ (A → C) → (A → (B ∧ C))
A6 A ∧ (B ∨ C) → (A ∧ B) ∨ C
A7 ¬¬A → A
A8 (A → ¬B) → (B → ¬A)
A9 A ∨ ¬A
A10 ∀xA(x) → A(a)
A11 ∀x(A(x) → B) → (∃xA(x) → B) (x not free in B)
Rules:
∃y∀x(x ∈ y ↔ A(x))
all now suitably phrased with a robust relevant conditional. The extensionality
axiom states that if sets have the same members, they may be intersubstituted.
The comprehension axiom has no restrictions: every property whatsoever
determines a set. That means one very striking example is worth observing.
There is a set that is comprised of all and only the members that are not members
of that very set,
∀x(x ∈ Z ↔ x < Z)
That is, there is a set identical to its own complement. This extension of this
set completely coincides with its anti-extension. This is the Routley set, since
it was observed in Routley (1977) (although it may have been spotted earlier).
It is, as Routley said, a “completely bizarre” set, but there it is – a highly novel
object for mathematical investigation.
Less flamboyantly, on this approach several of the results we saw earlier
from LP set theory can be re-obtained via axiomatic deductions rather than
model manipulations, and improved. There is an empty set, a universal set, and
a Russell set. All the axioms of ZF (but still not foundation) are derivable, as
are some of their most elementary consequences.44 How many consequences
is not known precisely, although it seems clear that the answer will not be “all
of classical ZF”; but it is also clear that this can prove more than ZF (later).
Routley went as far as to attempt to derive the axiom of choice as a theorem,
which was rather audacious considering choice is known to be independent in
classical set theory.45
Beyond the axioms, one would like to know what standard results can be
derived. In particular, the centerpiece of set theory is the theory of ordinal num-
bers, including the finite ones that ground arithmetic, and also the transfinite
ones that keep track of processes beyond:
ω
..
ω ωω ω.
0, 1, ... ω, ω + 1, ... ω , ... ω , ... ω
2
, ... ω , ...
To show that the ordinals do their job, one proves that the set of all ordinals, Ω,
is structured in the right sort of way, where the ordinals form a single straight
line and any non-empty subset of ordinals has a least member. This can be done
in relevant set theory, given some suitable definitions – where each ordinal α
is the first one after all those before it, α = {β : β precedes α}. Strikingly,
though, once these properties are established it also establishes that Ω itself is
an ordinal,
Ω∈Ω (6)
since it is the first one after all those before it, Ω = {α : α precedes Ω}. But,
by the structure of the ordinals, no ordinal is self-membered,
Ω<Ω (8)
various properties that are thought to “reflect” properties of the classically inef-
fable universal set. Proving they exist in relevant set theory is straightforward,
since Ω is the largest number; so it has all the properties one could dream of
for a large number.47
So we have theorems, some of which are also false. Which sentences are the
non-theorems? The language of relevant set theory is now substantially more
expressive, so that while in LP there is no sentence that cannot be satisfied by
some model (that is the point of Weir’s one element counterexample) in DKQ
set theory an “absurd” formula may be given:
⊥ := ∀x∀y(x ∈ y).
This formula is absurd because ⊥ → A is true for any A. (Just let y = {z : A}.)
So there are some useful theorems here, as well as some absolutely unsatisfiable
sentences.
Perhaps the most important result to date in paraconsistent set theory is Ross
Brady’s proof that naive set theory in a weak relevant logic has nontrivial mod-
els; the axioms of comprehension and extensionality, and their consequences
in relevant logic, can be true without sliding into absurdity. This proof was
devised in the late 1970s and has been refined continuously since.48 He con-
structs a classical model, M, which satisfies all the theorems of naive set
theory, but in which at least one sentence (⊥) is not satisfied. Because of the
importance of this result I will go through its main ideas, with details in an
Appendix.
The idea of the proof is to give a persistence argument, considerably extend-
ing Kripke’s fixed point method. Kripke’s construction is one-dimensional, a
straight line from a “ground” model to a fixed point. Brady takes the fixed
point at the end of the construction and uses it as the ground model for a
two-dimensional construction. A model is determined, made up of a transfi-
nite sequence of models, leading to the construction of a transfinite sequence
of transfinite sequences. Truth values for propositions are then shown to be
preserved along the sequence. An inductive argument is carried out on this
structure to show that it is a model of naive set theory, but that some sentences
fail to be true in the model, in particular, ∀x∀y(x ∈ y) and a few others; see
Appendix.
Philosophically, what does this proof show? It is a relative consistency proof.
It is not, I think, an attempt to show what the universe of relevant set theory
is “really” like. The proof is instrumental, showing that, if one believes that
classical set theory is consistent, then relevant set theory can be represented
within it, and given a nontrivial classical model. For the vast majority of people,
who believe in classical set theory, this shows the relative reliability of Routley-
Brady set theory. But what about someone who casts (extreme and vitriolic)
doubt on aspects of classical set theory – someone like, say, Routley? The point
is not lost on Brady:
It may seem ironic that hierarchies are being used to prove a result which is
designed to eliminate certain hierarchies. However, once our result is estab-
lished, hopefully one can chose a weaker and non-hierarchical logic to work
in. (Brady, 1989, p. 433)
49 For very reasonable reasons to doubt recapture, see Thomas (2014a). There is also the ever-
present worry about Curry’s paradox. Contraction for → is avoided, but the theory still features
structural contraction (if A, A ⊢ B then A ⊢ B) and then runs into trouble if the theory can
internalize its own notion of validity. For more on this, and for my best, and still incomplete,
attempt with this, see Weber (2021, chs. 4, 5).
which is derived from the Routley set Z in §2.2. Every member of X is in Z(X)
iff it is not; but it must be one or the other, so it is both, as is now familiar. By
extensionality, for any non-empty X, then
Theorem (Cantor) For any set X and any mapping f from X to P(X), there is
a y ∈ P(X) such that f(x) , y for every x ∈ X.
Since this is true for any mapping whatsoever, a fortiori no bijection covers
P(X) after all, even allowing for inconsistency. Cantor’s theorem is still true,
even in the presence of inconsistency: recaptured.
Of course, this argument does not rule out that there also is a bijection. It can’t
rule that out because there is a bijection at the level of the universe U. This argu-
ment is not really a reconstruction of Cantor’s proof, so much as a new proof
using paraconsistency-specific methods. So in “recapturing” Cantor’s theorem
in the previous paragraph, what has been shown?
If you are familiar with other episodes in philosophical logic, consider, by
analogy, how Tarski responded to the liar paradox. He said that there must be
an infinite hierarchy of metalanguages, each to look down on the previous,
because if there ever were some flat or final level there would be a contradic-
tion. Cantor’s transfinite is in many ways the same response, but to Russell’s
paradox instead of the liar. Running with the analogy, perhaps the entire appear-
ance of transfinite set theory – the mathematics beyond infinity – is a way of
making classical sense of an inconsistency. There is “too much” information
in one place, reasons the classicist, so there must be “many places” for the
information to be spread across.
The paraconsistent mathematician can agree with this classical appraisal,
with the caveat that the many places are also just one inconsistent place. One
inconsistent place is many places, too. Far from saying that the inconsistent has
no structure, it turns out that the inconsistent has transfinite structure. Along
these lines, perhaps all sets are no bigger than the first place the contradiction
occurs, perhaps at the first (and last) transfinite ordinal, with the thought that
ω = Ω.
Maybe there is only one inconsistent infinity after all, which is what almost
everyone since Aristotle thought.50
Matters do not end there. What if the inconsistency (if there is one) occurs
even sooner, so that the last ordinal is some finite natural number? This will be
the topic of the next section.
50 On this “axiom of countability” see also Priest (2017a). As Scott memorably puts it when sur-
veying the vast array of incompatible models of ZF delivered by forcing methods, “Perhaps we
could be pushed in the end to say that all sets are countable . . . when at last all cardinals are
absolutely destroyed” (from the introduction to (Bell, 2005, p. xv)).
These are the extensional connectives. Their values can be calculated at a point,
that is, if A is true at a point then ¬A is false at that same point.
To model the DKQ conditional →, which is not truth functional, we use a truth
functional conditional _. The truth table for it looks like this:
_ t b f
t t f f
b t b f
f t t t
This is the conditional from a logic called RM3. The DK conditional → is a modal
operator (at least in Routley–Meyer semantics) taking its meaning from varia-
tions across possible and impossible worlds; but it may be interpreted soundly
by the RM3 truth tables, making RM3 “the laboratory of relevant logic” accord-
ing to Meyer (according to Dunn, 1979, p. 81; cf. Priest, 2008, p. 125). Even
though many of the sentences that are true according the RM3 operator are not
true for naive set theory – for example, _ contracts, triggering Curry’s para-
dox, and so cannot simply be the conditional for our logic, still all DK theorems
are RM3 theorems.
In the model, values for a conditional at a point are determined by what
happens previously. A conditional is true at a point iff it is true (according to
the truth table) at all earlier points; it is false if there is a counterexample at
some earlier point. What truths persist to the limit, so to speak, are the truths.
So the models Mτ,λτ are fixed points, leading to the construction of a transfinite
sequence of transfinite sequences,
Truth values for propositions are shown to be preserved along the sequence,
thus establishing properties of the model M by induction. The proof proceeds
by a series of lemmas, starting at Brady and Routley (1989, p. 422) and Brady
(2006, p. 205). The first shows persistence.
Then the next lemma states that, if ⟨α, β⟩ ⩽ ⟨γ, δ⟩, then Mα,β ⩽ Mγ,δ .
Lemma 3 (Fixed Points) For all sequences Mτ,0, Mτ,1, ..., Mτ,λµ , ... there is
a countable ordinal λτ that is a fixed point; that is,
Mτ,λτ = Mτ,β
whenever λτ < β. Further, some λτ is the first such fixed point. For sequences,
there is a first ordinal κ such that
Mκ,λκ = Mκ+1,λκ+1 .
A double induction, which is divided into five cases (each case itself having
multiple subcases) to cover successor and limit ordinals (Brady, 1989, p. 445),
then determines the model structure
with Mκ,λκ called the limit structure. Finally, we can define validity in M: a
formula A is valid in M iff JAK = t or b at the limit structure, and invalid in M
otherwise.
Theorem 4 The axioms of naive set theory are valid in M and all the rules
preserve truth.
3 Arithmetic
It is not so hard to imagine that, somewhere in the vast transfinite towers of
Cantorian set theory, in some infinitely large sets that exceed even very immod-
est conceptions of infinity, there may be some inconsistency. But arithmetic is
the theory of the finite counting numbers 0, 1, 2, 3, . . .. It is harder to imagine
that there is any inconsistency in arithmetic. And yet it is here that some of
the deepest philosophical and technical investigations using paraconsistency
in mathematics have been made.
Hilbert’s program was a call in the first decades of the twentieth century to
show that mathematics is complete, consistent, and decidable, and to prove
it using finitary methods. Gödel proved in 1931 in his first incompleteness
theorem that for any consistent axiomatization of arithmetic, there is some sen-
tence (roughly, “this sentence is not provable”) that can be neither proved nor
its negation proved. In his second incompleteness theorem, he further showed
that no consistent axiomatization of arithmetic can prove itself to be consistent.
Gödel’s theorems showed that Hilbert’s program – as standardly understood –
is impossible.51 It turns out that both of Gödel’s theorems are fertile grounds
for paraconsistency.
While the standing wisdom on the first theorem is to accept that, as the name
suggests, arithmetic is consistent but incomplete, an alternative interpretation
j<j
52 Terminology in this area is, yes, inconsistent. Elsewhere, da Costa uses the term “antinomy” to
denote “bad” contradictions: “An antinomy implies triviality. A paradox is not in general an
antinomy” (da Costa, 1974, p. 498). This is not Asenjo’s (prior) usage.
Asenjo’s work is bold, creative, and ahead of its time. The development of
arithmetic in LP has been more recently and prominently studied by Priest.
Asenjo is forceful about the need for antinomic mathematics, but less clear on
specific justifications for it or how it would work. Priest tells the following
story.53
53 In papers Priest (1994a, 1994b, 1997, 2000) and (Priest, 2006, ch. 17). It appeals to finite models,
something we already saw for LP set theory in §2.1.
Consider a natural number n so large that, as Priest puts it, it has “no phys-
ical meaning or psychological reality.” Suppose it is a number that is simply
larger than anything that could be specified before the end of the universe. If we
suppose n is the least such number, this comes perilously close to something
like Berry’s paradox – giving a precise description of the first number too big
to give a precise description – and so we might entertain the idea that it is the
least inconsistent number. In particular, this number has the property that
n = n + 1.
If there is such a number, then we can pursue the intuition further. With N the
(infinite) set of natural numbers, consider the set
Nn = {m ∈ N : m ⩽ n}.
These are the (presumably) consistent numbers up to n. But since n is its own
successor, for any number k > n we might think of k as also being identical to
n. That is, in being the first inconsistent number, this is also (in some sense)
the last number.54 Anything true (or false) of any k > n is also true (or false)
of n. The picture might be illustrated like this:
In addition to looping at n, one might wonder whether the line also continues,
since all k > n are after all distinct, k , n. They are also not, k = n (and so n ,
n) but what precisely to make of this structure past the point of inconsistency
is up for further discussion.
Getting more precise: let N be the set of sentences in the language of first-
order arithmetic that are true about N, and Nn be the set of true sentences about
Nn . Some rather astonishing results follow (Priest, 1994a, p. 337).
The theory Nn , as governed by the logic LP, is inconsistent but not trivial.
It is inconsistent because it contains sentences that say some number both is
and is not its own successor. It is nontrivial because all the numbers m < n
are consistent, in the sense that if, for example, m is an even number, a sen-
tence saying m is not even is not in Nn . That is, N and Nn agree up to the first
inconsistency.
The theory Nn is complete and decidable. It is complete because every sen-
tence of N is a sentence of Nn : if A is a sentence about numbers before n then
both theories say exactly the same thing; and if A is a sentence about numbers
after n (which are all, really, just n itself) then Nn will at least agree with N
(although it might say more). The theory is decidable because the structure it
describes is finite. It is therefore also axiomatizable.
And the theory Nn can represent itself. It contains a truth predicate and its
own proof predicate. It proves its own soundness, its own nontriviality (by finite
means), and its Gödel sentence (“this sentence is not provable”) as well as its
negation.
These properties can be rigorously proved to hold in an LP-model of arithme-
tic, using a technique called collapsing.55 The rough idea is to take the standard
natural numbers up to n, and “collapse” the model after this, so that all subse-
quent numbers are interpreted as identical and not identical to n. To make this
precise and a little more general, consider some equivalence relation ∼ on the
numbers, and break the numbers into equivalence classes [n] = {m : m ∼ n},
giving a quotient algebra. Then a collapsed interpretation is obtained by, effec-
tively, treating any numbers in the same equivalence class as identical, an object
that has all the properties of all its members. Notably, nonidentities of members
of the equivalence classes are inherited by the classes themselves: if x , y and
x, y ∈ [z] then [z] , [z].
The key result from this construction is as follows.
Collapsing Lemma If M is a model and M ∼ is its collapse, everything true
(or false) in M is still true (or false) in M ∼ .
The proof is an induction on formulas, which turns on the monotonicity of
the connectives: once truth values go in, they never go out. It is important to see
that this would not work with classical negation, since adding more negation
facts to a theory would rule out previous ones – or destroy the theory.
A collapsed model can be an interpretation of a nontrivial but inconsistent
theory. This gives a neat way of generating models for paraconsistent theories,
and has antecedents in classical model theory. The (downward) Löwenheim-
Skolem theorem states that any theory with an uncountably infinite model has
a countably infinite model. The collapsing lemma is a kind of elaboration on
this – the “ultimate downward Löwenheim-Skolem theorem,” as Priest puts
it – in that it takes a consistent theory with an infinite model and reinterprets
it as an inconsistent theory with a finite model. The general structure of these
models is now well understood and involves sequences of successor-cycles off
to infinity.56
and yet it may not be the case that n = m. This is what allows there to be
inconsistencies, but apparently limits the use of the theory.57 An advocate of
the theory might reply that failures of modus ponens will only occur “above”
n and that for lower-down consistent numbers “subtraction” works just fine.
A critic might reply that arithmetic subtraction is not supposed to be the sort
of thing that only works sometimes. In any case, as with set theory, it seems
reasonable to consider working with a logic that has a conditional.
A candidate conditional that can easily be added to LP (indeed, conserv-
atively extending it) for the purposes of arithmetic has been investigated in
Tedder (2015). Suppose A ⇒ B is true when A is false, and otherwise takes the
value of B. In three-valued terms this is
⇒ t b f
t t b f
b t b f
f t t t
This gives an appealing conditional that obeys modus ponens, which has
been investigated by Arnon Avron, Batens, and others. Tedder applies collaps-
ing techniques. Then a sound and complete axiomatization of paraconsistent
arithmetic can be given in this logic; cf. Tedder (2021).
Another, more radical, direction is to move away from the idea of “gen-
eralizing” classical logic and considering instead contra-classical logics – in
particular, connexive logics. These are systems that include the principles
in some ways an even more radical departure than anything considered here,
since it would seem to rule out any reductio proofs, or at least render key
premises in a reductio at least false.
As with set theory, then, we find ourselves looking for a developed theory
of paraconsistent arithmetic with a robust conditional; as with set theory (but
in fact chronologically prior), we turn to relevant logic.
A → ((A → B) → B).
Add to the language the symbols 0, ×, +, and ′, and add the following version
of Peano’s postulates:
∀x∀y(x = y ↔ x′ = y′)
∀x∀y(x = y → (x = z → y = z))
∀x(x′ , 0)
∀x(x + 0 = x)
∀y∀y(x + y′ = (x + y)′)
∀x(x × 0 = 0)
∀x∀y(x × y′ = (x × y) + x)
This looks extremely conservative. The third axiom, for example, says that 0 is
not the successor of any number. But recall that the “not” here is paraconsistent
(LP) negation.
Atop these one adds some form of induction axiom. If we add
(which says that every number other than zero is a successor) then we get rel-
evant Robinson arithmetic, which has been investigated by Dunn. (Notice that
this phrasing of the axiom has a built-in disjunctive syllogism.)59 Instead of
that we can add the rule
59 One would also need to add a few further postulates that are derivable from stronger forms of
induction. See Dunn (1980, p. 408).
which is mathematical induction. The quantified logic R plus the Peano postu-
lates and mathematical induction rule is called R# , or “arr sharp.”60 If instead
of the induction rule we add the infinitary omega rule
What we wish to be sure of, as Hilbert might have put it, is that excur-
sions through general logical laws . . . do not render dubious what we rightly
regard as indubitable . . .. From the present viewpoint, the task of furnishing
a non-mythological and demonstrably secure reconstruction of all mathe-
matics was interrupted over trivia, and it is time that these trivia were placed
once more in proper perspective. Again, I do not propose to change the log-
ical superstructure – only to understand it more clearly, by making explicit
in a formal way features that have belonged to our intuitive logic all along
(Meyer, 2021b, pp. 298–299).
By Meyer’s own lights, this hope was not realized, as we now see.
Buoyed by the prospects of an improved formalization of arithmetic that
“removes the sting” of Gödel’s theorem (something that LP arithmetic does) but
still apparently has some proving power (something that LP arithmetic does not)
Meyer and collaborators wondered whether R# could recapture all of standard
Peano Arithmetic. This turned out to be an instance of a more general question
in relevant logic, called (for no very good reason62 ) the gamma problem.
The gamma problem asks about classical recapture. We know that, in order
to be paraconsistent, these logics must not validate disjunctive syllogism. But
in situations that are consistent, perhaps it would turn out that all the effects
of disjunctive syllogism still obtain “for free” as it were? That is, anything
provable with disjunctive syllogism would turn out already to be a theorem
62 The name comes from work by Ackermann in 1956, via Anderson and Belnap (1975), where
disjunctive syllogism is labeled γ (gamma).
are true inconsistent sentence pairs in arithmetic, the most elemental part
of mathematics. If so, that would make paraconsistency in mathematics not
merely a curiosity or side project, but urgently needed in order to retain (or
explain) the coherence of arithmetic, if not mathematics overall. Here is the
argument.63
Mathematics is a practice, conducted by humans using natural language.
This language (let us suppose English) is augmented with technical jargon
and many special symbols, all of which are assigned unambiguous meanings.
The language is used to communicate ideas and results, especially through
proofs.
A proof begins from some accepted statements and proceeds in a replicable
step-by-step way to a conclusion. All of this needs to be comprehensible and
checkable by other mathematicians. Depending on the level of doubt about a
given purported proof, the language may be subjected to more and more scru-
tiny, each statement and steps between the statements analyzed further and
further down into clear and unambiguous language, until ultimately it is decided
that the proof is valid or invalid (perhaps invalid because insufficiently clear).
In principle, any piece of mathematical reasoning should be analyzable in this
way. The practical details of all this might be very daunting (e.g., Perelman’s
proof of the Poincaré conjecture was posted in 2002 but not accepted as verified
until 2006 – and even that was not done by reducing everything to predicate
logic or the like) but such concerns are merely medical; if a proof is not for-
mally checkable in principle, it is not a proof. We can effectively assess the
validity of any proof. And not just that, but because proofs begin from state-
ments accepted as true and proceed by valid steps, we can see that mathematical
proofs are sound, that theorems are true. Informal proofs effectively prove true
theorems.64
But then – argues Priest – our “informal” mathematical discourse meets the
conditions for being a formal axiomatic system, where the provability relation is
effective. By the Church-Turing thesis, all effective procedures are recursive,
and in our formalized “naive” mathematics, all recursive relations are repre-
sentable. And if all that is correct, then informal or “naive” mathematics meets
all the conditions for Gödel’s incompleteness theorem. This has some dramatic
consequences.
63 Compare with Priest (2006, chs. 3, 17), Berto (2007, chs. 4, 12), and Routley (1979).
64 In classical arithmetic Löb’s theorem states that the soundness of arithmetic cannot be estab-
lished within arithmetic itself: that if “A is provable → A” were a theorem, then A would be a
theorem – for any A, which is absurd. But the proof of Löb’s theorem uses contraction. In logics
without contraction this version of incompleteness could be evaded. See van Benthem (1978,
p. 54). For more on the idea of informal provability, see Leitgeb (2009).
Gödel’s theorem states that any formal system that is able to represent all its
own recursive relations, and which has an effective proof relation, will contain
what he called “undecidable” propositions. One such undecidable proposi-
tion may be informally glossed as a statement in the language of the system
that says “the system cannot prove this very statement.” This is a gödel sen-
tence g for the system. If the system did prove either g or its negation then
the system would be inconsistent. Crucially, on the standard story, we estab-
lish all this from outside the system, showing that it is incomplete – and in
doing so, we can see that g is true. There is an unprovable truth, on pain of
contradiction.
Gödel gets to incompleteness by stepping outside the formal system. The
trick when it comes to the system of naive mathematical proof is that there is no
“outside” the system to retreat to. Rather than showing there is some sentence
our consistent system cannot decide, confronted with the gödel sentence g we
end up proving both g and its negation. The system itself, our system, is incon-
sistent. The system in question is our colloquial, informal, sound mathematical
discourse; so mathematics itself is inconsistent.
There is a lot to say about this argument and the issues it raises.65 We do not
have time to say any of it. I will only point in one of its most audacious, and
untapped, directions.
As Shapiro points out,66 and Priest agrees,67 if the gödel sentence is both
true and false (has a false negation), then this not only shows an inconsistency
in arithmetic, but that proof itself is inconsistent: for we have shown that there
is a proof of g, but then because g is true there is also no proof of g (since
that’s what g says). So the proof of g is also not a proof of g. Since a proof
is a precise object, a sequence of sentences beginning with axioms and ending
with a theorem, it is fair to ask where exactly in the (valid) proof things go
wrong.
If our informal proof relation is both effective and inconsistent, then the
this question approaches a very intriguing, and mostly uninvestigated, topic:
paraconsistent computability theory. What would an inconsistent computa-
tion be? Gödel (and Turing and Church) showed that there are hard limits on
what any consistent computer can do; but now we are beyond consistency.
What, in the end, about the prospect of reviving Hilbert’s failed dream of a
universal truth machine that effectively decides all mathematical questions?
This would have to be an inconsistent machine, of course, but so be it. The
familiar limits on computability from Turing et al. are really just variations
on the sorts of paradoxes that seem amenable to paraconsistent treatment –
the existence of uncomputable procedures is a diagonalization argument like
Cantor’s theorem – and so perhaps these apparently unbreakable limits can be
broken?
But to answer any of these questions is again to face Shapiro’s question,
of what, for example, a computation that is also not a computation, might look
like. Priest for the most part deflects these questions, and instead answers ques-
tions about what it would mean for the description of a computation to be
inconsistent; the answer to that is relatively easy: there would be some incon-
sistent natural number n , n, which he has already made peace with. This does
not tell us, though, what the description is describing. And here Priest deems
this “metaphysical speculations” (Priest, 2006, p. 243).
Saying something more meaningful about the hardware of paraconsistent
proofs either requires some advances in physics, as Copeland and Sylvan seem
to suggest (Copeland & Sylvan, 1999; Sylvan & Copeland, 2000; cf. Weber,
2016), or some other insight, perhaps something mundane along the lines of
Weber, Badia, and Girard (2016). Or, this is evidence that paraconsistent math-
ematics reaches its own limits at the horizon of mechanical realizability, which
is more or less what Gödel said about classical mathematics anyway.
Priest’s argument, I would say, is based on a fair gloss on the received
mythology about mathematical proof. It would also be fair to say that, espe-
cially since Gödel’s theorems, that mythology has been called in to question
as too simplistic and maybe the lesson to take is that our naive proof prac-
tices are more open ended. As is often the case, there is agreement about a
conditional statement, “if naive proof is recursive then it is inconsistent” and
then disagreement as to whether modus ponens or modus tollens is called
for.
4.1 Infinitesimals
“Questions about the immeasurably large are idle questions for the explanation
of Nature,” says Riemann. “But the situation is quite different with questions
about the immeasurably small.”68 Let us see.
It is easy to measure a straight line. It is not so easy to measure a curved
line. An idea for tackling the latter problem goes back to Archimedes, and is
emblematic of a lot of mathematical ingenuity: to break one big difficult prob-
lem up into many small but easy problems, or in this case, to imagine a curved
line as composed of infinitely many infinitely small straight lines. Archime-
des thought of this as a way to approximate the correct answer, but eventually
methods that do not merely approach but arrive at the precise answer were
accepted – the Calculus.69
As conceived by Newton and Leibniz (independently), the differential cal-
culus is concerned with “infinitesimal” changes. In its original versions, there
were no scare quotes: calculations involved terms (Newton called them “flux-
ions”) that were infinitely small, in the sense of being big enough to divide by
(since division by zero is undefined), but small enough to disregard completely.
These “ghosts of departed quantities,” as they were unflatteringly called, were
remarkably effective for getting results, but also apparently inconsistent. While
methods developed in the 1800s by Weierstrauss et al. make it possible now
to avoid use of infinitesimals, and moreover innovations by Robinson in non-
standard analysis make it possible to simulate the appearance of infinitesimals
using some loopholes in first-order logic,70 a return to the original intuitive
calculus has remained an attractive prospect for paraconsistent research.
For a real number x, let δ(x) be, intuitively, an infinitesimal part of x. For a
continuous function f, the original definition of its first derivative was
f(x + δ(x)) − f(x)
f ′(x) = .
δ(x)
This can only be allowed if δ(x) , 0, since we are dividing by it.71 For example,
for the function g(x) = x2 , then we have
(x + δ(x))2 − x2
g′(x) =
δ(x)
2xδ(x) − (δ(x))2
=
δ(x)
68 From “On the Hypotheses That Lie at the Foundations of Geometry,” §3, translated in Spivak
(1999, vol. 2, pp. 151–162).
69 See Bell (2008, introduction). Cf. McKubre-Jordens and Weber (2016).
70 See Goldblatt (1998).
The standard modern definition is f ′ (x) = limh→0
71 f(x+h)−f(x)
h , which allows division by h as it
approaches zero.
= 2x + δ(x)
= 2x.
But then at the last step, the δ(x) term is thought of as being so small as
to not matter at all, whence we have the standard rule for powers, that the
derivative of x2 is 2x. It looks like this reasoning involves two assumptions,
namely,
There is an instantaneous-but-not jump at zero. This turns out to be extremely useful, for exam-
ple, modeling the dynamics of force of impact. This also turns out to be classically impossible –
there is no function like this, as Dirac well knew. Proponents of inconsistent mathematics have
suggested that “bad” objects like the delta function could be prime candidates for paraconsistent
rehabilitation. See Priest, Routley, and Norman (1989, p. 376) or Mortensen (1995, ch. 7).
4.2 Topology
Calculus works by thinking about space with a metric, so that distances and
areas are assigned a quantifiable number. Topology drops the metric. The intu-
ition that there is something interesting but inconsistent about infinitesimals
has in some ways received better paraconsistent treatment in more qualitative
theories of space, where the metric is dropped, and the formalism is closer to
an underlying set theory than the equations and rules of calculus. We can focus
simply on regions as sets of points, where we can look to model impossibly
small-but-not-zero “instants.”
Think again of the idea of a curve as composed of infinitely many infinitely
small lines. Suppose this curve is describing the trajectory of an object moving
through space, for example, you walking from your bed to the door, stepping
around a pile of socks. You are moving, but if we think of the path as composed
solely of unextended points, none of these are moving; there is no motion at an
instant. No matter how many instants we include that depict the transition from
one to another, nothing moves. As Russell puts it (approvingly), “Change does
not involve a state of change” (Russell, 1903, p. xxxiii).
Put this way, the problem seems to be about capturing the “dynamics” in a
static formalism, but we can remove the movement and still see the problem.73
Suppose you step around your socks and reach the door, and step through it.
At some location along your path you were in your room (a ∈ A), and then at
a location further along the path, you were not in your room any more (a < A).
What happened? In particular, what happened at the plane of the doorway?
There are four options:
(i) a ∈ A only
(ii) a < A only
(iii) neither a ∈ A nor a < A
(iv) both a ∈ A and a < A
73 Here, following Priest (2006, ch. 11). Priest considers what he calls the (classically false) Leibniz
continuity condition, which is taken up by Mortensen (1995, ch. 6).
it is hard to see why there is any indeterminacy. If we identify you with your
center of gravity and the doorway with the plane of the door, then there is an
instant when the point is on the plane, and either that is in the room, or not.
Any theory of your journey that treats this as indeterminate is, in a basic sense,
an incomplete theory of what happened. Option (iii) is just not answering the
question.
The broadly accepted classical theory, then, is to say that when you are
exactly in the doorway exactly one of (i) or (ii) is the case, and it is more or
less arbitrary which one. Either you are at the last in-the-room point, or the first
not-in-the-room point, and only one of these is true, and one of these must be
true, but there is no way to settle which beyond just flipping a coin. This has
an advantage over the gappy response insofar as it answers the question, but
it leaves a nagging sense of asymmetry. The very fact that we are free, more
or less, to stipulate whether it is (i) or (ii) is itself an indication that there is no
meaningful difference between these options. The classical theory is not merely
arbitrary, but also missing something about the scenario, namely that options
(i) and (ii) are symmetrical.
Maybe, then, inconsistency (option (iv)) can capture instantaneousness. To
model this, we can go from the metric idea of being “infinitesimally close” to
the topological idea of being “very nearby.” For a set of points A, the set of
points that are very nearby A is called the closure of A, usually written A− .
Every point in A is in A− , and A is called closed iff A = A− . Now, if we think
of a space X divided into two sets A, B such that every point in X is either in A
or B but not both, then B is the complement of A in X and we can define the
boundary (or sometimes ‘frontier’) of A as being all the points both very close
to A and to its complement:
bdy(A) := A− ∩ B−
So now we can rephrase the question from before: Is the boundary of A a part
of A, or a part of the complement of A? It must be one of the two, since A and B
exhaust the space by construction. But again it would seem asymmetrical and
arbitrary just to pick one and not the other. It seems like the boundary belongs
to both, that the spaces A and B touch each other at their shared boundary. But
a closed set has its boundary as a part: A is closed iff A includes its boundary.
If A and B are both closed, then points on the boundary will be in both A and
B; and since points in B are not in A, that is a contradiction.
Classically, that means that, if A and B are both closed, their boundary must
be empty, by reductio. But you are standing on the boundary; it is palpably
not empty. If your room and the abutting hallway are touching, and the overall
space has no rips or tears in it, then the reductio is not a reductio but a proof:
when you are on the boundary of A (as you are) then you just are both in A and
not. Or so goes this line of thought.
A formal theory of paraconsistent topology, or mereotopology (the theory
of connected parts), can be developed to make all this precise. Accounting for
inconsistency, the following can be shown:74
An object and its complement share a boundary, up to inconsistency: if B is
the complement of A then
A nontrivial model of this scenario using two elements and the logic RM3 is pro-
vided in Weber and Cotnoir (2015) and further point set topology is in (Weber,
2021, ch.9).
Some of these topological notions can be fed back into logic. The closure
operator has some inherently attractive properties:
These properties are attractive not least because, if instead of thinking of sets
of points we think of sets of sentences, and we read the closure operator as the
set of logical consequences of those sentences, then this provides a nice model
of logical consequence, following an observation by Tarski.
If we think of sentences as indexed by a set of points (such as the “worlds” at
which the sentence is true), then we can think about the topological structure of
those sets. If we think of sentences as holding only on closed sets, this gives rise
to a paraconsistent logic, which as a result Mortensen declares to be the “most
natural.” Conjunction and disjunction are the union and intersection of closed
sets, respectively, and negation is the smallest closed set containing the com-
plement. (As a three-valued logic, it is the same as RM3 except that in the truth
table for negation, “not both” is identified with “true.”) The distinctive feature
is that at the boundary of a proposition, both it and its negation hold. This is
exactly the dual of open set semantics for intuitionistic logic. See Mortensen
(2010, ch. 2).
Generalizing, in the area of category theory it is known that a certain
class of structures, “toposes,” correspond to intuitionistic logic; they are
74 Cf. Weber and Cotnoir (2015, p. 1284); cf. Weber (2021, p. 280).
They are inconsistent because some of the faces are both in front of, and not in
front of, the other.
These are examples of what Mortensen and his collaborators call impossible
pictures – images that offer some visually immediate material for inconsistent
mathematics. These are the sorts of pictures that are found canonically in the
artworks of M. C. Escher. Impossibilia are pictures that upon analysis have
some propositions A and ¬A as part of their description. Mortensen is con-
cerned with capturing how these pictures “seem” to us, how we cognize them
in our minds, and argues that any consistent interpretation of them will inevi-
tably leave something out: namely, that they present as contradictory in some
essential way (Mortensen, 2010, p. 74).
For example, at the end of the movie Labyrinth (1986)75 there is a scene that
takes place in an impossible room filled with staircases going in all directions;
the content of the scene clearly indicates that at some points the characters are
both “above” and “not above” in relation to each other. This scene would make
no sense if it were resolved into noncontradictory terms; the logical vertigo is
a key component of the story. This is a fictional fantasy story, of course, but
the images on screen present clear geometric properties and describing these
faithfully would seem to call for inconsistent statements.
The edges of a necker cube in the above diagrams are either dotted or solid;
to be more colorful, they may be thought of as painted either red or blue. So
when two lines cross, the image may indicate that either (1) the red line occludes
(blocks) the blue line, (2) the blue line occludes the red line, (3) neither occludes
the other, or (4) they overlap and occlude each other. To get at the underlying
impossible content, we specify that a local contradiction about crossings would
violate a “Local Consistency Axiom,”
A Necker is locally inconsistent iff it has both red and blue at some crossing.
Then a face of a necker is a figure with four edges intersecting at the corners
in four vertices. One face is in front of another face if some edge of one occludes
some edge of the other, violating a “Global Consistency Axiom”
75 Directed by Jim Henson. Written by Terry Jones. There is also a part where the protagonist
faces a classic liar/truth-teller riddle, of the sort made famous by Smullyan in his “knights and
knaves” scenarios.
front of” relation. It would be interesting to have conditions that connect local
and global inconsistency.
To carry on more precise study requires some formal representation of the
targets. For this, Mortensen employs linear algebra , which works with systems
of equations; cf. Mortensen and Leishman (2009). In any drawing of a Necker
Cube there are only two places where lines cross. If we denote these C1 and C2
then each cube can be described by a pair of simultaneous linear equations
aR + bB = C1
cR + dB = C2
Call this the primary equation and of the cube and the two-by-two matrix in it
its primary matrix. A cube is thus locally inconsistent iff it has all ones in one of
its rows. With cubes represented as matrices, we can[ then ]appeal to the standard
a b
notion of the determinant of a two by two matrix as (a × d) − (b × c),
c d
and provide a sufficient condition for contradiction:
If the determinant of a primary matrix of a Necker is not zero, then the Necker
is locally inconsistent.
jC1 + kC2 = RF
nC1 + mC2 = BF
with j, k, n, m ∈ {0, 1} so to say, for example, “the red face is composed of one
red at C1 and no red at C2 ” as 1C1 + 0C2 = RF. Then this information is written
as
[ ] [ ] [ ]
j k C1 RF
× =
n m C2 BF
with the two-by-two array as the secondary matrix of the cube. Then there is
an account of how both local and global inconsistency is generated:
A Necker is inconsistent iff it has a column of all ones in its secondary matrix.
76 For further history and analysis, and an extensive gallery of impossible pictures, see Mortensen
(2022).
For those sympathetic to Colyvan’s idea (and if you are this far into an Ele-
ment on paraconsistency, maybe you are), there is now a way to make better
sense of what “realism” about inconsistent mathematics might mean. For we
already clarified that consistency (or not) is a feature of language or theories,
Element does not arrive at the endpoint of a great discovery or the culmination
of a unified philosophical program, but is rather a start, for those not afraid of
risk, into an unknown future.78
For what it is worth, I think Asenjo, Arruda, and others are correct that
the future will hinge on the emergence of more mathematical results from
this diverse field. I think that, just as mountains are eventually climbed sim-
ply because they are there – even mountains that once appeared impossible
to climb – there is more to come from inconsistent mathematics; and if what
comes is useful or at least mathematically interesting, this will be more impor-
tant than the outcome of further philosophical debate. How might such future
efforts go?
Throughout this selective survey we have been taking note of when various
approaches are moderate, and when they are more revisionary. This can be
along two dimensions: in the content of the claims and results (i.e. whether
a paraconsistent mathematician is saying something that sounds like it is at
odds, or not, with standard mathematics); and in the methods used to support
those claims (e.g. whether a paraconsistent mathematician is using standard
mathematics as a framework).
In terms of content, we have seen some very conservative projects, includ-
ing embedding classical ZFC in a paraconsistent logic and Meyer’s dismay
that classical PA is not recovered in relevant arithmetic. We have seen also
some more risqué efforts with inconsistent sets, inconsistent natural numbers,
and inconsistent pictures – and a lot of finite inconsistency approximating the
infinite; but in terms of methods even these are more conservative in the sense
that they appeal to classical devices (Brady’s nontriviality proof, the collapsing
lemma).
This might all seem a bit off kilter, if Routley is right that
Various people, both friends and critics, have wondered aloud about this issue.
It turns, once again, on whether paraconsistency is ultimately at odds with
standard mathematics.
Many working in this area have suggested this worry is somehow misguided,
that we do not need to think of paraconsistency as a dispute with classicality,
78 More than in previous sections, I will be articulating a position that is, I suspect, not represen-
tative of the paraconsistent community (such that there is one).
so the use of classical methods is fine. If there is a dispute here, what sort of
disagreement would it be? Warren distinguishes between descriptive disputes
about which logic is used in some particular natural language; normative dis-
putes about which logic should be used in some particular natural language; and
metaphysical disputes about which logic is objectively correct in a language
independent sense (Warren, 2018, p. 437). Let us consider these.
The original Meyer/Routley/Priest approach, at least, was a serious descrip-
tive dispute about the correct formalization of mathematical practice, as we saw
in §1.2.3. Routley bragged “we can do everything you can do, only better, and
we can do more (Routley, 1977, p. 929).”
Circa the first quarter of the twenty-first century, this has not come to pass.
The idea that the classical and paraconsistent truths would somehow coincide,
or that the paraconsistent truths might completely subsume the classical, has
been painstakingly tested and appears to be out of reach. There is currently
no consensus on what logic people really are using (if there even is an answer
to this question); Gillian Russell’s logical nihilism suggests there simply is no
descriptive logic of natural language (Russell, 2018). But I think it is reasonable
to say that the early Meyer et al. hypothesis has not been vindicated; various
attempts to reconstruct standard proofs in paraconsistent logics have repeatedly
failed.79 Ordinary mathematical practice may not be classical, but hard-won
evidence would suggest it is not straightforwardly paraconsistent, either. There
was a descriptive dispute here about the logic of mathematical proofs, and I do
not think the classical side unequivocally won, but the relevant paraconsistent
side did not either.
What about sidestepping this with the idea of mathematical truth as invari-
ant to logic, by making logic secondary to mathematics? My sense, after long
exposure to useage at odds with classicality, is that the idea of mathematical
truth as invariant or prior to logic (suggested by Mortensen, but also Brouwer)
looks harder and harder to maintain. If one is looking at classical manipulations
of classical models, or translations into “classicalese” (as Meyer once called it)
then yes, things may all seem to exhibit some similarities. But taken more on
their own terms, without classical representations, I see very little by way of
invariance; even small decisions about the phrasing of axioms in set theory
or real analysis, as we have observed, will lead to dramatic differences down-
stream. Similarly, for the idea of some kind of “commonsense” cutting across
79 Much of this evidence is unpublished, since it involves unsuccessful work. See Priest (2006, p.
221), Istre and McKubre-Jordens (2019). Note that the proof of Cantor’s theorem in §2.3 is not
the classical proof but a new proof.
different logics (as Shapiro suggests in (Shapiro, 2014, ch. 7)), friendly though
the sentiment may be, the subtle differences between weak logics tend to render
what was once classically familiar as multitudinous, strange, and new, a chal-
lenge to our old ways. Saying that maybe some natural number n = n + 1 is no
small thing, more than some kind of idiosyncrasy or regional dialect.
If there is still a dispute here, then, what is left open are the normative and
metaphysical lines, the idea that paraconsistency in mathematics can be a guide
to new truths – truths that are currently on the far side of what is currently
acceptable. Without being polemical or trying to pick a fight, I want to suggest
that something closer to this radical, rivalrous understanding of the situation
makes more sense of some of the efforts that have gone into this area. Meyer
says Hilbert “should” have used relevant arithmetic, because Meyer thinks rel-
evant arithmetic is true arithmetic. Meyer might have been wrong, but it is
more charitable to say so than to tell a story about why he really meant some-
thing more agreeable or pluralistic.80 Seeing things this way may be ultimately
healthier for the growth of all fields involved. Classicists and nonclassicists
alike stand to learn more if we think that there is, indeed, some truth-seeking
activity going on, and by comparing notes, especially notes that do not always
agree, we might get closer to truth.
Whither paraconsistency in mathematics? A very standard story goes that
a mathematical theorem, once proven, cannot be undone. Mathematics only
accumulates, generations building on previous generations, like a tower. And
it is a very good tower, perhaps one of the finest things humans have to show
for themselves. If you like that story, there is a clear place for paraconsis-
tency in mathematics. The idea that paraconsistent logics are a generalization
of classical logic is right there in the semantics. For observe:
p ⊃ q, q ⊃ r therefore p ⊃ r
80 This hews very close to detailed debates in logical pluralism about the role of normativity, which
I cannot do justice here. See Steinberger (2019). Priest separates mathematical pluralism from
logical pluralism, leaving a place for truth, in Priest (2021). Cf. Meadows and Weber (2016).
p ⊃ q, q ⊃ r therefore (p ⊃ r) ∨ (q ∧ ¬q)
81 Even if there is no counterexample, maybe there is still a counterexample. See Omori and Weber
(2019) and Weber (2021, ch. 10).
82 An Ivy League professor took me aside and asked me this very question when I was still a
graduate student. In this footnote, I am taking you aside and asking you, even though I am not
an Ivy League professor, and I did not listen to him anyway.
Thanks to Stewart Shapiro and Penelope Rush for the opportunity to write this.
Thanks to Guillermo Badia, Thomas Ferguson, Patrick Girard, Franci Man-
graviti, Ed Mares, Hitoshi Omori, and Graham Priest for helpful comments,
corrections, and discussions.
Penelope Rush
University of Tasmania
From the time Penny Rush completed her thesis in the philosophy of mathematics
(2005), she has worked continuously on themes around the realism/anti-realism divide
and the nature of mathematics. Her edited collection The Metaphysics of Logic
(Cambridge University Press, 2014), and forthcoming essay “Metaphysical Optimism”
(Philosophy Supplement), highlight a particular interest in the idea of reality itself and
curiosity and respect as important philosophical methodologies.
Stewart Shapiro
The Ohio State University
Stewart Shapiro is the O’Donnell Professor of Philosophy at The Ohio State University, a
Distinguished Visiting Professor at the University of Connecticut, and a Professorial
Fellow at the University of Oslo. His major works include Foundations without
Foundationalism (1991), Philosophy of Mathematics: Structure and Ontology (1997),
Vagueness in Context (2006), and Varieties of Logic (2014). He has taught courses in logic,
philosophy of mathematics, metaphysics, epistemology, philosophy of religion, Jewish
philosophy, social and political philosophy, and medical ethics.