You are on page 1of 30

On the existence of infinitesimals

Richard Kaye
School of Mathematics
University of Birmingham
26th May 2010

Abstract
So-called nonstandard mathematics uses infinite and infinitesimal num-
bers to develop mathematics, especially calculus, without the use of the
notion of limit. These methods are rigorous and all results are provable
in usual mathematics or ZFC, and these nonstandard numbers are typi-
cal of modern mathematics in that they abstract familiar operations and
concepts (in this case, of limit in the –δ analysis) as mathematical ob-
jects in their own right. This makes them a good test case for the study
of what a mathematical object is, and issues relating to mathematical
truth and knowledge. This paper studies from a mathematical and quasi-
philosophical point of view the potential existence of such nonstandard
numbers for problemms of existence, realism and platonism of mathemat-
ical objects, and indeed to develop ideas of what mathematical objects
really are. Structuralist and utilitarian views are emphasised and we con-
clude suggestions for a theory of ‘pure structuralism’ that might unite
mathematical and general thought.

Contents
1 Introduction 2

2 Mathematical existence 4

3 Existence of infinitesimals 14

4 The case for ‘pure’ structuralism 21

5 Questions for further research 24

This is the second version of this paper and was a complete rewrite of an
earlier one. I continue to be interested in the material here and I am still devel-
oping it. This paper is issued as a preview of work in progress. All comments
are most welcome! Part of the material here was used as the basis of a collo-
quium to the Philosophy Department of Warwick University, in the summer of
2010, and there was a long and useful discussion afterwards.

1
1 Introduction
The nature of mathematical objects, whether they are abstract or real, how
one might reason about them, and how one might understand the meaning of
statements concerning them, are substantial and interesting philosophical ques-
tions.1 It is quite common in such discussion to direct attention towards the case
of natural numbers, 0,1,2,. . . , as examples of mathematical objects, because of
their comparative simplicity and their familiarity in the non-mathematical world
as well as with them being objects of on-going research in pure mathematics.
Because of this familiarity and because they are so fundamental makes natural
numbers rather atypical examples of the kind of mathematical objects usually
found in mathematical practice, and therefore not necessarily the most useful
examples for our philosophical questions. This paper will comment on the im-
portant questions of the nature of mathematical objects by focusing on another
example, that of infinitesimal numbers. It is hoped that by looking at num-
bers that have seemingly contradictory or impossible nature but nevertheless—
according to mainstream mathematics—incontrovertibly exist, we may learn
something about the nature of mathematical objects. In any case, these are
particularly interesting numbers in their own right, with many potential appli-
cations.
Infinitesimals are numbers that are ‘so small that there is no way to see
them or to measure them’. More formally, in an ordered field, a positive num-
ber x is infinitesimal if 0 < x < 1/n holds for each ordinary positive natural
number n, i.e. of the form 1 + 1 + · · · + 1. Newton and Leibniz both used in-
finitesimals in the development of their calculus, but were famously criticised by
Berkeley. In the nineteenth century Cauchy, and also Riemann and Weierstrass
and many others, replaced the notion of infinitesimal with that of limit. But
in 1966, Abraham Robinson’s book Non-standard Analysis showed that the use
of techniques from first-order logic, in particular the Completeness, Soundness
and Compactness Theorems, the notion of infinitesimal could be put on a firm
foundation and be useful enough to develop the calculus in the way Newton and
Leibniz intended [9]. The name of Robinson’s theory is often abbreviated to
NSA.2
Thus, according to Robinson at least, infinitesimals exist and can be used
profitably in analysis. However this does not completely deal with questions to
do with such numbers, questions I will associate with their ‘existence’ for reasons
that will hopefully become clear. For example, the fact that the existence of
infinitesimals follows from other axioms of mathematics (or set theory) can
be used as a way to focus on those axioms and provide a testing ground for
questions about those axioms: whether we believe them, or in what sense we
believe they model the (or a) valid mathematical universe.
This paper attempts to be a mathematician’s view on questions on the nature
of mathematics, mathematical objects and their existence, and mathematical
1 Thevolume of essays edited by W.D. Hart [4] is an excellent introduction to these issues.
2 Robinsoncalled the new numbers in his system ‘non-standard’ to distinguish them from
the ‘standard’ numbers or usual numbers of other kinds of mathematics. Thus ‘standard’
and ‘non-standard’ are technical terms with precise meanings. Unfortunately, many people
reading ‘Non-standard Analysis’ see it incorrectly as meaning the activity of analysis done in
a non-standard way, and this easily becomes a pejorative term for the subject, which is most
unfortunate. Most recent authors write ‘nonstandard’ without the hyphen to emphasise the
technical meaning of the word, and I will follow this convention here.

2
reasoning. There are a range of mathematical ideas here, which I attempt to tell
‘straight’ without over-simplification, and where there is a choice concentrating
on the mathematical view. I make comments on the underlying philosophy
where I am able, without being particularly thorough. A more thorough detailed
examination of these ideas might constitute a new research project in its own
right, or possibly more than one.
The paper is organised as follows. After this introduction, the first main
section, Section 2, contains descriptions of four main points of view of math-
ematical research, as a working mathematician would see them. These four
viewpoints are not mutually exclusive, nor do I claim that the list is complete.
I suggest ways in which the four views merge into each other, but these are not
the only ways. (Indeed this property of mathematics that it can be looked at
in different ways and at different levels simultaneously is one of its strengths.)
In the following section, Section 3, I shall present the case for and against in-
finitesimal numbers in each of the four views. My presentation is again mostly
mathematical, though I try to make speculative suggestions wherever possible.
Section 4, contains my personal conclusions from this thought, which are that
the structuralist view is essential, not just for mathematics, but for everyday
thought and arguments too. However, the structuralist view as sometimes pre-
sented (e.g. Parson’s essay [7] reprinted in Hart [4, Chapter XIII]) has problems
which must be addressed. My suggestion is essentially that the structuralist
account des not go far enough, and this section concludes with what might
be described as a brief manifesto, or research proposal, for what I call ‘pure
structuralism’. I conclude with a list of research questions that arise out of the
discussion that I think are worthy of further study. An appendix presents some
technical information on nonstandard mathematics for the interested reader.
The four viewpoints are as follows. Firstly, what I call the view from a
unifying theory is the idea that all mathematics can be done or perhaps even
is best done as if from a single unifying theory. A set theory such as ZFC is
typically used. I mention this view first because it seems to be the most com-
mon ground for the majority of mathematicians. It is also a useful simplifying
view for mathematical practice: it may not be optimum for all work or repre-
sent the full views of any particular working mathematician but it is ideal for a
first presentation of new work. Secondly, there is what I call the pluralist view,
where the main work of mathematics is in looking at a large variety of differ-
ent number systems, with ‘number’ being taken in the widest possible sense,
and I would include geometrical arguments within these terms. The underly-
ing theory is weakened but the construction of these systems is still possible
in ZFC. These number systems are regarded as the most important aspect of
mathematics and a number of them are more fundamental than others because
of their ease of construction and applications. Chief amongst these systems are
those for natural numbers, integers, rationals, reals and complexes. The third
of my viewpoints is the structuralist one, that a system of numbers or other
mathematical objects takes abstract meaning through what it does rather than
what it is—the axioms it satisfies rather than the way it is constructed. Here,
the axioms take priority, but constructions are still required to show that such
objects still exist. The most important feature that this brings to bear for us is
canonicity. Remarkably often it is possible to prove that two systems satisfying
the same axioms are isomorphic, the number systems are ‘naturally forced upon
us’, or canonical, and sometimes even more: sometimes the canonicity itself is

3
canonical. This, it will be seen has very deep consequences for the structuralist
view of mathematical objects. Finally, we consider what I call the utilitarian
view, that mathematical objects that are useful for other kinds of mathematics,
and other applications such as science, are the ones that deserve most attention
and the ones that have may be said to have existence in their own right. Apart
from the issue of practicality, an argument related to Hilbert’s programme sup-
ports this view. It will be seen that the issues of canonicity are essential here
and also support the utilitarian existence of certain objects, and in particular
canonicity is particularly interesting with respect to applications for physical
theories, even possibly quantum mechanics.
This lists my four viewpoints and summarises the content of Section 2. I
should emphasise that this section is not complete, even from the mathematical
perspective, in the sense that there are many other sensible viewpoints, includ-
ing different ways of combining or prioritising the viewpoints I have given. One
major omission is the more modern view of mathematics as done on a com-
puter, with computer algebra systems, or similar. This is particularly poorly
represented and presents interesting philosophical, mathematical, and computa-
tional problems, related to constructivism, but perhaps distinct from it too. But
perhaps the biggest omission is the lack of discussion of how the various ways
the views fit together and what extra they offer when they are taken together.
In particular I do not mention anything like a constructivist or intuitionist
framework for combining these views, this being where I suspect intuitionistic
mathematics might still have its greatest impact.

2 Mathematical existence
The main issue, according to Benacerraf [1] (reprinted as Hart [4, Chapter I]) is
that theory of truth of mathematical statements should be consistent with that
of everyday truth, and so should the idea of knowledge. There are a number of
suggestions, not all incompatible with the others, for answers to the questions,
‘What is a mathematical object?’ and ‘What does it mean to say that an object
exists?’ In this section we highlight some of the main options and choices from
the point of view of a typical working mathematician and actual mathematical
practice. In each case I will hint at how mathematics is done, and what the
idea of its underlying foundation is. (These different views are not exclusive of
each other. Indeed one of the interesting things about mathematics is the way
the various different foundational views can work together as different aspects
of ones work or at different levels.)

The unifying view of mathematics. By ‘the unifying view’ I mean the


view of mathematics that it can all be done from one global theory, such as that
of set theory, typically ZFC, and whether or not one chooses to write proofs
and other arguments formally (and most choose not to) it is clear from their
presentation that they all can be written in this way. Therefore one’s work
contributes in the very least to the body of knowledge of consequences of the
global theory.
I think it is fair to say, however, that most mathematicians do not give
their underlying theory much attention, preferring to ‘get on with the job of
doing mathematics’. But if they were asked, they could give a list of principles

4
they find admissible for deduction which would probably amount to first-order
logic and axioms that would be available in a (possibly multi-sorted) version of
ZFC set theory. Most mathematicians in this sense are rather conservative, and
understand that this conservatism places them well within the realm of ZFC.
This places their work on a reasonably sound footing, at least according to one
of the standard paradigms, but their belief in the soundness of mathematics
is typically much stronger. They have little difficulty mentally picturing the
set theoretical universe described by the ZFC axioms, or (more realistically) the
part they are working on, as existing in some sort of platonic way. They ‘get on’
with their mathematics, which is to say, they posit the idea and consequences
of there being particular objects with particular properties, both informally
(using images, diagrams, analogies, and so on) and semi-formally. Research
mathematicians have generally trained themselves to be ‘pessimistic’ about their
picture of the universe; that is to say the mental picture they have is generally
inclusive of all possibilities and therefore necessarily somewhat incomplete. This
‘pessimism’ arises because the informal ‘brain-storming’ stage is important for
successful work, but from experience they know it can be unreliable. Potentially
unreliable arguments generated by informal means are always carefully checked,
verified and communicated in a rather different semi-formal style when it is
believed that some important conjecture that can be proved has been identified.
When working semi-formally, proofs are written down in a mixture of nat-
ural language and mathematical symbolism in such a way that they could in
principle be rewritten or developed into formal proofs in first-order logic. Few
mathematicians work in anything other than first-order logic. These logical
principles used in such proofs are, however, always phrased in a way that are
compatible (via an informal version of the soundness theorem) with the notion
of truth (defined using something like Tarski semantics) relative to the universe
they conceive of as (at least for the moment) existing platonically. These proofs
can be rewritten as formal proofs in ZFC and are frequently interpreted as such,
but at the time of conception the syntactical proof rules are rather considered
as semantic rules concerning truth and possible situations which relate to the
conception of the universe being ‘explored’.
If an example or algorithm or other object is explicitly exhibited rather than
shown to exist by non-constructive means a mathematician will usually say so,
rather than leave the realm of classical logic. Proofs are deliberately written in
a semi-formal way because mathematicians know that there may be a number
of subtly different interpretations of what they write, and they emphasise the
(semi-formal) arguments rather than the pure statements of their results to
aid these different interpretations. A proof can be read as the reason why
some statement is true, but also often as a method or process by which to
carry out a calculation, and although mathematicians are generally unfamiliar
with intuitionistic logic and other constructive logics, they do present proofs as
methods when appropriate.
Of course, some mathematicians are more familiar with foundational matters
and may explicitly state they are ‘working in ZFC’ or similar. A few may be
working in areas where additional axioms (such as CH, GCH, AD or large
cardinal assumptions) are useful and typically pick and choose from this list of
additional axioms as suits them. In any of these cases, mathematics is usually
done in the first instance within some standard logical framework, such as ZFC,
and the actual work taking place is both informal and semi-formal, but both

5
parts are conceived by the mathematician as taking place in some semantic
manner relative to the conceived or imagined set theoretic universe.

The pluralist view of mathematics. Mathematics is primarily (but not


exclusively) about numbers, where ‘number’ is often taken in the most general
sense possible. Some number systems are of particular importance, and new
number systems are usually built from more fundamental ones. Arguably the
most fundamental number system of all is the system N of the natural num-
bers, 0, 1, 2, 3, . . .. These frequently are described as being the numbers corre-
sponding to a finite sequence of strokes on the page, the number 5 corresponding
to ||||| for example. The natural number system is extended to the integers, Z,
the rationals, Q, the reals, R, and the complexes, C. Other number systems
may be devised by related means, such as polynomial rings, groups, finite and
other fields, and extension and quotients of these. Other structures that are not
strictly speaking of numbers but are treated as numbers by mathematicians,
such as the collection Vω of hereditarily finite sets, the collection of (finite or
infinite) graphs, groups, etc., may also be defined and used.
In this view, Mathematics becomes an industry of combining these ‘num-
bers’ from these different systems to find interesting properties or facts about
them, to devise new systems, and to use these systems to model phenomena
in the natural world. Some logical theory is required for this endeavour of
course, but mathematicians working like this typically regard each individual
system of numbers as having some genuine existence. It may be that a work-
ing mathematician will have what amounts to different logical conventions for
the different areas of work. The theory for the reals might be based on first-
order logic, but that for making new systems out of old may be based on some
category-theoretic framework. From the foundational point of view the math-
ematician feels on safer ground as the theory of each system has some sort of
independent life, and if one falls, being found to be inconsistent or uninteresting
(that is, if some interesting mathematical result is proved stating that there is
no system with the particular properties in question) then the others still stand.
In this view, each system is constructed, and exists because it is constructed.
It has a particular construction, and therefore a particular definition. A rational
number p/q is the equivalence class of a pair of integers (p, q) with q 6= 0. One
of the early tasks of set theory (the theory ZFC, for example) was to verify
that all the constructions of all these number systems could be carried out in
that theory, so that ZFC could (but not necessarily should) be regarded as a
unifying theory of all these number systems. In this sense set theory was more-
or-less successful (though it is inconvenient that so-called ‘large categories’ are
not sets) and the pluralist view can be and is partially subsumed by the global
unifying-theory view.

The structuralist view of mathematics. The main problem with what


I have called the pluralist view of mathematics is that two workers may have
different constructions of number systems and these systems need to be com-
pared. It is certainly the case that many number systems that look like the real
numbers can be constructed, and the important thing about a particular real
number is not how it is actually constructed (as a Dedekind cut, the equivalence
class of a Cauchy sequence, a continued fraction, or whatever) but what it does.

6
Thus the important property of π is that it is the ratio of the circumference
of the circle to the diameter, and not that it happens to be a Dedekind cut of
smaller rational numbers (in the view of one construction of R). To get around
this problem one writes down axioms for each structure of numbers one devises
and then proves these axioms to be true for the number system in question.
The name ‘axiom’ is used not because it is to be assumed without proof (on the
contrary, these axioms must be proved) but because the other features of the
structure will be quietly forgotten and future work regarding the structure will
be from the axioms we have listed alone and nothing else.
The word ‘structure’ has crept in here because a set such as N or Q without
additional structure is just amorphous and depends only on its cardinality. So
the additional algebraic structure (the order relation and addition and multipli-
cation operations in N and Q) are important to distinguish these systems. The
axioms describe the properties of the elements of the set (as primitive objects,
i.e. without structure in themselves) and the properties they have relates to
order, addition and multiplication. The number 2/3 in Q is described by the
properties it has with respect to order, addition and multiplication compared
to other rationals. In other words mathematical objects like N and Q, and also
the numbers themselves, are abstract objects characterised by ‘what they do’
rather than ‘what they are’. For an elementary mathematical introduction to
mathematics considered this way, read Gowers [3]. A more critical philosophical
account is to be found by Parsons [7] (Hart [4, Chapter XIII]).
There are two main aspects of this new way of thinking about mathematics.
The first is that we have the basis of understanding the idea of an abstract math-
ematical object: an object that has no structure in itself but is characterised by
what it does in certain situations. Whatever the philosophical implications of
this view are, it is at least in accordance with modern mathematical practice.
The second is that an abstract mathematical object is in some sense independent
of how it is constructed—the actual ‘internal structure’ of a real number as a
Dedekind cut or whatever—but is in fact the same object as one constructed in
an entirely different way but with the same properties. Two key examples illus-
trate this perfectly. The axioms for the natural numbers N given by Dedekind
characterise the structural properties of the natural numbers precisely. Simi-
larly the axioms for the real numbers as being a complete Archimedean ordered
field (also essentially due to Dedekind) characterise the structural properties of
the reals exactly. We have the following.
First Canonicity Theorem for the Natural Numbers. Let N and M
satisfy the axioms for the natural numbers. Then N and M are isomorphic.
First Canonicity Theorem for the Real Numbers. Let R and S satisfy
the axioms for the real numbers. Then R and S are isomorphic.
In undergraduate lectures I like to describe a conversation between humans
and some intelligent extra-terrestrial species soon after first-contact. Humans
and ET might struggle to agree on what constitutes something fashionable, or
elegant, or even beautiful, but the mathematicians of the two races would get
together and discuss axioms for real numbers, and would presumably agree on
the set of axioms each species takes (or if not, prove from one set of axioms the
axioms of the other, and vice versa, showing that the two sets are equivalent)
and therefore be able to conclude that both humans and ET share the same

7
concept of real number, irrespective of any ideas each race might have of their
implementation using Dedekind cuts, Cauchy sequences, or whatever.
Although these theorems are well-known and appear to support the view
that the structuralist approach to objects works, at least in these two cases, the
story is not complete. Although we now know that all systems of reals are in
fact structurally similar, these results do not tell us how they are similar. But
the theorems that there are essentially only one natural number system and one
real number system are even stronger than this, in another subtle but important
way. Given two systems N and M of natural numbers, or two systems R and
S of real numbers, not only are they isomorphic, but it is possible to show that
in each case there is only one possible mapping f : N → M , g : R → S that
demonstrates this isomorphism.
Second Canonicity Theorem for the Natural Numbers. Let N and M
satisfy the axioms for the natural numbers. Then there is a unique isomorphism
between N and M .
Second Canonicity Theorem for the Real Numbers. Let R and S satisfy
the axioms for the real numbers. Then there is a unique isomorphism f : R → S.
Not only are the ideas of ‘natural number’ and ‘real number’ canonical,
or forced upon us in a natural mathematical way, but the isomorphism that
shows this canonicity is canonical too. This means that however one defines
real numbers, not only is the structure of real numbers essentially unique, but
the individual real numbers are characterised by their properties and are also
essentially unique.
The Second Canonicity Theorem has important mathematical consequences.
But it has important consequences for physical applications of the reals and
measurement too, which brings us on to our fourth view of mathematics.

The utilitarian view of mathematics. This is the idea that mathematical


ideas, objects, and theories exist because they are necessary or useful to explain
or model scientific phenomena, including other areas of mathematics. In math-
ematics, one can temporarily posit the existence of all sorts of mathematical
structures and objects and it is remarkable from a psychological point of view
how these objects can take some sort of real existence in the imagination when
one starts to work with them. In this sense one can choose to believe in almost
anything, including the leprechaun with a pot of gold at the end of the rain-
bow. A reasonable restriction is that one’s beliefs should not force one into an
inconsistent point of view, but it is not necessary to be reasonable.
I have heard mathematicians being compared to children at a sweet shop,
being offered many glittering packages of sweets to which they may pick and
choose the ones they want. The choice of such sweets, be they axioms or number
systems or something else, is usually made for practical reasons—to solve the
current problem at hand—or for reasons of elegance, which may or may not in
the long term amount to the same thing. We have already seen an example,
where a mathematician needing axioms for set theory that go further than the
usual ZFC axioms tends to pick and choose the ones they need without too
much concern about how these are justified. But we were all brought up in a
very proper way and know that an excess of sweets can give one a tummy-ache.
So one tries to get by with as little as reasonably possible, though starting with

8
a large tub of sweets and being able to pick and choose a small number from
such a large variety certainly adds to the excitement and excites the mind as to
the possibilities of some hitherto undreamt-of exotic combination.
There are two arguments supporting this view.
One is the application of Peirce’s principle of abduction used by Quine [8]
(Hart [4, ChapterII]), that if a piece of mathematics X is required to understand
an observed phenomenon Y then the observation of Y tends to support the
argument that X is correct, or true, or sound. This argument is also employed
even if X cannot be shown as necessary for an understanding of Y but is perhaps
the most elegant or the most powerful or suggestive of other applications. This
argument might be considered to have more force if Y is some aspect of ‘the real
world’ and X is being used as part of a mathematical theory to model phenomena
in the real world, but it seems reasonable to take this further and argue that
some new kind of mathematics X has mathematical existence (whatever that
may be) if it is the most elegant or powerful way of explaining some other piece
of mathematics Y.
The second argument relates to Hilbert’s programme and says that pro-
vided that X can at least be argued to be consistent (or consistent with other
mathematical ideas one is using) then it can be regarded as ‘ideal’ mathemat-
ics that has some validity of its own. Gödel’s Second Incompleteness Theorem
shows that Hilbert’s programme as originally posed cannot succeed, but the
main thrust of the programme still holds weight. This is that new axioms or
new ideal elements may be accepted if shown consistent, and such ideal mathe-
matics makes a useful contribution if it can be shown to have many reasonable
consequences for ordinary ‘real’ mathematics—and there are many levels of ‘re-
alness’ from verifiable statements about the natural numbers (Hilbert’s original
notion of ‘real’) to comprehensible statements about one of the other standard
number systems discussed above.
In some sense the abduction argument and Hilbert’s programme argument
are similar, in that they both try to measure the success of the theory in terms of
‘real’ consequences, be they in some familiar mathematical structure, or in their
use as models for natural phenomena, and these useful consequences ‘trickle
down’ in the sense that given a theory Y which has consequences for X, a the-
ory Z that has consequences for Y is likely to also have consequences for X. Put
a different way, if at the lower levels of this ‘hierarchy’ we can readily detect
problems (such as inconsistency, and this is the point of Hilbert’s programme,
that inconsistency is ‘real’) by ‘trickle down’ any problems in higher mathemat-
ics will eventually show up. This is even more true of powerful and elegant
higher mathematics, which being one of the more glittering sweets available is
likely to be taken up more often by other mathematicians, who will surely in
due course find out what the problems of it are, if there are any.
In addition to all this there are mathematical reasons for taking the utili-
tarian point of view. A consistent first-order system is, by the Completeness
Theorem, satisfied by some mathematical structure. The Completeness Theo-
rem is provable in a minimal theory of mathematics (ZF set theory with the
Axiom of Choice is certainly sufficient, but rather less is actually required) and,
as we shall see, arguments supporting consistency are not necessarily as difficult
as they might seem in all such cases.
However the main problem with consistency as a criterion for belief is that it
is rather weak: given its consistency and some unifying ZFC-like framework, the

9
existence of our number system follows, but we are looking to see if there is more
than this. Thus belief (at least for the context of this paper) needs to have some
reason or rationality associated with it. From the point of view of mathematics
we expect belief in an object to have some usefulness: adding an axiom for the
existence of a leprechaun does not in itself improve mathematical knowledge and
if we tried it we would tend to reject the axiom and disbelieve in the leprechaun.
But if the ‘leprechaun’ was simply a fanciful name for an abstract mathematical
‘point at infinity’ (mathematicians are indeed given to using fanciful names for
abstract ideas such as this) and the axioms state this property of the ‘leprechaun’
correctly then its addition could quite likely simplify the description of the
geometry of the system being considered and we would have rational reason to
believe the new axioms and the existence of the ‘leprechaun’ so characterised.
In other words, it’s not what one believes and what one calls it that’s important,
but rather how it affects the way one thinks about everything√else.
Consider for example the addition of the number i for −1 to the real
numbers, making the complex numbers. From the point of view of the physical
universe, especially when one is thinking of the reals as measuring distance or
time, the square root of minus one is a mysterious object, and historically it
was rejected for a long time because this number does not seem to exist in this
physical sense. However, the addition of this number to the reals turns out to
be straightforward mathematically and not nearly as complicated as applying
the Completeness Theorem of logic: essentially all that is required is to know
that the polynomial X 2 + 1 is irreducible over the reals, something that is quite
easy to establish. More precisely, the symbol i is taken simply as a formal
symbol and a complex number is a formal expression x + iy where x, y are real
numbers, and this expression can be considered as being simply a notation for a
pair (x, y) of real numbers, with special rules for addition, multiplication and so
on. That these rules make sense depend simply on the irreducibility of X 2 + 1.
The number i itself is 0 + i1, and once one has added i to the reals one can see
that all numbers of the form x + iy need to be added, so this construction has
a pleasant kind of ‘inevitability’ about it.
Thus the complex number system is easy to construct from the reals, and
this is already in its favour. Is it a useful system of numbers? Well yes, most
definitely, as the addition of i simplifies a great many theorems and formulas.
For example the ‘Fundamental Theorem of Algebra’ that ‘every polynomial has
a root’ becomes true in general without having to qualify the hypothesis as
‘every polynomial of odd degree’ as one would have to for the reals. Complex
numbers simplify the equations for the solutions of polynomial equations of third
and fourth degree even when the solutions are purely real numbers (this was the
first use they were put to and their original motivation) and the introduction of
i unifies equations for real-valued trigonometric and hyperbolic functions into
one single set of formulas.
It could be said of the complex numbers that these are merely a technical
device for handling a pair of real numbers simultaneously. And of course this is
how they are constructed or defined. However it is important to be clear that the
applications of complex numbers show that they are rather more than this. In
particular the important notion of differentiability of a complex-valued function
is not at all the same as that of functions of two real variables, and may not have
been discovered but for the view of complex numbers as single numbers rather
than pairs of reals. In other words, the fact that complex numbers suggest new

10
mathematics that would not have been otherwise obvious is a very strong factor
for their usefulness.
The other aspect of the utilitarian view is the usefulness of the mathematics
to scientific theories, especially theories of physics. Here it is important to stress
that the question is whether any particular kind of number can be used to
develop a useful theoretical model of some aspect of the universe, not whether
numbers really exist in the physical world. For example, it is traditional to
measure the traditional dimensions of length and time using real numbers. (The
switch from the use of rational numbers to real numbers for this was made by the
ancient Greeks who were genuinely concerned about √ measurements that seemed
to have to be made with irrational numbers such as 2. After several centuries
we do not seem to have any serious rival for this use of the reals, something I
find surprising.)
If we ask how complex numbers help us with measurements and physical
theories we see that although distance and time do not obviously have complex
values, some quantities, notably current and voltage in AC circuits, are naturally
modelled as complex numbers, with the magnitude of the number being the peak
value and the argument of the number being the phase.3 So from the point of
view of modelling physical phenomena, complex numbers play a part and should
be accepted. Whether one goes so far as speculating whether other equations
that occur in physical models also apply to complex numbers, for example that
a particle with imaginary rest mass might exist and if so would travel faster
than light speed, is perhaps more the realm of science fiction. However the fact
that numbers such as i promote such speculations and that at least one or two
of these speculations may turn out to be reasonable science rather than fiction
is in itself also a reason to accept the utility of i.
For the application of numbers to natural phenomena, the canonicity theo-
rems are particularly important. The First Theorem is obviously essential, for
if we were to measure a quantity by a number—a real number perhaps—it is
important that the resulting numerical measurement comes from an identified
structure so that two such measurements can be combined or compared. But
the Second Canonicity Theorem is important too: it is this that guarantees
that the result of a measurement is unique and reproducible. If there are two
different isomorphisms between structures R and S then each of R, S has at
least one nontrivial automorphism sending numbers to different numbers with
the same properties. And if two numbers x, y ∈ R have the same properties
they are both candidates for the same measurement of a physical quantity.
The Second Canonicity Theorem might fail for a structure S because the
system S may not have enough structure to distinguish between its elements. An
example of a number system satisfying the First but not the Second Canonicity
Theorem is the system of complex numbers C as a field with +, ·, 0, 1 and (so
that the real number line can be identified as a special subsystem) the absolute
value operation |x|. The First Canonicity Theorem follows from that for R. But
there is no way to distinguish i and −i, nor (more generally) x + iy and x − iy.
In other words conjugation x + iy 7→ x − iy is an automorphism of the structure
and Second Canonicity fails. We can resurrect Second Canonicity by adding
to our structure an additional function, for example the argument map arg z
p
3 The magnitude if x + iy is the real quantity x2 + y 2 and its argument is tan−1 (y/x)
taken in the appropriate quadrant.

11
(returning a value in the interval [0, 2π)) but adding such a function requires
an arbitrary choice of which is the upper half-plane and which the lower, or
whether angles are measured clockwise or anticlockwise.
Failure of canonicity for C has some consequence for measurements using C.
For example in an experiment or electronic design using C to model alternating
current (AC) there are two choices for the measurement of the very first current
or voltage, but once the conventions for this first measurement is chosen the
rest of the measurements must follow suit. This is of little consequence to the
physical theory using C to model the actual physical system, but suggests that
it is not in fact exactly true that we ‘see’ complex numbers as complex voltages
or currents in an AC circuit. Put mathematically, the first complex number
value or measurement is one of two values, x + iy or x − iy, the set of which
is called the orbit of (either) value under the automorphism group in question.
There is nothing to choose between these two values, although they turn out to
be the same value if y = 0. But (provided y 6= 0) the second value u + iv will be
uniquely determined. The orbit of a single point x+iy under the automorphism
group has 2 elements (or 1 if y = 0) but the orbit of a pair of points (x+iy, u+iv)
also has at most 2 points. No physical theory of measurement I can think of
can distinguish between i and −i, so it is not quite true to say that the complex
numbers exactly models AC circuits.
For the complex numbers mathematicians usually choose to live with the
failure of the Second Canonicity Theorem and to signal this failure and the
additional properties of C as an extension of R that it gives us is coded into the
conjugation operation.
The canonicity of the reals is not necessary for believing their existence, but
it is a very desirable property of the reals and strong evidence for such belief.
In the framework already set up, it is an elegant property of the reals that is
potentially highly useful. Although canonicity itself does not imply that real
numbers can be used to measure physical quantities it does at least show that
the number system is available for such measurements. And, as we know, it is
common in physics to measure distance, time, mass, energy and so on as real
numbers with appropriate units. This is not to say that real numbers must be
used in this way or that there is no other more appropriate system to use, but
rather that the real numbers forms a particularly useful model of such quantities
that is applied extensively in physical theories.
In contrast, consider the case of a family of number systems described by a
set of axioms A which fails to have the basic canonicity property, i.e. that we
can’t prove that every two systems satisfying A are isomorphic. We might be
able to convince ourselves of the existence of systems satisfying A by elementary
or straightforward manipulations of systems whose existence we already are
convinced about. For example if A is the set of axioms for abelian groups,
we can present the reals with the addition operation, or the reals with zero
removed and the multiplication operation, or the integers modulo 5 as concrete
examples of systems satisfying A. But if our evidence for the existence of systems
satisfying A only comes from complicated arguments in ZFC this option is not
available to us. If we can prove in ZFC that we expect that there is up to
isomorphism only one system satisfying A then we can posit the existence of such
a system ‘in the real world’ and describe it accurately in terms of the theorems
about it that are provable in ZFC. We have a lot of concrete information about
this system that we can at least consider, and maybe later choose to believe

12
in. (There is another example of the semi-semantical reasoning going on here.)
Thus provable canonicity in a set of axioms like ZFC is at least a useful precursor
to belief of existence. Equally, with the obvious necessary extra care being
taken, provable canonicity some other conceivable set of axioms other than
ZFC is helpful evidence for the belief of prior existence of the number system,
irrespective of what we take for our usual axioms for set theory or mathematics.
If systems satisfying A are not canonical then perhaps they are there (in
models of ZFC) because of some dubious axiom or artifact from the way ZFC is
conceived. This is particularly pertinent because one of the axioms of ZFC that
has been the subject of much debate as to its correctness over the last hundred
years—the Axiom of Choice (AC)—is often recognisable in its consequences by
their non-canonicity. For example, AC implies that there is a ‘well-order’ of
the set of real numbers. It doesn’t really matter what a well-order is for this
discussion, except that no such well-order can be defined by elementary means
as discussed earlier and well-orders are (provably in ZFC) highly non-canonical.
What’s more, from knowing in more detail the structure of any well-order of
the reals, it would be possible to read off the solution to one of the biggest open
problems in set theory: whether the continuum hypothesis (CH) should be
regarded as true or not. (CH is known to be independent of the other axioms
of ZFC, but no satisfactory evidence as to whether it is CH or its negation
describes the true mathematical universe is known.) Non-canonicity in itself is
sufficient to make the issue warrant further investigation and my view is that
the other evidence is quite compelling in the direction of not accepting at this
time the prior belief in the existence of a well-order on the reals.
There are other more fundamental issues connected with canonicity that
are not related to AC, but which are rather more difficult to isolate. Interest-
ingly some of these other issues may have an impact on physical principles and
quantum mechanics.
Consider the air in front of me. Most theories say it consists of particles—air
molecules. Certainly there is plenty of good scientific evidence to say that there
is matter in the air about us and it is in the form of very small particles, so
this seems entirely reasonable to believe. But if I was asked to focus on one
particular air molecule and describe it—in particular whether it exists—this
becomes more problematic. The immediate question is which one? There is an
amorphous mass of air molecules in front of me and I can’t pick a single one
out. Does that matter? Is it a reasonable position to believe in the existence
of the air in front of me and have some belief about the form of structure that
air takes without any specific belief in any particular air molecule? If I am to
believe in the existence of any single air molecule, shouldn’t I be able to say
something specific about it other than it is simply an air molecule and it exists
somewhere?
From the point of view of quantum physics, my refusal to believe in a single
molecule might be quite a sophisticated position. The uncertainty principle says
I shouldn’t be so sure of any single molecule because I cannot specify its position
and velocity. Furthermore, the Pauli exclusion principle says that all individual
particles must have distinct states, i.e. there should be ways to distinguish them.
Now I didn’t refuse to believe in the existence of individuals rather than the
amorphous mass because I choose to bow to the Heisenberg–Pauli god, but
rather because of some more general principle that needs to be pinned down
and understood better. I admit to finding it difficult to articulate the exact

13
principle here, but it is something along the lines of the following. Were I to
believe in a single air molecule without being able to say anything at all specific
about it, this would hardly be a useful belief but instead would be rather like
a belief in an object that has no impact whatsoever on the rest of my thinking,
like the arbitrary belief in the leprechaun. I can however reasonably believe in
the amorphous mass of air, and also reasonably believe in the theory that says
it is described best as a collection of individual molecules. Were I to be able to
say something specific about some air molecules, the ones that are molecules of
oxygen perhaps, then I would have a stronger belief in a particular part of the
mass of air, the part that is oxygen, but I still would not be able to have any
useful belief in any particular oxygen molecule.
One wonders whether this issue of canonicity or definability and existence
of individuals may have some bearing on the underlying principles of quantum
mechanics. Unfortunately I have to leave these speculations open here as the
questions seem difficult at this stage, but it would seem worthwhile returning
to them at another time.

3 Existence of infinitesimals
The previous section set out four main viewpoints a working mathematician
might typically take in his or her work. None is thought through in detail ac-
cording to the underlying philosophy—we will make further comments on these
views later. In this section I would like to describe the case for existence of
infinitesimals and nonstandard number systems from these different viewpoints.
For background information on infinitesimals see Robinson [9], Kossak’s arti-
cle [6], or the technical appendix to this paper. Additional material on first-order
logic, as well as a brief introduction to nonstandard analysis appears also in The
Mathematics of Logic [5].

Infinitesimals in the unifying view. From the unifying point of view, the
existence of a number system with specified properties follows, if at all, from
the axioms of the unifying theory one has chosen to adopt. In the case of
the theory ZFC, axioms are available to construct or define the set of natural
numbers, N (or ω, as it is usually called in this context), and from this the
usual constructions allow us to define Z, Q and R. These systems are regarded
(within ZFC) as structures for first-order languages and ZFC can state and
prove the main results of first-order logic including the Soundness Theorem, the
Completeness Theorem, and Loś’s ultraproduct theorem. Then by usual model-
theoretic means, we can either analyse the structure R and using Soundness
deduce that a first-order theory of hyper-reals with infinitesimals is consistent
and hence by Completeness there is such a structure in the universe, or go
directly from R to a hyper-real structure ∗R by means of Loś’s theorem and a
suitable ultrafilter, usually a non-principal ultrafilter on ω or N.
In this sense, number systems with infinitesimals clearly exist, and this why
Robinson’s approach is considered correct and rigorous. One concern that we
may have is that the Axiom of Choice (AC) is used in an essential way as one of
the ZFC axioms required for the Completeness Theorem, or for the construction
of the ultrafilter to use when applying Loś’s theorem. Looking at it a different
way, these nonstandard number systems could be regarded as a test case for

14
theories such as ZFC: ZFC clearly ‘predicts’ the existence of them but direct
constructions do not yield such systems. Is there some more direct way that
arguments for such number systems can be given? Does this prediction support
or refute the traditional belief that ZFC is a good unifying theory?
We have concentrated on those unifying axiomatic systems in the ZF family.
It is worth remarking that a number of alternative systems exist in which non-
standard numbers appear more naturally. Some of these have associated philo-
sophical motivation (such as Vopěnka’s Alternative Set Theory [10]). There are
a number of systems proposed by Kanovei and others. In any case, to adopt
such a system requires one understand the consequences especially as it forces
one’s mathematics outside the mainstream.

Infinitesimals in the pluralist view. In the pluralist view, we should take


as little as possible from our metatheory and construct nonstandard number
systems directly as an extension of R perhaps, analogously to the construction
of C. The theory ZFC ‘predicts’ that this should be possible, and the Loś con-
struction appears to be the most straightforward approach. It is direct and
explicit, once one is given a suitable ultrafilter. For this approach to work one
seems to need an axiom for the metatheory saying that such ultrafilters exist,
and this axiom needs to be justified. The two such possible justifications that
come to my mind are: (A) an argument that ZFC or some fragment of it is
justified, as is AC (or the Boolean Prime Ideal Theorem) and the argument for
ultrafilters from these; or (B) an argument of the utilitarian sort that says that
ultrafilters are necessary to explain and work with a great number of mathe-
matical phenomena.
The existence of these ultrafilters is, it seems to me, not unreasonable, so
that it seems that we can reasonably imagine our pluralist universe populated
with nonstandard number systems amongst others. Without such ultrafilters,
it is possible to make poor versions of nonstandard number systems. One can
take for an infinitesimal h a transcendental number over R and order the field
extension R[h] so that 0 < h < 1/n for all n ∈ N. This is an ordered field with an
infinitesimal, but not as rich as the hyper-reals constructed from an ultrafilter,
and not yet as useful for mathematical analysis either. I suspect that if this
approach were to be continued, we might have a workable, but clumsy, theory
of analysis very like the -approach with limits.

Infinitesimals in the structuralist view. The most useful and commonly


discussed nonstandard number systems in practice are sufficiently saturated
models (it is usual to take them ℵ1 -saturated) of an appropriate first-order
theory—the theory of the reals with additional functions and relations, for ex-
ample. The systems constructed from a non-principal ultrafilter over ω are of
this type, for instance. From the structuralist point of view one would like to
work with these properties as axioms, rather than concern oneself about the
properties of the ultrafilter one used to construct the system, if it was obtained
that way. Indeed, a construction via the completeness theorem seems preferable
in this sense since it is ‘purer’ and from it one cannot easily see the details of
how one obtained the system, only the properties of the system so obtained.
The canonicity theorems for nonstandard number systems are, on the other
hand, more problematic. The immediate reason for concern arises from the

15
fact that all the usual constructions of nonstandard number systems with in-
finitesimals use the Axiom of Choice (or a slightly weaker form of it such as the
Boolean Prime Ideal Theorem) in some essential way. These axioms have been
around for some time, but are not to everyone’s taste, so are worth looking at
in this context. Looking ahead, issues to do with canonicity also impact on the
application of such numbers in physical theories, particularly in measurement,
and our question on physical existence.
We concern ourselves here with nonstandard number systems that are ele-
mentary extensions (in the sense of first-order logic) of structures of the form

R = (R, 0, 1, +, ·, <, Z, . . . , f, . . .)f ∈F

for some suitable set of functions F . Following standard terminology from math-
ematical logic we shall also call such systems models, the theory of the model
being understood to be the elementary diagram of the structure above, i.e. the
set of all first order statements that can be written down using addition con-
stants naming real numbers and true in the above structure. Other nonstandard
structures considered in NSA usually contain all this as a substructure or as an
interpreted structure, so the non-canonicity phenomena we will be talking about
for this structure applies to these others too. I have included the integers Z as
a unary predicate in order to code infinite sets, in the way that is common in
NSA. The set of natural numbers N can be defined from that of the integers if
that is one’s main interest, and using this one can use results from the theory
of models of arithmetic to help classify models.
The issue for canonicity is whether there is some identifiable model of this
form that can be described simply by means of mathematical axioms, other than
the so-called ‘standard’ one (i.e. the one above) which contains no infinitesimals.
The answer in general is no, and there are a number of obstructions.
The first is well-known, but is not a particularly serious obstruction. By
a pair of theorems of first-order logic known as the Upward and Downward
Löwenheim–Skolem Theorems, models of the appropriate theory can be found
of every suitably large cardinality. (Where ‘suitably large’ means in this case at
least as big as R itself and of the set of functions F used.) That is not so much
of a problem is because we can specify the cardinality we are interested in in a
natural way, as the first cardinal bigger than this minimum, perhaps.
The other obstructions to canonicity are specific to the particular theory we
are looking at, the fact that it codes sequences, computations in Z, and other
rather complex mathematics. It is necessary for NSA to look at structures that
code complex mathematics to enable us to solve difference equations in the
nonstandard world as indicated above, or more generally to use NSA to reduce
continuous problems concerning sets of reals or functions of real variables to
discrete problems with solutions by combinatorial means. In the terminology
of the classification of first order theories given by model theory, the theory we
are looking at is highly unstable with too many models at each cardinality to
expect a classification of these models.
One possible candidate for a ‘canonical choice of model’ is a ‘minimal’ or
‘smallest’ one, but it turns out from this and some model theory that there is
no minimal nonstandard model. (By a slight irony, minimal models do exist for
a third method of construction outlined in the appendix—the one using Gödel’s
Incompleteness Theorem—but these necessarily give structures satisfying false

16
sentences, such as ¬ Con(PA). In any case there is an issue as to which false
sentences we are to choose.)
Perhaps instead one should look for large models, models which contain every
possible feature that one might want, models that contain elements satisfying
every possible property. This is a common idea in model theory, and models of
this type are said to be saturated. Saturated models are very powerful, not only
for model theory, but for nonstandard analysis, where saturation principles are
often exactly what one needs to transfer a problem or definition from the real
world to the nonstandard word. There are many notions of ‘saturation’ in model
theory, but for highly unstable theories such as ours, all the notions of saturation
have some difficulty too. Some weaker notions of saturation (such as recursive
saturation, arithmetical saturation, resplendency) are available to allow models
of all theories to have such models at all cardinalities, but unfortunately these
notions of saturation do not characterise the models up to isomorphism, i.e. there
are no canonical weakly saturated models. There is a notion of full saturation4
which does characterise models up to isomorphism, but unless the underlying
set theoretical framework of ZFC that we are using is changed, saturated models
of our theory need not exist at all.
The best general results showing existence of saturated models are of the
following type [2].
Theorem. Suppose ZFC together with either the generalised continuum hy-
pothesis (GCH) or the assumption that there is a strongly inaccessible cardinal.
Then there is a saturated elementary extension of R.
One might say that the set theoretic assumptions required to build saturated
models are irrelevant in the structuralist view, but if one takes this standpoint
one still has to argue for the existence of saturated structures. In any case one
of the strengths of mathematical work is that the four views I have outlined
are in some senses compatible, and we should not throw away the unifying view
lightly.
If we are adding axioms to set theory, I would argue that adopting GCH is
not something that one would want to do unless strong evidence is forthcoming
on the continuum problem (CH), but an axiom for the existence of arbitrarily
large strongly inaccessible cardinals is a much more reasonable addition to our
set theoretic axioms for mathematics. Indeed much of modern set theory is con-
cerned with adding ‘large cardinal axioms’ that cannot be proved from the usual
axioms and seeing what the consequences of them are, especially for ordinary
sets such as the set of reals. From the point of view of more advanced NSA,
these large cardinal axioms are useful in another way too, since they allow us to
have access to a number of models of set theory (something that is not available
without large cardinal axioms) and for some applications it is helpful to start
NSA by taking an elementary extension of a suitable model of set theory, rather
than an elementary extension of our structure R.
Any two saturated models of our theory of the same cardinality will be
isomorphic, and it is difficult to see how non-saturated models might be suffi-
ciently canonical to be of interest, so the conclusion is that for canonicity we
4 For experts, the technical definition I refer to is: a model M is λ-saturated if it realises

all types over sets of parameters of cardinality strictly less than λ, and it is saturated if it is
λ-saturated where λ is the cardinality of M .

17
do require addition axioms in our mathematics to allow us the chance to work
with saturated models. If we feel that the reasons for including nonstandard
systems as part of zoo of useful mathematical structures that we wish to accept
and use, we would under the unifying view, require additional principle for ex-
istence of mathematical objects, but suitable additional principles are available
as additional axioms in the ZFC style.
This deals with the First Canonicity Theorem. The Second Canonicity The-
orem adds a further complication. Given any two saturated models of the same
infinite cardinality, by standard methods in model theory there will always be
a huge number of isomorphisms between the two. A consequence of the Second
Canonicity Theorem is that there is precisely one automorphism of a structure
that satisfies it, so the Second Canonicity Theorem fails for saturated systems of
nonstandard numbers. This does not cause immediate problems for the Struc-
turalist view, but it does have important consequences for the utilitarian view,
to be discussed later.

Infinitesimals in the utilitarian view. In the utilitarian view, we would


have evidence to support the existence of nonstandard number systems if we
can show that such systems are useful and important enough. This might mean
with relation to scientific theory, or to mathematics itself. We start here by
looking at applications of infinitesimals to mathematics, and look at possible
application to other ares later.
It is quite easy to say that infinitesimals as used by Robinson are simply a
technical device to simplify and code up the idea of ‘limit’ and this is in some
sense correct. For example, there is no doubt that the nonstandard definition
of derivative is simpler that the definition using limits, but arguably it does
no more than use the same idea underlying that of ‘limit’ in a different way.
Against this criticism we might offer the argument that Newton and Leibniz
may not have come up with the differential calculus but for their thinking in
terms of infinitesimal quantities. Of course this is difficult to judge so many years
later. It is certainly true that many mathematicians today find it easier to think
in terms of infinitesimal quantities, even if they later re-work their arguments
in terms of limits. But also, many others prefer to think in terms of limits
instead. Perhaps it has more to do with how one is (mathematically) nurtured,
and at present current teaching methods at universities certainly emphasise
limits rather than any alternative, and indeed infinitesimals rarely enter the
undergraduate curriculum at all.
Another key thing to look at is whether infinitesimals unify different areas
of existing mathematics and simplify the presentation of them or the state-
ment of their results. In fact there is one area in which infinitesimals do this
beautifully: that of the parallel topics of differential equations and difference
equations. A differential equation is an equation for an unknown function y(x)
of a real variable involving the derivative y 0 (x) of this function, or higher-order
derivatives of this, y 00 (x), y 000 (x), etc. A difference equation is an equation for
an unknown discrete function y(n) of a natural number variable involving the
difference function ∆y(n) = y(n+1)−y(n) and possibly higher order differences
∆2 y(n) = ∆y(n+1)−∆y(n), ∆3 y(n), etc. That these two types of equation can
be classified and solved by similar techniques is rather well-known, and a typical
method for solving a differential equation numerically (i.e. approximately) on

18
a computer involves choosing a small step size h, approximating a continuous
function y(x) by the discrete function ŷ(n) = y(nh) and each derivative y 0 (x)
by (ŷ(n + 1) − ŷ(n))/h = ∆ŷ(n)/h and so on. It is usually the case that the
resulting equation can be rearranged to take the form ŷ(n) = F (ŷ(n − 1)) or
possibly ŷ(n) = F (ŷ(n − 1), ŷ(n − 2), . . . , ŷ(n − k)) so that on choosing appro-
priate starting values ŷ(0) (or ŷ(0), ŷ(1), . . . , ŷ(k − 1)) one can generate all other
values ŷ(n) on the computer. A particularly simple example is known as the
Euler method in which the differential equation

y 0 (x) = F (x, y(x))

is replaced by the difference equation

ŷ(n) = hF (hn, ŷ(n − 1)) + ŷ(n − 1).

Difference equations like this have the advantage that it is obvious to see that
some solution exists, though finding a closed expression for a solution is often
more difficult, whereas the existence of solutions of differential equations is often
more delicate. Many differential equations can be solved exactly by nonstandard
methods by the same numerical method by choosing an infinitesimal step size
h and solving the difference equation in the nonstandard world. Once again,
the existence of the solution is usually obvious, and this gives a rapid existence
proof for solutions of some kinds of differential equations.
So infinitesimals and nonstandard methods unify numerical methods with
the classical analysis of real valued functions. There are other examples too.
Nonstandard methods allow one to give adequate nonstandard approximations
of useful but classically-speaking fictitious functions such as the delta function.
Nonstandard methods allow problems in calculus of variations be solved by tra-
ditional means such as using Lagrange multipliers maximising or minimising a
function of nonstandard-infinitely many variables with constraints. Thus non-
standard methods at least highlight the connection between discrete problems
such as difference equations and analytic problems such as differential equations
via versions of numerical methods normally used to find approximate solutions.
This is not quite the same thing as unifying these problems—seeing them as all
the same kind of problem. (That would appear to be a useful project for some
other time.)
Do infinitesimals and nonstandard methods permit new kinds of mathemat-
ics to be done that could not easily have been achieved without them? Here
I think the jury is still out. Certainly it was hoped (by Robinson and peo-
ple following him) that some significant problems in analysis could be solved
by nonstandard means, and in one case, Robinson himself solved an important
outstanding problem in analysis by nonstandard means before Halmos identi-
fied the key ideas and presented an alternative classical argument. In fact it
seems that most work in nonstandard analysis with impact on problems that
can be stated purely in the classical language of analysis has been confined to
finding elegant nonstandard methods to existing problems with classical meth-
ods already known. It seems that for such problems, nonstandard methods and
classical methods using limits are too close: it is a little too easy for experts to
translate between one method and another. Where nonstandard methods are
most useful, in my opinion, is that they allow the construction of interesting
new analytical structures based on traditional discrete structures.

19
One possible stumbling block for the introduction of nonstandard analysis is
that the procedure of arguing for classical results, about the reals R for example,
using nonstandard means involves switching between two worlds: the ordinary
real world and the hyper-real world. This is the moment where ‘infinitesimal
quantities are neglected’ which was most problematic for Berkeley and others,
but is given a precise justification by Robinson. In some accounts, this is done
almost algorithmically—adding or removing stars from symbols, taking ‘stan-
dard parts’ and ignoring infinitesimal quantities according to tightly defined
rules. Alternatively one does it using the standard tool-kit of first order logic,
which is elegant and comprehensive, but perhaps too much for beginners, espe-
cially students, to learn. This is also an issue with the subject and unfortunately
a misunderstanding of some of these rules can lead to errors. Obviously there
is still work to do in this direction too.
The tentative conclusion to this part of the discussion is that infinitesimals
and nonstandard numbers in general do seem to form a useful system or systems
by which to do mathematics, and (with some careful warnings about potential
error that might occur if the methods are incorrectly applied) we might encour-
age more mathematicians to believe in their existence and usefulness.
Now we address similar questions about physical existence of nonstandard
numbers.
The most obvious remark is that infinitesimals seem to be useless to measure
traditional quantities such as time and space, as infinitesimal amounts of time
and space would be too small to be measured in any conventional sense. Nor is
there (to my knowledge) any physical theory in which infinitesimals are potential
measurements for physical quantities. This remark is a bit glib, as it presupposes
that the traditional real-valued measurements of space and time are ‘correct’.
So let us speculate for a moment what such a theory with nonstandard values
for measurements might look like.
Suppose some physical quantity—we will call it the mass of a particle, but
nothing we say will be specific to this—is measured in a nonstandard system
and the measurement may take infinitesimal values. Then the physical theory
will only ‘see’ the orbit of this value obtained by the measurement, i.e. the
set of automorphic images of it under the automorphism group of the number
system. Rather than seeing a hyperfine continuum of possible particles with
infinitesimal masses we would see a classification of ‘types’ of masses based on
the orbits of the values. However, the measurement of this one value will affect
the measurement of the mass of a second particle, since the orbit of a pair of
points (u, v) is not necessarily the Cartesian product of the orbits of u and v
individually.
All of this looks suggestive of what actually happens in physics, i.e. that
certain symmetry groups underlie physical structure and the result of one mea-
surement may affect another, but I am not enough of a physicist to see this
idea through. It is not completely without precedent. Complex numbers per-
vade mathematical physics, but as we have discussed, they cannot be seen in
isolation, at least not to the detail required to distinguish i from −i.
One reason for proposing the quark model of hadrons was that they show
structure and a non-zero size. Some particles, notably quarks and electrons and
electron-like objects do not apparently have a size, i.e. they are point masses
according to the best measurements possible today. But if we speculatively
imagine physics at the scale of infinitesimals, they might then have structure,

20
such as being made of smaller elementary particles at the infinitesimal scale.
These particles and any (infinitesimal) distance between them would not show
up as distances, but they would contribute to properties or quantum numbers
describing the particle; they might be the key to understanding how a particle
such as an electron might have a ‘hidden variable’ or to understanding the seem-
ingly random processes in quantum mechanics. They might interact (possibly
in infinitesimal time) with other particles, and the large number of different
kinds of interactions might usefully be unified. From the ‘outside’, i.e. at non-
infinitesimal scales, we would not actually see these infinitesimal distances, but
instead we would see the properties of infinitesimals that are preserved by the
automorphisms of the infinitesimal number system.
Of course most of the ideas in this section is complete speculation. The only
message that I want to draw out from this discussion is not the detailed specu-
lations as such but rather the fact that infinitesimal quantities would not look
infinitesimal on the macroscopic scale, but rather would manifest themselves
as ‘quantum numbers’ or properties of the objects concerned which would be
identified through their classification and patterns that relate closely to the sym-
metries and orbits of the situation at infinitesimal level and in particular how
the automorphism group of the nonstandard universe acts on such numbers and
on the physical set-up at the infinitesimal level.

4 The case for ‘pure’ structuralism


The main issues in the philosophy of mathematics that I wish to address con-
cerning the existence and nature of mathematical objects, theories of truth and
deduction about them, and theories of knowledge of mathematical truths are
discussed in Hart’s volume The Philosophy of Mathematics [4]. I will present
some speculations and personal views that will need defending further elsewhere,
but I will not do these points justice here, and certainly do not answer them
fully. My aim is simply to show how discussion of other kinds of objects can
stimulate this discussion.
Firstly, there is a major difficulty with both combining theories of both
truth and knowledge of mathematics and mathematical objects with theories of
ordinary everyday objects. (See Benacerraf [1], reprinted as Hart [4, Chapter
I].) Mathematical objects, if they are to have Tarskian semantics must have
something like platonic existence, but it is difficult to see how this allows us to
interact with them to obtain knowledge of them
Secondly, if mathematical objects are to be viewed in a structuralist view (as
I think they must, for it is impossible to conceive any other nature for them and
this is in any case closest to the working point of view of most pure mathemati-
cians) we have the issue that ‘structuralism’ is subject to circularity: structures
explain objects, but being objects must be explained first. (See Parsons [7],
reprinted as Hart [4, Chapter XIII], for a more detailed account.)
It seems to me that what I have called the utilitarian view is the only rea-
sonable way to justify new axioms or new mathematics. (It corresponds to a
modern and somewhat more pluralistic version of Hilbert’s programme with
‘levels’ of realness and different modes of application, and it is these applica-
tions and their success that give the axioms credibility.) Structuralism is the
only reasonable way to manipulate these different kinds of mathematics. One

21
can go further and postulate a unifying theory that brings various stands to-
gether. This may be a matter of taste, or a matter for some other overall view
of mathematics (an overarching constructive or intuitionistic one perhaps) but
I find the evidence given above that the bulk of mathematics is utilitarian and
structuralist compelling.
Canonicity results are important for both structuralism and the utilitarian
approaches, and I have argued, the issues of existence associated with canonicity
are very much ‘true to life’ too, possibly even having the ability to explain some
phenomena from quantum mechanics that from a naı̈ve point of view seem
unnatural.
The other major notion that arises from my discussion above of how mathe-
matics is typically done is that of quasi-semantic reasoning, and reasoning about
‘imagined’ structures. This doesn’t quite fit the usual semantics versus syntax
canon that we have come to expect from foundational results in mathematics
about mathematics, but is natural, commonplace, and appears reliable enough,
especially with the additional checks that mathematicians employ.5
To speculate first on the first issue, the similarity or otherwise of truth and
knowledge in the mathematical and everyday realms, it seems to me that the
theoretical dichotomy between pure Tarskian semantics and formal first-order
theories and their syntax is stretching things rather and in everyday life, as well
as in mathematics, some sort of quasi-semantic arguments take place rather
more often than Hart’s introduction (op cit) would have us think. To take his
example, the ‘trite’ example that ‘All bachelors are unmarried.’ We see that
this is true in a quasi-semantic way, not by some argument in a formal system,
by observing the semantic meaning of the definition of ‘bachelor’ and ‘married’
and making a connection in an (imagined) structure of people, some of known
are bachelors, and some of whom are married. Hart’s more worldly example,
‘All bachelors are sexually frustrated,’ is determined false not by engaging in
an opinion poll of bachelors on the street to see if we can find one that is
not sexually frustrated or else exhaust the supply, but rather by recalling from
past experience some bachelor, that we might have been envious of perhaps,
that was not sexually frustrated. We don’t actually know if this individual
has since got married (if so he would be no use as an example for us) but we
do have a semantic conception of the world and its people as being large with
all reasonable possibilities represented, does not change particularly fast, and
that our experience is rather more limited. Since it was not difficult to find
a counterexample 20 years ago, things are unlikely to be different now. To
answer a critic who says this argument is not proof enough, we reserve the
right to carry out an opinion poll, or possibly scan the men’s magazines to see
if such a poll has already been carried out. Other examples work similarly.
Benacerraf’s example, ‘There are at least three perfect numbers greater than
17’ can be determined by an opinion poll—or rather a computer search—but
other quasi-semantic arguments are more satisfactory are more revealing, and
the reason why mathematicians prefer these other arguments is that it gives
more information, not because a computer search is out of the question.
There seems to be a spectrum of modes of informal argument for propositions
5 Weaved in with all of this are psychological effects, and we must ask to what extent do

mathematicians work in the way they do because it is convenient and productive for them,
and to what extent do they actually need to because of the nature of mathematics itself? This
question clearly requires further study.

22
of all types, and I suggest that these arguments are essentially semantic in
nature. In mathematics there are arguments about propositions concerning
numbers that could possibly be found by calculation or computer search, and
these are the ‘real’ propositions of Hilbert, but in mathematics there are also
indirect means for argument, and many levels of indirectness, corresponding to
Hilbert’s ‘ideal mathematics’. But so too are there indirect means for arguing in
real life. The fact that mathematical argument can be formalised as a syntactic
system is an interesting and useful fact (useful for reliability, communication and
verifying arguments) but it is no accident that the so-called ‘natural deduction’
rules are based on quasi-semantic steps.
Nevertheless, if we are going to argue that mathematical arguments and
knowledge, like other kinds of knowledge, are essentially informal but semantic,
we will have the problem of providing some details of these semantics and show-
ing that these arguments determine truths—in particular that mathematical
truths are truth like any others.
Here the usual approach is to try to identify the objects to which Tarskian
semantics apply, and this is fraught with difficulties. The same difficulties ap-
ply in everyday arguments: for example, which is the object corresponding to
the ideal present-day sexually satisfied bachelor that we know exists by rather
convincing means based on personal experience of the world? This too I will
have to leave aside for further detailed discussion another time. But in some
sense we all use these arguments and they do work, even if they correspond
to neither the usual Tarski semantics nor any precisely delineated syntactical
formal system.
Objects, I am arguing, are of necessity abstract objects, described by what
they do, in both mathematics (where this requirement is quite clear) and ordi-
nary experience (where the issues are muddied by the possibility of the apparent
availability of a pollster’s approach to truth). In other words they are objects
presented to us by some kind of structuralist view of abstract objects.
One take on structuralism is that objects are presented as being part of a
structure (perhaps a structure for a first order language) and are identified in
some way by what they do, i.e. what properties they have in that structure.
But, according to Parsons, this has two major difficulties. The first is that
many objects have the same properties and therefore must be identified in some
way. But it is difficult to see how to do this: are we somehow taking a typical or
particular or canonical example of each? If so it seems difficult to see how one
could be chosen over the others. Or are we taking the equivalence class of all
such objects? This has issues with the underlying theory of sets, of course, but I
feel uncomfortable with this as the equivalence class of an object is not the same
sort of thing as the object itself—the equivalence class construct has added extra
unwanted structure onto the object. The second difficulty is that the first-order
structure which gives the abstract objects their definition is apparently itself
an abstract object so needs to be defined in the structuralistic way in terms of
some other structure, and this creates circularity.
It seems to me that these problems may be resolvable, and indeed I have
argued that they must be resolvable. The mistake (and I believe it is a mistake)
is focusing on ‘objects’ too strongly. Structuralism is a way of looking at things,
but it isn’t itself defined in terms of objects. Structuralism is a pair or spectacles,
or a lens, or filter, which we look through at things. The spectacles or filter
removes properties that we are not interested in and leaves us looking at things

23
with certain limited properties relating to some specific operations. The abstract
objects that this gives us are the ‘things’ identified by their properties. Thus
an object exists if there are ‘things’ that correspond to it but the object is not
one thing nor the set of all of them, but an abstraction of all of them by their
property. Mathematical examples are easy to find, but the everyday example
I used earlier is a god one to think about. In the collection of things in front
of me there are (or so the physical theory says) air molecules, but the abstract
object ‘air molecule’ is not a particular air molecule or the set of them all but
an abstraction of those properties perceived through the filter.
Having chosen the structuralist view, in discussions about circularity where
there is a choice one must put the structuralist view first. This means that the
filter through which we see things is not an abstract object at all. It is a sort of
description of how we are for the moment looking at things. Such descriptions
are typically very simple. It might just be that we are looking at the things
in front of us as a collection of molecules, and not as tables, chairs, etc. But
these filters or descriptions will be difficult to express in words. The mistake
made by many with respect to structuralism is, I believe, that they think of the
filters or spectacles as arising from structures which are objects. But they are
not objects, nor do they arise from objects: they are something else less easy to
pin down, akin to ‘pure descriptions’. When we choose to be more precise about
them what we are actually doing is modelling the filter with a theory of objects,
just as we choose to model space with a mathematical theory in which distances
are given by real numbers, or we choose to model the flight of a projectile as
the movement of a point-mass particle in a uniform gravitational field.
Structuralism taken this way (where the structuralist view is taken as pri-
mary and is modelled by abstract objects rather than defined by abstract ob-
jects) I call ‘pure structuralism’. Clearly the scope of this paper has not allowed
for any detailled look at it, and there is much work still to do. It seems to me
that the filters or spectacles are well-modelled by the idea of forgetful functor
in category theory, and category theory should also provide a kind of semantics
that is close in spirit to pure structuralism than the Tarskian one.

5 Questions for further research


Examine mathematics from the point of view of mathematics done on a com-
puter with a computer algebra system, for exmaple.
To what extent does the computer science concept of abstract object cor-
respond to the one suggested above? Do the considerations above suggest any
improvements to the object-orientated paradigm for computer programming?
Does the fact that mathematics has several different compatible viewpoints
(and different ways of making them compatible) add to the weight of its results
or make cloudy water even more murky?
What precisely is informal ‘quasi-semantic’ argument and how does it con-
trast with more formal modes of argument? What is it good for and what is it
not so good at?
There is also the interesting issue of the role of first-order logic and what I
have called ‘quasi-semantic’ arguments. As I have said, it is easy and natural to
argue in ZFC ‘quasi-semantically’ with reference to an imagined universe or part
of that universe. Then the syntactic rules for first-order logic are justified quasi-

24
semantically, by a reflection on this imagined universe and an argument similar
to the soundness theorem. In principle, quasi-semantic arguments of this form
are more powerful, as other rules could in principle be imagined and used that go
beyond the usual first-order logical rules. In practice this rarely occurs, and one
wonders why. Is this some psychological phenomenon restricting the mathemati-
cian’s imagination, perhaps related to the ‘pessimism’ mentioned earlier? Or is
there some deeper philosophical reason that puts these imagined universes and
quasi-semantic reasoning about them on a firmer foundation? something to do
with an informal Completeness Theorem for such quasi-semantical deductions?
In any case, it always seems remarkable to me that the true Completeness Theo-
rem makes excellent predictions about provability in first-order theories despite
its non-constructive nature and the fact that AC is required to prove it.

References
[1] Paul Benacerraf. Mathematical truth. J. Philos., 70(19):661–679, 1973.
[2] C. C. Chang and H. J. Keisler. Model theory, volume 73 of Studies in
Logic and the Foundations of Mathematics. North-Holland Publishing Co.,
Amsterdam, third edition, 1990.
[3] Timothy Gowers. Mathematics, a very short introduction. Oxford Univer-
sity Press, Oxford, 2002.
[4] W.D. Hart, editor. The Philosophy of Mathematics. Oxford University
Press, 1996.
[5] Richard Kaye. The mathematics of logic. Cambridge University Press,
Cambridge, 2007. A guide to completeness theorems and their applications.
[6] Roman Kossak. What are infinitesimals and why they cannot be seen.
Amer. Math. Monthly, 103(10):846–853, 1996.
[7] Charles Parsons. The structuralist view of mathematical objects. Synthese,
84(3):303–346, 1990.
[8] W. V. Quine. Two dogmas of empiricism. In The philosophy of language,
pages 39–52. Oxford Univ. Press, New York, 1996.
[9] Abraham Robinson. Non-standard analysis. North-Holland Publishing Co.,
Amsterdam, 1966.
[10] Petr Vopěnka. Mathematics in the alternative set theory. BSB B. G. Teub-
ner Verlagsgesellschaft, Leipzig, 1979. Teubner-Texte zur Mathematik.
[Teubner Texts in Mathematics], With German, French and Russian sum-
maries.

Appendix: commentary on the different views


[consider deleting some of this or merging with the text above]
The discussion throughout this paper is about existence of abstract objects
such as numbers and number systems in mathematics, and of course the main

25
area of interest is in the foundations of mathematics. Because of the nature of
the main questions addressed in this paper I have tended to take the working
mathematician’s point of view as understood, and have rather skated over the
fundamentals behind this. It is time to make amends and address the foundation
for this view, and put a bit more flesh on what a mathematician might mean
by ‘existence’. It may seem strange to discuss this at the very end of the paper,
but ideas here require some understanding of a few key examples, in particular
that of real numbers, as discussed earlier.
The notion of real number is a good place to start, since it is straightforward
enough for most working mathematicians to appreciate, and yet complicated
enough for a number of differing views to have been put forward, especially
in the first half of the twentieth century. But to most people, there is a clear
concept of real number based initially perhaps on the idea of a decimal expan-
sion with several examples, including terminating decimal expansions, such as
that for 1/4, repeating ones, such as for 1/7, non-repeating ones such as for

2 and more complicated ones such as for π. After some point, with these
examples in mind one can abstract the idea of a general decimal expansion, and
develop the concept of real number from that. (Of course some other intuition,
such as Dedekind cuts can be used in place of decimal expansions.) One feels
justified in this endeavour at the point when one writes down a set of very
reasonable-looking axioms for the resulting system of numbers and proves but
the existence theorem for such a system based on decimal expansions and proves
both canonicity theorems.
It seems to me that two very positive mental acts are being described in the
last paragraph, and both give strong evidence towards the existence √ of a system
of real numbers populated by familiar numbers such as 1/4, 1/7, 2 and π. The
first is the moment of abstraction when after several calculations with particular
examples one realises that every sequence of decimal digits corresponds to a real
number, and (ignoring the technical problem with recurring sequences of 9s) that
each distinct decimal corresponds to a distinct real number and that conversely
the mental view of a number on a number line shows that each can be measured
by a sequence of decimal digits. The second is the moment of axiomatisation
and the theorems of existence and canonicity that shows that there is essentially
only one system of real numbers. The first is a private moment of insight: after
playing with calculations and symbols on a page I suddenly have an inkling
of this particular kind of number in all its generality. The second is a way
of sharing this insight: now that I can describe my real numbers and prove
that my real numbers are the same as yours I can discuss them with you to
examine their structure in much more detail. It seems that for most people
this is ample evidence that the system of real numbers exists in as concrete a
way as is required. It is just as robust, or perhaps more robust, than some of
the notions of the physical objects around us and their qualities, and we have
excellent reasons for believing that these mathematical objects are viewed in
the same way by all other mathematicians—arguably much better reasons than
we might have for believing that some particular patch of grass is actually seen
as the same colour by all individuals, irrespective of the label ‘green’ that they
choose to put on that colour.
Against this evidence, some counter-arguments have been put forward. Most
centre round difficulties in the idea of an arbitrary sequence of digits (or an arbi-
trary bounded set of rational numbers, or whatever is relevant on one’s favourite

26
conception of the reals). When one examines it, it seems that this idea of ‘ar-
bitrary set’ or ‘sequence’ is harder to pin down that one might expect, and it
is essential for full understanding of the reals, especially for canonicity. One
alternative that has been proposed is the formalists’, which says that general
sequences of digits exist in an ideal sense and we can accept them in this limited
way because this belief in them and their properties does not impact
√ in any neg-
ative way on our conception of concrete real numbers such as 2 and π—indeed
more, that this world of ideal sequences might provide new information about
our familiar numbers. Another alternative is the intuitionists’ which says that
only numbers that can actually be constructed (such as via a computer program
that prints out their digits) can be accepted—there are no others; and all de-
cisions (such as whether one number is bigger than another) have to be made
in a similar constructive manner. Cases can be made for both these points of
view, especially in certain areas of research in which they are relevant, such as
foundations of mathematics, theoretical computer science, philosophy of scien-
tific method, etc. But most mathematicians reject them for practical reasons.
It is as if mathematics has moved on a few steps beyond the formalists or in-
tuitionists, in abstracting a number of important concepts as objects satisfying
axioms; perhaps this process of abstraction is genuinely necessary for human or
social or scientific reasons, is that if it is to be useful and be used as one of the
building blocks of the next piece of theory then we have to in some sense be
able to mentally picture these objects and believe in their existence.
For another example that clearly separates the formalist, intuitionist and
classical mathematician, consider the natural numbers, the ordinary counting
numbers. For an intuitionist, the natural numbers are given. They are the start-
ing point for the rest of mathematics. For a formalist they are strokes on a page,
together with rules that combine sequences of strokes, or compare two different
such sequences. To a classical mathematician, the intuitionistic approach ex-
plains little, and in particular there is no place for Dedekind’s elegant axiomatic
description of the natural numbers and the canonicity of this system, because
it rests on more complex notions such as arbitrary sets of numbers, something
the intuitionist rejects. Similarly, the formalist approach is cumbersome, and
in it it seems that numbers always have to be manipulated using these formal
systems. There is no place for the intuition of number that suddenly arises in
a child when he or she sees some sheep and exclaims for the first time ‘Three
sheep!’ having identified and abstracted the concept of ‘three’ and therefore no
longer has to perform any tedious matching against a collection of apples to
determine that of the set of apples there is exactly one apple each for the sheep
to eat.
Abstract objects in general presumably arise in the same sort of way, even
outside mathematics. Perhaps by playing a mental game, maybe with symbols
following rules, or using a formal system, or by argumentation using a known
style of argument, or whatever, one sees through intuition a pattern. The de-
scription of the pattern and analysis of how it arises is then the first part of
abstracting the pattern into an object or system of objects that have personal
meaning to us. The next stage is to describe that pattern more fully and show
that in some sense it is canonical, enough to communicate it with other people
that have a concept of the same sort of pattern, and a shared concept arises for
discussion and research, and in this discussion and research it is most convenient
to talk about and believe that these objects have some sort of prior existence

27
as abstract objects.
It’s important to remember that failure of canonicity does not necessarily
indicate the failure of this programme. An object—be it a number system or
whatever—may not be canonical, but that may be more of the fault of the
purported definition. Even if the object is canonical, the failure of the Second
Canonicity Theorem may be an issue. But on the other hand, this may be
something one can live with, or may even be necessary for theories based on the
concept.

Appendix: Infinitesimals
This section provides a slightly technical background on nonstandard number
systems and infinitesimals.
Just as with the complex numbers, where the addition of i to the reals
necessarily requires us to add other numbers of the form x + iy to preserve as
many of the usual properties of arithmetic that we expect, adding infinitesimals
to the reals requires us to add other numbers too. Thus if h is a positive
infinitesimal, −h should be a negative infinitesimal and 1/h a positive infinite
number; also π +h is a new number that is infinitesimally close to π, but slightly
larger than it, and so on. We get a picture of a system of hyper-real numbers
which is like that of an extended real number line where each real number is
‘fattened’ to a set of numbers all infinitesimally close to that real number. This
set of numbers infinitesimally close to a real number x is usually called the monad
of x and written µ(x) or st−1 (x). As well as containing infinitesimals, our hyper-
real number line also contains infinite numbers larger than previously existing
reals, and also the negatives of these infinite numbers. There are finite hyper-
reals (i.e. ones that are in magnitude bounded by ordinary real numbers) and
infinite hyper-reals (other ones which are greater than all normal real numbers
in magnitude). One non-obvious fact that follows from the completeness of the
real number system is that every finite hyper-real y lies in the monad µ(x) of
some standard real number x, which is uniquely determined by y. This standard
real is called the standard part of y and written as st(y).
The previous paragraph gives an account of the intuitive structure of the
hyper-reals, but rather fails to explain why such number systems exist. In fact,
prototype systems with infinitesimals can be constructed by algebraic means
similar to the construction of the complex numbers, in a way in which the usual
arithmetic laws of addition and multiplication hold, but such systems are not
particularly rich or useful. For analysis and much other mathematics we need
plenty of other functions to be defined, and for example it is not clear how
by algebraic means we might define the value of the function sin(1/x) at an
infinitesimal h. Because h is infinitesimal and sin(1/x) varies wildly between 0
and 1 near x = 0 the choice of sin(1/h) seems to be arbitrary. And there are
many other similar arbitrary choices to make. This is where tools from logic
help, and one method of constructing hyper-reals is to apply the Completeness
Theorem in logic as described above. Essentially what is happening is that one of
the axioms of ZFC, the Axiom of Choice or AC,6 usually via its equivalent form,
Zorn’s Lemma, is applied to decide all these arbitrary choices simultaneously,
6 In this paper, AC for the Axiom of Choice is not to be confused with AC for Alternating

Current. Both acronyms are too firmly set in place to be changed, unfortunately.

28
and in a consistent way relative to all the other properties of the real numbers.
Thus the first true models of nonstandard analysis contain infinitesimals h, have
many or perhaps all real-valued functions defined, and satisfy all first order
statements that were already true in the reals.
An alternative and popular but related method of construction is to use an
ultrafilter. Essentially an ultrafilter is a set theoretic object which encodes a
collection of infinitely many choices, and which is shown to exist in ZFC by an
application of Zorn’s Lemma. The hyper-reals are now constructed by taking
a large Cartesian product RN of the reals and using the ultrafilter to make
decisions as to which elements of this product should be regarded as equal and
how functions such as sin(1/x) should be defined on it. This type of construction
goes back to the Polish mathematician Loś and his work in the 1950s, though
similar ideas were already being used by Skolem (in a slightly less powerful way)
to construct nonstandard models of counting numbers in the 1920s and 1930s.7
A third method of construction uses Gödel’s Incompleteness Theorem. Tak-
ing for T to be one’s favourite consistent recursively axiomatised theory of arith-
metic, the theory of first-order Peano Arithmetic (PA) being a common choice,
by Gödel’s Second Incompleteness Theorem the theory with the axioms of T
together with a single extra statement ¬ Con(T ) saying that T is inconsistent is
itself consistent, so by the Completeness Theorem of logic has a model or system
of numbers satisfying it. In this model, the statement that there is a proof of
an inconsistency 0 = 1 from the axioms of T is true, but we knew in advance
that there is no such proof in the real world. Therefore that proof of 0 = 1 is
a nonstandard object and it turns out rather quickly that it must have infinite
length. So our model has infinite numbers. Now we can replicate the standard
construction of the integers, rational numbers and real numbers starting with
this model rather than starting from the usual set of counting numbers N. The
result is a nonstandard model resembling the reals and containing all standard
reals.
For the purposes of this paper, constructions by using tools from first order
logic and constructions by ultrafilters are perfectly reasonable constructions of
nonstandard number systems. Nonstandard analysts tends to dismiss the third
method based on the incompleteness theorems, because it is inconvenient to
check that statements they know true in the reals (such as the continuity of
the sin function perhaps) are indeed expressible and provable in the theory T .
We shall also dismiss this third construction for a different but related reason.
That is, starting with a theory T that we know is consistent we build a model in
which the theory T is not consistent. In some sense the model is wrong about
the consistency of T and therefore we must reject it as not being a ‘true model
of infinitesimals’.
Newton and Leibniz’s account already contain some ideas of when an in-
finitesimal number can be ignored and when it must not. For example, the
ratio of two infinitesimals x/y should be calculated as it could turn out to be
zero, infinite or some finite number, but in the sum of a real number and an
infinitesimal a + h the infinitesimal can often be neglected. Berkeley reasonably
criticised this because the rules for when an infinitesimal may be neglected were
not explained. Robinson’s nonstandard analysis provides these rules, and sim-
plifies many of the definitions of analysis that according to Cauchy and others
7 References required.

29
need the concept of limit.
For example, the sum of a real number and an infinitesimal a+h is a number
in the monad of a. The notion of continuous function is one that maps monads
to monads. More precisely, a function f defined on the reals is continuous at
a if ∗f (a + h) ∈ µ(f (a)) for all infintesimals h, where ∗f is the nonstandard
version on the function f . (The function f is defined on the real numbers only
so for technical reasons the corresponding function defined of the hyper-reals
is a different function ∗f , though it extends f in as natural a way as possible.)
In post-Cauchy mathematics, this is a theorem: one can prove quite easily the
equivalence of Cauchy’s definition of continuity and the one just given.
Differentiability and all other notions from analysis can be given a similar
treatment. Newton and Leibniz calculated the derivative of a function f at a
by computing (f (a + h) − f (a))/h for an infinitesimal h and then neglecting
infinitesimals afterwards. (For a continuous function f we have just seen that
f (a + h) − f (a) is infinitesimal, so this is the ratio of two infinitesimals which
initially must be calculated in some different way.) In nonstandard analysis, f
is differentiable at a (with derivative b) if the quantity (∗f (a + h) − ∗f (a))/h
always lies in the same monad (the monad µ(b)) irrespective of which nonzero
infinitesimal h is taken. This appears to be exactly what Newton and Leibniz
intended, and agrees with Cauchy’s definition exactly.

30

You might also like