Thinking Mathematically Notes

Thinking mathematically notes∗
Attila Máté
Brooklyn College of the City University of New York
March 12, 2020
Contents
Contents 1
1 Paradoxes in mathematics 2
1.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Cardinality of the power set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Cantor’s paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Russell’s paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Epimenides paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Berry’s paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.7 Axiomatic set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.8 The axiom of replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.9 Other axiom systems of set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.10 Hilbert’s program and incompleteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.11 The continuum hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.12 Independence of the continuum hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.13 Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.14 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 The unreasonable effectiveness of mathematics 8

2.1 I was wrong, too . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Geometric cardinalities 12
3.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Mapping a line segment to one of different length in a one-to-one way . . . . . . . . . . 12
3.3 Mapping a finite line segment onto the whole line in a one-to-one way . . . . . . . . . . 13
3.4 More on the stereographic projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5 Mapping a square onto a line segment in a one-to-one way . . . . . . . . . . . . . . . . 14
3.5.1 An idea that fails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.5.2 An idea that succeeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 The Pythagorean Theorem 14

∗ Written for the course Mathematics 1311 (Elementary Probability) at Brooklyn College of CUNY.
1
4.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5 How many cameras need to watch an art gallery? 16

5.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.2 How to find a spanning arc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.3 The number of cameras needed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6 Platonic solids 17
6.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.2 The five Platonic Solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.3 The duals of Platonic solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
7 Planar graphs and Euler’s formula 17

7.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
7.2 Graphs and planar graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
7.3 Euler’s formula for planar graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
7.4 The proof of Euler’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
8 There cannot be more than five Platonic solids 20

8.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.2 Platonic solids satisfy Euler’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.3 Determining the number of edges of a Platonic solid . . . . . . . . . . . . . . . . . . . . 20
8.4 No more than five: a proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
9 Rigid tilings of the plane 22

9.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
9.2 The pinwheel triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
9.3 A supertile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
9.4 An interior triangle cannot be any other tile . . . . . . . . . . . . . . . . . . . . . . . . 23
10 The pigeonhole principle 24

10.1 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
10.2 Approximation of numbers by fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
References 28
1 Paradoxes in mathematics
1.1 Reading
The material in this section is to provid background to the section Travels toward the Stratosphere
of Infinities: The Power Set and the Question of an Infinite Galaxy of Infinities [1, §3.4, p. 174]. If
you have a different edition of the book, you will still find the same material, perhaps with minor
changes. What is most likely to change material at the end of the section, called Mindscapes. The
material there is of real interest, but I largle skipped the Mindscapes segment of the book, since
there is really no time in the course for deep study of them, and I would like to present a variety
of material. This material is interesting, but the Midscapes are not important for exams. It gives
more insight for those whose interest goest further than what is presented in the main text.
2
1.2 Cardinality of the power set
For a set S, the power set P(S) of the power set is defined as the set of all subsets of S:
def
P(S) = {X : X ⊂ S}.
We will show that for any set, P(S) has larger cardinality (in other words, it has more elements)
than S. More precisely, we have
Theorem 1.1. Given an arbitrary set S, there is no mapping f : S → P(S) that is onto P(S).
The theorem does not require that f be one-to-one.
Proof. Assume there f is a mapping on S into P (S). Then
X = {x ∈ S : x 6= f (x)}
is a subset of S, and yet there is no y ∈ S for which f (y) = X. Indeed, assume on the contrary that
y ∈ S is such that f (y) = X. Recall that two sets are equal if they have the same elements. On
the other hand, the definition of X implies that y is an element of one, but not the other, of X and
f (y). so we have X 6= f (y). Indeed, if y ∈ f (y), then y ∈
/ X, and if y ∈
/ f (y), then y ∈ X.
The result and the proof applies to any set, even to finite sets, and even to the empty set itself.
One would, however, barely be interested in using this proof for finite sets (except for the interesting
point that the proof works even then), since other arguments show the much stronger result that a
finite set of n elements has 2n subsets. For a set S denoting by |S| the cardinality of S, by extension
of this statement, one defines the cardinality 2|S| as the cardinality |P(S)|.
1.3 Cantor’s paradox

When Georg Cantor found this result, there was no restriction on what he or his followers considered
a set (there were people like Leopold Kronecker who did not find Cantor’s arguments acceptable.
The world soon soured on Cantor’s discovery. In fact, it was Cantor himself that discovered a
contradiction in around 1899. Writing V for the set of all sets, we cannot have
|P(V )| > |V |.
since every element of P(V ) is already an element of V . This statement, showing that Theorem 1.1,
even though proved apparently rigorously is called Cantor’s paradox; a paradox is a self-contradictory
statement.
The set theory invented by Cantor is often called naive set theory (though I prefer the name
intuitive set theory, i.e., a set theory based on intuition, rather than on a rigorous foundation).
The discovery of paradoxes in set theory was very disconcerting to mathematician, since set theory
became an important part of mathematics. One of the greatest mathematicians at the time, David
Hilbert wrote that “Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben
können” (From the paradise that Cantor created for us, no-one shall be able to chase us out).
The resolution of this paradox was given by aximatic set theory, in which large collections are
called (real) classes, and small collections are sets; we will say more about axiomatic set theory later.
Here large and small are relative term, so we will give a more precise explanation: all collections are
considered classes, but some collections will be considered sets. Classes (and sets) can only have sets
as elements. Real classes (classes that are not sets) cannot be elements of classes. This will avoid
Cantor’s paradox, but it is not at all clear whether paradoxes will re-appear in different forms. ‘
3
1.4 Russell’s paradox
One of the most famous paradoxes is due to Bertrand Russell. It starts out with saying that for a
set x, we may or may not have x ∈ x.1.1 Now, consider the set R defined as
R = {x : x ∈
/ x},
where x runs over all sets. That is, R is the set of all sets x for which x ∈ / x. Then one asks the
question whether or not R ∈ R. This question has no answer, since if the answer is yes, that is, if
R ∈ R, then the definition of R with x = R says that R ∈ / R and if the answer is no, that is, if
R∈ / R, then the definition of R with x = R says that R ∈ R. Russel was so upset with his own
paradox that he decided to write a monumental three-volume work with Alfred North Whitehead1.2
in which, as detractors of the work like to say, it took 1500 pages to prove that 1 + 1 = 2. Their aim
was to put mathematics on a solid foundation; but, as Kurt Gödel showed in 1931, the goal they
tried to accomplish is not attainable.
1.5 Epimenides paradox

Logical paradoxes have been around for a long time. Epimenides was a Cretan (that is, from the
island of Crete in the Mediterranean sea) in the 7th or 6th century BC (perhaps he was a mythical
rather than a real figure) who said that “all Cretans are liars,” therefore he himself is a liar. But
that means the statement is not true, so he is not a liar after all. But then the statement is true,
so he is in fact a liar. So which is it? Even though this is not formulated in terms of set theory, the
analogy with Russell’s paradox is unmistakable. A somewhat similar paradox motivated by Russell’s
paradox is the barber’s paradox, described as saying “the barber is the man who shaves those that
do not shave themselves.”1.3 The question is, does the barber shave himself?
1.6 Berry’s paradox

The paradox of G.G. Berry, a junior librarian at Oxford describes an integer as “The smallest
positive integer not definable in under sixty letters” (this phrase conain 57 letters). So, which
integer it defines? The problem is that if it in fact defines an integer, then the phrase says that that
integer cannot be defined in fewer than sixty letters. Obviously, the sentence itself is contradictory.
The resolution of this paradox is that mathematical statements must be described in a formal
mathematical language; natural languages such as English are not sufficient to make clear statements.
Such a formal language has in fact been developed. For example, the statement that an integer p is
prime if it greater than 1 and it cannot be represented as a product of two integers u and v with
1 < u < p and 1 < v < p is described by the formal statement

prime(p) ↔ ∀u∀v (1 < u < p & 1 < v < p) → p 6= uv ,
where the variables p, u, and v run over integers. The literal translation of this fomral statement
is the following: p is a prime (expressed formally as prime(p)) if and only if (expressed as ↔) for
all u (expressed formally as ∀u) and for all v (expressed as ∀v)1.4 ), if 1 < u < p and (expressed as
1.1 An axiom called the Axiom of Foundation, invented by John von Neumann, explicitly disallows the possibility
that x ∈ x for any set x.

1.2 A building is named after him at Brooklyn College
1.3 Here shaving means shaving facial hair, so the statement is traditionally only about men – though nowadays one
seems to treading dangerous grounds for making such statements.

1.4 Note that this no separate symbol is used to express the word “and.” This is because the word “and” here is not
used in the logic sense, as in “the sun shines and it is warm,” but in the sense of enumeration as in “ham and eggs.”
4
&) 1 < v < p then (the “if . . . then” is expressed as →) p 6= uv). Note that with the placements of
parentheses in the formula one can express more clearly what is meant than is possible in a natural
language without sometimes complicated circumlocutions.
1.7 Axiomatic set theory

It was strongly felt by some mathematicians that in spite of the paradoxes described above, set
theory should not be discarded, especially since the paradoxical statements were different from the
mathematically useful statements, but nobody really knew where the boundary between the two
lied. We mentioned the attempt by Russell and Whitehead to resolve the problem; a humorous
description of this history can be found in the essay The Greatest Math mistake of the Century.1.5
The theory created by Russell and Whitehead was awkward to use in practical mathematics. An
axiom system for set theory of practical use was created by Ernst Zermelo. The theory created
by Russell and Whitehead was awkward to use in practical mathematics. An axiom system for set
theory of practical use was created by Ernst Zermelo. In axiomatic set theory, the only kinds of
things are sets; that is, the elemets of a set are sets themselves (in class, we discussed how to define
the integers as sets of earlier integers.1.6
We will briefly discuss some of Zermelo’s axioms. The axiom of extensionality says that two sets
that sets are equal if and only if thet have the same elements (this was mentioned in class). The
axiom of power set states the existence of the set of all subsets (power set) of a set. The axiom
of infinity asserts the existence of a specific infinite set (and not just infinite sets in general). A
remarkable axiom is the axiom of choice, basically saying that we can pick elements from an infinite
number of sets. More precisely, it asserts the following: given a set x with nonempty sets as its
elements, there is a function on x such that for each y ∈ x we have f (y) ∈ y.1.7 Interestingly,
this axiom caused a lot of controversy, even though on first hearing it seems completely natural.
However, its consequences are so striking that many mathematicians started to doubt its validity
(or its safety, in that they questioned whether it can cause contradictions, as earlier considerations
in naive set theory caused contradictions). It turns out that it is completely safe, as Gödel showed
in 1938 (to be discussed in more detail below).
1.8 The axiom of replacement

Zermelo’s axiom system had one serious deficiency in that it could not prove that the collection
containing the elements N = {0, 1, 2, 3, . . .}, P(N ), P(P(N)), P(P(P(N))), P(P(P(P(N)))), . . ., ad
infinitum,1.8 is a set; Cantor certainly considered this collection a set.
This definiency was cured by Abraham (Adolf) Fraenkel in 1922 by adding what is called the
axiom replacement, which, formulated somewhat loosely, assers that if φ(x, y) is a formula with two
1.5 The html file that contains this essay has the following copyright notice (commented out in the html, so it is not
visible for the reader without looking at the source file): this website is copyrighted by Paul Cox, all rights reserved.
The use of this material for commercial purposes without permission, including the posting of advertisements on this
site, is a violation of that copyright and may result in civil prosecution. Unfortunately, the last time I found this file
on the internet was in 2012, and it has disappeared before; so I felt I need to download it, so it will not be lost. I am
posting the website here for non-commercial use, since I feel it has great educational value, and without my posting
it it would disappear from public view. See the site for more.
1.6 In set theory, by integers one means nonnegative integers. As we mentioned, 0 = ∅, 1 = {0} = {∅}, 2 = {0, 1} =
{∅, {0}}, 3 = {0, 1, 2} = {∅, {0}, {∅, {0}}}, . . .; these are called the von Neumann ordinals; ordinals, and not integers,
because the construction continues after all integers have been described this way.
1.7 In set theory, functions themselves are considered sets, namely sets of pairs. For example, the function f (x) = x2
on the set R of all real numbers is considered the set f = {(x, x2 ) : x ∈ R}.
1.8 Latin for “to infinity,” meaning that the list is infinitely long.
5
distinguished variables x and y 1.9 such that, given a set z, for each x ∈ z there is exactly one y then
the collection
{y : x ∈ z and φ(x, y)}
is a set.
In the formal framework, the axiom of replacement is not a sincle axiom; it is a group of infinitely
many axioms, called an axiom scheme, one for each formula φ(x, y). The axiom scheme of replace-
ment greatly increases the power of Zermelo’s set theory. Zermelo’s set theory with the axiom of
choice and with the axiom of choice and with the axiom scheme of replacement is usually called ZFC
(Z for Zermelo, F for Fraenkel, and C for Choice); it is the most frequently used axiom system for
set theory.
1.9 Other axiom systems of set theory

Initially, the fact that ZFC had infinitely many axioms was considered a disadvantage. In 1925, John
von Neumann introduced classes for large collection. In this version of set theory, all collections are
classes, and those that are allowed to be elements of other classes are called sets. With the aid of this
framework, he was able to reformulate set theory with finitely many axioms. Soon afterward, Paul
Bernays and Kurt Gödel reformulated von Neumann’s system that is now called the von Neumann–
Bernays–Gödel set theory (also called Gödel–Bernays set theory).
1.10 Hilbert’s program and incompleteness

As a reaction to the contradictions in naive set theory, David Hilbert proposed a soution to this
crisis by grounding mathematics on a finite set of axioms which could be proved consistent, i.e., free
of contradictions, and also complete in the sense that true mathematical statement can be proved
in this system.
In 1931, Kurt Gödel surprised the mathemacal world by proving that Hilbert’s program cannot
be accomplished in that he showed that given any axiom system that is strong enough to contain
the traditional system of axioms, due to Giuseppe Peano, is incomplete in the sense that here are
true statements about integers that cannot be proved about integers; and, worse yet, Peano’s system
cannot be proved to be free of contradictions inside Peano’s system.
The proof of consistency, i.e., of being free from contradiction, needs to be accomplished inside
Peano’s system, or in a weaker system that is known to be consistent. Gödel proved his result under
the assumption of consistency of Peano’s system. This is important, since Peano’s system being
inconsistent (containing a contradiction), i.e., if a false statement such as 0 = 1 can be proved in‘the
system, then the system cannot be relied on (in fact, if a false statement can be proved in the system,
then everything, true of false, can be proved in the system).1.10
1.11 The continuum hypothesis

Cantor at the time of creating his set theory, asked the question whether there are infinite sets that
have cardinality greater than that of N, yet smaller than P(N).
The question whether cardinalities are comparable is decided affirmatively under the axiom of choice.
That is, given any two sets A and B, there is a one-to-one mapping f : A → B (in which case, |A| ≥ |B|) or
1.9 It may have other variables that assume fixed values.
1.10 In actual fact, Gödel used a somewhat stronger assumption than the consistency of Peano arithmetic; a few years
later, John Barkley Rosser showed that this stronger assumption can be replaced with the assumption of consistency
only.
6
there is a a one-to-one mapping g : B → A (in which case, |B| ≥ |A|). Without the axiom of choice, one can
prove that if there is a one-to-one mapping f : A → B and also there is a one-to-one mapping g : B → A,
then the is a one-to-one function h : A → B that is onto B. (That is, if |B| ≤ |A| and |A| ≤ |B| then
|A| = |B|.
In 1938, Gödel proved that if there are no contradictions in ZF (Zermelo–Fraenkel set theory
without the axiom of choice) then there are no contradictions in ZFC (Zermelo–Fraenkel set theory
with the axiom choice) even with the continuum hypohesis added. This is what we meant by saying
that the axiom of choice is a harmless assumption.
1.12 Independence of the continuum hypothesis

In 1964, Paul J. Cohen proved that the continuum hypothesis is independent of ZFC; that is (as-
suming ZF is consistent) one cannot prove the continuum hypothesis in ZFC. This means that ZFC
set theory cannot decide the continuum hypothesis: it cannot prove that the continuum hypothesis
is true, and it cannot prove that it is false. Given that the continuum hypothesis has very interesting
consequences, this is a disconcerting situation.
The von Neumann–Bernays–Gödel (NBG) set theory cannot decide either whether the contin-
uum hypothesis is true or false; this is no surprise, since in a technical sense NGB and ZFC are
equivalent.1.11
1.13 Computers
In 1928, David Hilbert and Wilhelm Ackermann formulated the Entsheidungsproblem1.12 in that
they asked whether one can design a systematic method of calculation (algorithm) that can decide
whether a mathematical statement is true.1.13 In 1936, Alonzo Church and Alan Turing solved
negatively, by showing that there is no such method. They gave radically different solutions: Church
created λ-calculus, on which the LISP programming language is based. LISP played a very important
role in the development of artificial intelligence.
Turing created a theoretical machine called the Turing machine that was very important in the
development of computers. Turing showed that it is possible to create a universal machine that can
do any calculation doable on other Turing machine. This led to the idea that it is in fact possible
to build an actual universal machine machine that can do all calculations that can be done at all
(except for resource constraints, such as time and memory; Turing’s theoretical machine did not
have such constraints).
Turing designed a theoretical machine called Turing machine, which played a great role in the
development of actual computers,
Building on Turing’s ideas, John von Neumann worked out the design principles of the a stored
program computer, and such a computer was built under his direction from 1945 to 1951 at the
Institute for Advanced Study in Princeton, New Jersey. The first electronic compter, the ENIAC
(Electronic Numerical Integrator and Computer) was built built earlier at the University of Penn-
sylvania; it was completed in late 1945. The ENIAC was not a stored program computer; it was
1.11 NBG is a conservative extension of ZFC. That is, any statement that can be formulated in the language of ZFC
and provable in NBG is also provable in ZFC. Note that the language of NBG is richer in that it can formulate
statements that cannot be formulated in the language of ZFC.
1.12 German for decision problem; the German word is used even in the English language literature.
1.13 The statement they used is a formal mathematical language similar to the one we used to define prime num-
bers above. Instead of true, they asked if the statement is universally valid, that is, whether it is true under any
interpretation of the symbols in it – we want to avoid a rigorous explanation for the sake of simplicity.
7
programmed with plugboards, and it took about eight hours to set up a plugboard for a new com-
putation. In 1948, under John von Neumann’s guidance, modificaions were made to the ENIAC
to make it function as a stored program computer. The design of nearly all computers today are
based on von Neumann’s ideas, and von Neumann architecture refers to the conceptual model of
this design.
1.14 Reading
[1, §3.4, pp. 174–185]
2 The unreasonable effectiveness of mathematics

The subjects discussed in Section 1.1 had quite an important effect, and it was quite important in
influencing technology. It is possible, in fact, it is quite likely, that without these ideas computers
would have been organized quite differently, and would have been made in quite costly way, even
though less cleverly organized. The computer that is generally considered the first electronic com-
puter, the ENIAC was not particulaly well organized. Even though quite a bit of inventiveness were
not done particularly well. For example, it calculated with decimal numbers, i.e., base 10 numbers
instead of binary, that is, base 2. This made designing the computing engine much more compli-
cated, storing the result somewhat more costly, for the relatively small gain of making the result
immediately human-readable.2.1 John von Neumann realized that it is quite easy to make computers
convert between binary and decimal numbers. In fact, he was the first one that set down the princi-
ples of computer design in a document generally referred to as First Draft. It is important to know
that John von Neumann was thoroughly familiar with the mathematical developments described in
Section 1.1 He was present at the lecture at Königsberg(now Kaliningrad), German, in 1931, where
Gödel presented his first Incompletess Theorem.2.2 The lecture was not understood by most of the
attendees, but von Neumann quickly realized its importance. There is a handwritten letter in the
Gödel Nachlass2.3 (written in German) from von Neumann in which he says that Gödel’s results im-
plies the conclusion that iniside any axiomatic theory described in the First Incompletess Theorem
one cannot prove the specific statement that such an axiomatic system is free of contradictions. This
result is Gödel’s Second Incompleteness Theorem. Von Neumann was not aware that Gödel already
had this result, but he never published his proof, because he recognized Gödel’s priority. Neither
did Gödel, except for very short description of the result, certainly not containing the proof.
John von Neumann moved to the Institute for Advanced Study in Princeton, NJ, where, together with
Albert Einstein, he was one of its professors. Through John von Neumann’s efforts, Gödel was also hired at
the Institute for Advanced Study in about 1940. One of the more recent professors was George F. Kennan
author of the famous “Long Telegram” (see the Wikipedia article just mentioned); that’s history worth
knowing, not mathematics.
2.1 Calculating with binary numbers is quite simple, as already Leibniz realized more about 350 years ago, though
people cannot realy make sense of results in binary without laborious calculations.
2.2 The theorem says, informally, that there are results in any axiomatic theory of the nonnegative integers that are
true, yet unprovable.

2.3 The unpublished writings Gödel left behind after his death. See Gödel’s Nachlass. Nachlass is a German word,
meaning unpublished writings left behind after the author’s death. Perhaps there is a good English word the replace
it, but in Gödel’s case it is common to use the word Nachlass. Gödel published very little in his life time, but the
material in the Nachlass is very rich. A significant portion of it was written in Gabelsberger shorthand, and at the
time Gödel’s writing were prepared for publication by John W. Dawson Jr., it was not easy to find an alder Austrian
or German person who was able to read this now obsolete shorthand.
8
Gödel was an Austrian, but in 1938 Austria was absorbed by Germany in the Anschluss. It was impossible
to to travel to America at the time via the Atlantic because of the war. Gödel traveled through the Soviet
Union via the Trans-SiberianRailway, a trip made possible by the Molotov-Ribbentrop Pact.
Why mathematics is so useful in the sciences is not quite understood by the scientists themselves.
The Nobel laureate physicist Eugene Wigner wrote about the his puzzlement in the article The Un-
reasonable Effectiveness of Mathematics in the Natural Sciences. This is quite interesting read in
original, as is von Neumann’s First Draft. Wigner made one of his first contribution to physics by
using group theory to explain certain mulptiple spectral line of the hydrogen atom (which spectral
lines split into several lines in electromagnetic fields). Wigner learned about the subject of group rep-
resentation (a field of mathematics required for the explanation) from John von Neumann. Wigner
was a professor at Princeton University. Princeton University was also trying to hire the famous
physicist Erwin Schrödinger, but they did not succeed in view of Schrödinger’s unusual lifestyle. He
lived with his wife and his mistress in the same household.
At about Christmas, after hearing a lecture about the theory of Louis de Broglie in which he
reversed Einstein’s theory that light is not only waves but also particles to explain the photoelectic
effect, at ETH,2.4 Peter Debye commented, “this is not the way to do physics; if you want to talk
about waves, you need to write a wave equation.” So Schrödinger took de Broglie’s dissertation
together with his girlfriend (he was married at the time) to a Swiss chalet around Christmas time
in 1925 and worked out his famous equation that changed physics.
The equations of quantum mechanics worked out by in the summer of 1925 by Werner Heisenberg,
Max Born,2.5 and Pascual Jordan. Heisenberg re-invented matrix multiplication, a mathematical
tool that had been around for about 50 years at his time, but Heisenberg did not know about it
(Max Born did). The theory was based on matrix calculations that physicists found difficult to work
with. Wolfgang Pauli worked out a description of the spectrum of the hydrogen atom using matrix
mechanic, as the theory was called at the time. “Heisenberg was not smart enough to do it” in the
description of Steven Weinberg.2.6 With Schrödinger’s equation, the calculation of the spectrum
became quite easy, relatively speaking (so, it is not really easy, but physicists were familiar with how
to work with partial differential equations).
The wave equation is a mathematical tool to discuss waves; Isaac Newton used the wave equation
to mathematically calculate the speed of sound, relying on how air can be compressed. Later, James
Clerk Maxwell used the wave equation to show that there must be electromagnetic waves, and
calculated their speed. As the speed agreed with the experimentally measured speed of light, it
was accepted that light itself was an electromagnetic wave. Electromagnetic waves are what make
your cell phones work, and without quantum mechanics, the small electronic circuits inside your
cellphone could not be made.
The equations of electromagnetic field were found experimentally before Maxwell, except that
they were inconsistent with the conservation of energy. Maxwell added the missing term, and then
the equations satisfied the requirements for a wave equation. The missing term was difficult to
find experimentally. The equations, with the term added, became to be called Maxwell’s equations.
Maxwell also invented color photography in his life of 49 years.
Maxwell’s equations were inconsitent with the equations of Newtonian physics. Einstein decided
in favor of Maxwell, hence the theory of special relativity. However, special relativity was incon-
sistent with Newton’s theory of gravitation, since in Newton’s theory the gravitational force spread
2.4 Eidgenössische Technische Hochschule in Zürich, Switzerland
2.5 The grandfather of the Australian singer Olivia Newton-John.
2.6 In a television lecture on Book TV. Weinberg must have not had great like for Heisenberg, since Heisenberg
worked on the German atomic bomb poject; Heisenberg probably did not put in an all-out effort, since it is doubtful
that he believed that the bomb could be made with the resources Germany had at her disposal.
9
with infinite speed, something that is not allowed in special relativity; this was quite a serious incon-
sistency, and a geometry of curved spaces was needed to resolve it. Such a geometry was invented by
Bernhard Riemann.2.7 Einstein learned about differential geometry from Marcel Grossmann, who
gave Einstein quite a bit of help. Finally, David Hilbert invited Einstein to Göttingen to give a
series of lectures, after which they both worked out the equations of General relativity, apparently
independently. Hilbert certainly did not receive any help from Einstein after Einstein’s lectures;
there are some doubts about whether Einstein used Hilbert’s ideas without acknowledgement, and
Hilbert did not assert his priority later.2.8
To this day, nobody knows how to make quantum mechanics and general relavivity be com-
patible. This may not be a problem with an acute need for a solution, since quantum mechanics
is mostly concerned with small things (not quite, since it helps in explaining what happens inside
stars), and general relativity is concerned with very large distances. But one would like to what
really understand what is going on. Meanwhile, nobody really understands what is going on. The
mathematics is quite will understood, and one can draw practically useful conclusions, but what is
actually going on is a mystery. Quantum theory splits into two parts; the mathematics, no issue
with that, but the interpretation? There are several of them, but none of them make sense (that is
why you can make money by writing popular expositions of quantum mechanics), but the nonsense
explanations are good enough to make decisions in practical situations (such as designing integrated
circuits or designing drugs by computational chemistry).
For nearly fifty years it was thought that Einstein’s general relativity is of no practical use. To be
sure, it explains the anomaly of the motions of the planet Mercury (measurements showed for more
than perhaps 150 years in a way that Mercury did not quite move according to the requirements
of Newton’s equations), but so what. However, the equations also showed that gravitation slowed
down time. The Global Positioning System that shows your location on your cellphone (so that
drivers no longer have to ask for directions, their cellphone directly shows their position on the map)
relies on accurate measurements of the time it takes for the radio signal to travel between a number
of satellites and your cell phone. In order to keep the clocks on the satellites in sync with the clocks
on the ground, one needs to take Einstein’s equations into account, because the clocks tick faster
on the satellites (the gravitation of the earth becomes weaker higher up), without this the satellites
would become quite useless in about a day. I assume the public is not quite aware of this.2.9
Schrödinger initially wanted to include special relativity in his equation, but he could not do this
in a way that agreed experimental results, so he gave up on this, and his equation, while give quite a
good agreement with experimens, are not compatible with special relativity. Paul Dirac, also known
as PAM after his full name Paul Adrien Maurice Dirac, succeeded in this a few later, mathematically
explaining the electron spin, first suggested by Wolfgang Pauli’s exclusion principle. Wigner referred
to Dirac as “my famous brother in law” (Dirac was married to Wigner’s sister. Wigner later used
the requirement of compatibility between special relativity and quantum mechanic to describe what
kind of elementary particles were possible. Many of these particles were discovered lates. See
Wigner’s classification, though the description at the target of this link is quite technical, and I am
not suggesting more than a cursory reading. Some other links above are also technical.
The following story was told to me by Atle Selberg, the discoverer, together with Paul Erdős of the first
2.7 Riemann, when asked what his geometry was good for, he answered physics. This was about 50 years before general
relativity, though, to be fair, Riemannian geometry has also applications in physics other than general relativity.
2.8 Hilbert was quite famous, and did not need any boost to his fame. Among his students were Emanuel Lasker,
who was chess world champion for 27 years (the longest) in history, quite a respectable mathematician in his own
right.
2.9 As the joke goes, who needs weather satellites when you have the weather channel on television.
10
elementary proof of he Prime Number Theorem.2.10 Dirac was famous for hardly ever engaging in small
talk. When Dirac visited the Institute for Advanced Study, they took a walk in the woods at the Institute,
the two wives in front, and Dirac and Selberg following behind. The two wives were talking animatedly
all the way, Dirac, totally silent. When they finished their walk, Dirac said, “I wonder how they do it.” I
walked that walkway many times myself. On one of my walks, I was attacked by a rabid fox. You are quite
helpless in such a situation. I noticed the fox, and I realized the danger, so I picked up a big stick. It was
quite useless, I was bitten on both arms. Finally, I picked up the fox by its hind legs, and flung it as far away
as I coud, and ran back to the building. With both arms bleeding in manyy places, I immediately drove to
the emergency room, an employee of the Institute following in another car, just in case. If this ever happens
to you, you need to read further: The mistake I made, as I was told at the emergency room, that I should
have washed out the wounds with warm water before driving to the emergency room. Doing so improves
your chances of survival, which may not be quite 100%. I think I am the only person in history to whom
this happened on that walkway.
Meanwhile, another famous mathematician, looking back on his life, wondered in the book [2]
whether is life was spent on something useful, and expressed the hope that his mathematics would
never be used for war (it might be). Hardy tried to separate mathematics into mathematics of
practical use and pure mathematics. While the book is still an interesting read, Hardy was quite
wrong in his conclusions. With computers, all bets are off.
2.1 I was wrong, too

Then there is an item on which I was totally wront some 35 years ago. At that time I was attending
lectures by Rohit Parikh about proving computer programs correct. I felt this to be totally pointless,
since one cannot prove computer programs correct. Turing proved that this is impossible have a
Turing machine to determine whether or not a calculation done on a Turing machine will ever finish,
or go on for infinitely long time.
What is going on is easier to represent in terms of the difference between primitive recursive and recursive
functions. When defining primitive recursive functions, it is guaranteed that all values of a primitive recursive
functions are defined everywhere. If fn is the nth such function, then the function defined by F (n) =
fn (n) + 1. This function will differ from all primitive recursive functions, so it cannot be a primitive
recursive function. Yet the description shows that it is clearly calculable. The method used here is called
“diagonalization.” It is similar to the method Cantor used to prove that the set of real numbers is not
countable.
So, how can one come up with the definition of a calculable function. It must be done in a way that
avoids diagonalization. The way to this is to set up the rules of a calculation in such a way that it is
impossible to tell whether a calculation ever finishes, and consider all the complete functions (i.e., the ones
that finish for all arguments, i.e. for 1, 2, 3, 2, . . . ). Since one does not know which calculation will finish,
it is impossible to get a function by diagonalization that is complete.
The real life objectives of the program to prove computer programs were more limited, and this
has been accomplished successfully, in a way that is worth a lot of money. Banks do not like their
computer systems to hacked, so they need high reliability in their computer programs. In fact, the
L4 Linux microkernel (the central part of a version of the Linux operating system) has been proven
correct. What has been done was the following. The microkernel L4 has been described in the
computer language Haskell, named after the Pennsylvania State Universiy mathemaician Haskell
Curry. This is a human-friendly language, quite unsuitable for writing operating systems. The
2.10 See [1, p. 77]. The proof is elementary in the sense that it did not rely on a delicate analysis of the Riemann Zeta
Function. Until the proof by Erdős, and Selberg, nobody though that such a proof was possible.
11
kernel has been written in C, and what has been proved is that the C implementation does exactly
the same thing as the one written in Haskell.
Computers of course do not understand the language C. So, before they can do anything they have to have
something to translate the C program into their own machine language. There is a small program, perhaps
a thousand lines long, that translate a small part of the C compiler into the computer’s own language. This
C compiler can translate the rest of the C compiler from C to machine language, and when the compiler has
been translated, the rest of the operating system into machine language. Thus the expression “pull oneself
up by one’s bootstraps” was applied to computers, as in “one bootsraps a computer.” Today, one uses the
expression to “boot a computer.”an expression quite common today.
Thus, to port (implement on another computer) the operating system one only has to rewrite the
thousand or so lines on a different computer, rather than rewriting the whole operating system.
3 Geometric cardinalities
3.1 Reading
The material in this section is to highlight the material in the section Steightening up the Circle:
Exploring the Infinite Within Geometrical Objects [1, §3.5, p. 190]. If you have a different edition of
the book, you will still find the same material, perhaps with minor changes. What is most likely to
change material at the end of the section, called Mindscapes. The material there is of real interest,
but I largle skipped the Mindscapes segment of the book, since there is really no time in the course
for deep study of them, and I would like to present a variety of material. Of course, you can spend
time on reading them, some of you might even enjoy them, but I definitely do not want to emphasize
the material there for exams. So you have the best of both worlds. You can learn a lot, you may
enjoy doing so, but the exams will not be terribly demanding. You still have to work though, since
nothing is free in life. The textbook is excellently readable, at least the main part, the Mindscape
segments may touch on more complicated matters.
3.2 Mapping a line segment to one of different length in a one-to-one way

This can be done simply by projecting one line to the other. In the picture on the right you can see
how to project a line segment of length one 1 a line of length 3. The picture shows how a number w
in the line segment (0, 1) (the set of all numbers greater than 0 but less than 1) is mapped onto the
number h in the line segment (0, 3) by drawing a straight line through the tip of the triangle at P .
It is also easy to describe this map algebraically (h = 3w in the present example). See Figure 3.1.
P ......................
..... ........
..... .......
..... .......
..... .......
0 1
..... .......
. ...
.......................................................................
w .....
.....
.....
.....
.......
.......
.......
.......
..... .......
..... .......
..... .......
..... .......
..... .......
..... .......
..... .......
..... .......
..... .......
..... .......
..... .......
... .......
..
0 1 h 2 3
Figure 3.1: Mapping a line segment onto another
12
3.3 Mapping a finite line segment onto the whole line in a one-to-one way
Take the finite line segment (with the endpoints not included), and roll it up into a circle; the circle
will be missing its “North Pole” (its highest point). Then use the stereographic projection‘ to map
this circle onto the line. The stereographic projection maps a point of the circle to the line on which
the circle is sitting by connecting the point with the North Pole of the circle; the this line will hit
the horizontal line at the image of the point on the circle. In the picture on the right, the point u on
the circle is mapped to the point t on the horizontal line. The stereographic projection maps every
point of the circle (except the North Pole) onto the horizontal line. See Figure 3.2.
3.4 More on the stereographic projection

Instead of projecting the circle onto a line, one can use the same idea to project a sphere (with
the exception of the North Pole) onto a plane. The stereographic projection gives a solution to a
fundamental problem of map-making: how to represent (a part of) the surface of a sphere (such
as the earth) on a plane sheet of paper. This problem already presented itself in an acute way
to the ancient Greeks. Perhaps not in making maps of a large part of the earch, but definitely
in making maps of the stars, where a whole hemisphere3.1 is visible at one time. An important
property of the stereographic projection that is presenves angles. According the the Wikipedia
article Stereographic projection in cartography, this was first proved by mathematically in 1695,
using the recently establsed tools of calculus, invented by Isaac Newton. However, this statement
is a fairly direct consequence of a result Apollonius obtained about 2300 years ago when studying
sections of the oblique circular cone.
On obtains an oblique circular cone by taking a circle in a plane, and connecting the points of this circle
with given point outside the plane that is off center. The intersection of this cone with the given plane, or one
parallel to it, is obviously a circle. Apollonius showed that there is also a nonparallel plane that intersects
this cone in a circle. The section with such a place is called subcontrary section. This result implies that
the stereographic projection maps tiny circles into nearly a circle, a statement equivalnt to saying that the
stereographic projection preserves angles. So, as far as anybody knows, already Apollonius may have known
that the stereographic projection preserves angles.
Navigation on the seas in the sixteenth century demanded a different kind of projection for map
making. The Mercator projection produces maps that shows as a straight line the path that a
navigator follows when sailing in a constant direction given by the compass. This was a great help
in planning shipping routes by drawing straight lines on a map.
.
.................. ...................
...... ..... .....
..... ...... .....
..... ...... ...
... .........
u ..
.
.
..........
.....
...
...
...
..
........
. ...
..
..... ...
. ...
..
.....
. ...
. ..
..
.....
. ...
...
.
..
.....
. ...
. ...
... ... ...
..... .....
.....
..... ......
..... ......
..... .......
..................................
.....
t 0
Figure 3.2: The stereographic projection: mapping a line segment onto a cirle
3.1 Half of a sphere; hemi is Greek, as is sphere. One does not use the Latin semi, which also means half, so as not
to mix Latin and Greek.
13
3.5 Mapping a square onto a line segment in a one-to-one way
3.5.1 An idea that fails
Consider the unit square, represented by pairs of numbers between 0 and 1, and the unit line segment,
represented by single numbers between 0 and 1. One need to ensure that each number only has one
representation; for example, never represent a number by digits that are all 0 from some point on.3.2
Then there is no pair of numbers (x, y) that corresponds to the number
t = 0.420909090909090909 . . .
(the number is continued by repeating the groups 09 of digits). Indeed, in order to get t, we would
have to have
x = 0.400000 . . . and y = 0.299999 . . . ,
but x = 0.400000 . . . is not allowed, since a number is not allowed to end in all zeros.3.3
3.5.2 An idea that succeeds

Take the line segment as the part of the number line with numbers between 0 and 1. The square
will be represented by pairs of numbers between 0 and 1. Each real number should be represented
in a way that a number is not allowed to end in all 0’s.3.4 A point (x, y) in the square (described by
two numbers x and y between 0 and 1) are mapped to a number t between 0 and 1 by intermixing
the digits of x and y as follows: take groups of digits alternately from x and y to build up the digits
of t. Each time, take the smallest group of digits not ending in zero.3.5 For example, if
x = 0.1400302740000902 . . . and y = 0.0214000392007 . . . ,
then first we divide x and y into groups as described:
x = 0. 1 4 003 02 7 4 00009 02 . . . and y = 0. 02 1 4 0003 9 2 007 . . . .
Then, taking groups of numbers alternately from x and y, build up a single number t:
t = 0. 1 02 4 1 003 4 02 0003 7 9 4 2 00009 007 02 . . . .
Observe that if t is given, then x and y can be determined by splitting t into groups, each time,
as before, taking the smallest group of digits that does not end in a zero, and putting the groups
alternately into x and y.
4 The Pythagorean Theorem

4.1 Reading
We expand on what is discussed in the secion Pythagoras and his Hypethenuse. How a Puzzle Leads
to the Proof of One of the Gems of Mathematitics in [1, §4.1, p. 208].
3.2 That is, the number 0.482 = 0.48200000 . . . (the number continuing in all zeros) needs to be represented as
0.481999999 . . . (the number continuing in all 9s).

3.3 If instead of excluding numbers ending in all zeros, we decide to exclude numbers ending in all 9s, then y will
become a number that is not allowed.

3.4 Thus, for example, the number 0.73, which would normally be written as 0.73000 . . . (with infinitely many zeros,
should be written with its alternate representation as 0.72999 . . . (with infinitely many 9s).
3.5 Since we represent a number in such a way that no number ends in all 0s, it is guaranteed that each group of
digits is finite.
14
In [1, §p. 211], a way of rearranging pieces of a puzzle are shown are shown that amounts to a
proof of the Pythagorean theorem. However, in addition to rearranging the pieces, some rigorous
consideration is needed to show that the pieces, when rearranged, properly fit together. This is what
is shown in the calculation next.
Consider the two figures. Figure 4.1 geatures the triangle ABC with sides a = BC, b = AC, and
c = AB. The two squares, ADHE and F HIJ cover an area of size a2 + b2 . This area is made up
by four triangles of congruent to the triangle ABC and the square ECGF of side EC = a − b. The
four triangles and this latter square are rearranged to form the square KLM N in Figure 4.2 of side
KL = c. (Note that the small square in the middle, P RQO has side P O − P L = a − b, that is, it is
congruent to the square ECGF in Figure 4.1.) The area of the square KLM N is the same as the
area in Figure 4.1 covered by the two squares ADHE and F HIJ. That is, c2 = a2 + b2 .
b
J ................................................................... I
... .... ...
...
... .... ...
... ... ...
... .... ...
... ... ...
... ...
F ... ...
... ... ...
E H
.
. .
...............................................................................................................................................................
... . ...
... .
.
. .
... ....
... .
. ... ...
... .... ... ...
...
... ... ... ...
... ... ... ...
... ... ...
... . ... ...
. ... ...
... ..
. ... ...
... ..
. .
... ...
... .
. ...
... .
. ...
... .
. .
... ...
... .... ... ....
... ... ... ...
... ... ... ...
... ... ... ...
... ... ... ...
... ... ..
...
G
... . ......
... . ......
C B
.
.
........................................................................................................................................................................
.
... ....... .. . ...
... .. ... ... . ...
..
... .......... ...
.... .
....... ........... ....
... ................ ...
......
c
... ....
..... ...
... ................ ...
... ................ ...
....
... .................... ..
A ............................................................................................................................................................
a D
Figure 4.1: Pythagorean Theorem: first arrangment
N .............................................................................................................................................................................................................. M
. .......... . ..
.... ...........
. ... ...
... ... .................... ... ..
..
... .
...
.. Q ..........
..........
............ ..
.. ..
.. .....
.
... ... ..........
.......... ... ....
... ... .......... ..
.......... ....
...
... ... ... ...
R
... ... . ...
... ..
. ..
. ...
... .. .. ...
. . ...
... ... ... ...
... ..
. ... ...
... ... ..
. ...
... ..
. ..
. ...
... ... ... ...
... ..
. ... ...
... ... ... ...
... .. .. ...
O
. . ...
... ... ..
. ...
... .......... ... ...
... .. ................
. ... ...
... .... .......... .
.. ...
........... .
... ... .......... .
. ...
P
... ... .......... ..
. ...
............
... ... .......... ... ...
... ... .......... .. ...
. .
......... ...
... ... ..........
..........
...... .......... ....
K ............................................................................................................................................................................
L
Figure 4.2: Pythagorean Theorem: second arrangement
15
5 How many cameras need to watch an art gallery?
5.1 Reading
In [1, §4.2, p. 218], in the section entitled A View of an Art Gallery: Using Computational Geometry
to Place Security Cameras in Museums, a fairly recent mathematical problem is discussed.
5.2 How to find a spanning arc

We will be able to find a spanning arc unless the floor plan of the art gallery is a triangle. Assume
the floor plan is not a triangle, and imagine you are standing at a vertex in a dark art gallery with
a flashlight; call this vertex the flashlight vertex L (the letters refer to the Figures 5.1 and 5.2). You
looking at an edge of the gallery. Call the at the far end of this edge the starting vertex S (you are
standing at the close end L of the edge, with the flash light). Initially, the flash light illuminates
the starting vertex S. Move the flashlight until you it illuminates the first vertex (other than the
starting vertex). Call this vertex the found vertex F .
There are two cases. If the flashlight vertex and the found vertex are connected by an edge of
the art gallery, then the line connecting the starting vertex and the found vertex is guaranteed to
form a spanning arc (SF in Figure 5.1).
If the flashlight vertex and the found vertex are not connected by an edge, then the line connecting
the flashlight vertex and the found vertex form a spanning arc (LF in Figure 5.2).
L...............
. ...
... ...
... ... ...
... .....
.... ...
... ... ...
..
.
.. .. ..... ....
. ... ... ...
...
...
...
...
F . . ..
... ...
..
...
...
...
. . . .
S .................. . ...
................ ...
................ ...
................
................ ....
..............
Figure 5.1: Spanning arc: no vertex found
L................................
.. . .......
... . .......
...
... .....
.
.... .. .....
.... . ........
. ... . . .....
.. ...
... . ...
... . ...
...
S
. ...
...... ...
F
..... . ..
..... . . ......
..... . .
...
...
...
... .
..... .....................
...
Figure 5.2: Spanning arc: a vertex found
5.3 The number of cameras needed

First triangulate the art gallery. Then color the vertices of the art gallery by three colors (red, blue,
and green) in such a way that in each triangle there will be one vertex of each color. You start
with coloring the vertices of an arbitrary triangle, and then continuing with coloring the vertices of
the adjacent triangles. If in a triangle, two vertices have already gotten colors (because these two
16
vertices are in common with the triangle next to it, and that triangle has already been colored), we
can choose the third color to paint the third vertex.
Once the vertices of the art gallery have been colored, if we place a camera at vertices of one
given color (say red), then these cameras will survey the whole art gallery. This is because there
will be a camera in each triangle, and from any vertex of a triangle, the whole triangle can be seen.
The only question that remains is, which at color should the the cameras be places. The answer
is: at the color which paints the smallest number of vertices. This color will paint at most n/3
vertices; otherwise, each color would paint more than n/3 vertices, but then the number of vertices
would be more than
n n n
+ + = n.
3 3 3
6 Platonic solids
6.1 Reading
The subject discussed here expands on [1, §4.5, p. 270] The Platonic Solids Turn Amorous: Dixcov-
ering the Symmetry and Interconnections Among the Platonic Solids.
6.2 The five Platonic Solids

They are the tetrahedron, the cube, the octahedron, the dodecahedron, and the icosahedron.
The regular tetrahedron has four equilateral triangles as faces. At each vertex, three edges come
together.
The cube has six squares as faces. At each vertex, three edges come together.
The regular octahedron has eight equilateral triangles as faces. At each vertex, four edges come
together.
The regular dodecahedron has twelve regular pentagons as faces. At each vertex, three edges
come together.
The regular icosahedron has twenty equilateral triangles as faces. At each face, five edges come
together.
6.3 The duals of Platonic solids

Given a Platonic solid, the centers of each of the faces can be taken as the vertices as a new solid.
These new solid is called the dual of the given solid.
The dual of the regular tetrahedron is another regular tetrahedron. The dual of cube is the
octahedron, and the dual of the dodecahedron is the icosahedron.
This is all that needs to be remembered, since the dual of the dual of a Platonic solid is (a smaller
copy of) the original Platonic solid. That the dual of the octahedron is the cube (since the dual of
the cube is the octahedron), and the dual of the icosahedron is the dodecahedron (since the dual of
the dodecahedron is the icosahedron).
7 Planar graphs and Euler’s formula

7.1 Reading
This section is about using graph theory to prove that there are only five Platonic solids. As an
introduction, it is desirable to read the introduction to [1, Chapter 6, pp. 384–385] Modeling Our
17
World throug Graphs, and [1, §6.1, p. 386] Circuit Training: From Königsberg Bridge Puzzle to
Graphs, but the main material is in [1, §6.2, p. 401] Feeling Edgy: Exploring Relationships Among
Vertices, Edges, and Faces.
7.2 Graphs and planar graphs

First, by a graph one means a number of points, called vertices, some of which are connected by arcs
(or lines), called edges (the edges do not have to be straight lines). A graph is called connected if
you can walk from any vertex to any other vertex by traversing a number of edges. Finally, a graph
is called planar if you can draw it in the plane without any two edges crossing (that is, the edges
are allowed to meet only if they run into the same vertex).
One often draws a nonplanar graph in the plane, but then the edges must cross at some points.
Such crossing points, however, must be distinguished from vertices. For example, when verifying
that a graph is connected by walking from vertex to vertex, one is not allowed to step from one edge
onto another at a crossing point; each edge that is traversed, must be fully traversed from the one
vertex at its one end to the other vertex at its other end.
7.3 Euler’s formula for planar graphs

Writing V for the number of vertices, F for the number of regions (faces), and E for the number of
edges, we have
V + F − E = 2.
Keep in mind that when counting the regions, the region at “infinity” also needs to be counted. That
is, a region is a part of the plane completely enclosed by edges that is not cut into parts by other
edges, and the region that is outside the graph. Using a geographic metaphor, the inside regions
can be called countries, and the outside region can be called the ocean.
7.4 The proof of Euler’s formula

The formula says that
V + F − E = 2,
where V denotes the number of vertices, F , the number of regions (faces), and E, the number of
edges in a connected graph.
First, the formula is certainly valid for small graphs containing one or two vertices. If a graph
contain only one vertex then V = 1, there is one region (the ocean), i.e., F = 1, and there can be
no edges, i.e., E = 0.7.1 As 1 + 1 − 0 = 2, the formula V + F − E = 2 is true in this case.
The graph with one vertex is the only one that, strictly speaking, needs to be considered. To
feel somewhat more comfortable, one might observe that the formula is also true for a graph with
two vertices. In this case V = 2. There is still only one region (the ocean), i.e., F = 1. There
must be one edge, that is, E = 1. In fact, if the two vertices were not connected by an edge, then
the graph would not be connected, and we are considering only connected graphs, according to the
assumptions.
If one is given a larger graph, one can gradually build down, or deconstruct,7.2 a graph by
removing one edge or one vertex at a time, and noticing that one step of such deconstruction will
7.1 Sometimes one allows edges connecting a vertex to itself (such edges are called loops). Then E = 1; but the loop
encloses a region, so F = 2. Since 1 + 2 − 1 = 1, the formula V + F − E = 2 is true in this case.

7.2 to use a word fashionable in literary criticism.
18
..... ..
..... .....
..... .....
..... .....
....
A B
..... ..
.
..... .
..... .....
....
....................................
..... .....
...
...... .....
.....
..... .....
..
....... .....
... ..
Figure 7.1: An edge that cannot be deleted
not change the quantity

V + F − E = 2.
Note that in such deconstruction step, one cannot remove an edge that would make a graph discon-
nected. For example, in the graph in Figure 7.1, one is not allowed to remove the edge AB, since
this would disconnect the graph.
...............
................ .... .................
... C D
.............
... ...... ..................
. .......
.......................................
Figure 7.2: Removing an edge and a vertex
If in the graph in Figure 7.2, one removes the vertex D, the number of vertices V decreases by
one: Vnew = V − 1; with the vertex D, one also have to remove the edge CD, since, with the vertex
D removed, one end of the edge CD would not be attached to a vertex; so the number of edges
also decreases by one: Enew = E − 1. On the other hand, the number of faces does not change, so
Fnew = F . Hence
Vnew + Fnew − Enew = (V − 1) + F − (E − 1) = V + F − E.
That is, the quantity V + F − E does not change by the removal of a vertex as described.
H
.........................................................................
............................
... ... ......
... ....... ......
..
.. ... . ............
....................................................................
Figure 7.3: Deleting an edge and a face
If in the graph in Figure 7.3, one removes the edge GH, the number of vertices does not change,
that is Vnew = V . The number of faces decreases by one, that is Fnew = F − 1. The number of edges
decreases by one, that is Enew = E − 1. Hence,
Vnew + Fnew − Enew = V + (F − 1) − (E − 1) = V + F − E.
That is, the quantity V + F − E does not change by the removal of an edge as described.
Thus, starting with any planar graph, one can build down this graph step by step, by removing
an edges and vertices (one at a time), without changing in the quantity V + F − E. In a finite
19
number of steps, one arrives at the graph containing one vertex and no edges. Since, as we saw
above, in this graph we have V + F − E = 2, we must have had V + F − E = 2 in the original graph.
Thus, Euler’s formula is established.7.3
8 There cannot be more than five Platonic solids

8.1 Reading
The material here is a continuation of what was started in Section euler: planar graphs, sect the
primary concern of [1, §6.1, pp. 405–408].
8.2 Platonic solids satisfy Euler’s formula

Place the center of the solid a the center of the sphere, and out of the center of the sphere, project
the vertices, edges, and faces of the solid on the sphere. Then take a point on the sphere inside one
of the projected faces (so that the point is not at a vertex of on an edge), and use this as the North
Pole to project the graph drawn by the projected edges and projected vertices on a plane touching
the sphere at the South Pole. We obtain a planar graph; for this Euler’s formula V + F − E = 2
must hold.8.1
8.3 Determining the number of edges of a Platonic solid

There are three different ways.
One is Euler’s equation saying that
V + F − E = 2,
where V is the number of vertices, F is the number of faces, and E is the number of edges.
For example, for the icosahedron F = 20 and V = 12.8.2 Thus,
E = V + F − 2 = 30.
Another way is to observe that if each face has s sides, then the number of edges is
F ·s
E= .
2
This is because each face is next to s edges, and so F faces give rise to F · s edges. However, each
edge is shared between two faces, so this number needs to be divided by 2. Thus, for the octahedron,
F = 8 and s = 3; so
8·3
E= = 12.
2
7.3 If one removes the last vertex, one ends up with the empty graph (a vertex with no edges and vertices). One can
reasonably say that V = 0, E = 0, but F = 1 in this case, since the whole plane can be considered a single region.
In this case, Euler’s formula is not valid. If one wants to be overly precise, one should say that Euler’s formula holds
form nonempty connected planar graphs.
8.1 V , F , and E for this planar graph is clearly the same as V , F , and E for the solid we started with.
8.2 It may be too much to ask to remember how many vertices the icosahedron has. However, since the icosahedron
is the dual of the dodecahedron, the number of vertices of the icosahedron is the same as the number of faces of the
dodecahedron; that is, V = 12 for the icosahedron.
20
The third way is to note that if at each vertex c edges come together then
V ·c
E= .
2
This is because each vertex sits on c edges. Hence the V vertices give rise to V · c edges. However,
each edge is shared between two vertices (its two endpoints), so this number needs to be divided by
2. Thus, for the dodecahedron, V = 20 and c = 3,8.3 and so
20 · 3
E= = 30.
2
8.4 No more than five: a proof

Given a regular solid, write E for the number of its edges, F for the number of its faces, and V for
the number of its vertices. Let s denote the number of sides of each face, c, the number of edges
coming into each vertex. Then we have E = F s/2 and E = V c/2.8.4 Hence we have F = 2E/s and
V = 2E/c. Substituting this into Euler’s formula
V + F − E = 2,
we obtain
2E 2E
+ − E = 2,
c s
that is,
2 2
Å ã
E· + − 1 = 2.
c s
The first factor on the left (that is, E) is positive. For the product to be a positive number (namely,
the number 2 on the right-hand side), the second factor must also be positive. That is, we must
have
2 2
+ > 1.
c s
This inequality can hold only under very narrow circumstances. Namely, we must have c ≥ 3
(we cannot have only two edges coming into a vertex)8.5 and s ≥ 3 (there is no polygon with only
two sides).
As c ≥ 3, we have 2/3 ≥ 2/c, so the above inequality gives
2 2 2 2
+ ≥ + > 1,
3 s c s
8.3 It may be too much to ask to remember how many vertices the dodecahedron has. However, the dodecahedron,
being the dual of the icosahedron, the number of its vertices is the same as the number of faces of the icosahedron;
that is, 20. Similarly, the number of edges coming together at a vertex of the dodecahedron is the same as the number
of sides of each face of the icosahedron; that is, 3.
8.4 To count the number of edges, each face gives rise to s edges adjacent to it, so F faces give rise to F s edges. Since
each edge is shared between the two faces adjacent to it, so this number needs to be divided by 2; that is, E = F s/2.
Another way to count the number of edges is to note that if each vertex gives rise to the c edges coming into it, then
V vertices give rise to V c edges. Since each edge is shared between two vertices (its endpoints), this number needs to
be divided by 2; that is, E = V s/2.
8.5 It should be clear that the number of edges coming into a vertex is the same as the number of faces adjacent to
a vertex. A vertex sits at the tip of a pyramid (you can obtain this pyramid by slicing off a small part of the solid
containing the vertex), and a pyramid cannot have only two sides.
21
i.e.,
2 2
+ > 1.
3 s
This gives 2/s > 1 − 2/3 = 1/3, i.e., 6 > s. Hence the only possible values for s are 3, 4, and 5.
Similarly, As s ≥ 3, we have 2/3 ≥ 2/s, and a calculation along the same lines shows that the
only possible values for c are 3, 4, and 5.
Hence the only possibilities are s = 3 and c = 3, which gives the tetrahedron;8.6 s = 3 and c = 4,
which gives the octahedron; s = 3 and c = 5, which gives the icosahedron; s = 4 and c = 3, which
gives the cube; s = 4 and c = 4, but it fact this is not possible, since 2/c + 2/s = 2/4 + 2/4 = 1 in
this case, whereas the quantity on the left must be greater than one according to our key inequality
above. s = 4 and c = 5 is impossible a fortiori;8.7 s = 5 and c = 3 gives the dodecahedron. The
remaining cases, s = 5 and c = 4, and s = 5 and c = 5 are impossible, since 2/c + 2/s ≤ 1 in these
cases.
9 Rigid tilings of the plane

9.1 Reading
Tilings of the plane are discussed in [1, §4.4, p. 249] Soothing Symmetry and Spinning Pinwheels:
Cam a Floor Be Tiled Without Any Repeating Pattern?
9.2 The pinwheel triangle

A pinwheel triangle is a right triangle in which the longer leg9.1 is twice as long as the shorter
√ leg.
Using the Pythagorean theorem, a simple calculation then shows that the hypotenuse is 5 times
as long as the shorter leg.
.
.........
....... ....
.......
................
.
...
...
.. .
....... ... ..
3
...
............ ..... .....
. ...
..... . ...
int
.
. .
. .. ... .. ...
..... .
2
....
... ... ...
.. .
.......... ....
. ...
. . ... ...... .... ....
.
..... ... .
. ...... ... ...
1 4
....
.. ... .. ....... .. .
...
. ... ..
..
...... . .
.
.......
.. ... ......... . ... ......
..................................................................................................................................
Figure 9.1: A supertile
9.3 A supertile
Consider the figures below: The picture on the right shows a five pinwheel triangles (the tiles),
forming a larger pinwheel triangle (the supertile). The tiles are labeled 1, 2, 3, 4, and int; the
last label refers to the interior triangle, and it plays a special role in the considerations involving
the pinwheel tiling. It is useful to remember these labels when analyzing the pinwheel tiling (see
Figure 9.1).
8.6 Once s and c are given, the last displayed equation above determines E, and then the other equations given earlier
determine F and V . So, given s and c, all the other quantities are determined.
8.7 Even more so (because, in this case 2/c + 2/s is even smaller, that is 2/c + 2/s < 1).
9.1 A leg is a side adjacent to the right angle.
22
9.4 An interior triangle cannot be any other tile
Consider Figures 9.1–9.5. Figure 9.1 shows the tiling of a supertile, with the interior triangle labeled
as int. In Figure 9.2, superimposed on this, drawn with thin lines, is a supertile that uses this interior
triangle as triangle 1. As can be seen, some tiles drawn with thin lines cut some tiles drawn with
thick lines, showing that the addition of the tiling drawn with thin lines is not allowed. Figure 9.3
shows what happens if the interior triangle is used as triangle 2 to build a new supertile. Again,
tiles are cut. Figure 9.4 shows what happens if the interior triangle is used as triangle 3 to build a
new supertile. Again, tiles are cut. Figure 9.5 shows what happens if the interior triangle is used as
triangle 4 to build a new supertile. Again, tiles are cut.
..........
....... ........
....... ... ...
..
........... ... ...
...............................................................................................
..
....... . ..
....................................................... ....
......... .. ........ .. ..
. ..
..
.
..
.
...
...................... .... ...
...
..... ... ...........
.......... ....... int . . ....
.....
. ...
.. .......
....... ... ....... . ...
........................................ ..
....... ....... ....
..
.. .
.
. ...
..... . . ... . .
..........................................................................................................................................
......... . .
......... ....
...
Figure 9.2: An interial tile cannot be tile 1
.............
...... ..............
... .. .......
.. .... .......
...................................................................................................
. .......
... ... .......
......... . ....... . .
..
.... ..........................................................................
.. ...... ... ............
........... ........ int .. .. .......
. .
....
... ....
... ... .......
.......
.... ..............
...
...................................... .
.... .......
......
.. . ..... .
..........................................................................................................................................
......... .. ..
......... ....
..
.....................................................................................................................
......... .. ........................ ....... ..
.................................... int ...... .
........................................................................................................... ..
... .......
............................... ..
... .......
... .....
...
...
... .
... ....
... ... .... ...
. . ..
... .. .. ...
... ... ... ...
... .. ..
........ ..
...............................
... ...
...
... ...
... ...
...
... ...
.
... ...
... ...
... ..
... ..
......
.....
A part of the tiling is shown in Figure 9.6. A larger part, with smaller tiles, is shown in Figure 9.7.
23
..............................................................................................................................
......... . ........................ ...... ...... .
.................................... int ....... ........
...
....
... ....
.........
.....
...
..
...
..
.....
...
..
.....
.....
...
..
...
..
...
.......
... ....... ...
............ ..............
...
....... .. ... ...
...
....... .... .
.....
...
...
. .
...
...
...
......
...
...
...
.......
....... . . .
.................................................................
....................................................................................................
.. ... ..
.....
. . .
. . .
....... ..
10 The pigeonhole principle

10.1 Reading
One may want to read the introduction to [1, Chapter 2, pp. 42–43] Number Contemplation be-
fore reading [1, §2.1, p. 44] Counting: How the Pigeonhole Principle Leads to Precisio Through
Estimation.
The pigeonhole principle says that if we try to put more than n items into n slots (pigeon holes),
then there will be one slot that receives more than one of the items.
This is all very simple, and sounds totally harmless. Nevertheless, the pigeonhole principle can
be a powerful tool in mathematical arguments where only the existence of an object is proved, while
the object itself cannot be identified.
An illustration, silly to the extreme but quite instructive, is the way of showing that there live
two persons in the world that have the same number of hairs on their bodies. A fairly routine
estimation, not repeated here, shows that no person in the world has more than 400 million hairs.
Since there are more than 5 billion people in the worlds, there must be more than one person with
the same number of hairs.
What is interesting about this example is that we know for certain that there are two persons
in the world with the same exact number of hairs. Yet, there is virtually no chance that we will be
able to identify two persons with the same number of hairs.
10.2 Approximation of numbers by fractions

Along more serious lines, next we will discuss an importnt application of the pigeon-hole principle.
The number π expressing the ratio of the circumference to the diameter of the circle is approximately
3.141,592,653,589,793 . . . . π can be approximated by common fractions as
22 333 355 103993
, , , .
7 106 113 33102
As a measure of how good this approximations are, consider the following approximate equations:
22 1 333 1
π− ≈ −.001,264,489 ≈ −.0619 · 2 , π − ≈ 8.321,963 · 10−5 ≈ 0.935 · ,
7 7 106 1062
355 1 103993 1
π− ≈ −2.667,642 · 10−7 ≈ −0.0034 · 2
, π− ≈ 5.778,906 · 10−10 ≈ .633 · ;
113 113 33102 331022
the goodness of each of these approximations is measured in terms of 1 divided by the square of
the denominator of the approximating fraction. The following theorem will explain why this way
of measuring the goodness of the approximations make sense. The theorem is proved by a direct
application of the pigeon hole principle:
24
Figure 9.6: A part of the tiling
Theorem 10.1. Let x be a real number and n a positive integer. Then there are integers k and l
25
Figure 9.7: A larger part of the tiling
such that 1 ≤ l ≤ n and

1
(10.1) |lx − k| ≤ .
n+1
26
Noting that l 6= 0, the inequality here can also be written as

x − k ≤ 1

.
l l(n + 1)
In other words, given any positive integer n, a real number x can always be approximated by a
common fraction whose denominator l is ≤ n such that the error of the approximation is less than
or equal to
1
l(n + 1)
Since 1 ≤ l ≤ n, this error is less than 1/l2 .
Proof. For a real number y, denote by [y] the integer part of y. That is, [y] is the largest integer
def
m ≤ y. Further, denote by {y} the fractional part of y; that is, {y} = y − [y]. 10.1 Clearly,
0 ≤ {y} < 1.
Assume that no k and l satisfying the assertion of Theorem 10.1 exist. Then for any integer l
with 1 ≤ l ≤ n we must have
1 n
(10.2) < {lx} < .
n+1 n+1
Indeed, writing s = [lx], we have {lx} = lx − s. Thus, if the first inequality fails then, noting that
lx ≥ s, we have
1
0 ≤ lx − s ≤ ,
n+1
the inequality (10.1) claimed in Theorem 10.1 is satisfied with k = s. If the second inequality fails
then, noting that lx − s = {lx} < 1, we have
n
≤ lx − s < 1.
n+1
10.2
Subtracting 1 from all the members of the inequality, we obtain
1
− ≤ lx − (s + 1) < 0.
n+1
In this case, inequality (10.1) claimed in Theorem 10.1 is satisfied with k = s + 1.
Given that (10.2) is satisfied, each of the n numbers {1x}, {2x}, {3x}, . . ., {nx} belongs to at
least 10.3 one of the n − 1 intervals
1 2 2 3 3 4 n−2 n−1 n−1 n
ï ò ï ò ï ò ï ò ï ò
, , , , , , ... , , , .
n+1 n+1 n+1 n+1 n+1 n+1 n+1 n+1 n+1 n+1
10.1 The def
symbol = describes an equation where the left-hand side is defined by means of the expression on the
right-hand side.
10.2 The members of the inequality are the expressions on the left-hand side, the middle, and on the right-hand side,
separated by the inequality signs.

10.3 The interval [a, b] is the set of points {x : a ≤ x ≤ b}. In case {px} is one of the numbers
1 2 3 n−2
, , , ..., ,
n+1 n+1 n+1 n+1
then {px} belongs to each of the two intervals whose one endpoint is {px}.
27
Since there are n numbers and n − 1 intervals here, there must be (at least) one among these
intervals to which (at least) two of these numbers belong. That is, there are integers p and q with
1 ≤ p < q ≤ n such that {px} and {qx} belong to the same one among these intervals. 10.4 Since
the length of each of these intervals is 1/(n + 1), we then must have
1
{qx} − {px}| ≤ ;
n+1
Writing r = [px] and s = [qx], we have {px} = px − r and {qx} = qx − s; hence the above inequality
becomes
1
|(q − p)x − (r − s)| ≤ ;
n+1
here p, q, r, s are integers. Writing l = q − p and k = r − s, this inequality becomes identical
to (10.1), the inequality we wanted to show. As 0 ≤ p < q ≤ n, the inequality 1 ≤ l ≤ n also
follows.
References
[1] Edward B. Burger and Michael Starbird. The Heart of Mathemaics: An invitation to effective
Thinking. Wiley, Hoboken, NJ, fourth edition, 2012.
[2] G. H. Hardy. A Mathematician’s Apology. Cambridge University Press, Cambridge, 1969.
10.4 Saying that 1 ≤ p < q ≤ n amount to saying that there are two distinct integers p and q with 1 ≤ p ≤ n and
1 ≤ q ≤ n such that {px} and {qx} belong to the same interval, and the notation is so chosen that the smaller integer
is denoted by p and the larger one by q.
28

Thinking Mathematically Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thinking Mathematically Notes

Uploaded by

Copyright:

Available Formats

Thinking mathematically notes∗

2 The unreasonable effectiveness of mathematics 8

4 The Pythagorean Theorem 14

5 How many cameras need to watch an art gallery? 16

7 Planar graphs and Euler’s formula 17

8 There cannot be more than five Platonic solids 20

9 Rigid tilings of the plane 22

10 The pigeonhole principle 24

1.3 Cantor’s paradox

1.5 Epimenides paradox

1.6 Berry’s paradox

that x ∈ x for any set x.

seems to treading dangerous grounds for making such statements.

1.7 Axiomatic set theory

1.8 The axiom of replacement

1.9 Other axiom systems of set theory

1.10 Hilbert’s program and incompleteness

1.11 The continuum hypothesis

1.12 Independence of the continuum hypothesis

2 The unreasonable effectiveness of mathematics

true, yet unprovable.

2.1 I was wrong, too

3.2 Mapping a line segment to one of different length in a one-to-one way

Figure 3.1: Mapping a line segment onto another

3.4 More on the stereographic projection

to mix Latin and Greek.

3.5.2 An idea that succeeds

4 The Pythagorean Theorem

0.481999999 . . . (the number continuing in all 9s).

become a number that is not allowed.

Figure 4.1: Pythagorean Theorem: first arrangment

Figure 4.2: Pythagorean Theorem: second arrangement

5.2 How to find a spanning arc

Figure 5.1: Spanning arc: no vertex found

Figure 5.2: Spanning arc: a vertex found

5.3 The number of cameras needed

6.2 The five Platonic Solids

6.3 The duals of Platonic solids

7 Planar graphs and Euler’s formula

7.2 Graphs and planar graphs

7.3 Euler’s formula for planar graphs

7.4 The proof of Euler’s formula

encloses a region, so F = 2. Since 1 + 2 − 1 = 1, the formula V + F − E = 2 is true in this case.

Figure 7.1: An edge that cannot be deleted

not change the quantity

Figure 7.2: Removing an edge and a vertex

Vnew + Fnew − Enew = (V − 1) + F − (E − 1) = V + F − E.

Figure 7.3: Deleting an edge and a face

Vnew + Fnew − Enew = V + (F − 1) − (E − 1) = V + F − E.

8 There cannot be more than five Platonic solids

8.2 Platonic solids satisfy Euler’s formula

8.3 Determining the number of edges of a Platonic solid

8.4 No more than five: a proof

9 Rigid tilings of the plane

9.2 The pinwheel triangle

Figure 9.1: A supertile

Figure 9.2: An interial tile cannot be tile 1

Figure 9.3: An interial tile cannot be tile 2

Figure 9.4: An interial tile cannot be tile 3

Figure 9.5: An interial tile cannot be tile 4

10 The pigeonhole principle

10.2 Approximation of numbers by fractions

such that 1 ≤ l ≤ n and