Professional Documents
Culture Documents
and Beauty
This page intentionally left blank
Mathematics, Poetry
and Beauty
Ron Aharoni
Technion, Haifa, Israel
World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TA I P E I • CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Hebrew edition: Matematika, shira veyofi (Hakibbutz Hameuchad Publishing, Tel Aviv, 2008)
Translators for the English edition: Merav Aharoni and Edward Levin
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.
Printed in Singapore
Contents
Introduction: Magic 1
Displacement 7
Part I: Order 13
Hidden Order 19
To Discover or to Invent 25
Mathematical Harmonies 31
√
Why 2 is Not a Rational Number 39
Independent Events 61
v
vi Mathematics, Poetry and Beauty
Compression 85
Mathematical Ping-Pong 89
Poetical Ping-Pong 99
Topology 123
Matchmaking 129
Imagination 135
Symmetry 169
Contents vii
Impossibility 179
Twists 233
Change 243
Estrangement 247
3
4 Mathematics, Poetry and Beauty
This poses a riddle. How can the austere and abstract world of mathe-
matics resemble art? What does geometry have in common with music, or
arithmetic with poetry? One answer is that both mathematics and poetry
search for hidden patterns.
So, this book is about the magic of poetry and of mathematics, and how
close the two are. It is divided into three parts. Part I is about order. We
shall see in it how both fields uncover deep hidden patterns. In Part II we
shall study common techniques of the two fields. Finally, in Part III I will
try to draw conclusions on the concept of beauty.
But first, I want to give a glimpse of what’s to come. I would like to
describe one sleight of hand.
Displacement
[. . .]
My images are
Transparent like windows in a church:
Through them
One can see
How the light of the sky shifts
And how my loves
Fall
Like dying birds.
7
8 Mathematics, Poetry and Beauty
A six-sided polygon (a hexagon), with a straight line that intersects all its sides. Can you
also draw a seven-sided polygon (a heptagon), with a straight line intersecting all its
sides?
on the side on which it began. That is, it ends on the side opposite Q. But
since the heptagon is closed, it should end at Q. This contradiction means
that the assumption that every side of the heptagon intersects the straight
line is impossible.
After seven crossings, we are on the other side of the river. The polygon is not closed.
pair, followed by game between the winner of the first round and the player
who was waiting on the sidelines. Let us now skip to a tournament with 10
players. The following diagram depicts a possible course of events. In the
first round 5 games were played; in the second, 2; in the third, 1; and in the
fourth, again 1.
Every so often, an ant reaches one of the pole ends, and then it falls off
and disappears forever.
Question: In the end, will all the ants fall off the pole? If so, how long will
this take?
At first glance, the answer seems to depend on the initial state, that is,
on the number of ants on the pole and their position. If there are many
ants, it seems that it might take a long time for all of them to fall off.
How can we test this? I have already told you the first secret of thinking
mathematically: studying examples. Mathematical thought is a play between
examples and abstractions. The difference between the two is that strokes in
the direction of the concrete can be done consciously, that is, examples can
be evoked deliberately. For this reason one ought to begin with examples.
An additional reason, of course, is that examples are the raw material of
15
16 Mathematics, Poetry and Beauty
abstraction. In the case of the ants, the simplest example is that of a single
ant. If the ant is at one end of the pole and goes toward the other end, it
will fall off in one minute. In any other case, it will fall in less than a minute.
But we still have not touched upon the core of the problem: the collisions. So
let us look at two ants, located at the opposite ends of the pole, advancing
toward each other.
After half a minute, they will meet in the middle of the pole, reverse
their directions, and fall off in another half a minute. So, both will fall after
exactly one minute.
The next example is a bit less obvious. Imagine one ant starting at the
right end, the other exactly in the middle, and they are advancing toward
each other.
After the collision, ants A and B will go towards each other, meet in the
middle after another quarter of a minute, reverse their directions and each
will fall after another half a minute. Ant C will had fallen from the right end
even before that. Again, it will take a total of one minute for all ants to fall.
Now this is really strange. In all our examples, all the ants fell within a
minute. Does this always hold true? The answer is “yes,” and the proof
is easy. That is, if you have the right insight. Strange as it may seem,
this insight does not add information but ignores information: it ignores
the identity of the ants. If we don’t care who the ants are, then what
happens at the moment two ants meet? Actually, nothing. Before their
meeting, one ant goes to the left, and the other to the right; after their
encounter, the exact same thing happens: then too, one ant proceeds to
the left, and the other to the right, at the same speed. But for our pur-
pose, which ant goes to the left and which goes to the right doesn’t
matter.
So in effect, there are no collisions. They were only there to confuse us.
The problem is completely identical to the problem: “ants are proceeding
along a one meter long pole, each at the speed of one meter per minute,
without colliding and without changing direction. How long will it take for
them to fall?” There is no mystery here. All will fall off in one minute
or less.
Mathematicians are a lucky breed. They get paid to play. When we take
into account the billions that are invested in mathematical research and edu-
cation, we would expect them to be busy with applied projects. In reality,
most mathematician allow themselves to indulge in problems like this one.
Why? Because the impractical appearance of this riddle is misleading. In
fact, it is a good example of the discipline’s primary strength: abstraction.
The ants in the problem are mathematical: real ants do not move at a uni-
form speed, and do not obey such simple rules. Mathematics is the study
of systems that follow well defined rules. And the abstraction is even more
evident in the solution, that strips the situation of its details, and exposes
its essence.
Ignoring the irrelevant, as in the ants problem, is a primary characteristic
of mathematical thought. Mathematics takes the abstraction process to its
extreme. It takes a complex looking tree, strips it of its leaves, and reveals
the trunk. Think, for example, of the concept of number. The person who
invented the number “4” understood that, as far as the rules of arithmetic
are concerned, it is immaterial if he had 4 stones or 4 pencils, what color
they are, and how they are arranged. 4 stones and 3 stones are 7 stones,
18 Mathematics, Poetry and Beauty
just as 4 pencils and 3 pencils are 7 pencils, and hence we can say abstractly
“4 + 3 = 7.” Abstraction is generalization, and generalization saves effort.
The rule we found for stones will be valid for any kind of objects, and at any
point in time. “Mathematics is being lazy,” said the mathematician George
Polya (1887–1985), “it is letting the principles do the work for you.” In
this respect, the ants question is very practical. Directly, it is not useful for
anything, because there are no ants like these in reality, but it educates the
person who solves it to think abstractly.
It may even be the case that problem was invented to model a real-world
phenomenon. Bundles of light waves (“solitons”) behave in collisions just
like the ants in the solution: they pass through one another.
Hidden Order
Nature does nothing in vain, and more is vain, when less will serve;
for Nature is pleased with simplicity.
Isaac Newton
A good concept is like a path that suddenly opens before you in a dark
forest. A minute ago the thicket seemed impenetrable; but from the moment
19
20 Mathematics, Poetry and Beauty
the path was revealed, the way stands open. The English mathematician
Andrew Wiles, who solved the famous Fermat’s Conjecture, used another
metaphor. A good concept is like a light switch, that you find as you feel
your way in a dark castle. When you turn on the light, you know what is
in the room you are in. In the next room you will have to look for another
switch.
Here is a classic example, a problem composed in 1946 by Max Black, a
British philosopher and mathematician. Take an 8 × 8 board of 64 squares, and
cut out the lower left-hand corner and the top right-hand corner, like this:
You also have 31 dominoes, each of which can cover two adjacent squares
of the board. All together, they can cover 62 squares, which is the number of
squares on the incised board. Can the board be covered with these dominoes?
You may have guessed the first step: look at small cases, even very small.
The smallest possible example is a 2 × 2 board. After the removal of two
opposing squares, we get:
Hidden Order 21
Of course, this shape cannot be covered with a single domino. Now try
a 4 × 4 board (We are skipping the case of a 3 × 3 board, since it contains
9 squares, and removing 2 leaves 7 squares, which is an odd number. An
odd number of squares cannot be covered without overlapping, since each
domino covers 2 squares). A bit of experimentation will convince you that
this is impossible.
In a 2 × 2 or 4 × 4 board it is easy to check all possibilities. In an 8 × 8
board this would be impractical, there are too many possibilities. We need
an idea. And the concept that hits the mark is coloring the squares black
and white.
Things now fall into place. Each of the 31 dominoes will cover one black
square and one white square. Since the squares we removed from the board
are both white, there are 32 black squares left, and only 30 white squares.
These cannot be covered by 31 dominoes that are supposed to cover 31
squares of each color.
The chessboard coloring revealed a concealed pattern. Emerging by
magic, as if from nowhere, it made things simple and clear.
Here is another example for the power of concepts. A chocolate bar mea-
sures 5 squares long and 4 wide. We want to divide the 20 squares among
22 Mathematics, Poetry and Beauty
20 children, so we must break the bar into its squares. The rule is that at
each step we may take one of the pieces we have at hand, and break it
along a single straight line. What strategy should we follow to apply as few
breakings as possible?
As usual, the first step is to consider simpler examples, which in this case
means having smaller dimensions. For example, a 3× 1 chocolate bar divided
among 3 children. Here we have no choice: 2 breakings are necessary. Let us
move on to a slightly bigger example: a bar of size 3 × 2, to be divided among
6 children. One way would be to first separate the two rows of 3 blocks apiece
by a single breaking. After this, we need another 2 breakings for each of the
2 rows, for a total of 5 breakings.
One lengthwise breaking, and 2 more in each row, for a total of 5 breakings.
The different ways lead to the same result: a chocolate bar of n blocks
needs n−1 breakings. Namely, the strategy has no effect on the number
of breakings. We cannot separate the individual blocks with fewer than n − 1
breakings, nor with more than n − 1. Why is this so? Here, again, a correct
concept makes things simple. This is the number of pieces obtained
after each breaking. At the start the number of pieces is 1 — there is a
single bar. Each breaking turns one piece into two, thereby increasing the
number of pieces by 1. At the end of this process, there are n pieces. In order
Hidden Order 23
Platonism
Don’t go around saying the world owes you a living; the world
owes you nothing; it was here first.
25
26 Mathematics, Poetry and Beauty
extreme. Plato argued that the concept of the table is more real than the
table itself). A bitter row is supposed to exist between Platonists and anti-
Platonists. In practice this is not the case. The twentieth-century American
mathematician Ralph Boas claimed that he had never met a mathematician
who was not a Platonist. Almost all mathematicians believe in the reality
of their objects. Numbers, geometric shapes, functions, evenness of num-
bers — these are all part of the actual world. Mathematics is discovery, not
invention. Mathematics reveals order that is out there in the world. A con-
cept is nothing more than mirror image in our brain of a pattern in reality.
A mathematician is more of a photographer than a sculptor.
the tragically early (age 31) death of Franz Schubert was a greater loss to
humanity than the even earlier age death of Evariste Galois, the French
mathematical genius, at around the same time — early 19th century. The
discoveries that would have been made by Galois, had he lived to an old
age, have long since been made, while with the death of Schubert we lost
unimaginable treasures of beauty.
Carl Friedrich Gauss (Germany, 1777–1855), the greatest mathematician of the 19th
century. He contributed to the theory of complex numbers, number theory and modern
algebra. Together with the physicist Wilhelm Eduard Weber he built the first telegraph.
He spent his later years in seclusion in the observatory in Gottingen, and published very
little. The biographer Eric Bell estimated that if all of his discoveries had been published
in his lifetime, mathematics would have progressed by fifty years.
Saving energy
When everything falls into its proper place, we say “Everything worked out
beautifully.” Why? Recognizing order is useful. It saves effort in coping with
the world. But why should it cause aesthetic pleasure?
To answer this, we must first realize that it is not mere order. Order
by itself is not necessarily beautiful. Nothing is more orderly than a blank
sheet of paper, and no combination of sounds is more orderly than absolute
silence. Nonetheless, a blank sheet of paper is not a work of art, and silence
does not possess the beauty of a Mozart symphony. A monotone series of
beats is orderly and predictable, but it does not constitute music. In order
to create a sensation of beauty, we need something beyond order.
The secret lies in a concept proposed in the second half of the 19th cen-
tury: saving mental energy. The industrial revolution in England led to the
idea that machines can replace not only muscle work, but also mental. This
led to the invention of the first computer, by Charles Babbage, and also
to a mechanical perception of the human mind. One proponent of this was
Herbert Spencer, who claimed that the mind, like other systems in the world,
seeks a state of minimal energy. Young Freud adopted this approach whole-
heartedly, and in the 1890s, when he was still taking first tentative steps
in psychoanalysis, he wrote a draft of a thick book entitled Physiology for
Psychologists, in which he tried to explain mental phenomena in terms bor-
rowed from the physical world of his time. Freud championed the Spencerian
idea that the psyche tries to reduce effort as much as possible, that is, to
save energy.
Like many before and after him, Freud quickly learned that psychological
terms that are effective as metaphors soon become useless when used con-
cretely. The concept of “saving energy” is too general to predict the behav-
ior of human beings. As a result, Physiology for Psychologists was shelved
around the year 1895, but echoes of it would reverberate throughout Freud’s
writings. The idea of saving energy was expressed most clearly in a book he
29
30 Mathematics, Poetry and Beauty
wrote in 1905 on humor, Jokes and Their Relation to the Unconscious. The
book’s thesis was that the pleasure we derive from a joke results from sav-
ing the energy of repression. The joke enables us to enjoy forbidden things
without having to repress them. Consequently, energy that was prepared to
repress the forbidden idea is unnecessary, and is transformed into pleasure.
Not much revelation came out of this book, as far as humor is concerned.
Freud himself was not happy with it, referring to it in later years as a needless
deviation from his main course. But the idea of saving energy caught on,
especially with respect to art. The idea is that a work of art disguises itself
as chaotic, demanding preparation of energy to tackle it, and then hidden
order is revealed, which means that the energy prepared can be saved. And
saved energy entails pleasure. Just like when we discover that we won a
battle we feel pleasure, because we no longer need the energy we prepared
for the struggle. The sensation of beauty, by this approach, arises when order
is suddenly revealed in disorder.
Music is one area in which this explanation works beautifully. In order
for music to be enjoyable, it has to be complex. It must seem to be disor-
ganized noise, and then to be realized as ordered. We constantly attempt
to decipher the stimuli that arrive from the outside world, and so we pre-
pare energy to organize noise. If we then discover order in the noise, this
energy is saved. Links between the sounds are revealed, which enable us
to predict what is coming. This happens in two dimensions: rhythm and
harmony. Rhythm is the organization in time, and harmony the connection
between the frequencies of the notes. In the next chapter, I will explain a bit
about both.
If the music is complex enough, these links are not straightforward, and
cannot be perceived consciously. This means that on the conscious level
we do not fully understand the order in the musical work. There is a gap
between the perceived lack of order and the hidden order that is uncon-
sciously revealed. And this gap, between what we consciously observe and
the unconscious perception, is the source of beauty.
Mathematical Harmonies
Pythagoras
And what about the second element of music, harmony? This is more of a
puzzle. We all know that some combinations of notes are pleasing to the
ear, while others are less so. For example, a C note sounds well with the
C one octave higher. Actually, when hearing them together we can hardly
distinguish between them. The C-G and C-E combinations, as well, sound
31
32 Mathematics, Poetry and Beauty
well together. The notes C, E, G are the basic chords of the C major scale,
the scale whose notes are played on the white piano keys. A composition
in C major frequently begins with the notes C, E, G in some order, strays
and wanders about, before finally returning to them. Music is built on the
tension between the digressions and the original harmony.
But why is one combination of notes pleasing, while another grates on
our ears? Surprisingly enough, the answer to this question is mathematical,
and it was discovered by one of the most fascinating figures in the history
of mathematics, Pythagoras. He was the founder and leader of a most rare
entity: a mathematical cult. The cult numbered about 600 men and women,
who lived in the Greek colony of Crotona in the south of the Apennine
peninsula, in the “heel” of the Italian boot. They donated all their posses-
sions to the community, and swore to keep their discoveries secret. Legend
has it that, one day Pythagoras was passing by a blacksmith’s workshop, and
realized that when the blacksmith struck rods with a simple ratio between
their lengths — for example, one was twice as long as the other or one
and a half times as long, the combination of the two sounds was pleasing
to the ear.
In modern terminology, two sounds sound well together if the ratio
between their frequencies is simple, that is, it is expressed by small num-
bers (for example 3:2 is simpler than 11:5). The frequency of a sound is the
number of times per second the air vibrates when the sound is produced,
or in more precise language: the number of peaks per second of the sound
waves. If the note is produced by a string, this is the number of vibrations per
second of the string. A difference of a single octave between notes (like that
between a C and the C above) means a ratio of 2 between their frequencies:
the frequency of a high C is twice that of the C below. The frequency of
the note G, the fifth in the octave (when beginning with C) is 3/2 times
that of the low C of that octave. In other words, for every 2 vibrations of
the C, there are 3 vibrations of the G. The ratio between the frequencies
of E and C is 5:4, again quite simple. This is why C, E, and G sound well
together.
Helmholtz
would have to pass before this question could be answered. The enigma was
solved by the German Hermann von Helmholtz (1821–1894), a true Renais-
sance man: a mathematician, physicist and physiologist, who also studied
aesthetics. His explanation was based on the phenomenon of “overtones.”
When a chord vibrates at a certain frequency, it also vibrates, at the same
time, at frequencies 2, 3, 4,. . . times higher. The overtones are weaker the
further they are from the original tone, namely the higher the ratio is to the
original frequency, but they are audible. In other words, when the note C is
played, most times we will also hear the C of an octave higher, with a fre-
quency exactly double, and also the G in the higher octave, whose frequency
is three times that of the original C. Simple ratio between two frequencies
means that they share overtones. For example: the C and the G in the same
34 Mathematics, Poetry and Beauty
octave share the G of one octave higher. Hearing these notes together, we
reveal hidden order. The notes are different, but, unconsciously we find a
factor common to both. Instead of chaos, order emerges.
Does this explain in full the pleasure people derive from music? Of course
not. It does not explain how come music can be so moving. It does not touch
upon the emotions aroused by music. It only relates to a pleasure that can
be classified as intellectual. But it is a good first start.
Mystical numbers
All this was beyond the knowledge of the ancient Greeks, who knew nothing
of frequencies. When people don’t know, they fantasize. In order to explain
harmony, Pythagoras and his school invented fanciful theories of the magical
powers of numbers and the ratios between them. “All is number” was their
strange slogan. That is, the world is ruled by simple numerical ratios. The
Pythagoreans believed that every important natural phenomenon has to
obey numerical laws. They maintained that there are simple ratios between
the diameters of the planetary orbits, and that the planets consequently
emanate “celestial music.” And they went far beyond that. They claimed
that every size in the world that is of any significance can be expressed as a
ratio between whole numbers.
A number that is the quotient of two whole numbers is called a “rational
number” (from the word “ratio”). Every whole number is rational: 4, for
example, is rational because it is the ratio between itself and 1, that is,
4:1 = 4. Every fraction with a whole-number numerator and denominator is
rational, because the fraction bar is actually a division sign: 17/3 is the ratio
between 17 and 3. So, the Pythagoreans believed that important quantities
in nature are rational.
Sobering up
Babylonians studied numbers before, but they did so for practical ends. The
Greeks were the first to see numbers as a world worthy to be explored for
its beauty and inner harmony.
But even within the Greeks’ achievements, geometry enjoys a special
pride of place. It was in this field that the Greeks developed the concepts
of “axiom” and “proof,” and it was here that they reached the highest level
of abstraction. Pythagoras was one of the founders of Greek geometry. The
theorem that, to this day, is regarded (and rightly so) as the most important
and useful geometric theorem, is named after him, though in fact he was
not its discoverer. The theorem states that the sum of the areas of the
two squares based on the legs of a right triangle equals the area of the
square based on the hypotenuse. This is important because it enables us
to calculate distances. Given the lengths of the legs of a right triangle, we
can calculate the length of the hypotenuse. This means that knowing how to
calculate east-west and north-south distances, you can calculate the distance
between any two points.
The Pythagorean theorem: the sum of the areas of the two squares on the legs of a right
triangle equals the area of the square on the hypotenuse.
The area of the square based on the diagonal (with vertical lines) is twice the area of the
small square (with horizontal lines), because it contains 4 triangles, while the small
√
square contains only 2 triangles. The side of the larger square is therefore 2 the length
of the side of the small square.
Let’s assume that the length of the side of the small square (with
horizontal lines) is 1. The area of this square is therefore 1 × 1, that is, 1.
The large square, that is on a diagonal (and marked with vertical lines), is
composed of 4 triangles while the small square contains only 2 (all of the
triangles are congruent, that is, they are capable of perfectly fitting one over
the other). Therefore, the area of the large square is twice that of the small
square, which means that it is 2. The length of any square’s side√is the square
root of its area, and so the length of the large square’s side is 2. But look:
the side of the large
√ square is the diagonal of the small square! Therefore,
this diagonal is 2 long.
Geometry had a special place in the minds and hearts of the
Pythagoreans, and for them the diagonal of a square was an everyday object.
Therefore, they believed that the length of this diagonal should be rational.
For many years they tried to find what ratio it is. It is close to 75 , but it is not
quite that, because the square of 75 is 49
50 , which is almost 2, but not quite.
Mathematical Harmonies 37
√
Eventually they had to realize the bitter truth: that 2 is not rational. This
was such a severe blow that they vowed to keep it a secret. Due to the sect’s
secretiveness, not much is known about it for certain and the continuation
of the story might very well be spurious. But legend has it that Hippasus,
a sect member who revealed the secret to the world, was put to death for
doing so. This is almost certainly apocryphal. Hippassus drowned, and his
death might very well have been an accident. But the sect accredited it to
punishment by the gods.
√
Why 2 is Not a Rational Number
√ m
Actually, why isn’t 2 a rational number? Why can’t it be expressed
√ as n for
m
some whole numbers m and n? To see this, assume that n = 2. We shall
show that this assumption leads to a contradiction. First, we can assume
that mn is a reduced fraction, namely the numerator and the denominator
are not divisible by the same number, greater than 1. If not, we√just reduce
it, namely√divide by the common divisor. By the definition of 2, the fact
2
that mn = 2 means that m 2
n2 = 2. If we multiply both sides by n , we get:
(∗ ) 2n2 = m2
Now, we will divide our discussion into two cases: in one case, m is an
odd number, and in the other case, it is even. Each of these two cases will
lead to the desired contradiction. If m is an odd number, the right side of (∗ )
is the square of an odd number, so it is an odd number (the product of two
odd numbers is odd), while the left side is a multiple of 2, and is therefore
an even number. Since an odd number cannot be equal to an even √ one, the
right side cannot be equal to the left. (As an example of this case, if 2 = 75 ,
then 2 = 752 , meaning that 2 × 52 = 72 . Then the left side, 50, is even, while
2
39
The Real Numbers
It may take a long time to realize the full importance of a discovery. With
hindsight, the discovery of the existence√ of irrational numbers was a turning
point in the history of mathematics. If 2 is not the quotient of two integers,
then just how can we describe it? As we saw, the usual way is as an infinite
decimal fraction, which is the limit of an infinite sequence of numbers, whose
squares get closer and closer to 2. This was the gateway to the fundamental
concept of the limit, the cornerstone of the infinitesimal calculus.
The square root of 2 was not alone for long. It was soon joined by addi-
tional irrational numbers. The Greeks realized that if the root of a whole
number is not itself an integer
√ (another word√ for “whole number”), then it
is irrational. The roots√ 4,√which
√ is 2, or 9, which is 3, are √ integers, and
therefore rational. But 3, 5, √6 and so on, are√not. Nor is 2+1√rational:
if it were rational, then, since ( 2 + 1) − 1 = 2, the number 2 would
be the difference between two rational numbers, and hence rational itself,
which we know it isn’t.
The conclusion is that there are infinitely many irrational numbers. In
fact, they are so numerous that they are “dense,” in the sense that there
is an irrational number between any two distinct numbers. The rational
numbers look like a sieve, whose holes are the irrational numbers. An even
more startling discovery will be made at the end of the nineteenth century
by Georg Cantor: that the holes are the majority. There are more irra-
tional numbers than rational numbers. Among numbers, as among humans,
rationality is rare.
The rational and the irrational numbers together are called “real
numbers.” Of course, this is not a proper definition of the term. It is like
defining “living creatures” as “humans or nonhumans,” which does not tell
us what is a “nonhuman living being.” The real numbers were defined pre-
cisely only at the end of the nineteenth century, which was an era of tran-
sition from fuzziness to rigor, mathematical intuitions being supplemented
by precise definitions and proofs. In those years the principles of differential
and integral calculus were given explicit and accurate definitions, clear-cut
41
42 Mathematics, Poetry and Beauty
axioms were written for the natural numbers, and David Hilbert completed
the work that Euclid had left unfinished for 2,000 years: the writing of precise
axioms for plane geometry.
Two mathematicians, Richard Dedekind and Georg Cantor, were
responsible for the rigorous definition of the real numbers. Their definitions
provided the justification for the way in which these numbers had been
presented beginning in the sixteenth century, that is, as infinite decimals.
The number π, for example, which is the ratio of the circumference of circle
to its diameter, is written as 3.145912 . . . , going on ad infinitum. What this
means is that the numbers 3, 3.1, 3.14, 3145, . . . get nearer and nearer to π.
There is something special in the decimal expansion of rational numbers.
Everybody knows, for example, that 13 = 0.333 . . . , the 3s repeating forever.
This is true of every rational number. The decimal expansion of a rational
number keeps repeating from some point on, as in 2.4131313 . . . , in which
13 repeats indefinitely.
Indeed, what is this number, written as a fraction? There is a simple trick
that does the job. It uses the fact that surprises many people and exasperates
others, that 1 = 0.999 . . . . In order to understand why this equality is true,
we must first understand what 0.999 . . . is: it is the limit of the sequence 0.9,
0.99. 0.999, . . . . These numbers approach 1 because their distances from 1
1 1 1
are 10 , 100 , 1000 . . . — numbers that tend to zero.
Knowing that 0.999 . . . = 1, we can calculate 0.333 . . . . Since dividing a
number by 1 does not change it, we can write: 0.333 . . . = 0.333 ...
0.999 ... . In the last
quotient, every 3 in the numerator is matched by a 9 in the denominator.
Accordingly, the denominator is 3 times as large as the numerator (when I
explain this to children, I tell them about two brothers: for every amount of
money that one receives, the other receives three times as much. At the end
of the day, the second brother will have three times as much money as the
first). This means that the fraction is equal to 13 .
Let us now look at 2.4131313 . . . . The number 0.0131313 . . . is one-
tenth of 0.131313 . . . . We can write 0.131313 . . . = 0.1313... 13
0.9999... = 99 . (When
the numerator “receives” 13 one-hundredths, the denominator “receives” 99
one-hundredths; when the numerator “receives” 13 one-thousandths, then
the denominator receives 99 one-thousandths, and so on. Therefore, the
numerator is 13 99 as big as the denominator.) Summarizing, 2.4131313 =
2.4 + 10 × 99 = 2 409
1 13 2389
990 = 990 , a fraction.
The other direction is also true: every rational number can be written as
a recurring decimal fraction. This is proved, simply, by dividing the numer-
ator by the denominator. 73 , for example, is the result of dividing 7 by 3,
The Real Numbers 43
and performing the division we get 2.333. . . . It isn’t difficult to show that,
dividing one integer by another the numerals repeat themselves beginning
at a certain point.
This implies, for example, that the number 0.101001000100001. . . is not
rational, since it is not recurring. This is also a (somewhat vague) indication
that there are more irrational numbers than rational ones: recurrence is a
rare phenomenon, and “most” of the decimal fractions are nonrecurring.
The Miracle of Order
Isaac Newton
Einstein told us that even if we know something about the order that rules
the universe, we will never understand why it is there at all. It seems as
if nature built for us a castle, whose treasures we reveal bit by bit. And
we shall always be like children playing with shells on the shore of the sea,
the depths of which we will shall never fathom. But even more surprising
than the existence of order is the fact that it is expressed by mathematical
formulas. More than that — by the most advanced mathematical theories of
the day. The great number theorist Godfrey Harold Hardy (1877–1942) was
an avowed pacifist. His main research partner, John Edensor Littlewood,
spent World War I developing artillery. This may be why when, towards
the end of his life, Hardy summed up his experience as a mathematician in
his book A Mathematician’s Apology, he took comfort in the fact that none of
his discoveries had ever had any use, certainly not for military applications.
Not much later Hardy’s theories played a role in encryption theory, and now
they are indirectly applied in parts of computer science.
The annals of mathematics are replete with such examples. Today’s
esoteric fields are tomorrow’s basic scientific tools. A famous example: in
45
46 Mathematics, Poetry and Beauty
The four sections of a cone cut by planes (from top to bottom): circle, ellipse, parabola,
and hyperbola.
There are more things in heaven and earth, Horatio, than are
dreamt of in your philosophy.
Hamlet, in Hamlet by William Shakespeare, Act I
Yes, but there are also things in philosophy that have never
been dreamt of in heaven and earth.
Georg Christoph Lichtenberg, mathematician and satirist, 1742–1799
Mathematics describes the world, but there are also many things in math-
ematics that have never been dreamt of in the world. From the moment of
the invention of a mathematical concept, it has a life of its own. In fact,
most mathematical problems do not emerge from real life problems, but
from other mathematical problems. Questions gain their right to exist by
relating to earlier concepts. But then, the offshoots often return and join the
main river. Scientists suddenly realize that they need them, in spite of their
purely theoretical appearance.
48 Mathematics, Poetry and Beauty
My work always tried to unite the truth with beauty, but when
I had to choose one or the other, I usually chose beauty.
Hermann Weyl, German mathematician, 1885–1955
One of the characteristics most peculiar to mathematics is its conjectures.
These strange creatures are the forceful drive and the holy grails (there are
many of them) of mathematics. Conjectures may survive for hundreds of
years before being solved. And strangely enough, most of them are eventu-
ally proved, rather than refuted. How do mathematicians have the hunch
that a fact should be true? The surprising answer is that the best crite-
rion is aesthetics. Mathematicians believe a conjecture when they feel it is
beautiful.
Godfrey Hardy, the English mathematician already mentioned in this
book, received a letter in 1913 from a poor Indian clerk named Srinivasa
Ramanujan. The letter contained a collection of identities in number the-
ory. Hardy could prove some of them, but many he could not. He believed
they were true, because they looked so elegant. Ramanujan himself could
not explicitly prove some of them, but merely “dreamt” them. Hardy, who
realized that the young Indian was one of the great mathematical geniuses
of all time, invited him to England, where the two worked together for a few
years. Sadly, Ramanujan could not withstand the English climate and being
away from home. His health, which had not been good to start with, rapidly
deteriorated. He died in 1920, after having returned to India.
In the process of proving a mathematical conjecture, it frequently looks
as if the blanket is too short: if you pull to one side, the other will not be
covered. But if the hypothesis is beautiful, the mathematician believes that
the deep order behind it will act in his favor, and that he will uncover its
The Miracle of Order 49
underlying logic. And if not he himself, then those who come after him. It
seems that the goddess of mathematics is on the side of beauty — the more
beautiful the conjecture is, the better its chances of being correct. Beauty is
the guide to truth because it expresses an unconscious perception of order.
When everything falls into place, there must be an intrinsic reason.
Simple Conjectures, Complex Proofs
51
52 Mathematics, Poetry and Beauty
number will fit in the carton? This problem has two natural solutions. One is
to line the bottom of the carton with oranges, arranged in straight horizontal
and vertical rows, like this:
A seemingly economical packing. The next layer will sit in the spaces between the
oranges in the first layer.
Over the first layer we will now place the second layer, fitting the oranges
into the “holes” between every adjacent quadruple of oranges. The third
layer will fit into the holes between quadruples of oranges in the second
layer, and so on.
In the second natural solution, the oranges on the bottom of the carton
are arranged as a honeycomb, so that each orange is at the center of the six
oranges that surround it. Then, the holes between the oranges are filled, as
in the previous solution. The second layer, and succeeding layers, all have
this same honeycomb structure.
Which of the two packing methods is more efficient? We are in store for
a surprise. The two methods, seemingly so different, are actually identical.
In the straight rows pattern, there are inclined planes arranged as a hon-
eycomb, and in the honeycomb pattern there are inclined planes arranged
in straight rows and columns. This is manifest when we build a pyramid
with a square base, with straight rows and columns packing. The illus-
tration below shows that there is a honeycomb packing at the side of the
pyramid.
Simple Conjectures, Complex Proofs 53
The base of the pyramid is a square, in which the balls are arranged in straight rows in
both directions. If we look at the face of the pyramid, we see balls arranged as a hexagon
around a central ball — the honeycomb packing.
The fact that the two most natural packing methods coincide suggests
that this is indeed the most efficient packing. Johannes Kepler, whose name
was already mentioned in connection with the conic sections, surmised that
this is indeed the case. This very natural proposition waited 300 years to be
proved. Like many other famous conjectures, many incorrect solutions were
offered for it over the course of time. A proof that was accepted as correct
was found only in 1998 by the American Thomas Hales. Seven more years
would pass until the mathematical community agreed on the correctness of
the proof. The reason was the proof’s extensive use of computers, checking
details that are too complex to be done with pen and paper. The length of
the written part of the proof is also formidable: some 250 pages!
The basic requirement of a political map is that any two adjoining coun-
tries are colored differently, in order to be distinguishable from one another.
The more colors a mapmaker has at his disposal, the easier it is for him
to meet this requirement. For example, if the number of colors equals the
number of countries, no special effort is needed — each country has its
own color.
In 1852 the English mathematician Francis Guthrie noted that four colors
suffice for a proper coloring of the map of England’s counties. As a mathe-
matician (or as a poet) this prompted him to generalize. Is it not the case for
every map? Can’t every map be colored by just four colors? This problem
gained immediate publicity, and also an endless number of incorrect solu-
tions. The most famous of these was by Alfred Kempe in 1879. Unlike other
solutions, much time would pass before Kempe’s error was discovered; in the
meantime, partly thanks to his false solution, Kempe was elected a fellow of
the British Royal Society. After 11 years, it transpired that he had proved
54 Mathematics, Poetry and Beauty
less than he had claimed: that it was possible to color any map in five colors.
Election to the Royal Society is for life, and his membership remained in
force.
From then until 1976, when the theorem was finally proved, it was the
fate of every mathematician in the relevant field, combinatorics — which
happens to be my own field — to receive false proofs from amateurs who
tried their luck. When a proof was finally discovered, by the Americans
Kenneth Appel and Wolfgang Haken, it became apparent to all that there
was a good reason for the elusiveness. Not only was the proof long and
complex; like Hales’ proof for the Kepler theorem, it made extensive use of
the computer to check more than a thousand special cases. The proof has
been somewhat simplified since then, but until this very day there is yet no
proof that does not rely on computers.
I cannot tell you much about the solution, but, as compensation, let
me tell you a more modest proposition, whose proof is easy. Assume that
the map is drawn in a special way, by adding one circle at a time, such as the
left-hand drawing on this page. The circles cut the world into “countries.”
In this case, you do not need four colors. Two suffice, as in the right-hand
drawing:
The map on the left is special: The borders are generated by circles. Such a map can be
colored with only two colors, as in the example on the right.
The simplest proof of this employs the concept of evenness and oddness
(once again, we see how useful this concept is!). Color each country lying
within an odd number of circles red, and color each country contained in
an even number of circles blue. In particular, the surrounding area (the
“sea”), that is contained in zero circles, is colored blue: zero is an even
number.
Let us show that this coloration fits the bill. That is, every two adjacent
countries are colored differently. Look at one country (call it A). Assume, as
an example, that A is contained in 5 circles. Since 5 is an odd number, by our
Simple Conjectures, Complex Proofs 55
coloration rule, A is colored red. We have to prove that any country (call it
B) adjoining A is colored blue. On our map, crossing a border between two
countries means entering or leaving a circle. If to go from A to B we leave a
circle, then B is located within 4 circles (one fewer than A), and therefore is
colored blue (since 4 is even). If to go from A to B we enter a circle, B lies
within 6 circles. Since 6 is even, B has to be colored blue, which is just what
we had to show. You can easily convince yourselves that there is nothing
special about the number 5. The argument is valid for every number.
100, 50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13,
40, 20, 10, 5, 16, 8, 4, 2, 1
but seems plausible, is that after every descent there still is a 50 percent
chance to descend further, and after that, there is an additional 50 per-
cent chance to descend, and so on. If this is so, it easily follows that for
every ascent (that rises by a factor of about 3), there are, on average,
2 descents. Since these two descents mean going down by a factor of 4,
every ascent times 3 is typically matched by a 4 times descent. There-
fore, on average, we drop further than we rise, so that, eventually, there
is a good probability of reaching 1. Of course this does not constitute
a proof, because even if the probability of an event is low, it can still
occur.
Another difficulty is that the numbers can go in a circle: there is no
a priori reason why, say, if we begin with the number 537, the sequence
will not eventually come around again to 537, just as when we begin with
1 we return to 1 (the sequence beginning with 1 is: 1, 4, 2, 1). To date no
such circle has been found, other than the one beginning with 1, and with
assumptions similar to those we mentioned, there is a good chance that no
other circle exists.
This is a famous conjecture. Is it also important? At first glance, the
answer is no. It isn’t connected to any other mathematical topic, nor does
it have any direct consequences. But, of course, this depends on the type of
its solution, if it will ever appear. If the solution will show that this series is
“random,” in the sense described above — that a snake has the same chance
to be followed by another snake as by a ladder, then we will understand
something of value about the structure of numbers.
1200 = 2 × 2 × 2 × 2 × 3 × 5 × 5 = 24 × 3 × 52
Ancient Greeks already knew that there are infinitely many primes. The
kingdom of natural numbers owes its complexity to the infinite number of
its fundamental building blocks.
Simple Conjectures, Complex Proofs 57
(3, 5), (5, 7), (11, 13), (17, 19), (29, 31), (41, 43).
The Twin Primes Conjecture states that the set of prime numbers is suffi-
ciently rich to contain many prime numbers that are close together. Accord-
ing to Goldbach’s Conjecture, the set of prime numbers is rich enough to
express every even number as their sum. Both conjectures are very famous,
and are usually bound together. The Goldbach Conjecture is the better-
known of the two, possibly because it was named after a person. Christian
Goldbach, a minor German mathematician, got into the pantheon of math-
ematics due to a letter he sent to Leonhard Euler that contained the conjec-
ture. For a reason that will be explained in the next chapter, “Independent
Events,” no one doubts its veracity. Actually, typically the larger the even
number, the greater the number of ways it can be expressed as the sum of
two primes. The number 10 can be written in two ways as the sum of two
prime numbers: 3 + 7 and 5 + 5. The number 100 can already be written
as the sum of two primes in five ways: 3 + 97, 11 + 89, 17 + 83, 29 + 71,
and 41 + 59.
“Be wise, generalize,” goes a famous mathematical dictum. Here is a
generalization of the twin primes conjecture.
Conjecture: for every even number k there are infinitely many prime pairs
whose difference is k.
In other words, there are infinitely many pairs of prime numbers with a
difference of 4 between the pair; there are infinitely many pairs of prime
numbers with a difference of 6; and so on. The number k must be even,
because if the difference is odd, then one of the two numbers is even, and
therefore it is not a prime number (unless it is 2). This conjecture is also
similar in form to the Goldbach Conjecture, in which every even number is
the sum of two primes (in the new conjecture, it is the difference between
two primes). A major breakthrough was obtained in 2013, when the joint
effort of many mathematicians culminated in a proof of the conjecture for
all k > 244.
So, how can simple theorems demand complex proofs? This is a mystery,
to which I can only attempt an explanation. I think that this is an opti-
cal illusion: the simple statements are those that draw our attention. Like
adventurers who seek gold, and incidentally discover an entire continent,
mathematicians, too, try to solve simply stated problems, and in the midst
of doing so, discover complex theories. If they were to start from the end,
Simple Conjectures, Complex Proofs 59
from the theory, things would look differently. The simple statements in a
mathematical theory are only a small part of the body of knowledge, but they
are the most conspicuous because of their concise formulations. Our gaze is
riveted to them, and it seems to us that they are the main focus. Actually,
they only make up a small part of a big world, little pegs protruding out of
the big rock.
But I must admit: every time I encounter a short theorem with a long
proof, I am surprised anew.
Independent Events
Independence
Assume, for the sake of argument, that the proportion of people in the
1
population (whether men or women) whose first name begins with A is 20 .
That is, one out of every twenty people has a first name beginning with A.
Problem: among all married couples, what is the proportion of couples in
which the first names of both the husband and wife begin with A?
61
62 Mathematics, Poetry and Beauty
We can imagine two extreme cases. If all men whose first names begin
with A were to take an oath not to marry a woman whose first name begins
with A, there would be no couples like this. On the other extreme, if all men
whose names begin with A were to marry only women whose names begin
with A, then all the couples in which the husband’s name begins with A
1
would belong to this category, and they would constitute 20 of all couples.
A more realistic assumption, however, is that people do not choose their
spouses based on the first letter of their name. So there is no connection
between the two events: the first letter of the husband’s name and the first
letter of the wife’s name are independent events. What proportion of the
couples then have both names that begin with A? The husband’s name
1
begins with A in 20 of the couples, and assuming independence of events,
1
in 20 of these couples the wife’s name, too, begins with A. Therefore, the
1
couples in which the names of both spouses begin with A constitute 20 of
1 1
20 , which is 400 of all couples.
This exemplifies a principle: when two events are independent, the proba-
bility of their joint occurrence is the product of their individual probabilities.
Let us see this in the gender and eye color example. Assume that 13 of the
population has blue eyes, and that 12 of the population are women. Assuming
independence, the blue-eyed women are 13 of the women, namely a 13 of 12 of
the population, which is 13 × 12 = 16 . Likewise, the probability of rolling two
6’s in two dice is the product of the probabilities of rolling a 6 in each one,
that is: 16 × 16 = 36
1
.
If two events are causally linked, then the occurrence of one changes the
probability of the other. We then say that the events are “dependent.” If the
occurrence of the first increases the chances of the other occurring, we say
that there is a positive correlation between them. For example, there is a
positive correlation between being blond and being blue-eyed. If the occur-
rence of one event decreases the probability of the other, we say that they
are negatively correlated. There is negative correlation between being blond
and being brown-eyed.
If the correlation between two events is positive, the probability that both
will occur is greater than the product of their individual probabilities. For
example, the positive correlation between being blue-eyed and being blond
means that if 13 of the population are blonds and 12 are blue-eyed, then more
than 16 of the population will be both blond and blue-eyed. As an extreme
Independent Events 63
example, assume that all blonds have blue eyes. In such a case, the set of
blue-eyed blonds is identical to the set of blonds, that is, it constitutes 13 of
the population, which is more than 16 .
The Twin Primes Conjecture states that there are infinitely many numbers
n, such that both n and n + 2 are prime. Why do mathematicians believe in
it? To see this, let us explain why there are many twin primes between, say,
1 and 1,000,000.
The secret is that there is no obvious connection between the primality
of a number n and the primality of n + 2. These two events should be
independent. The fact that 101 is a prime number gives no reason to think
that 103, too, is a prime number, or the opposite. As already mentioned,
there are about 78,000 prime numbers between 1 and 1,000,000, which means
that about 1 out of every 13 numbers in this range is prime. In other words,
1
13 of the numbers up to a million are prime. If there is no dependence
1
between a number n being prime and n + 2 being prime, then for about 13
of these prime ns, n + 2 is also prime. Therefore, the portion of the numbers
1 1
n up to a million for which both n and n + 2 are primes is 13 of 13 , which is
1 1 1 1
13 × 13 = 169 . In other words, for about 169 of the numbers between 1 and
1,000,000 (about 6,000 numbers), both the number and the number n + 2
are prime. So, assuming independence, there should be approximately 6,000
pairs of twin prime numbers between 1 and 1,000,000.
Actually, there are more. There are 8,169 such pairs. This means that our
assumption of independence was not accurate. The dice are in fact stacked
in favor of twins. The reason is that these two events are actually dependent,
and positively correlated. A number n between 1 and 1,000,000 being prime
means that there is greater chance that n + 2 is prime. The reason is that
the prime numbers are not evenly distributed between 1 and 1,000,000. They
are more concentrated among the small numbers. About 16 of the numbers
between 1 and 1,000 are prime numbers, while between 1 and 1,000,000 only
1
about 13 are, remember? Accordingly, if we know that n is a prime number,
then there is a higher probability that it is small. Then n + 2, also, is small
64 Mathematics, Poetry and Beauty
Between 1 and 1,000,000 there are 8,169 twin prime pairs. A similar
calculation shows that the range between 1 and 10,000,000 is likely to con-
tain more than 50,000 twin prime pairs (the actual number is 58,980). This
means that when going from 1,000,000 to 10,000,000 new pairs are added,
namely, there are many twin primes located in the range between these
two numbers. Similarly, there are twin-prime pairs between 10,000,000 and
100,000,000. New pairs appear at each step, which means there are infinitely
many pairs.
Let me repeat: the assumption on which this argument is based, that there
is no negative correlation between the primality of n and that of n + 2, is
not rock solid. So, the argument does not constitute a proof. It only provides
good reason to believe in the conjecture.
Poetic Image, Mathematical Image
Words
67
68 Mathematics, Poetry and Beauty
David Fogel (1891–1944) was a master of images. Fogel was born in the
Ukraine and, except for a single year in Palestine, spent his entire life in
Europe. His poem “The Cities of My Youth,” one of his last two, was writ-
ten in 1941 in occupied France, in a state of sheer isolation and desperation,
with the Nazi manhunt nearing. The mood, however, is not particular to
this poem of his. All his poetry expresses a feeling of estrangement.
In a rain puddle
Barefoot, you will yet dance for me
But you must have already died.
My fledgling steps
I shall never see,
Nor you, I shall not see,
Nor I of then.
The caravan of days,
From afar,
Moves on
From where to nowhere
Without me.
This is a poem of loss and surrender, of the abandoning of desires and lusts
that old age entails. But the detachment is not expressed abstractly. A prose
author would write “I forgot my youth.” Fogel writes tangibly, “I’ve forgotten
the cities of my youth.” Old age and death are both portrayed in a single
picture, the white palace. And most moving is the last picture, of the caravan
of days, that treads from one empty horizon to another, the poet being
alienated even from this bleak emptiness: “Without me.”
70 Mathematics, Poetry and Beauty
Mathematicians too, think in pictures. Words and formulas are used only
afterwards, for communication and stabilization. Here is a small example.
My daughter, who was seven at the time, was playing in the bathtub, and I
came in and told her she had to finish her bath in three minutes. She asked
for another five minutes. Okay, I negotiated, let’s make it another four. She
understood we were speaking about compromising in the middle and said,
“In that case, I’ll come out in 100 minutes.” If you can find the middle
between 3 and 100, I told her, you can stay for that amount of time. “51
and a half minutes” she said, without batting an eyelash (needless to say,
she didn’t stay that long.) To my question how she had calculated this, she
answered, “Half of 100 is 50, and half of 3 is one and a half, so the middle
between them is 50 and one and a half, which is 51 and a half.” But she
couldn’t explain why she calculated the middle this way — that’s how we
do it, and that’s that. Now, years later, I still don’t know how she did it.
But here is a picture that she may have had in her subconscious:
The middle between 3 and 100 is the middle between 0 and 103.
Our eye tells us that the middle between 3 and 100 is the middle between 0
(which is 3 units to the left of 3) and 103 (that is 3 units to the right of 100);
and the middle between 0 and 103 is half of 103, which is half of 100 plus
half of 3, just as my daughter had calculated.
The picture we used here is one of the most effective tools in mathemat-
ics, the “number line.” This is a line along which numbers are marked at
equal intervals. If we were to attribute it to a single person, it would be
Nicholas Oresme (1323–1382), a French prelate and mathematician. Actu-
ally, the number line is a very natural idea, because it is the reverse of
measurement. Measuring length quantifies geometry, while the number line
does the opposite: it brings geometry to the aid of numbers. It gives tangible,
geometric form to the concept of the number. Besides size, it also illustrates
direction, the negative numbers being to the left of zero.
Poetic Image, Mathematical Image 71
A coordinate system
The single picture that changed mathematical thought more than any other
is the Cartesian coordinate system. “Cartesius” was the Latin name of René
72 Mathematics, Poetry and Beauty
At first glance, there is nothing new here. Sailors used this system long
before Descartes. To mark points on the globe they use two numbers, lati-
tude and longitude. So what is so brilliant about this discovery? Descartes’s
innovation was not the discovery of the coordinate system, but in realizing
its usefulness. Number pairs describe connections between numbers. As in
life, numerical relations often link pairs (of numbers, in this case) rather
than triplets or quadruplet. The coordinate system enables us to graphically
depict these relations.
Take for example, the relation between a number and its square. The set
of pairs that satisfies this relation is a collection of pairs of the form (x, x2 ).
For example, (0, 0) is such a pair, as is (3, 9), and also (−3, 9), because
(−3)2 = 9. This also includes pairs of numbers that are not integers, such as
(0.5, 0.25). If we draw all of these points on paper, the result is a “graph,”
Poetic Image, Mathematical Image 73
which is a picture that depicts the relation. In this instance, the graph we
obtain is a parabola. Each point on the parabola corresponds to one of the
pairs.
The circle describes a relation between the value x of a point and the value y of the
point. The points on the circle are exactly the (x, y) points at a distance of 1 from the
origin of the axes; by the Pythagorean Theorem, this is true only if x2 + y 2 = 1.
The graph describing the relationship xy = 1 between the variables x and y. It is called a
“hyperbola.”
Poetic Image, Mathematical Image 75
Both algebra and geometry benefit from this system. Algebra gains the
tangibility of geometric pictures, and geometry gains the use of algebraic
tools.
77
78 Mathematics, Poetry and Beauty
The poet Rachel Bluwstein (known simply as “Rachel”), was born in Russia in 1909 and
immigrated to Palestine in 1909. She died in Tel Aviv in 1931, of tuberculosis she had
contracted when she returned to Russia to treat the children of First World War refugees.
world, and accepts that the power of the heart is stronger than her. Due to
the suddenness of the insight, and its minimalist statement (“no more. . . no
more”), we perceive it like a feather’s brush, and can pretend as if we really
hadn’t heard it.
Indirect proofs
(x2 )3 = x2 × x2 × x2 = (x × x) × (x × x) × (x × x) = x6 = x2×3
Independent sets
A set of numbers is “independent” (a term coined only for our purposes here)
if no number in it is divisible by another. For example, the set {3,5,6} is not
independent, because 3 divides 6. The set {3,4,5} is independent, because 5
is not divisible by 3 or 4, and 4 is not divisible by 3.
Question: How large can an independent set of numbers between
1 and 100 be?
contains a single element. What about the numbers between 1 and 2? The
set {1, 2} is not independent (2 is divisible by 1), and so we can take only {1}
or {2} — in this case, as well, the maximal size of the independent set is 1.
Between 1 and 3? The set {2, 3} is independent, and contains two elements.
Between 1 and 4: the set {3, 4} is independent, as is {2, 3}, and there is no
independent set with 3 of these four numbers; so in this case, the answer is
2. Now let us skip to 10: the set {6, 7, 8, 9, 10} is independent, as are {5,
6, 7, 8, 9} and {4, 5, 6, 7, 9}, each of which has 5 elements. A simple check
shows that there is no independent set of size larger than 5.
By these examples, if n is even, then the maximal size of an independent
set of numbers between 1 and n is half of n. If n is odd, then the maximal
size is half of n + 1. For example, it is easy to find an independent set of 50
numbers between 1 and 100: {51, 52, 53, . . . , 99, 100} or {49, 50, 51, . . . , 98,
99}. And indeed, there is no independent set of size larger than 50. Here is
an elegant argument showing this. We shall prove that a set with 51 elements
is necessarily not independent. Formally:
The proof makes use of the pigeonhole principle. As always, the trick is
in defining the cells. Between 1 and 100 there are 50 odd numbers, and for
each of them we will define a “cell,” namely a set of numbers.
The first cell consists of 1, 2, 4, 8, 16, 32, 64 — all powers of 2 below 100.
The second cell is 3, 6, 12, 24, 48, 96 — all multiples of 3 by powers of 2
(again, below 100).
The third cell is 5, 10, 20, 40, 80 — the multiples of 5 by powers of 2.
The fourth cell is 7, 14, 28, 56 — the multiples of 7 by powers of 2.
The cell corresponding to a given odd number will consist of the odd
number times all powers of 2 (namely, times 1, 2, 4, 8, 16, . . .). For example,
the cell for the number 3 will include the numbers 3×1 = 3, 3×2 = 6, 3×4 =
12, 3 × 8 = 24, 3 × 16 = 48, and 3 × 32 = 96. (We aren’t going beyond 100.)
The cell for 25 contains the numbers 25, 50, and 100; and the cell for 49
contains only two numbers: 49 itself, and 98 (multiplying by 4 already takes
us beyond 100.) So our cells look like this:
1, 2, 4, 8, 16, 32, 64 (the multiple of 1 by powers of 2)
3, 6, 12, 24, 48, 96 (the multiples of 3 by powers of 2)
5, 10, 20, 40, 80 (the multiples of 5 by powers of 2)
7, 14, 28, 56 (the multiples of 7 by powers of 2). . .
The Power of the Oblique 83
There are 50 odd numbers up to 100, so there are 50 cells. And every
number between 1 and 100 appears in one of them. As an example, which
cell contains 92? Divide 92 by 2, and we have 46, which is even, so we can
divide it again by 2 and get 23, which is odd. So, 92 = 23 × 22 , and therefore
92 appears in the cell of 23.
Recall what is our aim: we want to show that among any 51 numbers no
larger than 100 there is one that divides another. The 51 numbers go into
50 cells, and by the pigeonhole principle, two of them belong to the same
cell. But if two numbers belong to the same cell, the larger number will be
divisible by the smaller. For example, the numbers 12 and 96 belong to the
same cell, because 12 = 3 × 22 = 3 × 4 and 96 = 3 × 25 = 3 × 32. Since
4 divides 32, also 12 divides 96.
Co-prime numbers
The great Hungarian mathematician Paul Erdő s was told about a child
prodigy, Lajos Pósa, who already knew higher mathematics at the age of
twelve. Erdő s invited the young Pósa to a restaurant and asked him: “Prove
that, in any set of 51 numbers between 1 and 100, there are two co-prime
numbers.” (Two numbers are called “co-prime” if they do not have a common
divisor, apart from 1. For example, 5 and 9 are not divisible by any number
larger than 1, so they are co-prime, while 9 and 12 have 3 as a common
divisor, and therefore are not co-prime.) Pósa raised his head from his soup
bowl and said: “In a set of 51 numbers between 1 and 100 there are two
consecutive numbers.” Naturally, two consecutive numbers are co-prime. If,
for example, the smaller number is divisible by 3, the next one (the number
+1) is not.
Here, too, the pigeonhole principle is at work. The simplest way of proving
that there are two consecutive numbers among 51 numbers between 1 and
100 is to divide the numbers into 50 cells of consecutive pairs: {1, 2}, {3, 4},
{5, 6}, . . . , {99, 100}. Of the 51 numbers in this set, 2 will have to belong
to the same cell, meaning that they are consecutive.
Pósa left mathematical research at a young age. Erdő s used to refer to
mathematicians that abandoned research as “dead.” But Pósa is very much
alive: he devoted his life to the cultivation of gifted children, and produced
generations of bright mathematicians.
Compression
85
86 Mathematics, Poetry and Beauty
The Russian novelist Vladimir Nabokov said that smells are cloth hangers
for memories. The two smells evoke worlds of emotions. A lot is compressed
into the contrasts between the delicate, stealthy smell of the lilac and its
dim blue color, and the heavy smell of the oranges and the blinding light
of the southern country. The oranges give, like a mother’s breasts, but also
choke. Another expression of the poet’s ambivalence towards his homeland
is in the paradoxical wording “this homeland,” as if there may be more than
one homeland.
Compression is the secret of all art. It is compression that enables us
to return again and again to the same work of art, finding something new
at each visit. We are never tired of it, because so much is happening that
we never understand it in full. The haiku poet Matsuo Basho (1644–1694)
claimed that “a good haiku poem reveals only part of itself. We will never
grow tired of a poem that reveals only half of itself.” It should come as
no surprise that lengthy books have been written on poems of only a few
lines, or that thousands of words have been written on each note written by
Beethoven.
But isn’t this a case of “the eye of the beholder”? Does a poem or sonata
really contain so much, or are their interpreters simply being inventive? A
poem or sonata might be written very quickly. Even Beethoven, who was
known for his countless drafts, wrote his sonatas in much less time than
has been devoted to their analysis. Did so much really occur in his mind?
The answer is a resounding yes — not only because he was Beethoven, but
because our minds work much faster than we imagine. A short dream is
capable of containing an entire world; every thought is the result of a complex
process.
We still have not touched upon the best-known, and most deterring, quality
shared by mathematics and poetry: their difficulty. Both poetry and mathe-
matics are hard to understand. The reason for students’ difficulties is almost
always the same: the teacher doesn’t say all that he knows. He skips things.
Even if he is aware of everything that came before, he doesn’t have the time
to spell them all out.
Conveying a lot of information in a single statement is what compression
is all about. And it is this type of compression that is responsible for the dif-
ficulty in understanding poetry and mathematics. But there is a significant
difference between the two: the compression in mathematics is vertical, while
poetical compression is horizontal. In other words, in mathematics many
stages, built like floors one upon the other, are hidden within a single state-
ment. In poetry, many distinct ideas, not necessarily hierarchically ordered,
are compressed into one expression. This is why the vague understanding
of poetry causes no harm, while a hazy comprehension of mathematics gets
back at us in a later stage, when the next floor is built.
Mathematical Ping-Pong
Anonymous
How do mathematicians think? Unfortunately, or perhaps fortunately, there
is no recipe for this. A well-known book by George Polya, How to Solve It,
describes thought strategies for solving mathematical problems. Although
the book is replete with telling insights, reading it does not guarantee success
in problem solving. The way to learn problem solving is not to read how
others solved problems, but to solve them yourself.
Even though there is no magic formula, a basic trait of mathematical
thought can nevertheless be put in words: it is conducted like a ping-pong
game between examples and generalizations, between the tangible and the
abstract. From examples we build generalizations, that, in the next phase,
are confronted with other examples, which in turn lead to more accurate
generalizations. This ping-pong game is not symmetrical, since shots in one
direction — the abstract — involve magic, while the shots in the opposite
direction — the examples — are more down-to-earth. There is no recipe
for the generalization step. This is where we need illumination, the sudden
discovery of hidden order in the world. In other words, this is where the
beauty of mathematics is revealed.
Perhaps not surprisingly, the exact same can be said for poetry. There,
too, a continuous ping-pong game is conducted between the concrete and
89
90 Mathematics, Poetry and Beauty
the abstract. But there is a basic difference: in poetry the game is held
within the poem itself. The abstract and the concrete coexist in the same
lines. In mathematics, in contrast, we see only the results: the game is
already over. It was there only at the stage of the struggle with the solu-
tion. The last shot in the solution is toward the abstract, and this is
all that we, the spectators, see. The reader is like someone who is late
for a play, arriving only for the last scene, after most of the charac-
ters have already made their exit. This is Abel’s complaint against Gauss
in the first quotation, and this is the anonymous student’s complaint in
the other.
So, in order to watch the ping-pong game of mathematical thought, we
have to seize the moment. As an example, let me tell you how a sixteen
year old discovered the formula for the sum of a geometric series. In the
next chapter, we will give it a short and elegant solution. The high school
student’s method was a bit less elegant, but it is a fine example of the phases
of mathematical inquiry, and provides a glimpse into the way mathemati-
cians think.
He understood right away that this would not work here. In contrast with an
arithmetic sequence, the middle between the first and last terms (between 2
and 1024) is not the arithmetic average of the terms of the sequence, since,
for example, it is not equal to the middle between the second and the next
to last terms (between 4 and 512). Another idea is needed.
A conceptual leap
I expected the student to try small examples. I thought that he would calcu-
late the sums 2 + 4, 2 + 4 + 8, and so on, and find the regularity. But he sur-
prised me with a true insight, made of the material from which mathematical
discoveries are forged. “Let us look at the sum in the reverse order,” he sug-
gested. Like this: 1024+512+256+128+64+32+16+8+4+2. “This is about
2048,” he said (as usually happens at the beginning of a solution, things
were still hazy in his mind). That is, about twice the first term (1024). Why?
Because the addition of every term in the series halves the distance of the sum
from 2048. We start with a sum of 0 (that is, our shopping cart was empty),
and its distance from 2048 is, simply, 2048; the first term (1024) is half of
this, and so after its addition, the distance to 2048 will be 1024, which is
half the previous distance; 512 is half of the distance between 1024 and 2048,
and after its addition, the distance between it and 2048 is 512 — again, half
the previous distance. I think that the student had the following picture in
his mind:
The length of the entire “rod” is 2048. Starting at the left, each step takes us half of the
distance to the right end of the rod. In each phase, the distance from 2048 is the same as
the size of the last step that we took.
At each step, the distance from 2048 is the same as the term that we just
added. Now the student was already capable of calculating the exact sum.
The distance of 1024 + 512 + 256 + 128 + 64 + 32 + 16 + 8 + 4 + 2 from 2048
is the last term, that is, 2. Consequently, the sum equals 2048 – 2, namely
twice the first term in the series (1024), minus the last term (2).
The student found here the formula for the sum of a geometric series
with a quotient of 2: the sum is twice the last term, minus the first term.
In a formula, the sum is 2an − a1 , with the terms of the series labeled as
a1 , a2 , a3 , an . . . . This is the accepted notation for series: the first term is
labeled as a1 , the second as a2 ; and in general, the n-th term is denoted
92 Mathematics, Poetry and Beauty
These sufficed for the student to discover the law: the formula that he
had guessed gives twice as much as the truth. He must divide his formula
by 2. And indeed, the formula for the sum is 3an2−a1 . A quick check with
several examples showed the student that this indeed works.
Mathematical Ping-Pong 93
Generalization
Now it is a short way to the formula for the sum of a geometric series with
general quotient q. If, when the quotient is 3, the formula is 3an2−a1 , then
when the quotient is q, the formula has to be qaq−1 n −a1
(presumably the 2 in
3an −a1
the denominator of the formula 2 is q − 1, putting q = 3). For q = 2 we
get 2a2−1
n −a1
. But 2 − 1 = 1, and division by 1 doesn’t alter the number, so the
formula gives 2an − a1 , fitting the formula we found before. This explains
the student’s wrong guess: in the case q = 2 the denominator was hiding.
It was hard to guess that there is a 1 in the denominator, which is actually
2 − 1.
I then gave the student an additional example in which he could easily
check the formula: 1+10+100+1000+10,000. Here q = 10, and according to
the formula that we discovered, the sum is 10×10000−19 = 99,999 : 9 = 11,111.
Look at the sum and see why this is obvious.
Formal proof
A guess is not enough. We should prove it formally. At this point the student
exhibited surprising mathematical maturity. For the proof, he told me, we
need a formula for the terms of the sequence. This is not hard. If the first
term is a1 , and the second is q times bigger, then the second term is a1 q.
Similarly, the third term is the second term times q, namely, a1 q 2 ; and the
fourth is a1 q 3 . In general, the kth term is a1 q k−1 . If there are n terms in
the sequence, then the last term is a1 q n−1 , and the sum of the terms in the
sequence is a1 +a1 q +a1 q 2 +· · ·+a1 q n−1 . Our guess is that this sum is qaq−1
n −a1
.
qa1 q n−1 −a1 a1 q n −a1 n
Since an = a1 q n−1 this is in fact q−1 , which is q−1 = a1 qq−1
−1
. We
therefore have to prove:
qn − 1
a1 + a1 q + a1 q 2 + · · · + a1 q n−1 = a1
q−1
Division of both sides of the equation by a1 produces the equivalent formula:
qn − 1
(∗) 1 + q + q 2 + · · · + q n−1 =
q−1
94 Mathematics, Poetry and Beauty
Equality (*) can be proved by multiplying both sides by q − 1. The left side
will then be: (q − 1) × (1 + q + q 2 + · · · + q n−1 ), which equals
q × (1 + q + q 2 + · · · + q n−1 ) − (1 + q + q 2 + · · · + q n−1 )
which, in turn, is q + q 2 + · · · + q n−1 + q n − (1 + q + q 2 + · · · + q n−1 ). In
this sum, almost everything cancels out, leaving only q n − 1. Note now that
multiplying the right side of equation (*) by q − 1 produces q n − 1, so indeed
equality occurs in (*).
The Book in Heaven
95
96 Mathematics, Poetry and Beauty
Erdő s was trying to claim the credit for himself. Echoes of the quarrel that
erupted between the two reverberate to this very day.
Erdő s had his own private language, with a special vocabulary. He called
women “bosses,” and men “slaves”; children were “epsilons” (after the Greek
letter used in differential calculus to mark small numbers). Whenever he met
a small child, he would ask for his age, show him a coin trick, and then move
on to higher spheres. As someone who was born into the First World War and
witnessed the horrors of the Second World War, he called God the “Supreme
Fascist.”
Erdő s used to talk about “the book in heaven,” which includes all the math-
ematical theorems, each with its most elegant proof. Being “from the Book”
is the greatest compliment a mathematical proof can receive. A friend of
mine, who discovered such a proof, made a wise remark following his dis-
covery: proofs from the Book aren’t always born as such. They rarely spring
from the forehead of their inventors in their full beauty. They often require
elaboration to become pearls.
In the last chapter I described the process experienced by a student who
discovered the formula for the sum of a geometric sequence. This was a
prolonged process, and the proof was not particularly elegant. Here is the
proof, after polishing.
Call this sum S. The secret is that multiplying the sum by q gives almost
the same sum. Multiplied by q, every term becomes the next term — except,
of course, for the last term, that does not have a next term. So, qS is
almost S. What is the difference? qS has an extra term a1 q n , the last term
after multiplication by q; and it lacks the first term, a1 .
The Book in Heaven 97
Poetry is the other domain in which the play between the abstract and
the concrete is essential. Like mathematics, poetry is an ongoing dialogue
between individual instances and generalizations, between the tangible and
the abstract, the low and the high. Metaphor, for example, is such a game:
from the individual to the general, and back. The poet thinks of something
specific, say, his lover’s eyes. In his excitement, he wishes to give this a more
general dimension, and he thinks about the general characteristics of eyes:
softness, or their shape. In the next step, he returns to something else that is
worldly, which has similar qualities: “Your eyes are doves.” Note that the last
shot in this game is in the direction of the tangible. This is a general feature
of the poetical ping-pong: the heart of the poem is given to the concrete,
and it is in this direction that the poem goes. This is the diametric opposite
of the ping-pong of mathematics, in which the last shot is always toward the
abstract.
As in mathematics, in poetry, too, the shots and their returns are so
fast that in order to follow them we must freeze the moment and look at
the game in slow motion. As an example, let us take Amichai’s poem from
which the quotation at the beginning of this section was taken. The poem
is based on the legend in which King Solomon and the Queen of Sheba pose
mental challenges to each other. For Amichai, these are the hide-and-seek
games of lovers, that substitute for the real thing. Towards the end of the
poem, however, the metaphor dissolves. The disintegration of the symbolism
mirrors the breakdown of the lovers when the time comes to depart.
99
100 Mathematics, Poetry and Beauty
Sawdust of questions,
shells of cracked parables,
wooly packing materials from
crates of fragile riddles.
Long problems
were rolled up on spools,
magician’s tricks were locked in their cages.
Chess horses were led back to the stable.
Yehuda Amichai, “The Visit of the Queen of Sheba,”
trans. by Chana Bloch and Stephen Mitchell
The depiction of the games as shells transmits a sense of missed opportu-
nity — the two protagonists themselves wanted something beyond the shell,
but they did not reach it.
Concealed ping-pong
One of the most powerful poems on the Holocaust was written by Dan Pagis,
a survivor:
Here in this carload
I am Eve
with my son Abel
if you see my older boy
Cain son of Adam
Tell him that I . . .
Dan Pagis, “Written in Pencil in the Sealed
Railway-Car,” Transformation, trans. Stephen Mitchell
One source of the poem’s force is obvious — the nonstatement at its
end, that leaves the reader hanging in air, and compels him to return
Poetical Ping-Pong 101
to the poem’s beginning. But the real strength of the poem lies in
something else, less evident: the play between the abstract and the con-
crete. In these six short lines there are at least three transitions in each
direction.
The poem begins with the tangible. It tells of a particular woman, in a
railroad car that is not all cars, but a special one. Even the poem’s title
attempts to convey this, by insisting on the detail of the writing imple-
ment. We are so drawn to the woman in the car that we tend to forget the
metaphoric nature of her name. And yet, the poem is not only about the suf-
fering of this specific woman, and the poet gives her a name that represents
all women, with her son who represents all children.
From this generalization the poem returns to the concrete: “if you see
my older boy” — a simple, down-to-earth expression, the way a real mother
would talk about her son. These words allude to the unbelievable — Eve
still relates to Cain as a son, and even a beloved son. Poetry, as we know,
can bear unresolved contradictions.
At this juncture the concrete use of the word “son” is replaced by an
abstract meaning, as part of the wording “son of Adam” (the Hebrew, ben
adam, also means a human being, and especially, a decent human being).
This is obviously ironic — the last thing anyone could say about Cain is
that he was a decent human being. But alongside the abstract sense, the
words “son of Adam” also have a concrete meaning. Pagis reminds us that
the metaphoric “son of Adam” had a concrete source — there actually was
a person who was Adam’s son. In poetry research this maneuver is called
“metaphor reification” or “concretization.”
This, however, is not the end of the ping-pong game, since the words
“son of Adam” have an additional meaning: “You are your father’s son, not
mine.” Couples sometimes joke — “see what your son has been up to.” But
Eve says this in all seriousness, as an actual statement of fact — the last
move in the ping-pong game.
Laws of Conservation
103
104 Mathematics, Poetry and Beauty
1
same. But now each third has become 5 pieces, each of which is 15 th of the
cake. In other words, the two thirds together consist of 10 fifteenths, and so
we have 23 = 10
15 .
The American Sam Lloyd (1841–1911) was one of the most ingenious puzzle
creators of all times. He composed chess puzzles and mathematics puzzles,
was an amateur magician and a professional ventriloquist, and included ven-
triloquism in his magic acts. His son would “read” his thoughts, when it was
Lloyd himself speaking through his son’s mouth. In 1875 he composed (some
say, borrowed from another source) his most famous puzzle, the “15 puzzle,”
that is still popular today. It consists of a square with 16 smaller squares, on
15 of which are pieces bearing the numbers 1 through 15, with one square
remaining empty. A piece can be moved to the empty square if it adjoins the
empty square, that is, if it is alongside, above, or below the empty square.
The puzzle is usually played by sliding the squares around, and trying to
arrange them in order by a sequence of legal moves.
In order to boost sales, Lloyd offered a prize of $1000 (which was a con-
siderable sum in those days, but not enough to arouse suspicion) to anyone
who could exchange the places of the 14 and the 15.
cent, because he knew that the task was impossible. In the nineteenth century
news didn’t spread as fast as they do in the internet age. For some reason, no
reporter thought to interview mathematicians, and much time passed before
people learned of the impossibility of meeting Lloyd’s challenge.
Before I prove this, I’ll announce a similar competition for the readers of
this book, a scaled-down version of Lloyd’s challenge. Start with the num-
ber 1, and at each move add or subtract the product of two consecutive
numbers. Anyone who manages to arrive at the number 10 will receive a
prize of $100.
Example:
Could a better selection of moves reach 10? The answer is no, and my
$100 prize money is safe. Winning this game is impossible because of a law of
conservation: what is conserved here is the parity of the number. It always
remains odd. This is because the product of two consecutive numbers is
always even, since one of the two numbers is even. We started with 1, an
odd number, and we add or subtract even numbers. When an even number
is added to or subtracted from an odd number, the result is still odd. This
is why we will not reach 10, that is even.
106 Mathematics, Poetry and Beauty
Even after the pieces have been moved, we will always go over them in
the order dictated by the arrow. As can be seen from the drawing, when we
go along the arrow we pass through the numbers in the following order: 1, 2,
3, 4, 8, 7, 6, 5, 9, 10, 11, 12, 15, 14, 13. An “order change” means a pair of
numbers that is not in the right order. 3 and 10, for example, are in the right
order in this sequence: 3 appears before 10, as in their regular order. But
between 5 and 8, for example, there is an order change. In this sequence, 8
appears before 5, while the normal order calls for 8 to follow 5. In the wiggly
sequence 15 appears before 14 — not the normal order between them, so
here is another order change. How many order changes are there in this
sequence in all? I’ll list the number pairs in this sequence that are not in the
right order — see if I’ve listed them all: (7, 8),(6, 8),(5, 8),(6, 7),(5, 7),(5, 6),
(14, 15),(13, 15),(13, 14). There are 9 pairs, and so, in the original situation,
there are 9 order changes of number pairs (as always, when following the
arrow). In the desired situation, in which the 14 and the 15 change places,
there is one less order change, since the (14, 15) pair is in the correct order
(look at the arrow, and you’ll understand why this is so). This leaves 8 order
changes in the desired situation. The secret here is that the number of order
changes remains odd, all the time, and therefore this situation — in which
there are 8 order changes (an even number) — can never be attained.
Laws of Conservation 107
Why does the number of order changes always remain odd? For the exact
same reason that the number in the competition I suggested above remains
odd: each move adds or subtracts an even number of order changes. Look
for example at the following move:
This move does not add any order change, and does not remove any order
change. For example, 12 appears before 13 in the wiggly order before this
move, as it does after this move. And this is true for all other pairs. Simply,
the order of the squares along the wiggly line hasn’t changed.
Here is another example:
The move in the drawing alters the order changes of 5 only with the
numbers 6, 7, 8, and 1. It adds 3 order changes (5 becomes out of order
with 6, 7, and 8) and subtracts 1 order change (after the move, 5 is in order
with 1, while before this move it was out of order with 1). So this move
adds 3 − 1 = 2 order changes. This is a typical example: in each move,
the number of squares with which the moving square changes order along
the wiggly line is 0, 2, 4 or 6. So, an even number of order changes will
be added or subtracted. This means that if, in the starting position, the
108 Mathematics, Poetry and Beauty
number of order changes was odd (to be precise, 9), then it will remain so
throughout the game. Consequently, not only is it impossible to attain the
state that Lloyd sought — any state with an even number of order changes
is an impossibility.
In many Greek tragedies the hero tries to escape his fate, only to realize that
it pursues him. The best-known tragedy in which this happens is Oedipus
Rex by Sophocles. Oedipus, the young prince of the city of Corinth, hears
a prophecy that he will kill his father and marry his mother. Horrified, he
decides to flee the city. On his journeys, he meets a man at a crossroads
and kills him in a fight. After this, he come to the city of Thebes, and at
the gate of the city he learns of a monster, the Sphinx, with the head of
a human and the body of a beast, who takes her toll on the city’s inhabi-
tants. The curse of the monster will not be lifted until the riddle she poses
will be solved. Oedipus solves the riddle, and the grateful people of the
city marry him to Jocasta, the widowed queen. Years later, when a plague
rages unchecked in Thebes, and the oracle claims that Oedipus himself is
responsible for this calamity, the truth is revealed to the king: he was a
foundling, and his mother Jocasta had given him to a shepherd to raise,
after she had heard the oracles utter the exact same prophecy that Oedipus
had heard; and the person he had killed at the crossroads was his biological
father.
Modern thought would view fate as symbolizing inner forces and wishes.
Constantine Cavafy, the avowed Hellenist, conveys this message in many of
his poems. The best known poem with this idea is probably “The City.” It
is a poem of “conservation of inner truth.” External circumstances, it says,
are not as important as who you really are.
It can’t be denied
Without symbols and simile there is no poem.
The metaphor
111
112 Mathematics, Poetry and Beauty
is compressed into a single word: the feeling that, just as water saturates the
soil, metaphors are everywhere; that just as the water is inseparable from
the earth, so metaphors are so much absorbed in ordinary speech that we do
not even notice them. Just as resemblance between people is so effective for
description, a metaphor is capable of transmitting ideas that regular words
cannot. Thomas Ernest Hulme, a British poetry researcher and poet (and a
mathematician by education) who was killed in the First World War, claimed
that “plain speech is essentially inaccurate. It is only by new metaphors, that
is, by fancy, that it can be made precise.” The features of the mold that is
transmitted are sometimes too subtle to be transmitted in any other way.
This explains the prevalence of metaphor in poetry. Here, for example, is
a metaphor from “In the Twilight” by Hayyim Nahman Bialik, the Israeli
national poet:
But the metaphor would not have won its special standing if this was its
only power. The true secret of its force lies in its being on both sides of the
fence at the same time: it is an effective tool for transmitting information,
and also for its concealment. It enables us to grasp matters without having
to look them in the eye. It communicates the information innocently, as if
this was about something entirely different. Some metaphors of everyday life
have a similar role. We hear, for example, of “dropping out” of school, and
we are not attentive to the poetical quality of this expression, that replaces
blunt words like “leaving” or “expulsion.”
Indirect expression has great force, no less for the writer than for the
reader. Metaphor enables the writer to penetrate within himself and wrestle
with questions which he could not confront head on. For example, poetry
was Dan Pagis’s way of contending with his childhood Holocaust memories
that he did not otherwise dare to touch.
Once I read a story
about a grasshopper one day old,
a green adventurer who at dusk
was swallowed up by a bat.
that have different meanings, in this case even of different types. Another
play of being within and without the story is the green grasshopper (patently,
green in two senses) who is a child in the story, while the most touching
element of the poem is that the one who reads the story was a child at the
time, but is no longer one, because his childhood was stolen.
An effective metaphor is always condensed, that is, there are many points
of similarity between the tenor and the vehicle (the symbolized and the
symbol). This density has an effect on both of the metaphor’s roles: the trans-
mission of information, and its concealment. On the one hand, the simulta-
neous transmission of many ideas means efficiency in communication; while
on the other hand, when much information is delivered in a single effort, we
are incapable of consciously absorbing it all, and most of its assimilation is
subliminal.
In mathematics, just as in poetry, it often happens that ideas are brought
from one realm to another. And just as an apt metaphor is finer, the more
distant its tenor and vehicle are from each other, so too in mathematics:
the solution is more elegant if the idea is brought from a more distant field.
Number theory is known for the ideas that its draws from unexpected fields:
geometry, complex numbers, differential calculus, and actually, from almost
every other discipline of mathematics. In order, however, to present examples
of this type, we must first learn of the division of mathematics into its various
subdisciplines.
Three Types of Mathematics
Continuous mathematics
115
116 Mathematics, Poetry and Beauty
that the mouse cannot run forever in the direction opposite to the cat, since
it will run into the fence. The reason why, despite the mouse’s inability to
escape, the cat cannot catch the mouse is that when both are close to each
other it is possible for the mouse to run almost directly away from the cat.
The cat and the mouse are on a round field. They can choose the direction in which to
move, but not their speed — both move at the same speed. Will the cat be able to catch
the mouse?
Algebra
Ask high school students what is this algebra with which they are spending
so much time, and they will mumble something about “xs and ys.” What
are these, really? The answer is simple: they are names for numbers, and
high school algebra is nothing more than calling numbers by names (or,
more precisely, by letters). The need for letters arises in two contexts. One
is when we speak about general numbers, that is, about any number. In
order to relate to a general number, it must be given a name. Take, for
example, the following rule: “the product of a number plus 1 multiplied by
that same number minus 1 is the square of the number minus 1.” Rather
unwieldy, isn’t it? Written in a formula, it is shorter and more understand-
able: (x + 1) × (x − 1) = x2 − 1. When a letter denotes a general number,
it is called a “variable.” The second instance in which we need to assign
names to numbers is when the number is not known, and we attempt to
find it by some given information. For example, “the number plus 1 is 3.”
118 Mathematics, Poetry and Beauty
only during the competition, but once he did so, he solved all thirty of Fior’s
problems within two hours, and handily won the competition.
This is where Cardano entered the scene. Girolamo Cardano was a man
of many talents. Besides being a leading mathematician, he was also a physi-
cian, the author of encyclopedias, an inventor (the transmission system he
invented is in use to this day), and one of the first professional chess players
in history. On top of all this, he was also a compulsive gambler, which led
him to write the first book on mathematical probability — that also included
a chapter on how to cheat. According to his own testimony, he was cantan-
kerous. In 1570 he spent several months in prison on the charge of heresy:
he was accused of casting Jesus’ horoscope and attributing the events of his
life to the influence of the stars.
Cardano attempted to find the solution to third order equations by him-
self, and when he failed he implored Tartaglia to tell him the secret. Tartaglia
was himself a difficult man. At the age of twelve he almost died from a
sword blow from a French soldier that slashed his face and mouth. This
caused him to stutter (“tartaglia” in Italian means “stuttering”). He was
suspicious, and refused for quite some time to divulge his secret, but even-
tually gave in, after a promise by Cardano to find him a patron. To ensure
secrecy, Tartaglia sent the solution encoded in a poem. Cardano’s assurance
to attain support for Tartaglia was probably not sincere; at any rate, it never
materialized.
Tartaglia quickly came to regret his giving in to temptation, and began a
lengthy struggle, that was conducted in exchanges of letters. He was right: in
the end, Cardano did not honor his promise of secrecy. About a decade later,
Cardano heard that Tartaglia was not actually the first to discover the solu-
tion, having been preceded by del Ferro. Cardano no longer felt bound by his
promise, and he publicized the discovery. Tartaglia was furious. He invited
Cardano to a public debate, but the latter sent his talented pupil Lodovico
Ferrari (1522–1565). Tartaglia lost the contest, and was consequently dis-
missed from the academic position he had recently won, after many years of
poverty. In the end, he returned to his meager teaching position in Venice.
At about the same time Ferrari also discovered the solution to fourth degree
equations (such as x4 − x3 + 4x2 − 5x + 1 = 0).
Polynomial equations continued to provide inspiration to algebra also in
later years. But in the eighteenth and nineteenth centuries algebra made a
sharp turn, and the meaning of the term changed. Modern algebra examines
120 Mathematics, Poetry and Beauty
operations. These are similar to arithmetic operations, but they can also
be much more general, and relate to more abstract mathematical objects.
An example is the movements in the plane. Among these movements there
is an operation called “composition.” Let us illustrate this. Draw a pic-
ture of something in the plane, say of a turtle, and select a fixed point of
reference.
Now the turtle can be moved in different ways. These movements are
called “transformations.” It can be rotated around the point of reference,
or moved in the plane — up, down, right, or left. Such a movement,
that does not involve change of angle, is called “translation.” Addition-
ally, movements can be combined: performing one movement and then
another. The composition of two transformations is dependent on their
order. If the turtle is first translated and then rotated, it will arrive at a
place different from the one it would reach if it were first rotated and then
translated.
Motions in the plane are reversible. For example, if you move one meter
north, you can go back by moving one meter south. Likewise, a 90◦ clockwise
rotation can be reversed by a 90◦ counterclockwise rotation. The requirement
of reversibility is the generalization of a well known property of operations
with numbers: the addition of 7 can be reversed by subtracting 7; multipli-
cation by 3 can be reversed by division by 3. A set (a collection of elements),
together with reversible operations, is called a “group.” The group is the
Three Types of Mathematics 121
The composition of two transformations, in the opposite order: first translation to the
right, followed by a 90o counterclockwise rotation. The result is different.
most basic algebraic object. This does not mean that groups are simple
objects. They are surprisingly diverse and rich in structure.
Similarity of names usually indicates similarity in nature. What, then, is
the connection between modern algebra, that examines abstract operations,
and the study of equations? Why are they both called “algebra”? Not only
because equations involve arithmetic operations, but also for a deeper reason.
The French mathematician Joseph Louis Lagrange (1736–1813) discovered
a surprising connection between the possibility or impossibility of solving a
polynomial equation and operations on the set of solutions.
After the discovery of the solution to fourth order equations mathemati-
cians tried desperately to solve fifth order equations. In the beginning of
the nineteenth century a very surprising fact came to light: there is no gen-
eral formula for solving fifth order equations. This was discovered by the
Norwegian Niels Abel (1802–1829), who used Lagrange’s ideas. Abel died
of tuberculosis at an early age after a life of poverty and hardship, and
left a mathematical inheritance that would “provide mathematicians with
material for thought for two hundred years,” as one mathematician of the
period put it. Following him, the Frenchman Evariste Galois (1811–1832)
showed just which equations can be solved and which not (not every fifth or
higher degree equation is unsolvable; there are solutions for some equations
of higher than fourth order). Galois’s life, too, was tragic. He died in a duel
122 Mathematics, Poetry and Beauty
at the age of 20. Recognition of his discoveries came more than a decade
after his death. Galois used groups in his proof, and thereby further linked
group theory to algebra.
Discrete mathematics
A topologist is a geometer with his hands tied behind his back. He forbids
himself to speak about distances. For him, the boundaries of a triangle and
that of a circle are the same, since a triangle can be distorted to become
a circle, or the other way around. But if distances aren’t measured, then
what remains to be said of shapes? One quality that topology examines is
123
124 Mathematics, Poetry and Beauty
whether or not a shape has holes, and if so, how many? It develops tools
to prove that two bodies are topologically equal, meaning that one can be
transformed into the other by stretching, rotation, or reflection.
One of the best known topological theorems is the Fixed Point Theorem of
the Dutchman Luitzen Brouwer (1881–1966). The theorem, dating to 1912,
speaks of a ball in some dimension. A “ball” of radius R is the set of all
points whose distance from a certain point (the center of the ball) is at
most R. In one dimension this is a 2R long segment; in two dimensions,
this is a disk (a “disk” is the inside of a circle, together with the boundary.
By “circle” we mean just the boundary). In three dimensions, this is a regular
ball, like the ones used in sports, and in four dimensions, it is a body that
cannot be visualized. Brouwer’s Theorem states that if we take such a body,
distort it, move it, and stretch it — without tearing! — and if we leave it
entirely within the same space that it previously occupied (that is, after the
transformation no point is outside the place in space occupied by the body
before the change), then there is a point that did not move. This is called a
“fixed point,” because it remains fixed in its place.
The gray shape on the right was obtained by the distortion and moving of the disk to the
left (note that all points remained within the disk). Point x did not move — the
distortion left it in place. Brouwer’s Fixed Point Theorem states that in every distortion
of a disk that leaves it within its original bounds, there is a point that stays put.
This theorem is no longer true if the body has a hole, for example if it is
a ring. If a disk is rotated, its center remains fixed, but if a ring is rotated,
there is no point that remains stationary — everything moves, since the
center of rotation is not in the ring.
The two-dimensional case of Brouwer’s Theorem can be demonstrated
with sheets of paper (although paper can only be crumpled and moved,
but not stretched, and therefore will not fully illustrate the theorem). For
Topology 125
On the eve of the Second World War mathematics enjoyed a short-lived but
dramatic blossoming in Poland. Mathematicians whose names are known to
every present-day mathematician were active in the coffee houses of Lwow
and Warsaw, the two major centers of mathematics research: Stefan Banach,
Stefan Mazurkiewicz, Kazimierz Kuratowski, Alfred Tarski, Karol Borsuk,
and many others. Topology in particular benefited from this spurt of activity.
Stanislaw Ulam, later one of the fathers of the hydrogen bomb, was one of
the younger of the group. He was not a topologist by profession, but he
formulated a basic conjecture, which was quickly proved by Borsuk, and was
named after them the “Borsuk-Ulam Theorem.” First, an example of the
theorem:
At any given moment there are two antipodal points on the Equator
with exactly the same temperature.
The Equator is just an example: we could have used any circle instead; and
in place of temperature, we could have taken any other quantity, provided
that it is continuous, that is, without leaps. Temperature does not “jump” —
if at a certain point the temperature is, say, 10 degrees, then at neighboring
points the temperature will be close to 10 degrees. That was an example of
126 Mathematics, Poetry and Beauty
At any given moment, there are two antipodal points on the face
of the earth with exactly the same temperature and the same
humidity.
The Borsuk-Ulam Theorem states that if we rotate the pointer over the Equator, we will
come to a situation in which the same temperature is measured at the pointer’s head and
tail. This is proved by measuring the difference between the temperature at the head and
at the tail.
Place the pointer in any position, and calculate the difference between
the temperature at its head and the temperature at its tail. If, at the posi-
tion we chose, the difference is 0, that is, the temperature is the same at
the head and the tail — then these are the two antipodal points with the
same temperature, the existence of which we are after. We can therefore
Topology 127
assume that this difference is not 0. Let us say, for example, that the temper-
ature is 10 degrees at the head, and 3 at the tail, which gives us a difference
of 10 − 3 = 7 degrees. Now spin the pointer 180 degrees, to the position
where the head and tail switch positions, all the while measuring the dif-
ference in temperature between the head and the tail. When the pointer
arrives at its final position (that is, when the rotation of 180 degrees is com-
pleted), the temperature at the head (the former position of the tail) will be
3 degrees, and at the tail (where the head used to be), 10 degrees. The differ-
ence between the head and the tail is now 3 − 10 = −7 degrees. And so, the
difference changed from a positive value to a negative one. Since we assume
that temperature is a continuous parameter, that is, without leaps, then
according to the mean value theorem that we learned in the last chapter,
at some point the difference has to be 0. But 0 difference between the head
and the tail means exactly what we set out to prove — the existence of two
antipodal points with the same temperature!
Starting from 3 types, however, the only known proof uses the Borsuk-Ulam
Theorem.
In this particular example, when the necklace is cut in two places (marked by the scissors
in the drawing), the first thief receives the sections marked A, and the second thief
receives the section marked B.
Matchmaking
1912, the year in which Brouwer proved his fixed point theorem, also marked
a turning point in combinatorics. A cornerstone theorem of the field was
proved then. Its discoverer, Ferdinand Frobenius (1849–1917), was both the
blessing and the curse of the department of Mathematics at the Univer-
sity of Berlin. A blessing — because he was an outstanding mathematician;
a curse — because he was contentious. The reluctance of other mathemati-
cians to work with him was one of the reasons for the flourishing of Berlin’s
great rival, the University of Göttingen. Frobenius was an algebraist, not
a combinatorialist (combinatorics hardly existed as a separate field at that
time), and he worded his theorem in algebraic terms. When the Hungarian
Dénés König proved a stronger version of this theorem a few years later
and cast it in a combinatorial formulation, Frobenius ridiculed him for his
“inferior terminology.”
For many years König was the only one to recognize the importance of
Frobenius’ theorem. The theorem would become widely known only much
later, in 1935, when it was discovered independently by the Englishman
Philip Hall (1904–1982). The success of the new version (which was even-
tually named after Hall), might have been due to the intriguing name Hall
gave it: the “Marriage Theorem.” Sex sells, even in mathematics. Imagine,
Hall said, a set of acquainted men and women: each man is acquainted with
some of the women (possibly even with none). The men want to get married.
The rule is that a man may marry only a woman with whom he is acquainted
(these are mathematical men, and they don’t have tall demands — the sole
requirement is being acquainted). Naturally, the marriage must be monog-
amous, that is, a person (of either sex) may have only a single spouse. The
129
130 Mathematics, Poetry and Beauty
question that Hall asked is: under what conditions can all the men be mar-
ried? (Take note that we don’t insist that all the women will get married —
the problem was formulated before women’s lib.) In order to understand the
meaning of “under what conditions,” look at the following example:
When two men are acquainted with only a single woman, they cannot be matched.
So, for a wedding of all the men to take place, it is necessary for every man
to be acquainted with at least one woman, for every 2 men to be acquainted
Matchmaking 131
with at least 2 women, for every 3 men to be acquainted with 3 women, and
so on. This is clearly a necessary condition. It is less clear that this is also
a sufficient condition. Namely, if every k men are acquainted with at least
k women, then all the men can get married. This is the content of Hall’s
Theorem:
Hall’s Theorem: There will be a wedding [i.e., all the men can be
married] if and only if every set A of men is acquainted with a
number of women at least the size of A.
Ménages à Trois
Hall’s theorem belongs to a rare species: theorems that are not hard to
prove, and yet they possess depth and many applications. Experience tells
that such theorems tend to have many different proofs. Pythagoras’ theorem,
for example, has hundreds of proofs, each with its own beauty. This is the
case also with Hall’s theorem. And surprisingly, the strongest proof, having
the furthest reaching conclusions, is topological. The topological proof has
applications beyond those of the combinatorial proofs.
One of these applications is to triple weddings. This is not what you
think. I am not talking about two men and a woman, but about a man, a
woman, and a third object, say their dog, or their lodging. Suppose, as an
example, that we add to the men and women a third element — hous-
ing type. Instead of being acquainted with women, now every man will
be acquainted with woman + housing (or woman + dog) pairs. A man’s
acquaintanceship with a woman + housing pair means that the man is
willing to marry this woman, on condition that they will live in this
housing.
132 Mathematics, Poetry and Beauty
Example:
A man named Alan is acquainted with the pairs Alice + tent, and
Betty + house. This means that Alan is willing to marry Alice on
condition that they live in the tent, or with Betty on condition that
they live in the house. He is not willing to live with Alice in the
house, or with Betty in the tent. A second man, Bob, is acquainted
with Betty + tent.
Alan is acquainted with the pairs Alice + tent and Betty + house, while Bob is
acquainted with the pair Betty + tent. In this situation, there cannot be a marriage for
both men.
disjoint pairs. In the case of Hall’s Theorem, in which the men married only
women, this condition suffices. Does it also suffice in this case? That is:
When 2 men are acquainted together with 3 disjoint pairs and each man is acquainted
with at least one pair, the marriage is assured.
For example, if we add to this example one more pair (Claire + apart-
ment), with which Alan is acquainted, then this requirement is met. For
k = 1, the requirement means that every man must be acquainted with
2k − 1 = (2 × 1) − 1 = 1, namely, every man must be acquainted with at least
one pair — this condition is met in the example. For k = 2, the condition
means that 2 men together must be acquainted with 2k − 1 = 2 × 2 − 1 = 3
disjoint pairs, and this condition is also met: the two men are acquainted
134 Mathematics, Poetry and Beauty
with the 3 disjoint pairs Alice + tent, Betty + house, and Claire + apart-
ment. And indeed, both men can be married: we match Alan with the pair
Claire + apartment, and Bob with the pair Betty + tent.
And here we are in for a surprise: the proof of this theorem is topolog-
ical. Topology is not mentioned in the theorem, and is totally unexpected.
Nonetheless, the only proof known uses topological tools — actually, a cer-
tain version of Brouwer’s fixed point theorem.
Imagination
Albert Einstein
The English poet and essayist Samuel Tayler Coleridge claimed that poetry
finds “similarity in difference.” The poet William Wordsworth (1770–1850)
used an almost identical definition in his philosophical poem “The Prelude”:
poetry discerns the similarities between things that look different to the
passive observer. Imagination is the ability to find features shared by two
seemingly distant objects. This ability is common to poets, mathematicians,
and scientists.
In the early seventeenth century a poetic movement that its opponents
called “Metaphysical poetry” emerged in England. It was characterized
by sophisticated metaphor and the discovery of unexpected similarities
between disparate objects. The leading figure in the movement was John
Donne (1572–1631). Here is a famous passage from one of his poems,
that bears the strange name “A Valediction Forbidding Mourning.” The
bond between the souls of the poet and his lover is compared to the rela-
tionship between the two arms of a compass. The analogy goes on and on,
and every time that we think it has been exhausted, yet another point of
similarity emerges.
135
136 Mathematics, Poetry and Beauty
Mathematical similarity
be proved? One way is by first realizing that it is enough to prove this for
a right angle triangle. Every triangle can be divided into two right angle
triangles, like this:
If we prove the formula for each of the two right angle triangles that
were formed, then the area of the left-hand triangle is 12 ah, the area of the
right-hand triangle is 12 bh, and the area of the entire triangle is therefore
1 1 1
2 ah + 2 bh = 2 (a + b)h. Since the length of the bottom side is a + b, this
means that the area is half of the base multiplied by the height, as we wanted
to demonstrate. In an obtuse triangle, the height line is outside the triangle:
138 Mathematics, Poetry and Beauty
In this case, the area of the triangle is the difference of the areas of the
right angle triangles.
So, all we have left to prove is the formula for a right angle triangle, like
this:
Here is one way of proving this. Note that the distance of a point on the
upper side from the base increases at a constant rate as the point moves to
the right. The point furthest to the left is at a height (distance) 0 from the
base. If the entire upper side were at this height, it would fuse with the lower
one, and the area of the line formed would be 0. The point furthest to the
right is at a height of h above the base, and if the entire upper side were at
this height, it would assume the shape of a rectangle, the area of which is
the base times the height, that is, ah. The average height of the upper side
is the middle between 0 and h, namely, 12 h. The area of the triangle is the
product of the base times the average height, namely, a × 12 h, which is 12 ah,
which is what we wanted to prove.
In the chapter “To Discover or to Invent” we learned how Gauss calcu-
lated the sum 1 + 2 + 3 + · · · + n. In order to show its likeness to the calcula-
tion of the area of a triangle, I will present the solution somewhat differently.
Let us add 0 to the sequence, and write it as 0 + 1 + 2 + 3 + · · · + n. Adding
0 doesn’t change the sum. The first number is 0, and the last one is n. Since
the numbers grow at a fixed pace, their average is the middle between 0
and n, that is, 12 n. After we added 0, the number of terms in the sequence
is n + 1 (before we added 0, there were n terms). The sum of the terms of
the sequence is the number of terms multiplied by the average of the terms
Imagination 139
(this average is 12 n). That is, the sum is equal to 12 (n + 1)n — just like the
calculation of the area of a triangle: “half of the length of the base times the
height.”
Gauss’s method was a bit different. He matched 1 to n, 2 to n − 1, 3 to
n − 2, and so on. The next drawing shows the calculation of the sum of the
numbers from 1 to 6. 1 is joined to 6, 2 to 5, and so on.
The sum of the numbers from 1 to 6 is the number of circles in the lower
triangle. As we see from the drawing, this number is half the number of
circles in the rectangle, that is, half of 6 × 7, namely half of 42, which is 21.
This parallels the classic proof of the formula for the area of a triangle, as
illustrated in the next drawing:
The two triangles, the upper and the lower, complement each other to
form a shape of fixed height. This yields a rectangle, of height h with area
ah. The area of the triangle is half this, that is, 12 ah — which is exactly
Gauss’s proof!
140 Mathematics, Poetry and Beauty
Isomorphism
Pushing the “Power” button on a radio changes its condition, from “off”
to “on,” or the opposite. It doesn’t take any child long to learn that the
same holds true for a computer, a television, or a game console. This is
an example of “isomorphism.” That is, structural likeness: two phenomena
sharing the same hidden structure. The radio, the computer, and the tele-
vision are “isomorphic” in terms of their on/off mechanism. We can add a
mathematical example. If we take the numbers 1 and (−1), the operation of
multiplication by (−1) reverses the on/off status, just like pressing an elec-
trical switch: when 1 is multiplied by (−1), it reverses and becomes (−1),
while (−1) multiplied by (−1) becomes 1.
In a party my daughter asked me what is the difference between the
buffets at the two sides of the hall. Being a mathematician, I answered
“they are isomorphic.” She asked me what does this mean, and by way of
an answer I suggested that we play the following game. Two players, each
in turn, picks a number between 1 and 9, which had not been selected so
far by either of the two players. Each player collects the numbers he chose,
and the winner is the player who has three numbers that total 15. Here is an
example of such a game. Call the players A and B. We record their moves,
as well as the numbers they have collected so far:
Imagination 141
The first three steps taken by players A and B, from left to right. In the first round A
chose 5, which is represented by the X in the left-hand drawing. B responded with 7,
which is represented by the O in this board. After the third round, the “X” player has
two possibilities of winning, only one of which can be blocked by the “O” player.
On the face of it, the number-choice game and tic-tac-toe seem differ-
ent, but a proper abstraction shows the identity between the two: they are
isomorphic. I asked my son how he discovered this, and he said that he iden-
tified two similar elements in both games: the threesomes, and the double
threat. In tic-tac-toe, too, the way to victory lies either in the opponent’s
oversight or in a double threat, that is, two threats only one of which can be
blocked.
“That reminds me” is one of the more common paths leading to mathe-
matical discoveries.
A Magic Number
Strangely, of all the real numbers the Goddess of Mathematics picked two
and assigned them a special role. One of them, π, which is about 3.141, was
already well known in antiquity. This is the ratio between the circumference
of a circle and its diameter, a ratio that already the ancients knew is the
same for all circles. The symbol π was first used in 1706 in a book by William
Jones, as the first letter of the word “perimeter.” The number π naturally
appears in geometric formulas, but, surprisingly enough, also in number
theory. The significance of the second special number, marked as e, with
a value close to 2.718, was discovered only in the seventeenth century. Its
meaning could be comprehended only with tools from differential calculus,
which developed then. Its importance quickly became apparent. The first
to name this number (as b, rather than e) was the German mathematician
and philosopher Gottfried Leibniz (1646–1716). It was named e by the Swiss
Leonhard Euler (1707–1783), the leading mathematician of the eighteenth
century. Contrary to the natural guess, Euler did not pick the first letter of
his own name. He wanted to use a vowel, and since he had already used the
letter a for some other value, he chose the second vowel in the alphabet.
It soon became clear that e is no less important than π. It appears in
varied contexts, and in many fields. In this chapter we shall meet it in four
roles.
Compound interest
143
144 Mathematics, Poetry and Beauty
a decimal — by 1.1). After one year, A’s account will contain 1.1 times $1000,
that is, $1100. At the end of the second year he will have 1.1 times $1100,
which is $1210. At the end of the third year he will have 1.1 times
$1210, and so forth. After k years, A will have $1000 × 1.1k . To answer
our question, after 10 years A will have $1000 × 1.110 , and since 1.110 is
approximately 2.59, he will have about $2590.
Customer B is more aggressive than A. He demands (and receives) inter-
1
est of 20 every half a year. How much will he have after 10 years? Before
answering, think: will B earn more or less than A? The answer is — more.
1
If it were not for this being compound interest, two half-years of 20 interest
1
would equal one year of 10 interest. Since, however, this interest is com-
1
pound, the interest for the two half-years comes to more than 10 , so each
year B will earn more than A. Using a calculation similar to that for A’s
1 20
profits, 20 half-years will end with B having 1000× (1+ 20 ) , which is about
2650 dollars.
Customer C is even more assertive. He demands that the 10 years be
1
divided into 50 parts (50 fifth-years), and that he receive 50 interest every
fifth-year. The same calculation reveals that after 10 years the initial $1000
1 50
investment will be multiplied by (1+ 50 ) , which is about 2.69. If a customer
demands that the 10-year period be divided into 100 parts, and that he
1
receive interest of 10 each tenth-year, his money will be multiplied by (1 +
1 100
100 ) , which is about 2.704, so that after 10 years he will have a sum of
approximately $2704 in his account.
The sequence we are looking at is (1 + n1 )n . As we saw from the examples,
the terms of the sequence increase as n increases. Yet, the rate of increase
decreases, and therefore the sequence does not tend to infinity, but rather
to a finite number, which is only a bit larger than 2.7. The limit of the
sequence is marked by e, which is approximately 2.718. This value is only
approximate, because e is not a rational number. Bernoulli’s definition is the
most commonly used of all the definitions of e.
It is not hard to see that if the numbers (1 + n1 )n converge to e, then the
numbers (1 − n1 )n converges to 1e . To see this, write 1 − n1 = 1/(1 + n−11
), so
n
1 1 1 1
1− = 1 = 1 × 1
n (1 + n−1 )n (1 + n−1 ) n−1 1 + n−1
The last term tends to 1, and the one before it to 1e . There is a gruesome
story related to this calculation, told by the British writer and Nobel lau-
reate Graham Green. As a teenager he suffered from acute depression, and
he played Russian Roulette 6 times. In each attempt he had 1/6 chance
A Magic Number 145
of being killed. What was the probability of his survival? Since at each
attempt he had a 5/6 chance to stay alive, the answer is ( 56 )6 , which is 5/6
of 5/6. . . of 5/6 (6 times), namely (1 − 16 )6 . As we saw, this number is very
close to 1e , namely a bit more than a third. Not much. Green’s fans should
be relieved, but of course they wouldn’t know if he did succeed in killing
himself.
Secret friend
Geometric mean
Example:
This is also called “arithmetic mean.” There are also other kinds of averages,
the best known of which is the geometric mean. The geometric mean of a
146 Mathematics, Poetry and Beauty
set of numbers is the number that will leave the product unchanged if it is
exchanged for each of the numbers in the set.
Example:
A differential equation
A ladybug is standing in the Cartesian plane (see the “Poetic Image, Mathe-
matical Image” chapter), at the point (0, 1), that is, on the y axis, at a height
of 1. It begins to move to the right, along a line with a slope of 1, that is, at
an angle of 45 degrees with the axes, which means that for each unit that
it moves to the right it will go up one unit. As the ladybug advances, it
changes the angle of its movement — it always moves at an incline exactly
equal to its height above the x axis. For example, when it reaches the height
A Magic Number 147
of 2 above the x axis, the incline of its movement will be 2, meaning that it
goes up twice as fast as it moves to the right.
The ladybug is moving to the right and up. The ratio between its progress upwards and
its movement to the right is increasing: it is equal to the ladybug’s height above the x
axis. It can be shown that when the ladybug has advanced x units to the right, its height
is ex .
Obviously, the ladybug will climb very quickly: the further it rises, the
greater the rate of its ascent. The question is: what is the formula for the
curve that it follows? That is, after it has moved x units to the right, how
high will it be? The formula for this curve is y = ex . In other words, after
the ladybug has moved x units to the right, it will be at height ex . For
example, after a single time unit, it will be at height e; after two time units,
it will be at height e2 . In terms taken from differential calculus, the function
ex possesses a unique trait, that it equals its derivative (the derivative of a
function is its rate of change). This is the special property of the number
e, from which all its other characteristics are derived. The function ex does
indeed rise very rapidly: for example — after 10 units of movement to the
right, the beetle’s height, e10 , will be approximately 6000 units.
What I have just described is called a “differential equation.” Such an
equation prescribes the behavior of a curve. It tells what the slope of the
curve is, that is, the rate of increase of the value y as we advance along
the curve, at every point on it. The differential equation above says: If this
curve is described by the equation y = f (x), then the function f (x) is equal
to its derivative. As a formula, this becomes f (x) = f (x). (The derivative
of a function f (x) is denoted by f (x).) The simplest differential equation is
f (x) = 0, which means: the derivative is 0, or, in other words, the rate of
change of the function is 0 (more simply put, the function does not change).
The solution of this equation is a constant function f (x) = c, where c is any
fixed number.
148 Mathematics, Poetry and Beauty
Leonhard Euler (1707–1783), the greatest mathematician of the eighteenth century, and
one of the most prolific mathematicians of all time. The blindness from which he suffered
for the last seventeen years of his life did not curb his creativity.
Reality or Imagination
Some numbers look real, and others less so. Some appear as if they are part
of the real world, while others seem to be an arbitrary invention. The most
natural are the numbers that are indeed called — “natural”: 1, 2, 3, . . . .
They are visible and tangible. Four apples are concrete reality. Fractions, too,
never suffered from a lack of faith in their existence, since a half or a third of
something too can be seen and felt. This is not to say that they reached their
present form smoothly. The Egyptians knew only fractions with a numerator
of 1, such as 12 , 13 or 14 . The Romans had no special symbols for fractions,
only verbal descriptions. The fraction sign that we know was invented by
the Indians, and was brought to Europe only in the twelfth century by the
Arabs. But despite their difficult birth, the existence of fractions was never
doubted.
In contrast, there are some numbers that were, and are still suspected of
being fictitious. It seems that they are no more real than the unicorn, and
that the mathematicians who study them are playing make-believe. This was
the fate of the number 0, that, at least on the face of it, has no object, that
is, it doesn’t count anything. In Europe it gained respectability only in the
twelfth century.
Something similar happened to the negative numbers. When present day
elementary school children recite a descending numerical sequence, say, 9, 7,
5, 3, 1, there are usually two or three children in the class who would continue
-1, -3, -5. This just goes to show how natural the concept of negative numbers
is today. Children pick it up as if from the air. It is hard to believe that this
idea was still ferociously attacked in the nineteenth century. The famous
French mathematician Lazare Carnot (1753–1823) argued that “in order to
think about a negative quantity, something must be subtracted from 0, that
is, from something that does not exist — which is impossible.” The author
of a mathematic handbook, Busset, argued that the root of the problem in
mathematical education in France was teaching negative numbers. He wrote
149
150 Mathematics, Poetry and Beauty
that “thinking about quantities smaller than zero is the height of insanity.”
In 1831 Augustus De Morgan, an important English mathematician, wrote
that the appearance of negative numbers in real-life problems is just an
indication of incorrect formulation. If we ask how much a store owner earned,
and the answer comes out to be −10, it means only that we should have
asked: “How much did he lose?”, and then the answer would be the positive
number 10. If we were to ask, “In how many years will Robert be twice as
old as Sherman?”, and we get −3, we should have asked: “How many years
ago was Robert twice as old as Sherman?”, and then the answer would have
been a positive number: “3 years ago.” Today, the idea that “a profit of −10
dollars” means a loss of 10 dollars, or that “in −3 years” is simply 3 years
ago, is almost self-evident.
Other numbers were thought to be even more imaginary. They were even
called this: “imaginary numbers.” Today, these numbers have similar stand-
ing to the real numbers, but are still cloaked in an aura of mystery. Indeed,
there is something magical about them.
The need for new types of numbers usually arises in the solution of equa-
tions. Negative numbers are necessary to solve equations such as 5 + x = 2
(whose solution is x = −3); rational numbers were introduced to solve equa-
tions such as 3x = 2 (whose solution is x = 23 ). Irrational numbers first
2
√ to solve equations like x = 2, that is, in order to be
appeared in order
able to refer to 2. The next class of numbers, the imaginary numbers, also
can be understood in this manner: they are needed to solve equations of the
form x2 = −9. The square of a real number cannot be negative, so there
is no real number satisfying this equation. For, 32 = 3 × 3 = 9, and also
(−3)2 = (−3) × (−3) = 9. The discoverer of imaginary numbers (for, it was
a discovery — a notion was waiting there to be discovered) was the Italian
Rafael Bombelli (1526–1572).
√ He spoke of a number whose square is (−1),
that is, a number that is −1, but he did not assign it a name. Nor did
he call it an “imaginary number,” a name derisively given it by Descartes.
√
The accepted letter for −1, which is i (the first letter of “imaginary” or
the Latin “imaginarius”), was given it by Euler in 1777. Once we have this
number it is possible to solve equations such as x2 = −9: the two solutions
are 3i and −3i.
x2 + x +
In order to solve equations that are a bit more involved, such as √
9 = 0, we need combinations of i and real numbers like 5 + 3i or 2 + 23 i,
Reality or Imagination 151
and in general, a + bi, with a and b being real numbers. Such a number is
called “complex,” because it is composed of two parts, one real and the other
imaginary.
New types of numbers were defined one after the other, and in each phase an
additional number type was needed for the solution of equations. Is there an
end to this process, or will it continue indefinitely? Fortunately, the complex
numbers are, indeed, the Promised Land. There is no need for further number
types to solve new equations. This fact, that is known as “the Fundamental
Theorem of Algebra,” was proved by the 22 years old Gauss in 1799. The
theorem states that the complex numbers are sufficient to solve any equation
of the type known as “polynomial,” such as x5 + 4ix4 − 3x3 + 10x + 1 + i = 0.
Gauss showed that every such equation has a complex solution, and that if
the equation is of the nth order, namely the highest exponent of the unknown
is n, then it has n solutions (the equation in the example is fifth-order, since
5 is the highest power of x in it).
Actually, this is true for more than polynomial equations. Almost every
equation has a solution. This is Pickard’s Theorem, which states that even if
one were to concoct some wild equation that uses regular operations (such as
multiplication, division, raising to a power, trigonometric functions), chances
are that it has a complex-number solution. “Chances are” means that the
equation has a solution for any value in its right-hand side, apart possi-
bly from one single value. For example, there is a solution for the equa-
tion 2x = 3 + i as there will be for any other number placed in the right
side in place of 3 + i, except for a single number: 0. There is no solution
for the equation 2x = 0. “Almost every equation has a solution” is quite
surprising. As far as the solution of equations is concerned, complex num-
bers are truly the last word. It is not that mathematicians stopped invent-
ing new classes of numbers. But the new classes are not there for solving
equations.
Already the ancient Greeks spoke of the need to invent numbers with neg-
ative squares, but they did not devise any special letter for such numbers,
nor did they give them the official status of “numbers.” The story of the
152 Mathematics, Poetry and Beauty
and sin(π) = 0. The trigonometric functions cos(x) and sin(x) are the basic
tools used to describe waves. This explains why the formula is so useful in
physics and in electrical engineering, where waves play a central role. Euler’s
formula states that when applied to the complex numbers, the trigonometric
functions and raising to a power are almost the same.
Unexpected Combinations
Eureka
155
156 Mathematics, Poetry and Beauty
the English physician William Jenner about a century earlier. Jenner heard
from villagers that humans who had been infected by cowpox were immune
to human smallpox, and recommended that humans be intentionally infected
with cowpox. He knew nothing about the mechanism behind this process,
and didn’t dream about such creatures as germs. Pasteur understood this
mechanism in a flash, linking Jenner’s procedure with what had happened
to his chickens.
Many scientists attest of themselves that they made their discoveries in
this same way. Michael Atiyah, a great twentieth-century mathematician,
tells that he would talk with people about their work, hear ideas from all
different directions, and often things would connect with the problems that
occupied him at that particular time. Richard Feymann explained how easy
it is to become a genius. “It is easy,” he testified. “I just roll in my brain the
problem with which I am concerned at the time, until something comes and
connects to it.”
In a way, this is how the most famous conjecture in mathematics, named
after Fermat, was solved. The conjecture (now theorem) stated that for any
number n larger than 2, there are no integers x, y, z, all different from 0,
for which xn + y n = z n . This was also known as “Fermat’s Last Theorem,”
since Fermat claimed that he had proved it. He used to write down his
discoveries in the margins of his copy of Euclid’s geometry book. He wrote
this conjecture in the margin, and next to it: “I have a wonderful proof,
but there isn’t enough room to write it.” The conjecture remained open
for some 350 years, and tortured generations of mathematicians with the
mocking simplicity of its formulation. A decisive step in the direction of
its solution was the discovery of an unexpected connection with a conjecture
that was formulated in the 1950s, and that seemed completely unrelated, the
Taniyama-Shimura Conjecture. When the latter was solved in 1995 by the
Englishman Andrew Wiles (with a proof that was too long for the margins
of any book), Fermat’s Conjecture was proved together with it.
Koestler’s claim that every discovery is the result of joining together
previous ideas is probably an exaggeration. A completely new idea, unrelated
to known conceptions, has to appear every once in a while. But we cannot
deny that this is one of the richest sources of beauty.
The storm joins the playing of music and the frothy river that nothing can
withstand, and we realize that even a beloved one can be played. The poem
then moves on to combinations such as “warring street, dripping raspberry
syrup,” “cities of trade, painful and deaf.”
Unexpected combinations means bringing together distant patterns, and
finding likeness between them. We saw that, in the eyes of many poets, this
is the main characteristic of poetry. The poetical combination can be odd,
but it is never arbitrary. It uncovers a true similarity between two patterns.
Indeed, there is a likeness between opening one’s eyes and the unfolding of
the way as one walks. “Warring street,” or “street dripping raspberry” are
apt combinations, not because a street can fight or drip blood, but because
the lover fights a lost battle, and he drips pain. In terms of the external world,
there isn’t much logic there, in terms of inner meanings there certainly is.
Harnessed together
Deuteronomy 22:10
There is a term in Greek rhetoric known as “syllepsis,” meaning joining far
apart elements. Nathan Alterman was a master of such combinations.
Ask a mathematician to define his profession, and chances are that he will
stutter. A physicist can say what it is that he studies, but a mathematician,
even after long years of research, will find it hard to define his occupation.
One of the common definitions of mathematics relates to its subject matter:
“the science of the number and shape.” In other words, numbers and geom-
etry. There is much truth to this. Almost every modern mathematical field
developed from one of these two fields, and almost every mathematical topic
has a geometric or numerical aspect. And strangely, the more mathematics
advances, the harder it becomes to separate the two: geometry contributes
to numbers, and numbers appear in geometry. But the reservation “almost”
is unavoidable: numbers or geometry appear in only almost every mathe-
matical field, not in all. For example, mathematical logic (which will be the
subject of a later chapter) touches upon neither numbers nor geometry. Or
take the ants puzzle in the first part: the concepts it used, — “collision,”
“change of direction” — were not numerical, and were hardly shape-related.
159
160 Mathematics, Poetry and Beauty
Abstraction
If so, then what is special about mathematics? To answer, let me tell a story
from elementary school. When I go there I sometimes ask first-graders: how
many are 2 pencils and 3 pencils? The children learned that addition means
joining, so they join 2 pencils to 3 pencils, and find that they have 5 pencils.
Now I ask: how many are 2 erasers and 3 erasers? They immediately answer,
“5 erasers.”
The children laugh. But my question is serious. Behind it lies the main
strength of mathematics: generality. Mathematics strips a situation of its
secondary details, and leaves the gist. In this case, the gist is that three
objects and two objects are five, regardless of their nature or arrangement
in space. Of course, we make abstractions all the time, and in every field of
thought. What is special about mathematics is that it takes abstraction to
an extreme, applying it to the most basic thought processes.
The classic example is the concept of number. Numbers were born by
the abstraction of the most fundamental thought process: the division of the
world into objects — separating the world into units and assigning them
names: “apple,” “family,” “state.” Counting means repeating the same unit
many times “2 apples,” “3 apples,” “4 apples”. . . .
Frege
The vast majority of our imports come from outside the country.
George W. Bush
Yogi Berra
Anonymous mathematician
In 1820 Gauss suggested that in order to communicate with intelligent inhab-
itants of the moon, if there are such, forests should be cleared in Siberia to
create a picture of the Pythagorean theorem (the same drawing that appears
in the chapter “Mathematical Harmonies”). This theorem is valid everywhere
in the universe, it was always there, and Pythagoras only had to discover it.
In other words, its information is hidden in its assumptions. Seemingly, it
does not really contain any new information, namely, it is a tautology. This
is what Wittgenstein claimed in the above quotation.
“Tautologia” in Greek means “an identical word.” It is an empty state-
ment that contains no new information. “Water is wet,” for example. A
“tautological argument” in logic means an argument that is always true,
such as: “If today is Tuesday, then today is Tuesday.” By what we saw
165
166 Mathematics, Poetry and Beauty
You are guilty and you, too, are guilty, the judge
ruled for the accused. Man is only a man,
the doctor explained to the stunned relatives.
Amusing? Perhaps. But it is also hard to miss the despair. The heroes of
the poem face the empty statements dumbfounded, and the message is that
man’s very existence is filled with despair; that nothing specific needs to be
known in order to realize the finality of life.
Haiku poems
Haiku is a strict Japanese poetic style, with exactly 17 syllables (in the
original). Haiku poems frequently depict a nature scene that relates to
a specific season. The Japanese poetry scholar R. H. Blyth said that
haiku poems are always tautological. They contain no new informa-
tion. Haiku poems are more about what does not happen than about
what does:
Darkening sea
voices of the wild ducks
faint whitely away.
Matsu Basho, 1644–1694
Externally, the ducks have gone. Internally, something has happened: their
voices left their mark upon the listener. The minimalism of the outside action
leaves a place for inner action.
Arriving alone
to visit someone alone
in the autumn dusk.
Yosa Buson, 1715–1783
The wording “Arriving alone to visit someone alone” is almost a tautology.
But who visits whom is not important — what is significant is what happens
within oneself, and the dusky autumn sensation that arises upon reading the
poem.
Dawn rises
the storm is buried
in shrouds of snow.
Edo Watsujin, 1758–1836
Economy by symmetry
In a schoolbook that I had as a child there was a story about a king who
assigned to two artists the task of painting the royal chamber, one side of
the chamber to each. One artist labored for months, while the other just sat
idly. On the last day, after the first artist had completed his work, the other
artist placed mirrors on his side. The king came, and was greatly impressed
by the paintings on the first side. When he came to the other side, the second
artist showed him that his side had just as fine paintings. The king paid the
first artist his wages, and said to the other: “Here, you see the bills in the
mirror? You can take them on your side.”
Mathematicians are very fond of the second artist’s stratagem. Mathe-
matics never spares effort in order to spare effort, and it often, and very
successfully, uses symmetry for this purpose. There is no need to work twice
when a single time will do. Here is a famous example:
Two people are situated on the same side of stream L, one at point A and the
other at point B. This is a mathematical stream, namely, it is a straight line.
The person at B is dying of thirst, and the person at A wants to bring him
water from the stream, as quickly as possible. He has to go to L, and from there
to B. What route will he choose, so that he will have the shortest distance to
go? That is, what is the shortest (broken) line that touches L and connects A
with B?
169
170 Mathematics, Poetry and Beauty
The shortest broken line from A to B is obtained by reflecting the stream part of the line
from A to B , the reflection of B.
Here is a game for two players. The players sit at a round table, and each
has an unlimited quantity of same size coins. Each in turn places a single
coin in an empty spot. The winner is the player who places the last coin,
that is, after his move there is no room left for the other player to place a
coin. Which player can assure a victory?
The answer is that the opening-move player has a strategy for winning.
He places a coin exactly in the center of the table. After this, he adopts
a mirror (or monkey) strategy: for every coin that the other player places
anywhere, he places a coin symmetrically, on the opposite side. This strategy
ensures that after each move the coin situation will be symmetrical, that is,
every coin has a matching coin opposite it. This guarantees the first player
a win, because he always can make a return move: since the coin placement
is symmetrical, if there is a spot open for the second player, there will also
be an (exactly opposite) opening for the first.
The first player places a coin in the center. Now, for every move by his opponent, he will
place a coin exactly opposite.
172 Mathematics, Poetry and Beauty
The second player cannot use the mirror strategy, because of the unique
opening move of the first player, that cannot be mirrored: there is no sym-
metrical response to placing a coin in the center. By the way, the table
doesn’t have to be circular. This same strategy works for a rectangular or
elliptical table, because they, too, have a center.
Dido’s problem
A shepherd is given a rope, and is told that he can use it to fence off an area
as he wishes. Which shape should he choose, in order to fence off as large an
area as possible?
of two adjoining sides is half the rectangle’s perimeter, which is 2L. Assume
that the length of one of the two adjoining sides is L + X. Since the sum of
the lengths of the adjoining sides is 2L, the length of the other side is L − X
(since L + X + L − X = 2L). The area of the rectangle is the product of the
lengths of the adjoining sides, which in this instance is (L + X) × (L − X).
When we open the parentheses, we obtain (L + X) × (L − X) = L2 − X 2 .
But X 2 cannot be negative. Accordingly, L2 − X 2 does not exceed L2 , which
is the area of the square with the same perimeter (each of whose sides is L
long). The smaller X is (that is, the closer the sides’ length to each other),
the larger the area of the rectangle.
The square and the rectangle in the picture have the same perimeter (4L). The area of
the square is greater.
Assume now that the shepherd still has to choose a rectangle, but he
can use the bank of a stream as one of the boundaries of his pen (just as
in Dido’s Problem). As usual, we will assume that this is a mathematical
stream, that is, a straight line. What shape should the shepherd choose for
his rectangle? Here, too, formulas could be used, but it is simplest to rely
on symmetry. The shepherd’s area is mirrored across the stream bank line.
The combination of the original rectangle with the reflected one is the
rectangular area that does not make use of the stream. Its entire perimeter
is a rope (albeit one that is partially imaginary) twice the length of the
shepherd’s rope. Consequently, the new (doubled) rectangle’s perimeter is
fixed; and as we already know, its area is maximal if it is a square. Since
the area of the shepherd’s original rectangle is half of the overall rectangular
area, the shepherd would do well to choose half a square.
The shepherd wants to use his rope to build a rectangle to the left of the stream, with
the stream itself being one side. The reflection of this rectangle (the dotted line) yields a
rectangle with a perimeter twice as long as the rope, and an area twice that of the
original rectangle. Of all the rectangles with a perimeter twice the length of the rope, a
square has the largest area. Therefore, the shepherd should choose a rectangle that is
half a square.
Of all the shapes with a given perimeter, the circle has the greatest area.
It is best for the shepherd (like Dido) to choose a semicircle. To show this,
assume that the shepherd used his rope and the line L to mark off a pen of
maximal area. A and B will be the points of contact between the rope and
L (see the drawing).
If the rope bounds a maximal area with the line L, then every point X on the rope “sees”
the segment AB at a right angle.
Let us assume that there is a point X on the curve for which the angle
AXB is not a right angle (that is, not 90◦ ). Now modify the curve by changing
this angle to 90 degrees, without changing the two arcs over AX and over
BX (see the next drawing). According to the argument, the area is larger —
in contrast with the assumption that the curve bounds with L a maximal
area.
The right-hand picture is derived from the left-hand one by closing the span between the
sides XA and XB, so as to make them perpendicular. The domes C and D are preserved.
The length of the curve (the rope) is not thereby changed. The total area increased,
because the area of the triangle increased, while the area of the two domes remained as it
was.
Having worked hard to solve the riverbank case, we can use this to solve
our original problem, of the shepherd who does not have a stream that can
bound part of his area. In order to show that it is best for him to mark off
a circular area with his rope, we will assume that he chose an area of some
other shape. Take two points on the rope, A and B, that divide the rope
(that forms a closed shape) into two parts of equal length (see the left-hand
drawing):
The two shapes have the same perimeter. The circle on the right was obtained by
replacing the upper and lower parts (both of which are between A and B) in the
left-hand shape by semicircles. According to what we already proved, this increases the
area, and therefore the area of the circle is larger than that of the left-hand shape. This
is the isoperimetric inequality.
Replacing both the bottom and top parts of the shape with semicircles
will increase its area. To see this, draw a line between A and B, and note
that by the riverbank case, each part of the shape cut by this line increases
Symmetry 177
Symmetry in poetry
Once again, the outside is reflected within: the lukewarm water reminds the
old man of his lukewarm blood and his lukewarm life. In the last line there
is mirror reflection, but of opposites: the summer without is reflected as
autumn within.
In Nathan Alterman’s powerful poem “The Foundling,” inner reality is
the mirror reflection of the external one. Externally, the mother abandons
her baby; her inner truth is that he has abandoned her. The incongruity
between the picture and its reflection in the mirror only grows throughout
the poem. The symmetry of roles is joined by a temporal symmetry: the end,
the mother’s death, is a mirror image of its beginning, the son’s birth; and
her shrouds are a mirror image of his infant clothes. Nathan Zach, the other
great “Nathan” of Hebrew poetry, who knew Alterman well, testified that
178 Mathematics, Poetry and Beauty
the poem painfully reflects Alterman’s relationship with his parents. I have
selected three stanzas from this long poem:
[. . . ]
One of the greatest contributions of the Greeks to geometry, along with the
ideas of “theorem,” “proof,” and “axiom,” was the concept of construction
of geometric objects using only a ruler and compass. The ruler used for
such constructions has no markers of length, and therefore cannot be used
to measure distance, but only to draw straight lines between given points.
A compass is used to draw circles, and to draw equal segments (a segment
is a finite part of a line). While lengths cannot be measured using only ruler
and compass, lengths of segments can be compared, and a segment equal in
length to a given segment can be marked off on a line.
These do not seem to form an impressive arsenal of tools, but this is
misleading. In actuality, much can be accomplished using these two simple
devices. A parallel to a given line can be drawn, through a given point; a
perpendicular to a line can be drawn from a given point on or outside it; an
angle can be bisected; and a segment can be divided into any finite number
of equal parts. It is possible to construct a perfect hexagon (with equal sides
and equal angles), an octagon, and — as Gauss proved at the age of 19 —
even a perfect 17-sided polygon. Gauss was so proud of his discovery, that
was almost certainly the first significant geometric construction since the
time of the Greeks, that it led him to prefer mathematics to philology, his
other academic interest. He asked that a 17-sided polygon be inscribed on his
tombstone. The tombstone mason refused, claiming that it was impossible
to tell the difference between this shape and a circle. This injustice was
corrected fifty years later, on a monument erected in memory of the great
man.
When we speak of geometric constructions, we mainly think of finding
points, or of forming shapes. But there is another type of construction, that
of lengths. We are given a certain segment, that serves as the “measuring
rod,” namely, it is arbitrarily defined as 1 unit long. We then want to build
a segment of length 2 (that is, double the length of the given segment), or of
179
180 Mathematics, Poetry and Beauty
1
2 (half of the given segment), and so on. It is easy to multiply the length
of the segment by a whole number (by simply adding more and more copies
of the segment, one next to the other), and it is not difficult to divide a
segment into a whole number of equal parts. So segments of any rational
length can be constructed (in order, for example, to construct a segment of
length 35 multiply the segment by 3, and then divide the result into 5 equal
parts). Euclid, in the fourth century BC, already knew how to geometrically
extract a root. That is, given a segment of length a, he could also construct a
√
segment of length a. This is done, as in the drawing below, by constructing
a right angle triangle with a hypotenuse of length a, and such that the
projection of one of its sides on the hypotenuse is of length 1. In the drawing,
the length of the projected side is labeled x, and the length of this side
√ √
is a, that is, x = a. Why? The triangles ACB and ADC have equal
corresponding angles. This implies that they are similar, meaning that the
ratios between their sides are equal. So, AD AC 1
AC = AB or, in other words, x = a .
x
The power of the ruler and compass is indeed surprising. But three construc-
tion problems remained unsolved, and frustrated the efforts of both pro-
fessional and amateur mathematicians for more than two thousand years.
They withstood vigorous assaults first by the Greeks, and then by the
Impossibility 181
(1) The most famous of the three was squaring the circle: given a circle
of a certain radius (say, one unit), construct, with the aid of a ruler and
compass, a square with the same area as that of the circle. In a second
formulation: find a segment whose length equals the circumference of the
circle. The first to tackle the problem was Anaxagoras (499–428 BCE).
Aristophanes derided the circle squarers in his play The Birds, and ever
since “squaring the circle” has been synonymous with attempting to
achieve the impossible.
(2) Doubling the volume of a cube: in the chapter “Mathematical Har-
monies” we saw how, for a given square, it is possible to construct a
square with double the area. This is plainly a square, the side of which
is the same length as the diagonal of the original square. Can something
similar be done for volume? Namely, for a given cube, can the side of
a cube of double the volume be constructed with the aid of a ruler and
compass?
(3) Dividing an angle into three equal parts: bisecting an angle using
ruler and compass is simple. Can an angle also be divided into three
equal parts?
Examples:
How does this theorem lead to the conclusion that squaring a circle is
impossible? Let us take the version of the problem that speaks of circum-
ference: “Construct a segment the length of the circumference of a circle
with radius 1.” The circumference of a circle with radius 1 is 2π. If we knew
how to construct a segment of length 2π, by halving it we would obtain a
segment of length π. I already mentioned (in the chapter “The Power of
the Oblique,”) that in 1768 Lambert proved the irrationality of π. In 1880
Lindemann proved even more: that π is “very irrational,” or in technical
terms “transcendent” or “non algebraic.” This means that it is not a solu-
tion of any polynomial equation with integer coefficients not all of which
are 0, let alone the solution of a polynomial satisfying the conditions above.
So, by the theorem, π cannot be constructed.
What about the other version of squaring the circle, that of constructing
a square whose area is that of a circle of a given radius? Taking the radius to
be 1, the area of such a circle is π, and therefore the length of a side of the
√
square must be π. But with a compass and ruler we can construct from a
pair of segments a segment that is the length of their product, and so, from a
segment of length a we can also construct a segment of length a2 . If we were
√
capable of constructing a segment of length π, we would therefore be able
√ 2
to construct a segment of length π , which is π. But, as we already know,
this is impossible, and so a segment of length π cannot be constructed.
As for doubling the volume of a cube, assume that we can double the
volume of a cube with a side of length 1, whose volume is 1 × 1 × 1 = 1.
Twice 1 is 2, so presumably we would succeed in constructing a cube of
volume 2. Let x be the length of this cube’s side. The volume of a cube with
Impossibility 183
185
186 Mathematics, Poetry and Beauty
The stars are not wanted now; put out every one:
Pack up the moon and dismantle the sun;
Pour away the ocean and sweep up the woods:
For nothing now can ever come to any good.
This amazingly modern poem derives its force from three poetic devices.
The first is the twist at its end. The poem’s true meaning is revealed only
in the last line. One of the later chapters of the book will be devoted to this
device. Here let me just explain that the beauty of a twist lies in the fact
that everything that came before suddenly receives new meaning, and the
reader has to absorb and comprehend a great deal all at once. It is only in
the last line that the reader realizes that the portrayal of the sunset and
the earth’s departure by the sun is only a metaphor for the poet’s sense of
abandonment upon the death of his patron. We must then decipher anew
all the preceding lines. Since conscious thinking cannot absorb so much so
quickly, the understanding remains partially subconscious. The second device
of the poem is displacement, the offhanded statement of the crux of the
matter. Yequtiel’s death is mentioned as part of a metaphor: “as if it is
covered by sackcloth.” Very similar to the displacement in Lea Goldberg’s
“About Myself,” in the chapter on displacement at the beginning of this
188 Mathematics, Poetry and Beauty
book. Yet, the strongest device is probably the third one: hyperbole. The
pain at Yequtiel’s death is attributed to the entire world — to the earth, the
skies, the sun. Projecting his mourning to the skies, it is easier for the poet
to bear his pain.
The beauty of hyperbole resembles that of a majestic landscape. A soaring
cliff, or a tremendous mountain, are not fully comprehended by the viewer.
We are accustomed to perceiving the world around us in practical terms, of
action, and the cliff or the mountain are too great to even imagine climbing.
Similarly, poetic hyperbole transports matters beyond ordinary perception,
resulting in absorption without conscious understanding.
Cantor’s Story
David Hilbert
A mathematical feud
Mathematics, too, has its hyperboles, things that are bigger than the dimen-
sions of the world to which we are accustomed, and so the rules that prevail
189
190 Mathematics, Poetry and Beauty
in them appear strange and wondrous. This is the concept of infinity. The
Greeks already were charmed by this notion and the paradoxes that it cre-
ates. But the stormiest turning point occurred at the end of the nineteenth
century — stormy in the literal sense, and not only mathematically. The
concept of infinity became a battleground.
Except for quarrels regarding claims of priority of discovery, there are
hardly any disputes in mathematics. A prominent exception is the story
of set theory, a field that was developed in the late nineteenth century by
Georg Cantor (1845–1918). Cantor’s idea was so novel and surprising that
the mathematics community needed about twenty years to digest it. In those
two decades wars were waged, polemical articles written, and blood was shed,
almost literally. In these wars, several of the period’s leading mathematicians
took the wrong side.
Cantor was not a quarrelsome person, nor did he intend to overturn
the established order. He was engaged in a classic field known as “Fourier
analysis,” that was originally developed to analyze wave patterns. Cantor
made important contributions to this field, but not revolutionary ones. One
day, however, for one of his proofs he needed the following fact: that there is
more than one kind of infinity. There are large infinite sets, and there are
even larger. Before Cantor, mathematicians regarded all the infinite sets as
equal. All were thought to be very big, and that is all. No one tried to classify
infinite sets by size. Cantor showed that, not only was such a classification
possible, it was also productive and important. Just as one number can be
larger than another, one infinite set can be larger than another. This was
the starting point for a new theory — set theory.
The fundamental concepts of set theory are surprisingly simple. Every-
thing evolves from a single concept: an element’s belonging to a set. This
simplicity might have been responsible for the mathematical community’s
reluctance to accept the new theory. About fifty years later, John von
Neumann would demonstrate that all of mathematics can be formulated
within the framework of set theory, but when Cantor introduced his ideas it
was hard to believe that such a simple concept could be used to say anything
of importance.
Mainly, however, it was another idea that had Cantor’s contemporaries
up in arms: regarding infinity as a tangible entity. The mathematicians of the
nineteenth century were proud of their recent success in providing a rigor-
ous foundation for differential and integral calculus. Since its discovery in the
seventeenth century, calculus had proved to be extremely useful, second only
to the concept of the number. But for almost two centuries this field rested
Cantor’s Story 191
on shaky foundations, its terms being only vaguely defined. The central con-
cepts in calculus, “tending to a limit” was left fuzzy. Only in the nineteenth
century several mathematicians, notably Augustine-Louis Cauchy, Bernhard
Riemann, and Karl Weierstrass, provided precise definitions for these terms.
In these definitions infinity is something seen from afar but never reached.
Tending to infinity means that numbers become as large as we wish, but they
never reach infinity. Gauss reproached a friend with whom he corresponded:
“I must vigorously protest against the use you make of the term infinity, as
something that can be reached. Infinity is only a manner of speaking, mean-
ing numbers that are as large as we wish.” To this way of thinking, infinity is
“potential,” not “actual.” Having just exorcised the devil of actual infinity,
namely infinite quantities that exist on their own right, the mathematical
community was enraged to find Cantor bringing it back through the back
door.
Prominent mathematicians, led by the famous Leopold Kronecker, dispar-
aged Cantor’s theory, claiming it was worthless. Henri Poincaré, one of the
major mathematicians of the time, said that set theory was “a childhood
disease, from which mathematics would eventually recover,” and accused
Cantor of “corrupting the youth” — the same charge for which Socrates had
paid with his life some 2,300 years earlier. Unlike Socrates, Cantor was not
executed, but he did not receive a coveted university position. The attacks
against him further fueled the prolonged depression from which he suffered,
and he ended his days in a mental institution. Too late for him to enjoy, but
still in his lifetime, his theory was finally victorious, and its importance was
universally recognized. Nowadays, set theory is taught in first-year university
mathematics courses.
The first hurdle that Cantor had to overcome was definition. In order to
speak of larger and smaller infinite sets, a precise definition is needed for
these terms. But first, an even more basic question must be answered: when
are two sets of equal size? For finite sets, the answer is clear: two sets are
of equal size if they contain the same number of elements. A set of 5 books
is equal in size to one of 5 pencils. But in the infinite case, we don’t have
numbers at our disposal. Here, Cantor had an extraordinary insight: numbers
aren’t really needed. Equality of set size can be defined without them, even
in the finite case. For example, in order to prove that each of your two hands
has the same number of fingers, you don’t have to count the fingers on each
192 Mathematics, Poetry and Beauty
hand. Just match them, by putting your hands against each other. This is
called “correspondence”:
Two sets A and B are of equal size if there exists a correspondence, assign-
ing to each element of A an element of B, in such a way that each element
from B corresponds to exactly one element from A.
There is a one to one correspondence between the set of flowerpots and the set of flowers,
that is, exactly one flower corresponds to each flowerpot, and vice versa. This shows that
the two sets are of the same size. This definition skips the concept of number, and can be
used for infinite sets, as well.
to define equality of size between sets. Two infinite sets are defined as being
of equal size if there is a one to one correspondence that matches all elements
of the first to all elements of the second.
0 1 2 3 4 5 6…
0 1 2 3 4 5 6 7 8 9 10 11 12…
There is a one to one correspondence between the set of natural numbers and its subset,
the set of even numbers.
Hilbert put this in the form of a story about a hotel in heaven having
infinitely many rooms, numbered as 1, 2, 3, . . . . One day all the rooms were
filled. And then, in the evening, another guest arrived. If the hotel were finite,
the guest would have been stranded. But having infinitely many rooms in
his hotel, the manager could easily solve the problem. Using a public address
system, he asked each guest to move to the next room. The guest in room
no. 1 moved to room no. 2, the guest in room no. 2 to room no. 3, and so
on. Each guest now has his own room, and room no. 1 is vacant and ready
to receive the new arrival.
194 Mathematics, Poetry and Beauty
The hotel is full, but if each guest moves to the next room, a vacancy appears.
The next day, too, the hotel was full. That evening, something even more
distressing happened: an infinite number of new guests arrived! But again,
the hotel manager did not lose his cool. He asked the guest in room no. 1
to move to room no. 2, the guest in room no. 2 to move to room no. 4,
the guest in no. 3, to room no. 6, and on and on. Note that an infinite
number of rooms are thereby vacated — all those with odd numbers (1,
3, 5, . . . ). These rooms can house the infinite number of new guests. For
anyone who encounters Hilbert’s hotel for the first time, this probably seems
mystifying, even entertaining, and possibly even beautiful. I must admit that,
even as a professional mathematician, who uses this idea on an everyday
basis, Hilbert’s hotel still has not lost its charm for me.
And here is another surprise: the set of points on the short segment in the
following drawing is equal in size to the set of points on the long segment. One
correspondence between them is shown in the drawing. Point Q sends “light
beams,” and each point on the upper segment corresponds to its shadow on
the lower segment.
The upper segment has the same number of points as the lower one, despite their being
of different lengths. The beams that emanate from a single point establish a one to one
correspondence.
Cantor’s Story 195
A finite segment with its two endpoints being removed is bent into the shape of a
semicircle. The beams issuing from the light source then show that the segment is equal
in size to the entire infinite straight line.
These examples may create the impression that all infinite sets are of the
same size. Cantor’s great discovery was that this is not so: there are large
sets, and larger ones.
For each flowerpot there is a corresponding flower. This means that the set of flowers is
at least as large as the set of flowerpots.
To show this, we have to define inequality between set sizes. Let us begin
with the concept “at least as large as . . . ”: when is set A at least as large as
set B? Once again, let us begin with the finite case. In the figure above there
196 Mathematics, Poetry and Beauty
are 5 flowers and 3 flowerpots. Since 5 is bigger than 3, there are more flowers
than flowerpots. This definition, however, is not applicable to the infinite
case, because there we cannot count. Accordingly, we will define it, once
again, by means of correspondence. The drawing illustrates a correspondence
of the set of flowers to that of the flowerpots, that satisfies the following
condition: for each flowerpot there is a corresponding flower.
This definition is applicable also for infinite sets: a set A is at least as
large as a set B if there is a correspondence assigning to every element of A
an element of B, such that all of B is covered. That is, for each element in
B there is at least one element in A corresponding to it.
The next drawing, for example, illustrates why the set of points of the
segment [0,1) (that is, all the points between 0 and 1, including 0 but not 1)
is larger than or equal to the set of natural numbers. For each natural num-
ber there is a corresponding point from the segment (actually, many points
correspond to each number).
The conclusion is that every infinite set is greater than or equal (in size)
to the set of natural numbers. In other words, the set of natural numbers is
the smallest infinite set. An infinite set of exactly the same size as that of
the natural numbers is called “countable.” Cantor’s first great discovery was
that there are infinite sets that are not countable. That is, they are actually
greater than the set of natural numbers. This discovery is so basic that it
deserves a chapter of its own.
The Most Beautiful Proof ?
David Hilbert
Emily Dickinson
Cantor arrived at his theorem in two stages. In the first stage, he proved only
that not all infinite sets are of equal size. As already mentioned, the smallest
infinite set is the set of natural numbers, and Cantor proved that there is
199
200 Mathematics, Poetry and Beauty
a larger set. In other words, there are sets that cannot be counted.
“Uncountable,” they are called.
Cantor’s proof is explicit. Not only does it show that there is an uncount-
able set, it also expressly presents such a set. This is the set of sequences
composed of 0’s and 1’s. There are many such sequences. For example: 0, 0,
0, 0, . . . ; 1, 1, 1, 1, . . . ; 0, 1, 0, 1, 0, 1, . . . ; or 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0,
0, 1, . . . . Cantor showed that there are not just many sequences like these,
there are very many, more than the natural numbers.
The proof is by negation. That is, upon assuming that the theorem is false,
a contradiction is reached. Let us assume, Cantor said, that it is possible to
count all the sequences whose terms are 0 and 1. The enumeration below is
just an attempt, whose purpose is to make things concrete:
S1 = 0, 0, 0, 0, 0, 0, 0, . . .
S2 = 1, 1, 1, 1, 1, 1, 1, . . .
S3 = 0, 1, 0, 1, 0, 1, 0, . . .
S4 = 0, 0, 1, 1, 0, 0, 1, . . .
S5 = 1, 0, 1, 1, 0, 1, 1, . . .
S6 = 1, 0, 1, 0, 1, 0, 1, . . .
S7 = 0, 0, 0, 1, 1, 1, 0, . . .
..........................................................................
The assumption is that all 0, 1 sequences should appear in this list.
Namely — that we succeeded in counting all of them. But now, Cantor
said, I will show you that in actuality you must have failed to count them
all. I will show you a sequence that definitely does not appear in this list.
To this end, look at the diagonal of the table:
S1 = 0, 0, 0, 0, 0, 0, 0, . . .
S2 = 1, 1, 1, 1, 1, 1, 1, . . .
S3 = 0, 1, 0, 1, 0, 1, 0, . . .
S4 = 0, 0, 1, 1, 0, 0, 1, . . .
S5 = 1, 0, 1, 1, 0, 1, 1, . . .
S6 = 1, 0, 1, 0, 1, 0, 1, . . .
S7 = 0, 0, 0, 1, 1, 1, 0, . . .
..........................................................................
Now, write the sequence that appears on the diagonal: S = 0, 1, 0, 1, 0,
0, 0, . . . and change every 0 in it to 1, and every 1 to 0. This gives us the
sequence T = 1, 0, 1, 0, 1, 1, 1, . . . . What is special about T? It differs from
S in each of its terms. Since the first term in S is the first term in S1 , we
know that T differs from S1 in its first term: S1 has 0 in the first place, while
The Most Beautiful Proof ? 201
One of the conclusions of this theorem is that there are more real num-
bers than natural numbers. Every sequence of 0’s and 1’s can be matched
with a real number, by adding a 0 and a decimal point to the left of the
sequence. For example, the sequence 0, 1, 0, 1, . . . corresponds to the num-
ber 0.0101. . . ; the sequence 0, 0, 1, 1, 0, 0, 1, 1, . . . corresponds to the real
number 0.00110011. . . . Since the sequences whose terms are 0 and 1 cannot
be counted, it is also impossible to count all the real numbers of this type,
that is, the numbers in whose decimal representation a 0 appears before the
decimal point, and only 0 and 1 after it. These, of course, are only a small
portion of all the real numbers, and if even these cannot be counted, then
all the real numbers most certainly are uncountable.
For obvious reasons, the idea used by Cantor in his proof is called the
“diagonal method.” Since its discovery, it has repeatedly proved its effec-
tiveness, and has become a standard mathematical tool. We will return to
its later exploits, but first, let us see how it developed into a proof that for
every set there is a larger one.
Two years later, Cantor proved the more general result already mentioned:
there is no largest set. Every set has a set larger than itself.
Like the proof for the existence of an uncountable set, the proof is explicit.
For every set A, Cantor exhibited a specific set larger than A. This is the
set of subsets of A. Set S is called a “subset” of A if it contains part of the
elements of A. “Part” means any part — S might contain no element at all
202 Mathematics, Poetry and Beauty
(that is, it will be empty), or it might contain all the elements of A (that is,
it will be equal to A). For a reason that will immediately become clear, the
set of the subsets of A is called the “power set” of A. For example, if A is
the set of the two terms {1, 2}, its power set contains 4 sets: {1}, {2}, {1, 2},
and the empty set, that contains no element, and that is labeled Ø.
Cantor proved the following:
In the example above, the size of A is 2, while the size of its power set
is 4, and so the power set is indeed larger. In a simpler example, if A is the
empty set Ø, then it has a single subset, which is it itself. Its power set is
therefore of size 1, while the set itself has only 0 terms. The set A = {1, 2, 3}
has 3 elements, as compared with its 8 subsets:
Ø, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}.
The source of the name “power set” is evident from these examples. A set
of n elements has 2n subsets, that is, the size of its power set is 2 to the power
n. (In these examples: a set of size 2 has 22 = 4 subsets; a set of size 3 has
23 = 8 subsets; and a set of size 0 has 20 = 1 subsets.)
How did Cantor prove his theorem? Let me demonstrate, using the familiar
toy Mr. Potato Head. This is a toy model to which you can add ears, a
mustache, eyebrows. . . . In our version, each feature has precisely two possi-
bilities — being there or not. So for example, our Mr. Potato Head either
has a moustache or not, either has eyebrows or not, and so on. Each choice
of features creates a different character: one has a moustache and eyebrows
but no ears, and another has ears and eyebrows, but no moustache. So a
character is determined by a subset, any subset, of the set of features. Can-
tor’s theorem (“there are more subsets than elements”) reads in this case:
There are more characters than features.
How do we prove this? To show that a set A is larger than a set B we
should show that it is impossible to cover A by B. In our case, in which
we want to show that there are more characters than features, we should
The Most Beautiful Proof ? 203
Cantor proved that the number of possible characters of Mr. Potato Head you can form
with given features is larger than the number of features, even if the number of features
is infinite.
Even after many years of acquaintance with this proof, I am still charmed.
The definition of Out of Spite is a rabbit popping out of a hat. The proof is so
short and surprising, that it takes some time to digest. It has few competitors
for the ratio between importance and length.
Returning to more abstract terms, what Cantor argument says is this.
Let A be any set. If f is an assignment of subsets of A to elements, so that
to every element a of A we assign the subset f (a), there is a subset C (after
“Cantor” — this is the “Out of Spite” set) that is not assigned any element.
What is C? An element a belongs to C if it is not included in f (a). This is
the “spite.” So, C is the set of elements that do not belong to the set assigned
to them by f . This is a rather confusing definition. It smells of circularity,
which is another name for “self-reference.” A bit like defining a man as “the
father of the person hereby defined,” or a number as “the number hereby
defined, plus 1.” Such definitions tend to generate paradoxes, like a man who
is own father or a number that is greater than itself by 1. And indeed, this is
what happened next in set theory. Paradoxes appeared as if from nowhere,
threatening to topple the entire beautiful edifice erected by Cantor.
Paradoxes and Oxymorons
A parody of a proof
205
206 Mathematics, Poetry and Beauty
Call the set of all the sets that do not belong to themselves R (after
Russell). Does R belong to itself? By the definition of R, a set belongs
to it if and only if it does not belong to itself. If we apply this rule to R
itself, we obtain: R belongs to itself only if it does not belong to itself. This
is a contradiction: a proposition and its negation cannot both be true at
the same time.
The story of the barber is not a paradox. His vow simply cannot be
realized. Russell’s set, however, seems to lead to a real contradiction. Is
this so? Certainly not. If the mathematical axioms are chosen well, they
will not result in a contradiction, because they will describe reality, and in
reality there are no contradictions. Like every paradox, this one was born
out of a shaky definition. Behind Russell’s paradox, as behind the original
paradox of Cantor, is an assumption called the “Axiom of comprehension.”
This states that every property defines a set: the property of “being a chair”
defines the set of all chairs; the property “being an even natural number”
defines the set of even natural numbers. This assumption, however, enables
circular definitions. As Russell’s paradox shows, with the help of the axiom
of comprehension, we can define a set for which the relationship “belonging
to itself” is self-defined.
For a brief moment, the foundations of mathematics shook. It seemed
as if the paradox would expel us from Cantor’s paradise. But things were
soon set right. Ernst Zermelo wrote in 1908 a system of axioms, that was
improved in 1922 by Abraham Fraenkel. The Zermelo-Fraenkel system of
axioms does not contain the axiom of comprehension. Sets are constructed
with greater care, from axioms that (apparently) do not lead to circular
definitions. And so, Cantor’s paradise remained unscathed, enabling Hilbert
to say that the danger of expulsion had passed (his declaration cited above:
“No one shall expel us from the paradise that Cantor has created for us,” was
made in 1926). This is still one of the most beautiful mathematical fields,
and the paradoxes no longer trouble it. In fact, the upheaval had a positive
side, as well: the paradoxes of set theory led to some of the most fascinating
developments in modern mathematics.
Paradoxes in probability
Mark Twain
Suppose that your rich uncle presents you with two sealed envelops, and
tells you that he put in one of them — you do not know which — a dou-
ble amount of money as he put in the other. You are free to choose an
envelope, which you do. You open it and find a $100 bill. Now your uncle
evinces double generosity: “If you want, you can change your choice,” he
offers. Of course, without first opening the second envelope. The question
now is whether it is worthwhile to switch, or not? You randomly chose
one of the envelopes, and so there is a 12 probability that you chose the
envelope with the smaller sum. In that case, the second envelope contains
$200, and switching will give you a profit of $100. There is a probabil-
ity of 12 that the envelope you chose contains the larger sum, in which
case the other envelope will contain a $50 bill, and you will lose $50. So
switching the envelopes is a gamble in which there is a 50 percent chance
to win a larger sum, and a 50 percent chance to lose a smaller sum — a
gamble that is definitely worth taking. And if so, then it is worthwhile to
switch.
But this is patently absurd. The argument is not dependent on you having
found $100 in the envelope. Had you found $1000, the conclusion should have
been the same. But if so, then it would be worthwhile to switch in any case,
meaning that you should switch even before you open the envelope. But this,
of course, is silly, if only for the reason that it would be worthwhile then to
switch again.
The deceit is quite subtle. It is based on an unstated assumption: that
each sum has the same probability of being in the envelopes. In our case, in
which you opened the envelope and found $100 in it, it is assumed that the
likelihood that the uncle put $50 in one envelope and $100 in the other is
the same as the second possibility, of $100 and $200. If we were to assume
that the probability of the first possibility is much greater than that of the
second, you could reasonably expect the envelope you didn’t open to contain
$50, and so it would not be worthwhile to switch.
The assumption that each sum has the same chance is necessarily erro-
neous. This is because there are an infinite number of possible sums. When
there are (say) 10 possibilities, each with the same probability, the proba-
1
bility of each is 10 . When there are infinitely many possibilities, if each had
the same degree of probability, then the probability of each of them would
have to be 0. But this means that none of them could happen! This means
that the probability is not uniform. There are combinations of sums that are
more likely than others.
Paradoxes and Oxymorons 209
Here is an example clarifying this point. Assume that your uncle does
not have more than $2,000,000. In this case, if you find $1,000,000 in the
envelope that you opened, you know for a certainty that the second envelope
contains half a million dollars, and not two million. In this case, obviously,
it is not worthwhile to switch.
Poetical paradoxes
I open the collection of poems True Love by the Israeli poet Dalia
Ravikovitch (1936–2005), and this is what I find:
We learn from this that poetry, too, uses paradox. But there is a basic
difference: while a mathematical paradox conceals error, there is always truth
behind a poetical oxymoron. Beneath the apparent contradiction there is
internal logic. By the death bed of the poetess Rachel there were found the
lines “Only what I lost/is my possession eternal” (My Dead). This contains
a truth, because external possessions can be lost. Only what is internalized
remains.
Paradox is one of the means poetry uses to maintain tension between sur-
face and interior — the tension that, as I have strived to show, is responsible
for the sensation of beauty. Pulling the carpet of logic from under our feet
forces us to seek the truth within, and understand that beneath abstract
thought there are things no less significant.
Self-Reference and Gödel’s Theorem
Hilbert’s program
211
212 Mathematics, Poetry and Beauty
Look again at item 3 above — the search for an algorithm that tests prov-
ability. There is a natural candidate for such an algorithm: simply, trying all
possibilities. Given a formula you want to check for provability, go system-
atically, from shorter to longer, over all possible series of symbols. For each
such sequence check to see whether it proves the formula, or perhaps proves
its negation. Most of the sequences are not proofs at all, but just jumbles
of symbols. But perhaps, by chance, like the monkey hitting a typewriter
at random, you will hit a proof of the formula or its negation. If there is
such a proof, you will get to it at some point. And since checking whether a
sequence of symbols is a proof of a given formula is doable, the algorithm is
well defined.
Obviously, this is not very efficient. It is like a police detective trying to
solve a murder case by examining all the people in the world, one by one.
Just as police investigations aren’t conducted in this manner, so, too, no
Self-Reference and Gödel’s Theorem 213
one would try to find proofs for mathematical theorems by writing random
marks on paper. However, at this stage we are not looking for an efficient
algorithm, but for any algorithm at all.
But there is a worse and deeper problem. It is that we don’t know when to
stop. The algorithm of the murder detective is not efficient, but it is feasible,
since there are only a finite number of people in the world, and at some point
the algorithm will end. The situation is different for mathematical proofs.
If a proof is found at some point during our search, well and good. But
if we have examined a million series of signs, and none of them yields the
desired proof? Obviously, we could continue on to the one-million-and-first
series, but we would never be able to stop and declare: “We exhausted all
possibilities, and have not found a proof, so there is none.” There is always
the possibility that in our next step we would stumble upon the proof. Oil
prospectors face this dilemma, but in their case, there is at least a theoretical
limit: if they drilled and reached the other side of the earth, this is a clear
sign of failure. As far as proofs are concerned, there is no phase in which we
should give up.
But note: if we know for sure that one of the possibilities indeed occurs,
the formula is provable, or its negation is, then we are in good shape. We
can check at each step whether the sequence of symbols at hand is a proof
for the formula, or of its negation. Knowing that there exists a proof of one
of these, it is guaranteed that at some point our sequence of symbols will
be such a proof. So, the algorithm will terminate. So, if item 1 of Hilbert’s
program is true, namely every formula is provable or its negation is, then we
also have an algorithm for deciding which of the two cases it is.
finding proofs in the Peano system; and that there is no syntactical proof of
the consistency of the Peano axioms.
The first of these negative results got most of the fame. It is called
“Gödel’s incompleteness theorem.” It is that there are true statements about
numbers, which Peano’s axioms cannot prove. Of course, since they are true,
their negation also cannot be proved. So, there are statements that both they
and their negation cannot be proved.
OK, you may say, so Peano was stupid, and didn’t devise a good axiom
system. Let somebody cleverer come, and propose a better axiom sys-
tem, that will be complete. Namely, it will decide everything — for each
formula, it will prove the formula or its negation. But Gödel’s argument
has a much wider scope: it is valid not only for Peano’s axioms, but also
for every reasonable axiom system, where “reasonable” means that it is
possible to decide, for every sequence of symbols, if it is an axiom in the
system or not.
Gödel’s theory drew the attention of a young Englishman named Alan
Turing (1912–1954). In addition to his being an exceptionally strong math-
ematician, Turing also had mechanical skills, and he wanted to give more
tangible form to Gödel’s arguments. To this end, he invented the first the-
oretical model of a computer. The actual construction of a computer did
not lag far behind. During World War II, Turing participated in building
a primitive computer, as part of the effort to break the German code for
communications with submarines. Gödel’s theory was therefore a significant
step toward the creation of the computer.
Kurt Gödel (1906–1978) proved what some consider to be the most important theorem of
the twentieth century — that any reasonable set of axioms for number theory is
incapable of proving all true facts about the natural numbers.
Self-Reference and Gödel’s Theorem 215
Circularity
Gödel was inspired in his proof by a paradox named after the French math-
ematician Jules Richard (1862–1956). This paradox, like Russell’s (in the
preceding chapter), is a parody of Cantor’s diagonal method. Like Russell’s
paradox, its deception lies in self-reference. In fact, this is the case in all
long-lasting paradoxes. It seems that the human mind is not built to eas-
ily detect circularity. The best known paradox based on self-reference is the
so-called “Liar’s Paradox,” invented by Greek philosophers back in the 5th
century BC.
Think about the truth value of this sentence: if the sentence is true, then,
according to its content, it is false. But if it is false, then, again, according
to its content, it is true. As in Russell’s paradox, we found a statement that
is correct only if it is incorrect, which is plainly impossible. Rivers of ink
have been poured over this paradox, and philosophers spent many sleepless
nights wrestling with it. Actually, the deception behind it is quite simple,
and is not very different from the definition of a number as “itself plus
1.” The circularly-defined concept in the paradox is the truth value of the
sentence. A sentence does not come into the world with a truth value pinned
to its collar. In order to calculate the truth value of a sentence, we must
do something: compare it with reality. This sentence, however, speaks of
its own truth value, and therefore the part of reality to which it must be
compared is none other than the truth value itself, that is, the result of the
current examination. Thus, the truth value of the liar’s sentence is defined
by reference to itself. Actually, it is defined simply as its own negation. This
is a circular definition, and is therefore invalid. The liar’s sentence just has
no truth value — it is neither true nor false.
216 Mathematics, Poetry and Beauty
This paradox is subtler than the Liar’s Paradox, and the circularity it
conceals is not as simple (a hint: the problem lies in the assumption that
“what can be proved is correct,” which, if used in proofs, becomes circu-
larly defined). We arrived at a contradiction here, because we looked at the
statement formulated in words. Gödel, in contrast, did not arrive at a con-
tradiction. He did not write his statement in words, but as a formula that
speaks of numbers — an outstanding achievement by itself. Dressed as a for-
mula, Gödel’s sentence does not result in a contradiction, but in a formula
that is true in the natural numbers, yet it cannot be proved.
[. . . ]
Simply:
There was snow in one land
And desert in another
And a star in an airplane window
At night
Above many lands.
They came to me
And commanded me: Sing.
They said: We are words
And I surrendered, and sang them.
Lea Goldberg, “About Myself”
Ars poetica poems occupy a surprisingly large place in poetry as a whole.
Should this be attributed to the excessive narcissism of poets? Probably not.
218 Mathematics, Poetry and Beauty
I believe that the answer is not to be found in the personality of poets, but in
the beauty inherent in circularity, in having something hang in air because
it is hanging on itself.
To end this chapter, here is an amusing example of circularity in poetry.
In the poem “A Tale of Two ‘Garoos” by the Israeli poet Abraham Shlonsky,
the negativist no-‘garoo responds to everything with a “No.” After learning
his lesson, he is asked whether he will remain in his obstinacy.
Genesis 22:17
One of the heroes of the English author David Lodge tries to explain to his
friend the meaning of eternity. “Think,” he says, “of a ball of steel as large
as Earth, and a fly alighting on it once every million years. When the steel
ball is rubbed away by the friction, eternity will not even have begun.”
Mathematicians wouldn’t be impressed by this image. For them, the num-
ber of years that will pass until the ball is rubbed away is not especially big.
Assume, as an extreme example, that the number of atoms in the ball is that
of the number of atoms in the universe, which is estimated at 1080 . Assume
also that with every alighting, only a single atom adheres to the foot of the
fly. After a million times 1080 , that is, 1086 years, the ball will be rubbed
away. Much larger numbers appear in some mathematicians’ work every day,
especially in my own field, combinatorics. For example, the number of ways
in which 100 people can be ordered in a line is much greater than this.
We can deal with such numbers, but they are difficult to comprehend.
Even “a million” is a number that people cannot grasp. In the O. J. Simpson
murder trial, in which the defendant was charged with murdering his ex-wife
and her friend, an expert witness testified for the prosecution that there was
one chance in a billion that the blood samples found at the murder scene
did not belong to the accused. An expert for the defense claimed that the
chance is one in several million, and this statement was enough to acquit
Simpson — the jurors didn’t have a clue as to the meaning of “one chance
in a million.”
Mathematicians, too, don’t really understand the meaning of large num-
bers, but they live with them quite well, and they know how to write them
concisely. The trick consists of operations that repeat other operations. Mul-
tiplication, for example, is a repetition of addition, and raising to a power is
a repetition of multiplication. 1010 means “10 to the tenth power,” that is,
219
220 Mathematics, Poetry and Beauty
be more interesting for you?” The lecturer was forced to admit, as would any
mathematician, that the second result would be more important. (Fermat’s
Conjecture was explained above, in the “Unexpected Combinations” chapter.
At the time of this story it was not yet solved.)
At this point someone else got up and said: “1010 + 1 is not a prime
10
number, because 1010 can be written as a fifth power, that is a5 .” (For this
10
Once I asked my daughter, “Is there something in the world of which there
are a googol?” Without stopping to think, she answered, “Yes. In a second
there are a googol googolths of a second” (just as there are ten tenths of a
second in a second). Of course, this is cheating — “googolths of a second”
exist only in our imagination, we cannot clock them.
There is an additional way in which gigantic numbers appear, presum-
ably in the real world: combinations. For example, a number we already
mentioned: the number of ways in which 100 people can be ordered in a line.
Halfway to Infinity: Large Numbers 221
How is this number calculated? Each of the 100 people can be put in the
first place in the line, and there are then 99 possibilities of putting someone
in the second position (only 99, because the first person was already cho-
sen). Accordingly, there are 100 × 99 possibilities for placing people in the
first two positions. For each of these possibilities, there are 98 ways to select
the third person, and so, we have 100 × 99 × 98 ways to fill the first three
places. Continuing, we find that the number of ways to arrange 100 people
is 100 × 99 × 98 × 97 × · · · × 3 × 2 × 1, a number denoted by 100! and is
called “100 factorial.” Using the formula of James Stirling (in the chapter “A
Magic Number”), 100! can be estimated to be about 10150 , which is much
larger than the number of atoms in the universe. The number of ways in
10
which Earth’s inhabitants can be arranged in a line is greater than 1010 —
so here is this number, in a real-life situation. But, these, too, are not indeed
from real life. No one intends to arrange people in all possible ways, nor even
to order their names on paper.
Infinitely Small
Everything changes
Everything flows.
223
224 Mathematics, Poetry and Beauty
If the sectors are very narrow, to a good approximation they look like
triangles. Their base looks like a straight line, just as the earth appears flat
to someone observing it from a low height. And we know how to calculate
the area of a triangle: this is the base, times the height, divided by 2. If we
think of every sector as being almost a triangle, then its height is the radius
of the circle. The area of each sector, therefore, is more or less its base times
the radius of the circle, divided by 2; and the narrower the sectors, the better
this approximation. In consequence, the area of the circle, which is the sum
of the sector areas, is — to a good approximation — the radius, times the
sum of the base lengths, divided by 2. But the sum of the base lengths is the
circumference of the circle. So, the area of the circle is its radius, multiplied
by the circumference, and divided by 2. Since the circumference of a circle
of radius R is 2πR, its area is 2πR × R2 = πR2 , which is the formula you
probably remember from high school. Incidentally, why is the circumference
of a circle 2πR? This is a question of definition. The number π is defined as
the ratio between the circumference of the circle and its diameter, that is,
between the circumference and 2R.
In antiquity, the use of arguments of this type to calculate areas and
volumes was known as the “method of exhaustion.” It was developed by
Eudoxus, and already appeared in Euclid’s Elements, that was written
in the fourth century BC. The master of this method was Archimedes.
He could calculate the area of a circle, the volume of a sphere, and the
inscribed area between a parabola and a straight line. Among all his accom-
plishments, Archimedes most appreciated these calculations. He asked that
a cylinder and a ball inscribed within it be engraved on his tombstone,
as a testament to the result of which he was most proud, that the vol-
ume of the cylinder that inscribes a sphere is 32 times as large as that of
the ball.
The idea of “looking through a microscope,” that is, using sizes that tend
to zero, enjoyed renewed currency in the seventeenth century. This began,
actually, by looking through a telescope. Tycho Brahe, the great Danish
Infinitely Small 225
Archimedes asked that this drawing be engraved on his tombstone. He proved that the
volume of the ball inscribed in the cylinder is 23 of the volume of the cylinder. He thereby
discovered the formula for the volume of a ball.
Archimedes, a Greek who lived in Sicily (287–212 BCE). The greatest mathematician of
the ancient world, he was a man of many talents — mathematician, engineer and
inventor. He developed a type of pump named after him, invented the planetarium, and
built sophisticated weaponry for the defense of his city, Syracuse, against the Romans.
When the city was finally conquered, a Roman soldier found him drawing geometric
shapes in the sand. As the legend goes, when the soldier asked him what he was doing,
he replied, “Don’t bother my circles” — an answer that cost him his life.
Small things, the poem teaches us, leave space for imagination. Since they
don’t take much space, the rest will be filled by our thoughts. The next poem,
by the American poet William Carlos Williams, speaks of the importance of
details:
so much depends
upon
a red wheel
barrow
Masaoka Shiki
Infinitely Many Numbers
Having a Finite Sum
The Greek philosopher Zeno of Elea (490–425 BC) was preoccupied with the
relation between the freeze frame and the whole picture. Two millennia later,
mathematicians would be occupied with the same issues, which would lead
them to invent differential calculus. Zeno, who lacked the necessary tools,
thought that he had reached contradictions, the most famous of which is
“the paradox of Achilles and the Tortoise.” The original paradox speaks of a
race between Achilles and a tortoise, but I prefer to write a slightly different
version, on a competition between the two hands of a clock. The question is
this:
The hands of the clock meet at 12 o’clock. At what hour will they meet
again?
Zeno’s paradox states that this will never happen. This is absurd, but
Zeno had a “proof.” At exactly 1 o’clock the hour hand will be ahead
of the minute one, because it will point to 1, while the minute hand will
point to 12. By the time the minute hand reaches 1, the hour hand will
1
advance a bit. Actually, we know exactly how much: 12 of an hour, since
1
its speed is 12 that of the minute hand, and the minute hand advanced
1
from 12 to 1. Now the hour hand points to the hour 1 12 (1:05), and the
minute hand has to reach this place. But until it does so, the hour hand
1
will advance a bit more. And, once again, we know exactly how much: 12
1 1 1
of the minute hand, that is, 12 × 12 = 144 of an hour. The minute hand
has to reach this place, and in the meantime the hour hand will advance
1 1 1
a bit more, 12 × 144 = 1728 of an hour. This will continue in the same
way: every time that the minute hand reaches the previous position of the
hour hand, the hour hand will advance a bit more in the meantime. It
therefore seems that the minute hand will never be able to catch up to
the hour hand! There will always be some gap between them, even if it
becomes increasingly small. In the original paradox, in which Achilles ran
against a tortoise, Achilles gives the tortoise a head start. According to
the exact same argument, Achilles will never be able to catch up to the
229
230 Mathematics, Poetry and Beauty
tortoise, because every time that he reaches the tortoise’s former position,
the tortoise will have advanced in the meantime, even if only a little bit.
But we know, of course, that the minute hand will pass the hour hand,
and that Achilles will quickly overtake the tortoise. So where does Zeno
cheat?
Before we answer that, let us ask ourselves: at what hour exactly do the
two hands meet? We could write an equation, and then solve it. But there
is a more elegant, and much simpler, way to do this. Over the course of 12
hours the two hands meet 11 times: every whole hour and something; except
for the hour 11 and something, since then the meeting will be at 12, and not
at “11 and something.” Now, note that the time that passes between each
two meetings of the hands is the same. A simple way to see this is to remove
the numerals from the clock, and to rotate it so that at the time of a meeting
it will look as if both hands are pointing to 12 o’clock. Now it is clear that
the exact same time will pass until the next meeting as passed between 12
o’clock and the first succeeding meeting. Therefore, the 12 hours of the day
divide into 11 equal parts. And so, 12 11 of an hour passes between each two
meetings. The first meeting after 12 o’clock will be at the hour 12 11 , which is
a little before 1:06.
Where did Zeno go wrong? His argument was correct up to a certain point,
but he erred in his conclusion. He divided the time to the next meeting of
the hands into infinitely many parts. The sum of these periods of time (in
1 1 1
hours) is 1 + 12 + 144 + 1728 + · · · . This is an infinite series, that is, the
sum of infinitely many numbers. Zeno argued that since there are infinitely
many terms, the sum is infinite, namely, the time that will pass until the
next meeting is infinite. This, however, is wrong. The sum of an infinite
number of numbers can be finite, on condition that the numbers decrease
at a sufficiently rapid pace. And this is what happens here: each number
is smaller than its predecessor by a factor of 12. This means that it is a
1
geometric series, with a quotient of 12 (we first encountered this notion
in the chapter “Mathematical Ping-Pong”). And every geometric sequence
with a quotient smaller than 1 has a finite sum. The classic example is the
geometric series with the quotient of 12 , a series in which each element is
smaller than its predecessor by a factor of 2. The sum is 1 + 12 + 14 + 18 + · · · ,
which is 2. In order to see this, observe that the distance of 1 from 2 is
1; the distance of 1 + 12 from 2 is 12 ; the distance of 1 + 12 + 14 from 2 is
1
4 . The addition of each element in the series halves the distance from 2.
Accordingly, the partial sums of the series tend to 2. The “partial sums” are
Infinitely Many Numbers Having a Finite Sum 231
the sums of the first elements — in this case, the first partial sum is 1, the
second partial sum is 1 + 12 , the third partial sum is 1 + 12 + 14 , and so on.
It is also not difficult to prove that if q is a positive number smaller
than 1, the infinite geometric series 1 + q + q 2 + q 3 + · · · converges to a
finite number. What number? We can calculate this with the method we
used in the chapter “The Book in Heaven.” Use the letter S for the sum
1 + q + q 2 + q 3 + · · · . Multiply each element in the series by q. This gives
us qS = q + q 2 + q 3 + q 4 + · · · . Note how close is the expression for qS
to S itself: it just misses the first term, 1. So, qS = S − 1. This can be
viewed as an equation in the unknown S, that can be easily solved. Moving
S to one side we get S(1 − q) = 1, and then by dividing both sides by
1
(1 − q), we obtain S = 1−q . For example, if q = 12 , for which the series is
1 + 12 + 14 + 18 + · · · , we obtain S = 1−1 1 = 2, which is what we discovered
2
1
earlier. In the case of the clock, q = 12 , so by this formula the hands will meet
after 1 + 12 + 144 + 1728 + · · · = 1− 1 which is 12
1 1 1 1
11 hours, as we already know.
12
Is it always the case that when the elements of the series tend to 0, the
series sum is finite? The answer is no. The simplest example of this is the
following:
1 + 12 + 12 + 13 + 13 + 13 + 14 + 14 + 14 + 14 · · · (next comes 5 times 15 ). The elements
tend to 0, but two halves are 1, three thirds are 1, and four fourths are 1 —
we have a sum of 1 an infinite number of times. This means that the partial
sums of the series tend to infinity, that is, the sum is infinite.
The next example is more sophisticated, and also more important,
because it appears in numerous contexts. This is the series 1+ 12 + 13 + 14 +· · · ,
which is called the “harmonic series.” Its elements tend to 0, but its sum is
infinite, which means that its partial sums tend to infinity. In order to see
this, partition the sum as follows:
1 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 10 1 1
+ · · · + 16 · · · . Each pair of
1
parentheses contains a sum that is at least 2 . Why? The first pair has two
numbers, each of which is at least 14 , so the sum is at least 12 ; the second
parenthesis contains 4 numbers, each of which is at least 18 , so the sum is at
least 12 , and so on. So, we have a sum of infinitely many numbers each being
at least 12 , which gives infinity.
Twists
On its face, this is an uninspired poem. The metaphors are corny, and the
story slow and heavy. But then everything changes in the last line. Suddenly
we realize that all details of the poem are metaphors for the course of human
233
234 Mathematics, Poetry and Beauty
life. Each line has to be read anew. Life is like a book too hard to understand;
man is like a child desperately trying to decipher the code of a book, an
attempt that is doomed to fail; fate treats a man towards the end of his life
like a forgiving father, that puts his child to sleep at night; the child’s sleep
turns out to be, in fact, death; the opening words of the poem — “It may
happen” acquire an ironical meaning — it is not that “it may happen,” it is
always the case.
So, the twist is a trick of condensation. A lot of information is compressed
into one line. But it is a special type of condensation: it does not require
conciseness. On the contrary, the longer the poem is, the more information
is compressed into the last line. The details are not understood correctly
at first reading, so they may as well be said at length. The moment of
illumination will be too brief for a conscious scanning of all these details and
reinterpreting them.
Let me just point out one other stratagem that Steinberg’s poem uses that
makes it a gem: the reversal of roles between tenor and vehicle. Throughout
the poem, and in particular in the last line, it seems that life is a metaphor
for reading the book. “The play is over, like the life of a man.” The meaning
is of course the opposite, “life is for us like a book to an ignorant child.”
“Knowing without knowing,” all the way.
Knowing without Knowing
What makes a person beautiful? One factor is essential: that we should not
know why he or she is beautiful. Beauty is said to be “blinding,” “stunning,”
“breathtaking” — all expressions attesting to its being beyond our conscious
understanding. We may feel stunned by scenery that is too majestic for us to
grasp with our ordinary tools of perception. Beautiful musical compositions
are too complex for us to know just what happens in them. Beauty is hidden
in what we don’t completely understand, at least not consciously.
This explains many of the so familiar characteristics of poetry. They are
all related to the aim of the poem, of sneaking messages without our notice.
The poetic devices are meant to distract our attention, so that the message
slips under the radar of consciousness. Brevity, for example, is nothing but
the magician’s dexterity, meant to deceive our critical faculties. The use
of external devices, like rhyme and meter, is meant to draw our attention
away from the content. This is the handkerchief under which the magician
performs his tricks. Outside appearance hides internal messages: an apparent
paradox hides deep truth, while verbally similar phrases may conceal deep
underlying contrast.
Poetical repetition
In nonfiction there are few sins more serious than repetition. A piece of
information that appears twice in the same text is like a stitch in a garment
that is wrongly exposed. “I already know that,” the reader thinks, and the
magic disappears. In poetry, by contrast, repetition is a powerful device.
A famous example is the poem by the Spanish poet Federico Garcia Lorca
(1898–1936), “Lament for Ignacio Sanchez Mejias.” Mejias, a close friend of
the poet, was a bullfighter that was killed in the arena. In every second line
237
238 Mathematics, Poetry and Beauty
the poem repeats the time of the corrida, and the effect is strong — the
poem’s fame is well-deserved. The following is one stanza from the poem:
At five in the afternoon,
It was exactly five in the afternoon.
A boy brought the white sheet
at five in the afternoon.
A frail of lime ready prepared
at five in the afternoon.
The rest was death, and death alone.
Why poetry?
239
240 Mathematics, Poetry and Beauty
Poetical detachment
Bialik, still alive, prophesies that he will never sing his true song, as if this
were not at all dependent on his will. His true song, he explains in the
continuation of the poem, was destroyed within him, without intent and
without his being able to control this.
Poetic justice
243
244 Mathematics, Poetry and Beauty
The danger facing someone who is searching for the lowest point: when he reaches a
valley, he is liable to be stuck, instead of climbing out of the present valley, to look for
the deeper one. In life, too, the conduct that solves a local problem might not be effective
in an overall perspective.
ineffective, and a person must retreat within himself to mobilize new forces.
Upon returning to the external world, he is likely to use new, often better,
behavioral strategies.
But crises exact a stiff price, one that cannot be paid too often. This
is why man invented ways to bring about minor shakeups, in which only
one attachment to the world becomes undone, gently and not explicitly. The
effect is not sweeping, but even a small change is of value. Art is one of
the ways to induce such a shaking. The mind sends many minute tentacles
to the world, to hold onto ideas, objects, or people. Art removes, if even
for a fleeting moment, the grasp of one tentacle. For a moment the energy
that was invested in the external world is detached, and retreats inward.
When reattached, it can be rebuilt in a new form. A painting that represents
its object from a new angle, the reorganization of sounds in music, or the
unconventional arrangement of ideas in poetry — all these undo existing
attitudes, making us aware of other possible ways of relating to the world.
Know yourself
For childhood
does not grow, no, never.
It is covered with layers, like a thickening shell.
Not mine
247
248 Mathematics, Poetry and Beauty
purges. The term “estrangement” that he coined comes from “strange,” and
means placing things in a strange and new light, for the purpose of restoring
pristine freshness to their perception. Shklovsky argued that as the years
pass our senses becomes dulled, and they need to be shaken to revital-
ize them. The role of art, he said, is to cause someone who lives next to
the sea to hear anew the murmur of the waves to which he has long been
accustomed.
Estrangement is a type of shaking. What is special about it, among other
artistic shakeups, is our awareness of it. In most artistic experiences a person
forgets himself. Someone watching a movie is usually absorbed in the plot and
a person listening to a symphony forgets the world outside. Estrangement
does the opposite — it places a mirror before the viewer, and causes him
“to look at himself in surprise.” The result is the shock of alienation, and
the awakening of consciousness to automatic responses. Like the surprise
of Monsieur Jordan, the hero of Molière’s The Bourgeois Gentleman, who
suddenly realizes that he has been speaking prose his entire life.
The experience of estrangement is often accompanied by pleasure. For a
moment we cease to be slaves to habit, and become its masters. A famous
example is the pleasure we derive from discovering the origin of words. Some-
one who discovers that the three words “radio,” “radiator,” and “radius” all
are derived from the Greek word radia, meaning “beam,” suddenly becomes
master of the words instead of their slave, and entertains the pleasure of
freedom. Indeed, one of the terms Shklovsky uses for estrangement is “de-
automatization.”
Estrangement is particularly characteristic of modern art. Picaso, with his
deformed faces, causes us to re-think the way we perceive our surroundings.
Stravinsky shocked the ears of the audience of the beginning of the twentieth
century, and caused people to stop for a moment and think about what is
music and what is the role of sounds in their life. Brecht declared it as
an aim, to cause people to step back from the show and realize they are
watching theater, and not real life. These are manifest cases, but even when
the estrangement is not so obvious it is always there, in all art, if only for
the reason that it is taken out of the context of everyday life, and happens
in a museum or a concert hall.
Estrangement 249
4. A patient has to take daily one pill of type A, and one pill of type B.
One day, each of the two bottles had exactly two pills left. And then
disaster struck — both bottles fell and broke, and the pills were mixed
up. Unfortunately, the two types of pills look the same, and the patient
cannot distinguish between them. What will he do?
Habit is the obstacle to solving each of these problems. The first problem
is difficult, because we are accustomed to cutting pizzas, and a pizza is
divided into 6 sections with 3 cuts, and not 8. But there is a difference: a
pizza is thin, while a cake has another dimension in which it can be cut —
its height. If we only think in terms of this third dimension, the solution is
easy: divide it into 4 pieces by two vertical cuts, as with a pizza; and then
cut the cake across its width, perpendicular to the vertical axis.
For the second problem, as well, we have to overcome habit: we are used
to two-dimensional matches puzzles, in which the matches lie in the plane.
As soon as we forgo this assumption and allow the matches to be in three
dimensions, the solution is simple — try it for yourselves.
250 Mathematics, Poetry and Beauty
The difficulty with problem no. 3 is that we assume that the broken line
has to be drawn within the square. If we allow it to go beyond the square,
then the solution is quite easy. Here it is:
251
252 Mathematics, Poetry and Beauty
invest effort in examining special cases. His staring, so it seems to me, is less
intensive, and he works as if in a dream.
All this stems from one basic difference: while the mathematician tries
to discover something in the world, the poet’s aim is to dive within him (or
her)self. The discovery of order in the world can be done in a conversation
with someone else, and be aided by ideas of others, but only the individual
himself can delve into his soul.
Depth
unconscious, from where new ideas, both strange and beautiful, are meant
to be drawn. In poetry, too, split perception is not just a means, but the
result of thought modes of this type. Mathematical magic, like poetic magic,
is produced by unexpected leaps of thought, and from amorphous thought in
which “a blue kerchief becomes a deep well.” This is the matter from which
beautiful mathematical discoveries and beautiful poems are cut out. And
whoever is capable of such thought could become a perfect mathematician
or a perfect poet.
Appendix A: Mathematical Fields
Algebra: Algebra was invented by the Indians, who later handed it down
to the Arabs. It reached the West through Al-Khwarizmi’s book Al-Jabar,
literally meaning “restoration.” This refers to a common way of solving equa-
tions, by identical operations on the two sides of the equation. The main idea
in algebra is calling numbers by names of letters. This is useful in two cases:
when we want to speak about general numbers (in this case the letter is
called “variable”) and when we want to find an unknown number by some
indirect information on it (in this role the letter is called “unknown”). The
meaning of the word “algebra” changed in the nineteenth century, and came
to denote the study of operations, like the four operations in numbers, but
more general and abstract. The most basic example of a structure with an
operation is the “group”, a set with one operation that has similar properties
to those of addition in the integers.
Combinatorics, or Discrete Mathematics: A field dealing with (usually)
finite sets, and relations defined on them. Classical combinatorial problems
are about counting: in how many ways can you choose a committee of 4 out of
100 people? “Discrete” is the opposite of continuous, so discrete mathematics
usually deals with integer quantities, as opposed to quantities that can be
as small as we please. Computers act discretely — a cell can be active or
non active, with no intermediate possibilities, and a digit is either 0 or 1,
with no third possibility. Hence the study of computer actions needs discrete
mathematics. For this reason discrete mathematics has flourished in the last
half century.
Differential and Integral calculus: A branch of mathematics that deals
with limits, in particular with tending to zero or to infinity. So, in particular
it deals with quantities that are “infinitely small” (meaning really — as
small as we please), and hence it is also named “infinitesimal calculus.” The
seeds of this field were sawn already by the ancient Greeks, who used it
to calculate areas and volumes. In the seventeenth century the field had a
revival, because it is so useful in physics, in particular in the study of motion.
255
256 Mathematics, Poetry and Beauty
The most prominent among its developers were Fermat, Barrow, Newton and
Leibniz. Differential calculus studies continuous change. It goes from the
global behavior to the local, for example calculating the velocity of a body
at a given moment from knowledge of its overall motion. Integral calculus
goes the other way — from local to global. For example, from knowing the
velocity of a body at every given moment to the overall distance that it
covers in its motion.
For about two hundred years mathematicians studied the notions of tend-
ing to zero and to infinity in an intuitive manner. In the nineteenth century
it transpired that precise definitions were needed. French and German math-
ematicians — Cauchy, Riemann, Cantor and Weirstrass, took on this job.
They got rid of notions like “infinitesimally small quantities”, and replaced
them by “quantities as small as we please”.
Mathematical logic: This is the “mathematics of mathematics”, namely
mathematical study of what mathematicians do. The field began with
Aristotle, who pointed out some basic rules, and defined what is logical
implication. It revived towards the end of the nineteenth century, with the
realization of Frege that a mathematical proof is a game that could be played
mechanically (in modern day terminology — by a computer). A far reaching
revolution in the field was made by Gödel, who proved in 1931 some impos-
sibility results. For example, that no “reasonable” set of axioms for number
theory can prove all true statements about numbers, and that though a
computer can recognize a proof, no single computer program will be able to
prove all provable statements in Number Theory.
Number Theory: The oldest and one of the deepest mathematical field.
Its objects, the natural numbers (1, 2, 3, . . . ), are seemingly simple, but
their study gave rise to the development of entire mathematical fields — for
example, algebra.
Set theory: The basic notion of this theory is a very simple one — elements
belonging to sets. While the study of finite sets is given to combinatorialists
(see above), set theory concerns itself with infinite sets. One of its main
topics is the study of sizes of sets. Cantor, the founder of the field, proved
that even in the infinite realm there are different possible sizes — there are
big infinite sets, and there are bigger.
Topology: Topology is geometry without measuring distances. If you take a
rubber sheet, and stretch or compress parts of it in some directions, without
making any cuts in it, for the topologist the sheet will remain the same. What
matters is only the number of holes in the sheet, which does not change with
stretching and compressing.
Appendix B: Sets of Numbers
257
258 Mathematics, Poetry and Beauty
Algebraic numbers: These are real numbers that are the √ solution to a
polynomial equation with integer coefficients. For example, 2 is algebraic,
because it is the solution of the equation x2 = 2.
Complex numbers: The square of a real number is always non-negative.
This means that in the real numbers you will not find a number x satisfying
the equation x2 + 1 = 0 (in other words x2 = −1). So, such a number must
be invented, just like negative numbers were invented to solve equations
like x + 1 = 0. Such a number, later denoted by i (for “imaginary”) was
introduced in the sixteenth century. Once i is around, it can be combined
with real numbers, namely multiplied by them and added to them. The
result of such operations is called a “complex number”, for example 3 + 2i
(complex, because it is composed of two parts, the real part — the 3 in the
example above, and the imaginary part — the 2i in the example. Actually,
we say that the imaginary part is 2, the i being indicated by the word
“imaginary”). Gauss showed that using complex numbers, any polynomial
equation can be solved. So, there is no need in further extension of the
kingdom of numbers, at least not from the direction of seeking solutions for
equations.
Appendix C: Poetical Mechanisms
Mentioned in the Book
The following definitions of the poetic devices mentioned in the book are
titled in the direction of the topic of this book, namely, their effect on the
reader and the way they generate beauty.
Anaphora: the repetition of the same combination of words at the begin-
ning of sentences; a special case of the poetical repetition, in which an
expression recurs throughout the poem. As in many poetical mechanisms,
this creates a gap between the external expression and the underlying con-
tent: the repetition conceals change. Every time that the poem returns to
this expression it has a slightly different meaning, if only for the cumulative
effect. Repetition at the end of sentences is called epiphora.
Chiasmus: crossing. The word has its source in the Greek letter χ, called
“chi” (the parallel of X). Chiasmus exchanges places or roles, such as “He
wakes in the morning, but morning doesn’t wake in him.”
Compression: expressing many ideas with a single symbol. Compression
is not a mechanism by itself, and may be effected in many ways, including
metaphor, multiple meanings of words, or a picture that contains many mes-
sages. This is such an outstanding poetical trait that it is sometimes used
as part of the definition of poetry. Poems, like jokes, transmit their mes-
sages concisely. The German word for poetry is “Dichtung,” which means
“compression.”
Conceit: a sophisticated metaphor, in which the distance between the tenor
and the vehicle (see “Metaphor,” below) is large. The word has its source in
the originally Latin word “concept.” This term was invented to describe the
type of metaphors used by the members of a seventeenth-century poetical
movement in England, who were called the “Metaphysical poets” by their
opponents.
Displacement: turning the spotlight on one element of the poem, while the
true message is in an idea that appears at its fringes. That is, displacement
emphasizes a less important element in order to incidentally transmit the
259
260 Mathematics, Poetry and Beauty
(see above) without conciseness: at the moment of the turnaround, the reader
must absorb, all at once, the meanings of many things that appeared before-
hand, and that must suddenly be reinterpreted. Since it is impossible to
absorb so much all at once, a large part of the absorption is necessarily
subconscious.