You are on page 1of 33

10 RICHARD ELWES & ROB STURMAN

2. Looping
2.1. Storage in Python. The real value of a computer is that as well as performing arithmetic,
it can store information in its memory. The challenge for the programmer is to be adept at
recognising how to retrieve the computer’s stored information in the correct way.
2.2. Self-referential equations. Often to a mathematician, one of the strangest ideas found
in computer programming is a statement such as n=n+1. Clearly there is no value of n for
which this is true. The explanation is that in programming, the symbol = is used in a quite
different way to that of ‘equals’. If we interpret = as the assignment operator, the statement
makes more sense. In this context, the statement reads ‘in the variable n store the value n+1’.
Then, providing n has a value at the time the statement is implemented, the value stored in n
will be incremented by 1. A computational shorthand for this notation is n += 1.
2.3. Loops. One of the most fundamental uses of a computer is to repeatedly perform an
operation many times. A typical way to accomplish this is to use control flow statements,
which can typically be thought of as looping.
2.3.1. while. The while statement is used for repeated execution of a procedure as long as a
particular expression is true. For example, consider the factorial function n! we used in the
previous workshop. Instead of using factorial from the module math, we might encode this
function as follows:
n = 10
n factorial = 1
while n > 0 :
n factorial = n factorial ∗ n
n = n − 1
Note the indentation of this code. Success of programming always depends on the accuracy
of syntax (for example, note the : at the end of the while statement), and in Python this
includes the indentation. Each line within a block must be indented the same amount. The
editor Spyder will indent your code automatically. The statement n=n-1 in the code indicates
a decrementation of n by 1. The while loop stops when the value of n equals zero.
2.3.2. for. An alternative way to encode the same procedure is to use the for statement. This
command instructs Python to perform each command in the indented block for each value of
i in the range given.
n = 10
n factorial = 1
fo r i in r a n g e ( n ) :
n f a c t o r i a l = n f a c t o r i a l ∗ ( i + 1)
Using a for loop we do not need to increment or decrement explicitly. Python’s for statement
iterates over the items of any list (or string).
2.3.3. if. The if statement, of course, only executes if a particular expression is true. It is
often used in conjunction with the commands elif (short for ‘else if’) and else.
# T e s t s f o r p r i m a l i t y u s i n g Wilson ' s theorem
n = 97; n minus one factorial = 1
fo r i in r a n g e ( n − 1 ) :
n m i n u s o n e f a c t o r i a l = n m i n u s o n e f a c t o r i a l ∗ ( i + 1)
i f n m i n u s o n e f a c t o r i a l % n == n − 1 : # Use == t o t e s t f o r e q u a l i t y
print ( ' Prime ' )
else :
print ( ' Composite ' )
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 11

This code uses Wilson’s theorem to test for primality. First it uses the previous procedure
to compute (n − 1)!. Then the if statement asks whether the remainder on dividing (n − 1)!
by n is equal to n − 1. If this is true, the programme prints ‘Prime’ to the screen, else it prints
‘Composite’. The print function is a useful one for displaying answers, or checking that a
programme is running as expected. You can print any variable using this command; in this
case we print strings.
2.3.4. break. Both while and for loops can be made to exit before the procedure is finished.
This is done using the break keyword, which will stop the code from executing any further,
and break out of the smallest enclosing loop. The following code gives another method, albeit
not a good one, of computing a factorial, using the break command.
n = 10; n f a c t o r i a l = 1; i = 1
while True : # means : r e p e a t f o r e v e r
i = i + 1
n factorial = n factorial ∗ i
i f i == n :
break

2.4. Euclid’s algorithm. This is an efficient method to find the greatest common divisor
(gcd) of two integers a and b. The algorithm can be described in a number of ways. Briefly,
the idea is to repeatedly subtract the smaller integer from the larger until both are equal. The
following gives this method in pseudocode5 to find the (gcd):
if a = 0, gcd = b
while b > 0:
if a > b:
a = a - b
else:
b = b - a
gcd = a
Eg, if (a, b) = (35, 10) initially, the algorithm produces the following sequence for a and b:
(a, b) = (35, 10) → (25, 10) → (15, 10) → (5, 10) → (5, 5) → (5, 0)
and so gcd(35,10)=5. If a and b are very different, this algorithm may be forced to subtract
a from b (or vice versa) many times in a row. An equivalent version of this algorithm, which
uses integer division instead of repeated subtraction, again in pseudocode, is the following:
while b > 0:
t = b
b = a mod b
a = t
gcd = a
Repeating the previous example with (a, b) = (35, 10) we have
(t, a, b) = (t, 35, 10) → (10, 10, 5) → (5, 5, 0)
which again gives gcd(35, 10) = 5.
In fact, Python has the useful ability to assign values to more than one variable in a single
line, so we can simplify the code above further (dropping the unnecessary extra variable t):
while b > 0:
a, b = b, a mod b
gcd = a
5Pseudocode is a brief and informal description of programming code, without the correct syntax, but containing
the main ideas to be turned into correct code.
12 RICHARD ELWES & ROB STURMAN

Mathematics Background 2
2.5. Euclid’s algorithm. Euclid’s algorithm has been described as the ‘oldest non-trivial
algorithm still in use’. Certainly it is old — it is described in Euclid’s Elements, written
around 300BC, but probably dates from before this.
The Python implementation given in lectures:
while b > 0:
a, b = b, a % b
gcd = a
is very succinct. It works by successively dividing one integer into another and computing
the integer quotient and the remainder at each stage. The sequence of remainders is strictly
decreasing, which guarantees that the algorithm will terminate and not run forever. The final
non-zero remainder is the gcd(a,b). The reason the algorithm works is that if a = bq + r, then
gcd(a, b) = gcd(b, r). Hence we are repeatedly replacing the gcd of a pair of integers with the
same gcd of a smaller pair of integers.
2.6. Extended Euclidean algorithm. The Euclidean algorithm can be adapted to find the
integers x and y which solve Bézout’s identity, given by:
ax + by = gcd(a, b).
By working backwards through the steps of the Euclidean algorithm it is easy to find integers
x and y such that gcd(a, b) = ax + by. However, a good algorithm would not need to work
backwards through steps already computed, nor should it need to store information unneces-
sarily. Here we demonstrate that Euclid’s algorithm can be extended to solve Bézout’s identity
avoiding both of these problems. Euclid’s algorithm produces a sequence of remainders ri , but
also it computes a pair of integers xi and yi such that ri = axi + byi .
For example, consider a = 375, b = 279. We will (to make indexing neat) denote r0 = a = 375
and r1 = b = 279. The first step of Euclid’s algorithm is to note that 375 = 1 × 279 + 96, so we
have r2 = 96, and rearranging gives 96 = 1 × 375 − 1 × 279, so x2 = 1 and y2 = −1. Continuing
through the algorithm we see that each xi and yi is related the previous ones in the following
way.
We assume that
(2.6.1) ri = axi + byi .
At the ith step, the following remainder ri+1 is computed from
(2.6.2) ri+1 = ri−1 − qri
where q = bri−1 /ri c is the integer part of the Euclidean division at the ith step. Then substi-
tuting (2.6.1) into (2.6.2) we have
ri+1 = axi−1 + byi−1 − qaxi − qbyi
= a (xi−1 − qxi ) + b (yi−1 − qyi ) .
Hence the sequences xi and yi satisfy the relations
xi+1 = xi−1 − qxi
yi+1 = yi−1 − qyi
To initialise the recurrence relations for xi and yi we note that a = (1 × a) + (0 × b) and
b = (0 × a) + (1 × b), and so we take x0 = 1, x1 = 0, y0 = 0, y1 = 1. This gives the extended
Euclidean algorithm given in pseudocode on the workshop sheet.
16 RICHARD ELWES & ROB STURMAN

3. Integers
3.1. Positional-based integers. In our familiar Hindu-Arabic base-10 positional-based num-
ber system, we express large integers using powers of 10. For example, 256 is an efficient way of
writing 2 × 102 + 5 × 101 + 6 × 100 . However, base 10 is mathematically arbitrary (it originates
in humans’ number of fingers). Integers can be expressed in any number base. In particular,
Theorem 3. Given any integer x ∈ N and any choice of base B ≥ 1, x can be written as
N
X −1
x= kn B n ,
n=0
n−1
where N = max{n ∈ N|B ≤ x} and kn ∈ {0, 1, . . . , B − 1} for 0 ≤ n < N .
Computers use binary, meaning base 2, whose ‘digits’ are called bits. In Python 3, the
computer’s memory is the only limit on the number of bits that can be used to store an integer.
But how do we deal with negative numbers? Python 3 internally stores numbers using a sign
& magnitude protocol, meaning that the first bit on the left represents the sign, with 0 for +
and 1 for −, and the following bits represent the integer’s absolute value in binary.
There is a slight inefficiency here, in that 10 and 00 represent −0 and +0 (which are of
course the same number). Other languages have different systems. In Python 2, integers (as
distinct from long integers) can occupy only up to 32 bits. So if we were interested only in
non-negative integers, we could encode all integers from 0 to 232 − 1. To include negative
numbers, instead of using the initial bit to indicate the sign, Python 2 uses a technique called
“two’s-complement” to re-order the mapping between decimal and binary, so that all integers
in the range [−(231 ) : 231 − 1] are uniquely representable using 32 bits. See the mathematical
background for more details. In Python 3, ints can have arbitrary size.
3.2. Converting between decimal and binary.
3.2.1. Decimal to binary. One method is to repeatedly divide the decimal by 2, and to record
the remainder each time from right to left in the binary code. For example, consider the decimal
6. This algorithm then gives
6/2 = 3 remainder 0 → 3/2 = 1 remainder 1 → 1/2 = 0 remainder 1
We record the remainders from right to left, giving 110 as the binary representation of 6 (that
is, 1 × 22 + 1 × 21 + 0 × 20 ). A general algorithm to perform this routine is given by:
# Decimal t o b i n a r y c o n v e r s i o n
decimal , b i n a r y =6, [ ]
while decimal >0:
b i n a r y . append ( d e c i m a l %2)
d e c i m a l=d e c i m a l //2
binary . reverse ()

3.2.2. Binary to decimal. Given a binary number ak ak−1 . . . a1 a0 , ai ∈ {0, 1}, to find the decimal
representation, we simply unpack it an then add up:
ak · 2k + ak−1 · 2k−1 + · · · + a1 · 21 + a0 · 20 .
For example, the binary number 1101 has decimal representation 1×23 +1×22 +0×21 +1×20 =
8 + 4 + 0 + 1 = 13. Some code to perform this is
# Binary t o d e c i m a l c o n v e r s i o n
binary = [ 1 , 1 , 0 , 1 ]
decimal = 0
binary . reverse ()
fo r i in r a n g e ( l e n ( b i n a r y ) ) :
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 17

decimal = decimal + (2∗∗ i )∗ binary [ i ]

3.3. Functions. In programming, a function is a building block of larger programs. Just as in


mathematics, where a function takes an argument and produces a result, so it does in Python.
The general form of a Python function is
def function_name(arguments):
{lines telling what the function
to do to produce the result}
return result
Here again the indentation is useful for arranging the code, and crucial for compilation. As a
simple example, consider this code which accepts any number as input as returns its square:
def s q u a r e d ( x ) :
xx = x∗x
return xx
When we compile and run this code in Python, apparently nothing happens, but this defines a
function we can then use: squared(3) will produce the answer 9. It’s important to realise that
objects defined within the function are immediately forgotten when the function has ended.
For example, consider the function:
def f ( x , y ) :
a = x ∗∗2
b = y ∗∗2
return ( a+b ) ∗ ∗ 0 . 5
p
After running this code, the function f takes two inputs x and y, and returns x2 + y 2 .
However, the values of a and b are not stored or returned. If we want to record them, we must
add them to the return statement: return sqrt(a+b), a, b.
A function need not only deal with numbers: any data type can be passed to, and returned
from a function. The following functions takes a list and returns the sum of the elements.
def s u m o f l i s t ( a ) :
sum=0
f o r i in a :
sum = sum+i
return sum

3.4. The Sieve of Eratosthenes. We can use a combination of the control flow statements
from lecture 2 to encode an ancient algorithm for listing prime numbers. The sieve of Eras-
tothenes is conceptually simple - begin with a list of integers, and cross off every non-trivial
multiple of 2 (but leaving 2 itself). Then cross off every non-trivial multiple of 3, 5, 7, 11 and so
on. After this procedure is finished, the only numbers remaining in the list are not non-trivial
multiples of any other integer, and so are prime numbers. The following function implements
this algorithm.
# Function t o g e n e r a t e primes by E r a t o s t h e n e s ' s i e v e
def e r a t o s t h e n e s ( max prime ) :
p r i m e s = l i s t ( r a n g e ( 2 , max prime +1))
f o r i in p r i m e s :
j =2
while i ∗ j<= p r i m e s [ − 1 ] :
i f i ∗ j in p r i m e s :
p r i m e s . remove ( i ∗ j )
j=j +1
return p r i m e s
18 RICHARD ELWES & ROB STURMAN

Mathematics Background 3
3.5. Representing numbers in different bases. Proof of Theorem 3: By definition of
N,
B N −1 ≤ x < B N
and so
1 ≤ B 1−N x < B.
Now set
x0 = bB 1−N xc and r = x − B N −1 x0
Notice that 1 ≤ x0 < B. If r = 0, we have x = x0 · B N −1 and the result is proved. Otherwise,
r is a positive integer such that max{n ∈ N : B n−1 ≤ r} < N. (Think about why this must
be true.) In this case, replace x by r and repeat the above procedure to find its leading digit.
This algorithm must terminate since N decreases at each step.
As a recurrence relation, we can re-state the algorithm above as
N = max{n ∈ N : B N −1 ≤ x}, and rN = x
Then, for n = N − 1, . . . , 0,
xn = bB −n rn+1 c, rn = rn+1 − xn B n .
3.6. Binary representations.
3.6.1. One’s complement. In 32-bit one’s complement, 31 bits are used to represent the positive
integers in the usual way. Then, to represent a negative integer −n, we flip the binary code
n by replacing 0s with 1s and vice versa. In this scheme the codes 000 . . . 000 and 111 . . . 111
both represent the integer 0.
3.6.2. Two’s complement. In two’s-complement, the binary code for −n is instead the flip of
the binary code for n − 1.

Binary code Sign & Magnitude One’s complement Two’s complement


011 3 3 3
010 2 2 2
001 1 1 1
000 0 0 0
100 -0=0 -3 -4
101 -1 -2 -3
110 -2 -1 -2
111 -3 -0=0 -1
Table 3. Three different protocols for binary representation of decimal numbers,
for a p-bit scheme with p = 3. Using sign & magnitude and one’s complement
we can represent the numbers [−(2p−1 − 1) : 2p−1 − 1] = [−3 : 3], whilst two’s
complement can represent the numbers [−2p−1 : 2p−1 − 1] = [−4 : 3].

3.7. The Sieve of Sundaram. The sieve of Sundaram crosses out all numbers of the form
i + j + 2ij, then doubles and adds one to the remaining numbers. Thus numbers are deleted
from the final list if and only if they are of the form 2(i + j + 2ij) + 1. But since
2(i + j + 2ij) + 1 = 4ij + 2i + 2j + 1 = (2i + 1)(2j + 1),
an odd integer is excluded from the final list if and only if it can be factorized into (2i+1)(2j+1).
That is, all composite numbers are deleted from the list, and so the final list is exactly the set
of odd primes less than or equal to n.
22 RICHARD ELWES & ROB STURMAN

4. Fractions
4.1. Exact representation of fractions. The most natural way to represent fractions is to
use a pair of integers (the numerator and denominator) and express the fraction as the ratio of
this pair. This representation is exact (as opposed to approximate), but in computational use
requires the storage of two entities (the integers) to express just one quantity (the fraction).
4.2. Representing fractions in binary. Alternatively, just as we can represent any integer
in terms of (positive) powers of any base, using positional numbering, we can express a fraction
in terms of negative powers of any case. In particular, in binary,
Theorem 4. Every x ∈ (0, 1) can be written in the form

X
(4.2.1) x= xn 2−n−1 ,
n=0

where xn ∈ {0, 1}.


This binary expansion of x is finite if and only if x is a rational number (that is, can be
written p/q in lowest form), where q is a power of 2. For example, the following expression is
exact:
11 0 0 1 0 1 1
= + + + + + .
64 2 4 8 16 32 64
Binary expansions of irrational numbers do not terminate, but neither do rationals for which
the denominator has prime factors other than 2. For example, the binary expression for 1/3
can be written 1/3 = 0.01010101 . . ., which is interpreted as
1 1 1 1
= 0 × 2−1 + 1 × 2−2 + 0 × 2−3 + 1 × 2−4 + · · · = + + + ···
3 4 16 64
Similarly, the binary fraction for 1/10 does not terminate. This may seem like a severe limita-
tion, but remember that in decimal notation too, expansions of some fractions do not terminate.
For example, 1/3 = 0.333 . . . in decimal. In general, no system which uses powers of some base
integer can express every fraction as a finite sum of terms.
4.3. Continued fractions. Another alternative way to express fractions is the beautiful rep-
resentation known as continued fractions. A continued fraction is a number written in the
form
1
a0 +
a1 + a2 + 1 1
1
a3 + a +···
4
where the ai are positive integers. A commonly used shorthand is to write this expression
as [a0 , a1 , a2 , a3 , a4 , . . .]. It can be shown that rational numbers have finite continued fraction
expansions, whereas irrational numbers have infinite continued fractions representations.
To compute the continued fraction expansion of a real number x, we have a simple algorithm.
(1) Set x0 = x. Set a0 = bx0 c.
(2) Set x1 = 1/(x0 − a0 ). Set a1 = bx1 c.
(3) Continue, setting xn = 1/(xn−1 − an−1 ). Set an = bxn c.
(4) The algorithm will terminate if xn = an , which is the case for rational x for some n.
For example, consider the fraction x = 43/30. The algorithm gives x0 = x, and so a0 =
b43/30c = 1. Then x1 = 1/(43/30 − 1) = 1/(13/30) = 30/13, giving a1 = b30/13c = 2.
Continuing we have x2 = 1/(30/13 − 2) = 1/(4/13) = 13/4, and so a2 = b13/4c = 3. Finally,
x3 = 1/(13/4 − 3) = 1/(1/4) = 4, so a3 = 4. The algorithm terminates at this point as x3 = a3 .
Therefore,
43 1
= [1, 2, 3, 4] = 1 + .
30 2 + 3+1 1
4
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 23

The algorithm also works for irrational x, although it will never terminate.
Given a (possible infinite) continued fraction expansion [a0 , a1 , a2 , . . . , ] for x ∈ R, the nth
partial convergent is given by truncating the continued fraction at the nth term, giving the
rational approximation [a0 , a1 , a2 , . . . , an ] which we may write as pqnn . Such a continued fraction
gives the best, in the sense described in Mathematics Background 4, rational approximation to
an irrational number. √
A particularly nice example is the golden ratio ϕ = ( 5+1)/2 ≈ 1.618... which has continued
fraction given by [1, 1, 1, 1, 1, . . .]. Computing successive partial convergents gives the following
approximations to ϕ:
1st 1
1
2nd 1+ = 3/2
1
1 1 2
3rd 1+ 1 = 1+ = 1 + = 5/3
1+ 1 3/2 3
1
4th 1+ = 8/5
1 + 1+1 1
1

5th 13/8
..
.
nth Fn+1 /Fn
where Fn is the nth Fibonacci number, defined by the relation Fn+2 = Fn+1 + Fn with F0 =
F1 = 1. Note that to evaluate a partial convergent6, we begin with a1n and work outwards to
a0 .

6Fairlystraightforward methods using linear algebra exist to compute a partial convergent in the other direction,
but we will not discuss these.
24 RICHARD ELWES & ROB STURMAN

Mathematics Background 4
4.4. Representing fractions in different bases. Proof of Theorem 4: We introduce the
map B : [0, 1) → [0, 1) given by
(
2y if 0 ≤ y < 1/2
B(y) =
2y − 1 if 1/2 ≤ y < 1.
That is, B(y) is the fractional part of 2y, i.e. B(y)P= 2y − b2yc.
Now let x ∈ (0, 1). We will find xn so that x = ∞ n=0 xn 2
−n−1
(4.2.1) as required.
First, set r0 = x. Trivially,
x = r0 = 2−1 2r0 = 2−1 (b2r0 c + B(r0 )).
So, setting
x0 = b2r0 c ∈ {0, 1} and r1 = B(r0 ) ∈ [0, 1).
we have
x = 2−1 x0 + 2−1 r1 .
If r1 = 0, then only the first coefficient in the series on the right-hand side of equation (4.2.1)
does not vanish, and the algorithm terminates successfully. Otherwise, we repeat the calculation
with r0 replaced by r1 . This gives
x = 2−1 x0 + 2−2 (x1 + r2 )
where x1 = b2r1 c and r2 = B(r1 ). After N iterations, we obtain
N
X −1
x= xn 2−n−1 + 2−N rN
n=0
PN −1
where xn = b2rn c and rn+1 = B(rn ). Call the partial sum XN := n=0 xn 2−n−1 . Then the
final thing to note is that XN → x as N → ∞ because
0 ≤ x − XN = 2−N rN ≤ 2−N → 0.
4.5. Continued fractions.
Theorem 5. Let x ∈ R\Q and n ≥ 1 and let pqnn be (in simplest terms) the nth partial convergent
for x. Then pn+1 > pn and qn+1 > qn and
pn 1 1
x− < < 2.
qn qn qn+1 qn
This theorem implies that for irrational x, the partial convergents do indeed converge to x.
p pn
Theorem 6. Let x ∈ R\Q and n ≥ 1. Take p, q ∈ Z such that 0 ≤ q < qn and q
6= qn
. Then
pn p
x− < x− .
qn q
This theorem states that of all fractions with denominator at most qn , the nth convergent is
the one which best approximates x.
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 29

5. Floating-point numbers
5.1. Floating-point numbers. A floating-point number x is a number which can be expressed
in the form
x = (−1)s × m × 2e−σ
where s = 0 or 1 determines the sign of x, m is the mantissa (effectively the significant digits),
e is the shifted exponent, which allows us to represent both very large and very small numbers,
and σ is the shift, which is the same for every x, and fixes the range of representable numbers.
Every floating-point number occupies the same amount of computer storage (the number of
bits taken up by s, m and e), regardless of its actual value. Generally Python 3 installations
use double-precision binary floating-point formats, which deploy exactly 64 bits of memory to
express a number x, made up of 1 bit for s, 52 bits for m and 11 bits for e.
The 1 bit representing s simply specifies ±1. The mantissa is given in binary as
m = 1.m1 m2 m3 . . . m52 (base 2) ∈ [1, 2),
where each mi is either 0 or 1. The shifted exponent is a non-negative whole number given in
binary as
0 ≤ e = e10 e9 e8 . . . e0 (base 2) ≤ 211 − 1 = 2047.
The shift σ is fixed as σ = 210 = 1024, which allows for a roughly equal range of positive and
negative exponents, that is,
−1024 ≤ e − σ ≤ 1023,
so floating-point numbers x can be expressed with maximum accuracy (i.e. with 52 bit man-
tissas) within the range:
2−1024 ≤ |x| ≤ 1 · 111 | {z. . . 1} ×21023
52 bits, base 2
which in decimals is roughly
5.563 × 10−309 ≤ |x| ≤ 1.798 × 10308
However, Python 3 can squeeze in more numbers below this range (so-called ‘subnormal
numbers’) by exchanging accuracy in the mantissa for extra bits in the exponent. See how
small a non-zero number you can produce!
Since the leading 1 in the mantissa is built in, we cannot express zero as a standard float.
So this is handled separately, by interpreting m = e = 0 as zero. However, in floating point
arithmetic zero is signed: try playing around with “-0.0”. Is it a different number from 0.0?
What happens when you try to work with floats which are too big?
Clearly the set of numbers that can be represented in this way is finite, and in some sense
quite small. Computationally, any number x within the range above which cannot be exactly
represented in this way is simply represented by the nearest floating-point number to x. For
example, the number 1/10 has a binary expansion 1/10 = 0.000110011001100 . . . and so cannot
be represented exactly by a floating-point number. In this scheme, it is represented by the
decimal fraction 0.10000000000000001. This is the real explanation behind some of the oddities
in computational arithmetic that we met earlier.
Python does have a dedicated decimal module geared to correcting these errors: try the
following
print(0.1+0.7)
from decimal import *
print(Decimal(’0.1’)+Decimal(’0.7’))
However, we shall not use this much.
30 RICHARD ELWES & ROB STURMAN

5.2. Machine . The machine  can be defined in several different ways; there is no industry
standard. We will use the following informal definition, which encapsulates the main idea.
Definition 2. The machine  is the smallest positive number which, when added to 1, yields a
result other than 1.
We can find the machine  directly using this definition, by repeatedly testing whether a
candidate increases 1 when added to it; if so, we halve the candidate and try again, until 1 is
unchanged by the addition:
macheps = 1 . 0
while 1.0+ macheps > 1 . 0 :
macheps = macheps / 2 . 0
print ( macheps )

5.3. Evaluating polynomials. Computing the numerical value of a polynomial f (x) at a


particular value appears a straightforward task. For example, the following function takes
the coefficients a, b, c of a quadratic function ax2 + bx + c, and a value of x, and returns the
evaluation of the quadratic.
def quad ( a , b , c , x ) :
return a ∗x∗x+b∗x+c
We can extend this idea to evaluate a general polynomial given by
n
X
2 3 n
(5.3.1) P (x) = a0 + a1 x + a2 x + a3 x + . . . + an x = ai x i .
i=0
When writing larger pieces of code, it’s a good habit to split procedures up into functions
within the same file. For example, consider the following:
def x t o n ( x , n ) :
return x ∗∗n

def e v a l p o l y ( x , c o e f f s ) :
a = 0
f o r i in r a n g e ( l e n ( c o e f f s ) ) :
a = a + c o e f f s [ i ]∗ x to n (x , i )
return a
Note how the function eval poly makes several calls to the separate function x to n. You
should get used to using functions in a similar way. The function just described for evaluating
polynomials is not a good method, for the following reason.
Consider how many operations are required to make the evaluation. If we count the operation
an xn as n multiplications, then we have n+(n−1)+(n−2)+. . .+1 = n(n+1)/2 multiplications,
and then n additions to sum the terms. In total this is n(n + 3)/2, which grows quadratically
with n.
5.4. Horner’s method. Instead we could write equation (5.3.1) as
(5.4.1) P (x) = a0 + x(a1 + x(a2 + . . . + x(an−2 + x(an−1 + an x))) . . .)))
Counting arithmetic operations to evaluate P (x) in this way gives just n multiplications and n
additions, making 2n operations in total, which clearly increases linearly with n.
Evaluating polynomials at x = x0 using (5.4.1) is known as Horner’s method, and can be
simply programmed as follows. The following Python function takes a list of coefficients and a
value x0 and returns the evaluation P (x0 ):
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 31

def h o r n e r ( x , c o e f f s ) :
coeffs . reverse ()
b=0
f o r a in c o e f f s :
b = a+x∗b
return b
We can express the terms in successive brackets using the descending recurrence relation
bk−1 = ak−1 + x0 bk
for k = n, . . . , 1 where bn = an . Then b0 = P (x0 ). The Python function above essentially
computes each bi in turn starting with bn until it reaches b0 , which it returns.

We can also evaluate the derivative P 0 (x0 ) of P (x) at some point x = x0 in a similar way.
First, factoring out the term (x − x0 ) from P (x) we get
(5.4.2) P (x) = (x − x0 )Q(x) + P (x0 ),
Pn−1 i
where Q(x) = i=0 ci x is some polynomial of degree one lower than that of P (x).
Differentiating (5.4.2) we have P 0 (x) = Q0 (x)(x − x0 ) + Q(x) and so P 0 (x0 ) = Q(x0 ). Thus
to find P 0 (x0 ), we need the coefficients ci of Q.
Example 1. Let P (x) = 1 + 2x + 7x2 + 4x3 . Then P 0 (x) = 2 + 14x + 12x2 , and evaluating
each at x = 1 we have P (1) = 14 and P 0 (1) = 28.
Writing P (x) in nested form we have P (x) = 1 + x(2 + x(7 + 4x)). Then successive values
of bi from Horner’s method are b3 = a3 = 4, b2 = (7 + 1 × 4) = 11, b1 = 2 + 1 × 11 = 13,
b0 = P (x0 ) = 1 + 1 × 13 = 14.
Moreover, we can factorise P (x) as P (x) = (x − 1)(4x2 + 11x + 13) + 14 = (x − 1)Q(x) + 14,
where the coefficients of Q(x) are simply the values of b3 , b2 , b1 .
This example suggests that ci = bi+1 . The maths background sheet shows this in general.
32 RICHARD ELWES & ROB STURMAN

Mathematics Background 5
5.5. Polynomial remainder theorem.
Theorem 7. If a polynomial P (x) is divided by (x − r) then the remainder is given by the
constant P (r).
Proof of Theorem 7: Polynomial long division states that for any polynomial P (x) and any
other of lower degree D(x) (the divisor ), there exist polynomials Q(x) and R(x) (the quotient
and remainder respectively) where P (x) = D(x)Q(x) + R(x), and the degree of R(x) is strictly
less than that of D(x).
If we set D(x) = (x − r), then the degree of R(x) must be less than one, meaning that R(x)
is a constant, C. Thus P (x) = (x − r)Q(x) + C, and inputting x = r we have C = P (r).
5.6. Horner’s method. Having P factorised P (x) as P (x) = (x − x0 )Q(x) + P (x0 ), we see that
P 0 (x0 ) = Q(x0 ). If Q(x) = n−1 c
i=0 i x i
, we can compute the coefficients ci by equating powers
of x in P (x) = (x − x0 )Q(x) + P (x0 ) (5.4.2), giving
O(xn ) : an = cn−1
O(xn−1 ) : an−1 = cn−2 − x0 cn−1
..
.
k
O(x ) : ak = ck−1 − x0 ck for k = n − 1, . . . , 1
..
.
0
O(x ) : a0 = −x0 c0 + P (x0 )
which can be rearranged to give a recurrence relation
cn−1 = an
ck−1 = ak + x0 ck for k = n − 1, . . . , 1
P (x0 ) = a0 + x0 c0
which is just the same recurrence relation as for the bi in Horner’s method, with the indices
shifted by 1. Thus ci = bi+1 as required.
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 37

6. Data and function plotting


6.A. Introduction. We often want to use a computer to display information graphically.
There are many different packages to do this, and you may be already familiar with several
(for example, Excel, gnuplot, Maple, or Matlab). Python has free built-in libraries that allow
a powerful and flexible way to plot graphs. As with many aspects of computer programming,
initially it may seem like a complicated procedure, but with practice will soon bring rewards.

First, to make plots appear in a new window, we change the default setting in Spyder:
Tools → Preferences → IPython console → Graphics → Graphics backend → Automatic
We will use the library matplotlib, which itself contains a plotting sublibrary pyplot. We will
also use the package numpy (=‘numerical python’). Some online tutorials import both these
libraries using pylab (a Matlab style environment). But we won’t use Pylab, so to get started:

import matplotlib.pyplot as plt


import numpy as np
6.B. Plotting functions. Pyplot is fundamentally not a function plotting package, but can
handle data plotting easily and well. To plot a function we first define a list of values over
which to define a function. We have used the command range(a, b, c) many times to
produce a list of integers starting at a and ending at b-1 with steps of c. A similar command
is np.arange, which performs a similar task, but allows the start, end and step values to
be fractional (unlike range). For example, np.arange(0.5, 2.4, 0.3) produces an array
[0.5, 0.8, 1.1, 1.4, 1.7, 2.0, 2.3]. Strictly speaking, this is a numpy array, not a standard Python
list. These two data-types are similar, but arrays have some advantages.
We can interpret the above array as x-values, and create a corresponding array of y-values,
and plot a graph of a function y = f (x). Numpy makes this quite easy. For example, the lines
x = np.arange(0.0, 1.1, 0.1)
y = x ** 2
plt.plot(x, y)
creates a graph of y = x2 in the range x ∈ [0, 1]. (Notice that the line y = x ** 2 would not
work if x was a standard list.) We can then save our graph using plt.savefig(’filename.png’).
This will save the figure in the same directory that you are working (make sure you know where
that is!) as a .png file — standing for portable network graphics, this is a very flexible format.
In fact the plot command is drawing a line graph, joining the points defined by x and y.
We can see this by using an optional argument in the plot command to change the line style
or marker: for example plot(x, y, ’o’) will plot circles at the data points rather than draw
a joined up line. Other markers and linestyles to try include ’-’ (solid line), ’--’ (dashed
line), ’-.’ (dash-dot line), ’:’ (dotted line), ’.’ (point marker), ’s’ (square marker), ’p’
(pentagon marker). These can be combined, for example plot(x, y, ’o-’) gives a plot with
both a solid line and circular markers.
To plot more than one line on the same graph simply use the plot command more than
once. To plot more than one figure, we can name or number them:
plt.figure('Figure A')
plt.plot(x, y)
plt.figure('Figure B')
plt.plot(x, z)
If plot is given only a single input y, it assumes x=[0,...,N-1], where N is the length of y.
6.C. Labelling and other cosmetics. Every mathematician knows that it is good practice
to label our graphs. We do this with commands such as the following (mostly self-explanatory):
plt.xlabel('x'), plt.ylabel('y = f(x)'), plt.title('My graph'), plt.grid(True), and
38 RICHARD ELWES & ROB STURMAN

plt.xlim([-1.0,1.0]). We can also specify the colours of lines and markers by adding an-
other optional argument to the plot command. For example, plt.plot(x, y, 'gs') plots
the data with green squares. Other colours include b (blue), g (green), r (red), c (cyan), m
(magenta), y (yellow), k (black), w (white).
Subplots (several graphs on separate axes within the same figure) can be created using
plt.subplot(ijk) which gives a grid of i rows, j columns, and plots in the k place in the grid.
6.D. Scatter plots. Another useful plotting technique is scatter plotting, used when plotting
data which does not necessarily represent a function. Here we will provide the plt.scatter
command with two lists, one representing x-coordinates, the other representing y-coordinates
of the data to be plotted. As such, the two lists must be the same length.
x = [1, 4, 3, 5, 7, 6, 3]
y = [3, 6, 5, 7, 4, 3, 2]
plt.scatter(x, y)
To change markers in the scatter function, we must explicitly change the marker vari-
able. We can also change the size of the markers by including the s variable. For ex-
ample, plt.scatter(x, y, s=50, marker='*') plots very big stars, while the command
plt.scatter(x, y, s=1, marker='.') plots very small points. The default size is 20.
6.E. Logarithmic scales. One useful aspect of graph plotting is the ability to use logarith-
mic scales. This can help to discern the underlying form of unknown data. Pyplot has the
logarithmic plotting commands loglog, semilogy and semilogx.
For example, consider plotting the function
y = f (x) = x3 .
Taking logarithms of this equation we have
log y = 3 log x.
Thus plotting a graph of log y against log x we get a straight line graph of gradient 3:
x = arange(0, 1, 0.01)
y = x ** 3
plt.loglog(x, y)
In general, any data of the form y = Axp , that is, log y = log A + p log x, will produce a straight
line graph with gradient p and intercept log A when plotted as a log-log plot.
Similarly, consider a function of the form y = g(x) = Cemx . Taking logarithms gives
log y = log C + mx.
Thus a straight line graph with gradient m and intercept log C is produced when plotting log y
against x.
x = np.arange(1, 3, 0.01)
y = 4 * np.exp(2 * x)
plt.semilogy(x, y)
The command semilogy gives a plot with a logarithmic scale on the y-axis (similarly, semilogx
gives the corresponding graph with a logarithmic scale on x). However, note that to make sense
of the gradient m from the graph, we need to consider the base of the logarithm we are taking.
We originally took base-e logarithms, and so we should plot the graph according to the same
base. To do so, we specify the argument basey, e.g., plt.semilogy(x, y, basey=np.e).
6.F. Advanced plotting. Matplotlib/Pylab is a highly advanced plotting library which is
widely used for the production of professional graphics. We have barely scratched the surface!
For a fuller account of what it can do, see: www.matplotlib.org/ .
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 39

Mathematics Background 6
6.G. The Hénon map. Michel Hénon was a French mathematician and astronomer (1931–
2013). Much of his work was on the 3-body problem, a physical situation in which chaotic
dynamics have long been known to arise. The Hénon map, dating from 1976, is perhaps one
of the simplest 2-dimensional maps to produce chaotic dynamics, and certainly one of the best
known. Given fixed parameters a, b ∈ R (Hénon took (a, b) = (1.4, 0.3) and a starting point
(x0 , y0 ) it is defined by
xn+1 = yn + 1 − ax2n
yn+1 = bxn .
If |b| < 1, the map (xn , yn ) → (xn+1 , yn+1 ) is dissipative, which means R2 is contracted. To
show this, consider the determinant of the Jacobian of the Hénon map:
!
∂xn+1 ∂xn+1  
∂xn ∂yn −2axn 1
∂yn+1 ∂yn+1 = = −b
∂x ∂y
b 0
n n

Thus any region will be compressed by a factor |b|. A consequence is that all starting points
tend towards the attracting set known as the Hénon attractor.
In 1976, Hénon used an IBM mainframe computer and presumably several hours of computer
time to plot 5 million iterates of the map, to get a reasonable image of the attractor. It takes
my laptop less than 10 seconds to perform the equivalent computations.
6.H. The Collatz conjecture. The Collatz conjecture is named after the German mathe-
matician Lothar Collatz, who died in 1990. It is also known by other names, including the
3n + 1 problem, Kakutani’s problem, the Syracuse problem, Thwaites conjecture and Ulam’s
problem. The conjecture remains unproven, and the great Hungarian mathematician Paul
Erdős apparently stated of the Collatz conjecture that “Mathematics may not be ready for
such problems”. This seems at odds with the apparent simplicity of the problem: take any
positive integer n. Form a sequence (often called a hailstone sequence, or occasionally a se-
quence of wondrous numbers) with the following rule: if n is even, halve it; if n is odd, triple
it and add 1. Repeating this process, the Collatz conjecture claims that the sequence formed
from any initial n will eventually reach 1, i.e. every positive integer is a hailstone. (Note
that the sequence is terminated when it reaches 1, otherwise it continues in an endless cycle
1 → 4 → 2 → 1 → 4 . . ..)
Another way to state the Collatz conjecture is to claim that other than the trivial cycle
1 → 4 → 2 → 1, there are no other cyclic sequences. It has been proven that there are no other
cycles of length < 400 (by John Horton Conway) and indeed no other cycles of length < 275, 000
by Jeffrey Lagarias. The conjecture has been tested computationally for all initial integers n
up to 5.4 × 1018 . Although prizes have been offered for a proof of the Collatz conjecture, it
looks as if these are a long way from being claimed.
46 RICHARD ELWES & ROB STURMAN

7. Random numbers
7.A. Introduction. Truly random numbers are impossible to create on a computer. By def-
inition, an algorithm intended to produce random numbers is inherently predictable. Instead,
computers can do a good job of creating pseudorandom numbers, which satisfy many statis-
tical tests of randomness. Random numbers play a crucial role in many computing tasks, in
particular being central to the notion of a Monte Carlo method.
7.B. Middle square. A very early method for generating a list of pseudorandom numbers
was proposed by John von Neumann in 1946. One version is as follows:
• Choose an initial 4 digit number a.
• Square this number to get a number b.
• Extract the middle 4 digits of b to give c. (Usually b will have 8 digits, so the middle 4 will
be those in positions 3-6. If b has 7 digits, we have to decide what to do: one option is to take
the digits in positions 2-5. Alternatively, we could take those in positions 3-6.)
• Repeat with c.

This method is now only of historical interest, as it has many shortcomings (e.g. its success
is heavily dependent on the initial number). Nevertheless, it is easy and quick to implement.
7.C. Linear Congruential Generators. Another classic method of generating pseudoran-
dom numbers comes from iterating linear functions using modular arithmetic. Choose a con-
stants m, a, c (as the modulus, multiplier and increment respectively) and set an initial value
(the seed) x0 < m. Then a sequence of integers is generated by the recurrence relation:
xn+1 = (axn + c) (mod m).
Sensible choices of m, a, c can then produce a sequence of integers in the range [0, m) with
reasonable pseudorandom properties. Since the procedure is deterministic with xn+1 completely
determined by xn , the method is inevitably periodic with maximum period m. (Some – bad –
choices of parameters produce far shorter periods.)
7.D. Testing pseudorandom numbers. There are many different ways to test a sequence of
numbers for randomness. For example, in MATH1725, χ2 -tests are used (see Workshop). Here,
to avoid going deeply into statistical theory, we mention two simple ones. First, we can check
the arithmetic mean of the pseudorandom numbers. If a sequence of n numbers xi ∈ (a, b) are
1
P
close to random (meaning uniformly distributed), we should find that n xi ≈ (a + b)/2. If
the mean departs from this value, the values are consistently too high or too low.
We can also compare the standard deviations of the uniform distribution 2√1 3 (b − a) with
q P
that of the sequence: 1
n
(xi − x̄)2 where x̄ is the mean of the data as above.
Your eyes can provide another non-rigorous but useful test of the randomness of a sequence
of numbers (xi ). If we create a scatterplot of xi against xi−1 for each i, we would like to see
no discernable patterns. E.g. using a linear congruential generator with c small relative to m
will produce sequences which lie on straight lines. Given a list x to use as x-coordinates in a
scatter plot, we can generate the y-coordinates by the lines:
y = x[:]
y.append(y[0])
del y[0]
These lines make a copy of x in y (but why y = x[:] instead of y = x?), append the initial
element to the end, and then remove the initial element.
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 47

7.E. The time module. A random generator is only as good as its seed. Moreover, running a
deterministic generator with the same seed will by definition produce identical results each time.
This is often highly undesirable. In particular, one technique of computional mathematics is
to run a numerical experiment many times with repeated random sampling, perhaps as initial
conditions. In such a situation it may be required to run a routine which produces a different
sequence of pseudorandom numbers each time.
A useful trick for seeding a program so that it produces different results each time it is
compiled and run is to use the time at compilation as a seed. The Python module time enables
the programmer to access the information in the computer’s clock.
from time import time , c l o c k
a=time ( )
b=c l o c k ( )
print a
print b
The function time gives the time in seconds since the start of time, as far as the computer is
concerned. Typically7 this is midnight on January 1, 1970. We can use this to time how long
a piece of code takes:
from time import time
a = time ( )
i = 0
while i <10000:
i+=1
b=time ( )
print ( b−a )
The function clock is similar, but the clock starts running the first time the function is
called, and counts in seconds to more decimal places.
Time can also be useful for seeding a random number generator with a floating point number
x ∈ [0, 1). One way would be to use the lines:
from math import floor
from time import time
x = time()
x = x - floor(x)
This code will give a different (and largely unrelated) seed each time it is run. Similarly, we
could use the same trick to produce an integer seed. Here we might use a trick similar to that
used in the middle-square method to extract an integer from the floating point time.
7.F. Lagged Fibonacci. The well-known Fibonacci sequence, given by Fn = Fn−1 +Fn−2 , is a
second-order recurrence relation requiring two initial values F0 and F1 . This can be generalised
to a lagged Fibonacci sequence, given by
Sn ≡ Sn−j + Sn−k (mod m),
where 0 < j < k. Here the new term is the sum of two previous terms in the sequence, with the
lag given by the integers j and k. The modulus m is usually taken in practice to be a power
of 2, say m = 2M = 232 . To generate such a sequence we need to supply initial values from S0
up to Sk , inclusive. A Lagged Fibonacci generator has a maximum period of (2k − 1) × 2M −1 .
With M = 32 this can be very large indeed.
It is much more difficult to determine which parameters j and k will produce the maximum
period possible than it is for linear congruential generators. Some pairs which do this include:
(j, k) = (7, 10), (5, 17), (24, 55), (65, 71), (128, 159).
7although this may vary from machine to machine. To check, run the command gmtime(0), having imported
it from time.
48 RICHARD ELWES & ROB STURMAN

7.G. Mersenne Twister. Python uses a technique known as a Mersenne Twister to generate
pseudorandom numbers, which has a period of 219937 − 1. This is certainly sufficient for many
purposes, although is entirely deterministic. Random numbers using the Mersenne Twister are
provided in Python by the module random. Functions from this module include random(),
which returns a pseudorandom floating point number x ∈ [0.0, 1.0), and randint(a,b), which
returns a pseudorandom integer in the range [a, b].

Mathematics Background 7
7.H. Buffon’s needle. The problem of Buffon’s needle was first posed by the French naturalist
Georges-Louis Leclerc, Comte de Buffon. It concerns dropping a needle of length l onto a floor
marked with parallel lines a distance d apart, and asks the probability that a needle will across
a line. In the case that l < d, the problem can be solved using relatively straightforward
integration and probability theory, giving the answer 2l/πd. Thus an estimation of π can be
found by performing the experiment. Simply drop a needle at random many times, and count
the proportion P of times the needle lies over a line. Then π ≈ P d/2l.
7.I. Monte Carlo methods. When mathematicians cannot solve a problem analytically, we
often use numerical techniques to get approximate solutions. Monte Carlo methods are nu-
merical techniques of this kind, which employ randomness in a critical way. They work by
generating random numbers to use as possible initial conditions (this corresponds to the exact
location from which Compte de Buffon drops his needle), and computing the proportion of
solutions which satisfy a particular property (such as crossing a line).
Here is a simple example, closely related to Buffon’s needle. Consider a unit square with
an inscribed circle. The area of the circle is π/4, since a circle inscribed in a square of side 1
has radius 1/2. A Monte Carlo method for estimating π is then to choose points distributed
uniformly in the square. One way to do this is to use Python’s random() procedure to generate
independent random samples from [0, 1) to use as x and y coordinates. Count the proportion
of
p these points lying inside the inscribed circle, that is, points (x, y) with the property that
(x − 0.5)2 + (y − 0.5)2 < 0.5. This proportion is an estimate for π/4. On average, the
approximation improves as more sample points are taken.
7.J. Linear Congruential Generators. It is a non-trivial fact that a linear congruential
generator xn+1 = (axn + c) (mod m) produces a sequence of pseudorandom numbers with
period m (the maximum possible period) for all initial seed values, if and only if
(1) c and m have no common factors
(2) a − 1 is divisible by all prime factors of m
(3) a − 1 is a multiple of 4 if m is a multiple of 4
These conditions can be satisfied relatively simply. For example, if m is a power of 2 (so that
2 is the only prime factor of m), all three conditions are satisfied provided c is odd and a is 1
greater than a multiple of 4.
7.K. Lagged Fibonacci Generators. Lagged Fibonacci generators achieve their maximum
period if, of the k prescribed initial values at least one is odd, and if the polynomial
y = xj + x k + 1
is primitive over Z2 (i.e. the integers mod(2)). This is a rather technical algebraic notion. A
polynomial of degree k is primitive over Z2 if it has a solution that generates all elements of
an extension field of Z2 of degree k.
64 RICHARD ELWES & ROB STURMAN

9. Fractals
9.A. Introduction. One field of mathematical research which flourished due, at least in part,
to the advent of computers, was fractal geometry. Fractals are geometric objects with dimen-
sions which are not integers. One method, designed by the British mathematician Michael
Barnsley in the early 1990s, of creating fractal is now termed the chaos game.
9.B. The chaos game. The simplest implementation of Barnsley’s method is known as the
chaos game. Fix a triangle and label the vertices A, B, C. Plot a point p0 somewhere inside
the triangle. Then choose a vertex at random and plot a point p1 halfway between p0 and
that vertex. Continue in this way, choosing vertices at random, and plotting point pi halfway
between the chosen vertex and pi−1 . The resulting set of points is a fractal shape8, and has a
immediately noticeable self-similarity. The following code plots a Barnsley fractal.
from random import random , r a n d i n t
import m a t p l o t l i b . p y p l o t a s p l t
i t e r a t i o n s = 50000
# d e f i n e x and y c o o r d i n a t e s o f 3
# v e r t i c e s o f t r i a n g l e & an i n i t i a l p o i n t
x=[0.0 ,1.0 ,0.5 ,0.2]
y=[0.0 ,0.0 ,0.8 ,0.3]
r =0.5
fo r i in r a n g e ( i t e r a t i o n s ) :
p = randint (0 ,2)
x . append ( x [ −1]∗(1 − r )+x [ p ] ∗ r )
y . append ( y [ −1]∗(1 − r )+y [ p ] ∗ r )
p l t . s c a t t e r ( x , y , s =0.1)
The particular fractal produced is known as the Sierpiński gasket, or Sierpińksi triangle, which
can also be defined in the following way, starting with an equilateral triangle and iteratively

removing the central equilateral triangle:


9.C. Fractal dimensions. The Sierpiński triangle created from the chaos game has Hausdorff
dimension equal to log(3)/ log(2) ≈ 1.585. This dimension has a technical definition (see
Mathematics Background), but is related to the intuitive and simpler box-counting dimension.
This, as its name suggests, is computed by counting the number of boxes required to cover the
set in question, in the limit of the size of the boxes going to zero.
Definition 4 (Box-counting dimension). Given a set S in a Euclidean space Rn , let N () be the
number of n-dimensional boxes of side  required to cover S. Then the box-counting dimension
dbox (S) is given by the following limit (where it exists):
log(N ())
dbox (S) = lim
→0 log(1/)

For example, consider a straight line of length l. To cover the line with boxes of width 
requires N = l/ boxes. Thus the box-counting dimension of a line is
log(l/) log(l) − log() log(l)
dbox = lim = lim = lim − +1=1
→0 log(1/) →0 − log() →0 log()
as expected. Similarly, the a rectangle of width a and height b needs N = (a/) × (b/) boxes
of width  to cover it, and so has box-counting dimension given by
log(ab/2 ) log(ab) − 2 log()
dbox = lim = = 2.
→0 log(1/) − log()
8Usually the first few iterates are deleted, to ignore the any transient behaviour.
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 65

Table 4. Covering the Sierpiński triangle of unit size.

Iteration
Size of box,  1 1/2 1/4 1/8 1/16 1/2n
Number of boxes, N () 1 3 9 27 81 3n

The box-counting dimension of the Sierpiński triangle can also be computed exactly. Here
we consider the construction of the Sierpiński triangle by the central triangle removal method
rather than the chaos game. As shown in Table 4, a triangle of side 1 can be covered by exactly
1 box of side 1. Removing the central triangle then requires 3 boxes of side 1/2. Removing the
next set of central triangles we require 9 boxes of side 1/4. This process continues, and at the
nth iteration we require 3n boxes of side 1/2n . Observing that as the number of iterations n
grows, the size of box  shrinks, we can compute the box-counting dimension of the Sierpiński
triangle to be exactly:
log(N ()) log 3n log 3
dbox = lim = lim n
= = 1.58496 . . . .
→0 log(1/) n→∞ log 2 log 2
The argument above works because we understand the fractal well enough to compute its
dimension analytically. In situations with a more complex fractal, the box-counting dimension
is not an easy thing to compute. To start with, we can only ever compute an approximation
to the fractal. Suppose we create a set approximating some fractal (perhaps the Sierpiński
triangle by the chaos game method). It is relatively simple to write the algorithmic procedure
that will approximate the boxcounting dimension:
• Define a grid of boxes of size  and fill the grid with zeros
• For each point in the fractal set, change the zero of the containing gridbox to a 1
• After all points have been located, count the number N of 1s in the grid
• Repeat with increasingly small box size 
One major computational issue is that we must have sufficient iterates in the fractal set com-
pared to the number of boxes in the grid. If there are insufficient points in the fractal, then
eventually the grid gets so fine that boxes which you would expect to be full remain empty,
and the computation becomes inaccurate. In particular, it is impossible to take a finite number
of points in the fractal set, and then computationally take  to zero. Instead, we use the fact
that log N () ≈ dbox · log(1/) for small , and plot a graph of log N () against log() for 
reasonably small. The gradient of the graph (which should be a straight line) then gives a
reasonable approximation for the box-counting dimension. A good rule of thumb is that there
should be around ten times more points in the fractal set than boxes in the grid.
9.D. Barnsley ferns. A Barnsley fern is effectively a generalisation of the chaos game pro-
ducing a fractal set typically in the shape of a fern. In particular, Michael Barnsley’s original
definition produces an image of the black spleenwort fern. The system consists of four affine
transformations gi , i = 1, 2, 3, 4, each of the form
    
a b x e
gi (x, y) = +
c d y f
chosen at random with probabilities pi . The entries in the constant matrix and vector for the
four transformations are given in Table 5.
66 RICHARD ELWES & ROB STURMAN

Table 5. Original parameters for Barnsley’s fern.

a b c d e f pi
g1 0 0 0 0.16 0 0 0.01
g2 0.85 0.04 -0.04 0.85 0 1.6 0.85
g3 0.2 -0.26 0.23 0.22 0 1.6 0.07
g4 -0.15 0.28 0.26 0.24 0 0.44 0.07

Mathematics Background 9
9.E. Fractal dimensions. The notion of a dimension which can take fractional values has been
used in mathematics for far longer than fractal images have been viewed on a computer. For
example, in late 1940s, the mathematician, physicist and meteorologist Lewis Fry Richardson,
while investigating the causes of international conflict, wished to accurately measure the length
of borders between different countries. He made the fundamental observation that the measured
length of coastlines depended on the length of the measuring tools. Richardson demonstrated
that the length of coastline as measured by rulers of decreasing length appears to increase
without limit. This Richardson effect is part of the genesis of fractal dimensions, as revisited in
Benoit Mandelbrot’s 1967 paper How Long Is the Coast of Britain? Statistical Self-Similarity
and Fractional Dimension. There are now many different definitions of a fractal dimension,
each addressing a slightly different facet of the same type of behaviour.
9.F. Box-counting dimension. The box-counting dimension discussed in lectures, also known
as the Minkowski dimension is arguably the simplest fractal dimension, both conceptually and
terms of computation. The box-counting dimension does have limitations however. For exam-
ple, we would like any definition of dimension to give zero dimension to single points, and to
countable unions of points. Consider the set X = {1/n|n ≥ 1} ∪ {0}. That is, X contains the
reciprocals of all the integers, plus zero. It is clearly countable. However, choose 0 <  < 1/2,
1 1
and let k be the (unique) integer satisfying k(k+1) ≤  < k(k−1) . Then an interval of length 
can cover at most one of the points of the set {1, 21 , 13 , . . . , k1 }. Therefore at least k intervals are
required to cover X, so
log(N ()) log(k)

log(1/) log k(k + 1)
Taking  → 0 gives dbox (X) ≥ 2 . However, we can cover the interval [0, k1 ] with (k + 1) intervals
1

of length , and the remaining (k − 1) points of X with (k − 1) intervals, so we require at most


2k intervals to cover X. Hence
log(N ()) log(2k)

log(1/) log k(k − 1)
which gives dbox (X) ≤ 12 . Therefore the set X has a box-counting dimension with fractional
value 21 , and yet is a countable union of isolated points.
9.G. Scaling Dimension. Another approach to fractal dimension comes from the following
observation: start with a square, say of width 1. If we shrink it to 12 its width, exactly 4 = 22
of the smaller squares fit inside the original. If instead we shrank our square to 13 its width, we
would need 9 = 32 to cover the original. In both calculations the exponent is 2, which is the
dimension of a square.
If we start with a cube, the same phenomenon occurs: if we shrink it to 12 its width, exactly
8 = 23 of the smaller squares fit inside the original. Again the dimension of the cube (3) appears
as the exponent.
We can deduce a dimension of the Sierpiński gasket in the same way. If we shrink it to 12
its width, then the shrunken gasket fits exactly 3 times into the original. So the dimension d
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 67

should satisfy 2d = 3 which is to say, d = log3 (2) = 1.58496 . . . (the same as the box-counting
dimension above).
The scaling dimension is a mathematically natural one. However it is not easy to handle
computationally since it requires access to the final fractal structure, not just a finite approx-
imation. Furthermore it only works with fractals which exhibit exact self-similarity. In more
sophisticated fractals, small parts of the shape may resemble distorted copies of the whole,
meaning the idea requires adaptation.
9.H. Hausdorff dimension. A more formal definition avoids these problems. The Hausdorff
dimension, introduced in 1918, is technical and sophisticated, very general, but hard to com-
pute. Given a set X, we consider covers {Ui }∞ i=1 of X by open sets. The diameter |Ui | of an
open set Ui is the maximum distance between two points inside the set. For  > 0, an -cover
of X is a cover which each |Ui | < .
For s > 0 (not necessarily an integer), we define
(∞ )
X
Hs (X) = inf |Ui |s such that {Ui }∞
i=1 is an -cover of X .
i=1
That is, we consider all -covers of X try to minimise the sum of the sth powers of their
diameters. Now we let  decrease, meaning the infimum Hs (X) increases, and we write
H s (X) = lim Hs (X).
→∞
Finally we let s decrease, and define the Hausdorff dimension dH to be
dH = inf{s : H s (X) = 0}.
The Hausdorff, box-counting, and scaling dimensions are related, and frequently they coin-
cide. One relationship which is always true is that for any set X, dH (X) ≤ dbox (X).
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 71

10. Computer-assisted proof


10.A. Introduction. Computers and mathematical proof are uneasy bedfellows. Mathemati-
cians disagree on the validity of mathematical results proven with the use of a computer. At
the root of such a debate lies the philosophy of the very nature of truth and proof. All the
same, it is certainly true that computation can further mathematical understanding. In other
cases, computers can supply brute-force proofs by exhaustion which a human alone could never
manage.
10.B. Euler’s conjecture. Fermat’s Last Theorem, that there are no integer solutions to the
equation
an + bn = cn
when n > 2, was proven by Andrew Wiles in 1995. The fact that it was shown to be true was
no great surprise — by 1993 the conjecture had been tested (using a rigorous method due to
Ernst Kummer) for n up to four million. Euler’s conjecture (related to Fermat’s Last Theorem)
states that there are no integer solutions to the equation
a4 + b 4 + c 4 = d 4 .
Years of numerical searching by computers failed to find a solution, supporting the conjecture,
and its truth was probably believed by many as readily as Fermat’s Last Theorem. In 1988
Naom Elkies of Harvard University combined mathematical insight with computational search,
to find a counterexample:
2, 682, 4404 + 15, 365, 6394 + 18, 796, 7604 = 20, 615, 6734 .
In fact Elkies’ method allows the construction of infinitely many solutions (of which this is the
smallest). The conjecture also fails replacing 4th powers with 5th powers, but it is not known
whether or not it holds for higher powers.
10.C. The Four-colour Theorem. The Four Colour Theorem is arguably the most famous
example of a mathematical problem which was solved using computational techniques. The
conjecture is easy to state: any division of a planar region into contiguous elements can be
coloured so that no two adjacent regions have the same colour using at most four different
colours. This was first proposed as a conjecture by Francis Guthrie in 1852. Several well-known
mathematicians, including Alfred Kempe in 1879 and Peter Guthrie Tait in 1880, published
‘proofs’ to the conjecture, some of which were accepted for years, but all were eventually shown
to be incorrect. The theorem was finally proved by Kenneth Appel and Wolfgang Haken in
1976, who used a computer to check each of 1,936 cases.
10.D. The Kepler Conjecture. Another old, apparently straightforward, conjecture that
has only been proven computationally is Kepler’s sphere-packing conjecture. In 1611 Johannes
Kepler stated, in a manuscript on the structure of snowflakes, that there is no way to arrange
equal sized spheres to achieve a greater density than stacking them in hexagonal lattices. In
other words, the greengrocer’s method of filling a box with oranges really is the most space-
efficient method. Gauss, in 1831 proved that any more efficient arrangement must be an
aperiodic one. Beyond that, the conjecture remained unproven, and was even included in David
Hilbert’s 23 unsolved problems of mathematics in 1900. In 1998 Thomas Hales announced that
a computational proof, involving the minimization of a function of 150 variables for a set of
over 5000 different packing configurations. Unfortunately (and very unusually), the paper’s
referees were unable to fully certify the proof, but said they were “99% certain” is was correct.
In 1998, Hales and collaborators then instigated a collaborative effort to produce a fully
computationally verified proof, using Coq. This ‘FlysPecK project’ (‘Formal Proof of Kepler’)
was completed in 2014.
72 RICHARD ELWES & ROB STURMAN

10.E. Goldbach’s conjecture. In 1742 Christian Goldbach sent a letter to Leonhard Euler
observing that, apparently, every even integer greater than 4 is the sum of two odd primes. It
is simple to check for the first few even numbers (6 = 3 + 3, 8 = 5 + 3, 10 = 5 + 5 = 7 + 3,
12 = 7 + 5, 14 = 7 + 7 = 11 + 3, etc). Note that some even numbers are expressible as the
sum of two odd primes in more than one way. A weaker version of the Goldbach conjecture
states that all odd numbers greater than 7 can be written as the sum of three odd primes. The
Goldbach conjecture has been verified computationally, for all even numbers up to 35 × 1017 ,
but a proof of the result for all even numbers still evades mathematicians.
Computations can only ever verify Goldbach for a finite set of numbers, how ever large. It
is possible that computation could find a counter-example, or even that it may provide a hint
towards a method of proving the conjecture rigorously. Computational investigation can assist,
but not supplant, mathematical reasoning. Moreover, finiteness is not the only barrier to the
use of computation in proving Goldbach.
In 1937, Vinogradov proved that the weak Goldbach conjecture was true for all odd numbers
greater than 107,000,000 . It might appear that since the ‘infinite’ case has been dealt with,
then all we need do is verify the conjecture for the finite set of numbers up to 107,000,000 , and
the conjecture is proved. But this is an insanely large number. The number of particles in
the known universe is around 1080 , and so it is utterly impossible for a computer to even
accurately represent this number, let alone compute with it. In 2013, Peruvian mathematician
Harald Helfgott published a paper which proves the weak Goldbach conjecture. First, he used
mathematical reasoning to reduced the threshold to the more managable figure of 1030 . Then,
with David Platt, he checked odd numbers up to that limit by computer.
Goldbach’s original conjecture, however, remains unproven.
10.F. Euler Bricks & Perfect Cuboids. A Pythagorean triple is a triple (a, b, d) of positive
integers where a2 + b2 = d2 . This can be thought of as describing an a × b rectangle with the
property that the diagonal d is also of integer length.
An Euler Brick is an a × b × c cuboid where a, b, c are positive integers, as are the diagonals
of each face. The first example (a, b, c) = (44, 117, 240) was discovered in 1719 by Paul Halcke.
A Perfect Cuboid is an Euler Brick with the additional property that the body diagonal is
an integer. That is: a2 + b2 + c2 = d2 for some integer d. It is an open question whether any
perfect cuboids exist. But computational techniques have established that if one does, each of
a, b, c must exceed 5 × 1011 .
10.G. Julia sets. One area of mathematics that has been transformed by numerical computa-
tion is complex dynamics. The mathematical foundations of the topic were established in the
early 20th century by the French mathematicians Gaston Julia and Pierre Fatou. They defined
a pair of complementary sets (the Julia set and the Fatou set) for complex-valued functions.
Julia began by considering the problem of which root of a particular complex function a vari-
ation of Newton’s method found. The answer turns out to be one of infinite complexity, and
indeed results in a fractal division of the complex plane. Of greater complexity still were the
systems Julia studied next. A special case is given by the iterative procedure
zn+1 = f (zn ) = zn2 + c,
where the zn form a sequence of complex numbers, and c is a complex constant. Iterating the
system for a given initial condition z0 , we find that the sequence either stays bounded in the
complex plane, or else diverges to infinity. The initial conditions producing bounded sequences
form the filled Julia set. (The Julia set is then the boundary of such a set.) Julia sets take
many different forms depending on the constant c. Taking c = 0 produces a trivial Julia set
(the unit circle), while c = −1 gives a recognizable fractal. Julia and Fatou proved many results
about the topology and geometry of Julia sets, but neither lived to actually see what a fractal
Julia set looked like.
MATH2920 COMPUTATIONAL MATHEMATICS, 2018/19 73

In 1980 Benoit Mandelbrot revisited Julia and Fatou’s work, using a computer to simulate the
behaviour of the map f . He saw that Julia sets were either connected (that is, in one piece), or
disconnected, depending on the value of c. The Mandelbrot set is defined to be the set of points
c in the complex plane such that the corresponding Julia set is connected. Computationally,
the Mandelbrot set can also be found by iterating the map f . Always starting with the initial
value z0 = 0, the constant c is in the Mandelbrot set if the sequence {zn } remains bounded.
10.H. The Prime Number Conjecture. Let the number of prime numbers less than N be
given by Π(N ). Predicting the appearance of prime numbers is notoriously difficult, but Gauss,
RN
when 14, gave the estimate Π(N ) ∼ lnNN . He later refined this to Π(N ) ∼ Li(N ) = 2 lndxx . It
appeared that Gauss’s logarithmic integral always overestimated Π(N ) (that is, appeared to be
an upper bound). Testing computationally for N up to a million, or a billion, or even a trillion,
showed Li(N ) > Π(N ). However, in 1914, Littlewood showed that for some large enough N ,
Li(N ) < Π(N ), and in 1955 Skewes showed that first such N would occur sometime before
101000
N = 1010 . This upper bound has subsequently been reduced to ‘only’ 10317 .
10.I. Computational Proof Verification. Some recent mathematical proofs (such as the
Four Colour Theorem 10.C and Kepler Conjecture 10.D) involve so much intensive computation,
combined with lengthy and intricate mathematical arguments, that when fully unpacked they
are far too large and difficult for every line to be checked by a human mathematician. So
how can we ever trust them? Formal Proof Assistants provide an answer. Such an assistant
is a special programming language which can examine the proof, and check that each line is a
logical consequence of the lines above. Of course, the proof has to be presented in a suitably
formal way; this typically requires a great deal of work.
Once a proof is formally verified, you only have to trust the proof assistant’s kernel; so long
as that programme is robust, there can be no other error.
There are now various proof-verifying languages in existence. Georges Gonthier used Coq in
2005 to validate a proof of the 4 colour theorem (Gonthier also used Coq to verify the Feit-
Thompson Theorem, a lengthy proof in group theory, but the theorem is quick to state: every
finite group of odd order is solvable). Hales and his collaborators used HOL Light and Isabelle
to verify Kepler’s conjecture. Mizar is another proof assistant, used for the expanding Mizar
Mathematical Library: a collection of 52000 mathematical facts and theorems (and growing)
which have been formally verified. These include many standard mathematical facts, as well
as more advanced theorems such as Gödel’s Incompleteness Theorems. There are now many
ambitious formalised mathematics projects ongoing.
10.J. Automated Theorem Proving. Formal Proof Assistants check proofs which are con-
structed by humans. An altogether more ambitious idea is to create software which can come
up with proofs itself. Robbins’ Conjecture was a conjecture in abstract algebra made in 1933, it
was finally proved in 1996 by the Equational Prover in collaboration with human mathematician
William McCune.

You might also like