O'Connor - Two Simple Statistical Calculations and ClimateGate

Two Simple Statistical Calculations
and
ClimateGate
Derek O’Connor
www.derekroconnor.net
Started : February 14, 2008,

Latest : February 6, 2010
Yet the errors do not come from the art

but from those who practice the art.1
— Isaac Newton (1642 – 1727)
1 Introduction
This note is prompted by reports of errors in the calculation of two simple statistics: the
mean of a vector x, or x̄ = 1n i xi , and its variance Var(x) = 1n i (xi − x̄)2 .
P P
Errors in such simple calculations usually indicate that a problem is ill-conditioned, or an

unstable algorithm is being used, or a combination of both.
We illustrate these ideas by analyzing two problems where these errors arise: the Sea Surface
Heights mean problem which is ill-conditioned, and the Microsoft Excel variance problem
where an unstable algorithm is used.
A new section has been added called Climategate. This identifies and explains some of the
the programming errors found in the journal of Ian (Harry) Harris who worked as a scientist-
programmer in the Climate Research Unit at the University of East Anglia.
We will see that Newton’s admonition needs to be taken to heart, especially in scientific
computing.
1
Attamen errores non sunt Artis sed Artificum, from the ‘Author’s Preface to the Reader’, Philosophiae
Naturalis Principia Mathematica, First Edition, July 5. 1686.
1
Two Statistical Calculations Derek O’Connor
1.1 The Sea Surface Heights Problem
The following information was taken from a paper [4], by Yun He & Chris Ding of the
NERSC-Lawrence Berkeley Labs who were doing a large-scale simulation of ocean circula-
tion. At each step of the simulation the following was done:
1. Sea Surface Heights are calculated at each point on a 64 ×120 latitude-longitude grid.
2. The average of these 64 × 120 = 7680 numbers is then calculated.

3. This average is compared with satellite data.
The Fortran code below does the summation part of these calculations.
sum = 0.0
do i = 1, 64 latitude index
do j = 1, 120 longitude index
sum = sum + ssh(i,j)
end do
end do
The order of summation can be changed by interchanging the i and j indices and by revers-
ing their order.
Table 1: He and Ding’s Results using 16-digit Precision
Summation Order Value of sum Rel. Error Corr. Digs.

Longitude First 34.4147682189941410 95.1 0
Latitude First 0.67326545715332031 0.88 0
Reverse Longitude First 32.302734375 89.2 0
Reverse Latitude First 0.734375 1.05 0
Exact Value 0.35798583924770355224609375 0 26
Table 1 shows the results that He & Ding got with the Fortran code shown above, on a single
processor, using IEEE double precision (∼16 decimal digits). He & Ding point out that these
results are completely wrong — not one digit is correct. We will analyse this problem in
Section 3 and explain why He & Ding got such inaccurate results.
1.2 The Microsoft Excel Problem
Microsoft’s Excel spreadsheet has been in use for many years and has gone though many
versions. Many millions in business, government, and universities, use some version of
Excel. Most users do not have the time or ability to test the quality of Excel’s calculations,
and when errors do occur most users do not see the result as erroneous. Here is an example
where Excel gets a wrong answer to a simple problem.
c Derek O’Connor, February 6, 2010

2
We wish to calculate the mean and standard deviation of the set of numbers
xi = ai + M, ai = 1 i = 1, 3, 5, 7, 9 and ai = 2, i = 2, 4, 6, 8, 10.
where M is a large constant. This contrived example is designed to reveal flaws in the
standard deviation calculation.
The exact values of the mean and variance are
n n
1X 1 X 1
x̄ = xi = (ai + M) = (15 + 10M) = M + 1.5
n i=1 10 i=1 10
n 10 10
1 X 1 X 1X
Var(x) = (xi − x̄)2 = (ai + M − M − 1.5)2 = (ai − 1.5)2
n − 1 i=1 10 − 1 i=1 9 i=1
10
1X 1
= (±0.5)2 . = 2.5 = 0.277̇ . . .
9 i=1 9
√
SDev(x) = 0.277̇ . . . = 0.5270462766947299 rounded to 16 digits and does not involve M.
Table 2 shows the results of Excel 2000’s calculations with M = 108 , 1010 , 1014 , 1015 . The
last line of the Table 2 contains Excel 2000’s values for the standard deviation. None of these
values is correct. This result is not new: for many years and versions the variance function
in Excel has been calculated by a bad algorithm which gives the bad results shown here.2
We will analyse this problem in Section 4 and explain why Excel 2000 and later versions get
such inaccurate results.
Table 2: Excel 2000 Results

i xi xi + 108 xi + 1010 xi + 1014 xi + 1015
1 1 10000001 1000000001 100000000000001 1000000000000001
2 2 10000002 1000000002 100000000000002 1000000000000002
3 1 10000001 1000000001 100000000000001 1000000000000001
4 2 10000002 1000000002 100000000000002 1000000000000002
5 1 10000001 1000000001 100000000000001 1000000000000001
6 2 10000002 1000000002 100000000000002 1000000000000002
7 1 10000001 1000000001 100000000000001 1000000000000001
8 2 10000002 1000000002 100000000000002 1000000000000002
9 1 10000001 1000000001 100000000000001 1000000000000001
10 2 10000002 1000000002 100000000000002 1000000000000002
Sum 15 100000015.0 10000000015.0 1000000000000015.0 10000000000000016.0
Mean 3/2 10000001.5 1000000001.5 100000000000001.5 1000000000000001.6
√
Sdev 0.27̇ 0.54. . .20E+00 0.00. . .00E+00 1.39. . .30E+06 0.00. . .00E+00
2
Note that the sum in the last column is wrong also.

3
2 Preliminaries
2.1 Floating Point Arithmetic
We assume we are working in a floating point number system with base b and precision p.
The derived parameter, ǫ M = b1−p , is called machine epsilon, the distance between 1.0 and
the next higher floating point number. In such a system we can show that fl(1 + δ) = 1 if
δ < 21 ǫ M . That is, δ is insignificant compared to 1.
— TO BE COMPLETED —

4
A Summary of Floating Point Number Systemsa

s
A real number x has a floating point representation x̂ = fl(x) = ± p × be , which
b
is a rational number because s, b, p, and e are integers.
F(b, p, emin , emax ) is the floating point number system with base b, precision p,
and exponent range [emin − emax ]. F is a finite subset of the rationals Q.
1. Range : bemin −1 ≤ | x̂| ≤ (1 − b−p ) bemax .
2. Machine Epsilon : ǫ M = b1−p .
3. Unit Roundoff (to Nearest) : u = 12 b1−p = 12 ǫ M
4. Roundoff Error : x̂ = fl(x) = x(1 + δx ), where |δx | ≤ u
5. Fundamental Axiom of Floating Point Arithmetic
fl(x ◦ y) = (x ◦ y)(1+δ), where |δ| ≤ u and ◦ is one of {+, −, ∗, /}.

Note: the operation x ◦ y is exact but may need to be rounded to a floating
point number. Hence fl(x ◦ y) = (x ◦ y)(1 + δ), where |δ| ≤ u .
6. Cancellation Error
Theorem 1. In any floating point system F(b, t, −, −), without guard digits,
the relative error in fl(x − y) can be as large as b − 1.
Note: this can be as large as 100% for b = 2 and 900% for b = 10.
IEEE Double Precision : F(b, p, emin , emax ) = F(2, 53, −1021, 1024)
1. Range : 2.225 × 10−308 ≈ 2−1022 ≤ | x̂ | < 21024 ≈ 1.798 × 10308
2. Machine Epsilon : ǫ M = 21−53 ≈ 2.2 × 10−16 .
3. Unit Roundoff : u ≈ 1.1 × 10−16 .

a
The expression x̂ = fl(x) means that fl(x) is the operation of rounding x, and x̂ is the rounded
value of x.
2.2 The Condition of a Problem
Both the mean and variance calculations are essentially summation problems. Calculating
the sum of n numbers is probably the simplest and most widely-used calculation in comput-
ing, from the home spreadsheet user to scientists who use earth and cosmos simulators.
We wish to explain why the simple problem of summation can give rise to such wildly-

5
inaccurate results. Such results usually indicate that the problem, that is, the data (x1 , x2 , . . . , xn ),
is ill-conditioned with respect to summation. However, a bad result may indicate a bad al-
gorithm.
We view a solution of a problem as a mapping or transformation from
Definition 1 (Condition of a Problem).
Problems, Data, Algorithms, and Programs
We need to be very careful when using these four words because of the confusion that sloppy
use can cause.
The following is a synopsis of pages 89–91, Trefethen and Bau, Lecture 12 [2].

6
Trefethen and Bau’s Definition of Condition
The process of solving a problem may be viewed as a mapping f : X → Y of a

problem instance x ∈ X (the problem(data) space), to a solution y = f (x) ∈ Y
(the solution(data) space). In general we assume both X and Y are normed
vector spaces and in what follows we assume that X = Rn and Y = Rm , which
have the usual norms.
We are interested in the effect of small perturbations δx on the solution f (x).
Let δ f = f (x + δx) − f (x). Then the Relative Condition Number for f at the
point x is the scalar κ( f (x)), defined as
!
kδ f k / k f (x)k
κ( f (x)) = lim sup (1)
δ→0 kδxk ≤δ kδxk / kxk
If f is differentiable then we have δf /δx = J f (x) = [∂ f /∂xi ]ni=1 , the Jacobean

of f at x. Thus we have
kJ f (x)k
κ( f (x)) = kxk . (2)
k f (x)k
The relative condition (number) κ( f (x)) measures the relative change in the
solution f (x) due to a perturbation δx. If κ( f (x)) is small then the problem f is
well-conditioned. If κ( f (x)) is large then the problem f (x) is ill-conditioned.
For a one-dimensional function the expression in (2) becomes
| f ′(x)|
κ( f (x)) = |x| . (3)
| f (x)|
Example 1 (Subtraction). Let f (x1 , x2 ) = x1 − x2 , i.e., f : R2 → R. Then we

have J f (x) = [∂ f /∂x1 , ∂ f /∂x2 ] = [1, −1]. Using the k·k ∞ norm we get kxk∞ =
max{|x1 |, |x2 |}, k f (x)k∞ = |x1 − x2 |, kJ f (x)k∞ = 1, and
kJ f (x)k∞ max{|x1 |, |x2 |}

κ∞ ( f (x)) = kxk∞ = . (4)
k f (x)k∞ |x1 − x2 |
This shows that subtraction is ill-conditioned when |x1 − x2 | ≈ 0. In extreme cases
this can lead to catastrophic cancellation, where no digits of the result are correct.
(See Theorem 1)
Notes :
1. We use the notation κ( f (x)) rather than κ( f ) to remind us that the con-
dition number depends on f and x.
2. The condition of a problem has nothing to do with the algorithm used

to solve the problem — κ( f (x)) is a characteristic of the problem alone.

7
2.3 Error, Stability, and Condition
Let x be an exact numerical vector and x̂ be an approximation to x. Then we define
E x = kx − x̂k is the absolute error in x, and

kx − x̂k
ex = is the relative error in x. (5)
kxk
It is often useful to write x̂ = x(1 + ex ).
Errors in Algorithms
Recall that the process of solving a problem may be viewed as a mapping f : X → Y of a

problem instance x ∈ X, the problem space, to a solution y = f (x) ∈ Y, the solution space.
An algorithm to solve a problem will not, in general, be exact and so the algorithm is an
approximate mapping fˆ : X → Y. We define the relative error of the algorithm as
k f (x) − fˆ(x)k
ef = is the relative error in f . (6)
k f (x)k
Assume we want to do a calculation f (x). We want to determine the effect of using x̂. Let
x̂ = x(1 + ex ) and e f = ( f (x) − f ( x̂))/ f (x).
Example 2 (Condition vs Stability). Neumaier[6] has this example that nicely illustrates the differ-
ence between condition and stability. Calculate
p p
f (x) = x−1 − 1 − x−1 + 1, 0 < x < 1. (7)
√ √
Stability. For x ≈ 0, x−1 − √ 1 ≈ √x−1 + 1,√and the calculation of f (x) suffers from massive cancel-
lation. When x ≈ 1 f (x) ≈ 0 − 2 = − 2, and no cancellation occurs. Thus the calculation of
f (x) is unstable when x ≈ 0 and stable when x ≈ 1.
Condition. We have, using (3)
1 1
f ′ (x) = √ − √ ,
2 x2 x−1 + 1 2 x2 x−1 − 1
and, after some simplification, we get
f ′ (x) 1
κ( f (x)) = x = √ . (8)
f (x) 2 1 − x2
Hence f (x) is ill-conditioned near x = 1 because lim x→1 κ( f (x)) = ∞ and well-conditioned near x = 0
because lim x→0 κ( f (x)) = 12 .
Thus, the calculation of f (x) is stable but ill-conditioned near x = 1, and is unstable but well-
conditioned near x = 0.
This example shows clearly that condition and stability are two independent aspects of the
same problem.

8
3 The Summation Problem

n
X
There are many ways of calculating s = xi . Here is the standard algorithm for this prob-
i=1
lem, along with the general algorithm for summing the elements of the set X = {x1 , x2 , . . . , xn } :
algorithm sum(x, n) →s algorithm gsum(X) →X
s := 0 while |X| > 1 do

for i := 1 to n do (xi , x j ) := Delete2(X )
s := s + xi s := xi + x j
endfor Insert( s, X )
return s endwhile
return X
The standard algorithm is based on the partial-sum recurrence si = si−1 + xi , with s0 = 0 and
i = 1, 2, . . . , n. The general sum algorithm is not well-known but is interesting because it
allows us to sum the elements of any set X where ‘+’ is defined, and to do this in any order.
The function Delete2(X ) deletes and returns two elements xi and x j . The elements chosen
depend on the Delete2 function. The two chosen elements are added and the result s added
back into X. Thus the size of X decreases by 1 for each iteration of the while – loop, which
halts when the size of the set X reaches 1. On exit, the set X contains one element, the sum
of the elements in the initial set.
3.1 The Condition of the Summation Problem
In keeping with Trefethen & Bau’s definition of a problem, we view the summation of n
numbers, s(x), as a mapping from Rn to R. That is,
n
X
s : Rn → R, where s(x1 , x2 , . . . , xn ) = xi . (9)
i=1
√
We have Js (x) = [1, 1, . . . , 1] and so kJs (x)k1 = n, kJs (x)k2 = n, and kJs (x)k∞ = 1. Also
ks(x)k1 = ks(x)k2 = ks(x)k∞ = | xi |. Using these values in (2) we get3
P
P
|xi |
κ1 (s(x)) = n P
| xi |
P 12
√ |xi |2 (10)
κ2 (s(x)) = n P
| xi |
maxi {|xi |}
κ∞ (s(x)) = P
| xi |
3 P Pn
We use the abbreviated form xi for i=1 xi , in what follows.

9
P
We can see that these three condition numbers have the same denominator, | xi |, and if this
is small relative to the numerator the problem will be ill-conditioned. This can happen if x
has many positive and negative elements that cancel each other.
X X X
Rule of Thumb: xi is Ill-Conditioned if xi << |xi |.
P P P
The inequality | xi | << |xi |, is called Massive Cancellation in xi .
Example 3 (Massive Cancellation). Consider the problem x = [1, M, −M, M, . . . , M, −M] ∈ Rn+1 .
P P
We have xi = 1 and |xi | = 1 + nM. Hence
κ1 (s(x)) = n(1 + nM),

p
κ2 (s(x)) = n + n2 M 2 , (11)
κ∞ (s(x)) = M.
Each of these condition numbers can be made as large as we please by choosing M large enough.
Note, however, that it is not the value of M that causes the problem. The real culprit is the fact that
P
the denominator xi = 1 is small due to massive cancellation. This occurs no matter what the values
we have for M or n.
3.2 The Condition of the SSH Problem
We now demonstrate that the SSH problem suffers from massive cancellation and is thus
ill-conditioned.
The exact sum of the SSH problem may be calculated using such systems as Maple, Math-
ematica, Maxima, etc., and we get sexact = 0.3579858392477036 (rounded to 16 digits). The
1-norm condition of the SSH problem, using the exact sum is
5.3025040611697 × 1016
P
|xi |
κ1 (sexact ) = n P = 7680 ≈ 1021 (12)
| xi | 0.3579858392477036
Even if we use the inexact sum of Matlab we get κ1 (sM ) ≈ 1019 . This shows that the
SSH summation problem is highly ill-conditioned. Hence we may expect trouble with this
summation.
An upper bound on the relative forward error is:
|scalc − sexact |
er (s) , ≈ κǫM , (13)
|sexact |
where ǫ M ≈ 2.2 × 10−16 is machine epsilon for IEEE double precision. Thus, for the SSH
problem we have the upper bound
er (s) ≈ κ1 ǫ M ≈ 1021 × 10−16 = 105 or 10, 000, 000% (14)
Hence, we can expect to get no digits accurate in this sum, and we should not be surprised
that in IEEE double precision, Matlab’s sum(x) gives sM = 34.41476821899414, which is
100 times larger than the correct answer.

10
Although we cannot hope to get a correct answer for this problem in Matlab, the simple
and fast Matlab calculation cond1 = n*sum(abs(x))/abs(sum(x)) can warn us of trouble
ahead.
Calculating the exact value for this problem requires at least 26 digits of precision. This
precision (and higher) can be attained using 16-digit arithmetic by compensated summation
and other methods. However, even if we have the exact answer the fundamental difficulty
remains: the problem is ill-conditioned, i.e., a small change in the data will cause a huge
change in the result. No computational ‘trick’ can avoid this. Instead, we must ask the
question ‘why is this problem ill-conditioned?’ Is it an artifact the program that generates
the data, or the mathematical model of the ocean, or is it feature of the ocean itself? We
cannot answer these questions here. He & Ding [4] do not mention ill-conditioning and give
the impression that an accurate (or exact) calculation of the sum solves their problem. It does
not solve the problem, as this ‘maxim’ of Nick Trefethen implies:
If the answer is highly sensitive to perturbations,

you have probably asked the wrong question.4
A further maxim from Trefethen
No physical constants are known to more than around eleven digits,

and no truly scientific problem requires computation
with much more precision than this.
raises the question: why do He & Ding use 16-digit numbers when they are comparing them
to satellite data which has much lower precision? Scientific modellers and programmers
would do well to study Trefethen’s Maxims, and remember Newton’s admonition.
4 The Sum of Squared Deviations Problem

The sum of squared deviations of a vector x = [x1 , x2 , . . . , xn ] is the second simple statistical
calculation we wish to analyse. The sum of squared deviations S (x) is defined as:
n !2 n
n
X s(x) X
S : R → R, where S (x1 , x2 , . . . , xn ) = xi − , and s(x) = xi . (15)
i=1
n i=1
Chan, et al.,[3] give the 2-norm condition number for this problem:
s
kxk 2 1 s2 (x)
κ2 (S (x)) = √ = 2 1+ . (16)
S (x) n S (x)
Higham[5], pages 32 and 528, gives a component-wise condition number:
P
|xi − x̄| |xi |
κc (S (x)) = 2 , x̄ = s(x)/n. (17)
S (x)
4
“Maxims about numerical mathematics, computers, science and life”, L. N. Trefethen, SIAM News,
v. 31, no. 1 (1998), p 4. Download here: http://www.comlab.ox.ac.uk/people/nick.trefethen/publication/PDF/1998_76.pdf

11
The calculation of S (x) using the definition in (15) is straight-forward but requires two
passes over the data: one to calculate s(x) and one to calculate S (x).
We can get a one-pass algorithm by rearranging the expression for S (x) in (15) to give:
n !2 n
X s(x) X 1
S (x) = xi − = x2i − s2 (x). (18)
i=1
n i=1
n
The algorithms for calculating the variance by these two formulas are shown below. The one-
pass algorithms is more elegant and ‘efficient’ than the two-pass algorithm but is numerically
unstable. Unfortunately, it is often ‘trotted out’ as a clever trick in elementary statistics
books. Worse still, Microsoft’s programmers thought it was a clever trick and used it in
Excel until quite recently. First, let us see why the one-pass algorithm is bad and then we
will see how Excel fares.
algorithm TwoPassSSQ( x, n) algorithm OnePassSSQ( x, n)
sumx = 0 sumx = 0
for i := 1 to n do sumsqx = 0
sumx := sumx + x[i] for i := 1 to n do
endfor sumx := sumx + x[i]
xbar := sumx/n sumsqx := sumsqx + x[i]ˆ2
sumsqd = 0 endfor
for i := 1 to n do sumsqd := (sumsqx - sumxˆ2/n)
sumsqd := sumsqd + (x[i] - xbar)ˆ2 return sumsqd
endfor endalg OnePassSSQ
return sumsqd
endalg TwoPassSSQ
The differences between these algorithms is obvious. What is not so obvious is this important
distinction: when properly implemented in floating point arithmetic, the Two-Pass algorithm
can never give a negative result, but the One-Pass algorithm can give a negative, hence mean-
ingless result. This has plagued amateur programmers for many years, and it seems to have
plagued the unfortunate programmer named Harry in the Climategate Affair (see Section 5
below).
4.1 Analysis of the One and Two Pass Algorithms
We will use the vector x = [M, M + 1, M + 2] to examine the behaviour of these algorithms.
This simple contrived vector is chosen because it will allow us to see precisely where the
rounding errors occur and their effect on subsequent calculations.
Example 4 (Exact Arithmetic). We have x = [M, M + 1, M + 2]. Let S 1 (x) be the sum-of-squares cal-
culated by the One-Pass algorithm, and S 2 (x) the sum-of-squares calculated by the 2-pass algorithm.

12
These give:
n
X
s(x) = xi = 3M + 3 = 3(M + 1). (19)
i=1
n
X s2
S 1 (x) = x2i − = [M 2 + (M + 1)2 + (M + 2)2 ] − [3(M + 1)]2 /3
i=1
n
= (3M 2 + 6M + 5) − 3(M 2 + 2M + 1) (20)
= (5 − 3) = 2
Xn
S 2 (x) = (xi − s/n)2
i=1
= (M − M − 1)2 + (M + 1 − M − 1)2 + (M + 2 − M − 1)2 (21)
2 2 2
= −1 + 0 + 1 = 2
Thus both S 1 (x) and S 2 (x) give the same answer, in exact arithmetic. Notice, however, that the
intermediate calculations, (20) for S 1 (x) and (21) for S 2 (x), are different.
The condition numbers for s(x) and S (x) with x = [M, M + 1, M + 2] are:
1 P
|x |2 2 r
√ i 6 M 2 + 12 M + 10
κ2 (s(x)) = n P = < 1, for all M > 1. (22)
| xi | 9 M 2 + 18 M + 9
s s r
kxk 22
P 2
xi (3M 2 + 6M + 5)
κ2 (S (x)) = = = ≈ M, for large M. (23)
S (x) 2 2
This problem illustrates an important point about the condition of a problem : the condition of the
summation problem, κ2 (s(x)) < 1, is perfect as shown in (22), whereas the sum-of-squares condition,
κ2 (S (x)) ≈ M, in (23), can be made as ill-conditioned as we please by choosing M large enough.
Thus it is not the data that is ill-conditioned, but the calculation being performed on the data.
Now, let us perform the calculations of the previous example using floating point arithmetic.
This is tedious but worth the effort because the result can be generalized. Before we begin
the analysis we must remember the following points when using floating point arithmetic:
1. Associativity may not hold and so we assume that all expressions are evaluated from
left to right, i.e., fl(a ◦ b ◦ c ◦ d) = fl(fl(fl(a ◦ b) ◦ c) ◦ d) .
1 1
2. Unit Roundoff Error : u = ǫ M = b1−p = 2−53 ≈ 10−16 .
2 2
3. Relative Insignificance of δ : fl(x + δ) = fl(x) if δ < u x = x × 10−16 .
Example 5 (Floating Point Arithmetic). We have x = [M, M + 1, M + 2] and we assume that these
numbers are representable in the floating point number system
F(b, p, emin , emax )

13
, from equations (24), (25), (26),

n
X
ŝ(x) = xi = fl(M + fl(M + 1) + fl(M + 2) ) = 3M + 1. (24)
i=1
n
X s2
Ŝ 1 (x) = x2i − = [M 2 + (M + 1)2 + (M + 2)2 ] − [3(M + 1)]2 /3
i=1
n
= (3M 2 + 6M + 5) − 3(M 2 + 2M + 1) (25)
= (5 − 3) = 2
Xn
S 2 (x) = (xi − s/n)2
i=1
= (M − M − 1)2 + (M + 1 − M − 1)2 + (M + 2 − M − 1)2 (26)
2 2 2
= −1 + 0 + 1 = 2
Range Errors

14
4.2 Testing the One and Two Pass Algorithms
The Matlab function S1vS2(pows) shown below implements both algorithms for x = [M, M, M+
2] and the results plotted for various values of M. Matlab uses IEEE double-precision arith-
metic. Note that this problem is so simple that each summation is performed in one line of
code without any loops over the data.
function S = S1vS2(pows);
n = length(pows);
S = zeros(n,2);
for p = pows
M = 2ˆp;
x = [M M+1 M+2];
s = x(1) + x(2) + x(3);
S(p,1) = x(1)ˆ2 + x(2)ˆ2 + x(3)ˆ2 - sˆ2/3;
S(p,2) = (x(1)-s/3)ˆ2 + (x(2)-s/3)ˆ2 + (x(3)-s/3)ˆ2;
end
Plotting code omitted
Figure 1 shows how bad the one-pass algorithm is compared to the two-pass algorithm.
Recall that in exact arithmetic S (x) = 2 for all values of M. The one-pass algorithm gives
the correct result for M < 226 , while the two-pass algorithm gives the correct result for
M < 253 . In fact the one-pass algorithm gives the same results as the two-pass algorithm
using single precision, thus losing half the attainable precision.
4 4
3.5 3.5
3 3
2.5 2.5
S (x)
S (x)
2 2
1
1.5 1.5
1 1
0.5 0.5
0 0
0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 60
p p p p p p
p, where x=[2 2 +1 2 +2]; p, where x=[2 2 +1 2 +2];
Figure 1: One-Pass vs Two-Pass Algorithm

15
Testing Microsoft Excel 2000
We saw in Table 2 that Microsoft Excel 2000 gave completely wrong results for the variance
calculation. We explain here how Excel goes wrong by analysing a simpler problem which
is given in Table 3. The exact mean is M + 1 and the variance is 1, for each set. As we can
Table 3: Excel 2000 Calculations – Simple Problem
Data M M M+1 M+2 Mean Var

Set1 108 10,000,000 10,000,001 10,000,002 10,000,001 1
Set2 109 100,000,000 100,000,001 100,000,002 100,000,001 0
see, Excel 2000 gets the wrong result for the variance of the second set.
Assuming that Microsoft Excel 2000 uses IEEE double precision (∼ 16 digits), then using
the second variance formula S 22 = 12 [(3M 2 +6M +5)−(3M 2 +6M +3)] we get Ŝ22 = fl(S22 ) = 1
for data set 1 because the constants 5 and 3 are not zero relative to M 2 = 1014 . We get Ŝ 22 = 0
for data set 2 because the constants 5 and 3 are zero relative to M 2 = 1016 .
This shows that Microsoft Excel 2000 uses the bad, unstable, but faster one-pass method.
Microsoft was repeatedly told about this error but refused to fix it. They finally fixed it in
Excel 2003 and then charged people for an upgrade!
The spreadsheet Gnumeric is a free clone of Excel. Indeed the initial version was a per-
fect clone, repeating the errors of Microsoft Excel 2000. The Gnumeric-ers, to their credit,
quickly fixed it once they were told about the error.
Here is the latest news on Excel 2007 from The Inquirer5
A thread on Google Group microsoft.public.excel reveals that Excel 2007 loses its grip with
arithmetic that involves the number 65,535.
Several examples are shown, perhaps the simplest of which is the calculation ( 850 × 77.1 ),
which should produce 65,535 but instead returns 100,000.
There’s all sorts of speculation as to how this bug occurred, postulating floating-point and round-
ing errors and the like, but it seems much more likely that some Excel developer simply punted
at some point and the Vole’s stringent quality control (cough) never caught it.
Some might recall that mathematical errors have been discovered in Excel periodically in various
releases going back at least as far as Excel 5.
Microsoft people appear to have been involved in the discussion and confirmed the bug.
5
http://www.theinquirer.net/gb/inquirer/news/2007/09/25/math-bug-found-excel and follow Google Groups link

16
5 ClimateGate
This section examines some of the programming errors found in one Fortran program that
was in the Climategate files. Here is a concise description of how the Climategate files
became public:
“On November 17, 2009, someone posted to the Internet a vast archive of mate-
rials that had been hacked or leaked from the CRU. When packed, the materials
took up about 62 MB, and consist of more than 1,000 emails from prominent
members of the CRU and more than 3,000 documents that included everything
from raw data to annotated computer code to lengthy reports documenting the
frightfully disorganized state of the CRU’s vitally important data files.”
— Kenneth P. Green, Thursday, December 3, 2009
This quotation is referring to the Climate Research Unit (CRU), University of East Anglia
(UEA), which supplies much of the scientific knowledge and data to the UN’s Intergov-
ernmental Panel on Climate Change (IPCC). The compressed 62 MB file expands to about
150 MB and contains various emails, documents, and computer code in various languages
(Fortran and IDL mainly).
5.1 Programming Errors
Many errors occur in scientific programs because the programmers (amateurs usually) have
an imperfect understanding of computer arithmetics.
5.1.1 Arithmetic in Fortran
Fortran has many arithmetics but we concentrate on just two: 32-bit signed integer and 32-bit
floating point arithmetic.
Integer Arithmetic: Most computers use 2s-complement integer arithmetic for all types
of integers. The range of the 32-bit signed integers is
[−231 , 231 − 1] = [−2147483648, 2147483647].
An integer overflow occurs when a program calculates an integer value that is outside the
integer range. The reaction to this overflow depends on how the program was compiled.
The usual reaction is silent overflow and we get intmax+1 = intmin or intmin-1 = intmax,
as shown in Figure 2.
Floating Point Arithmetic: The range of the 32-bit reals is about ±10±38 , with machine
precision ǫ M = 2−23 ≈ 10−7 (≈ 7 decimal digits precision). Machine precision ǫ M is the
distance between 1 and the next higher floating point number. Hence if |x| is the magnitude
of a floating point number, then the magnitude of the next higher number is |x|(1 + ǫ M ), and
the distance between these is ǫ M |x|. There are two important consequences of these facts:

17
-2^{31} -1 0 +1 2^{31}-1
+1
-1
Figure 2: Integer Overflow
• No floating point number can exist between |x| and |x|(1 + ǫ M ). A number which falls
between these two must be rounded to one or the other.
• The distance between these two floating point numbers varies with the magnitude of
x. This means that large consecutive floating point numbers have large gaps between
them, and small numbers have small gaps.
5.1.2 Harris’s Program
Ian (Harry) Harris is or was a scientist-programmer at the CRU. The UEA website says that
Harris specialises in dendroclimatology, climate scenario development, data manipulation
and visualisation, programming.
Harris appears to have been given a program (anomdtb.f90) written by another person and
told to get it working.6 This is a particularly nasty job if the code has been badly written and
not properly commented. It is to Harris’s credit that he kept a fairly detailed journal or log
of the work he did on the program.
This is an extract from the file HARRY_READ_ME.txt (Harris’s journal) where he tries to figure
out how a squared variable becomes negative:
17. Inserted debug statements into anomdtb.f90, discovered that

a sum-of-squared variable is becoming very, very negative! Key
output from the debug statements:
OpEn= 16.00, OpTotSq= 4142182.00, OpTot= 7126.00

DataA val = 93, OpTotSq= 8649.00
DataA val = 49920, OpTotSq=-1799984256.00
6
The comments at the top of anomdtb.f90 say the program was written by Tim Mitchell on 11.02.02.

18

OpEn= 16.00, OpTotSq=-1798522368.00, OpTot=56946.00
forrtl: error (75): floating point exception
IOT trap (core dumped)
..so the data value is unbfeasibly large, but why does the
sum-of-squares parameter OpTotSq go negative?!!
Probable answer: the high value is pushing beyond the single-
precision default for Fortran reals?
Value located in pre.0312031600.dtb:
-400002 3513 3672 309 HAMA SYRIA 1985 2002 -999 -999
6190 842 479 3485 339 170 135 106 0 9 243 387 737
1985 887 582 93 16 17 0 0 0 0 352 221 627
1986 899 252 172 527 173 30 0 0 0 84 496 570
1987 578 349 950 191 4 0 0 0 0 343 462 929
1988 1044 769 797 399 11 903 218 0 0 163 517 1181
1989 269 62 293 3 13 0 0 0 0 101 292 342
1990 328 276 83 135 224 0 0 0 0 87 343 230
1991 1297 292 860 320 70 0 0 0 0 206 298 835
1992 712 1130 222 39 339 301 0 0 0 0 909 351
1993 726 609 452 82 672 3 0 0 0 34 183 351
1994 625 661 561 41 155 0 0 0 22 345 953 1072
1995 488-9999-9999 182-9999 0-9999 0 0 0 754-9999
1996-9999 40949920-9999 82 0-9999 0 36 414 112 312
1997-9999 339 547-9999 561-9999 0 0 54 155 265 962
1998 1148 289 672 496-9999 0 0-9999 9 21-9999 1206
1999 343 379 710 111 0 0 0-9999-9999-9999 132 285
2000 1518 399 211 354 27 0-9999 0 27 269 316 1057
2001 370-9999-9999 273 452 0-9999-9999-9999 290 356-9999
2002 871 329 403 111 233-9999 0 0-9999-9999 377 1287
(value is for March 1996)

Action: value replaced with -9999 and file renamed:
pre.0312031600H.dtb (to indicate I’ve fixed it)
.dts file also renamed for consistency.
anomdtb then runs fine!! Producing the usual txt files.
Note: The program anomdtb.f90 was found in ...\FOIA\documents\cru-code\linux\cruts. The

READ_ME.txt says “Procedure for updating the databases underlying the CRU high-resolution
grids. Tim Mitchell, 25.06.03, revised 30.3.04”
The errors occur in the subroutine Anomalise which extends from line 286 to line 576 in
anomdtb.f90. The main computational task in this subroutine seems to be the calculation
of the sum and sum-of-squares of data arrays which are then used to calculate standard
deviations for the data.
The program Harris is sweating over has at least two serious errors and if you are not ad-
equately trained in computer arithmetic, programming (and Fortran in this case), you will
never find the cause of these errors.7
7
Worse still, you may not find the errors if the output is wrong but plausible. This is why rigorous testing
of software is essential.

19
Error No. 1: Integer Overflow.
The program defines all variable as global. It uses the default Fortran types integer and
real. Any value of either type occupies 32 bits. Here are the statements that cause the first
error:
Line 026: implicit none

Line 041: integer, pointer, dimension (:,:,:) :: Data,DataA,DataB,DataC
Line 087: real :: OpVal,OpTot,OpEn,OpTotSq,OpStdev,, etc.
Line 348: do XAYear = 1, NAYearat
Line 353: OpTotSq=OpTotSq+(DataA(XAYear,XMonth,XAStn)**2)
Line 355: end do
We can see that the variable being squared is an element of the integer array DataA, and that
these squared values are accumulated in the real variable OpTotSq.
Here is a skeleton version of the program that uses the DataA values given in the debug output
in Harris’s journal above. It shows how the negative OpTotSq values occur:
PROGRAM Climgate1
implicit none
integer,parameter :: dim = 16
integer,dimension(dim) :: DataA
data DataA /93,172,950,797,293,83,860,222,452,561,49920,547,672,710,211,403/
real :: OpTotSq, ROpTotSq
integer :: k
print*, " k DataA(k) DataA(k)**2 OpTotSq ROpTotSq"
print*, "---------------------------------------------------------------"
OpTotSq = 0.0
ROpTotSq = 0.0
do k = 1, dim
OpTotSq = OpTotSq + DataA(k)**2
ROpTotSq = ROpTotSq + real(DataA(k))**2
print "(i5,i8,5x,i12,5x,2f15.2)",k, DataA(k), DataA(k)**2, OpTotSq, ROpTotSq
end do
END
This program was compiled and run in the Release .NET mode with Silverfrost FTN95
Fortran compiler Version 5.4 for Windows.8 This mode allows silent integer overflow.
The output, which is identical to that of anomdtb.f90, shows that OpTotSq becomes nega-
tive because the result of DataA(k)**2 is negative due to integer overflow. Also shown is a
quick fix: real(DataA(k))**2 converts the integer DataA(k) to 32-bit floating point which
has the approximate range ±1038 and the subsequent squaring does not cause a floating point
overflow.
8
Harris seems to be using the Portland Group Fortran 90 compiler, pgf90, which is is one of the best
commercial compilers available.

20
k DataA(k) DataA(k)**2 OpTotSq ROpTotSq

-----------------------------------------------------------------
1 93 8649 8649.00 8649.00
2 172 29584 38233.00 38233.00
3 950 902500 940733.00 940733.00
4 797 635209 1575942.00 1575942.00
5 293 85849 1661791.00 1661791.00
6 83 6889 1668680.00 1668680.00
7 860 739600 2408280.00 2408280.00
8 222 49284 2457564.00 2457564.00
9 452 204304 2661868.00 2661868.00
10 561 314721 2976589.00 2976589.00
11 49920 -1802960896 -1799984256.00 2494982912.00
12 547 299209 -1799684992.00 2495282176.00
13 672 451584 -1799233408.00 2495733760.00
14 710 504100 -1798729344.00 2496237824.00
15 211 44521 -1798684800.00 2496282368.00
16 403 162409 -1798522368.00 2496444672.00
The problem occurs because the array DataA is declared to be integer in Line 041. The
semantics of Fortran specify that (int op int) --> int. When the loop reaches k = 11
we get DataA(11) = 49920 and DataA(11)**2 should be 2492006400, but this is greater than
2147483647 by 344522753. This overflow causes the result to wrap around to -2147483648,
to which 344522753-1 is added, and we get -1802960896. Thus the square of a number has
become negative. This may seem to be crazy arithmetic, but it is standard in Fortran and
other languages.9 If programmers do not understand the arithmetics that the programming
language uses, then they will make and be baffled by such ‘errors’.
Error No. 2.
Further examination of the code reveals the unstable One-Pass standard deviation algorithm.
Here is one example (of three), starting at line 382:
do XAYear = AStart, AEnd

if (DataA(XAYear,XMonth,XAStn).NE.-9999) then
OpEn = OpEn + 1.0
OpTot = OpTot + (real(DataA(XAYear,XMonth,XAStn))/Factor)
OpTotSq = OpTotSq + ((real(DataA(XAYear,XMonth,XAStn))/Factor) ** 2)
end if
end do
... .
NormStdev(XMonth,XAStn) = Factor*sqrt((OpEn/(OpEn-1))*((OpTotSq/OpEn)-((OpTot/OpEn)**2)))
P
The variable OpTot accumulates the sum of DataA/Factor, xi /α, while OpTotSq accumu-
lates the sum-of-squares of DataA/Factor, (xi /α)2 . The variable OpEn counts the number
P
(n) of data elements accumulated and the symbol α stands for the variable Factor. The last
statement is the Fortran version of the mathematical statement:
s P !2 
n  (xi /α)2
P
(xi /α) 
Sdev(x) = α  − , (27)
n−1 n n
9
It may be possible to set a compiler switch that generates code to ‘trap’ an integer overflow at run-time.
See 5.2 below.

21
which is the formula for the One-Pass algorithm (with the factor α). We have shown in Sec-
tion 4 why the One-Pass algorithm is bad, but unfortunately, many scientist-programmers do
not know this because few have taken rigorous courses in programming and numerical meth-
ods. Ironically, in this piece of code the programmer has fixed the integer overflow problem
by converting DataA to real. However I suspect that this conversion was done because the
programmer is unsure about what happens in Fortran when an integer is divided by a real
(Factor).
Although the One-Pass algorithm may not cause errors for the data given here, it would have
been better to use the Two-Pass algorithm. Indeed, if the Two-Pass algorithm had been used,
Error No. 1 (integer overflow) would not have occurred. The One-Pass is thoroughly bad: it
is unstable and it is prone to overflow and underflow.
Perhaps the most insidious aspect of the One-Pass algorithm is this: it works most of the
time.
5.2 Comments on Coding Style, etc.
Most scientists and engineers write or use programs to calculate various things and to organ-
ise their data into neat tables, lists, etc. Here is a warning that I have had on my Numerical
Algorithms website for many years10
Writing high-quality mathematical software is a very demanding and difficult task which
is best left to experts.
A corollary to this is that there is a lot of junk software in use today because of the
inability of users to distinguish between good and bad software.
Scientists and engineers would be better off using the highly-regarded set of Fortran subrou-
tines called Lapack11, or using a numerical system such as Matlab, which incorporates most
of the standard numerical algorithms, but are written and tested by experts. Indeed, Matlab
does not rely on its own experts to write the low-level mathematical algorithms (called math
kernels), but uses those written by experts at Intel and AMD who know how to get the best
out of their own processors. These math kernels are supplied by the CPU manufacturers and
are tuned for each class of processor.
The program anomdtb.f90 was, obviously, written by amateurs, as we see below:
1. Virtually no comments: This is a messy program doing a messy job (processing badly-
organised data). Comments would have helped clean up some of this mess.
2. Global Variables, Subroutines called without arguments: All variables are declared in
the main program. This is a cardinal sin because it breaks the rule that use of global
variables should be minimized if not eliminated.
3. Pointers and dynamic array allocation. Why? : The arrays used in the program are not
very large and do not need to be reclaimed. Besides, do the amateurs who wrote this
program understand dangling pointers, memory leaks, garbage collection, etc.?
10
http://www.derekroconnor.net/NA/na2col.html
11
Free, and also in C.

22
4. Too many complicated, highly-parenthesised expressions. These are error-prone.
5. Poor structure. Three inline standard deviation calculations are performed rather than
using a single function.
6. Ignorance of machine arithmetics and their ranges: Harris’s comment

Probable answer: the high value is pushing beyond the single-precision default for
Fortran reals?
shows that he does not know the parameters of the floating point arithmetic he is using.
He shows also that he does not understand Fortran’s integer arithmetic, which is the
source the problem.
7. Ignorance of standard numerical algorithms and their limitations: Whoever wrote the
program did not know that the One-Pass algorithm for calculating the standard devia-
tion is unstable.

23
Appendix
Fortran Machine Parameters and Arithmetic
PROGRAM MachParms
implicit none
integer*4 :: k,u,v
real*4 :: S
real*8 :: D
print*
print*,"--- Silverfrost FTN95: Machine Arithmetic and Parameters ---"
print*
print*, "Largest Integer*4 = ", huge(u)
print*, "Machine Epsilon S = ", epsilon(S)
print*, "Precision S = ", precision(S)
print*, "Min Exponent S = ", minexponent(S)
print*, "Max Exponent S = ", maxexponent(S)
print*, "Largest S = ", huge(S)
print*, "Smallest S = ", tiny(S)
print*,"-------------------------------"
print*, "Machine Epsilon D = ", epsilon(D)
print*, "Precision D = ", precision(D)
print*, "Min Exponent D = ", minexponent(D)
print*, "Max Exponent D = ", maxexponent(D)
print*, "Largest D = ", huge(D)
print*, "Smallest D = ", tiny(D)
print*,"-------------------------------"
print*
u = -2147483643
v = 2147483643
do k = 1,10
u = u - 1
v = v + 1
print "(i3,i15,3x,b32.32,5x,i15,3x,b32.32)", k,u,u,v,v
end do
END
---- Silverfrost FTN95: Machine Arithmetic and Parameters ----
Largest Integer*4 = 2147483647
Machine Epsilon S = 1.192093E-07

Precision S = 6
Min Exponent S = -125
Max Exponent S = 128
Largest S = 3.402823E+38
Smallest S = 1.175494E-38
----------------------------------------------
Machine Epsilon D = 2.220446049250E-16
Precision D = 15
Min Exponent D = -1021
Max Exponent D = 1024
Largest D = 1.797693134862E+0308
Smallest D = 2.225073858507E-0308
----------------------------------------------

24
Demonstration of Integer Underflow and Overflow
1 -2147483645 10000000000000000000000000000011 2147483644 01111111111111111111111111111100

2 -2147483646 10000000000000000000000000000010 2147483645 01111111111111111111111111111101
3 -2147483647 10000000000000000000000000000001 2147483646 01111111111111111111111111111110
4 -2147483648 10000000000000000000000000000000 2147483647 01111111111111111111111111111111
5 2147483647 01111111111111111111111111111111 -2147483648 10000000000000000000000000000000
6 2147483646 01111111111111111111111111111110 -2147483647 10000000000000000000000000000001
7 2147483645 01111111111111111111111111111101 -2147483646 10000000000000000000000000000010
8 2147483644 01111111111111111111111111111100 -2147483645 10000000000000000000000000000011
-2^{31} -1 0 +1 2^{31}-1
+1
-1
Figure 3: Integer Overflow
These are two comments on integer overflow in Fortran, found on the WWW:
• Peter wrote:
I understand from previous postings that the Fortran standard does not require any check-
ing for integer overflow.
I notice the Intel Fortran compiler 9.1 has removed the compiler switch to check for inte-
ger overflow. How have other compiler manufacturers dealt with this?
On the PC, the Pentium has an overflow flag which is set by integer arithmetic, so overflow
checking is a single conditional jump instruction. So I am wondering if there is some
reason why this capability should be removed.
• Hewlett-Packard: Handling Integer Overflow
Trapping on integer overflow is disabled by default for Fortran and C; an integer overflow
does not generate a SIGFPE error. Detecting integer overflows requires not only that the
trap be enabled but also that the compiler insert special code in the executable file to check
for overflows.
To enable integer overflow checking for Fortran, use a !$HP$ CHECK_OVERFLOW INTEGER ON
directive (in HP Fortran/9000, use $CHECK_OVERFLOW INTEGER_4 or INTEGER_2) to obtain
the overflow checking code, and use an ON INTEGER OVERFLOW statement to handle the trap.
(The !\$H$ CHECK_OVERFLOW directive does not enable checking for operations in libraries.
Using the exponentiation operator involves a library call in HP Fortran, so it is not possi-
ble to enable integer overflow checking for exponentiation operations.) There is no way
to enable integer overflow checking in C. HP C provides no mechanism to insert over-
flow checking code into your executable, because the C language does not define integer
overflow as an error.

25
References
[1] Donald E. Knuth, The Art of Computer Programming: Seminumerical Algorithms, 2nd Edition, Vol. 2,
Addison-Wesley, 1981.
[2] Lloyd N. Trefethen and David Bau III, Numerical Linear Algebra, SIAM, 1997.
[3] T. F. Chan and G. H. Golub and R. J. LeVeque, Updating formulae and a pairwise algorithm for
computing sample variances, Technical Report STAN-CS-79-773, Stanford University, Dept. of
Computer Science, 1979.
[4] Yun He and Chris H.Q. Ding, “Using Accurate Arithmetics to Improve Numerical Reproducibility and
Stability in Parallel Applications”, Journal of Supercomputing 18 (2001), no. 3.
[5] Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd Edition, SIAM, Philadelphia,
2002.
[6] Arnold Neumaier, Introduction to Numerical Analysis, Cambridge University Press, 2001.

26

O'Connor - Two Simple Statistical Calculations and ClimateGate

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

O'Connor - Two Simple Statistical Calculations and ClimateGate

Uploaded by

Copyright:

Available Formats

Two Simple Statistical Calculations

Started : February 14, 2008,

Yet the errors do not come from the art

Errors in such simple calculations usually indicate that a problem is ill-conditioned, or an

1.1 The Sea Surface Heights Problem

2. The average of these 64 × 120 = 7680 numbers is then calculated.

Table 1: He and Ding’s Results using 16-digit Precision

Summation Order Value of sum Rel. Error Corr. Digs.

1.2 The Microsoft Excel Problem

c Derek O’Connor, February 6, 2010

Table 2: Excel 2000 Results

c Derek O’Connor, February 6, 2010

2.1 Floating Point Arithmetic

c Derek O’Connor, February 6, 2010

A Summary of Floating Point Number Systemsa

1. Range : bemin −1 ≤ | x̂| ≤ (1 − b−p ) bemax .

2. Machine Epsilon : ǫ M = b1−p .

3. Unit Roundoff (to Nearest) : u = 12 b1−p = 12 ǫ M

4. Roundoff Error : x̂ = fl(x) = x(1 + δx ), where |δx | ≤ u

5. Fundamental Axiom of Floating Point Arithmetic

fl(x ◦ y) = (x ◦ y)(1+δ), where |δ| ≤ u and ◦ is one of {+, −, ∗, /}.

1. Range : 2.225 × 10−308 ≈ 2−1022 ≤ | x̂ | < 21024 ≈ 1.798 × 10308

2. Machine Epsilon : ǫ M = 21−53 ≈ 2.2 × 10−16 .

3. Unit Roundoff : u ≈ 1.1 × 10−16 .

2.2 The Condition of a Problem

c Derek O’Connor, February 6, 2010

Definition 1 (Condition of a Problem).

Problems, Data, Algorithms, and Programs

c Derek O’Connor, February 6, 2010

Trefethen and Bau’s Definition of Condition

The process of solving a problem may be viewed as a mapping f : X → Y of a

If f is differentiable then we have δf /δx = J f (x) = [∂ f /∂xi ]ni=1 , the Jacobean

Example 1 (Subtraction). Let f (x1 , x2 ) = x1 − x2 , i.e., f : R2 → R. Then we

kJ f (x)k∞ max{|x1 |, |x2 |}

2. The condition of a problem has nothing to do with the algorithm used

c Derek O’Connor, February 6, 2010

2.3 Error, Stability, and Condition

Let x be an exact numerical vector and x̂ be an approximation to x. Then we define

E x = kx − x̂k is the absolute error in x, and

Recall that the process of solving a problem may be viewed as a mapping f : X → Y of a

c Derek O’Connor, February 6, 2010

3 The Summation Problem

algorithm sum(x, n) →s algorithm gsum(X) →X

s := 0 while |X| > 1 do

3.1 The Condition of the Summation Problem

c Derek O’Connor, February 6, 2010

κ1 (s(x)) = n(1 + nM),

3.2 The Condition of the SSH Problem

er (s) ≈ κ1 ǫ M ≈ 1021 × 10−16 = 105 or 10, 000, 000% (14)

c Derek O’Connor, February 6, 2010

If the answer is highly sensitive to perturbations,

A further maxim from Trefethen

No physical constants are known to more than around eleven digits,

4 The Sum of Squared Deviations Problem

c Derek O’Connor, February 6, 2010

algorithm TwoPassSSQ( x, n) algorithm OnePassSSQ( x, n)

4.1 Analysis of the One and Two Pass Algorithms

c Derek O’Connor, February 6, 2010

F(b, p, emin , emax )

c Derek O’Connor, February 6, 2010

, from equations (24), (25), (26),

c Derek O’Connor, February 6, 2010

4.2 Testing the One and Two Pass Algorithms