You are on page 1of 10

INTERNATIONAL JOURNAL OF CLIMATOLOGY, VOL.

17, 25–34 (1997)

HOMOGENIZATION OF SWEDISH TEMPERATURE DATA.


PART I: HOMOGENEITY TEST FOR LINEAR TRENDS
HANS ALEXANDERSSON1 AND ANDERS MOBERG2
1
Swedish Meteorological and Hydrological Institute, S-601 76 Norrköping, Sweden
email: halexandersson@smhi.se
2
Department of Physical Geography, Stockholm University, S-106 91 Stockholm, Sweden
email: moberg@natgeo.su.se

Received 18 July 1995


Revised 15 February 1996
Accepted 17 May 1996

ABSTRACT
A new test for the detection of linear trends of arbitrary length in normally distributed time series is developed. With this test it
is possible to detect and estimate gradual changes of the mean value in a candidate series compared with a homogeneous
reference series. The test is intended for studies of artificial relative trends in climatological time series, e.g. an increasing
urban heat island effect. The basic structure of the new test is similar to that of a widely used test for abrupt changes, the
standard normal homogeneity test. The test for abrupt changes is found to remain unaltered after an important generalization.
Int. J. Climatol., Vol. 17, 25–34 (1977) (No. of figures: 1 No. of tables: 2 No. of refs: 19)

KEY WORDS: homogeneity; trends; breaks in trend; normal distribution; climate; time series; urban heat island.

1. INTRODUCTION
The first stage in climate change studies based on long climate records is almost inevitably a homogeneity testing
of climate data. One type of non-homogeneity in long meteorological time series is sudden shifts of the mean
level compared with surrounding sites. Such unrepresentative shifts are often related to relocations of the station
but also may be caused by changes in observing schedules and practices, changes in instrument exposure or
abrupt changes in the immediate environment (Heino, 1994). Changes in the surroundings also may be more
gradual in the case of an urban influence, which affects mainly temperature data (Landsberg, 1981). Gradual
changes also may be caused by trees growing in height, which reduces wind speeds and causes changes in the
catchment efficiency of precipitation gauges (especially when the precipitaiton falls as snow). Thus there are
situations when homogeneity testing benefits from a linear trend model, as well as other situations when an abrupt
change is a better model.
Tests for detection of non-homogeneities in geophysical data have a fairly long history. Some older and well-
known techniques are the subjective double mass curve technique (Bruce and Clark, 1966) and different non-
parametric run tests (Lindgren, 1968). Generally we require objective methods and use parametric tests because
they are more powerful and give more quantitative information than non-parametric ones. Two parametric tests
with a strong capacity to detect and quantify abrupt non-homogeneities have been discussed in the meteorological
literature. Both are used for the detection of a single shift of the mean level. The bivariate test was developed by
Maronna and Yohai (1978) and was first applied to precipitation data by Potter (1981). The standard normal
homogeneity test (SNHT) was developed and applied to precipitation data by Alexandersson (1984, 1986). It has
further been adapted to climate data problems by, for example, Hanssen-Bauer et al. (1991), Tuomenvirta and
Heino (1993) and Hanssen-Bauer and Førland (1994). In an extensive intercomparison (Easterling and Peterson,
1992) of different tests, these two fairly closely related tests were by far the best tests for revealing and dating
single and sudden shifts in artificial data. The SNHT also has been discussed in a slightly different version in the

#
CCC 0899-8418/97/010025-10
1997 by the Royal Meteorological Society
26 H. ALEXANDERSSON AND A. MOBERG

statistical literature (Hawkins, 1977). Techniques for handling and correcting temperature data with urban trends
have been discussed by, for example, Karl et al. (1988) and Portman (1993).
Here we will develop the ideas of the SNHT for single shifts into a test for the existence of a linear trend of
arbitrary length. The main text essentially contains a description of the mathematical structure of both the shift
and the trend tests. Four Appendices are included. Mathematical symbols are listed in Appendix 1. Critical levels
are derived and given in Appendix 2. Some idealized examples are illustrated in Appendix 3. Finally, we briefly
present two variants of the SNHT for single shifts in Appendix 4; in particular we demonstrate the fact that the
originial single shift test remains unaffected by an important generalization of the alternative hypothesis.
This is the first part in a series of three papers. In Parts II and III we will use the tests for single shifts and trends
in studies of long Swedish temperature data series, as well as discuss their practical aspects.

2. THE REFERENCE VALUE


We will use Y to denote our candidate series and Yi to denote a specific value (e.g. annual accumulated
precipitation or annual mean temperature) at year (or other time unit) i. Furthermore, Xj will denote one of the
surrounding reference sites (the jth of a total of k) and Xji a specific value from that site. To detect relative non-
homogeneities, we form ratios (by tradition used in precipitation studies) or differences (here primarily intended
to be used on temperature data) according to
  
Qi ˆ Yi =
P
k

j ˆ1
r
2  
j Xji Y =Xj
P
k

j ˆ1
r 2
j 1
… †

and
 
Qi ˆ Yi ÿ
P
k

j ˆ1
r 2
j ‰Xji
 ‡Y
ÿ X j
 Š=
P
k

j ˆ1
r 2
j 2
… †

We call the denominator in equation (1) and the second term at the right-hand side of equation 2 (both expressed

r
within brackets) reference values as they are intended to be reasonable and stable estimates for the candidate site
using a set of neighbouring reference stations. In these equations j denotes the correlation coefficient between
the candidate site and a surrounding station. This coefficient must be positive. Bars denote mean values, which
have been incorporated for normalizing reasons. The normalizing is important because it allows us to use
different sets of neighbouring stations at different years, including shorter and non-complete records, when we
calculate reference values. The normalizing also causes the Q-values to fluctuate around 1 for equation (1) and
around 0 for equation 2. It is necessary that the mean values of Y and Xj are calculated for one common time

r
period for all j ˆ 1; . . . ; k. Otherwise the size of non-homogeneities may be underestimated or missed by the test.
The correlation coefficients, j , need not for algebraic reasons be estimated from the same common time period,
but it seems reasonable to use one common period for all stations. In parts II and III we use at least 20 years for
the common period (Moberg and Alexandersson, 1997; Moberg and Bergström, 1997).
The reference value is an important part of the tests although reformulations of the reference value have no
influence on the theory of the tests. We will just make a few more short comments here.
Peterson and Easterling (1994) suggested using successive differences instead of the values themselves to
calculate the correlation coefficients used in equations (1) and (2). This will reduce the risk of making poor
estimates of correlations between the candidate site and a reference site if one or both of them have non-
homogeneities within the common time period used for the calculation of correlation coefficients.
It is tempting to use the optimum interpolation technique (Gandin, 1963) to create a reference series. Although
this technique is well suited for the interpolation of missing data, it is not satisfactory in this application because
it is oversensitive to the correlation coefficient matrix (it also uses correlations between the reference sites). This
leads to an effective masking of non-homogeneites when the reference sites are not perfect (Alexandersson,
1994).
SWEDISH TEMPERATURE DATA-HOMOGENEITY TEST 27

The standard normal homogeneity tests are applied to the standardized series

Zi ˆ …  †=
Qi ÿ Q s Q 3
… †

We use (n ÿ 1)-weighted standard deviations. This is important to mention because it influences the test statistic
and the critical levels.

3. THE STANDARD NORMAL HOMOGENEITY TEST FOR SINGLE SHIFTS


A single shift of the mean level at the candidate site Y can be expressed formally with a null hypothesis (H0) and
an alternative hypothesis (H1) as

m 

m
Zi 2 N… 1 ; 1† i 2 f 1; . . . ; ag
H0 : Zi 2 N … 0; 1† i 2 f 1; . . . ; ng H1 :
Zi 2 N… 2 ; 1† i 2 f a ‡ 1; . . . ; ng

where N denotes the normal distribution with its parameters (mean value and standard deviation). The null
hypothesis, which is the ideal case with a homogeneous record from the candidate site, follows directly from the
standardization in equation (3), except that we have added the assumption that we can use the normal distribution.
The alternative hypothesis says that at some unknown time the mean value changes abruptly. The standard
deviation is assumed not to change at this point. This is a simplification and in fact it should as a rule be slightly
less than one for the series before and after the year with a possible break. However, the test statistic (equation 4)
will not be affected if we introduce a common, unknown, standard deviation in the alternative hypothesis, as
shown in Appendix 4. We will also discuss the case with two, possibly different, standard deviations before and
after a break in Appendix 4.
Based upon the two hypotheses we can derive a test quantity, i.e. a quantity that is the most effective one to
separate H0 from H1. This is usually done by forming a likelihood ratio, i.e. the ratio of the probability that H1 is
correct, given the observed series fzi g, to the probability that H0 is correct. After some calculations
(Alexandersson, 1986) we obtain the test statistic as
s
Tas g ˆ az21 ‡ …n ÿ a†z 22 g
Tmax ˆ
1 4max
a4n ÿ
f
1 1 4max
a4n ÿ 1
f  4
… †

where z1 and z2 are the arithmetic averages of the fzi g sequence before and after the shift. The value of a,

m
corresponding to this maximum, is then the year most probable for the break, or more precisely the last year at the
old level z1 (or 1 —the theoretical analogue). (Note that fTas g is an ordered sequence; hence it can be regarded as
a separate time series. In Appendix 3 we show how the shape of plots of this time series is determined by the
s
character of some idealized Q-series). If Tmax is above a certain critical level we say that the null hypothesis of
homogeneity can be rejected at the corresponding significance level. If it is above the 95 per cent significance
level there is risk, at most 5 per cent, that we are wrong when we reject the null hypothesis. The two levels of the
ratios or differences before and after the possible break are then

q 1 ˆ sz
Q 1 ‡  Q … 5a†

q 2 ˆ sz
Q 2 ‡  Q 5b†
…

which are reverse uses of equation (3). If one intends to correct data for the period f1; . . . ; ag then the values
within this period should be corrected by q 2 =q 1 in the ratio case (equation 1) and by q 2 ÿ q 1 in the difference case
(equation 2). If the data contains only one shift, then we obtain a homogenized series where all data refer to the
present measuring situation.
s
We would also like to add that the highest possible value on Tmax is obtained when the series of Q consists of
two parts (of any length) at constant levels Q1 and Q2 . This maximum is n ÿ 1. This fact, which is fairly easy to
show after using equation (3) and inserting into equation (4), can be used to check for programming errors.
28 H. ALEXANDERSSON AND A. MOBERG

The same results will be obtained if the problem is formulated in terms of a curve-fitting using the principle of
least squares. Then the sum to minimize is

S ˆ
P
a

i ˆ1
… zi ÿ m 1†
2
‡
P
n

i ˆa ‡1
… zi ÿ m 2†
2
6
… †

The ordinary operations @S =@ m 1 ˆ 0 and @S =@ m 2 ˆ 0 give m 1 ˆ 1 z and m 2 ˆ 2z so that

min…S † ˆ az21 ‡ …n ÿ a†z 22 g


1 4max
a4n ÿ 1
f  7
… †

This coincidence is a consequence of using the normal distribution (where squared deviations are involved) with
a common standard deviation. It is necessary, nevertheless, to have a complete statistical formulation of the
problem to be able to derive or simulate critical levels.
Both of these remarks concerning the maximum value …n ÿ 1† and the least square approach also are valid in
the case for the trend model.
We can also mention that it is more appropriate and rigorous to use a simple t-test if we know that a series
being studied has one, and only one, possible risk for a break (at year A). In this case we do not need to
standardize but can use the Q-series directly and calculate
q 1 ÿ q 2
t ˆ
s
r
2
1
‡
s 2
2
8
… †

A nÿA
Most commonly, however, we have an incomplete knowledge of the possible causes for observed relative non-
homogeneities.
The test for a single shift cannot properly handle series with many breaks. It is fairly easy to generalize the test
to two or more breaks (Alexandersson, 1995) but another alternative is to use the single shift test on two or more
consecutive parts of a complicated series.

4. THE STANDARD NORMAL HOMOGENEITY TEST FOR TRENDS


Now we will introduce a model where the mean level of the Q-series changes linearly from time a to b. We can
then say that this is a test for a trend of arbitrary length.
The null and the alternative hypotheses are expressed as:
H0 : Zi 2 N … 0; 1† i 2 f1; . . . ; ng

8
m 9

m m m
< Zi
N… 2 1 ; 1† i 2 f1; . . . ; ag =

m
H1 : Zi 2 N … 1 ‡ … ÿ i a†… 2 ÿ 1 †=…b ÿ a†; 1† i 2 fa ‡ 1; . . . ; bg
: ;
Zi 2 N … 2 ; 1† i 2 fb ‡ 1; . . . ; ng

The sequence described by the mean values ( 1 etc.) in the alternative hypothesis is then assumed to be m
continuous. The trend may also extend throughout the whole length of the series.
Forming the likelihood ratio (see e.g. Lindgren, 1968) will then give
 
P
a
m 2 P
b
m m m 2 P
n
m 2

p
ÿ
1
2 … zi ÿ 1† ‡ … zi ÿ … 1 ‡ … ÿ i a†… 2 ÿ 1 †=…b ÿ a††† ‡ … zi ÿ 2†

m m
n 2
ÿ = i ˆ1 i ˆa ‡1 iˆb‡1
… 2 † e
L… 1; 2 ; a; b† ˆ
P
n
z2i
p
1
ÿ
2
ÿ = n 2 i ˆ1
… 2 † e
9
… †
SWEDISH TEMPERATURE DATA-HOMOGENEITY TEST 29

m m
Maximizing L with respect to the four parameters gives the test statistic for the trend test. One starts with
t
differentiation with respect to 1 and 2 . A scheme to obtain the test statistic Tmax can be written as
t
Tmax ˆ m m m m
max fÿa 21 ‡ 2a 1 z 1 ÿ 21 SB ÿ 22 SA ‡ 2 1 SZB ‡ 2 2 SZA m m
14a b4n
ab ;

mm m m
<

2
ÿ 2 1 2 SAB ÿ …n ÿ b†
2 ‡ 2…n ÿ b† 2 z
2 g 10†
…

where
P
b
SA ˆ … ÿ i a†2 =…b ÿ a†2 … 11a†
i ˆa ‡1

P
b
SB ˆ … b ÿ i†2 =…b ÿ a†2 11b†
…
i ˆa ‡1

P
b
SZA ˆ zi …i ÿ a†=…b ÿ a† … 11c†
iˆa‡1

P
b
SZB ˆ zi …b ÿ i†=…b ÿ a† 11d†
…
iˆa‡1

P
b
SAB ˆ … b ÿ i†…i ÿ a†=…b ÿ a†2 … 11e†
i ˆa ‡1

Furthermore, m 1 and m 2 are obtained from

m 1 ˆ
az 1 ‡ SZB ÿ SL 2 SAB
a ‡ SB ‡ SK 2 SAB
… 12a†

m m SK
2 ˆ 1 ‡ SL ˆ m SA1
…ÿ SAB†
‡ n ÿ b
‡
…n ÿ b†z
 2 ‡ SZA

SA ‡ n ÿ b
12b†
…

m m
where z1 and z2 denotes the arithmetic averages of the fzi g sequence before and after the trend section. Note that
1 and 2 must be used in equations (5a) and (5b) to obtain the two fixed levels q
m m
 1 and q
 2 before and after the
t s
trend period. If b ˆ a ‡ 1, then Tmax reduces to the single shift test statistic Tmax . In this case, 1 and 2 are equal
to z1 and z2 , as in the single shift test.
We suggest that a minimum number of years belong to the trend section, i.e. a minimum number on b ÿ a. One
can argue that critical levels must be simulated separately for this specific situation. On the other hand the SNHT
for single shifts has often been used with a constraint: if a significant break occurs within the five first or last
years, no corrections should be made (e.g. Hanssen-Bauer et al. 1991) because there are too few years to be able
to obtain a stable correction factor (q 1 =q 1 ) or difference (q 2 ÿ q 1 ). This seems wise from a meteorological point
of view, but it can be argued that to obtain correct critical levels for this situation one should simulate new values
with this same constraint in the simulation procedure. Alternatively, one can say that the test is even stronger (i.e.
it will consider less than 5 per cent of the homogeneous series as non-homogeneous when we test at the 95 per
cent level) when we omit breaks near the ends. We prefer not to complicate things too much, so we simply say
that it is wise to require a trend period of more than a few years, let us say 5 years, to accept it as a real trend. In
Part II more practical aspects of the testing will be discussed, including advice about choosing between an abrupt
break and a trend when both are significant.

5. CONCLUDING REMARKS
A new test for the detection of artificial trends of arbitrary length has been developed along lines similar to the
Standard Normal Homogeneity Test for abrupt shifts. The new test is intended primarily for studies of artificial
30 H. ALEXANDERSSON AND A. MOBERG

warming at temperature monitoring stations located in urban environments, but it can also be used for other
purposes. Here we have basically given a strict mathematical description of the tests. In two companion papers
(Parts II and III of this trilogy) we will provide a more instructive discussion of the tests, as well as demonstrate
their utilities and limitations, which are determined by the true nature of the non-homogeneities. Furthermore, the
tests will be used for obtaining a homogenized set of long monthly temperature series from Sweden. The
homogenized data will be used for a study of the temperature changes in all Sweden since 1861 (Part II) and in
Stockholm and Uppsala since the 1700s (Part III).

ACKNOWLEDGEMENTS
The authors wish to thank our Nordic colleagues who put forward the idea of developing a trend test along lines
similar to the single shift test. Especially Eirik Førland at the Norwegian Meteorological Institute argued that
such a test would be valuable. We also wish to thank the Nordic Environmental research programme and the
European Commission Environment Programme for financial support to the NACD (North Atlantic
Climatological Dataset) project (project coordinators: Bengt Dahlström, Sweden and Povl Frich, Denmark),
within which the lead author has been working.

APPENDIX 1
Mathematical symbols
a Last year (or other time unit) before a possible shift or trend
b Last year (or other time unit) of a possible trend
e Base of natural logarithm
i Time unit index
j Reference station index
k Total number of reference stations
l Logarithm of likelihood ratio
n Number of values in a time series
q 1 Estimated mean level of a series of differences (or ratios) before a possible shift or trend
q 2 Estimated mean level of a series of differences (or ratios) after a possible shift or trend
t Test statistic for the ordinary t-test
z 1 Estimated mean level of standardized differences (or ratios) before a possible shift or trend
z 2 Estimated mean level of standardized differences (or ratios) after a possible shift or trend
A Last year (or other time unit) before a definitely known shift
C Auxiliary symbol used in an inequality (in Appendix 4)

ms m s
L Likelihood ratio
N… ; † Normal (Gaussian) distribution with mean value and standard deviation
Q A difference (or ratio) between a value at a candidate station and a weighted average of values from a
set of reference stations

Q Mean value of Q
S 9 A sum of squares
SA >>
SB >>
>
>
>
SK >=
SL Auxiliary symbols used in the calculation of the test statistic for the trend test
>
SAB >
>
>
>
SZA >
>
>
;
SZB
T A test value for the single shift test
Tas A test value for the single shift test at year a …ˆ T †
SWEDISH TEMPERATURE DATA-HOMOGENEITY TEST 31

s
Tmax Test statistic for the single shift test. Maximum of Tas
sl
Tmax Test statistic for the single shift test with a common unknown standard deviation (in Appendix 4).
s
Equivalent to Tmax
s2
Tmax Test statistic for the single shift test with two unknown standard deviations (in Appendix 4)
t
Tmax 9
Test statistic for the trend test
T90 =
T95 Critical levels at 90; 95; and 9715 per cent significance
;
T97 5 1

X Value at a reference station


X Mean value of X
Y Value at the candidate station
Y
m
Mean value of Y

m 1 Theoretical mean level of standardized differences (or ratios) before a possible shift or trend

r 2 Theoretical mean level of standardized differences (or ratios) after a possible shift or trend

s
Correlation coefficient

s
Standard deviation of standardized differences (or ratios) (in Appendix 4)

s
1 Standard deviation of standardized differences (or ratios) before a possible shift (in Appendix 4)

s
2 Standard deviation of standardized differences (or ratios) after a possible shift (in Appendix 4)
Q Standard deviation of a series of Q-values
@ Differential operator

APPENDIX 2
Critical levels
The exact distribution of the test statistic under H0 is not known for the two test formulations discussed. It is
s t
therefore necessary to simulate critical levels of Tmax and Tmax using large sets of random normal numbers. The

6
critical levels depend on the number of values in the series. This number is denoted by n.
Typically 2 106 standard normal random numbers were used to simulate critical levels for n ˆ 100, giving
20 000 series. Each of these series was then standardized to obtain a mean value of exactly zero and a standard
deviation of exactly one. Then the lowest value of the 10 per cent largest test statistic values derived from these
20 000 series is an estimate of the T90 critical value.
It turns out that the critical levels for the single shift and the trend test (with no constraints on b ÿ a when
critical levels are derived) are practically equal (Table AI). This seems to be so because in the simulations under
H0, i.e. using homogeneous random numbers, the largest breaks practically always are of the sudden shift type.

Table AI. Critical levels for the trend and single shift tests

n 10 20 30 40 50 0 70 80 90 100 150 250


T90 5105 6110 6165 7100 7125 7140 7155 7170 7180 7185 8105 8135
T95 5170 6195 7165 8110 8145 8165 8180 8195 9105 9115 9135 9170
T97 5 1 6125 7180 8165 9125 9165 9185 1011 1012 1013 1014 1018 1112

APPENDIX 3
Four idealized examples
Here we demonstrate briefly how the shape of the fTas g sequence (mentioned in section 3; hereafter denoted T-
series for simplicity) for the single shift test is affected by some idealized Q-series (note: constant values within
intervals). The analogy for the T-series in the trend test depends on two time points (a and b) and is more difficult
to illustrate.
32 H. ALEXANDERSSON AND A. MOBERG

Figure A1. Idealized examples of Q-series and the corresponding T-series for the single shift test. The 95 per cent critical level, T95 , is
indicated with small dots. (a) A single shift. (b) A perfect trend. (c) Three distinct shifts. (d) A perfect trend interrupted by a single shift

We can summarize the results from these idealized tests as follows.

(i) A single shift (Figure A1(a)) is estimated correctly by both the shift and the trend test and causes the T series
for the shift test to have two concave curved sections converging to a peak at the place where the shift
occurs.
(ii) A trend (Figure A1(b)) can be estimated correctly only with the trend test, but the shape of the T-series for
the shift test is typically dome-shaped and convex within the trend section.
(iii) Complicated series with multiple shifts or mixed shifts and trends are difficult to handle. Such series have to
be tested in subsections. Note that in Figure A1(c), the mean value of the first three Q-levels equals the
fourth level, completely masking the third break when the entire series is tested. One way to proceed is to
divide the series into subsections after a visual inspection of the Q series and T series. An interrupted trend,
as in Figure A1(d), is another difficult type of Q-series that can be expected in observed data. Note that the
T-series has a local minimum at the time of the discontinuity.

Strategies for testing series with multiple non-homogeneities are discussed and demonstrated with realistic
examples in Parts II and III. We will then use the shape of the T-series as an aid for distinguishing between a shift
and a trend when both are significant, as well as for defining subsections of a Q-series, to which the tests can be
applied.

APPENDIX 4
The single shift test, two variants
The alternative hypothesis for the single shift case can be reformulated as

m s 

m s
Zi 2 N… 1; † i 2 f 1; . . . ; ag
H1 :
Zi 2 N… 2; † i 2 f a ‡ 1 ; . . . ; ng
SWEDISH TEMPERATURE DATA-HOMOGENEITY TEST 33

This means that we allow the standard deviations within the two parts of the series, before and after a possible
break, to be lower than unity, the exact value for the whole series. It is natural that the standard deviation is lower
in the two parts because of the change in mean level.
Forming the likelihood ratio gives

p
ÿP P 1 P
m m
a n n
n 2 1 2 2 1
z2i
2 ÿ = ÿ
s … zi ÿ 1† ‡ zi ÿ
… 2† ‡

s
… † 2 2 2
Lˆ n
e i ˆ1 iˆa‡1 iˆ1 A1†
…

s
m m m m
Taking the logarithm of L…l ˆ ln…L†† and maximizing using the standard technique gives (using @l=@ ˆ 0 and
also 1 ˆ z 1 and 2 ˆ z2 obtained from @l=@ 1 ˆ 0 and @l=@ 2 ˆ 0 respectively)
v
u a
uP 2 P n
2
u …zi ÿ z 1 † ‡ …zi ÿ z 2 †

s ˆ
ti ˆ1 iˆa‡1
n
A2†
…

sl
from which the test statistic Tmax can be obtained as
   
sl
Tmax ˆ
1 4a4n
max
ÿ 1
ÿ n ln s 21sÿ
2
P
a

i ˆ1
… zi ÿ z1 †2 ‡
P
n

i ˆa ‡1
… zi ÿ z 2 †2 ‡
1Pn
z2
2i 1 i
ˆ
A3†
…

s
The first two sums within the bracket add simply to n 2 whereas the third sum equals n ÿ 1. Multiplying with 2,
as in the original SNHT (Alexandersson, 1984 and 1986), then gives
sl
Tmax ˆ
1 4max
a4n ÿ 1
fÿ 2n ln s ÿ 1g A4†
…

If (A2) is rewritten as
q
s ˆ p
1
n
n ÿ 1 ÿ …az21 ‡ …n ÿ a†z 22 † A5†
…

s s
we recognize, within the parenthesis, the test quantity of the simple, original test. When the value within the
parenthesis is at its maximum, then is at its minimum and ÿ ln is at its maximum, so the tests are really
equivalent. This can be shown more strictly starting from the inequality
 q
ÿ 2 ln
1
p
n
n ÿ 1 ÿ …az 21 ‡ …n ÿ a†z 22 † ÿ 1 5C A6†
…

Using the fact that the square root and the logarithm are monotonic functions, this inequality, which defines the
test statistic, can be rewritten so that it exactly equals the original SNHT formulation. This fact is really welcome

s s
because it shows that the original formulation happened to be more general than expected!
There may also be situations when it is more realistic to use two different standard deviations, 1 and 2, in the
alternative hypothesis:

m s 

m s
Zi 2 N… 1; 1† i 2 f1; . . . ; ag
H1 :
Zi 2 N… 2; 2† i 2 fa ‡ 1; . . . ; ng
Without showing the mathematical details we obtain
s2
Tmax ˆ
2 4 4
max
a nÿ2
fÿ2a ln s 1 ÿ 2…n ÿ a† ln s 2 ÿ 1g A7†
…

where
v
u  2
uP a P a
u 2
zi ÿ zi =a
s 1 ˆ
t
iˆ1
a
iˆ1
A8†
…
34 H. ALEXANDERSSON AND A. MOBERG

Table AII. Critical levels for the single shift test with two independent standard deviations

n 10 20 30 40 50 0 70 80 90 100 150 250


T90 13135 13170 13195 14115 14125 14130 14135 14135 14140 14140 14145 14145
T95 16100 16130 16145 16155 16165 16170 16175 16180 16185 16185 16190 16190
T97 5 1 18160 18185 19105 19120 19130 19135 19140 19145 19150 19150 19155 19155

and
v
u  2
u P n P n
u 2
zi ÿ =…n ÿ a†
u
s 2 ˆ
ti ˆ a ‡ 1
…
iˆa‡1
n ÿ a†
…A9†

This test has a drawback, because it too often gives breaks near the ends of series. This was observed when

s s
simulations under H0 were made and also in some cases with real data. If a few values on Qi close to the ends
happen to have low variance, a very low value on 1 or 2 makes the first or second term in equation (A7) very
large. One way of reducing this tendency is, for example, to omit the first and last 10 years in such a test.
To make it possible to use this variant of SNHT, Table AII gives some critical levels.

REFERENCES
Alexandersson, H. 1984. A Homogeneity Test Based on Ratios and Applied to Precipitation Series. Report 79, Department of Meteorology,
Uppsala, 55 pp.
Alexandersson, H. 1986. ‘A homogeneity test applied to precipitation data’, J. Climatol., 6, 661–675.
Alexandersson, H. 1994. ‘Climate series—a question of homogeneity’, Proceedings of the 19th Nordic Meteorological Meeting, Kristiansand
(DNMI), pp. 25–31.
Alexandersson, H. 1995. ‘Homogeneity testing, multiple breaks and trends , Proceedings of the 6th International Meeting on Statistical
Climatology, Galway, pp. 439–441.
Bruce, J. P. and Clark, R. H. 1966. Introduction to Hydrometeorology, Pergamon Press, Oxford, 319 pp.
Easterling, D. R. and Peterson, T. C. 1992. ‘Techniques for detecting and adjusting for artificial discontinuities in climatological time series: a
review’, Proceedings of the 5th International Meeting on Statistical Climatology, Toronto, pp. J28–J32.
Gandin, L. S. 1963. Objective Analysis of Meteorological Fields, Hydrometeoroly Press, Leningrad.
Hanssen-Bauer, I. and Førland, E. 1994. ‘Homogenizing long Norwegian precipitation series’, J. Climate, 7, 1001–1013.
Hanssen-Bauer, I., Førland, E. and Nordli, P. Ø. 1991. Homogeneity Test of Precipitation Data, Descriptions of the Methods used at DNMI,
DNMI Report 13=91 KLIMA, Norwegian Meteorological Institute, 28 pp.
Hawkins, P. M. 1977. ‘Testing a sequence of observations for a shift in random location’, J. Am. Statist. Assoc., 73, 180–185.
Heino, R. 1994. Climate in Finland during the Period of Meteorological Observations, Finnish Meteorological Institute Contributions 12,
Academic dissertation, Helsinki, 209 pp.
Karl, T. R., Diaz, H. and Kukla, G. 1988. ‘Urbanization: its detection and effect in the United States climate record’, J. Climate, 1, 1099–1123.
Landsberg, H. E. 1981. The Urban Climate, Academic Press, New York, 275 pp.
Lindgren, B. W. 1968. Statistical Theory, 2nd edn, Macmillan, London, 521 pp.
Maronna, R. and Yohai, V. J. 1978. ‘A bivariate test for the detection of systematic change in the mean’, J. Am. Statist. Assoc., 73, 640–645.
Moberg, A. and Alexandersson, H. 1997. ‘Homogenization of Swedish temperature data. Part II: Homogenized gridded air temperature
compared with a subset of global gridded air temperature since 1961.’ Int. J. Climatol, 17, 35–54
Moberg, A and Bergström, H. 1997. ‘Homogenization of Swedish temperature data. Part III. The long term records from Uppsala and
Stockholm.’ Int. J. Climatol, in press
Peterson, T. C. and Easterling, D. R. 1994. ‘Creation of homogeneous composite climatological reference series’, Int. J. Climatol., 14, 671–
679.
Portman, D. A. 1993. ‘Identifying and correcting urban bias in regional time series: surface temperature in China’s Northern Plains’, J.
Climate, 6, 2298–2308.
Potter, K. W. 1981. ‘Illustration of a new test for detecting a shift in mean in precipitation series’, Mon. Wea. Rev., 109, 2040–2045.
Tuomenvirta, H. and Heino, R. 1993. ‘Homogeneity testing of meteorological time series in Finland’, 73rd American Meteorological Society
Annual Meeting, Anaheim, CA, 2 pp.

You might also like