You are on page 1of 24

Z ltal. Statist. Soc.

(1996)
2, pp. 179-202

A COMPARISON OF INDICATORS FOR


EVALUATING X-11-ARIMA SEASONAL
ADJUSTMENT

P. Battipaglia*
D. Focarelli
Research Department, Bank of ltaly

Summary
The evaluation of the performance of seasonal adjustment procedures is an issue of prac-
tical importance in view of the unobservable nature of the components. Looking at just
one indicator when judging the overall quality of a procedure may be misleading, even
though this is common practice when many series are involved.
The main purpose of this paper is to compare the information content of different
synthetic indicators with reference to the X- 11-ARIMA procedure.
Sixty-six different types of monthly seasonal series are generated and the seasonal
component then extracted by carrying out X- 11-ARLMA with standard options. The cor-
relation between the pseudo-true error for each series and various synthetic indicators
allows us to compare the latter's reliability, under both the hypotheses of minimum and
maximum variance of the pseudo-true seasonal component.
We show that the overall quality index Q - the indicator most commonly adopted by
users of the X-11-ARIMA - is always outperformed by the simpler diagnostics based on
the stability of the estimates.
In particular, the ~sliding-spans. indicator, proposed by Findley et al. (1990) and in-
cluded in the diagnostics of the new X-12 procedure, shows a much stronger correlation
with the pseudo-true error in the seasonal adjustment.
We also show that the total forecasting errors in the one-year-ahead extrapolation of
the seasonal component have a good informative power and perform almost as well as the
~sliding-spans~ indicator.

1. Introduction

The institutions responsible for the production of economic data often adjust time
series for seasonal factors in order to improve the interpretation of variations.
The procedures for seasonal adjustment are essentially based on the application
of moving-average filters. These are usually linear and two approaches to their

* Address for correspondence: Banca d'Italia, Via Nazionale 91, 00184 Roma, Italy.

179
P. B A T I ' I P A G L I A 9 D. F O C A R E L L I

construction can be distinguished: in the first, the model that regulates the filter is
fixed in advance and, apart from a few ajustable options, independent of the se-
ries under study (automatic or semiautomatic procedures); in the second, the model
is identified on the basis of the observed series (model-based procedures).
In Italy, a working group composed of researchers and statistical agencies,
known as DESEC (Piccolo, 1985), proposed procedures of the first type for the
majority of series, and in particular of the X-11 procedure (U.S. Bureau of the
Census) and subsequent modifications such as X-11-ARIMA (developed in 1974
by E. B. Dagum and subsequently revised by Statistics Canada in the 1988 ver-
sion). This paper does not intend to re-examine that choice, although it merits
some attention, especially in the light of subsequent theoretical developments in
the field; rather, we will explore the specific problem of evaluating the results
produced by X- 11-ARIMA.
As an objective measurement of the effects of seasonal factors is not possible,
the evaluation of the results of any seasonal adjustment procedure is entrusted to
a fairly broad range of criteria based on theoretical and practical requirements.
Once the most important criteria have been identified, it is possible to construct a
synthetic indicator of the quality of the procedure. Such an indicator is of great
value since it is often the only control instrument available to those who work
with a very large number of time series.
The indicator most commonly adopted by users of the X- 11-ARIMA is called
Q, which is a weighted average of 11 qualitative measures computed by the pro-
cedure. It is often required to give a definitive judgement on the results produced
by the procedure (see Lothian and Morry, 1978). The importance attributed to
this indicator has prompted efforts to assess its reliability. Findley et al. (1990)
proposed a test based on the property of stability: the reliability of the results is
inversely proportional to the variability introduced in estimating the seasonal
component by the use of different observation samples (sliding-spans); this indi-
cator has been included in the diagnostics of the new X-12 procedure, which is
about to be released by the U.S. Bureau of the Census.
This paper assesses the reliability, as a function of the estimation errors in the
seasonal component, of these two indicators and of two others based on revision
errors in the extrapolation of seasonal factors. To accomplish this, we first simu-
lated a set of 198 series with the <<Aidine~ Data Generating Process, covering 66
different combinations of the non seasonal and seasonal parameters. The season-
al component was generated on the hypotheses of both minimum and maximum
variance, in order to broaden the range of possible patterns. We then applied the
X- 11-ARIMA procedure to each series with the additive model and the standard
options; this enabled us to evaluate the estimation error as the difference between
the pseudo-true non seasonal series and the seasonally adjusted series. The anal-
ysis of the correlation between the estimation error and the various indicators
allowed us to throw some light on the latter's information content.

180
COMPARISON OF INDICATORS FOR EVALUATING X-11-ARIMA

The paper is organized as follows. Section 2 sets out the most frequently used
criteria for verifying the quality of the seasonal adjustment, section 3 describes
the simulation experiment and the last two sections present the empirical find-
ings and the main conclusions.

2. The assessment of a seasonal adjustment procedure

A traditional approach to evaluating the performance of seasonal adjustment pro-


cedure is to look at specific properties of the adjusted series (Lovell, 1963; Bur-
man, 1967; Bacchilega and Gambetta, 1984), some of which, such as the smooth-
ness, are practical in nature, while others refer to properties of the theoretical
components. However, referring to theoretical properties can be misleading (Bell
and Hillmer, 1984): the adjusted series is obtained by applying an estimator to
the data that, although optimal, may have properties that differ from those of the
theoretical component. This is confirmed by Grether and Nerlove (1970), Sims
(1978) and Tukey (1978), who demonstrated empirically and theoretically that
the linear estimator with the smallest mean square error (~optimal~0 did not guar-
antee any of the following properties: a) high consistency, at non-seasonal fre-
quencies, between the original series and the seasonally adjusted series; b) min-
imum phase shifts between the spectrum of the original series and the adjusted
series; c) elimination of seasonal peaks without substantial alteration of the spec-
tral density. In order to overcome these shortcomings, Bell and Hillmer (1984)
suggested comparing the estimates with the theoretical estimators rather than the
theoretical components.
The choice of assessment criteria is nonetheless influenced by the type of pro-
cedure used: with model-based procedures, the assumptions regarding the com-
ponents are explicitly formalised and the properties of the estimators can be de-
rived formally; with automatic procedures it is difficult to explain both, so that
the assessment has to be limited to the properties of the results. The remainder of
this section sets out a review of the most frequently used diagnostic criteria, with
reference to this distinction between procedures.

2.1. Diagnosticsfor model-based procedures

It is possible to derive the theoretical estimator for each component of model-


based seasonal adjustment procedures if the generating model is known. The
comparison between the characteristics of the resulting estimates and those of
the theoretical estimators is a valid technique for the verification of the results.
Maravall and Gomez (1992) suggest comparing the respective autocorrelation
functions at lags 1 and 12 and the variances.

181
P. B A T r I P A G L I A 9 D. FOCARELLI

A second assessment technique for these procedures is the calculation of the


standard error associated with the estimators of the components. The analysis of
the errors is not necessarily a diagnostic technique for the procedure, since the
estimation error may be large owing to the variability of the series itself. Never-
theless, the confidence intervals of the estimated seasonal coefficients have a
high information content, as they permit the identification of series for which the
estimate of seasonality is highly volatile and thus unreliable.

2.2. The criteria for automatic procedures

There is a wide variety of criteria normally used to assess these procedures; the
most commonly used are the following:

(a) orthogonality: the estimated components must be independent and thus un-
correlated;
(b) idempotency: a procedure is idempotent if, when applied to the adjusted se-
ties, it reproduces the series without ~filtering~ it;
(c) no residual seasonality: the seasonally adjusted series must not present signif-
icant correlation at seasonal lags;
(d) stability: a measure of the invariance of the estimates with the variation of the
reference observations;
(e) smoothness: refers to seasonally adjusted series and is considered to be desir-
able because it facilitates short-term analysis;
(f) randomness of the residuals: the estimated irregular component must not show
systematic characteristics, especially in relation to the seasonal lags.
It is important to note that among these criteria, only (a) and (b) refer to theo-
retical based properties. The others reflect qualities that are often thought desir-
able but which have been refuted in the literature (for example, as regards (f), see
Dagum et al. (1991) and Maravall (1987)). The new diagnostics introduced with
X- 12 is based on the stability of the estimates. While stability is a function of the
characteristics of the raw series, it is a property that also reflects the reliability of
the results of the procedure, on the basis of considerations analogous to those
proposed for the standard error of the estimators in model-based procedures.
The following ctiteria are less frequently used but possess informative power:
(g) robustness in terms of temporal aggregations: for example, the quarterly ag-
gregation of monthly adjusted series should not produce results that differ
significantly from the direct seasonal adjustment of the quarterly series;
(h) robustness in terms of aggregations of variables: when an unadjusted series is
defined as the sum of elementary series, the direct seasonal adjustment of the
compound series should not differ significantly from the aggregation of the
seasonally adjusted components.

182
COMPARISON OF INDICATORS FOR EVALUATING X-11-ARIMA

2.3. Diagnosticsfor the X-I1-ARIMA procedure

2.3.1. Tim I~qDEXESOF"rrm 1988VERSION

The quality indicators proposed in the 1988 version of the X-11-ARIMA proce-
dure are partly based on the general principles described in the previous section
and partly constructed according to the characteristics of the filters used by the
procedure.
Very brieflyt:

9 the variability of the irregular component must not be too high in relation to
that of the total (M1, M2), the trend cycle (M3) and the seasonal component
(M6);
9 the irregular component must not present residual autocorrelation (M4);
9 the trend component must have substantially more marked variations than the
irregular component (M5);
9 the variations in the seasonal component across years must be limited and lin-
ear (MT, M8, M9), particularly for the final years of the series (M10, M l l ) .
This condition is essential for the good quality of the estimates of X-11-ARI-
MA since the procedure uses linear filters for most of the estimates.

The synthetic indicator, designated Q, is a weighted average of the 11 indexes.


The highest weight is assigned to M7, which is constructed as a combination of
two F tests for the stable and variable seasonality, respectively: if the weight of
the latter is too high compared with that of the former, the X- 11-ARIMA proce-
dure is considered unable to extract it from the raw series. For values of Q greater
than 1, the equality of the seasonal adjustment is considered unacceptable (Lothian
and Morry, 1978) 2.

2.3.2. THE INDEXESOFTHEX-12 VERSION

For the new X-12 procedure, which is being perfected at the U.S. Bureau of the
Census, Findley and Monsell (1986) and Findley et al. (1990) have proposed a

1. For a completedescriptionof the indicatorsin the X- 11-AR/MAprocedure, see Lothian


and Morry, (1978).
2. Findley and Monsell (1986) propose somewhat different reference thresholds: the
seasonal adjustment is considered reliable if Q is less than 0.8 and unreliable if Q is
greater than 1.2, If Q is between 0.8 and 1.2, they suggest using additional indicators in
order to decide whether the seasonal adjustment of the series is correct.

183
P. BATFIPAGLIA 9 D. FOCARELLI

set of synthetic indicators based essentially on the property of stability: a season-


al adjustment procedure is considered not very reliable if it produces widely dif-
ferent results when a limited number of observations vary. The indicator mea-
sures the variability of the estimates of the seasonal coefficients or of the season-
ally adjusted data for period t, recursively calculated for time spans that overlap
except for the first and last years (sliding-spans). The seasonal adjustment is con-
sidered unreliable when for an appreciable number of periods:

17,m'~ = A b s { [ m a x k N ( k ) - m i n k N(k)] / m i n k N(k)} > v h,~,,,o,a

where t is the time index, N(k) is the seasonally adjusted va]ue estimated for
period t on time horizon k and v h~sno~a is appropriately predetermined 3.
In our paper we constructed an entirely analogous indicator (hereafter SLID-
SP) that uses the mean calculated with respect to t to summarize the variability of
the estimates 4.

SLIDSP = Mean f A b s { [ m a x k N ( k ) - m i n k N(k)] / m i n k N ( k ) } }

Findley et al. (1990) verified the superiority of such diagnostics with respect to
the traditional indicators for a number of representative series. The comparison
was made with Q, with the F test serving to indicate the variability related to the
number of working days, and with a smoothness indicator.

2.3.3. TIlE INDICATORS BASED ON REVISION ERRORS

Most seasonal adjustment procedures are based on the application of symmetric


filters to the raw data. Where such filters cannot be calculated, at the extreme
values of the time span observed, asymmetric preliminary filters are used. As
new observations are acquired the preliminary filters change, becoming more
similar to the definitive structure (~final>> filters). In particular, approximating
the filters of the X-11-ARIMA procedure, in the standard version and with monthly
observations, requires 84 successive observations to make the current seasonally
adjusted value definitive. However, Burridge and Wallis (1984) have demon-

3. In Findley et al. (1990), Vthrcshol d is set at 0.03 for the indicators constructed on the basis
of the seasonal coefficients or monthly percentage variations. They suggest, however, that
the reference value be selected case by case, taking account of the nature of the seasonal
component and the degree of ~uncertainty~ considered acceptable.
4. The t index varies in the set of observations for which two or more estimates of the
seasonally adjusted value can be made.

184
COMPARISON OF INDICATORS FOR EVALUATING X-11-ARIMA

strated that after three years the value of the weights is practically nil and the
estimates of the seasonal coefficients are therefore no longer subject to revision.
The difference between the preliminary estimates and the successive revisions
is called the revision error and is usually calculated with reference to the ~final~
filters 5. In many cases, however, the first estimate of the seasonal component is
produced using coefficients extrapolated from earlier observations (frequently those
referring only to full years: one-year-ahead forecasts). In such cases, the first revi-
sion is calculated at the end of the year and the revision error coincides with the
difference between the seasonally adjusted series estimated with the extrapolated
coefficients and that produced when the data for the full year are available. These
are the revisions of greatest interest to the institutions that publish seasonally ad-
justed series as they measure the adjustment made to the data on which short-term
analysis and, in some cases, policy decisions are based. The subsequent revisions
are generally smaller, since the structure of the filters, with decreasing weights in
relation to the central value, attributes most importance to the differences between
the values forecast by the ARIMA model and the actual results 6.
In the present work we calculated two indicators to measure the forecasting
error produced by the X-11-ARIMA procedure. The seasonal adjustment proce-
dure was applied by considering various six-year time spans; the experiment was
repeated on five of these spans. The two indicators were constructed as the mean
of the absolute value of the differences between the seasonally adjusted series
calculated with the extrapolated coefficients N (forecast) and those recalculated
with the values for the full year, N(preliminary), - and the ~final~ adjusted se-
ries, N (final). The two indicators, called respectively TREV (total revision error)
and FYREV (first year revision error), are defined in formal terms as follows:

TREV = Mean, {Abs{ [N (final) - N (forecast)] / N (final) } }


FYREV = Mean, {Abs { [N (preliminary) - N (forecast)] / N (preliminary) } }

The former has a greater information content than the latter. The usefulness of
FYREV depends on there being a strong correlation between the size of the first

5. Findley and Monsell (1986) argue that the ~finab~ seasonal adjustment coefficients
may be artificial in that they are determined exclusively by the fact that the filter is of
finite length in at least three cases: i) when there is no seasonality; ii) when the seasonal
pattern changes too rapidly; iii) when the signal of seasonality is very weak and is drowned
by the irregular component. In all three cases the preliminary coefficients should there-
fore approach the final ones in a much more erratic fashion than in the case of strong and
stable seasonality.
6. For a more formal discussion, see Dagum (1987).

185
P. B A T T I P A G L I A 9 D . F O C A R E L L I

revision and that of the total revision. In this case, using only one additional year
of data, FYREV presents two advantages: it requires less data than are needed to
produce the definitive seasonal adjustment, and it can also be applied to the most
recent years, which are usually those of greatest interest 7.

3. The proposed application

Performing Monte-Carlo simulations with a complex iterative procedure such as


X- 11-ARIMA is an extremely demanding task in itself (see Ghysels et al. 1996).
Our experiment, involving indicators based on the stability of the estimates,
was even more demanding, as it required the procedure to be carded out several
times on every series, with a different time span each time. We therefore decided
to keep the sample of the simulated series relatively small, relying on the strong
similarity of the results that we obtained in a set of three replications for each
parameter combination (see section 4).
All the series were obtained using the same Data Generating Process, i.e. from
the ~Aidine~ seasonal ARIMA model (Box and Jenkins, 1970):

(1-B)(1-B'2)Z, = (1-OB)(1-OB'Z)a , o2a = 1.

A wide spectrum of series was considered, covering all the combinations of the
parameter values: 0 = (--0.95; -0.8; -0.6; -0.4; -0.2; O; 0.2; 0.4; 0.6; 0.8; 0.95),
0 = (0; 0.2; 0.4; 0.6; 0.8; 0.95) s.
The ~Airline~ model was chosen since it fits a broad range of seasonal series
and for its theoretical properties, the most important of which is the association
between the parameters 0 and O and, respectively, the trend and seasonality com-
ponents. It is easy to demonstrate that the components associated with 0 and O
become more deterministic as 0 and O approach 1. It should also be recalled that,
as shown by Cleveland and Tiao (1978), this model is similar to that obtained by
the linear filter implicitly used by the X- 11-ARIMA procedure with the standard
options.
Canonical decomposition, originally proposed by Pierce (1978) and Box, Hillm-
er and Tiao (1978), was applied to each of the 66 models considered in order to

7. The length of the time spans considered was set at six years owing to the need to
utilize four years subsequent to the reference year when calculating the ,~finab seasonal
adjustments (TREV). However, the use of longer time span was found to produce similar
results.
8. Hilmer and Tiao (1982) have shown that the ~Airline~ model admits a canonical de-
composition for O > 0.

186
COMPARISON OF INDICATORS FOR EVALUATING X-11-ARIMA

derive the unobservable theoretical components: trend (Tc~.ica~), seasonal (Sco.o.i..~)


and irregular (l~a.o.icat) 9.
A number of observations can be made regarding the variance of noise (inno-
vation variance) for the canonical components, (Fig. 1 to 3).
The link between the values of the model's parameters (0 and O) and, respec-
tively, the trend and seasonal components is evident: as 0 increases, and for any

0.$

0.7

016

0.$

G4

~J

0
Fig. 1 - Innovation variance of the canonical trend component
(From the bottom upwards: O = O; 0.2 ... 0.95).

0.2

0 i
-&a .&2

Fig. 2 - Innovation variance of the canonical seasonal component


(From the top downwards: O = O; 0.2 ... 0.95).

9. The decompositions were calculated with the SEATS program (Maravall and Gomez,
1992).

187
P. B A T I T P A G L I A 9 D. F O C A R E L L I

0.9

0.7

0.6

0.5

0.4

0.]

0.l

I I I I f [ I
0

Fig. 3 - Innovation variance of the canonical irregular component


(From the bottom upwards: O = O; 0.2 ... 0.95).

value of O, the innovation variance of the trend decreases. The innovation vari-
ance of the seasonal component has the highest values for O -- 0, while it is
virtually nil for O = 0.95. The stochasticity of the trend component increases
with the value of the seasonal parameter. The innovation variance of the seasonal
component has a moderately decreasing trend for values of 0 between - 0 . 8 and
+0.6 (where it reaches a low point), after which it rises to reach its highest value
at 0 = 0.95.
The simulations were run to generate the three canonical components for each
of the 66 parameter combinations of the Airline modelt~ the components were
then summed to obtain raw monthly series of 180 observations, corresponding to
15 years u.
The simulation experiment was repeated three times, as a check on the stabil-
ity of the results.
In aggregating the components, two cases were considered in order to make
the experiment more representative, as in Bell and Hillmer (1984):

i0. The simulations were run using the SCA software. The graphs of the fu'st group of 66
series are given in Appendix.
11. We set the starting values equal to zero and then discarded the first three years of the
generated sample to reduce their effect on the simulated series. Although this procedure
has been criticized (Hilleberg, 1996), the choice of starting values is necessarily subject
to a fair amount of arbitrariness (Ghysels et al., 1996).

188
COMPARISON OF INDICATORS FOR EVALUATING X-11-ARIMA

9 MIN SEAS: the variance of the seasonal component is minimum. For the sea-
sonal component S = Sc~nonic~tand for the non-seasonal component N = Tc~no,i~z
dr" [canonical"
9 MAX SEAS: the variance of the seasonal component is maximum. For the
seasonal component S = Sca.o,ical+ Ico.o,icaland for the non-seasonal component
N = ZcanonicaI.

The innovation variance for the non-seasonal component N under the two hy L
potheses considered is given in Figures 4 and 5.
All the raw series were seasonally adjusted with the X- 11-ARIMA procedure,
applying the additive decomposition model and using all the standard procedures
of the 1988 version. The only exception is that the order of the ARIMA model
was supplied as an input while the estimation of the parameters was entrusted to
the procedure.

(From the b o t t o m u p w a r d s : @ = 0; 0.2; ... 0 . 9 5 ) ( A t the b o t t o m : 0 = 0 . 9 5 ; _ . a t t h e t o p : 0 = 0 . 6 )


1

.6

.2

0 ..... t t
-0.95 -0.S -~).b -i)..1 -0.2 O ').L 0.a 06 0I 0.9S ~ ; 0': 0', 0', 0', 0;,
0 0
Fig. 4 - Innovation variance of the"non-seasonal component (MIN SEAS hypothesis).

4. T h e m a i n results

4.1. Characteristics of the X-11-ARIMA estimation error o f the non-seasonal


component

The results of an estimation procedure relative to a time series can be evaluated in


terms of the deviation from the actual values or as a function of its ability to

189
P. B A T T I P A G L I A - D . F O C A R E L L I

(From the bottom upwards: O = 0; 0.2; ... 0.95) (From the bottom upwards: 0 = 0.60.2 -0.2 -0.6-0.95)

.6

..L

.2

90.9~ -0.$ -,)6 .0.-L -0. ~- 0 0.." 0.4 0.6 0.3 0.95 0.2 0.4 0.6 0.$ 0.95

0 0
Fig. 5 - Innovation variance of the non-seasonal component (MAX SEAS hypothesis).

duplicate variations over time, and in particular at turning points. The indication
of turning points is especially important when the results are used for short-term
analysis and policy guidance. Accordingly, the definition of the measure of error
must be consistent with the use to be made of the estimate. For instance, the
Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are suitable
for evaluating deviations from value levels, while Theil's U measures the diver-
gence between predicted and actual variations ~2.
In the present application, both types of estimation error have been evaluated.
The three indexes (MAE, RMSE, and Theil's U) have been computed by compar-
ing the seasonally adjusted series with the sum of the non-seasonal components
generated by simulation for both assumptions concerning the seasonal compo-
nent, MIN SEAS and MAX SEAS. Since the results produced by MAE and RMSE
are similar, we report only the characteristics of the X-11-ARIMA estimation
errors as measured by MAE and Theil's U (Tables 1 and 2).

12. Defining AN,(actual) as the increment in the values observed over the period [t-I, t]
and AN,(predicted) as that in the predictedvalues,Theil's index is: U = (Mean[AN,(actual)
.AN,(prexftcted)2/Mean[AN,(actual)]2} ~n.The shortcomingsof this index are well known;
most significantly, a change in sign between periods is not necessarily linket to a cyclical
turning point. Nevertheless, as monthly variations are commonly used in short-term anal-
ysis, Theil's index can be helpful.

190
COMPARISON OF INDICATORS FOR EVALUATING X - 1 1 - A R I M A

Table
X-I I-A.RIMA EST~tATION ERROR OF THE NON-SEASONAL COMPONENT
(MAE)

HIGHEST Associated
MOMENTS QUANTILES VALUES parameters
(0 O)

average = 0.27 median = 0.06 5.87 0.95 0.6


MIN SEAS 1.65 0.6 0.95
stddev= 0.76 Q3-QI = 0.17 1.21 095 0.8
GROUP 1
average = 0.46 median= o.os 6.44 0.95 0.95
MAX SEAS 4.12 0.8 0.95
std dev = 1.06 Q3-QI ffi 0.30 2.92 0.95 0.6

average = 0.29 median = 0.06 2.21 -0.2 0.6


SEAS 2.09 0.8 0.6
std dev = 0.53 Q3-QI = 0.30 2.00 0.95 0.6
GROUP II
average = 0.31 median = 0.08 5.84 0.8 0.95
MAX SEAS 2.04 0.95 0.6
std dev = 0.78 Q3-QI = 0.32 1.43 0.95 0.95

average = 0.30 median = 0.05 5.73 095 0.95


,MIN SEAS 1.75 0.95 0.2
stdd~'= 0.78 Q3-QI = 0.19 1.60 0.95 0
GROUP III
average = 0.52 median = 0.05 13.17 0 95 0.95
MAX SEAS 5.75 095 0.6
std dev = 1.80 Q3-QI = 0.22 3.79 0.95 0.2

I For each group, the errors are computed on 180 obsen'ations for 66 series.

Before discussing the results, it is worth noting that the estimation error was
tested for stationarity. Following the procedure proposed by Engle and Granger
(1987), we checked that the pseudo-true non seasonal series and the X-11 sea-
sonally adjusted series were cointegrated: the augmented Dickey Fuller test was
significant (at the 1 per cent confidence level) for all the 198 series and under the
hypotheses of both minimum and maximum variance for the seasonal compo-
nent.
The similarity of the results for the three groups of simulated series is pro-
nounced. This applies to the different measures of error (MAE and Theil's U), to

191
P. BATHPAGLIA 9 D. FOCARELLI

Table 2
X-11-.M~h-MA ESTII~L%TION ERROR OF T H E NON-SEASONAL C O M P O N E N T
(THEIL's U)

HIGHEST Associated
MOMENTS QUANTILES VALUES parameters
(0 |

average = 0.62 median = 0.48 2.60 0.95 0


MIN SEAS 1.51 -0.4 0
stddes'= 0.37 Q3-Q1 = 0.32 1.42 0.8 0
GROUP I
average = 3.80 median = 1.00 35.20 0.95 0.95
MAX SEAS 32.98 0.95 0.8
std dev = 7.34 'Q3-QI = 2.16 28.85 0.95 0.6

average = 0,63 median = 0.54 1.89 0,95 0


NLIN SEAS 1.26 0,8 0
s t d d e v = 0.28 Q3-QI = 0.28 1.25 0.6 0
GROUP II
average = 3.79 median = 0.97 34.60 0.95 0.95
MAX SEAS 29.27 0.95 06
std des" = 7.19 Q3-Q1 = 2.74 28.16 0.95 0.8

average = 0.62 median = 0.54 2.11 0.95 0


MIN SEAS 1.56 0.8 0
std dev = 0.29 Q3-QI = 0.23 1.25 0.2 0
GROUP III
average = 3.78 median = 0.95 3417 0.95 0.95
MAX SEAS 30.19 0.95 0.$
std dev = 7.09 Q3-Q1 = 2.31 28.74 0.95 0.6

[ For each ~roup. the errors are computed on 180 observations for 66 series.

the different assumptions concerning the structure of the seasonal component


(MIN SEAS and MAX SEAS) and to the characteristics of error in relation to the
parameters 0 and O. In what follows we therefore present only the aggregate
results, for the whole set of 198 series.
The characteristics of the X-11-ARIMA estimation error of the non-seasonal
component differ depending on whether they are measured by MAE or Theil's U.
In the first case the mean value for the 66 series that form each group is slightly
higher under MAX SEAS than under MIN SEAS. This small difference is due to

192
COMPARISON OF INDICATORS FOR EVALUATING X-11-ARIMA

the extremes values; the median value is practically the same under the two hy-
potheses (Table 1).
The amplification of the estimation error when the seasonal components are
highly variable (MAX SEAS) is more evident when it is measured by Theil's U.
The mean and the median are both markedly higher than under MIN SEAS, and
the variance also increases substantially (Table 2). Hence, even when the season-
al component is particularly erratic, the filters of the X-11-ARIMA procedure
make it possible to limit the size of the estimation errors. On the other hand, the
indication of turning points is subjected to considerable distortion.
The characteristics of the error connected with variations in the parameters 0
and O (Figures 6 and 7) suggest more complex considerations. As measured by
the Mean Absolute Error, the estimation error is similar under the two hypotheses
for the seasonal component: in both cases, the maxima come at the same point: 0
= 0 . 9 5 and (9 = 0.95. At these points the difference between the errors produced
under the two hypotheses widens, while for the other values of the parameters it
is relatively small.

h ,. i i i i i i i i
-a I 09~

Fig. 6 - X-11-ARIMA estimation error (MAE) as a function of 0.

Measured by MAE, the pattern of the error, and in particular the location of the
maxima, can be related to that of the innovation variance of the seasonal and
non-seasonal components. The innovation variance of the canonical seasonal com-
ponent, as noted in Section 3, is at its highest value in correspondence with 0 =
0 . 9 5 (a quasi-deterministic trend). Under MAX SEAS, this characteristic is ac-
centuated by the addition of the irregular component, whose innovation variance
also increases for high values of 0. At O = 0.95, the seasonal component is quasi-
deterministic under both MIN SEAS and MAX SEAS and the variance of the
non-seasonal component is at its highest value (Figures 4 and 5).

193
P. B A T H P A G L I A 9 D. FOCARELLI

12

0,3

O+

04

0.2

I I I I
, +: ,+ 0!, +', 0 9S

Fig. 7 - X-11-ARIMA estimation error (MAE) as a function of O.

The fact that the highest values of the M A E and of the innovation variance of the
components correspond suggests that the filters of the X-11-ARIMA procedure
are less effective when there is significant .inequality between the non-seasonal and
the seasonal stochastic components. In intermediate situations- which are the most
c o m m o n for economic series - the filters produce a more satisfactory local fit.
The X- 11-ARIMA estimation error measured by Theil's U (Figures 8 and 9) is
much larger under the M A X SEAS hypothesis. Hence, the seasonally adjusted

Fig. 8 - X- 11-ARIMA estimation error (Tlaeil's U) as a function of 0.

194
COMPARISON OF INDICATORSFOR EVALUATINGX-11-ARIMA

Fig. 9 - X-11-ARIMA estimation error (Theil's U) as a function of O.

series estimated by the X-11-ARIMA procedure is better at identifying turning


points in the ~true~ seasonal adjusted series when the seasonal component is
canonical.

4.2. Synthetic indicators for evaluating the seasonal adjustments

The value of a synthetic indicator of the reliability of a seasonal adjustment pro-


cedure is all the greater when a large number of series have to be seasonally
adjusted, since it should reveal any problems in the procedure almost automati-
cally. The assessment of the discriminatory power of such an indicator is there-
fore of considerable practical importance.
The relationship between the synthetic indicator and the estimation error of
the non-seasonal component produced by the procedure has been studied for four
of the indicators discussed in Section 2: Q of the X-11-ARIMA procedure, SLIDSP
of the new X-12 procedure and two indicators based on the revision errors TREV
and FYREV. The Spearman's rank correlation index has been used rather than
the linear correlation coefficient in order to take account of the non-linearities in
the values of the indicators.
All the indicators considered show a statistically significant positive correla-
tion with the MAE error, under the hypothesis of both MIN SEAS and MAX
SEAS. The diagnostics based on the stability of the series all have very high
coefficients, which outperform those associated with Q. This is particularly true
for the indicator based on the sliding-spans (Findley et al., 1990), whose rank
correlation with the error is 0.93 in the case of MIN SEAS and 0.92 in that of

195
P. BA'I'TIPAGLIA9 D. FOCARELLI

M A X S E A S (Table 3a), while for Q, the rank correlation coefficient with the
,,true>> estimation error is just over 0.5.

Table 3a
RANK CORRELATION BETWEEN THE X-11-ARIMA EST~IATION ERROR AND
THE SYNTHETIC INDICATORS
(MAE)

Q SLIDSP TREV FYREV


MIN SEAS 0.52 0.93 0.87 0.88
MAX SEAS 0.53 0.92 0.85 0.86
The rank correlation is computed on all the 198 simulated series.

Q SLmSP TREV FYREV


MINI SEAS 0.45 0.93 0.86 0.87
MAX SEAS 0.44 0.91 0.85 0.86
The rank correlation is computed on the 147 simulated series for which stable seasonality is identified by the
X-I I-ARIMA D8 test.

These results were basically confirmed in a number of further tests. The correla-
tions were also calculated excluding the 51 series not showing any identifiable
seasonality (Table D8 of X-11-ARIMA's output); the results are shown in Table
3a. In addition, the trimmed mean (obtained by dropping 10 per cent of the ex-
treme values) and the median were used to assess the robustness of the results
with respect to extreme values; the rank correlations between the estimation er-
rors produced by the procedure and these indicators are reported in Table 3b and
Table 3c, respectively.

Table 3b
ILANK CORRELATION BETWEEN THE X-11-AR.~IA ESTIMATION ERROR AND
THE SYNTHETIC hNDICATORS
(TRIMblED MEAN)

Q SLIDSP I TREV FYREV


MIN SEAS 0.60 ] 0.99 0.95 0.95
MAX SEAS 0.64 [ 0.98
The rank correlation is computed on all the 198 simulated series.
I 0.94 0.95

Table 3c
RANK CORRELATION BETWEEN THE X-1 I-ARIbIA ESTIMATION ERROR AND
THE SYNTHETIC INDICATORS
(MEDIAN)

q SLIDSP TREV gVR~V


MIN SEAS 0.64 0.98 0.96 0.96
MAX SEAS 0.68 0.98 0.95 0.96
The rank correlation is computed on all the 198 simulated series.

196
COMPARISON OF INDICATORS FOR EVALUATING X-11-ARIMA

In order to compare the estimation error measured on the single-period variations


and the indicators, the formulae for SLIDSE TREV and FYREV were modified
to make them more similar to Theil's U. In more formal terms, having defined
~V(k), the variation in the period [t-l, t] of the seasonally adjusted value over
the time horizon k, the three modified indicators can be defined as follows:

S L I D S P M = (Mean, [max k zlN (k) - min k z~V(k)]2/Mean, [min k z~(k)]2} '/2


T R E V M = { M e a n JAN(final) - Z~V,(forecast)] 2 / M e a n [z~N (final)]2} m
F Y R E V M = { M e a n JAN(preliminary)- A N (forecast) ]2/Mean [L~N (preliminary)] 2} la

The rank correlation between these three indicators and the X-11-ARIMA esti-
mation error measured by U has three very similar values (Table 4). Under both
hypotheses on the nature of the seasonality the values are lower than those ob-
tained with the error measured by the MAE: in the case of MIN SEAS the coef-
ficient is around 0.8; in the case of MAX SEAS the difference is more pronounced,
with the coefficient dropping to around 0.4. The Q indicator is not correlated
with the error measured by U under the hypothesis of MIN SEAS, whereas the
coefficient is in line with those of the other indicators in the case of MAX SEAS.

Table 4
RANK CORRELATION BETWEEN THE X-11-ARI~IA ESTIMATION ERROR AND
THE MODIFIED SYNTHETIC INDICATORS
(THEIL's U)
Q [ SLIDSP,~I [ TRE'v~I FYREV~I
MIN SEAS -0.22 0.79 0.79 0.74
~IAX SEAS 0.44 0.44 0.33 0.48
The rank correlation is computed on all the 198 simulated series.

Table 5 shows the rank correlation between all the indicators considered.

Table 5
IL-kNK CORRELATION BETWEEN THE SYNTHETIC INDICATORS

q SLIDSP SLIDSPM TREV TREV?,I FYREV FYREVM


q i 0.57 -0.15 0.60 -0.32 0.64 -0.09
SLIDSP 1 0.19 0.91 0.07 0.91 0.20
SLIDSP~,I 1 0.25 0.80 0.19 0.72
TREV I 0.15 0.99 0.28
TREV'M 1 0.08 0.79
FYREV 1 0.26
FYREVM
The rank correlation is computed on all tile 198 simulated series

197
P. B A T T I P A G L I A 9 D . F O C A R E L L I

The correlation between Q and SLIDSP is equal to 0.57, which is comparable to


the value found by Findley and Monsell (1986). The rank correlation between
SLIDSP and FYREV is high (0.91), indicating that the measure of the variability
of the seasonally adjusted values is not seriously distorted when the reference
values include the extrapolated values. There is an almost perfect rank correla-
tion between TREV and FYREV, which means that the first-year revision error,
calculated as the difference between the extrapolated value and that obtained
with the data for the whole year, is an excellent approximation of the overall
error. It should be noted, however, that the efficacy of the FYREV indicator has
been boosted in this experiment by the hypothesis that the model generating the
series observed is correctly identified.

5. Conclusions

Despite its being based on a set of simulated series that is not sufficiently large
for general properties to be deduced, our experiment does permit a number of
considerations concerning the error produced by the X-11-ARIMA procedure
and confirms some of the results obtained in earlier works with regard to the
information content of synthetic indicators for the control of the results of sea-
sonal adjustment.
The size of the X-11-ARIMA estimation error of the non-seasonal component
does not appear to depend significantly on the characteristics of the model used
to generate the series observed, with the important exception of the cases - rarely
found in economic series - in which one of the non-seasonal or seasonal compo-
nents is almost deterministic. The ability of the seasonally adjusted series to iden-
tify turning points worsens as the innovation variance in the seasonal component
increases.
As regards the synthetic indicators for the control of the procedure, the results
put forward here suggest that the weighted average Q, which is widely used to
assess the reliability of the seasonal adjustments obtained with X-11-ARIMA, is
less reliable than the diagnostics based on the stability of the estimates. In partic-
ular, SLIDSE proposed by Findley et al. (1990), is the indicator most closely
correlated with the pseudo-true estimation error produced by the procedure. The
forecasting errors in the one-year-ahead extrapolation of the seasonal compo-
nent, TREV and FYREV, have a good informative power with regard to the re-
sults of the seasonal adjustment; use of the revision error for just the first year,
FYREV, does not lessen the quality of the indicator in this experiment.

198
COMPARISONOF INDICATORSFOR EVALUATINGX-11-ARIMA

APPENDIX
Simulated series (1st group of 66 series)

0:-0.95 0=-0.8
m
.j---
j."" u12

../"
~,.<-U.~ . . . . . . . . . t-<~ . "*J , . , . ' ~ . . , " . "at+

II}O l .... + ' "~- ,s<;'~.+


'| i "+'., "-~+z '

.... :>,,-.~. r.>::.<'::::7+ ::: :: "?:':::'7"7 :::..-t,L'::['J "772~ : ~ .....

+ "~%'F',,

;~o
'<~'V".+
+iil
\b
,,: 1+t 5

+lCl~
"~+l m ix I,,i t+ u so ll

0:-0.6 0 = -0,4
..... . ",,',;,

~o

+...." ._,..
t~
"" " ......... . ..... ..2+,"'v7~" 9 "i,;,~
o
9w~-. - ilill i00

.~0 ,~^"

t'0

'~-'~L '-',;".,; ;.,, .. ,.',.~,',:'':2;" "


0

~"'+---+ " 9. . . . . . . " "" a~+


Yr -+.
~" " - . . ~ ...+..,,.~-x +'+

L~ I00

0=-0.2 am ,.,
0=0
io~

m
1 j.J ......... " " ' r ~ " ~ "
- ]....-""

,. +%.'j+ ",.4
"+,. ++ .~.
t

+it 10 ~, Ill II Pal eJO i1

199
P. BATI']PAGLIA 9 D. FOCARELLI

0=0.2 0=0.4

9 " f,=.j,

, 9 . , ,,].",:...',~-e,A+ +
...... '-+

+ + +-+"+~?" ;~+"++ .... +.." ..... ~ z

0=0.6 0=0.8

.i:,:.',,z.::1
I 9
:o -
~.
! i:',:~;; "~ :

9 I0

0
9 " +'+','+ '~'G~'.:,,,,,+. ,,. ~ . r
,,, * ++'W~.~L~..++ ~++.i...+..+~ | .10

' ''+'~"'il r =:0

G=0.95

1
9 i ! : !
i i ~ ~ I ~i

i i:~! ~ ! ~ i i+ ~! ~i

1 ~,! ...... t:
[+l!lf
i

200
COMPARISONOF INDICATORSFOREVALUATINGX-11-ARIMA

Acknowledgments

The authors wish to thank D. E Findley, A. Maravall and D. Piccolo for valuable
comments, as well as the participants at both the SIS-Banca Toscana conference
on ~Models for short-term economic analysis~ (Florence, June 15, 1994) and the
~Seasonal Adjustment Workshop~ (Arlington, March 22-23, 1995), where earli-
er versions of this paper were presented. They also thank an anonymous referee
for providing insightful comments. The views expressed in the paper are not
necessarily those of the Bank of Italy.

REFERENCES

BACCHILEGA,G. and GAMBETTA,G. (1984). Appunti per un confronto empirico tra diversi
metodi di depurazione stagionale, Centro di Specializzazione e Ricerche, Portici,
RS 16/84.
BELL,W. R. and HILLMER,S. C. (1984). Issues Involved with the Seasonal Adjustment of
Economic Time Series, Journal of Business and Economic Time Series, 2, 91-349.
Box, G. E. P., HILLMER,S. C. and TIAO,G. C. (1978). Analysis and Modeling of Seasonal
Time Series. In Seasonal Analysis of Economic Time Series, ed. A. Zellner, US De-
partment of Commerce, 320.
Box, G. E. E and JENKINS,G. M. (1970). Time Series Analysis: Forecasting and Control,
San Francisco, Holden-Day.
BURMAN,J. P. (1967). Assessment of a Seasonal Adjustment Procedure by Spectral Anal-
ysis, Statistician, 247-256.
BURMAN,J. P. (1980), Seasonal Adjustment by Signal Extraction, Journal of the Royal
Statistical Society, A, 321-37.
BURRIDGE,P. and WALLIS,K. E (1984). Unobserved-Components Models for Seasonal
Adjustment Filters, Journal of Business and Economic Statistics, 2, 350-9.
CLEVELAND,W. P. and TIAO, G. C. (1976). Decomposition of Seasonal Time Series: A
Model for the Census X-11 Program, Journal of the American Statistical Associa-
tion, 71,581-7.
DAtum4, E. B. (1987). Current Issues on Seasonal Adjustment, Working Paper-TSRA 87/
6, Statistics Canada, Ottawa.
DATUM,E. B. (1988). The X-11-ARIMA/88 Seasonal Adjustment Method, Statistics Can-
ada, Catalogue K1A OT6.
DAGUM,E. B., CHAB,N. and SOLOMON,B. (1991). The Autocorrelation of Residuals from
the X- 11-ARIMA Method, Journal of Officials Statistics, 7, 181-194.
ENGLE,R. E and GRANGER,C. W. J. (1987). Cointegration on Error Correction: Represen-
tation, Estimation and Testing, Econometrica, 55, 251-276.
FINDLEY,D. E and MONSELL,B. C. (1986). New Techniques for Determining if a Time
Series Can Be Seasonally Adjusted Reliably. In Regional Econometric Modelling,
eds. M. R. Perryman e G. R. Schimdt, Amsterdam, Kluwer-Nijhoff, 195-228.

201
P. BATTIPAGLIA9D. FOCARELLI

FINDLEY,D. E, MONSELL,B. C., SHULMAN,H. B. and PUGH,M. G. (1990). Sliding-Spans


Diagnostics for Seasonal and Related Adjustments, Journal of the American Staffs-
ticalAssociation, Vol. 85, N. 410, 345-55.
GHYSELS,E., GRANGER,C. W. J. and SIKLOS,E L. (1996). Is Seasonal Adjustment a Linear
or Nonlinear Data-Filtering Process?, Journal of the Business and Economic Statis-
tics, Vol. 14, N. 3,374-86.
GREa'FER, D. M. and NEP,LOVE,M. (1970). Some Properties of ~Optimab> Seasonal Ad-
justment, Econometrica, 38, 682-703.
HVLLEBERG,S. (1996). Comments on Ghysels E., Granger C. W. J. and Siklos E L. (1996),
Is Seasonal Adjustment a Linear or Nonlinear Data-Filtering Process?, Journal of
the Business and Economic Statistics, Vol. 14, N. 3, 388-89.
HILLMER,S. C. and TIAO,G. C. (1982). An ARIMAzModel-based Approach to Seasonal
Adjustment, Journal of American Statistical Association, 77, 63-70.
LOTHIAN,J. and MORAY,M. (1978). A Set of Quality Control Statistics for the X- 11-ARI-
MA Seasonal Adjustment Program, Research Paper, Statistics Canada.
LOVELL,M. C. (1963). Seasonal Adjustment of Economic Time Series and Multiple Re-
gression Analysis, Journal of the American Statistical Association, 58, 993 - 1010.
MARAVALL,A. (1987). Minimum Mean Squared Error Estimation of the Noise in Unob-
served Component Models, Journal of Business and Economic Statistics, 5, 115-
120.
MARAVALL,A. and GOMEZ,V. (1992). Signal Extraction in ARIMA Time Series. Program
SEATS, European University Working Paper, n. 92/65.
PIERCE,D. A. (1978). Seasonal Adjustment when both Deterministic and Stochastic Sea-
sonality are Present. In Seasonal Analysis of Economic Time Series, ed. A. Zellner,
US Department of Commerce, 242.
FhccoLo, D. (1985). Progetto DESEC: un'esperienza di ricerca statistica sulle serie sta-
gionali, Quad~rni di Statistica e econometria, VII, 5-40.
SIMS, C. A. (1978). Comments on ~Seasonality: Causation, Interpretation and Implica-
tions>> by Clive W. J. Granger. In SeasonalAnalysis of Economic Time Series, ed. A.
Zellner Washington D.C., US Dept of Commerce, Bureau of the Census.
Tufty, J. W. (1978). Comments on ~Seasonality: Causation, Interpretation and Implica-
tions>> by Clive W. J. Granger. In SeasonaIAnalysis of Economic Time Series, ed. A.
Zellner Washington D.C., US Dept of Commerce, Bureau of the Census.
WALLIS,K. E (1982). Seasonal Adjustment and Revision of Current Data: Linear Filters
for the X-11 Methods, Journal of the Royal Statistical Society, A, 145, 74-85.

202

You might also like