Journal of Statistical Planning and Inference

JSPI: 5711 Model 3G pp. 1–14 (col.
fig: nil)
Journal of Statistical Planning and Inference xxx (xxxx) xxx
Contents lists available at ScienceDirect
Journal of Statistical Planning and Inference

journal homepage: www.elsevier.com/locate/jspi
A Kalman filter method for estimation and prediction of

space–time data with an autoregressive structure
∗
Bernardo Lagos-Álvarez a , , Leonardo Padilla a , Jorge Mateu b ,
Guillermo Ferreira a
a
Department of Statistics, University of Concepción, Chile
b
Department of Mathematics, University Jaume I, Castellón, Spain
article info a b s t r a c t
Article history: We propose a new Kalman filter algorithm to provide a formal statistical analysis of
Received 19 December 2017 space–time data with an autoregressive structure. The Kalman filter technique allows
Received in revised form 23 October 2018 to capture the temporal dependence as well as the spatial correlation structure through
Accepted 14 March 2019
state-space equations, and it is aimed to perform statistical inference in terms of both
Available online xxxx
parameter estimation and prediction at unobserved locations. We put in relevance the
Keywords: nugget effect at the observation equation. We test our procedure and compare it with
Kalman filter algorithm classical kriging prediction via an intensive simulation study. We show that the Kalman
Simple kriging filter is superior in both the estimation, without using a plug-in approach, and prediction
Space–time geostatistics for spatio-temporal data, providing a suitable formal procedure for the statistical analysis
Space–time models
of space–time data. Finally, an application to the prediction of daily air temperature data
State-space system
in some regions of southern Chile is presented.
© 2019 Elsevier B.V. All rights reserved.
1. Introduction 1
During the last decades, the Kalman filter (KF henceforth) algorithm has proved to be a powerful tool for the statistical 2
treatment of state-space models, providing the estimation of parameters (given in the state vector), and the prediction 3
of unobserved values both in time and space. The KF algorithm is based on the state-space representation of a linear 4
process. This can be described by two equations, the first is called the state equation which is a linear combination of a 5
vector of unobservable state variables and some process noise, whereas the second is called the observation equation which 6
is formed by means of a linear relationship with the state variables and a random noise. Once the state-space system is 7
defined, the KF can be used to estimate the state vector, estimating the model parameters, and classically to build the 8
one-step-ahead predictor of the process and its mean square error matrix. The idea of developing techniques for the esti- 9
mation of an unobserved state from the observed process goes back to the sixties. For example, Kalman (1960) published 10
his famous paper describing a recursive solution to the discrete-data linear filtering problem. A friendly introduction to the 11
general idea of the KF algorithm is offered in Maybeck (1979) and in Meinhold and Singpurwalla (1983). More extensive 12
references include Brockwell and Davis (1991), Harvey (1992), Durbin and Koopman (2001), Grewal et al. (2007) and 13
Grewal (2011). 14
∗ Corresponding author.
E-mail address: bla@udec.cl (B. Lagos-Álvarez).
https://doi.org/10.1016/j.jspi.2019.03.005
0378-3758/© 2019 Elsevier B.V. All rights reserved.
Please cite this article as: B. Lagos-Álvarez, L. Padilla, J. Mateu et al., A Kalman filter method for estimation and prediction of space–time data with an
autoregressive structure. Journal of Statistical Planning and Inference (2019), https://doi.org/10.1016/j.jspi.2019.03.005.
JSPI: 5711
2 B. Lagos-Álvarez, L. Padilla, J. Mateu et al. / Journal of Statistical Planning and Inference xxx (xxxx) xxx
1 As mentioned, the KF is one of the fundamental algorithms for the statistical treatment of classical time series models.
2 For more complex time series models, there are also variants of the KF , see e.g. Naveau et al. (2005), Ferreira et al.
3 (2013), Grassi and de Magistris (2014), Rezaie and Eidsvik (2014) and Rehman et al. (2016). This favorable evidence makes
4 it a natural candidate for performing statistical inferences for space–time processes. In this context, the space–time KF
5 has been implemented to perform inference and prediction for processes in the physical, environmental, and biological
6 sciences, among others. Huang and Cressie (1996) were perhaps one of the first to use the KF for the prediction of locations
7 that have greater storage of snow water in southwest Colorado. Hughes et al. (1999) used hidden Markov models with
8 unobserved weather states to model space–time atmospheric precipitation. Wikle (2003) developed empirical Bayesian
9 space–time KF models for monthly precipitation. An application of KF to electroencephalographic source locations is
10 proposed by Barton et al. (2009). The above applications have been performed with a limited number of observations or
11 by using a plug-in method to estimate the parameters. Xu and Wikle (2007) proposed a space–time dynamic model
12 formulation with parameter matrices restricted to prior scientific knowledge and/or to common spatial models. The
13 estimation can be carried out via the expectation–maximization (EM) algorithm or by a general EM algorithm, as done
14 by Amisigo and Van De Giesen (2005), Fassò and Cameletti (2009) and Katzfuss and Cressie (2011). An approach to
15 nonlinear state estimation, known as the ensemble KF , has been discussed for example by Anderson (2001), Gillijns et al.
16 (2006) and Stroud et al. (2010), among others. Mardia et al. (1998) established a connection between the KF algorithm
17 and the kriging methodology (named as kriged Kalman Filter), in which the state equation incorporates different forms
18 of temporal dynamics to model space–time interactions.
19 In this paper we devote particular attention to compare the performance of spatial predictors, between the predictor,
20 generated by the KF applied to our proposed model, which uses all available information up to time t, and the more
21 classical kriging prediction methodology that, unlike KF, uses only the information associated to the current time t. The
22 particular model in space-state form we here propose places an error component which includes both the measurement
23 error and the small-scale error inside the observation equation. This allows to model another source of variability, the
24 innovation, which explains the spatial structure of the process. Our method facilitates the computation, at the time of
25 estimation (from the convergence point of view), without sacrificing the desired variability in predictions. Moreover, we
26 provide a brief overview of this important field of research in space–time analysis, discussing estimation and prediction
27 techniques as well as applications to real-life data. In the context of parameter estimation, we discuss the use of maximum
28 likelihood estimators (MLE) through the state-space systems.
29 The remaining of this article is structured as follows. Section 2 presents the class of space–time processes that
30 we are working with comparing it with existing and alternative classes. Section 3 presents the Kalman filter and the
31 corresponding estimation procedure for our state-space representations. Then Section 4 is devoted to present a simulation
32 study to evaluate the performance of the estimation and prediction methods. Section 5 presents an application to predict
33 air temperature data in Chile. The paper ends with some conclusions in Section 6.
34 2. Space–time AR(p) processes
35 Let ZD×T be a space–time process formed by a series of values located at s ∈ D ⊂ R2 and in a time instant t ∈ T ⊆ T ,
36 T ⊂ Z+ . In this paper our modeling strategies assume the following model for the observational data:
37 Zt (s) = µt (s) + εt (s) + ωt (s), (1a)
38 where µt (s) is a systematic component that explains most of the variation of Zt (s), ωD×T := {ωt((s) : s ∈ D, t ∈ T} is )a
39 Gaussian white noise (GWN , henceforth) with mean zero and variance σω2 , known as nugget effect ωD×T ∼ GWN (0, σω2 ) ,
40 see Cressie (1993), and εt (s) is the unobserved smoothed stochastic component. We note that in this context there is a
41 wide variety of models proposed for the analysis of space–time data that enable to perform prediction (extrapolation),
42 smoothing, filtering or estimation (see Hefley et al., 2017; Zammit-Mangion and Cressie, 2017). In this paper, we assume
43 that εD×T := {εt (s) : s ∈ D, t ∈ T} is a space–time autoregressive stationary process of order p, with mean zero, specified
44 as in Huang and Cressie (1996), i.e.,
45 εt (s) = φ1 εt −1 (s) + φ2 εt −2 (s) + · · · + φp εt −p (s) + ηt (s), (1b)
46 where
∑p the coefficients φ1 , . . . , φp are chosen such that the absolute values of all roots (possibly complex) of λp −
m=1 m λ
φ = 0 are less than 1 (to meet the conditions of stationarity). Finally ηD×T := {ηt (s) : s ∈ D, t ∈ T} is
p−m
47
48 a Gaussian space–time stationary process, with mean zero, independent of ωD×T , and covariance function given by
C η (s, r) if t = t ′ ,
{
49 Cov[ηt (s), ηt ′ (r)] = (2)
0 if t ̸ = t ′ ,
50 for s, r ∈ D. In addition, the spatial covariance function, C η (s, r), is specified by an admissible (or valid) covariance family.
51 A very general choice is to adopt the Matérn covariance function (see Matérn, 2013) given by
)ν
ση2
( ( )
∥s − r ∥ ∥s − r ∥
52 C η (s, r) = Kν , ση2 > 0, α > 0, ν > 0,
2ν−1 Γ (ν ) α α
JSPI: 5711
B. Lagos-Álvarez, L. Padilla, J. Mateu et al. / Journal of Statistical Planning and Inference xxx (xxxx) xxx 3
where K is the modified Bessel function of the second kind of order ν . The range parameter α controls the rate of decay 1
with distance, with larger values of α corresponding to more highly correlated observations. A popular special case of the 2
Matérn family is the exponential model C η (s, r) = ση2 exp (−∥s − r∥/α) which is obtained when ν = 1/2. 3
We now consider the issue of the spatial and temporal dependencies of the process given in (1). The space–time 4
covariance function of the spatio-temporal AR(p) process is defined by 5
C (t , t ; s, r) := Cov[Zt (s), Zt ′ (r)]

Z ′
= Cov[εt (s), εt ′ (r)] + Cov[ωt (s), ωt ′ (r)] (3) 6
= Ckε (s, r) + σω2 δ{k=0,s=r} ,

where δ is the indicator function, k = ⏐t − t ′ ⏐, and the covariance function Ckε (s, r) is given by
⏐ ⏐
7
Ckε (s, r) = φm Ckε−m (s, r) + C η (s, r)δ{k=0} ,

∑
k = 0, 1, . . . . (4) 8
m=1
For k = 0, 1, . . . , p, the covariance function Ckε (s, r) can be determined by solving the p + 1 Yule–Walker equations given 9
in (4). The remaining values for Ckε (s, r), with k = p + 1, p + 2, . . . are obtained recursively. Notice that by solving the first 10
(p + 1) equations we have 11
ε η
Ck (s, r) = f (φ1 , . . . , φp , k)C (s, r), (5) 12
for a particular f (φ1 , . . . , φp , k) that is easy to calculate. In particular, if we consider an AR(1) space–time model, from (4) 13
we have 14
ε ε η
C0 (s, r) = φ C1 (s, r) + C (s, r)
15
C1ε (s, r) = φ C0ε (s, r).
Therefore, the covariance function for the time-stationary space–time AR(1) process is given by 16
φk
Ckε (s, r) = C η (s, r), k = 0, 1, . . . , (6) 17
(1 − φ 2 )
for s, r ∈ D. Thus, the space–time covariance function of the process ZD×T defined in (3) can be written as the product 18
of a purely spatial with a purely temporal covariance function. In such a case, we say that the space–time covariance 19
function C Z defined in (3) is a separable space–time covariance function (Gneiting et al., 2006). 20
A number of non-separable models have been developed in the literature, among which we mention the so-called 21
Gneiting, Iaco-Césare or Porcu type; variants are available in the CompRandFld R package of Padoan and Bevilacqua 22
(2015) (see Montero et al. (2015) for a complete overview). 23
The model defined by (1a) has been considered in different works. For example, Huang and Cressie (1996). Huang and 24
Cressie (1996) did not include the error term in the observation equation; in addition, the variability of the error term was 25
incorporated as part of the covariance structure of the spatial innovation process, and a simple moment-based procedure 26
for estimating the parameters in the model was developed. 27
Another class of methodology that deals with a space–time statistical model is the well-known kriged Kalman filter 28
(KKF ), which couples the methodology of KF and kriging, see Mardia et al. (1998). They decrease the computational cost 29
by applying the concept of dimension reduction. 30
Wikle and Cressie (1999) and Cressie and Wikle (2014) proposed a model in which the state equation, the process 31
model, introduces a component that tries to capture a large part of the variability produced by the different spatio- 32
temporal scales, using the expansion of Karhunen–Loéve. From a usual descriptive point of view, this approach is applied 33
in order to reduce the dimensionality of the number of different sources of spatio-temporal variability, represented by 34
the, say m most relevant parameters. This remove any structure from the error component leaving it as a nugget effect. 35
This is different from our proposed model where the nugget effect is part of the observation equation, the data model. 36
The essence of this proposal is to reduce the dimensionality of the space–time data, see Li and Qi (2011) and Cressie et al. 37
(2010), via Fixed Rank Kriging to achieve an efficient (in time) computational development of estimation and prediction 38
procedures using the orthogonal approximation. In Cressie and Wikle (2014) the spatial dynamics is directly constructed 39
from the data, from the complete and orthonormal sequence of deterministic spatial functions, which does not have a 40
predefined behavior. In our proposed model, the spatial structure is captured from a predefined and admissible parametric 41
model that fits well to the data, and whose parameters have a clear interpretation for the spatial continuity. Also, Cressie 42
and Wikle (2014) apply the KF algorithm to the time series, again produced from the data, from the main components 43
a(t). However, we consider a common parametric model predefined with an autoregressive structure, to each time series, 44
able to interpret the physical phenomenon in terms of these parameters. In a different context, Fassò and Cameletti (2010) 45
consider the following model 46
Zt (s) = Xt (s)β + K(s)Yt + et (s) (7a) 47
Yt = GYt −1 + ηt , (7b) 48
JSPI: 5711
1 where Xt (s) is a l-dimensional spatio-temporal field of known covariates observed at time t at location s, Yt is a
2 d-dimensional vector, which is constant in space, and K(s) is a matrix which defines a d-dimensional field of known
3 coefficients. The term K(s) may be based on the observed data through an EOF decomposition (Wikle and Cressie, 1999)
4 or it may be constant over the geographical space. Finally, et (s) = ωt (s) + εt (s), with εt (s) a Gaussian instrumental error
5 which is white noise in space and time with variance σε2 (nugget effect), whereas ωt (s) is a white noise in time, but
6 correlated over space with a covariance function C ω . The term ηt is a d-dimensional Gaussian white noise process with
7 variance–covariance matrix Ση . The error et (s) comes from a zero-mean Gaussian random field with covariance function
8 C e ≡ σω2 Γ , where Γ is the scaled spatial covariance function given by
1 + σε2 /σω2 h = 0,
{
Γ (h) =
C ω (h) h > 0.
9
We then have
C Z (t , t ′ ; s, r) := Cov[Zt (s), Zt ′ (r)]
=K(s) Cov[Yt , Yt ′ ]K(r)⊤ + Cov[ωt (s), ωt ′ (r)] + Cov[εt (s), εt ′ (r)].
For the particular case, Fassò and Cameletti (2010) used a model based on only one term in the truncation of the EOF
decomposition, so that
′
φ |t −t | 2
C Z (t , t ′ ; s, r) =K(s)K(r) σ + C ω (t , t ′ ; s, r)δ{t =t ′ } + σε2 δ{t =t ′ ;s=r}
(1 − φ 2 ) η
′
φ |t −t | 2
=K(s)K(r) σ + σω2 Γ (h)δ{t =t ′ } ,
(1 − φ 2 ) η
∑n ∑p 2
10 where δ is the indicator function, φ = i=1 K(si )b(si )/ i=1 K (si ), and the b(si )’s satisfy equation (9) of Wikle and Cressie
11 (1999).
12 In particular, models much more related to the one we propose here are those considered in Sahu (2012) and Cameletti
13 et al. (2013), whose inference is made through Bayesian techniques. Cameletti et al. (2013) highlight the use and
14 application of the Integrated Nested Laplace Approximation algorithm (INLA) (Rue et al., 2009).
15 Recently, Ferreira et al. (2017) have proposed a hierarchical space–time model that generalizes the construction of the
16 temporal dependence to the long memory case, keeping separable the space–time covariance. The authors write the part
17 associated with the time dimension as an MA(∞) and, using Chan and Palma (1998), apply the KF algorithm to the model
18 as a state-space in the temporal dimension, without any relation to the spatial dimension.
19 The model and methods of analysis that we propose are markedly different. We first give great importance to the
20 nugget effect, which should not be omitted when the data is in a point-referenced format (see Matheron, 1971; Thioulouse
21 et al., 1993; Cressie and Wikle, 2015). This effect is incorporated into the observation equation (10a), explaining most of
22 the spatial variability due to a short-range dependence and the measurement error. Note that the incorporation of this
23 parameter into the model can bring difficulties in the fitting of the model, as well as being an important ingredient for
24 the predictions. Secondly, we show explicitly the procedure used for the temporal and spatial predictions, while the focus
25 in Ferreira et al. (2017) is more in the temporal dimension.
26 Finally, we emphasize that our proposal, follows a frequentist approach. We use the KF algorithm for estimation and
27 prediction of the space–time processes, but on the basis of a new updating scheme of the unobserved state vector, which
28 is different from the above-mentioned proposals because the process ηD×T is a sequence of temporally independent and
29 spatially stationary Gaussian processes with covariance function given by (2), and we propose an AR(p) process to deal
30 with the temporal dynamics rather than a first-order Markovian temporal structure.
31 3. Statistical inference: estimation and prediction methods
32 3.1. Classical kriging
33 Diggle and Ribeiro (2007) and Banerjee et al. (2014) implemented the classical or simple kriging (SK, henceforth) as a
34 prediction method built upon the minimum mean squared prediction error for the model
35 Zt (s) = µt (s) + εt (s) + ωt (s), s ∈ D ⊂ R2 (8)

36 where {ω } with ωD×t := {ωt (s) : s ∈ D} and {ε
T
D ×t t =1 } with εD×t := {εt (s) : s ∈ D} are two temporally independent
T
D ×t t =1
37 and spatially stationary Gaussian processes, with covariance function as in (2). In order to evaluate the SK methodology
38 for the purely spatial model (8), we develop independent SK of Z in each time, obtaining a heterogeneous behavior in the
39 estimates of the covariance structure of the underlying process εD×T , i.e., Cov[εt (s), εt (r)] = C εt (s, r) for t ∈ {1, . . . , T }. In
40 this context, and on the assumption of Gaussianity, the predictor SK of Z at time t and location s0 , given the information
41 Zt = (Zt (s1 ), . . . , Zt (sn ))⊤ , is given by
(SK )
42 Zt |t (s0 ) := E[Zt (s0 ) | Zt ] = µt (s0 ) + E[εt (s0 ) | Zt ],
JSPI: 5711
with 1
E[εt (s0 ) | Zt ] = E[εt (s0 )] + Gt (s0 )∆t (Zt − E[Zt ]),

−1
2
and its prediction variance is given by 3
(SK )
∆t |t (s0 ) := Var[Zt (s0 ) | Zt ] = Var[εt (s0 ) | Zt ] + σωt , 2
4
with 5
Var[εt (s0 ) | Zt ] = Var[εt (s0 )] − Gt (s0 )∆t Gt (s0 ) ,

−1 ⊤
6
where E[εt (s0 )] = 0, Var[εt (s0 )] = σε2t , ∆t := Var[Zt ] = C εt + σω2t In and 7
Gt (s0 ) := Cov[εt (s0 ), Zt ]

= (Cov[εt (s0 ), Zt (s1 )], . . . , Cov[εt (s0 ), Zt (sn )]) 8
= (C εt (s0 , s1 ), . . . , C εt (s0 , sn )).

Note that the innovation at location sj , defined as Zt (sj ) − E[Zt (sj )], incorporates an unconditional expectation which 9
corresponds to the systematic trend denoted here by µt := E[Zt ]. Thus the prediction of Zt (s0 ) given the information Zt , 10
with its respective prediction variance is given by 11
(SK )
Zt |t (s0 ) = µt (s0 ) + Gt (s0 )∆t (Zt − µt ),
−1
(9a) 12
(SK )
∆t |t (s0 ) = σεt − Gt (s0 )∆t Gt (s0 ) + σωt .
2 −1 ⊤ 2
(9b) 13
εt εt
In the case that model (8) is specified via parameters, i.e. µt (s) = β⊤
t Xt (s) and C (s, r) := C (s, r; θ t ), a useful estimation 14
procedure is based on the maximization of the likelihood, specifically the maximum likelihood method (ML), see Mardia 15
and Marshall (1984), Zimmerman and Zimmerman (1991), and Bevilacqua et al. (in press) and references therein. Note 16
that the ML estimator is applied at each time, therefore we estimate the parameter vector Θt = (β⊤ t , θ t , σωt ) , T times. In
⊤ 2 ⊤
17
the case that we consider a temporal domain T fixed in (8) the results obtained could be useful for a descriptive analysis 18
of the non-stationarity behavior of the parameters of the process. 19
Another methodology to perform inference and prediction for the space–time process ZD×T are the well-known space– 20
time kriging methods, see Montero et al. (2015), Chap. 6. They proposed a goodness-of-fit analysis of the theoretical 21
space–time semivariogram, where the location index s ∈ D ⊂ R2 is considered as s′ := (s, t) ∈ D′ ⊂ R2 × Z+ (or it can 22
be easily extended to D′ ⊂ R2 × R+ ) with the corresponding metric over D′ . The CompRandFld R package of Padoan and 23
Bevilacqua (2015) is concerned with these analysis by using the space–time kriging procedure. 24
Regarding the estimation procedure and prediction using the space–time kriging, it is more common in geostatistics to 25
consider the observation spatial domain D as a fixed-domain. When the spatial domain is fixed and bounded, the only way 26
to obtain more data is by sampling within the domain. Unlike the spatial domain, the unidirectional flow of time forces 27
us to modify the temporal support when new information is incorporated over time, implying a modification of the scale 28
parameter associated with the temporal dimension. As new information is incorporated in time, if the same correlation 29
structure is assumed, it would cause that this parameter is refitted becoming smaller, and implying a modification of its 30
interpretation. 31
Implications of the above are the following. First, generally the construction of non-separable space–time semivari- 32
ogram models follows the construction of usual spatial models. Second, the scale parameter associated with the temporal 33
dimension should be represented in the same way as in the purely spatial case. However this is confusing when we 34
incorporate new information over time. Specifically, the interpretation of the range parameter (scale) will be modified. 35
It may happen that the semivariogram parameterized by the scale parameter is erroneously specified, mismatching a 36
semivariogram parameterized by the nugget parameter. 37
3.2. Kalman filter 38
Unlike the kriging methodology, the KF uses all available information at time t, i.e., it incorporates past data as well 39
as current data to predict at a location s0 where no observations are available. 40
The application of KF requires writing the model in a state-space form. State-space systems provide a useful framework 41
for the efficient calculation of estimates and forecasts. In this section, we review the application of this representation 42
to the case of space–time processes ZD×T . Consider the following state-space system for a location in the spatial domain 43
D ⊂ R2 , 44
Zt (s) = β Xt (s) + H ξ t (s) + ωt (s)

⊤ ⊤
(10a) 45
ξt (s) = Fξt −1 (s) + Vt (s), (10b) 46
where Zt (s) is the observation for time t at location s, β = (β1 , . . . , βl )⊤ is a vector of parameters, Xt (s) = 47
(1) (l)
(Xt (s), . . . , Xt (s)) is an l-dimensional vector of non-stochastic regressors, H is an observation operator that can be 48
dependent of s (Mardia et al., 1998), ξ t (s) is a state vector, and ωt (s) is an observational noise with variance σω2 . On the 49
JSPI: 5711
1 other hand, F is a state transition operator, and Vt (s) is a state noise. For MR(p×q) being the space of real-valued matrices of
2 dimension p × q, we have H ∈ MR(p×1) , ξ t ∈ MR(p×1) , ωt ∈ MR(1×1) , F ∈ MR(p×p) and Vt ∈ MR(p×1) . The temporal-stationary
3 space–time AR(p) model represented here by Eqs. (1a)–(1b), can be written in a state-space system as follows:
1 εt (s)
⎛ ⎞ ⎛ ⎞
⎜0⎟ ⎜ εt −1 (s)
⎟
⎜.⎟ ..
⎜ .. ⎟ , ξ t (s) = ⎜ ⎟,
⎜ ⎟
H = ⎜ ⎟ ⎜
⎟.
⎝0⎠ ⎝ ε (s) ⎠
t −p
0 εt −(p−1) (s)
4 (11)
ηt (s) φ1 φ2 . . . φp−1 φp
⎛ ⎞ ⎛ ⎞
⎜ 0 ⎟ ⎜ 1 0 ... 0 0⎟
. ⎟ ⎜. . .. .. ⎟
⎜ .. ⎟ , ⎜ .. .. . ⎟,
⎜ ⎟
Vt (s) = ⎜ F = ⎜ .
⎟
⎝ 0 ⎠ ⎝ 0 0 ··· 0 0
⎠
0 0 0 ··· 1 0
5 for s ∈ D, t ∈ T ⊂ T .
6 Consider the state-space representation (11) of Zt (s), the KF equations can be used for estimating model parameters,
7 state vectors and future observations. Note that without considering a spatially correlated error, model (10a) is simplified
8 as described by Prado and West (2010). Under the Gaussianity assumption, it produces the minimum mean square
9 estimator of the state vector along with its mean square error matrix, conditional on past information. To this end, we
10 define the optimal space–time predictor of ξ t (s) and the prediction covariance between the locations s and r ∈ D and
11 t ∈ T as
12 ξ t |t −1 (s) := E[ξ t (s)|Z1:t −1 ] (12a)

13 Pt |t −1 (s, r) := Cov[ξ t (s), ξ t (r)|Z1:t −1 ], (12b)
14 which are linear functions of the finite past Zj = (Zj (s1 ), . . . , Zj (sn ))⊤ , j = 1, . . . , t − 1 that we denote as Z1:t −1 . Given the
15 one-step prediction and the prediction covariance defined in (12a) and (12b), respectively, the updating Kalman equations
16 are given by
17 ξ t |t (s) = ξ t |t −1 (s) + Gt |t −1 (s)∆−

t |t −1 (Zt − Zt |t −1 ),
1
(13a)
18 Pt |t (s, r) = Pt |t −1 (s, r) − t |t −1 Gt |t −1 (r) ,
Gt |t −1 (s)∆− 1 ⊤
(13b)
19 where
20 Gt |t −1 (s) := Cov[ξ t (s), Zt |Z1:t −1 ]

= Cov[ξ t (s), Zt (s1 )|Z1:t −1 ], . . . , Cov[ξ t (s), Zt (sn )|Z1:t −1 ]
( )
21
= Pt |t −1 (s, s1 )H, . . . , Pt |t −1 (s, sn )H ,

( )
22
23 ∆t |t −1 := Var[Zt |Z1:t −1 ]
Cov[Zt (s1 ), Zt (sn )|Z1:t −1 ]
⎛ ⎞
Var[Zt (s1 )|Z1:t −1 ] ···
.. .. ..
= ⎝ .
⎜ ⎟
24
. . ⎠
Cov[Zt (sn ), Zt (s1 )|Z1:t −1 ] ··· Var[Zt (sn )|Z1:t −1 ]
H Pt |t −1 (s1 , s1 )H H Pt |t −1 (s1 , sn )H
⎛ ⊤ ⊤
⎞
···
.. .. ..
⎠ + In σ ω ,
2
= ⎝ .
⎜ ⎟
25
. .
H⊤ Pt |t −1 (sn , s1 )H ··· H⊤ Pt |t −1 (sn , sn )H
26 Zt |t −1 = E[Zt |Z1:t −1 ]
⎛ ⎞
E[Zt (s1 )|Z1:t −1 ]
=⎝
⎜ .. ⎟
27
. ⎠
E[Zt (sn )|Z1:t −1 ]
β⊤ X t (s1 ) + H⊤ ξ t |t −1 (s1 )
⎛ ⎞
..
=⎝ ⎠,
⎜ ⎟
28
.
β X t (sn ) + H ξ t |t −1 (sn )
⊤ ⊤
JSPI: 5711
where Gt |t −1 (s) ∈ MR(p×n) . The prediction of Zt (s) up to time t, and its respective prediction variance are given by 1
(KF )
Zt |t (s) = β Xt (s) + H ξ t |t (s)
⊤ ⊤
(14a) 2
(KF )
∆t |t (s) = H Pt |t (s, s)H + σω .
⊤ 2
(14b) 3
Finally, the algorithm is completed by updating the values of ξ t |t (s) and Pt |t (s, r) defined in (13) in order to make a 4
one-step-ahead prediction 5
ξ t +1|t (s) = Fξ t |t (s) 6
Pt +1|t (s, r) = FPt |t (s, r)F + Q(s, r),⊤

7
where 8
C η (s, r) ...
⎛ ⎞
0
Q(s, r) := Cov(Vt (s), Vt (r)) = ⎝
.. .. .. ⎟ ,
.
⎜
. . ⎠ 9
0 ... 0
and Q(s, r) ∈ MR(p×p) . 10
Recalling that the process εD×T has mean zero, a reasonable choice for the initial predictor is ξ 1|0 (s) := E[ξ 1 (s)] = 0, 11
and the prediction covariance between locations s and r are 12
ε
C0 (s, r) Cpε−1 (s, r)
⎛ ⎞
···
P1|0 (s, r) := Cov[ξ 1 (s), ξ 1 (r)] = ⎝
.. .. ..
. ⎠,
⎜ ⎟
. . 13
ε ε
Cp−1 (s, r) · · · C0 (s, r)
with Cjε (s, r) coming from (5). 14

(SK )
Observe that the prediction of Zt |t (s0 ) given in (9a) by using the SK methodology is not recursive and independent of 15
the data up to t − 1. This is different to the prediction given in (14a) which uses all available information at time t. 16
3.3. Estimation 17
The estimation of the parameter vector of the model (1a)–(1b) is carried out through the KF described above via 18
maximum likelihood, i.e., we maximize the log-likelihood function ℓ(·) (up to a constant) with respect to Θ 19
T
1 ∑(
ℓ(Θ | Z1:T ) = − t |t −1 (Zt − Zt |t −1 ) ,
log | ∆t |t −1 | +(Zt − Zt |t −1 )⊤ ∆− 1
)
(15) 20
2
t =1
where the n-dimensional vector Zt = (Zt (s1 ), . . . , Zt (sn ))⊤ denotes the complete sample information at time t. Θ= 21
(β⊤ , φ⊤ , θ ⊤ , σω2 )⊤ , with β = (β1 , . . . , βℓ )⊤ , φ = (φ1 , . . . , φp )⊤ , θ = (ση2 , α )⊤ . The solution of ∇Θ ℓ = 0 (where ∇ denotes 22
the gradient or vector of first derivatives) is approximated by partitioning Θ = (β⊤ , Θ∗⊤ )⊤ . A closed form solution for β 23
is given by 24
( T
)−1 T
∑ ∑
β̂ = X⊤ −1
t ∆t |t −1 X t t ∆t |t −1 (Zt − Λξ t |t −1 )
X⊤ −1
(16) 25
t =1 t =1
holding the remaining parameters Θ̂∗ fixed at its current value. Since there are no closed forms for Θ∗ , we use the 26
well-known Newton–Raphson algorithm 27
Θ̂∗{m} = Θ̂∗{m−1} − (∇∇ ℓ(Θ̂{m−1} | Z1:T )) ∇ℓ(Θ̂{m−1} | Z1:T ),

⊤ −1
(17) 28
where ∇ℓ(Θ|Z1:T ) is again the gradient, and ∇∇ ⊤ ℓ(Θ | Z1:T ) is the Hessian (matrix of second derivatives) of the log- 29
likelihood. This iterative formula is stopped by using a convergence criterion defined according to a given tolerance 30
level. 31
4. Simulation study 32
4.1. Performance of the estimation procedure 33
We conducted a Monte Carlo study to gain some insight into the finite sample performance of the Kalman filter 34
estimator discussed in Section 3. Specifically, we considered the model 35
Zt (s) = µt (s) + εt (s) + ωt (s)

(18) 36
εt (s) = φεt −1 (s) + ηt (s),
JSPI: 5711
Table 1
Results for the KF estimates for an space–time AR(1) model. The observed locations are in the square [0, 1]2 .
Case 1 β0 = 0 φ = 0.7 α = 0.8 ση2 = 0.459 σω2 = 0.1
Mean −0.00474 0.69763 0.79423 0.45896 0.09961
Sd 0.04685 0.01435 0.05698 0.01990 0.00860
B −0.00474 −0.00237 −0.00577 −0.00262 −0.00039
MSE 0.00222 0.00021 0.00328 0.00040 0.00007
Case 2 β0 = 0 φ = 0.5 α = 0.8 ση2 = 0.675 σω2 = 0.1
Mean −0.00118 0.49772 0.78996 0.67334 0.09767
Sd 0.04839 0.01752 0.05887 0.02873 0.01542
B −0.00118 −0.00228 −0.01004 −0.00166 −0.00233
MSE 0.00234 0.00031 0.00356 0.00083 0.00024
Case 3 β0 = 0 φ = 0.3 α = 0.8 ση2 = 0.819 σω2 = 0.1
Mean 0.00031 0.29848 0.79295 0.81424 0.09942
Sd 0.04473 0.01590 0.05348 0.03371 0.01161
B 0.00031 −0.00152 −0.00705 −0.00476 −0.00058
MSE 0.00200 0.00025 0.00291 0.00116 0.00013
Case 4 β0 = 0 φ = 0.7 α = 0.4 ση2 = 0.459 σω2 = 0.1
Mean −0.00340 0.69907 0.39763 0.45755 0.09949
Sd 0.03333 0.01396 0.03095 0.02317 0.01919
B −0.00340 −0.00093 −0.00237 −0.00145 −0.00051
MSE 0.00112 0.00020 0.00096 0.00054 0.00037
Case 5 β0 = 0 φ = 0.5 α = 0.4 ση2 = 0.675 σω2 = 0.1
Mean −0.00162 0.49856 0.39680 0.67435 0.09794
Sd 0.03605 0.01869 0.02972 0.03251 0.02668
B −0.00162 −0.00144 −0.00320 −0.00065 −0.00206
MSE 0.00130 0.00035 0.00089 0.00106 0.00072
Case 6 β0 = 0 φ = 0.3 α = 0.4 ση2 = 0.819 σω2 = 0.1
Mean −0.00238 0.29916 0.39674 0.81831 0.09696
Sd 0.03524 0.01759 0.02910 0.03604 0.03027
B −0.00238 −0.00084 −0.00326 −0.00069 −0.00304
MSE 0.00125 0.00031 0.00086 0.00130 0.00092
1 with an exponential model as the spatial covariance structure for ηD×T , i.e., C η (s, r) = ση2 × exp (−∥s − r∥/α). We
2 assumed that µt (s) = β0 = 0 and C η (s, s) = ση2 , so that it is satisfied that Var[Zt (s)] = Var[εt (s)] + Var[ωt (s)] =
3 ση2 /(1 − φ 2 ) + 0.1 = 1.
4 The simulation scheme is as follows. The process (18) is obtained for 1000 random samples of size n = 25 spatial
5 locations and T = 400 temporal observations with different parameter values.
Then the KF estimator θ̂ of a single parameter θ is evaluated by using the mean (Mean) standard deviation (Sd), bias
(B) and the mean square error (MSE), defined as
1000
( 1000
) 12
1 ∑ 1 ∑
Mean(θ̂ ) = θ̂i , Sd(θ̂ ) = (θ̂i − Mean(θ̂ ))2 ,
1000 1000
i=1 i=1
1000
1 ∑
B(θ̂ ) = Mean(θ̂ ) − θ, MSE(θ̂ ) = (θ̂i − θ )2 ,
1000
i=1
6 where θ̂i is the KF estimate of θ parameter for the ith-simulation, with θ ∈ Θ = (β0 , φ, α, ση2 , σω2 )⊤ . Table 1 reports
7 the results from the Monte Carlo simulations for several parameter values. As shown in the first row of each case, the
8 means of the estimated parameters are close to their expected values. Furthermore, it is noteworthy that the values of
9 the criteria used to evaluate the estimation of the proposed procedure such as the Sd, B and MSE report negligible values.
10 According to the results for the bias, it can be noted that the procedure underestimates the true value with a negligible
11 difference. It is expected that by increasing the sample size, the bias tends to zero. These results suggest that the KF
12 estimator for an AR space–time model can be extremely efficient. We can observe from Table 1 that KF estimates are
13 close to the values of the parameters used in the simulations. Moreover, the KF estimates do not show the problems of
14 positive definiteness of the covariance matrices given in Eq. (13b), and we can estimate the value of σω2 in a direct way
15 without considering the re-parameterization γ := σω2 /ση2 in the definition of Σe in (7), as suggested by Fassò and Cameletti
16 (2009).
JSPI: 5711
Table 2
Gain obtained through a cross-validation method using simple kriging and the Kalman filter algorithm.
φ σω2
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.1 0.01% 0.01% 0.01% 0.01% 0.03% 0.04% 0.02% 0.01% 0.00%
0.2 0.05% 0.06% 0.14% 0.10% 0.04% 0.09% 0.03% 0.06% 0.03%
0.3 0.02% 0.12% 0.19% 0.28% 0.18% 0.25% 0.14% 0.11% 0.10%
0.4 0.21% 0.26% 0.35% 0.43% 0.38% 0.32% 0.17% 0.19% 0.15%
0.5 0.20% 0.39% 0.67% 0.72% 0.48% 0.57% 0.43% 0.44% 0.22%
0.6 0.60% 0.76% 0.97% 0.82% 0.91% 0.60% 0.41% 0.55% 1.09%
0.7 1.33% 1.10% 1.41% 1.22% 1.16% 1.09% 0.60% 0.99% 1.66%
0.8 0.99% 1.66% 2.19% 2.31% 2.07% 2.10% 1.64% 1.23% 0.81%
0.9 1.71% 2.92% 3.44% 3.69% 3.62% 3.33% 2.72% 1.88% 1.15%
4.2. Performance of the prediction procedure 1
The aim of this section is to compare the prediction performance achieved through the Kalman filter from a space–time 2
AR(p) model, with the simple kriging predictor from a purely spatial model, when different sources of variability, temporal 3
and spatial, are controlled by the nugget effect and the autoregressive coefficient. 4
For the data generation scheme, in each scenario the parameter values φij = 0.1i and σω2ij = 0.1j with i, j = 1, . . . , 9 5
were combined, the spatial innovation variance was defined as ση2ij = (1 − σω2j )(1 − φi ) so that Var[Zt (s)] = 1 in all 6
cases, and the rest of the parameters remained fixed with β0 = 0, α = 0.8, thus considering a total of 81 simulation 7
scenarios. Cross-validation is implemented by removing data {Zt (si ) : t = 1, . . . , 50} for each i, and predicting Z50 (si ) 8
from the remaining data. Observe that the space–time predictor (14a) uses the data associated with {1, 2, . . . , 50}, while 9
the simple kriging predictor (9a) uses only the data at the time instant 50. 10
j (·)j
For the jth simulation, the resulting predicted value of Z50 (si ) is denoted by Z50|50 (si ). Finally, the criterion used to 11
quantify the predictive power was the mean squared error of prediction (MSEP) given by 12
1000 25
(
(·)
) 1 ∑∑[ j (·)j
]2
MSEP Z50|50 (s) = Z50 (si ) − Z50|50 (si ) , 13
1000
j=1 i=1
and the gain, to quantify how better the Kalman filter is, is given by 14
⎛ ( ) ⎞
(SK )
MSEP Z50|50 (s)
Gain = ⎝ ( ) − 1⎠ × 100%. 15
(KF )
MSEP Z50|50 (s)
Table 2 shows the gain results for the MSEP through a cross-validation method using simple kriging and the Kalman 16
filter algorithm with different parameter values for σω2 and φ . An important aspect of these simulations is that the gaining 17
in KF prediction in relation to SK is, in general, clearly shown when the temporal dependence is much more marked, 18
i.e. when the |φ| is large. And those gains for KF increase when φ is far away from the temporal stationarity region. All 19
the results and source codes are available from the authors upon request. 20
We note that the nugget effect is a very important parameter to achieve a good prediction performance when using 21
the KF . Additionally, an excess of smoothing in the prediction is detected when the measurement error is quite large. 22
From Table 2 we note that the gains for KF are greater than zero, and not uniform according to (φ, σω2 ). There is, 23
approximately, a range of values where the gains for KF are greater, namely φ × σω2 = (0.7; 0.9) × (0.2; 0.6). Moreover, 24
this shows that the gains in the prediction of the KF are larger when the one-step temporal correlations are strong. 25
5. Real data analysis 26
This section analyzes part of the integrated Agromet network which contains more than 100 meteorological stations 27
throughout Chile, including daily temperature data, soil temperature, rainfall, humidity, solar radiation, wind speed and 28
direction, among others. This dataset is reported and updated by the Institute of Agricultural Research (INIA) and can be 29
available from the website http://www.agromet.cl. 30
We focus our interest in three regions, namely Maule, Biobío and Araucanía which represent a portion of south-central 31
Chile. This area is surrounded by mountains (mountains from the coast to the east and from the Andes to the west). 32
According to the 2007 agriculture national census, this area has a surface of 99,206 km2 . The average cultivation area is 33
of 22%, and we highlight that we have extreme temperatures affecting the production and quality of agricultural and fruit 34
products. In this way, it is of great interest to study the spatio-temporal variability of this meteorological process, and 35
therefore to generate proposals that help to explain such variability in a coherent and appropriate way. 36
Particularly, we study the behavior of the space–time variability of the average daily temperature, information obtained 37
from the 22 meteorological stations located between Maule, Biobío and Araucanía Regions, during the year 2016. Table 3 38
summarizes the meteorological stations considered and Fig. 1 displays the spatial distribution of the stations. 39
JSPI: 5711
Table 3
Average daily mean temperature and standard error (SE) by station (year 2016).
Region Station ID lat long Mean SE
Coronel del Maule 1 −36.05909 −72.47769 13.8227 4.8971
Los despachos 2 −36.06211 −72.37144 13.9820 5.1669
Chanco 3 −35.70658 −72.51119 13.0156 2.7998
Maule
Santa Sofía 4 −35.97764 −72.35986 13.5443 5.0680
Sauzal 5 −35.71488 −72.11131 14.0101 5.1313
San Clemente 6 −35.53065 −71.47951 14.0924 5.0423
Coronel 7 −37.00482 −73.14041 13.0732 2.9039
Chiguayante 8 −36.91349 −73.03546 13.8568 3.8689
Human 9 −37.43356 −72.24398 12.9713 4.6506
Cañete 10 −37.89207 −73.41174 15.0568 5.1646
Biobío Nueva Aldea 11 −36.64890 −72.51349 12.2486 2.7765
Ninhue 12 −36.39811 −72.39520 14.5508 5.1507
Navidad 13 −36.90730 −71.93560 12.8937 4.6920
Punta Parra 14 −36.67239 −72.96468 13.1339 3.3800
Sta Rosa 15 −36.53520 −71.91643 13.4098 4.8871
Dominguez 16 −38.91861 −73.24028 12.0036 2.7602
C. Llollinco 17 −38.97694 −72.99778 11.4738 3.6989
Cuarta Faja 18 −39.11556 −72.60528 11.9440 4.0598
Araucanía Quiripio 19 −38.63972 −73.24417 11.4760 2.9514
San Luis 20 −38.40167 −71.90250 10.6738 4.4276
Tranapuente 21 −38.69083 −73.35333 11.8169 3.0236
Sta. Adela 22 −38.75611 −72.89139 11.8995 3.9698
Fig. 1. Locations of 22 meteorological stations in the Maule, Biobío and Araucanía regions, Chile.
1 5.1. The model
2 We consider the following model to analyze the Agromet temperature data in space and time
Zt (s) = µt (s) + εt (s) + ωt (s), ωD×T ∼ GWN (0, σω2 ),

2π t 2π t
( ) ( )
3 µt (s) = β0 + β1 cos + β2 sin ,
365.25 365.25
εt (s) = φεt −1 (s) + ηt (s),
( )
η ∥s − r∥
4 C (s, r) = ση exp −
2
,
α
5 which exhibits a systematic component that explains the global seasonality of the phenomenon, a latent process that
6 changes with an AR(1) dynamics in time, and where the innovation process has a spatial correlation structure of
JSPI: 5711
Table 4
Estimates of the space–time model parameters.
β̂0 β̂1 β̂2 φ̂ α̂ σ̂η2 σ̂ω2
15.247 4.413 −1.092 0.784 2.668 3.286 0.063
Fig. 2. Space–time predictions of the fortnights of each month of the year 2016.
exponential type. The estimations are shown in Table 4. We can observe that the estimated autoregressive coefficient 1
is high, φ̂ = 0.784, which suggests that the temperature is highly correlated with the temperature of the previous 2
day. Additionally, the estimated range parameter is high, α̂ = 2.668, in relation to the spatial sampling scheme, with 3
a maximum distance between the stations of hmax = 381.8 kms. This indicates that the spatial process has a high spatial 4
continuity, as verified in the smoothness of the variability of the images of the predicted temperatures for each month, 5
see Fig. 2. Finally, the estimated nugget effect is quite small, σ̂ω = 0.063, indicating that the measurement error variance
2
6
explains less than 1% of the total process variance. This guarantees accurate space–time predictions. 7
JSPI: 5711
Table 5
Estimation of parameters of the purely spatial model, through maximum
likelihood method, for the measurements of December 31, 2016.
β̂0366 α̂366 σ̂ε2366 σ̂ω2366
16.6165 0.6318 3.2444 0.1617
1 5.2. Prediction
2 We obtained predictions at sites where there is no information available, and the full dataset (from January 1 to
3 December 31, 2016) was used for prediction purposes. Using the proposed model and the Kalman filter, spatio-temporal
4 predictions were obtained, providing a total of 366 prediction maps. However, due to space constraints, only a subset of
5 the predictions are presented in Fig. 2, which displays the predictions of the 15th of each month of the year.
6 Examining the images of predictions, and omitting the interpolations associated with the Andes mountains, a fairly
7 marked general seasonality can be observed, lower temperatures are associated with the winter months, while the higher
8 ones to the summer season. It should be noted that in June and July images, the coast presents higher temperature values
9 than in the intermediate depression, with the exception of Cañete (ID 10), evidencing the regulatory effect generated by
10 the sea in coastal areas. On the other hand, it is possible to see a bipolar phenomenon generated between the Nueva Aldea
11 (ID = 11) and Ninhue (ID 12) zone. This is highlighted when analyzing the time series of these two stations, evidencing a
12 smaller variability on a large scale in the station of Nueva Aldea, contrasting with that of Ninhue, generating an interaction
13 effect between these two.
14 In order to calculate the prediction improvement generated by the Kalman filter, in terms of mean squared error of
15 prediction and prediction variance, with respect to the simple kriging for the purely spatial model, a purely spatial model
16 was fitted to the data associated with December 31, 2016
17 Z366 (s) = β0366 + ε366 (s) + ω366 (s), ωD×366 ∼ GWN (0, σω2366 ),
( )
ε366 ∥s − r∥
18 C (s, r) = σε366 exp −
2
,
α366
19 where the instant of time 366 refers to December 31, 2016. In the same way as in Section 4, the cross-validation technique
20 was performed for this moment of time of the data, and the following criteria were considered to quantify the mean square
21 error of prediction
( )2
(SK )
Z366 (sj ) − Z366|366 (sj )
(SK )
22 I366 (sj ) := ,
σ̂Z2366
( )2
(KF )
Z366 (sj ) − Z366|366 (sj )
(KF )
23 I366 (sj ) := ,
σ̂Z2
24 where σ̂Z2366 = σ̂ε2366 + σ̂ω2366 and σ̂Z2 = σ̂η2 /(1 − φ 2 ) + σ̂ω2 , see Table 5. From these results we observe that the ratio of
25 the average of the scores obtained between the SK and KF method is 0.721/0.307 ≈ 2.349 > 1, see Table 6. So, we can
26 conclude the predictor mean squared error of the SK method is approximately 2 times larger than that for the KF .
27 From the results obtained on the estimated prediction variances, shown in Table 6 we are interested in detecting
28 which of the two procedures produces more certainty, relative to the total variance delivered by each model. For this, we
29 considered the estimated variances of predictions in two locations; one within a conglomerate (with information around
30 the place where we want to predict the attribute), say j = 2, and another, one isolated location (without information
31 around the place to predict the attribute), say j = 20. We obtained
(SK ) (KF )
∆366|366 (s20 ) ∆366|366 (s20 )
32
(SK )
≈ 3.6, (KF )
≈ 7.6,
∆366|366 (s2 ) ∆366|366 (s2 )
33 noting that the predictor generated by KF method produces a better use of the information, i.e., the prediction variance
34 at location s20 is approximately 8 times larger than the prediction variance at location s2 , while through the SK method
35 it is approximately 4.
36 6. Conclusions
37 We highlight that one of the major advantages of using the KF methodology is the use of a single model that explains
38 the space–time evolution of the phenomenon of interest using the most available information. This clearly contrasts with
39 the more classical geostatistical techniques that only use the most recent past information, limiting the overall temporal
JSPI: 5711
Table 6
Results of cross validation for the measurements of December 31, 2016.
(SK ) (KF ) (SK ) (KF ) (SK ) (KF )
j Z366|366 (sj ) Z366|366 (sj ) ∆366|366 (sj ) ∆366|366 (sj ) I366 (sj ) I366 (sj )
1 18.066 18.067 0.892 0.609 0.001 0.001

2 18.379 18.498 0.788 0.431 0.000 0.001
3 18.507 18.484 1.507 1.470 1.067 0.413
4 18.182 18.181 0.821 0.491 0.079 0.031
5 18.130 18.151 1.415 1.397 0.276 0.105
6 18.396 19.031 3.027 3.218 0.027 0.013
7 17.135 17.299 1.163 0.878 0.447 0.228
8 16.165 15.949 0.982 0.671 0.205 0.129
9 15.715 15.527 1.940 2.169 0.564 0.288
10 15.260 15.313 2.279 2.628 5.033 1.944
11 17.678 17.792 1.238 1.141 1.268 0.559
12 17.234 17.139 1.121 0.978 0.815 0.361
13 17.449 17.552 1.545 1.533 0.212 0.106
14 16.708 16.681 1.283 1.177 0.076 0.027
15 17.607 17.791 1.484 1.473 0.234 0.059
16 15.195 15.333 1.134 0.942 0.003 0.006
17 15.139 15.286 1.092 0.892 0.062 0.011
18 14.835 15.340 1.880 1.958 0.094 0.000
19 15.830 15.685 0.971 0.675 1.733 0.608
20 16.306 16.606 2.850 3.277 3.405 1.599
21 14.710 14.204 1.026 0.709 0.183 0.195
22 15.014 14.741 1.208 1.099 0.069 0.067
Average 16.711 16.757 1.438 1.355 0.721 0.307
information. This reflects in that parameter estimates are worse in the purely spatial context, as there are many situations 1
where the purely spatial methodology is unable to capture the complete spatial autocorrelation. This is even worse when 2
the sample is reduced and limited, as frequently encountered in data recorded in space–time. We have given a number 3
of facts to reinforce the use of the KF methodology rather than classical kriging techniques. 4
We note that the autoregressive structure we imposed here can be extended to other temporal structures and a 5
corresponding adaptation of the Kalman filter techniques to these cases would be worth working on. 6
Acknowledgments 7
The first author thanks the VRID grant 216.014.026-1.0, from University of Concepción. Powered@NLHPC: This research 8
was partially supported by the supercomputing infrastructure of the NLHPC (ECM-02). J. Mateu has been partly granted 9
by project MTM2016-78917-R of the Spanish government. 10
References 11
Amisigo, B., Van De Giesen, N., 2005. Using a spatio-temporal dynamic state-space model with the em algorithm to patch gaps in daily riverflow 12
series. Hydrol. Earth Syst. Sci. Discuss. 9 (3), 209–224. 13
Anderson, J.L., 2001. An ensemble adjustment kalman filter for data assimilation. Mon. Weather Rev. 129 (12), 2884–2903. 14
Banerjee, S., Carlin, B.P., Gelfand, A.E., 2014. Hierarchical Modeling and Analysis for Spatial Data. Crc Press. 15
Barton, M.J., Robinson, P.A., Kumar, S., Galka, A., Durrant-Whyte, H.F., Guivant, J., Ozaki, T., 2009. Evaluating the performance of kalman-filter-based 16
eeg source localization. IEEE Trans. Biomed. Eng. 56 (1), 122–136. 17
Bevilacqua, M., Faouzi, T., Furrer, R., Porcu, E., 2019. Estimation and prediction using generalized wendland covariance functions under fixed domain 18
asymptotics. Ann. Stat. (in press). 19
Brockwell, P.J., Davis, R.A., 1991. Time Series: Theory and Methods, second Springer, New York. 20
Cameletti, M., Lindgren, F., Simpson, D., Rue, H., 2013. Spatio-temporal modeling of particulate matter concentration through the spde approach. 21
AStA Adv. Stat. Anal. 97 (2), 109–131. 22
Chan, N.H., Palma, W., 1998. State space modeling of long-memory processes. Ann. Statist. 26, 719–740. 23
Cressie, N., 1993. Statistics for Spatial Data. John Wiley & Sons.. 24
Cressie, N., Shi, T., Kang, E.L., 2010. Fixed rank filtering for spatio-temporal data. J. Comput. Graph. Stat. 19 (3), 724–745. 25
Cressie, N., Wikle, C.K., 2014. Space-time kalman filter. Wiley StatsRef Stat. Ref. Online. 26
Cressie, N., Wikle, C.K., 2015. Statistics for Spatio-temporal Data. John Wiley & Sons. 27
Diggle, P., Ribeiro, P.J., 2007. Model-based Geostatistics. Springer Series in Statistics. Springer. 28
Durbin, J., Koopman, S.J., 2001. Time Series Analysis by State Space Methods. Oxford Statistical Science Series, vol. 24, Oxford University Press, Oxford. 29
Fassò, A., Cameletti, M., 2009. The em algorithm in a distributed computing environment for modelling environmental space–time data. Environ. 30
Model. Softw. 24 (9), 1027–1035. 31
Fassò, A., Cameletti, M., 2010. A unified statistical approach for simulation, modeling, analysis and mapping of environmental data. Simulation 86 32
(3), 139–153. 33
Ferreira, G., Mateu, J., Porcu, E., 2017. Spatio-temporal analysis with short-and long-memory dependence: a state-space approach. Test 27, 1–25. 34
Ferreira, G., Rodríguez, A., Lagos, B., 2013. Kalman filter estimation for a regression model with locally stationary errors. Comput. Stat. Data Anal. 35
62, 52–69. 36
JSPI: 5711
1 Gillijns, S., Mendoza, O.B., Chandrasekar, J., De Moor, B., Bernstein, D., Ridley, A., 2006. What is the ensemble kalman filter and how well does it
2 work?. In: 2006 American Control Conference. IEEE, pp. 6–pp..
3 Gneiting, T., Genton, M.G., Guttorp, P., 2006. Geostatistical space-time models, stationarity, separability, and full symmetry. Monogr. Stat. Appl. Probab.
4 107, 151–175.
5 Grassi, S., de Magistris, P.S., 2014. When long memory meets the kalman filter: A comparative study. Comput. Stat. Data Anal. 76, 301–319.
6 Grewal, M.S., 2011. Kalman Filtering. Springer.
7 Grewal, M.S., Weill, L.R., Andrews, A.P., 2007. Global positioning systems, inertial navigation, and integration. John Wiley & Sons.
8 Harvey, A.C., 1992. Forecasting Structural Time Series and the Kalman Filter. Cambridge University Press, Cambridge.
9 Hefley, T.J., Broms, K.M., Brost, B.M., Buderman, F.E., Kay, S.L., Scharf, H.R., Tipton, J.R., Williams, P.J., Hooten, M.B., 2017. The basis function approach
10 for modeling autocorrelation in ecological data. Ecology 98 (3), 632–646.
11 Huang, H., Cressie, N., 1996. Spatio-temporal prediction of snow water equivalent using the kalman filter. Comput. Stat. Data Anal. 22, 159–175.
12 Hughes, J.P., Guttorp, P., Charles, S.P., 1999. A non-homogeneous hidden markov model for precipitation occurrence. J. R. Stat. Soc. Ser. C 48 (1),
13 15–30.
14 Kalman, R.E., 1960. A new approach to linear filtering and prediction problems. J. Basic Eng. 82 (1), 35–45.
15 Katzfuss, M., Cressie, N., 2011. Spatio-temporal smoothing and em estimation for massive remote-sensing data sets. J. Time Ser. Anal. 32 (4), 430–446.
16 Li, H.-X., Qi, C., 2011. Spatio-Temporal Modeling of Nonlinear Distributed Parameter Systems: a Time/space Separation Based Approach, Vol. 50.
17 Springer Science & Business Media, http://dx.doi.org/10.1007/978-94-007-0741-2.
18 Mardia, K.V., Goodall, C., Redfern, E.J., Alonso, F.J., 1998. The kriged kalman filter. Test 7 (2), 217–282.
19 Mardia, K.V., Marshall, R., 1984. Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika 71, 135–146.
20 Matérn, B., 2013. Spatial Variation. Vol. 36. Lecture Notes in Statistics, Springer Science & Business Media, http://dx.doi.org/10.1007/978-1-4615-
21 7892-5.
22 Matheron, G., 1971. Theory of regionalized variables and its applications. Cah. Cent. Morrphologie Math. Eisle Natl. Superieure des Mines de Paris.
23 5, 211.
24 Maybeck, P.S., 1979. Square root filtering. Stoch. Models Estimation Control 1, 368–409.
25 Meinhold, R.J., Singpurwalla, N.D., 1983. Understanding the kalman filter. Amer. Statist. 37 (2), 123–127.
26 Montero, J.M., Fernández-Avilés, G., Mateu, J., 2015. Spatial and Spatio-Temporal Geostatistical Modeling and Kriging. Vol. 998. John Wiley & Sons..
27 Naveau, P., Genton, M.G., Shen, X., 2005. A skewed kalman filter. J. Multivariate Anal. 94 (2), 382–400.
28 Padoan, S.A., Bevilacqua, M., 2015. Analysis of random fields using comprandfld. J. Stat. Softw. 63 (9), 1–27.
29 Prado, R., West, M., 2010. Time Series: Modeling, Computation, and Inference. CRC Press.
30 Rehman, M.J., Dass, S.C., Asirvadam, V.S., 2016. Nonlinear dynamical system identification using unscented kalman filter. In: AIP Conference
31 Proceedings. Vol. 1787. AIP Publishing, p. 020003.
32 Rezaie, J., Eidsvik, J., 2014. Kalman filter variants in the closed skew normal setting. Comput. Statist. Data Anal. 75, 1–14.
33 Rue, H., Martino, S., Chopin, N., 2009. Approximate bayesian inference for latent gaussian models by using integrated nested laplace approximations.
34 J. R. Stat. Soc. Ser. B Stat. Methodol. 71 (2), 319–392.
35 Sahu, S.K., 2012. Hierarchical bayesian models for space–time air pollution data. In: Handbook of Statistics, Vol. 30. Elsevier, pp. 477–495.
36 Stroud, J.R., Stein, M.L., Lesht, B.M., Schwab, D.J., Beletsky, D., 2010. An ensemble kalman filter and smoother for satellite data assimilation. J. Am.
37 Stat. Assoc. 105 (491), 978–990.
38 Thioulouse, J., Royet, J., Ploye, H., Houllier, F., 1993. Evaluation of the precision of systematic sampling: nugget effect and covariogram modelling. J.
39 Microsc. 172 (3), 249–256.
40 Wikle, C.K., 2003. Hierarchical models in environmental science. Int. Stat. Rev. 71 (2), 181–199.
41 Wikle, C.K., Cressie, N., 1999. A dimension-reduced approach to space-time kalman filtering. Biometrika 86 (4), 815–829.
42 Xu, K., Wikle, C.K., 2007. Estimation of parameterized spatio-temporal dynamic models. J. Stat. Plan. Inference 137 (2), 567–588.
43 Zammit-Mangion, A., Cressie, N., Fixed Rank Kriging: The R package, URL https://cran.r-project.org/web/packages/FRK/index.html.
44 Zimmerman, D.L., Zimmerman, M.B., 1991. A comparison of spatial semivariogram estimators and corresponding ordinary kriging predictors.
45 Technometrics 33 (1), 77–91.

Journal of Statistical Planning and Inference

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Journal of Statistical Planning and Inference

Uploaded by

Copyright:

Available Formats

JSPI: 5711 Model 3G pp. 1–14 (col.

Journal of Statistical Planning and Inference xxx (xxxx) xxx

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference

A Kalman filter method for estimation and prediction of

34 2. Space–time AR(p) processes

37 Zt (s) = µt (s) + εt (s) + ωt (s), (1a)

45 εt (s) = φ1 εt −1 (s) + φ2 εt −2 (s) + · · · + φp εt −p (s) + ηt (s), (1b)

covariance function of the spatio-temporal AR(p) process is defined by 5

C (t , t ; s, r) := Cov[Zt (s), Zt ′ (r)]

= Cov[εt (s), εt ′ (r)] + Cov[ωt (s), ωt ′ (r)] (3) 6

= Ckε (s, r) + σω2 δ{k=0,s=r} ,

Ckε (s, r) = φm Ckε−m (s, r) + C η (s, r)δ{k=0} ,

(2015) (see Montero et al. (2015) for a complete overview). 23

for estimating the parameters in the model was developed. 27

by applying the concept of dimension reduction. 30

consider the following model 46

Zt (s) = Xt (s)β + K(s)Yt + et (s) (7a) 47

31 3. Statistical inference: estimation and prediction methods

32 3.1. Classical kriging

35 Zt (s) = µt (s) + εt (s) + ωt (s), s ∈ D ⊂ R2 (8)

E[εt (s0 ) | Zt ] = E[εt (s0 )] + Gt (s0 )∆t (Zt − E[Zt ]),

and its prediction variance is given by 3

Var[εt (s0 ) | Zt ] = Var[εt (s0 )] − Gt (s0 )∆t Gt (s0 ) ,

where E[εt (s0 )] = 0, Var[εt (s0 )] = σε2t , ∆t := Var[Zt ] = C εt + σω2t In and 7

Gt (s0 ) := Cov[εt (s0 ), Zt ]

= (C εt (s0 , s1 ), . . . , C εt (s0 , sn )).

with its respective prediction variance is given by 11

of the non-stationarity behavior of the parameters of the process. 19

semivariogram parameterized by the nugget parameter. 37

3.2. Kalman filter 38

as current data to predict at a location s0 where no observations are available. 40

Zt (s) = β Xt (s) + H ξ t (s) + ωt (s)

ξt (s) = Fξt −1 (s) + Vt (s), (10b) 46

12 ξ t |t −1 (s) := E[ξ t (s)|Z1:t −1 ] (12a)

17 ξ t |t (s) = ξ t |t −1 (s) + Gt |t −1 (s)∆−

20 Gt |t −1 (s) := Cov[ξ t (s), Zt |Z1:t −1 ]

= Pt |t −1 (s, s1 )H, . . . , Pt |t −1 (s, sn )H ,

ξ t +1|t (s) = Fξ t |t (s) 6

Pt +1|t (s, r) = FPt |t (s, r)F + Q(s, r),⊤

and the prediction covariance between locations s and r are 12

with Cjε (s, r) coming from (5). 14

well-known Newton–Raphson algorithm 27

Θ̂∗{m} = Θ̂∗{m−1} − (∇∇ ℓ(Θ̂{m−1} | Z1:T )) ∇ℓ(Θ̂{m−1} | Z1:T ),

4.1. Performance of the estimation procedure 33

estimator discussed in Section 3. Specifically, we considered the model 35

Zt (s) = µt (s) + εt (s) + ωt (s)

4.2. Performance of the prediction procedure 1

5. Real data analysis 26

available from the website http://www.agromet.cl. 30

1 5.1. The model

Zt (s) = µt (s) + εt (s) + ωt (s), ωD×T ∼ GWN (0, σω2 ),

1 18.066 18.067 0.892 0.609 0.001 0.001

by project MTM2016-78917-R of the Spanish government. 10

You might also like