Professional Documents
Culture Documents
Geostatistical Modeling For Large Data Sets: Whitney Huang
Geostatistical Modeling For Large Data Sets: Whitney Huang
Whitney Huang
Geostatistical Modeling for Large Data Sets
Motivation
Methods
Covariance
Whitney Huang tapering
Low–rank
approximation
Likelihood
Department of Statistics approximation
Purdue University Gaussian Markov
random field
approximation
Whitney Huang
Motivation
Methods
Covariance
tapering
Motivation Low–rank
approximation
Likelihood
approximation
Gaussian Markov
random field
approximation
Methods
Covariance tapering
Low–rank approximation
Likelihood approximation
Gaussian Markov random field approximation
Geostatistics for
Gaussian process (GP) geostatistics Large Data Sets
Whitney Huang
Model:
Motivation
Y (s) = µ(s) + η(s) + ε(s), s ∈ S ⊂ Rd Methods
Covariance
tapering
where Low–rank
approximation
Likelihood
approximation
µ(s) = XT (s)β, {η(s)}s∈S ∼ GP (0, C (·, ·)) Gaussian Markov
random field
approximation
1
ln (β, θ, σ 2 , τ 2 ) ∝ − log Σ(θ, σ 2 ) + τ 2 In
2
1 −1
− (Y − XT β)T Σ(θ, σ 2 ) + τ 2 In
(Y − Xβ)
2
where Σ(θ, σ 2 )i,j = σ 2 ρθ (ksi − sj k), i, j = 1, · · · , n
Geostatistics for
“Big n Problem” in geostatistics Large Data Sets
Whitney Huang
Motivation
Whitney Huang
Motivation
Whitney Huang
Motivation
Whitney Huang
Motivation
Methods
Covariance
tapering
I Covariance tapering (Furrer et al. 06, Kaufman et al. Low–rank
approximation
08, Du et al. 09) Likelihood
approximation
Gaussian Markov
I Low–rank approximation (Cressie & Johannesson 08, random field
approximation
Banerjee et al. 08)
I Likelihood approximation (Vecchia 88, Stein 04)
I Gaussian Markov random field approximation (Rue &
Tjelmeland 02, Rue & Held 05, Lindgren et al. 11)
Geostatistics for
Modeling strategies in the literature Large Data Sets
Whitney Huang
Motivation
Methods
Covariance
tapering
I Covariance tapering (Furrer et al. 06, Kaufman et al. Low–rank
approximation
08, Du et al. 09) Likelihood
approximation
Gaussian Markov
I Low–rank approximation (Cressie & Johannesson 08, random field
approximation
Banerjee et al. 08)
I Likelihood approximation (Vecchia 88, Stein 04)
I Gaussian Markov random field approximation (Rue &
Tjelmeland 02, Rue & Held 05, Lindgren et al. 11)
Geostatistics for
Modeling strategies in the literature Large Data Sets
Whitney Huang
Motivation
Methods
Covariance
tapering
I Covariance tapering (Furrer et al. 06, Kaufman et al. Low–rank
approximation
08, Du et al. 09) Likelihood
approximation
Gaussian Markov
I Low–rank approximation (Cressie & Johannesson 08, random field
approximation
Banerjee et al. 08)
I Likelihood approximation (Vecchia 88, Stein 04)
I Gaussian Markov random field approximation (Rue &
Tjelmeland 02, Rue & Held 05, Lindgren et al. 11)
Geostatistics for
Modeling strategies in the literature Large Data Sets
Whitney Huang
Motivation
Methods
Covariance
tapering
I Covariance tapering (Furrer et al. 06, Kaufman et al. Low–rank
approximation
08, Du et al. 09) Likelihood
approximation
Gaussian Markov
I Low–rank approximation (Cressie & Johannesson 08, random field
approximation
Banerjee et al. 08)
I Likelihood approximation (Vecchia 88, Stein 04)
I Gaussian Markov random field approximation (Rue &
Tjelmeland 02, Rue & Held 05, Lindgren et al. 11)
Geostatistics for
Outline Large Data Sets
Whitney Huang
Motivation
Methods
Covariance
tapering
Motivation Low–rank
approximation
Likelihood
approximation
Gaussian Markov
random field
approximation
Methods
Covariance tapering
Low–rank approximation
Likelihood approximation
Gaussian Markov random field approximation
Geostatistics for
Covariance tapering (Furrer et al. 06) Large Data Sets
Methods
Covariance
tapering
Low–rank
where ρtap (h; γ) is an isotropic correlation function with approximation
Likelihood
compact support (ρtap (h) = 0 if h ≥ γ) and ◦ denotes the approximation
Gaussian Markov
random field
Schur product approximation
Geostatistics for
Covariance tapering cont’d Large Data Sets
Whitney Huang
Motivation
Methods
Covariance
tapering
Low–rank
approximation
Likelihood
approximation
Gaussian Markov
random field
approximation
Whitney Huang
Motivation
η = Hα + ξ, ξ ∼ MVN(0, Σξ )
α ∼ MVN(0, Σα )
where α = (α1 , · · · , αp )T such that p n and H is
mapping from the latent process, α, to the true spatial
process of interest, η. Σε and Σξ and diagonal.
Geostatistics for
Low–rank approximation cont’d Large Data Sets
Whitney Huang
HΣα H T + V approximation
Likelihood
approximation
Gaussian Markov
random field
where V = Σε + Σξ . approximation
Sherman–Morrison–Woodbury formula
−1
(A + BCD)−1 = A−1 − A−1 B C −1 + DA−1 B DA−1
In the case of low–rank model, we have
−1 −1
HΣα H T + V = V−1 −V−1 H Σ−1 α + H T −1
V H H T V−1
Geostatistics for
Fixed Rank Kriging (Cressie & Johannesson 08) Large Data Sets
Whitney Huang
Motivation
Methods
Y = Xβ + ZW∗ + ε Covariance
tapering
Low–rank
approximation
Likelihood
approximation
Whitney Huang
Motivation
Methods
Use a model Covariance
tapering
Low–rank
Whitney Huang
Motivation
Methods
Partition the observation vector Y into sub–vector Covariance
tapering
Y1 , · · · , Yb and let Y(j) = (YT T T
1 , · · · , Yj ) Low–rank
approximation
Likelihood
The exact likelihood approximation
Gaussian Markov
random field
approximation
b
Y
p(Y; β, θ) = p(Y1 ; β, θ) p(Yj |Y(j−1) ; β, θ)
j=2
Whitney Huang
Motivation
Methods
Covariance
tapering
Low–rank
approximation
Likelihood
approximation
Gaussian Markov
random field
approximation
Geostatistics for
Gaussian Markov Random Fields (GMRF) Large Data Sets
Whitney Huang
Motivation
Methods
Covariance
Definition tapering
Low–rank
approximation
Let the neighbors to a point i be the points Ni that are Likelihood
approximation
“close" to i. A Gaussian random field X ∼ N(µ, Σ = Q −1 ) Gaussian Markov
random field
that satisfies approximation
Whitney Huang
Motivation
I +: GP model is widely used in modeling continuously
Methods
indexed spatial data in which the covariance function Covariance
tapering
characterizes the process properties Low–rank
approximation
Likelihood
I –: Inference involves factorizing covariance matrices approximation
Gaussian Markov
random field
I +: GMRF model is computationally efficient due to the approximation
Whitney Huang
Motivation
I +: GP model is widely used in modeling continuously
Methods
indexed spatial data in which the covariance function Covariance
tapering
characterizes the process properties Low–rank
approximation
Likelihood
I –: Inference involves factorizing covariance matrices approximation
Gaussian Markov
random field
I +: GMRF model is computationally efficient due to the approximation
Whitney Huang
Motivation
I +: GP model is widely used in modeling continuously
Methods
indexed spatial data in which the covariance function Covariance
tapering
characterizes the process properties Low–rank
approximation
Likelihood
I –: Inference involves factorizing covariance matrices approximation
Gaussian Markov
random field
I +: GMRF model is computationally efficient due to the approximation
Whitney Huang
Motivation
I +: GP model is widely used in modeling continuously
Methods
indexed spatial data in which the covariance function Covariance
tapering
characterizes the process properties Low–rank
approximation
Likelihood
I –: Inference involves factorizing covariance matrices approximation
Gaussian Markov
random field
I +: GMRF model is computationally efficient due to the approximation
Whitney Huang
Motivation
I +: GP model is widely used in modeling continuously
Methods
indexed spatial data in which the covariance function Covariance
tapering
characterizes the process properties Low–rank
approximation
Likelihood
I –: Inference involves factorizing covariance matrices approximation
Gaussian Markov
random field
I +: GMRF model is computationally efficient due to the approximation
Whitney Huang
Motivation
I +: GP model is widely used in modeling continuously
Methods
indexed spatial data in which the covariance function Covariance
tapering
characterizes the process properties Low–rank
approximation
Likelihood
I –: Inference involves factorizing covariance matrices approximation
Gaussian Markov
random field
I +: GMRF model is computationally efficient due to the approximation
Whitney Huang
(SPDE) connection (Whittle 1954, 1963)
Motivation
Methods
Covariance
Gaussian process Y (s) with Matern covariance function is a tapering
Low–rank
stationary solution to the linear fractional stochastic partial approximation
Likelihood
differential equation: approximation
Gaussian Markov
random field
approximation
κ2 d
α2 − ∆ Y (s) = W(s), κ=ν+ ,ν > 0
2
where
I W(s) is a spatial Gaussian white noise
P ∂2
∆ = i ∂s 2 is the Laplacian operator
I
i
I d is the dimension of the spatial domain
Geostatistics for
An explicit link between GP and GMRF via SPDE Large Data Sets
Whitney Huang
(Lindgren et al. 11)
Motivation
Methods
Covariance
tapering
Low–rank
approximation
I Establish the link between GP with Matérn covariance Likelihood
approximation
function (with ν + d2 are integers) and GMRF Gaussian Markov
random field
approximation
I (Bayesian) inference can be done by using Integrated
nested Laplace approximation (INLA) approach
I The extensions to nonstationary models, models on
manifolds, multivariate models, spatio-temporal models
are relatively easy
Geostatistics for
An explicit link between GP and GMRF via SPDE Large Data Sets
Whitney Huang
(Lindgren et al. 11)
Motivation
Methods
Covariance
tapering
Low–rank
approximation
I Establish the link between GP with Matérn covariance Likelihood
approximation
function (with ν + d2 are integers) and GMRF Gaussian Markov
random field
approximation
I (Bayesian) inference can be done by using Integrated
nested Laplace approximation (INLA) approach
I The extensions to nonstationary models, models on
manifolds, multivariate models, spatio-temporal models
are relatively easy
Geostatistics for
An explicit link between GP and GMRF via SPDE Large Data Sets
Whitney Huang
(Lindgren et al. 11)
Motivation
Methods
Covariance
tapering
Low–rank
approximation
I Establish the link between GP with Matérn covariance Likelihood
approximation
function (with ν + d2 are integers) and GMRF Gaussian Markov
random field
approximation
I (Bayesian) inference can be done by using Integrated
nested Laplace approximation (INLA) approach
I The extensions to nonstationary models, models on
manifolds, multivariate models, spatio-temporal models
are relatively easy
Geostatistics for
Extensions Large Data Sets
Whitney Huang
I non-stationary model on a sphere
Motivation
κ
α2 (s) + ∆ 2 τ (s)Y (s) = W(s), s ∈ S2 Methods
Covariance
tapering
I non-separable anisotropic space-time model Low–rank
approximation
Likelihood
κ approximation
∂ 2
2 Gaussian Markov
+ (α + m · ∇ − ∇ · H∇ Y (s, t) = W(s, t) random field
approximation
∂t
where (s, t) ∈ S2 × R
Geostatistics for
For Further Reading I Large Data Sets
Whitney Huang
Appendix
Whitney Huang
Whitney Huang
Appendix
Lindgren, F., Rue, H., & Lindström, J. For Further
Reading
An explicit link between Gaussian fields and Gaussian
Markov random fields: the stochastic partial differential
equation approach.
JRSSB, 73:423–498
H. Rue, and H. Tjelmeland
Fitting Gaussian Markov Random Fields to Gaussian
Field.
Scandinavian Journal of Statistics, 29:31–49
M. L. Stein, Z. Chi, and L. J. Welty
Approximating Likelihoods for Large Spatial Data Sets
JRSSB, 66:275–296, 2004.
Geostatistics for
For Further Reading IV Large Data Sets
Whitney Huang
Appendix
For Further
Reading
A. V. Vecchia
Estimation and Model Identification for Continuous
Spatial Processes
JRSSB, 50:297–312, 1988.