Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: We consider current (1971–2000) and future (2041–2070) average seasonal surface temperature fields
Received 14 July 2011 from two regional climate models (RCMs) driven by the same atmosphere–ocean general circulation
Accepted 10 December 2011 model (GCM) in the North American Regional Climate Change Assessment Program (NARCCAP) Phase
II experiment. We analyze the difference between future and current temperature fields for each RCM
Keywords: and include the factor of season, the factor of RCM, and their interaction in a two-way ANOVA model.
Posterior distribution
Noticing that classical ANOVA approaches do not account for spatial dependence, we assume that the
Regional climate model (RCM)
main effects and interactions are spatial processes that follow the Spatial Random Effects (SRE) model.
Spatial Random Effects (SRE) model
Spatial statistics
This enables us to model the spatial variability through fixed spatial basis functions, and the computations
associated with an ANOVA of high-resolution RCM outputs can be carried out without having to resort
to approximations. We call the resulting model a spatial two-way ANOVA model. We implement it in
a Bayesian framework, and we investigate the variability of climate-change projections over seasons,
RCMs, and their interactions. We find that projected temperatures in North America are credibly higher,
that the associated warming effects differ in spatial areas and in seasons, and that they are of much larger
magnitude than the variability between RCMs.
© 2012 Elsevier B.V. All rights reserved.
0303-2434/$ – see front matter © 2012 Elsevier B.V. All rights reserved.
doi:10.1016/j.jag.2011.12.007
4 E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15
NARCCAP is an international program designed to investigate using triangularization, and it is not clear whether the prediction
the uncertainties in RCMs and provide high-resolution (approxi- performance is sensitive to these approximations. The approach
mately 50 km) climate-output data for the North American region in Lindgren et al. (2011) is limited in the way that nonstation-
(Mearns et al., 2009). A set of six RCMs and various experiments ary processes are modeled. Within the domains of chosen basis
are included in NARCCAP: The Phase I experiment is designed to functions, they assume that parameters of the stochastic partial dif-
explore the variability in RCM outputs for the current period where ferential equation are constant. This is a general comment that can
all six RCMs were run with common boundary conditions provided be made regarding the use of homogeneous Markov random fields
by the NCEP Reanalysis II data (e.g., Kanamitsu et al., 2002). Kang to model spatial dependence over large, heterogeneous regions,
et al. (2012) proposed a Bayesian hierarchical model to combine namely, flexibility of the spatial model is compromised to solve
RCM outputs for this part of the experiment. Further, Sain and the computational challenge of inverting a very large covariance
Furrer (2010) and Christensen and Sain (2012) used two differ- matrix (e.g., Sain et al., 2010; Lindgren et al., 2011).
ent classical spatial correlation models to capture the dependence Cressie and Johannesson (2006, 2008) proposed a model with
between the six RCMs. flexible spatial covariance structure that they called the Spatial
The most important part of the experiment in NARCCAP is called Random Effects (SRE) model, which is computationally scalable
the Phase II experiment, which involves not only multiple RCMs, for large-to-massive spatial datasets. Kang and Cressie (2011)
but also multiple runs with different boundary conditions provided extended it to the Bayesian framework, and Kang et al. (2012)
by different GCMs. In Phase II, RCMs are run, not only for the cur- successfully used the SRE model to analyze high-resolution RCM
rent period but also for a future period, and thus climate-change outputs in the NARCCAP Phase I experiment. In this article, we
projections are available from the Phase II experiment. incorporate the SRE model as one component in the spatial two-
Since NARCCAP Phase II is an ongoing experiment, in this article, way ANOVA model; this enables us not only to have a flexible
we consider a subset of the Phase II runs that are available. In partic- covariance structure without assuming homogeneity (e.g., sta-
ular, we consider current and future outputs from two RCMs with tionarity or isotropy), but also to complete a Bayesian analysis
boundary conditions provided by the same GCM. Our statistical without using approximations as in Furrer et al. (2006) and
analysis, based only on outputs from the RCMs, is not a validation Banerjee et al. (2008). In this article, we use the SRE model to
study, since it cannot detect climate-change factors missing from analyze variabilities of climate-change projections from two RCMs
them. In this article, we investigate the variability in the outputs due to various sources, including RCM, season, and their interac-
due to two RCMs, four Boreal seasons, and their interactions, which tions.
results in a quantification of the projected climate change from In Section 2, we give a detailed description of the NARC-
these RCMs. Specifically, we do this with a spatial two-way analy- CAP RCM outputs studied in this article. We present a spatial
sis of variance (ANOVA) model in a Bayesian framework. Although two-way ANOVA model in Section 3, and we give distributional
applications of Bayesian modeling to the climate and environmen- assumptions for the data model, the process model, and the
tal sciences is not new (e.g., Berliner et al., 2000; Tebaldi et al., prior. Section 4 describes how this model is implemented on
2005), in previous analyses of multi-climate-model outputs, most the NARCCAP Phase II outputs, and the results of our analy-
focus on GCMs, and no spatial dependence is accounted for (e.g., sis are presented there. Discussion and conclusions are given
Tebaldi et al., 2005; Berliner and Kim, 2008; Smith et al., 2009). In in Section 5, and Appendices A and B contain details on the
this article, we view the RCM outputs as spatial fields whose spa- prior specifications and the Markov chain Monte Carlo (MCMC)
tial variability is modeled and incorporated into a two-way ANOVA algorithm used to obtain the posterior distribution, respec-
model. tively.
It should also be noted that due to the high resolution of RCM
outputs, the size of the datasets (i.e., model outputs) is very large. As
pointed out by Banerjee et al. (2008) and Cressie and Johannesson 2. NARCCAP RCM outputs
(2008), such large or even massive datasets will cause computa-
tional difficulties for traditional spatial statistical methods, since The North American Regional Climate Change Assessment Pro-
the computations require inversion of a large covariance matrix. For gram (NARCCAP) is an international program designed to explore
example, the Bayesian ANOVA model suggested by Kaufman and uncertainties in the regional climate projections from RCMs. A set
Sain (2010), which is based on classical Matérn covariance func- of six RCMs are included in NARCCAP to produce high-resolution
tions, cannot be directly implemented on the outputs we analyze in (approximately 50 km) outputs over the spatial domain covering
this article. To alleviate the computational difficulty posed by mas- most of Canada, the 48 contiguous states in the United States, and
sive datasets, several approximation methods have been proposed, northern Mexico, as well as the adjacent Atlantic and Pacific Oceans.
including covariance tapering and predictive processes (e.g., Furrer In the Phase II experiment of NARCCAP, these six RCMs are coupled
et al., 2006; Banerjee et al., 2008; Salazar et al., 2011). However, for with a collection of four different GCMs. For each RCM+GCM com-
these methods, it is not always clear how well the approximations bination, two climate-model runs are specified: A current run is
perform in relation to the underlying modeled spatial processes. implemented from 1971 to 2000, with boundary conditions pro-
Approaches using Markov random fields have also been sug- duced by the “Climate of the 20th Century” GCMs; a future run is
gested. Sain et al. (2010) and Sain and Kaufman (2011) deal with implemented from 2041 through 2070, with boundary conditions
computation for large datasets, by using homogeneous Markov ran- produced by the GCMs with the SRES A2 emissions scenario for the
dom fields, assuming conditional dependence between only first- 21st century (Nakicenovic et al., 2000).
and second-order neighbors. Sain et al. (2011) generalized this In order to explore developments in the global environment,
model for multivariate processes and apply it to analyze RCM out- especially the future production of greenhouse gases and aerosol
puts of both temperature and precipitation. Moreover, Lindgren precursor emissions, scientists construct emissions scenarios by
et al. (2011) construct Markov random fields for Gaussian processes characterizing the dynamics and relationships between various
motivated by the stationary Matérn family of covariance functions. factors such as demographic, politico-societal, economic, and tech-
Their representation of a Gaussian process with Markov random nological growth. In Nakicenovic et al. (2000), a set of four scenarios
fields is based on using approximate solutions to a specific type of families are developed and are later used as the “markers” scenarios
stochastic partial differential equations. They use a further approx- in the IPCC Fourth Assessment Report (Solomon et al., 2007). Among
imation to put irregularly sampled locations onto regular grids by them, the SRES A2 emissions scenario assumes the world to be
E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15 5
Table 1 Table 2
Summary of the two RCMs and the GCM used. ANOVA table of {Dij }.
2070 The upper-right panel in Fig. 2 shows the spatial field of {D.. (·)}
future Zij (sk ; t)
Zij (sk ) ≡ , (2) from which it can be seen that, overall, future temperatures will be
30 higher than current temperatures. The warming effect is stronger
t=2041
on the land than on the ocean generally, and the Hudson Bay area
where Zij (sk ; t) is the ith RCM’s average temperature for the jth sea- shows the highest temperature difference.
son in pixel sk during year t, for i = 1, 2, j = 1, . . ., 4, {sk : k = 1, . . ., n}, Note that for a given s ∈ D, {Dij (s)} can be viewed as data from a
and n = 98 × 120 = 11, 760. Although seasonal average temperatures classical two-factor experiment without replication, with the factor
6 E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15
Fig. 1. Left column: Regional climate-change projections from RCM3, D2j , for four Boreal seasons, j = 1 (spring), j = 2 (summer), j = 3 (autumn), j = 4 (winter) from the top down.
Right column: Regional climate-change projections differences between CRCM and RCM3, D1j − D2j , for four Boreal seasons, j = 1 (spring), . . ., j = 4 (winter) from the top down.
To avoid distortion, the color scale on the left stops at 5 ◦ C, although there are higher temperatures for a few pixels on the maps (max. temperature difference = 7.18 ◦ C, in
the bottom-left panel).
‘RCM’ at two levels and the factor ‘season’ at four levels. Therefore, 3.1. Data model
a classical ANOVA of the type given in Table 2, can be implemented
for each pixel s ∈ D, independently. However, this ignores the spa- Climate models, which are complicated coupled systems of dif-
tial dependence over the field, and the information from the dataset ferential equations, are typically run over long time periods, and
would not be used efficiently. In this article, we expand the classi- climatological statistics, such as long-run averages, are extracted
cal ANOVA approach in the spatial setting, and we propose a spatial and modeled as one single realization equal to the underlying pro-
two-way ANOVA model to do this. This will be discussed in the next cess plus noise (e.g., Furrer et al., 2007; Sain et al., 2011; Kang et al.,
section. 2012). In our analysis, these long-run averages, {Dij (·)}, are in fact
average differences between future and current 30-year runs from
two RCMs (i = 1 and 2) and for four Boreal seasons (j = 1, . . ., 4). To
3. Bayesian hierarchical spatial ANOVA model
represent the variability that is inherent in such constructions, we
consider the following data model:
In this section, we introduce the Bayesian hierarchical spatial
ANOVA model for analyzing regional climate-change projections,
{Dij (·)}. The formulation can be applied to any spatial dataset from Dij (s) = Yij (s) + εij (s), s ∈ D, (6)
a two-way factorial experiment. All the three levels of the hier-
archical model, namely the data, process, and parameter (or prior) where {Yij (·)} represents the projection of climate change from
models are presented. the ith RCM for the jth season, and {εij (·)} represents the residual
E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15 7
Fig. 2. Upper-left panel: The posterior mean of Y.. . Upper-right panel: D.. . Middle-left panel: The posterior standard deviation of Y.. . Middle-right panel: Differences between
D.. and the posterior mean of Y.. . Lower panels: Pixelwise posterior 2.5th (lower-left) and 97.5th (lower-right) percentiles of Y.. .
variability after temporal averaging. Similar to Dij in (4), we define In applications beyond this one, the matrices {Vij } will be known
the corresponding n-dimensional vectors: from instrument specification or estimated separately (e.g., Shi and
Cressie, 2007; Kang et al., 2010).
Yij ≡ (Yij (s1 ), . . . , Yij (sn )) , Equivalently, the data model in (6) and (7) can be written as:
and
εij ≡ (εij (s1 ), . . . , εij (sn )) . Dij |{Yij }∼Gau(Yij , ε2 Vij ), independently, for i = 1, 2, j = 1, . . . , 4.
We assume that the noise components {εij } follow Gaussian (Gau) (8)
distributions:
Note that the assumption of conditional independence of {Dij } does
εij ∼Gau(0, ε2 Vij ), independently, for i = 1, 2, j = 1, . . . , 4, (7)
not mean that they are unconditionally independent. In fact, we
where ε2 is an unknown parameter, and {Vij } are known n × n expect that {Dij } are dependent, since they are all projections of
diagonal matrices determined by variability of replications of the climate change for the same region D.
outputs over time. That is, the kth diagonal element of Vij is specified
as:
3.2. Process model
1
2070
(Zij (sk ; t) − Zij (sk ; t − 70) − Dij (sk ))2
Vij (sk ) = . The projections of climate change from RCMs for all four seasons,
30 29
t=2041 {Yij (·)}, are modeled statistically with a spatial two-way ANOVA
8 E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15
model. Specifically, for s ∈ D, we first model the projection from We consider now the prior distributions for the r × r, positive-
the ith RCM for the jth season as follows: definite covariance matrices Km , for m = 1, . . ., 5. Previous articles on
random-effects models have usually assumed a very simple struc-
Yij (s) = (s) + ai (s) + bj (s) + (ab)ij (s), i = 1, 2, j = 1, . . . , 4, (9)
ture, such as: Km is a multiple of the identity matrix (e.g., Zhao et al.,
where (·) represents the baseline; {ai (·)} are the main effects rep- 2006; Baladandayuthapani et al., 2008), or Km is a diagonal matrix
resenting differences between RCMs; {bj (·)} are the main effects (e.g., Furrer et al., 2007; Lopes et al., 2008). Then priors such as the
representing differences between seasons; {(ab)ij (·)} are the inter- inverse-gamma prior have been assigned to the associated scale
actions between the RCM factor and the season factor. Note that parameters. For the SRE model, Kang and Cressie (2011) suggested
effects in (9) are now functions of location s ∈ D, through which the Givens-angle prior for the covariance matrix, and they have
spatial dependencies will be taken into account. shown that it allows for much more flexibility and has better pre-
Kaufman and Sain (2010) discussed a similar model as in (9), diction performance, when compared to the previously mentioned
although their choice of process model for {ai (·)}, {bj (·)}, and priors. It has also been applied by Kang et al. (2012) to analyze
{(ab)ij (·)} leads to very slow MCMC algorithms to sample the the outputs in the NARCCAP Phase I experiment. In this article,
posterior distributions. In our situation, the RCM outputs are because we use the SRE model to capture spatial dependence, we
high-resolution and, hence, any one spatial field is very large: can similarly use the Givens-angle prior on {Km }.
n = 98 × 120 = 11, 760. As pointed out by Banerjee et al. (2008) and We first express Km , for m = 1, . . ., 5, in terms of its spectral
Cressie and Johannesson (2008), such large or even massive n will decomposition,
cause computational difficulties for traditional spatial statistical
methods. This is the case in Kaufman and Sain (2010), since inver- Km = Pm m Pm , (14)
sions of n × n covariance matrices are required during each of their
where m is the diagonal matrix, diag(1,m , . . ., r,m ), whose diago-
MCMC sampling steps.
nal elements are the eigenvalues 1,m ≥ 2,m ≥ · · · ≥ r,m > 0, and Pm
In this article, we choose to model main and interaction effects
is the corresponding orthogonal matrix of eigenvectors such that
with the SRE model, due to its flexible spatial modeling and its com- P = P P = I. As in Kang and Cressie (2011), we express P in
Pm m m m m
putational advantages. Specifically, {ai (·)}, {bj (·)}, and {(ab)ij (·)} are
terms of the r(r − 1)/2 Givens angles, denoted by ij,m , for i = 1, . . .,
modeled as follows:
r − 1 and j = i + 1, . . ., r:
ai (s) = S(s) ıi , bj (s) = S(s) j , and(ab)ij (s) = S(s) ij ,
Pm = (G12,m G13,m · · ·G1r,m ) · (G23,m · · ·G2r,m )· · ·G(r−1)r,m . (15)
where S(·) ≡ (S1 (·), . . ., Sr (·)) is a vector of r deterministic, known,
multi-resolutional, not-necessarily-orthogonal spatial basis func- In (15), Gij,m is the Givens rotation matrix corresponding to ij,m ,
tions. The r-dimensional vectors, ıi ≡ (ı1,i , . . ., ır,i ) , j ≡ ( 1,j , . . ., defined as a modification of the r × r identity matrix whose ith and
r,j ) , and ij ≡ ( 1,ij , . . ., r,ij ) are Gaussian random effects and are jth diagonal elements of 1 are replaced by cos( ij,m ), and whose (i,
modeled with independent Gaussian distributions, as follows: j) and (j, i) elements of 0 are replaced by −sin( ij,m ) and sin( ij,m ),
respectively.
ıi ∼Gau(0, K1 ), We further assume independence of
j ∼Gau(0, K2 ), j = 1, 3,
m ≡ (1,m , . . . , r,m )
j ∼Gau(0, K3 ), j = 2, 4, (10)
and
ij ∼Gau(0, K4 ), i = 1, 2, j = 1, 3,
ij ∼Gau(0, K5 ), i = 1, 2, j = 2, 4. m ≡ ( 12,m , . . . , 1r,m , 23,m , . . . , 2r,m , . . . , (r−1)r,m ) ;
Notice that we allow the covariance matrices to differ between the that is,
mild (spring and autumn) seasons and the extreme (summer and
winter) seasons. [m , m ] = [m ] · [ m ], m = 1, . . . , 5.
Define S to be the n × r matrix whose jth column is Sj (·), and
define the n-dimensional vector, ≡ ((s1 ), . . ., (sn )) . Then the Priors on m and m imply a prior distribution on Km that is very
model in (9) can be rewritten as: flexible. We assign priors on m and m as suggested by Kang and
Cressie (2011), so that the prior on Km has a multi-resolutional
Yij = + Sıi + S j + S ij , (11) structure brought about by the choice of multi-resolutional basis
functions in the SRE model; Kang and Cressie (2011) call this the
where the random effects {ıi }, { j }, and { ij } are independently
Givens-angle prior. Other than the multi-resolutional structure, the
distributed as in (10), for i = 1, 2, j = 1, . . ., 4.
Givens-angle prior aims to be noninformative; more details, includ-
ing hyperparameter specification, can be found in Appendix A and
3.3. Parameter model
in Kang and Cressie (2011).
Finally, we independently assign an inverse-gamma-
To complete the Bayesian hierarchical model, we specify the
distribution prior to ε2 :
parameter model (i.e., the prior distribution) for the various param-
eters, specifically, the baseline , the r × r covariance matrices K1 , ε2 ∼IG(a, b), (16)
. . ., K5 , and the variance parameter ε2 . First, we assume indepen-
dence: where IG(a, b) denotes the inverse-gamma distribution with shape
parameter a and scale parameter b, whose density is proportional
[, K1 , . . . , K5 , ε2 ] = [] · [K1 ]· · ·[K5 ] · [ε2 ]. (12)
to,
Then we assign prior models for each of the components, as follows. −b
We assume that has a Gaussian prior, ( 2 )−(a+1) exp . (17)
2
∼Gau(0 , 02 I), (13)
Both a and b are hyperparameters, and their specifications are given
where 0 and 02 are known hyperparameters (see Appendix A). in Appendix A.
E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15 9
Fig. 4. Upper panels: Posterior mean (upper-left) and posterior standard deviation (upper-right) of Y1. − Y2. (CRCM − RCM3). Lower panels: Posterior mean (lower-left) and
posterior standard deviation (lower-right) of Y.2 − Y.4 (summer − winter).
Fig. 5. Upper panels: Posterior mean (upper-left) and posterior standard deviation (upper-right) of Y12 − Y22 (CRCM − RCM3, for summer). Lower panels: Posterior mean
(lower-left) and posterior standard deviation (lower-right) of Y14 − Y24 (CRCM − RCM3, for winter).
E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15 11
posterior probability that Y.. (s) lies in the interval from Ŷ..(2.5) (s) to differences between the summer and winter temperature increase
Ŷ..(97.5) (s) is 0.95, for each pixel s ∈ D. are positive in the south and negative in the north. One can see
We also consider the posterior-distributional properties of var- that, particularly in the Hudson Bay, winter increases in tempera-
ious contrasts, which are defined as linear combinations of two ture are much greater than summer increases. The reverse is true in
or more factor-level means with coefficients adding up to zero. In the Midwest of the United States. It should also be noticed that the
the upper panels in Fig. 4, we present the posterior means and the posterior mean of Y.2 (·) − Y.4 (·), is of larger magnitude than that of
posterior standard deviations of Y1. (·) − Y2. (·), which is the differ- Y1. (·) − Y2. (·). That is, seasonal differences are much more important
ence between CRCM and RCM3, averaged over seasons. It indicates than RCM differences.
that CRCM projects more temperature increase in southern Texas In addition, we consider the differences between RCMs given
and northern Mexico areas, while its projections are cooler on a certain season, summer or winter. That is, we investigate the
the coast, as compared to RCM3. In the lower panels in Fig. 4, we contrasts, Y12 (·) − Y22 (·) and Y14 (·) − Y24 (·), respectively, whose pos-
present the posterior means and the posterior standard deviations terior means and standard deviations are shown in Fig. 5. Compared
of Y.2 (·) − Y.4 (·), the difference between the summer and winter to D12 (·) − D22 (·) and D14 (·) − D24 (·) in Fig. 1, the posterior means of
temperature increase, averaged over RCMs. It can be seen that the Y12 (·) − Y22 (·) and Y14 (·) − Y24 (·) are much smoother (as expected).
0.1 0.4
6 CRCM
RCM3
5 0.05 0.2
4
3 0 0
2
−0.05 −0.2
1
0 −0.1 −0.4
Spring Summer Fall Winter CRCM RCM3 Spring Summer Fall Winter
0.1 0.4
6 CRCM
RCM3
5 0.05 0.2
4
3 0 0
2
−0.05 −0.2
1
0 −0.1 −0.4
Spring Summer Fall Winter CRCM RCM3 Spring Summer Fall Winter
0.1 0.4
6 CRCM
RCM3
5 0.05 0.2
4
3 0 0
2
−0.05 −0.2
1
0 −0.1 −0.4
Spring Summer Fall Winter CRCM RCM3 Spring Summer Fall Winter
0.1 0.4
6 CRCM
RCM3
5 0.05 0.2
4
3 0 0
2
−0.05 −0.2
1
0 −0.1 −0.4
Spring Summer Fall Winter CRCM RCM3 Spring Summer Fall Winter
Fig. 6. Left column: Plots of posterior means of {Y.. + vj : j = 1, . . . , 4}, with j = 1 indicating spring , . . ., j = 4 indicating winter. Middle column: Plots of posterior means of
{ui : i = 1, 2}, with i = 1 indicating CRCM and i = 2 indicating RCM3. Right column: Plots of posterior means of {wij : i = 1, 2, j = 1, . . . , 4}, with i = 1 indicating CRCM (red line
with circles) and i = 2 indicating RCM3 (blue line with asterisks), and j = 1 indicating spring , . . ., j = 4 indicating winter. The rows correspond to the locations s0 : Top row
(Hudson Bay pixel), 2nd row (Great Lakes pixel), 3rd row(Mid-West pixel), bottom row (Rocky Mountains pixel).
12 E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15
From Fig. 5, it can be seen that in winter, CRCM tends to project 5. Discussion and conclusions
stronger warming effects than RCM3 in the northern region, espe-
cially in the Hudson Bay, while in summer it is the opposite. In In this article, we develop a spatial two-way ANOVA model
both winter and summer, CRCM seems to project more tempera- in a Bayesian framework that allows a coherent statistical anal-
ture increase in southern Texas and northern Mexico. Additionally, ysis of RCM climate-change projections from NARCCAP Phase II.
notice that the posterior standard deviations of Y12 (·) − Y22 (·) are We investigate the variabilities due to RCMs, Boreal seasons, and
smaller than those of Y14 (·) − Y24 (·). their interactions. Our spatial ANOVA is based on the Spatial Ran-
Besides inferences over the entire domain D, we are able to dom Effects (SRE) model that allows us to carry out the Bayesian
investigate temperature increases for single pixels. For a fixed pixel computations in an efficient manner. Additionally, the SRE model
s0 ∈ D, we decompose Yij (s0 ) as follows: allows a very flexible class of spatial covariance structures without
assuming stationarity or isotropy.
Yij (s0 ) = Y.. (s0 ) + ui (s0 ) + vj (s0 ) + wij (s0 ), In our analysis, we obtain inferences for the temperature-
where change projections based on posterior distributions. We find that
the warming effects differ in areas and seasons substantially: For
ui (s0 ) = Yi. (s0 ) − Y.. (s0 ) example, the warming effects are much stronger in the north in
winter, and they are stronger in the south in summer. We also
vj (s0 ) = Y.j (s0 ) − Y.. (s0 ) find that although the two RCMs produce different outputs, the
variability between RCMs is very small, when compared to the
projected warming effects. Additionally, from our Bayesian anal-
wij (s0 ) = Yij (s0 ) − Yi. (s0 ) − Y.j (s0 ) + Y.. (s0 ).
ysis, we are able to obtain both point and interval estimates, and
Four locations were chosen for s0 , as shown by the triangles in we investigate various contrasts between factor levels of RCM and
Fig. 3. We then computed and plotted the posterior means of season.
Y.. (s0 ) + vj (s0 ), ui (s0 ), and wij (s0 ) for these four locations, in the The SRE model applied in this article can be employed for ana-
left, middle, and right columns of Fig. 6, respectively. The purpose lyzing massive environmental datasets. Particularly when spatial
of Fig. 6 is to illustrate that the effects of climate change, season, observations at some locations are missing (not the case for RCMs),
RCM, and their interaction are different at different locations. The the SRE model can be used to generate optimal spatial predic-
left column of Fig. 6 indicates warming on the order of 3 ◦ C for the tions through the spatial best linear unbiased predictor (spatial
four locations, with seasonal warming from 1 ◦ C to 6 ◦ C in the Hud- BLUP, or kriging). These can be computed efficiently, along with
son Bay pixel. The middle column of Fig. 6 shows the magnitude of their associated uncertainty measures, as shown in Cressie and
variation due to choice of climate model to be miniscule in com- Johannesson (2008) and Kang and Cressie (2011). The multi-way
parison. The interaction between climate model and season shown SRE models we present in this article can potentially be used
in the right column of Fig. 6 is small and indicates local (in time and for analyzing observations from various instruments on different
space) differences in the way the RCMs capture the flows of energy remote-sensing platforms, where the size of the datasets is typically
and water. large or even massive.
We shall now investigate how large the variability due to RCM The SRE models allow exact computation even when the dataset
is, compared to the projected temperature change. To provide a is massive, compared to other geostatistical approximation-based
quantitative measure, for a given region S0 ⊂ D, we define approaches such as Banerjee et al. (2008). Another advantage
of the SRE model is that it allows a flexible family of spatial
s∈S0
(Y1. (s) − Y2. (s))2 /(2|S0 |) covariance functions without assuming homogeneity, and it has
R0 ≡ , (19) no constraints on data observed at irregular locations. For many
Y (s)/|S0 |
s∈S0 .. other approaches such as those in Sain et al. (2011) and Sain
and Kaufman (2011), a homogeneous Markov random field is
which is unit-free and analogous to the coefficient of variation mea-
assumed.
sure. In Table 3, we choose S0 to be the entire domain D, the land,
In this article, we consider a subset of the Phase II runs in NAR-
the Great Lakes, the Hudson Bay, the coastline, and the ocean.
CCAP, and the model and methods presented here are ready to be
Notice that the numerator in (19) gives a measure of the variability
extended to the full NARCCAP Phase II runs, when they become
between the two RCMs in the region S0 , while the denominator
available. Recall that the Phase II experiment includes multiple
is the projected climate change averaged over the same region.
runs of RCMs with different boundary conditions provided by dif-
Hence, R0 provides the relative magnitude of variability between
ferent GCMs. Thus, instead of the spatial two-way ANOVA model
RCMs, relative to the averaged projected climate change. The pos-
introduced in this article, we can include another factor for ‘GCM’
terior means, 2.5th and 97.5th percentiles of R0 are summarized in
and build a spatial three-way ANOVA model. Currently, NARCCAP
Table 3, where it can be seen that for all the regions S0 considered, R0
includes only one emissions scenario (including greenhouse gases),
is estimated to be very small, less than 5%. That is, the variability due
the SRES A2 emissions scenario. If others are considered for future
to RCM is much smaller than the projected climate change, indicat-
runs, we can generalize the model to a spatial four-way ANOVA
ing that our overall conclusions about climate-change projections
model by adding the factor for emissions scenario. Note that with
are not RCM-dependent.
the SRE model, although adding factor(s) increases computation
time, it is linear in the number of data and hence is feasible. Further-
Table 3 more, the associated computation time is miniscule in comparison
(2.5) (97.5)
Posterior mean of R0 and 95% credible interval (R̂0 , R̂0 ), for various regions S0 .
to the computation time to run the physical models to produce the
Region S0 Post. mean of R0
(2.5)
(R̂0
(97.5)
, R̂0 ) output of the RCMs themselves.
Another important and natural extension of the current model is
D 0.0455 (0.0443, 0.0467)
Land 0.0424 (0.0410, 0.0438)
to consider multivariate processes. For example, as well as temper-
Great Lakes 0.0322 (0.0224, 0.0428) ature, RCM outputs for precipitation can be studied simultaneously,
Hudson Bay 0.0246 (0.0200, 0.0296) as do Sain et al. (2011). More generally, a spatial model could be
Coastline 0.0494 (0.0470, 0.0518) built by linking RCM outputs to other variables in regional ecosys-
Ocean 0.0583 (0.0564, 0.0860)
tems, allowing environmental concerns, such as local aerosol alerts,
E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15 13
In this section, we give details on the priors on parameters, where is a positive inflation factor for the variance, chosen as
2 }, and 2 , as
= 6 in this article. Similarly, we specify {cl,m }, {l,m
including hyperparameter specification; sometimes we use data 0,m
to guide us in the choice of hyperparameters. follows:
We choose a Gaussian prior, Gau(0 , 02 I), for . Here, the hyper-
parameter 0 is chosen to be the empirical baseline D.. , and 02 is k( ˜ ij,m )
cl,m = ,
chosen to be 106 , so that the prior is approximately noninformative. |Nl |
(i,j)∈Nl
For the prior on Km , for m = 1, . . ., 5, we wish to capture the multi-
(k( ˜ ij,m ) − cl,m )2
resolutional structure of the spatial basis functions; as in Kang and 2 =
l,m ,
Cressie (2011), we put priors on m : (|Nl | − 1)
(i,j)∈Nl
k( ˜ ij,m )2
[1,m , . . . , r,m ] = [1,1,m , . . . , 1,q1 ,m ]· · ·[L,1,m , . . . , L,qL ,m |L−1,qL−1 ,m ], (A.1) 2
0,m = ,
|N0 |
(i,j)∈N0
where l,1,m , . . . , l,ql ,m are eigenvalues corresponding to the ql
L
basis functions from the lth resolution, l = 1, . . ., L, and q =
l=1 l
where k(·) is given by (A.2).
r. We consider the basis functions given by elevation and four We choose inverse-gamma distributions, IG(a, b), for the prior
indicator functions (Section 4) to be part of the first resolution. on ε2 with a = b = 0.001, so that the prior distributions are approx-
In (A.1), l,1,m , . . . , l,ql ,m are distributed as the order statistics imately noninformative.
corresponding to i.i.d. lognormally truncated distributed random
variables with known hyperparameters, mean l,m and variance
2 ; l = 1, . . ., L, where the underlying truncated normal distribu-
l,m Appendix B. Markov chain Monte Carlo algorithm
tion is restricted to (−∞, log l−1,ql−1 ,m ).
The prior put on ij,m is defined indirectly through a prior on a We implement the MCMC procedure with a Gibbs sam-
logit transformation of the angle: pler, incorporating Metropolis–Hastings steps where necessary.
The full conditional distributions, as well as details of the
Metropolis–Hastings steps are described as follows.
(
/2 + ij,m )
k( ij,m ) ≡ log . (A.2) Since {Yij } are fully determined by , {ıi }, { j }, and { ij }, we only
(
/2 − ij,m )
investigate the joint posterior distribution of , {ıi }, { j }, { ij }, and
unknown parameters, but not {Yij } directly.
That is, independently,
The posterior distribution of all the unknowns, , {ıi }, { j }, { ij },
2 K1 , . . ., K5 , and ε2 , given the data {Dij }, is proportional to the joint
k( ij,m )∼Gau(cl,m , l,m ), (A.3)
distribution, which can then be simplified:
14 E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15
[, ıi , j , ij , K1 , . . . , K5 , ε2 |D11 , . . . , D24 ] ∝ [D11 , . . . , D24 |, ıi , j , ij , ε2 ]×[ı1 , ı2 |K1 ]×[ 1 , 3 |K2 ]×[ 2 , 4 |K3 ] × [ 11 , 21 , 13 , 23 |K4 ]
2
4
×[ 12 , 22 , 14 , 24 |K5 ] × [, K1 , . . . , K5 , ε2 ] ∝ |ε2 Vij |−1/2 exp −(Dij − − S(ıi + j + ij )) (ε2 Vij )−1 (Dij − − S(ıi + j + ij ))/2
i=1 j=1
2
2
× |K1 |−1/2 exp{−ıi K1−1 ıi /2} × |K2 |−1/2 exp{− j K2−1 j /2} × |K3 |−1/2 exp{− j K3−1 j /2} × |K4 |−1/2 exp{− ij K4−1 ij /2}
i=1 j={1,3} j={2,4} i=1 j={1,3}
2
× |K5 |−1/2 exp{− ij K5−1 ij /2} × (02 )−n/2 exp{−( − 0 ) ( − 0 )/(202 )} × (ε2 )−(a+1) exp{−b/ε2 }
i=1 j={2,4}
5
5
2 2 2 2 2
× [m |1,m , . . . , L,m , 1,m , . . . , L,m ]× [ m |c1,m , . . . , cL,m , 0,m , 1,m , . . . , L,m ]. (B.1)
m=1 m=1
where
2
4
b∗ ≡ b + (Dij − − S(ıi + j + ij )) Vij−1 (Dij − − S(ıi + j + ij ))/2.
⎛ ⎞
2
4 i=1 j=1
|rest ≡ |rest ⎝0−2 0 + (ε2 Vij )−1 (Dij − S(ıi + j + ij ))⎠ , For the parameters {m } and { m }, no closed-form full con-
i=1 j=1 ditional distributions can be obtained, and a random-walk
Metropolis–Hastings algorithm is applied (e.g., Gilks et al.,
and 1996). The MCMC procedure, which is a Gibbs sampler with
⎛ ⎞−1 Metropolis–Hastings updates according to the full conditional dis-
2
4 tributions outlined above, is iterated many times to generate a
|rest ≡ ⎝0−2 I + (ε2 Vij )−1 ⎠ . sample from the joint posterior distribution of the unknowns given
i=1 j=1 the data.
ıi |rest∼Gau(ıi |rest , ıi |rest ), Baladandayuthapani, V., Mallick, B.K., Hong, M.Y., Lupton, J.R., Turner, N.D., Carroll,
R.J., 2008. Bayesian hierarchical spatially correlated functional data analysis with
where application to colon carcinogenesis. Biometrics 64, 64–73.
Banerjee, S., Gelfand, A.E., Finley, A.O., Sang, H., 2008. Gaussian predictive process
models for large spatial datasets. Journal of the Royal Statistical Society Series B
4
70, 825–848.
ıi |rest ≡ ıi |rest S (ε2 Vij )−1 (Dij − − S( j + ij )), Berliner, L.M., Kim, Y., 2008. Bayesian design and analysis for superensemble-based
j=1 climate forecasting. Journal of Climate 21, 1891–1910.
Berliner, L.M., Wikle, C.K., Cressie, N., 2000. Long-lead prediction of Pacific SSTs via
Bayesian dynamic modeling. Journal of Climate 13, 3953–3968.
and Christensen, W., Sain, S.R., 2012. Latent variable modeling for integrat-
⎛ ⎞−1 ing output from multiple climate models. Mathematical Geosciences 44,
4 doi:10.1007/s11004-011-9321-1.
ıi |rest ≡ ⎝K1−1 + S (ε2 Vij )−1 S ⎠
Cressie, N., Johannesson, G.,2006. Spatial prediction for massive datasets. In: Mas-
. tering the Data Explosion in the Earth and Environmental Sciences: Proceedings
j=1 of the Australian Academy of Science Elizabeth and Frederick White Conference.
Australian Academy of Science, Canberra, pp. 1–11.
Cressie, N., Johannesson, G., 2008. Fixed rank kriging for very large datasets. Journal
The full conditional distribution for j is: of the Royal Statistical Society Series B 70, 209–226.
Cressie, N., Kang, E.L., 2010. High-resolution digital soil mapping: Kriging for very
j |rest∼Gau(j |rest , j |rest ), large datasets. In: Viscarra-Rossel, R., McBratney, A., Minasny, B. (Eds.), Proximal
Soil Sensing. Springer, Dordrecht, pp. 49–63.
Fennessy, M.J., Shukla, J., 2000. Seasonal prediction over North America with a
where regional model nested in a global model. Journal of Climate 13, 2605–2627.
Furrer, R., Genton, M.G., Nychka, D., 2006. Covariance tapering for interpolation
2
of large spatial datasets. Journal of Computational and Graphical Statistics 15,
j |rest ≡ j |rest S (ε2 Vij )−1 (Dij − − S(ıi + ij )), 502–523.
Furrer, R., Sain, S.R., Nychka, D., Meehl, G.A., 2007. Multivariate Bayesian analysis of
i=1 atmosphere–ocean general circulation models. Environmental and Ecological
Statistics 14, 249–266.
and Gelfand, A., Smith, A.F.M., 1990. Sampling based approaches to calculating marginal
−1 densities. Journal of the American Statistical Association 85, 398–409.
2 Gilks, W.R., Richardson, S., Spiegelhalter, D.J., 1996. Markov Chain Monte Carlo in
j |rest ≡ −1
Km +S
(ε2 Vij )−1 S , Practice. Chapman & Hall/CRC, Boca Raton, FL.
Hastings, W.K., 1970. Monte Carlo sampling methods using Markov chains and their
i=1 applications. Biometrika 57, 97–109.
E.L. Kang, N. Cressie / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 3–15 15
Kanamitsu, M., Ebisuzaki, W., Woollen, J., Yang, S.K., Hnilo, J.J., Fiorino, M., Potter, Sain, S.R., Furrer, R., Cressie, N., 2011. A spatial analysis of multivariate output from
G.L., 2002. NCEP-DOE AMIP-II reanalysis (R-2). Bulletin of the American Meteo- regional climate models. Annals of Applied Statistics 5, 150–175.
rological Society 83, 1631–1644. Sain, S.R., Kaufman, C., 2011. Hierarchical modeling of variability in regional climate
Kang, E.L., Cressie, N., 2011. Bayesian inference for the spatial random effects model. models using Markov random fields, under submission.
Journal of the American Statistical Association 106, 972–983. Sain, S.R., Nychka, D., Mearns, L.O., 2010. Functional ANOVA and regional climate
Kang, E. L., Cressie, N., Sain, S.R., 2012. Combining outputs from the North American experiments: a statistical analysis of dynamic downscaling. Environmetrics 22,
regional climate change assessment program by using a Bayesian hierarchical 700–711.
model. Journal of the Royal Statistical Society, Series C (Applied Statistics) 61, Salazar, E., Sansó, B., Finley, A., Hammerling, D., Steinsland, I., Wang, X., Delamater, P.,
doi:10.1111/j.1467-9876.2011.01010.x. 2011. Comparing and blending regional climate model predictions for the Amer-
Kang, E.L., Cressie, N., Shi, T., 2010. Using temporal variability to improve spatial ican southwest. Journal of Agricultural, Biological, and Environmental Statistics
mapping of satellite data. Canadian Journal of Statistics 38, 271–289. 16, 586–605.
Kaufman, C.G., Sain, S.R., 2010. Bayesian functional ANOVA modeling using Gaussian Shi, T., Cressie, N., 2007. Global statistical analysis of MISR aerosol data: a massive
process prior distributions. Bayesian Analysis 5, 123–150. data product from NASA’s Terra satellite. Environmetrics 18, 665–680.
Lindgren, F., Rue, H., Lindström, J., 2011. An explicit link between Gaussian fields Smith, R.L., Tebaldi, C., Nychka, D., Mearns, L.O., 2009. Bayesian modeling of uncer-
and Gaussian Markov random fields: the stochastic partial differential equation tainty in ensembles of climate models. Journal of the American Statistical
approach. Journal of the Royal Statistical Society Series B 73, 423–498. Association 104, 97–116.
Lopes, H.F., Salazar, E., Gamerman, D., 2008. Spatial dynamic factor analysis. Bayesian Solomon, S., Qin, D., Manning, M., Chen, Z., Marquis, M., Averyt, K.B., Tignor, M.,
Analysis 3, 759–792. Miller, H.L., 2007. Climate Change 2007: The Physical Science Basis. Contribution
Mearns, L.O., Gutowski, W.J., Jones, R., Leung, L.Y., McGinnis, S., Nunes, A.M.B., Qian, of Working Group I to the Fourth Assessment Report of the Intergovernmental
Y., 2009. A regional climate change assessment program for North America. Eos. Panel on Climate Change (IPCC). Cambridge University Press, UK and New York,
Transactions, American Geophysical Union 90, 311–312. NY, USA.
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E., 1953. Tebaldi, C., Smith, R.L., Nychka, D., Mearns, L.O., 2005. Quantifying uncertainty in
Equation of state calculations by fast computing machines. Journal of Chemical projections of regional climate change: a Bayesian approach to the analysis of
Physics 21, 1087–1091. multimodel ensembles. Journal of Climate 18, 1524–1540.
Nakicenovic, N., Alcamo, J., Davis, G., de Vries, B., Fenhann, J., Gaffin, S., Gregory, K., Xue, Y., Vasic, R., Janjic, Z., Mesinger, F., Mitchell, K.E., 2007. Assessment of dynamic
Grubler, A., Jung, T.Y., Kram, T., et al.,2000. Special Report on Emissions Scenarios: downscaling of the continental US regional climate using the Eta/SSiB Regional
A Special Report of Working Group III of the Intergovernmental Panel on Climate Climate Model. Journal of Climate 20, 4172–4193.
Change. Technical Report. Environmental Molecular Sciences Laboratory, Pacific Zhao, Y., Staudenmayer, J., Coull, B.A., Wand, M.P., 2006. General design Bayesian
Northwest National Laboratory, Richland, WA, USA. generalized linear mixed models. Statistical Science 21, 35–51.
Sain, S.R., Furrer, R., 2010. Combining climate model output via model correlations.
Stochastic Environmental Research and Risk Assessment 24, 821–829.