You are on page 1of 13

Received: 22 June 2023 | Accepted: 8 November 2023

DOI: 10.1111/2041-210X.14259

RESEARCH ARTICLE

Generative spatial generalized dissimilarity mixed modelling


(spGDMM): An enhanced approach to modelling beta diversity

Philip A. White1,2 | Henry A. Frye3 | Jasper A. Slingsby4,5 | John A. Silander Jr3 |


Alan E. Gelfand6
1
Berry Consultants, Austin, Texas, USA; 2Department of Statistics, Brigham Young University, Provo, Utah, USA; 3Department of Ecology & Evolutionary
Biology, University of Connecticut, Storrs, Connecticut, USA; 4Department of Biological Sciences and Centre for Statistics in Ecology, Environment and
Conservation, University of Cape Town, Cape Town, South Africa; 5Fynbos Node, South African Environmental Observation Network, Cape Town, South Africa
and 6Department of Statistical Science, Duke University, Durham, North Carolina, USA

Correspondence
Philip A. White Abstract
Email: philip@berryconsultants.net
1. Turnover, or change in the composition of species over space and time, is one
Funding information of the primary ways to define beta diversity. Inferring what factors impact beta
National Science Foundation, Grant/
diversity is not only important for understanding biodiversity processes but also
Award Number: DEB-1046328;
NASA Earth and Space Science and for conservation planning. At present, a popular approach to understanding the
Technology (FINESST), Grant/Award
drivers of compositional turnover is through generalized dissimilarity modelling
Number: 80NSSC20K1659; NASA
BioSCape Award, Grant/Award Number: (GDM). We argue that the current GDM approach suffers several limitations and
80NSSC22K1383; National Research
provide an alternative modelling approach that remedies these issues.
Foundation, Grant/Award Number:
150926, 142438 and 118593 2. We propose using generative spatial random effects models implemented in a
Bayesian framework. We offer hierarchical specifications to yield full regression
Handling Editor: Robin Boyd
and spatial predictive inference, both with associated full uncertainties. The ap-
proach is illustrated by examining dissimilarity in three datasets: tree survey data
from Panama's Barro Colorado Island (BCI), plant occurrence data from southwest
Australia and plant abundance surveys from the Greater Cape Floristic Region
(GCFR) of South Africa. We select a best model using out-of-sample predictive
performance.
3. We find that the form of the best model differs across the three datasets, but our
models provide performance ranging from comparable to significant improve-
ment over GDMs. Within the GCFR, the spatial random effects play a more im-
portant role in the modelling than all the environmental variables.
4. We have proposed a model that provides several improvements to the current
GDM framework. This includes advantages such as a flexible spatially varying
mean function, spatial random effects that capture dependence unaccounted
for by explanatory variables, and spatially heterogeneous variance structure. All
these features are offered in a model that can adequately handle a large incidence

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium,
provided the original work is properly cited.
© 2023 The Authors. Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society.

214 | 
wileyonlinelibrary.com/journal/mee3 Methods Ecol Evol. 2024;15:214–226.
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
WHITE et al. 215

of total dissimilarity through ‘one-inflation’, as would be expected from highly bio-


diverse areas with steep turnover gradients.

KEYWORDS
biodiversity, coherent/generative models, hierarchical models, model comparison and
validation, spatial random effects, warping functions

1 | I NTRO D U C TI O N Responding to issues of nonlinearity in relating beta diver-


sity to environmental gradients, Ferrier and colleagues developed
Change in the composition of biotic assemblages in space and time is GDM (Ferrier, 2002; Ferrier et al., 2004, 2007). This approach was
important for revealing the ecological processes that structure and argued as an extension of generalized linear modelling of pair-
maintain biodiversity spatially across landscapes, detecting biodi- wise distances and has become increasingly popular for analysing
versity loss or change in the composition of assemblages, or tracking compositional turnover (Mokany et al., 2022). GDM underpins the
global biodiversity change (Ferrier et al., 2020; Hoskins et al., 2020). calculation of several Global Biodiversity Change Indicators pro-
Such turnover is referred to as beta diversity and has been applied posed by the Group on Earth Observations Biological Observation
to taxonomic, functional, phylogenetic and spectral measures of di- Network (GEO BON), which are used to support global initiatives
versity (see, e.g. Graham & Fine, 2008; Schweiger & Laliberté, 2022; like the Intergovernmental Science-Policy Platform on Biodiversity
Whittaker, 1960). Unfortunately, modelling beta diversity presents and Ecosystem Services (IPBES) and the Convention on Biological
many unique challenges and, while several methods have been pro- Diversity's (CBD) Post-2020 Global Biodiversity Framework (Ferrier
posed, there are many issues that remain to be overcome. Here we et al., 2020; Hoskins et al., 2020). Unfortunately, GDM itself retains
reformulate the generalized dissimilarity modelling (GDM) method some flaws and limitations, which could have consequential implica-
of Ferrier (2002) and Ferrier et al. (2007) and propose several sig- tions for global biodiversity monitoring and management. Here, we
nificant advances that should aid modelling of beta diversity and the highlight issues with the GDM method and then propose and imple-
many applications it supports. ment a richer, generative modelling approach that is fully spatial. For
While beta diversity can be expressed as a single metric represent- us, generative means coherent, that is, a process specification which
ing variation across multiple samples in a region (Whittaker, 1960), it could have generated the observed data.
is more commonly expressed as differences in composition between
pairs of samples (Anderson et al., 2011). Modelling beta diversity
based on pairwise analyses requires a distance-based statistical 1.1 | A review and critique of generalized
framework. Several approaches have been proposed (Anderson dissimilarity modelling
et al., 2011), though each has its limitations or flawed assumptions.
Early efforts include the Mantel test of correlation between distance Generalized dissimilarity modelling treats between-site dissimi-
matrices (Legendre, 1993), ordination of assemblages based on their larities (beta diversity measures) as the response of interest. In
pairwise compositional differences, for example, non-metric multi- the paper, we also use GDM to represent generalized dissimilarity
dimensional scaling (NMDS; Prentice, 1977), and linear matrix re- model. For two sites, between-site dissimilarities are calculated as
gression (Manly, 1986), where the level of difference between pairs a function of community composition, considering differences in
of assemblages is related linearly to differences in environment or the presence/absence of taxa (at some taxonomic level) and, for
space. some metrics, differences in their relative abundance. The GDM
Unfortunately, these properties are often not linear because: imagines that dissimilarities are explained by differences between
(i) Most ecological dissimilarity measures are constrained to range monotonically warped environmental variables; the warping enables
from 0, when assemblages are identical, to saturating at a maximum differences between environmental variables to relate nonlinearly
value of 1, once pairs of assemblages are completely different. In to dissimilarity. Following Ferrier et al. (2007), GDM describes dis-
biodiverse regions, datasets may contain a high proportion of 1s, similarities using a linear combination of differences in monotone
which are not adequately explained by models based on standard transformations of a vector of site-level environmental covariates,
distributional assumptions. This is akin to the issue of 0-inflation the latter denoted by X(s) where s ∈ , the study domain of interest.
commonly faced in population and community modelling (Blasco- In addition, they may employ a monotone transformation of spatial
Moreno et al., 2019). (ii) The rate of change in assemblage compo- distance to account for additional dissimilarity that is not effectively
sition in relation to environmental variables can vary nonlinearly explained by environmental variables.
( )
along a gradient (Fitzpatrick et al., 2013; Oksanen & Tonteri, 1995), Adopting a dissimilarity measure for two sites, Z s, s′ , the GDM
which necessitates modelling explicitly in a spatial context (Ferrier does the following. It proposes to model a transformed/predicted
et al., 2007). ecological mean distance, 𝜂 on R+, and then links it to the interval
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
216 WHITE et al.

[ ]
0, 1 through 𝜇 = 1 − e−𝜂. It further suggests a scaled variance func- how to do this. If one deletes a Zij because it is exactly 1, in prin-
tion proportional to 𝜇(1 − 𝜇). With this link and variance function, ciple should one delete site i , that is, all Zij′ , j′ ≠ j? Similarly, should
iteratively re-weighted least squares (IRLS) fitting is implemented to one delete site j all other Zi′ j , i′ ≠ i? With a high incidence of 1s,
infer the parameters in the specification of 𝜂. The proposed form for if one implemented this over all of the 1s, likely little data would
( )
𝜂 s, s′ is remain.
• Finally, the modelling is not generative and could not have pro-
( ) ( ( )) ∑
P
( ) ( ( ))
𝜂 s, s� = 𝛽 0 + h d s, s� + ∣ fp Xp (s) − fp Xp s� ∣ (1) duced the dissimilarities we observe. That is, the implicit likeli-
hood being optimized is inappropriate and is not a likelihood for
p=1

where h( ⋅ ) and fp ( ⋅ ) are unique monotone increasing functions for the data. Given the strong interest in illuminating the complex
distance and all the environmental covariates respectively. Ferrier processes of biodiversity, there is a need for models that extend
et al. (2007) specify h( ⋅ ) and fp ( ⋅ ) using I-spline basis functions with beyond a patchwork of statistical pieces. We would also note that
positive coefficients, to ensure that the transformation of the environ- adoption of such regression modelling necessitates investigation
mental covariate is monotone (Ramsay, 1988). The number and choice of model adequacy and model comparison, not currently consid-
of I-spline basis functions are supplied for each of the regressors indi- ered with the GDM.
vidually, before the IRLS fitting.
There are several features of this modelling/fitting approach that
raise cause for concern. 1.2 | A proposed remedy: Generative spatial
generalized dissimilarity mixed modelling
• First, there is no formal likelihood for the data. Fitting, using the
proposed link and variance function, applies an ad hoc optimiza- Given the criticisms listed in the previous section, we propose an
tion criterion. It corresponds to employing a weighted version of a alternative hierarchical Bayesian modelling framework to remedy
normal likelihood for data that is clearly not normally distributed. these criticisms. We refer to our proposed model as a generative
More concerning is that it ignores the evident dependence be- spatial generalized dissimilarity mixed modelling (spGDMM). In this
( ) ( )
tween the sites, or Z's, for example, between Z s, s′ and Z s, s′′ . paper, we also use spGDMM to represent spatial generalized dis-
• Second, the choice of a binomial variance function seems to be similarity mixed model. The main aspects of our contribution are:
( )
motivated by thinking about Z si , sj arising as some sort of pro-
portion associated with a sum of independent and identically (i) for study region, , we offer fully spatial modelling for the
( ) ( )
distributed (i.i.d.) Bernoulli trials. Even if the ecological dissimi- dissimilarity Z s, s′ between s and s′ with s, s� ∈  × . That
larity measure is based on presence/absence data, there are no is, conceptually, the latter is the space over which pairwise
such trials to imagine such a variance function. When applied to dissimilarity exists. This requires introduction of novel spatial
abundance data, for example, per cent plant cover in a sample site, random effects modelling, with attractive interpretation, added
such a variance function is clearly inappropriate. Furthermore, to the mean regression term. Furthermore, these random ef-
GDM assumes that variance in dissimilarity behaves proportion- fects enable us to capture the evident probabilistic dependence
( ) ( )
ally to 𝜇(1 − 𝜇), equivalently in terms of the transformed ecologi- between, for example, Z s, s′ and Z s, s′′ . Additionally, we
cal distance, 𝜂. We explore whether this assumption is appropriate demonstrate the need for a flexible heterogeneity of variance
( )
below. We note that Ferrari and Cribari-Neto (2004) and Ferrier specification for Z s, s′ ;
et al. (2007) suggest that other variance functions associated with (ii) our approach enables spatial interpolation of dissimilarity. In
[ ]
distributions on 0, 1 could be investigated. This would not rem- principle, this can be done with the GDM approach. However,
edy the concern regarding the absence of a likelihood. prediction incorporating spatial dependence performs better, as
• Third, bootstrapping uncertainty is tenuous, even if the pro- we demonstrate through kriging, holding out pairs of sites. In
posed optimization criterion were suitable, because it assumes this regard, we offer novel out-of-sample model validation and
( ) ( )
independent observations and clearly Z si , sj and Z si , sk are not comparison;
independent. However, such dependence across sites can be en- (iii) we offer a formal stochastic treatment for the incidence of 1s,
visaged through structured spatial dependence. employing right truncation to provide a point mass at 1. The
• Fourth, the GDM ignores the need for a point mass at 1, that is, point mass is induced through transformation of a latent dissim-
the one-inflation problem. In our species-level dataset below, ilarity; and
essentially half of the dissimilarities are equal to 1. These pairs (iv) we do all the above through specification of an explicit hierar-
cannot be ignored in the model fitting and modelling them to be chical model. Such modelling, fitted within a Bayesian frame-
less than 1 with probability 1 denies the reality of the dissimilarity work, enables us to obtain full inference and uncertainty.
process realization over  × . One might suggest deleting the Furthermore, such modelling is generative, and thus equiv-
perfect dissimilarities. This can certainly introduce sampling bias. alently, coherent, that is, it prescribes a probabilistic specifi-
Furthermore, apart from losing a potentially large amount of in- cation which can generate the dissimilarity data we observe.
formative data (≈ 50 % in our GCFR family-level data), it is unclear Importantly, it can generate perfect dissimilarities.
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
WHITE et al. 217

2 | M ATE R I A L S A N D M E TH O DS complete dissimilarity, while this is quite prevalent in the South African
dataset, particularly at the species level.
2.1 | Data and dissimilarity The computed dissimilarity depends on the taxonomic scale
considered. If computing the dissimilarity at species scale, then
( )
We fit the spGDMM to three datasets. To anchor our analysis, two Z s, s′ arises as a sum over 288 terms. At family scale, we sum over
of these datasets are prominent and accessible examples that are 52 terms. In this manuscript, we focus our discussion on dissimilar-
well described in the GDM literature. The first and primary dataset ities calculated at family level; however, we also include some re-
we analyse is from van der Merwe et al. (2008a, 2008b) consisting sults for a species-level analysis and present more detailed results in
of per cent cover data of all plant species at 413 sites in the Hantam– Supporting Information E.
Tanqua–Rogeveld subregion of South Africa's Greater Cape Floristic We plot the family- and species-level dissimilarities against the
Region (GCFR; Figure 1). Given that these data have not been de- spatial distance between sites in Figure 2. In addition, we plot mean
scribed in the context of GDMs, we describe the ecological aspects dissimilarity and the proportion of dissimilarities equal to 1 as a func-
of the data in Supporting Information A. Since the level of species tion of distance. At family scale, 4.6% of the observed dissimilarities
turnover is particularly high, we also model the rates of turnover at are exactly 1. This increases to 49% at the species scale, frequently
the family level. The sites contain 288 species, 144 genera and 52 occurring across all spatial distances. In fact, mean dissimilarity is
families. For each site s, we observe a vector of per cent cover for very large even at short distances. An appropriate generative model
each taxonomic group (species or family) Y(s). for data like this must include a point mass at 1.
n ⎛ ⎞ We selected a set of seven explanatory variables for the GCFR
⎜ ⎟ = 85,078 ( )
For 413 sites, we have ⎜⎝ 2 ⎟⎠ pairwise dissimilarities Z s, s′ , data that captured a range of climatic, topographic and edaphic
observed in . Specifically, we use Bray–Curtis (BC) dissimilarity on features that could likely drive turnover in taxa. These variables
the available per cent cover data. Because per cent cover is a contin- are described in further detail in Supporting Information A. For the
uous variable, the BC dissimilarity is Panama and southwest Australia data, we use explanatory variables
that were provided with datasets described by Ferrier et al. (2007)
∑ � �
� � j ∣ Yj (s) − Yj s� ∣ and Fitzpatrick et al. (2013) respectively.
Z s, s �
= ∑ . (2)
j ∣ Yj (s) + Yj (s� ) ∣ The second dataset consists of 39 survey sites of 225 rainforest
tree species from Panama's Barro Colorado Island (BCI), initially de-
( ) ( ) ( )
For Z s, s� = 0, Yj (s) = Yj s� for all j. For Z s, s� = 1, locations s and s′ scribed by Condit et al. (2002) and presented in the GDM context
must have no species (or families) in common. We note that the Panama by Ferrier et al. (2007). We use only 39 of the 50 sites because 11
and southwest Australia data have a low proportion of locations with are not geocoded. The third dataset consists of occurrence data for
856 plant species across 94 locations in southwest Australia. This
is the example dataset provided in the ‘gdm’ R Package (Fitzpatrick
et al., 2022a) which is a subset of the data described by Fitzpatrick
et al. (2013).

3 | O U R M O D E LLI N G A PPROAC H

3.1 | Model specifications


( )
We model Z si , sj , the dissimilarity between sites si and sj, explicitly
( ) ( )
through Z si , sj = min 1, eV (si ,sj ) , where

( ) ( ( ) ( ) ( ))
V si , sj ∼ N 𝜇 si , sj + 𝜂 si , sj , 𝜎 2 si , sj . (3)

So, we immediately obtain a point mass at 1 for the distribu-


( ) ( ) ( )
tion of Z si , sj whenever V si , sj ≥ 0, and Z si , sj ∈ (0, 1) when
( )
V si , sj < 0 . That is, we adopt a version of the usual Tobit regres-
( )
sion setting, introducing V si , sj as a ‘latent dissimilarity’ on R1.
Whenever the latent dissimilarity becomes large enough (≥ 0)
the observed dissimilarity becomes 1. Unlike many inflation ap-
F I G U R E 1 Location of sites in the Greater Cape Floristic
Region in South Africa overlaid a map of biome boundaries. Biome proaches, like hurdle model or 0- and one-inflated beta models
boundaries derived from (South African National Biodiversity (Ospina & Ferrari, 2012), the point mass is driven solely by the
Institute, 2006). dissimilarity model rather than a logistic regression model for each
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
218 WHITE et al.

1.00 1.00 F I G U R E 2 Two-dimensional histogram


on the Bray–Curtis dissimilarities plotted
against distance between sites calculated
at (Left) family scale and (Right) species
Bray−Curtis dissimilarity

Bray−Curtis dissimilarity
0.75 0.75
scale for data within the Greater Cape
Count
400
Count
4000
Floristic Region. The cyan line marks the
0.50
300
0.50
3000
mean BC dissimilarity over 25 km bins, and
200 2000

100 1000 the green line plots the proportion of BC


dissimilarities equal to 1 over 25 km bins.
0.25 0.25

0.00 0.00
0 100 200 300 0 100 200 300
Distance (km) Distance (km)

( )
point mass. As a result, our model provides coherence for the clear the expression for V si , sj as a covariate would. In a different view,
one-inflation present in observed BC dissimilarities. An alternative it provides pairwise site adjustment to the global intercept 𝛽 0. It is
( )
link function that would allow both 0- and one-inflation, if needed, clearly different from h si , sj which is a fixed effect as a function of
is mentioned in Supporting Information C.1. distance between sites. However, it is not a usual Gaussian process
( )
So, our modelling effort is to consider specifications for 𝜇 si , sj , (GP) adjustment in the ‘local’ sense since it is a function on R2 × R2. In
( ) ( )
𝜂 si , sj and 𝜎 2 si , sj . Our primary contributions come through our fact, we consider three choices for 𝜂 (and employ model comparison
( ) ( ) ( )
specifications of 𝜂 si , sj and 𝜎 2 si , sj . In particular, for 𝜇 si , sj we below):
consider the form
( )
• 𝜂 s, s� = 0, no spatial random effect
� � � � � � � �� � � �� ( �) ( ( ))2
𝜇 si , sj = 𝛽 0 + 𝛽 1 h ‖ si − sj ‖ + 𝛼 k ‖ fk Xk si − fk Xk sj ‖ , (4) • 𝜂 s, s = 𝜓(s) − 𝜓 s� , the squared difference between nor-
k
mals, suggesting a chi-square type of process
( �) ( )
analogous to expression (1). The parameter 𝛽 0 represents the base- • 𝜂 s, s = ∣ 𝜓(s) − 𝜓 s� ∣, the absolute difference between nor-
( )
line log-dissimilarity. That is, it determines the mean of V si , sj when mals, suggesting a folded (or half) normal type of process
si = sj (see below and Supporting Information B1 for further discussion
( )
regarding 𝛽 0). Here, h( ⋅ ) will be an increasing function, arguing that, Continuing, the 𝜂 s, s′ process with two arguments, is more
with 𝛽 1 > 0, transformed dissimilarity increases with distance between challenging than customary spatial random effects with a single
sites. Here, the XK's are covariates with fk ( ⋅ ) increasing, providing the spatial argument. It provides spatial random effects over two spa-
‘warping’ function for covariate Xk. tial locations and is not a function of the distance between the two
As in Ferrier et al. (2007), we adopt cubic I-spline basis func- locations. The interpretation as a difference, squared or absolute,
( )
tions to specify h( ⋅ ) and fk ( ⋅ ) (Ramsay, 1988), each with two inte- between a Gaussian process realization 𝜓(s) and 𝜓 s′ , in terms of
( ′)
rior knots at the 33% and 67% of the predictor. However, we use adjusting the dissimilarity on the V s, s scale is intuitively clear. As
a reparametrization of the I-spline coefficients so that the warping an adjustment, capturing potentially unmeasured or unobserved
[ ]
functions h and fk map to 0, 1 for all k. This standardizes the predic- predictors, it must be positive; it can only increase expected dis-
tors, making all variables comparable through 𝛼 k and 𝛽 1. When inter- similarity as with the fixed effects in the mean. In different words,
preting our model output, 𝛽 1 and the 𝛼 k parameters become variable since a dissimilarity is a function of two spatial arguments, so must
∑5
importance parameters. Specifically, we express h(x) = j=1 𝛽 h,j Ij (x) a spatial random effect in the mean (on a transformed scale). As an
∑5 ∑
and fk (x) = j=1 𝛽 fk ,j Ij (x), where Ij (x) are I-spline bases, j 𝛽 h,j = 1, adjustment, presumably, the closer the spatial locations, the smaller

j 𝛽 fk ,j = 1, and 𝛽 h,j , 𝛽 fk ,j > 0. Together, these constraints make h( ⋅ ) and the expected adjustment. However, for a pair of locations at a given
[ ]
fk ( ⋅ ) monotone increasing and bounded between 0, 1 . Although our distance apart, this adjustment need not be the same as we move
( )
specification of 𝜇 si , sj is analogous to that which is often proposed around the study region. The Gaussian process and the ‘difference’
for GDM's (see, e.g. Ferrier et al., 2007), the variable importance pa- process that it induces enable this.
( )
rameters add interpretability to the GDM framework. In sufficiently For the second and third specifications above, 𝜂 s, s′ , as a func-
rich datasets, h and/or fk could be spatially varying and estimated by tion of a GP, is a realization of a stochastic process. This provides the
combining approaches in White et al. (2021, 2022). second role for the 𝜂's. They create expected dependence between the
( )
The role of 𝜂 si , sj is twofold. First, as in usual spatial model- V's and, hence, the Z's. In fact, using the second and third specifica-
( )
ling, we are adding a spatial random effect to provide adjustment to tions, we can explicitly calculate the covariance between V s1 , s2 and
( ) ( ) ( )
the mean, to potentially capture the difference effects of unmea- V s3 , s4 and, further, between Z s1 , s2 and Z s3 , s4 (see Supporting
( )
sured variables. That is, as we specify 𝜂 s, s’ , it directly takes the Information C.2). No dependence is introduced in the absence of the 𝜂's.
form of a non-negative difference in a latent covariate and enters Furthermore, the V's, hence the Z's are conditionally independent given
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
WHITE et al. 219

the 𝜂's, enabling us to write the likelihood over the Z's as a product form We defer this calculation to Supporting Information C.1 where we also
( ( ) ( ))
which is convenient for model fitting. In Supporting Information C.2, show the behaviour of E Z s, s’ | 𝜂 s, s’ . The calculation to obtain cov
( ( ) ( ))
we digress technically to briefly consider properties arising under the Z s1 , s2 , Z s3 , s4 requires the bivariate normal cdf and becomes too
( )
foregoing specification of the random effect, 𝜂 s, s′ in the model for messy to be useful.
( ) ( ) ( ( ))
Z s, s′ . We remark that, with 𝜂 s, s� = g 𝜓(s) − 𝜓 s� , 𝜂 is not a sta- We use weakly informative prior distributions for all model
tionary function. parameters. To fit the model we use Markov chain Monte Carlo
( )
We will see in Section 4 that the choice of 𝜂 s, s′ depends on (MCMC) methods to sample from the posterior distribution 𝜋(𝜽| Z) of
the taxonomic scale at which we calculate the BC dissimilarity. all model parameters 𝜽 conditioned on all dissimilarities Z. Because
We prefer the squared form at the species level but the absolute the posterior conditional distribution of 𝜎 2𝜓 is an inverse gamma dis-
( )
form at the family level. Furthermore, 𝜂 si , sj introduces the dif- tribution, we carry out a Gibbs update for this parameter. For further
ference between a Gaussian process realization at si and at sj. So, details regarding priors, model fitting and prediction estimation see
𝜂(s, s) = 0 with probability 1. We discuss this point in Supporting Supporting Information B.1. Code and data are provided at https://​
Information C.2. zenodo.​org/​recor​ds/​10091442 (White, 2023).
( )
We also consider three choices for 𝜎2 si , sj to capture variance
patterns suggested by Supporting Information A.2:
3.2 | Model comparison
• 𝜎 2 constant, the homogeneous variance case
� � ��
• log 𝜎 2 s, s� = g( ‖ s − s ‖ ), with g a random function, giving a het- To assess the performance of spGDMM, model comparison is accom-
erogeneous variance as a function of distance. In this case, we plished through 10-fold cross-validation on all three datasets. To do
let g( ⋅ ) be a cubic polynomial in distance. There is an intuitive ar- the cross-validation, we randomly distribute all sites into 10 groups of
gument that the variance of the difference should increase as a nearly equal size, and, in each stage of the cross-validation, we hold out
function of distance (in the spirit of a variogram). However, this in- the 10% of sites associated with each partition and all dissimilarities
tuition is contradicted by the patterns in Figure A.1 of Supporting associated with those sites. After fitting the model to dissimilarities
Information A. Thus, we let g( ⋅ ) be unconstrained. associated with the training subset of sites, we predict the hold-out
( ( )) ( ( ))
• Finally, we consider log 𝜎 2 s, s� = g 𝜇 s, s� where now g is a dissimilarities. We repeat this procedure 10 times so that each site
( ′)
function of 𝜇 s, s . Again, g( ⋅ ) is estimated, in the form of a cubic in our dataset is held out exactly once. Because each dissimilarity is
polynomial. In preliminary modelling, we found that this formula- associated with two sites, every dissimilarity is held out as test data
tion is only estimated effectively with large datasets, so we only twice. We average predictive performance across the 10 experiments
present it for the South Africa data. using the criteria discussed below.
To compare models, we primarily use the (continuous) ranked
Ultimately, we choose the final model specification through probability score (see Brown, 1974; Matheson & Winkler, 1976, for
cross-validation (Section 3.2). early discussion), which we abbreviate as CRPS, because it is a strictly
In summary, our models are more complex than the customary proper scoring rule (Gneiting & Raftery, 2007). Although we use the
GDM. However, in order to remedy our concerns with the latter, our abbreviation CRPS, we emphasize that we are working with a mixture
extension has only added the novel spatial random effects and more of continuous values and a point mass at 1 (see Tang et al., 2023, for
flexible variance structure. Our proposed spGDMM can be used for cus- similar discussion). Thus, strictly, this is not a continuous ranked prob-
tomary regression inference and for prediction, as in Ferrier et al. (2020) ability score. In practice, however, we estimate the CRPS using the
and Hoskins et al. (2020). With prediction as a primary target, our model empirical CDF of the predictions, where each prediction is associated
selection criteria are motivated by performance in the data space, with each posterior sample from our MCMC model fitting (Krüger
that is, in out-of-sample predictive performance. Criteria such as AIC, et al., 2021). For ease of comparison with the GCFR data, we divide
BIC or WAIC judge performance in parameter space. Ultimately, we all CRPS values by the smallest CRPS to give ‘relative CRPS’, which we
choose across a set of model specifications through cross-validation abbreviate as rCRPS. In addition to CRPS, we offer root mean squared
(Section 3.2). We do not employ leave-one-out (loo) validation; it is a error (RMSE) and mean absolute error (MAE) as out-of-sample pre-
‘simple’ version of the cross-validation that we do. We learn more about dictive criteria because they allow us to compare our models to those
model performance if we hold out a larger set and replicate. proposed by Ferrier et al. (2007).
Under these specifications, we can calculate the expected dis- For the GCFR data, we also include a summary of how well
( ( )) ( ( )) ( )
similarity, E Z s, s′ and var Z s, s′ given 𝜂 s, s′ . For the mean, we the model is capturing the exact 1s in the data. Specifically,
obtain we calculate the proportion of 1s predicted correctly (PPC) in

( ( ) ( )) the data using the posterior median of predictions. If we let


( )
( ( � ) ( � )) 𝜇 s, s� + 𝜂 s, s� Z̃ s, s� be the posterior predictive median for an unobserved
E Z s, s | 𝜂 s, s = Φ + ( ’)
𝜎(s, s� ) Z s, s and n1 be the number of ones in the test dataset, then
( ( �) ( �) )
𝜇 s, s + 𝜂 s, s ( �) (5) ∑ ∑ � � � � � �
e ( 𝜇 (s,s� )+𝜂 (s,s� ))+𝜎 2 (s,s� )∕2
Φ − − 𝜎 s, s . PPC = n1 i j 1 Z si , sj = 1, Z̃ si , sj = 1 , for all sites i in the
𝜎(s, s� ) 1
test set, while j sums over all sites. This PPC criterion enables
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
220 WHITE et al.

examination of how well the model predicts exact 1s. The stan- For the species-level analysis, the Ferrier model improves on the
dard GDM cannot predict exact 1s, so it always has a PPC of 0. naive model by 7% and 11% in terms of RMSE and MAE respec-
Using the model comparison approach in Section 3.2, we com- tively. Our best model improves upon the Ferrier model by 2.5% and
( ) ( )
pare all combinations of 𝜂 s, s′ and 𝜎 2 s, s′ discussed in Section 3 5.5% using RMSE and MAE respectively. Greater improvement over
using CRPS, RMSE and MAE. In addition, using the same 10-fold the GDM model occurs at the family level because the dissimilari-
cross-validation approach, we fit the Ferrier et al. (2007) GDM ties are smaller (mean dissimilarity of roughly 0.6) relative to those
model implemented in R (Fitzpatrick et al., 2022a), employing the at the species level (mean dissimilarity of 0.9), allowing more scale
same number of spline basis functions as we use. However, be- to improve. In different words, the improvement, relatively, is more
cause the Ferrier et al. (2007) fitting does not provide predictive substantial than it appears. This is apart from the empirical fact that
distributions, we only compute RMSE and MAE. We also include roughly 50% of species-level dissimilarities are 1, mandating the
a naive model that predicts all dissimilarities to be the sample need to specify a point mass at 1 in the modelling to enable exact
mean of dissimilarities in the training set. For the family- and spe- prediction of a 1. Again, the GDM model cannot predict a dissim-
cies-level analyses of GCFR, as well as the Panama and Australia ilarity of exactly 1; thus, its PPC is 0. This point is emphasized by
data, we present the results of this comparison in Table 1. the results in Table 1, where our best model (in terms of predict-
ing zeros correctly) is able to correctly predict 62% of the exact 1s,
and predicted approximately 45% of the dissimilarities to be exactly
4 | R E S U LT S 1. When analysing dissimilarities at species scale, the best model
in terms of CRPS was Model 9 which uses the 𝜒 2 specification for
( ) ( )
4.1 | Model comparison results 𝜂 s, s′ and variance as a function of 𝜇 s, s′ . This reveals that differ-
ent taxonomic scales can influence the selected model.
In comparison with the naive model (scalar mean only), for the fam- Compared to the best model for the BCI data, the Ferrier model is
ily-level analysis of the Greater Cape Floristic Region (GCFR), the 14% and 16% worse in prediction using RMSE and MAE. The Ferrier
Ferrier model improves by about 7% and 8% in terms of RMSE and model is better than the non-spatial models allowing one-inflation.
MAE respectively (Table 1). Under our specifications, all but our Models with spatial random effects consistently outperform the
model 3 performed better than the Ferrier model in terms of RMSE Ferrier model, highlighting the significance of incorporating the spa-
and MAE. For the family-level dissimilarity, two of our three models tial random effects for improved prediction accuracy. The Australia
without spatial random effects are marginally better in prediction data serves as an example where our framework demonstrates mar-
than the Ferrier et al. (2007) model. The models with spatial random ginal benefit. Compared to our best model, the Ferrier model shows
( )
effects 𝜂 s, s′ are much better in prediction than those without spa- only a 1% increase in prediction error measured by RMSE and MAE.
( )
tial random effects. The models using the 𝜒 2 specification of 𝜂 s, s′ Surprisingly, the Ferrier model outperforms all but one of the sp-
( �)
were between 7 and 10% better than those where 𝜂 s, s = 0 , de- GDMM models in terms of RMSE and MAE. We speculate that this
( )
pending on the specification of 𝜎 2 s, s’ . The models using the folded can be attributed to the presence of very low levels of one-inflation
( )
normal specification of 𝜂 s, s′ were between 11% and 14% better and a relatively weak spatial residual structure in the dissimilarities.
( �)
than those where 𝜂 s, s = 0, depending on the specification of
( )
𝜎 2 s, s′ .
For the GCFR data, the Ferrier model was 14.2% and 17.1% worse 4.2 | Inference
than our best model in terms of RMSE and MAE respectively. Thus,
at family level, our modelling offers a consequential improvement 4.2.1 | Variable importance and response curves
relative to the Ferrier model. Using only RMSE and MAE, the spa-
tial random effects models provide the greatest improvement. More Because we constrain the environmental warping functions fk ( ⋅ ) and
dramatically, our models can predict ones exactly, so all our models distance warping function h( ⋅ ) to be monotone increasing from 0 to
show some capacity in terms of PPC, while the Ferrier model offers 1, we can compare the relative importance of each variable through
none. Interestingly, at family level, our models cannot predict zeros the variable importance parameters: 𝛽 1 and 𝛼 k. We emphasize that
with any accuracy unless they have spatial random effects, suggest- these parameters enter the model through the shape given by fk. We
ing that the covariates and distance are insufficient to explain com- plot the posterior means and 90% credible intervals for the variable
plete assemblage similarity. Our best models correctly predict 46% importance parameters in Figure 3 for the GCFR, BCI and south-
of the observed 1s exactly using the posterior predictive median. In west Australia data. Inferential results of the GCFR data focus on
terms of CRPS, the best model was Model 6, which uses the folded our family-level analysis.
( )
normal specification of 𝜂 s, s′ and variance function that is a func- Four of the seven environmental variables in the GCFR data have
( ′)
tion of 𝜇 s, s . Thus, we use this model to present our results. Model posterior mean variable importances greater than 0.1: annual pre-
5 was competitive with Model 6 in CRPS and outperformed Model cipitation, heat load index, elevation and soil nitrogen (listed in order
6 in terms of RMSE and MAE. However, we ultimately use CRPS for of importance). The BCI data had two of three variables with impor-
model selection. tances greater than 0.1 (distance and elevation), while the southwest
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
WHITE et al. 221

TA B L E 1 Cross-validation model comparison results for (Top) the family-level analysis of GCFR dissimilarities, (Middle) the species-level
analysis of GCFR dissimilarity and (Bottom) BCI and Australia dissimilarities. Best performances are bolded. We do not include models with
( ( ))
g 𝜇 s, s′ in the BCI and Australia data due to small sample size. Model numbering is consistent across Tables.

Model rCRPS CRPS RMSE MAE PPC


( ) ( )
𝜼 s, s′ 𝝈 2 s, s′

Naive — — — — 0.2075 0.1711 0.0000


Ferrier — — — — 0.1922 0.1569 0.0000
1 0 𝜎2 1.1372 0.1104 0.1916 0.1566 0.0184
( )
2 0 g ∥ s − s� ∥ 1.1389 0.1106 0.1919 0.1568 0.0212
( ( ))
3 0 g 𝜇 s, s′ 1.1432 0.1110 0.1926 0.1573 0.0060
( )
4 ∣ 𝜓(s) − 𝜓 s� ∣ 𝜎 2 1.0115 0.0982 0.1701 0.1350 0.4500
( ) ( )
5 ∣ 𝜓(s) − 𝜓 s� ∣ g ∥s−s ∥ � 1.0007 0.0972 0.1683 0.1340 0.4605
( ) ( ( ))
6 ∣ 𝜓(s) − 𝜓 s� ∣ g 𝜇 s, s′ 1.0000 0.0971 0.1698 0.1342 0.3467
7 ( ( ))2 1.0464 0.1016 0.1757 0.1401 0.4318
𝜓(s) − 𝜓 s� 𝜎2
( ( ))2 ( )
8 𝜓(s) − 𝜓 s� g ∥ s − s� ∥ 1.0508 0.1020 0.1763 0.1413 0.3740
( ( ))2 ( ( ))
9 𝜓(s) − 𝜓 s� g 𝜇 s, s′ 1.0760 0.1045 0.1823 0.1454 0.3286

Model rCRPS CRPS RMSE MAE PPC


( ) 2
( )
𝜼 s, s′ 𝝈 s, s ′

Naive — — — — 0.1028 0.0801 0.0000


Ferrier — — — — 0.0954 0.0714 0.0000
1 0 𝜎2 1.0499 0.0467 0.0962 0.0741 0.3179
( )
2 0 g ∥ s − s� ∥ 1.0474 0.0466 0.0962 0.0736 0.3663
( ( ))
3 0 g 𝜇 s, s′ 1.0484 0.0467 0.0959 0.0720 0.5029
( )
4 ∣ 𝜓(s) − 𝜓 s� ∣ 𝜎 2 1.0079 0.0449 0.0934 0.0681 0.6198
( ) ( )
5 ∣ 𝜓(s) − 𝜓 s� ∣ g ∥s−s ∥ � 1.0128 0.0451 0.0938 0.0684 0.6141
( ) ( ( ))
6 ∣ 𝜓(s) − 𝜓 s� ∣ g 𝜇 s, s′ 1.0097 0.0450 0.0938 0.0669 0.4316
7 ( ( ))2 2 1.0099 0.0450 0.0936 0.0699 0.4830
𝜓(s) − 𝜓 s� 𝜎
( ( ))2 ( )
8 𝜓(s) − 𝜓 s� g ∥s−s ∥ � 1.0034 0.0447 0.0933 0.0690 0.5484
( ( ))2 ( ( ))
9 𝜓(s) − 𝜓 s� g 𝜇 s, s′ 1.0000 0.0445 0.0930 0.0675 0.4746

BCI dataset Australia dataset

Model CRPS RMSE MAE CRPS RMSE MAE


( ) ( )
𝜼 s, s’ 𝝈2 s, s′

Naive — — — 0.1269 0.1036 — 0.1574 0.1307


Ferrier — — — 0.0934 0.0716 — 0.0737 0.0549
1 0 𝜎2 0.0527 0.0954 0.0779 0.0439 0.0790 0.0595
( )
2 0 g ∥ s − s’ ∥ 0.0511 0.0937 0.0734 0.0435 0.0805 0.0608
( )
4 ∣ 𝜓(s) − 𝜓 s� ∣ 𝜎 2 0.0490 0.0878 0.0690 0.0473 0.0840 0.0629
( ) ( )
5 ∣ 𝜓(s) − 𝜓 s� ∣ g ∥ s − s’ ∥ 0.0479 0.0879 0.0654 0.0454 0.0820 0.0626
7 ( ( ))2 2 0.0472 0.0856 0.0658 0.0414 0.0731 0.0545
𝜓(s) − 𝜓 s� 𝜎
( ( ))2 ( )
8 𝜓(s) − 𝜓 s� g ∥s−s ∥ � 0.0450 0.0821 0.0618 0.0407 0.0748 0.0556

Australian data had three of four variables with importances greater Australian data, though the distance and phosphorus variables were
than 0.1 (winter precipitation, distance and soil phosphorus). also quite close in importance. Holding all other variables constant,
The model suggests that annual precipitation (gmap) is the most both elevation and soil nitrogen are also important drivers of beta
important driver of beta diversity in the South Africa data, holding diversity in the South Africa data.
all other variables constant. Annual precipitation appears to be al- To clarify how these environmental variables drive beta diversity
most twice as important as the second most important environmen- in the spGDMM, we plot the estimated 𝛼 k fk ( ⋅ ) for all covariates in
tal variable: heat load index (Theobald et al., 2015). Precipitation Figure 4. The warping functions, fk ( ⋅ ), are plotted without the vari-
was also found to be the most important variable for the southwest able importance scaling in Supporting Information D. In these plots,
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
222 WHITE et al.

0.4 F I G U R E 3 90% credible intervals


on variable importance parameters for

Variable importance
0.3 (Top) Family-level analysis of GCFR data,
(Bottom-Left) Panama data and (Bottom-
0.2
Right) Australia Data.
0.1

0.0
Pre

Ra

Ele

He

Me

So

So

Dis
il

il N
in

an
va

ta
c

tL

Co
ipit

fal

nc
tio

itro
Te
oa
l

n
ati

e
n
Co

du
mp
d

g
on

en
n

ctiv
c.

ity
0.4
0.4

Variable importance
Variable importance

0.3
0.3

0.2 0.2

0.1 0.1

0.0 0.0
Ph

Te

Pre

Dis
Pre

Ele

Dis

m
os

tan
va

tan

cip
cip

pe
ph
tio

it

ce
ita

ce

rat

ati
oru
n
tio

on
re
s
n

we fix the scale of the y-axis so that the relative importance of these For the Panama data, the Ferrier model showed weak effects
variables is clear. In addition, we add the estimated warping curve (< 0.1 %) for precipitation on beta diversity while the spGDMM
using the Ferrier model. Our model for the warping is on the scale found none. The spGDMM found weak effects of elevation on beta
of the latent log-dissimilarity and the Ferrier model is on the scale diversity that increased between 200 and 600 m and tapered after
of 1 − e−𝜇. To allow comparison between the inference provided by 600 m. The Ferrier model found similar results, though at stronger
our model and the Ferrier GDM model, we multiply all estimated effect levels. Both models found that distance had strong effects for
curves from the Ferrier model by a common scaling factor so that beta diversity with the spGDMM finding a steep effect increasing
the highest point on any curve is equal to the highest point on the between 0 and 20 km while the Ferrier model showed a steadier in-
posterior mean curves. crease across the distance range.
Holding other covariates constant in the South Africa data, The spGDMM and Ferrier model results often overlapped in
annual average precipitation has almost no estimated impact on the Australia data. The most significant departure between the two
beta diversity until the average exceeds 300 mm according to models was for the distance variable with the Ferrier model finding
estimates from the spGDMM. Beyond 300 mm, annual precipi- a weak, but steadily increasing effect of distance on beta diversity.
tation has a very strong impact on beta diversity. This suggests The spGDMM found a strong effect that sharply increased between
that precipitation averages below 300 mm are effectively similar 0 and 150 km.
in terms of beta diversity. The Ferrier model found a similar result,
but showed an additional impact on beta diversity around 150 mm
of rainfall. The warping of elevation is approximately linear, with 4.3 | Spatial random effects
some tapering for elevations above 1300 m. The Ferrier model re-
sults behaved similarly but had a strong impact for elevation above We visualize the estimated spatial random effect surface in two ways
1300 m instead of tapering. From the estimated warping for heat in Figure 5. First, we plot the posterior mean of 𝜓(s). Second, we plot
( ) ( )
loads in Figure 4, heat loads less than 170 are effectively the same 𝜂 s, s� = ∣ 𝜓(s) − 𝜓 s� ∣, by fixing s and plotting the posterior mean
in terms of beta diversity. When heat loads exceed 170, there are as a function of s′. Every point s generates a surface that adjusts the
significant effects on beta diversity. The Ferrier model estimated dissimilarities at s′. These fixed s sites are located where we would ex-
that heat load had an effect at much smaller values, but tapered pect to find vegetation belonging to the Succulent Karoo and Fynbos
off around estimates of 190. Uniquely, the beta diversity driven by biomes respectively (See Figure 1). The scale of the differences in
( )
soil total nitrogen occurs at low values (< 0.1 %) for the spGDMM. 𝜂 s, s′ is large relative to the scale of the scaled warping functions in
Above 0.1% in soil total nitrogen, there is effectively no differ- Figure 4. This suggests that the spatial random effect contributes to
ence in beta diversity. The Ferrier model had similar dynamics but the mean dissimilarity more than the environmental covariates, even
showed a higher effect from soil nitrogen. We speculate that the when considering the additive effect of many environmental vari-
differences in results between the spGDMM and the Ferrier mode ables. As is discussed at length in the spatial confounding literature
arise in part from the inclusion of spatial random effects and the (see, e.g. Khan & Calder, 2022; Reich et al., 2006), it is likely that the
use of different link functions. spatial random effects diminish or alter the estimated effect of some
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
WHITE et al. 223

k fk(Annual precipiation (mm))

k fk(Rainfall concentration)
0.4

0.4

0.4

0.4
k fk(Elevation (m))

k fk(Heat load)
0.3

0.3

0.3

0.3
0.2

0.2

0.2

0.2
0.1

0.1

0.1

0.1
0.0

0.0

0.0

0.0
100 200 300 10 20 30 40 50 500 1000 1500 120 140 160 180 200 220 240
Annual precipiation (mm) Rainfall concentration Elevation (m) Heat load
0.4

0.4

0.4

0.4
k fk(Soil conductivity)
k fk(Mean temp (C))

k fk(Soil nitrogen)
0.3

0.3

0.3

0.3
f(Distance)
0.2

0.2

0.2

0.2
0.1

0.1

0.1

0.1
0.0

0.0

0.0

0.0
12 14 16 18 20 20 40 60 80 100 0.05 0.10 0.15 0.20 0.25 0.30 0 50 100 150 200 250 300
Mean temp (C) Soil conductivity Soil nitrogen Distance (km)
k fk(Precipitation (mm))
0.4

0.4

0.4
k fk(Eleveation (m))

f(Distance)
0.3

0.3

0.3
0.2

0.2

0.2
0.1

0.1

0.1
0.0

0.0

0.0

2000 2500 3000 3500 4000 0 200 400 600 800 0 20 40 60 80 100
Precipitation (mm) Eleveation (m) Distance (km)
k fk(Winter precipitation (mm))
0.5

0.5

0.5

0.5
k fk(Max temperature (C))
k fk(Soil phosphorus)
0.4

0.4

0.4

0.4
f(Distance)
0.3

0.3

0.3

0.3
0.2

0.2

0.2

0.2
0.1

0.1

0.1

0.1
0.0

0.0

0.0

0.0

100 200 300 400 500 500 1000 1500 26 28 30 32 34 36 100 300 500 700
Winter precipitation (mm) Soil phosphorus Max temperature (C) Distance (km)

F I G U R E 4 Product of variable importance parameter (𝛼 k or 𝛽 1) and warping function. Posterior mean in black, 90% credible interval in
grey, and scaled curves from Ferrier model in red. (Top/Top-Middle) all covariates for the family level analysis of the GCFR data; (Bottom-
Middle) precipitation, elevation and distance curves for Panama data; (Bottom) winter precipitation, soil phosphorus, max temperature and
distance for Australia data. Within a dataset, all curves are plotted on fixed y-scale.

of the environmental predictors. We note that much of the random become a standard tool for many ecologists (Mokany et al., 2022) be-
effect signal we observed in the posterior means likely corresponds cause of their capacity to handle the nonlinear relationship between
with biome transitions from Succulent Karoo to Fynbos biomes compositional dissimilarity and environmental distance as well as their
along the edge of the escarpment. accessibility in software implementation (Fitzpatrick et al., 2022b). While
the original GDM framework has advanced our understanding of beta
diversity patterns and what drives them, it does not offer a coherent
5 | DISCUSSION solution that could generate the data observed. In response, we have
proposed a generative spatial model that we refer to as the spGDMM.
Understanding and predicting turnover across landscapes is critical to We believe that the spGDMM approach is a useful step forward
biodiversity studies and conservation planning. GDMs have, perhaps, for explaining turnover in a rigorous manner. Using a suitable link
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
224 WHITE et al.

−30.5 −30.5 −30.5

−31.0 −31.0 −31.0

Posterior Posterior Posterior


−31.5 Mean −31.5 Mean −31.5 Mean
1.0 1.6
lat

lat

lat
0.5 0.9 1.2
0.0 0.6 0.8
−32.0 −0.5 −32.0 0.3 −32.0 0.4
−1.0 0.0 0.0

−32.5 −32.5 −32.5

−33.0 −33.0 −33.0


19 20 21 19 20 21 19 20 21
lon lon lon

( ) ( )
F I G U R E 5 Posterior mean of (left) 𝜓(s) using the 𝜂 s, s� = ∣ 𝜓(s) − 𝜓 s� ∣ specification for the family-level analysis and (Centre-Right)
( �) ( �)
𝜂 s, s = ∣ 𝜓(s) − 𝜓 s ∣ for two fixed points indicated by points at s = (20.0, − 31.5), (19.6, − 32.3) respectively.

function (others are available, see Supporting Information C), we variables examined, we found that mean annual precipitation was the
have specified a model with a flexible spatially varying mean function most important in explaining beta diversity. This result is also echoed
as well as spatially varying variances. Perhaps more importantly, we in the southwest Australia data where the amount of winter rainfall
have incorporated novel spatial random effects to capture depen- was the most important factor in driving plant species turnover in
dence between dissimilarities as well as to improve out-of-sample the Mediterranean portion of the continent, matching the findings of
prediction of dissimilarities. We have also implemented one-infla- Fitzpatrick et al. (2013). Mean annual precipitation and heat load index
tion, acknowledging that, at the species level in the Greater Cape had relatively sharp thresholds with their relationship to beta diver-
Floristic Region (GCFR), essentially half of the observed dissimilar- sity. We hypothesize that the sharp increases of these two variables
ities are 1. are likely indications of turnover in biomes, particularly the transition
We found that our spGDMM yielded consequential predictive of the arid Succulent Karoo to fire-prone Fynbos biomes. The mean
improvement over the GDM approach of Ferrier et al. (2007) par- annual precipitation for the Succulent Karoo is around 170 mm with
ticularly for the GCFR plant dataset. Our results confirm adequate the wettest areas reaching levels of 300 mm (Mucina et al., 2006).
predictive capacity for other ecosystems when our models were Strikingly, the threshold of 300 mm is also where our results for annual
applied to the Barro Colorado Island (BCI) and southwest Australia precipitation start to strongly impact beta diversity. The threshold of
data (Section 4.1). In terms of inference, the model conclusions the heat load index is also likely the point where moisture availability
were often similar though we note that the differences in results favours Succulent Karoo over Fynbos vegetation either through the
between our models and the Ferrier model arise in part from the absence of fire or greater seed viability for succulent karoo vegetation
inclusion of spatial random effects and the use of different link in drier areas (Rebelo et al., 2006).
functions. We highlight that our model is able to predict dissimi- The spatial random effects in our models play an important role,
larities with a value of 1 (total dissimilarity) while the Ferrier model contributing more to explain mean dissimilarity than our environ-
cannot. We feel that this is a valuable contribution because many mental covariates. This makes a strong case for the need to include
biodiversity hotspots, for example, the GCFR, are characterized by spatial random effects in future dissimilarity modelling. Within our
their large degree of species turnover (Latimer et al., 2005) which data, we believe that the spatial random effects are capturing un-
in turn would lead to a high proportion of 1s even within small measured local and regional factors that are likely related to biome
spatial extents that are well sampled. This will also be the case in turnover. As seen in Figure 5, the two fixed points are roughly lo-
broad-scale studies that examine biodiversity across regional to cated in an area of Succulent Karoo (left) and Fynbos (right). The
global scales. Furthermore, given the time and resources needed magnitudes of the random effect are lower where we would expect
to survey an area in terms of its species, one can imagine a sce- the same biome of the fixed point and higher where we would ex-
nario where limited samples within a diverse area can easily inflate pect a different biome.
the proportions of 1s, creating a need for models that handle their There are several future applications that can be developed based
prevalence. on the generative spatial models we describe. The incorporation of
In terms of inferring what drives plant family turnover within the spatial random effects can produce a number of hypotheses regard-
GCFR, we found that mean annual precipitation, heat load index, el- ing spatial patterns and processes that can subsequently be explored
evation and soil nitrogen were the most important of the eight vari- and tested explicitly (Latimer et al., 2006, 2009). Future work could
ables that we examined. Broadly, these results align with previous include the investigation of the temporal dynamics of turnover lead-
( )
efforts to examine turnover in the region. Factors such as climate, ing to a model that considers Z Yt1 (s), Yt2 (s) , that is, the dissimilarity
topography and edaphic features have been highlighted in driving of a site between times t1 and t2. An even richer effort would con-
( ( ))
turnover in taxonomic composition and subsequently turnover in bi- sider spatio-temporal turnover, Z Yt1 (s), Yt2 s� to better understand
omes (Mucina & Rutherford, 2006; Power et al., 2017). Out of all the change in biodiversity in the past and to project future changes. This
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
WHITE et al. 225

would enable assessment of within-site behaviour/variability versus ORCID


between-site behaviour/variability of Z. Another direction could con- Philip A. White https://orcid.org/0000-0003-0907-9221
sider spatially varying coefficients in 𝜇 to enable the relationship to Henry A. Frye https://orcid.org/0000-0002-2066-5742
be local rather than global. This will be novel but very challenging be- Jasper A. Slingsby https://orcid.org/0000-0003-1246-1181
cause, like the intercept adjustment, these coefficients will need two John A. Silander Jr https://orcid.org/0000-0001-8535-4710
arguments. A different path will consider biome-specific dissimilarity Alan E. Gelfand https://orcid.org/0000-0002-5671-9212
when there are several biomes within the study region. Given the
complexity of biodiversity patterns and processes, the present mod- REFERENCES
elling of the spatial and environmental aspects of taxonomic turnover Anderson, M. J., Crist, T. O., Chase, J. M., Vellend, M., Inouye, B. D.,
only scratches the surface of what can be done in terms of under- Freestone, A. L., Sanders, N. J., Cornell, H. V., Comita, L. S., &
Davies, K. F. (2011). Navigating the multiple meanings of β diver-
standing beta diversity.
sity: A roadmap for the practicing ecologist. Ecology Letters, 14(1),
19–28.
AU T H O R C O N T R I B U T I O N S Blasco-Moreno, A., Pérez-Casany, M., Puig, P., Morante, M., & Castells,
Philip A. White and Alan E. Gelfand led the conceptualization, E. (2019). What does a zero mean? Understanding false, random
and structural zeros in ecology. Methods in Ecology and Evolution,
development and implementation of the statistical analyses.
10(7), 949–959.
They also led the writing of the statistical review and critiques Brown, T. A. (1974). Admissible scoring systems for continuous distributions.
of current generalized dissimilarity model approaches. Jasper A. Technical Report P-5235. The Rand Corporation.
Slingsby, Henry A. Frye and John A. Silander led the writing pro- Condit, R., Pitman, N., Leigh, E. G., Chave, J., Terborgh, J., Foster, R. B.,
Núñez, P., Aguilar, S., Valencia, R., Villa, G., Muller-Landau, H. C.,
cess for the ecological contextualization and interpretation of the
Losos, E., & Hubbell, S. P. (2002). Beta-diversity in tropical forest
models. They also helped guide the conceptualization of the man- trees. Science, 295(5555), 666–669.
uscript. All authors contributed critically to the drafts and gave de Valpine, P., Turek, D., Paciorek, C. J., Anderson-Bergman, C., Lang, D.
final approval for publication. T., & Bodik, R. (2017). Programming with models: Writing statistical
algorithms for general model structures with NIMBLE. Journal of
Computational and Graphical Statistics, 26(2), 403–413.
AC K N O​W L E D
​ G E ​M E N T S
Ferrari, S., & Cribari-Neto, F. (2004). Beta regression for modelling rates
We thank Matthew Aiello-Lammens, Douglas Euston-Brown, and proportions. Journal of Applied Statistics, 31(7), 799–815.
Hayley Kilroy Mollmann, Cory Merow, Helga van der Merwe and Ferrier, S. (2002). Mapping spatial pattern in biodiversity for regional
Adam Wilson for their contributions in the data collection and conservation planning: Where to from here? Systematic Biology,
51(2), 331–363.
curation. Special thanks to Cape Nature and the Northern Cape
Ferrier, S., Harwood, T. D., Ware, C., & Hoskins, A. J. (2020). A globally
Department of Environment and Nature Conservation for per- applicable indicator of the capacity of terrestrial ecosystems to re-
mission for the collection of leaf spectra and traits. Data collec- tain biological diversity under climate change: The bioclimatic eco-
tion efforts were made possible by funding from National Science system resilience index. Ecological Indicators, 117, 106554.
Ferrier, S., Manion, G., Elith, J., & Richardson, K. (2007). Using gener-
Foundation grant DEB-1046328 to J.A. Silander. Additional sup-
alized dissimilarity modelling to analyse and predict patterns of
port was provided by the University of Connecticut, a Future beta diversity in regional biodiversity assessment. Diversity and
Investigators in NASA Earth and Space Science and Technology Distributions, 13(3), 252–264.
(FINESST) grant award (80NSSC20K1659) and a NASA BioSCape Ferrier, S., Powell, G. V., Richardson, K. S., Manion, G., Overton, J. M.,
Allnutt, T. F., Cameron, S. E., Mantle, K., Burgess, N. D., & Faith, D.
Award (80NSSC22K1383) to H.A. Frye and J.A. Silander, a as well
P. (2004). Mapping more of terrestrial biodiversity for global con-
as National Research Foundation grant awards (118593, 142438 and servation assessment. Bioscience, 54(12), 1101–1109.
150926) to J.A. Slingsby. Fitzpatrick, M., Mokany, K., Manion, G., Nieto-Lugilde, D., & Ferrier, S.
(2022a). gdm: Generalized dissimilarity modeling.
Fitzpatrick, M., Mokany, K., Manion, G., Nieto-Lugilde, D., & Ferrier, S.
C O N FL I C T O F I N T E R E S T S TAT E M E N T
(2022b). gdm: Generalized dissimilarity modeling. R package version
The authors state that there is no conflict of interest. 1.5.0-9.1.
Fitzpatrick, M. C., Sanders, N. J., Normand, S., Svenning, J.-C., Ferrier,
PEER REVIEW S., Gove, A. D., & Dunn, R. R. (2013). Environmental and histori-
The peer review history for this article is available at https://​w ww.​ cal imprints on beta diversity: Insights from variation in rates of
species turnover along gradients. Proceedings of the Royal Society B:
webof​s cien​ce.​com/​a pi/​g atew​ay/​wos/​p eer-​review/​10.​1111/​2041-​
Biological Sciences, 280(1768), 20131201.
210X.​14259​. Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, predic-
tion, and estimation. Journal of the American Statistical Association,
DATA AVA I L A B I L I T Y S TAT E M E N T 102(477), 359–378.
Graham, C. H., & Fine, P. V. A. (2008). Phylogenetic beta diversity: Linking
The Barro Colorado Island and Southwest Australia datasets are ac-
ecological and evolutionary processes across space in time. Ecology
cessible from sources cited in the main manuscript. Data from the Letters, 11(12), 1265–1277.
Greater Cape Floristic Region, as well as code for the models im- Hoskins, A. J., Harwood, T. D., Ware, C., Williams, K. J., Perry, J. J., Ota,
plemented using NIMBLE (de Valpine et al., 2017) are provided at N., Croft, J. R., Yeates, D. K., Jetz, W., Golebiewski, M., Purvis,
A., Robertson, T., & Ferrier, S. (2020). BILBI: Supporting global
https://​zenodo.​org/​recor​ds/​10091442 (White, 2023).
|

2041210x, 2024, 1, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14259 by CAPES, Wiley Online Library on [11/01/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
226 WHITE et al.

biodiversity assessment through high-resolution macroecological South African National Biodiversity Institute. (2006). The vegetation
modelling. Environmental Modelling & Software, 132, 104806. map of South AfricaLesotho, and Swaziland.
Khan, K., & Calder, C. A. (2022). Restricted spatial regression meth- Tang, B., Frye, H. A., Gelfand, A. E., & Silander, J. A. (2023). Zero-inflated
ods: Implications for inference. Journal of the American Statistical Beta distribution regression modeling. Journal of Agricultural,
Association, 117(537), 482–494. Biological and Environmental Statistics, 28(1), 117–137.
Krüger, F., Lerch, S., Thorarinsdottir, T., & Gneiting, T. (2021). Predictive Theobald, D. M., Harrison-Atlas, D., Monahan, W. B., & Albano, C. M.
inference based on markov chain Monte Carlo output. International (2015). Ecologically-relevant maps of landforms and physio-
Statistical Review, 89(2), 274–301. graphic diversity for climate adaptation planning. PLoS One, 10(12),
Latimer, A. M., Banerjee, S., Sang, H., Jr., Mosher, E. S., & Silander, J. A., Jr. e0143619.
(2009). Hierarchical models facilitate spatial analysis of large data van der Merwe, H., van Rooyen, M. W., & van Rooyen, N. (2008a).
sets: A case study on invasive plant species in the northeastern Vegetation of the Hantam-Tanqua-Roggeveld subregion, South
United States. Ecology Letters, 12(2), 144–154. Africa. Part 1: Fynbos biome related vegetation. Koedoe, 50(1),
Latimer, A. M., Silander, J. A., & Cowling, R. M. (2005). Neutral ecological 61–71.
theory reveals isolation and rapid speciation in a biodiversity hot van der Merwe, H., van Rooyen, M. W., & van Rooyen, N. (2008b).
spot. Science, 309(5741), 1722–1725. Vegetation of the Hantam-Tanqua-Roggeveld subregion, South
Latimer, A. M., Wu, S., Gelfand, A. E., & Silander, J. A., Jr. (2006). Building Africa Part 2: Succulent Karoo biome related vegetation. Koedoe,
statistical models to analyze species distributions. Ecological 50(1), 160–183.
Applications, 16(1), 33–50. White, P. A. (2023). Data and code from: Generative spatial generalized
Legendre, P. (1993). Spatial autocorrelation: Trouble or new paradigm? dissimilarity mixed modeling (spGDMM): An enhanced approach to
Ecology, 74(6), 1659–1673. modelling beta diversity. Zenodo https://​doi.​org/​10.​5281/​zenodo.​
Manly, B. F. (1986). Randomization and regression methods for testing for 10091441
associations with geographical, environmental and biological distances White, P. A., Frye, H., Christensen, M. F., Gelfand, A. E., & Silander, J. A.
between populations. Researches on Population Ecology, 28, 201–218. (2022). Spatial functional data modeling of plant reflectances. The
Matheson, J. E., & Winkler, R. L. (1976). Scoring rules for continuous Annals of Applied Statistics, 16(3), 1919–1936.
probability distributions. Management Science, 22(10), 1087–1096. White, P. A., Keeler, D. G., & Rupper, S. (2021). Hierarchical integrated
Mokany, K., Ware, C., Woolley, S. N. C., Ferrier, S., & Fitzpatrick, M. C. (2022). spatial process modeling of monotone west antarctic snow density
A working guide to harnessing generalized dissimilarity modelling for curves. The Annals of Applied Statistics, 15(2), 556–571.
biodiversity analysis and conservation assessment. Global Ecology and Whittaker, R. H. (1960). Vegetation of the Siskiyou mountains, Oregon
Biogeography: A Journal of Macroecology, 31(4), 802–821. and California. Ecological Monographs, 30(3), 279–338.
Mucina, L., Jürgens, N., le Roux, A., Rutherford, M. C., Schmiedel, U., Esler,
K. J., Powrie, L. W., Desmet, P. G., & Milton, S. J. (2006). Succulent
Karoo Biome. In The Vegetation of South Africa, Lesotho and Swaziland,
S U P P O R T I N G I N FO R M AT I O N
Strelitzia. South African National Biodiversity Institute.
Mucina, L., & Rutherford, M. C. (Eds.). (2006). The vegetation of South Additional supporting information can be found online in the
Africa, Lesotho and Swaziland. Number 19 in Strelitzia. South African Supporting Information section at the end of this article.
National Biodiversity Institute. OCLC: ocn137259974. Supporting Information S1. We carry out additional data
Oksanen, J., & Tonteri, T. (1995). Rate of compositional turnover along
explanations and exploratory analyses in Section A. We introduce
gradients and total gradient length. Journal of Vegetation Science,
6(6), 815–824. prior distributions, model fitting details, and prediction methods
Ospina, R., & Ferrari, S. L. (2012). A general class of zero-or-one inflated in Section B. In Section C, we present an alternative link function
beta regression models. Computational Statistics & Data Analysis, and present distribution theory for the spatial random effects. In
56(6), 1609–1623.
Sections D and E, we show additional results for the South Africa
Power, S. C., Anthony Verboom, G., Bond, W. J., & Cramer, M. D. (2017).
Environmental correlates of biome-level floristic turnover in South data at family- and species-level. In Section F, we display the spatial
Africa. Journal of Biogeography, 44(8), 1745–1757. random effect for the Barro Colorado Island and Southwest Australia
Prentice, I. C. (1977). Non-metric ordination methods in ecology. The datasets.
Journal of Ecology, 65, 85–94.
Ramsay, J. O. (1988). Monotone regression splines in action. Statistical
Science, 3(4), 425–441.
Rebelo, A. G., Boucher, C., Helme, N., Mucina, L., & Rutherford, M. C.
(2006). Fynbos Biome. In The vegetation of South Africa, Lesotho, and How to cite this article: White, P. A., Frye, H. A., Slingsby, J. A.,
Swaziland, Strelitzia. South African National Biodiversity Institute. Silander, J. A. Jr, & Gelfand, A. E. (2024). Generative spatial
Reich, B. J., Hodges, J. S., & Zadnik, V. (2006). Effects of residual smooth- generalized dissimilarity mixed modelling (spGDMM): An
ing on the posterior of the fixed effects in disease-mapping models.
enhanced approach to modelling beta diversity. Methods in
Biometrics, 62(4), 1197–1206.
Schweiger, A. K., & Laliberté, E. (2022). Plant beta-diversity across bi- Ecology and Evolution, 15, 214–226. https://doi.
omes captured by imaging spectroscopy. Nature Communications, org/10.1111/2041-210X.14259
13(1), 2767.

You might also like