You are on page 1of 71

What are the important spatial

scales in an ecosystem?
Pierre Legendre
Département de sciences biologiques
Université de Montréal

Pierre.Legendre@umontreal.ca
http://www.bio.umontreal.ca/legendre/

“Meet the experts” seminar


Université catholique de Louvain, Louvain-La-Neuve, April 19, 2012
Outline of the talk
1. Motivation
2. Variation partitioning
3. Multi-scale analysis
• The dbMEM method
• Simulation study (summary)
4. Several applications to real ecological and data
5. Recent developments: MEM (general forms), AEM
Setting the stage
Ecologists want to understand and model spatial [or temporal]
community structures through the analysis of species assemblages.
• Species assemblages are the best response variable available to
estimate the impact of [anthropogenic] changes in ecosystems.
•   Difficulty: species assemblages form multivariate data tables
(sites x species).
Setting the stage
Beta diversity is the variation in species composition among sites.
Beta diversity is organized in communities. It displays spatial
structures.
Ile Callot, Finistère.
Photo P. Legendre
Stonehenge, Wiltshire,
southern England.
Photo P. Legendre
Setting the stage
Spatial structures in communities indicate that some process has
been at work to create them. Two families of mechanisms can
generate spatial structures in communities.
Google Maps
Setting the stage
Spatial structures in communities indicate that some process has
been at work to create them. Two families of mechanisms can
generate spatial structures in communities:
•   Induced spatial dependence: forcing (explanatory) variables are
responsible for the spatial structures found in the species assemblage.
They represent environmental or biotic control of the species
assemblages, or historical dynamics. Generally broad-scaled.
• Community dynamics: the spatial structures are generated by the
species assemblage themselves, creating autocorrelation1 in the
response variables (species). Mechanisms: neutral processes such as
ecological drift and limited dispersal, interactions among species.
Spatial structures are generally fine-scaled.

1 Spatial autocorrelation (SA) is technically defined as the dependence, due to geographic


proximity, present in the residuals of a [regression-type] model of a response variable y which
takes into account all deterministic effects due to forcing variables. Model: yi = f(Xi) + SAi + εi .
Multivariate variation partitioning
Borcard & Legendre 1992 [1722 citations]
Borcard & Legendre 1994
and many published application papers
Environmental data Spatial data
matrix X matrix W

Community
composition = [a] [b] [c]
data table Y

[d] = Residuals

Figure – Venn diagram illustrating a partition of the variation of a


response matrix Y (e.g., community composition data) between
environmental (matrix X) and spatial (matrix W) explanatory
variables. The rectangle represents 100% of the variation of Y.

Method described in:


Borcard, Legendre & Drapeau, 1992; Borcard & Legendre, 1994; Legendre & Legendre, 2012.
How to combine environmental and spatial variables in modeling
community composition data?
• A single response variable: partial multiple regression.

Partial multiple regression is computed as follows:


1. Compute the residuals Xres of the regression of X on W: Xres = X – [W [W'W]–1 W' X]
2. Regress y on Xres
How to combine environmental and spatial variables in modelling
community composition data?
• Multivariate data: partial canonical analysis (RDA or CCA).

Response table Explanatory table Explanatory table

Y X W
Community Environmental Spatial
composition variables base functions
data

Partial canonical analysis is computed as follows:


1. Compute the residuals Xres of the regression of X on W: Xres = X – [W [W'W]–1 W' X]
2. Regress Y on Xres to obtain Yfit . Compute PCA of Yfit .
Geographic base functions
First (simple) representation: Polynomial function of geographic
coordinates (polynomial trend-surface analysis).
Example 1: 20 sampling sites in the Thau lagoon, southern France.

z^ = f(X,Y) = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + b6X3 + b7X2Y + b8XY2 + b9Y3
Small textbook example:
20 sampling sites in the Thau lagoon, southern France.
Response (Y): 2 types of aquatic heterotrophic bacteria (log-transf.)
Environmental (X): NH4, phaeopigments, bacterial production
Spatial (W): selected geographic monomials X2, X3, X2Y, XY2, Y3
In real-life studies, partitioning is carried out on larger data sets.
Spatial eigenfunctions

Second representation:
Distance-based Moran’s eigenvector maps
(dbMEM)
(formerly called Principal Coordinates of Neighbor Matrices, PCNM)
leading to multiscale analysis in variation partitioning
Borcard & Legendre 2002, 2004
Dray, Legendre & Peres-Neto 2006
Figure – Graphs of ten of the 49 dbMEM eigenfunctions that represent the spatial
variation along a transect with 50 equally-spaced points. Abscissa, from left to right:
sites 1 to 50. Ordinates: values along the dbMEM eigenfunctions.
Truncated matrix of Euclidean
Data Euclidean distances distances = neighbor matrix
Observed 1 2 3 4 5 ... ... n–1 1...max
variable 1 2 3 4 5 ... 1...max
1...max
y 1 2 3 4 5 ...
1...max Max x 4
1 2 3 4 5 ... 1...max
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 ... 1...max
x 1 2 3 4 5 1...max
(spatial coordinates) 1 2 3 4 1...max
1...max
1 2 3
1...max
1 2
1
1

Multiple regression Principal coordinate


or canonical analysis analysis

Y X (+) + 0 –

Eigenvectors with Eigenvectors


positive eigenvalues
= PCNM variables

Figure – Schematic description of dbMEM analysis. The descriptors of spatial


relationships (dbMEM eigenfunctions) are obtained by principal coordinate analysis
of a truncated matrix of Euclidean (geographic) distances among the sampling sites.
MEM eigenfunctions display spatial autocorrelation

A spatial correlation coefficient (Moran’s I) can be computed for each


MEM. The eigenvalues are actually proportional to Moran’s I.
Example: 50-point transect, 49 MEMs.
How to find the truncation distance in 2-dimensional problems?
Compute a minimum spanning tree

Truncation distance ≥ length of the longest link. Longest link here: D(7, 8) = 3.0414
Technical notes on MEM eigenfunctions

MEM variables represent a spectral decomposition of the spatial


relationships among the study sites. They can be computed for
regular or irregular sets of points in space or time.

MEM eigenfunctions are orthogonal. If the sampling design is


regular, they look like sine waves. This is a property of the eigen-
decomposition of the centred form of a distance matrix.

⇒ Consider a matrix of 0/1 connections among points, with


• 1 between connected points
• 0 between unconnected points and on the diagonal
Double-centre the matrix to have the sums of rows and columns equal
to 0. The eigenvectors of that matrix, plotted on a map of the
geographic coordinates of the points, are sine-shaped.
Simulation study
Type I error study
Simulations showed that the procedure is honest. It does not generate
more significant results that it should for a given significance level α.
Power study
Simulations showed that dbMEM (PCNM) analysis is capable of
detecting spatial structures of many kinds:
• random autocorrelated data,
•   bumps and sine waves of various sizes, without or with random
noise, representing deterministic structures,
as long as the structures are larger than the truncation value used to
create the dbMEM (PCNM) eigenfunctions.
Detailed results are found in Borcard & Legendre 2002.
A difficult test case
A difficult test case
A difficult test case
12

Dependent variable
10 g) Data (100%)
8
6
4 Simulated
2
data
.

0
−2
−4
−6
0 10 20 30 40 50 60 70 80 90 100

8 h) Detrended data
4 Step 1:
0
−4
detrending
−8

6 i) Spatial model with 8 PCNM base functions


4 (R2 = 0.433) PCNM analysis
2
0
of detrended
−2 data
−4

1,5 4
j) Broad-scale submodel (R2 = 0.058) k) Intermediate-scale submodel (R2 = 0.246)
3
1,0 PCNM #2 PCNM #6, 8, 14
2
0,5
1
0,0 0
−0,5 −1
−1,0 −2

3 l) Fine-scale submodel (R2 = 0.128)


2 PCNM #28, 33, 35, 41
1
0
−1
−2
−3
A difficult test case
Selection of explanatory variables?
Significance of the adjustment of a MEM model can be tested using
the full set of MEM variables modelling positive spatial correlation,
without selection of any kind.

The adjusted R2 gives a correct estimate of the variation explained by


the MEM by correcting for the number of explanatory variables in the
model.1

Before constructing submodels, forward selection of MEMs can be


done by combining two criteria during model selection: the alpha
significance level and the adjusted R2 of the model containing all
MEM eigenfunctions2.

1Peres-Neto, P. R., P. Legendre, S. Dray and D. Borcard. 2006. Variation partitioning of species data
matrices: estimation and comparison of fractions. Ecology 87: 2614-2625.
2 Blanchet F. G., P. Legendre and D. Borcard. 2008. Forward selection of explanatory variables.
Ecology 89: 2623-2632.
Example 1
Regular one-dimensional transect in upper Amazonia1
Data: abundance of the fern Adiantum tomentosum in quadrats.
Sampling design: 260 adjacent, square (5 m x 5 m) subplots forming a
transect in the region of Nauta, Peru.
Questions
• At what spatial scales is the abundance of this species structured?
• Are these scales related to those of the environmental variables?
Pre-treatment
• The abundances were square-root transformed
• and detrended (significant linear trend: R2 = 0.102, p = 0.001)

1Data from Tuomisto & Poulsen 2000, reanalysed in Borcard, Legendre, Avois-Jacquet &
Tuomisto 2004.
3
(a) Data R2 = 0.815
2

Forward selection 1

50 dbMEM eigenfunctions –1

–2 PCNM model (50 significant PCNMs out of 176)


(PCNM) were selected 1 21 41 61 81 101 121 141 161 181 201 221 241 260
2
(permutation test, 999 1,5
(b) Very-broad-scale submodel, 10 PCNMs, R2 = 0.333
1
permutations). 0,5
0
-0,5
The dbMEM were arbitrarily -1
-1,5
Broad-scale submodel, 8 PCNMs, R2 = 0.239
divided into 4 submodels. -2
1 21 41 61 81 101 121 141 161 181 201 221 241 260

The submodels are 1,5

1
(c) Medium-scale submodel, 12 PCNMs, R2 = 0.126

orthogonal to one another. 0,5

Significant wavelengths -0,5

(periodogram analysis): -1

-1,5
1 21 41 61 81 101 121 141 161 181 201 221 241 260

V-br-scale: 250, 355-440 m 1,5


1
(d) Fine-scale submodel, 20 PCNMs, R2 = 0.117

Broad-scale: 180 m 0,5


0
Medium-scale: 90 m -0,5
-1
Fine-scale: 50, 65 m -1,5
-2
1 21 41 61 81 101 121 141 161 181 201 221 241 260
Interpretation: regression on the environmental variables
Use dbMEM in variation partitioning: Adiantum tomentosum at Nauta,
Peru (R2a).
Scalogram of the fern Adiantum tomentosum multiscale structure
along another transect called Huanta (Peru). Abscissa: the 129
dbMEM eigenfunctions with positive Moran’s I. Ordinate: absolute
values of the t-statistics. The 26 eigenfunctions selected by forward
selection (p ≤ 0.05) are identified by black squares.
Example 2
Regular two-dimensional sampling grid
Chlorophyll a in a brackish lagoon1
Data: Chlorophyll a concentrations at 63 sites on a geographic surface.
Sampling design: 63 sites forming a regular grid (1-km mesh) in the
Thau marine lagoon (19 km x 5 km).
Questions
• At what spatial scales is chlorophyll a structured?
• Are these scales related to the environmental variables?
Pre-treatment
• None.
Forward selection
• 12 dbMEM (PCNM) eigenfunctions were selected out of 45.
1Data first analyzed by Legendre & Troussellier 1988; reanalyzed in Borcard et al. 1992 and
Borcard, Legendre, Avois-Jacquet & Tuomisto 2004.
Example 3
Gutianshan forest plot in China1
• Evergreen forest in Gutianshan Forest Reserve, Zhejiang Province.
• Fully-surveyed 24-ha forest plot in subtropical forest, 29º15'N.
• Plot divided into 600 cells of 20 m × 20 m.
• 159 tree species. Richness: 19 to 54 species per cell.
• Data collection: 2005

Legendre, P., X. Mi, H. Ren, K. Ma, M. Yu, I. F. Sun, and F. He. 2009. Partitioning beta
diversity in a subtropical broad-leaved forest of China. Ecology 90: 663-674.

1The Gutianshan forest plot is a member of the Center for Tropical Forest Science (CTFS).
Details on the plot available at http://www.ctfs.si.edu/site/Gutianshan/.
Example 3
Gutianshan forest plot in China
Questions
• How much of the variation in species composition among sites (beta
diversity) is spatially structured?
• Of that, how much is related to the environmental variables?

⇒ Four environmental variables developed in cubic polynomial form:


Altitude: altitude, altitude2, altitude3
Convexity: convexity, convexity2, convexity3
Slope: slope, slope2, slope3
Aspect (circular variable): sin(aspect), cos(aspect)
(Soil cores collected in 2007. Soil chemistry data not available yet.)

⇒  599 dbMEM eigenfunctions. 200 model positive spatial correlation.


⇒  Nearly all of them are significant: spatial variation at all scales.
Example of a regular
grid with 8 × 12 = 96
points.
Maps showing ten of
the 48 dbMEM
eigenfunctions that
display positive spatial
correlation.
Shades of grey: values
in each eigenvector,
from white (largest
negative value) to black
(largest positive value).
Variation Variation
explained by explained
Environment by PCNM
= 0.307 = 0.626

Variation in
species data Y = [a] = [b] = [c] =
(beta diversity) 0.029 0.278 0.348

Unexplained (residual) = [d] = 0.344

• 63% of the among-cell variation (Ra2) of the community composition


(159 species) is spatially structured and explained by the 339 PCNM.
• Nearly half of that 63% is also explained by the four environmental
variables. Soil chemistry to be added to the model when available.
• Scales of spatial variation: the dominant structure is broad-scaled.
⇒ Balance between neutral processes and environmental control.
Applications of dbMEM eigenfunctions
a) We proceeded as follows in the first three examples:
• dbMEM analysis of the response table Y;
• Division of the significant dbMEM eigenfunctions into submodels;
• Interpretation of the submodels using explanatory variables.
The objective was to divide the variation of Y into submodels and
relate those to explanatory environmental variables.

b) dbMEM eigenfunctions can also be used in the framework of


variation partitioning, as in Example #4. The variation of Y is then
partitioned with respect to a table of explanatory variables X and (for
example) several tables W1, W2, W3, containing dbMEM submodels.
Applications of dbMEM eigenfunctions
c) dbMEM can be used to model spatial and temporal variation in the
study of spatio-temporal data, and test for the space-time interaction.
Refer to the space-time interaction talk, Wednesday.
d) dbMEM can efficiently model spatial structures in data. They can
be used to control for spatial autocorrelation in tests of significance of
the species-environment relationship (fraction [a]).1

1 Peres-Neto, P. R. and P. Legendre. 2010. Estimating and controlling for spatial structure in
the study of ecological communities. Global Ecology and Biogeography 19: 174-184.
Stéphane Dray: MEM analysis

The Moran’s eigenvector maps (MEM) method is a


generalization of dbMEM (PCNM) to different
types of spatial weights. The result is a set of spatial
eigenfunctions, as in dbMEM analysis.



Eigen-decomposition


of a spatial weighting matrix W

B = 0/1 Hadamard
product A = edge
W= connectivity weighting
matrix * matrix
among sites

Dray, Legendre and Peres-Neto (2006); Legendre and Legendre (2012, Chapter 14).
Difference between classical PCNM and dbMEM

A site is connected
to itself (D = 0)

A site is not
connected to itself
(D = 4 × threshold)

The eigenvalues are different but the eigenvectors


(i.e. the spatial eigenfunctions) are the same.
Other forms of Moran’s Eigenvector Maps (generalized
MEM) can be constructed (Dray et al. 2006):
• Binary MEM: double-centre matrix B, then compute its eigenvalues
and eigenvectors.
• Replace matrix A by some function of the distances.



• Replace A by some other weights, e.g. resistance of the landscape.

W=
B = 0/1
connectivity
matrix
among sites
Hadamard
product

*
A = edge
weighting
matrix
Guillaume Blanchet: AEM analysis

Asymmetric eigenvector maps (AEM) is a spatial


eigenfunction method developed to model species
spatial distributions generated by an asymmetric,
directional physical process.1

1Blanchet, F.G., P. Legendre and D.


Borcard. 2008. Modelling
directional spatial processes in
ecological data. Ecological
Modelling 215: 325-336.

The following slides were provided


by F. G. Blanchet from his talk:
Modelling directional spatial
processes in ecological data. Spatial
Ecological Data Analysis with R
(SEDAR) Workshop, Université
Lyon I, May 26, 2008.
Constructing asymmetric eigenvector maps
(AEM eigenfunctions)

Link
Spatial asymmetry

1 = presence of a link
0 = absence of a link

# Link number
# Site number
Constructing asymmetric eigenvector maps
(AEM eigenfunctions)

Link
Spatial asymmetry

1 = presence of a link
0 = absence of a link

# Link number
# Site number
Constructing asymmetric eigenvector maps
(AEM eigenfunctions)

Link
Spatial asymmetry

1 = presence of a link
0 = absence of a link
1 2 3 4 5 6 7 8 9 10 11 12 13 14 … 57

Site 8 0 1 1 1 0 0 0 0 0 1 0 1 0 1 0 0
# Link number
# Site number
Matrix of edges E
Matrix of edges E

Weights can be put on the edges


Compute the AEM eigenfunctions

Either the object scores or SVD of matrix E


from PCA of matrix E centred by columns
Eigenfunctions (explanatory spatial variables)
Maps of some of the AEM eigenfunctions


AEM 1 AEM 2 AEM 3

[...]

AEM 22 AEM 23 AEM 24

Negative values
Positive values
Three applications of AEM analysis in the following paper –

Blanchet, F. G., P. Legendre, R. Maranger, D. Monti, and P. Pepin.


2011. Modelling the effect of directional spatial ecological processes
at different scales. Oecologia 166: 357-368.
Example 1 – Roxane Maranger, U. de Montréal
Bacterial production in Lake St. Pierre

Sampling was done on the


morning of August 18th 2005

Frenette et al. - Limnology & Oceanography (2006)

t d ir ecti on
Cu r r e n
AEM model: R2adj = 51.4% using 4 selected AEMs
Example 2 – Dominique Monti, UAG
Atya innocous

94 sites were sampled in Rivière Capesterre


Current direction
Atya innocous

93 AEM variables were constructed based on this connection diagram;


38 measured positive autocorrelation. 12 were selected.
AEM model: R2adj = 59.8% using 12 selected AEMs
Example 3: 6 larval stages of Calanus finmarchicus on
the Newfoundland and Labrador oceanic shelf
– Pierre Pepin, DFO, St. John’s, NL
AEM model: R2adj = 38.4% using 2 selected AEMs
Computer programs
in the R statistical language
On R-Forge: http://r-forge.r-project.org/R/?group_id=195
PCNM package (P. Legendre)
AEM package (F. G. Blanchet)
SPACEMAKER: an R package to compute PCNM and MEM (D. Dray)
PACKFOR: R package for selection of explanatory variables (S. Dray)
On the CRAN page: http://cran.r-project.org
VEGAN package (Oksanen et al. 2011):
function varpart() for multivariate variation partitioning
function ordistep() for selection of explanatory variables
A new package, ADESPATIAL, is in preparation, that will contain all functions for
spatial eigenfunction analysis presently found on the R-Forge page, and more.
To appear
in 2012
(August
or before)
References
Available in PDF at http://numericalecology.com/reprints/
Blanchet, F. G., P. Legendre, and D. Borcard. 2008a. Modelling directional spatial
processes in ecological data. Ecological Modelling 215: 325-336.
Blanchet, F. G., P. Legendre, R. Maranger, D. Monti, and P. Pepin. 2011. Modelling
the effect of directional spatial ecological processes at different scales. Oecologia
166: 357-368.
Borcard, D., P. Legendre & P. Drapeau. 1992. Partialling out the spatial component
of ecological variation. Ecology 73: 1045-1055.
Borcard, D. & P. Legendre. 1994. Environmental control and spatial structure in
ecological communities: an example using Oribatid mites (Acari, Oribatei).
Environmental and Ecological Statistics 1: 37-61.
Borcard, D. & P. Legendre. 2002. All-scale spatial analysis of ecological data by
means of principal coordinates of neighbour matrices. Ecological Modelling 153:
51-68.
Borcard, D., P. Legendre, C. Avois-Jacquet & H. Tuomisto. 2004. Dissecting the
spatial structure of ecological data at multiple scales. Ecology 85: 1826-1832.
References (continued)
Available in PDF at http://numericalecology.com/reprints/
Dray, S., P. Legendre & P. Peres-Neto. 2006. Spatial modelling: a comprehensive
framework for principal coordinate analysis of neighbour matrices (PCNM).
Ecological Modelling 196: 483-493.
Guénard, G., P. Legendre, D. Boisclair, and M. Bilodeau. 2010. Multiscale
codependence analysis: an integrated approach to analyze relationships across
scales. Ecology 91: 2952-2964.
Oksanen, J., G. Blanchet, R. Kindt, P. Legendre, P. R. Minchin, R. B. O’Hara, G. L.
Simpson, P. Solymos, M. H. H. Stevens, and H. Wagner. 2011. vegan: Community
Ecology Package. R package version 2.0-0. http://cran.r-project.org/package=vegan.
Peres-Neto, P. R., P. Legendre, S. Dray and D. Borcard. 2006. Variation partitioning
of species data matrices: estimation and comparison of fractions. Ecology 87:
2614-2625.
Peres-Neto, P. R. and P. Legendre. 2010. Estimating and controlling for spatial
structure in the study of ecological communities. Global Ecology and Biogeography
19: 174-184.
The End