You are on page 1of 421

ENVIRONMENTAL SCIENCE, ENGINEERING AND TECHNOLOGY

ECOLOGICAL MODELING

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or
by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of information
contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in
rendering legal, medical or any other professional services.
ENVIRONMENTAL SCIENCE,
ENGINEERING AND TECHNOLOGY

Additional books in this series can be found on Nova‘s website


under the Series tab.
ENVIRONMENTAL SCIENCE, ENGINEERING AND TECHNOLOGY

ECOLOGICAL MODELING

WENJUN ZHANG
EDITOR

Nova Science Publishers, Inc.


New York
Copyright © 2012 by Nova Science Publishers, Inc.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system or
transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical
photocopying, recording or otherwise without the written permission of the Publisher.

For permission to use material from this book please contact us:
Telephone 631-231-7269; Fax 631-231-8175
Web Site: http://www.novapublishers.com

NOTICE TO THE READER


The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or
implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of
information contained in this book. The Publisher shall not be liable for any special,
consequential, or exemplary damages resulting, in whole or in part, from the readers‘ use of, or
reliance upon, this material. Any parts of this book based on government reports are so indicated
and copyright is claimed for those parts to the extent applicable to compilations of such works.

Independent verification should be sought for any data, advice or recommendations contained in
this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage
to persons or property arising from any methods, products, instructions, ideas or otherwise
contained in this publication.

This publication is designed to provide accurate and authoritative information with regard to the
subject matter covered herein. It is sold with the clear understanding that the Publisher is not
engaged in rendering legal or any other professional services. If legal or any other expert
assistance is required, the services of a competent person should be sought. FROM A
DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE
AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS.

Additional color graphics may be available in the e-book version of this book.

Library of Congress Cataloging-in-Publication Data

Ecological modeling / editor, Wen-Jun Zhang.


p. cm.
Includes bibliographical references and index.
ISBN 978-1-62417-275-5 (eBook)
1. Ecology--Simulation methods. I. Zhang, Wen-Jun.
QH541.15.S5E275 2011
577.01'13--dc23
2011013661

Published by Nova Science Publishers, Inc. † New York


CONTENTS

Preface vii
Chapter 1 Artificial Neural Network Simulation of Spatial Distribution of
Arthropods: A Multi-Model Comparison 1
WenJun Zhang and GuangHua Liu
Chapter 2 Multispectral Vegetation Indices in Remote Sensing:
An Overview 15
George P. Petropoulos and
Chariton Kalaitzidis
Chapter 3 Development of a Decision Support System for the Estimation of
Surface Water Pollution Risk From Olive Mill Waste Discharges 41
Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos,
Dimitris Zianis and Nikos Boretos
Chapter 4 Analysis of Green Oak Leaf Roller Population Dynamics
in Various Locations 65
L. V. Nedorezov
Chapter 5 Individual Based Modelling of Planktonic Organisms 83
Daniela Cianelli, Marco Uttieri
and Enrico Zambianchi
Chapter 6 The Effectiveness of Artificial Neural Networks
in Modelling the Nutritional Ecology of a Blowfly Species 97
Michael J. Watts, Andre Bianconi,
Adriane Beatriz S. Serapiao, Jose S. Govone
and Claudio J. Von Zuben
Chapter 7 Development and Utility of an Ecological-based
Decision-Support System for Managing Mixed Coniferous
Forest Stands for Multiple Objectives 115
Peter F. Newton
vi Contents

Chapter 8 Ecological Niche Models in Mediterranean Herpetology:


Past, Present and Future 173
A. Márcia Barbosa, Neftalí Sillero,
Fernando Martínez-Freiría
and Raimundo Real
Chapter 9 Some Aspects of Phytoplankton and Ecosystem Modelling
in Freshwater and Marine Environments: Consideration
of Indirect Interactions, and the Implications for Interpreting
Past and Future Overall Ecosystem Functioning 205
V. Krivtsov and C.F. Jago
Chapter 10 Modeling Population Dynamics, Division of Labor
and Nutrient Economics of Social Insect Colonies 223
Thomas Schmickl and Karl Crailsheim
Chapter 11 Observation and Control in Density-
and Frequency-Dependent Population Models 267
Manuel Gámez
Chapter 12 Environmental Noise and Nonlinear Relaxation
in Biological Systems 289
B. Spagnolo, D. Valenti, S. Spezia, L. Curcio, N. Pizzolato,
A. A. Dubkov, A. Fiasconaro, D. Persano Adorno, P. Lo Bue,
E. Peri and S. Colazza
Chapter 13 Landscape Structural Modeling:
A Multivariate Cartographic Exegesis 325
Alessandro Ferrarini
Chapter 14 Basic Concepts for Modelling in Different and Complementary
Ecological Fields: Plants Canopies Conservation, Thermal
Efficiency in Buildings and Wind Energy Producing 335
Mohamed Habib Sellami
Index 391
PREFACE

Ecological modeling is a fast growing science. More and more innovative methodologies
and theories on ecological modeling are emerging around the world. This book presents
models, methods and theories on ecological modeling. Topics discussed include artificial
neural networks; individual-based modeling; ecological nich models; landscape and GIS
modeling; population dynamics; nutritional ecology; remote sensing and decision support
systems. This book reflects recent achievements of scientists in the study of ecological
modeling.
Chapter 1 - Probability distribution functions have been widely used to model the spatial
distribution of arthropods. Aggregation types (i.e., randomly distributed, uniformly
distributed, aggregately distributed, etc.) of arthropods can be detected based on probability
distribution functions, but the abundance at given location is not able to be predicted by them.
This study aimed to present an artificial neural network to simulate spatial distribution of
arthropods. Response surface model and spline function were compared and evaluated against
the neural network model for their simulation performance.
The results showed that the artificial neural network exhibited good simulation
performance. Simulated spatial distribution was highly in accordant with the observed one.
Overall the neural network performed better in the case of lower total abundance of
arthropods. Response surface model could fit the spatial distribution of arthropods but the
simulation performance was worse than neural network. Cross validation revealed that neural
network performed better than response surface model and spline function in predicting
spatial distribution of arthropods. Confidence interval of predicted abundance could be
obtained using randomized submission of quadrate sequences in the neural network
simulation. It is concluded that artificial neural network is a valuable model to simulate the
spatial distribution of arthropods.
Chapter 2 – Remote sensing has generally demonstrated a great potential in mapping
spatial patterns of vegetation. By employing the amount of reflected radiation at particular
regions of the electromagnetic spectrum, it is possible to make estimates on certain
characteristic of vegetation. The use of radiometric vegetation indices is a fast and efficient
method for vegetation monitoring, exploiting information acquired from remote sensing data.
These indices are dimensionless radiometric measures that generally function as indicators of
relative abundance and activity of green vegetation.
Throughout the years, a large number of multispectral vegetation indices have been
formulated. Each has variable degree of efficiency in estimating one or more vegetation
viii WenJun Zhang

parameters such as, health status, nutrient or water deficiency, crop yield, vegetation cover
fraction, leaf area index, absorbed photosynthetically active radiation, net primary production
and above-ground biomass. Additionally some of them also consider atmospheric effects and/
or the soil background for an enhanced retrieval. The present chapter aims in providing an
overview on the use of radiometric vegetation indices developed over the last few decades,
utilizing spectral information acquired from multispectral optical remote sensing sensors.
This overview is preceded an introduction to some important principles of remote sensing
relevant to the vegetation spectral response is made available, as this was considered
necessary to better understand the context of the present overview.
Chapter 3 – According to the Water Framework Directive (WFD, 2000/60/EC),
Integrated River Basin Management Plans (RBMP) are required at different scales, in order to
prevent amongst other things, water resource deterioration and ensure water pollution
reduction. An integrated river basin management approach underpins a risk-based land
management framework for all activities within a spatial land-use planning framework. To
this end, a risk assessment methodology is required to identify water pollution hazards in
order to set appropriate environmental objectives and in turn design suitable mitigation
measures. Surface water pollution as a result of Olive Mill Waste (OMW) discharge is a
serious hazard in the olive oil producing regions of the Mediterranean. However, there is no
standardised method to assess the risk of water pollution from olive mill waste for any given
river basin. The present chapter shows the results from a study conducted addressing the
above issue by designing a detailed risk assessment methodology, which utilises GIS
modelling to classify within a watershed individual sub-catchment risk of water pollution
occurring from olive mill waste discharges. The chapter presents the proposed criteria and
calculations required to estimate sub-catchment risk significance and comments on the
methods potential for wider application. It combines elements from risk assessment
frameworks, Multi Criteria Analysis (MCA), and Geographic Information Systems (GIS).
MCA is used to aggregate different aspects and elements associated with this environmental
problem, while GIS modeling tools helped in obtaining many criterion values and providing
insight into how different objects interact in nature and how these interactions influence risk
at the watershed level. The proposed method was trialed in the Keritis watershed in Crete,
Greece and the results indicated that this method has the potential to be a useful guide to
prioritise risk management actions and mitigation measures which can subsequently be
incorporated in river basin management plans.
Chapter 4 - Publication is devoted to the problem of population time series analysis with
various discrete time models of population dynamics. Applications of various statistical
criterions, which are normally used for determination of mathematical model parameters, are
under the discussion. With a particular example on green oak leaf roller (Tortrix viridana L.)
population fluctuations, which had been presented in publications by Rubtsov (1992), and
Korzukhin and Semevskiy (1992) for three different locations in Europe, the possibilities of
considering approach to the analysis of population dynamics are demonstrated. For
approximations of empirical datasets the well-known models of population dynamics with a
discrete time (Kostitzin model, Skellam model, Moran – Ricker model, Morris – Varley –
Gradwell model, and discrete logistic model) were applied. For every model the final decision
about the possibility to use the concrete model for approximation of datasets are based on
analyses of deviations between theoretical (model) and empirical trajectories: the
correspondence of distribution of deviations to Normal distribution with zero average was
Preface ix

checked with Kolmogorov – Smirnov and Shapiro – Wilk tests, and existence/absence of
serial correlation was determined with Durbin – Watson criteria. It was shown that for two
experimental trajectories Kostitzin model and discrete logistic model give good
approximations; it means that population dynamics can be explained as a result of influence
of intra-population self-regulative mechanisms only. The third considering empirical
trajectory needs in use more complicated mathematical models for fitting.
Chapter 5 - In the last decades, numerical modelling has gained increasing consensus in
the scientific world, and particularly in the framework of behavioural and population ecology.
Through numerical models it is possible to reconstruct what is observed in the environment or
in the laboratory and to get a more in-depth comprehension of the factors regulating the
phenomena under examination.
Numerous approaches have been developed in this framework, but probably one of the
most promising is the individual-based modelling. With this type of approach it is relatively
straightforward to investigate aspects related to the ecology of a population starting from the
characterisation of processes taking place at the scale of the individual organism.
This contribution is intended to provide a general view of the main features of the
individual-based models and of their peculiarities in comparison to other modelling strategies.
Special emphasis will be given to applications in the field of phyto- and zooplankton ecology
and behaviour, and results from the available literature on this topic will be used as examples.
Chapter 6 - The larval phase of most blowfly species is considered a critical
developmental period in which intense limitation of feeding resources frequently occurs.
Furthermore, such a period is characterised by complex ecological processes occurring at
both individual and population levels. These processes have been analysed by means of
traditional statistical techniques such as simple and multiple linear regression models.
Nonetheless, it has been suggested that some important explanatory variables could well
introduce non-linearity into the modelling of the nutritional ecology of blowflies. In this
context, dynamic aspects of the life history of blowflies could be clarified and detailed by the
deployment of machine learning approaches such as artificial neural networks (ANNs), which
are mathematical tools widely applied to the resolution of complex problems. A
distinguishing feature of neural network models is that their effective implementation is not
precluded by the theoretical distribution of the data used. Therefore, the principal aim of this
investigation was to use neural network models (namely multi-layer perceptrons and fuzzy
neural networks) in order to ascertain whether these tools would be able to outperform a
general quadratic model (that is, a second-order regression model with three predictor
variables) in predicting pupal weight values (outputs) of experimental populations of
Chrysomya megacephala (F.) (Diptera: Calliphoridae), using initial larval density (number of
larvae), amount of available food, and pupal size as input variables. These input variables
may have generated non-linear variation in the output values, and fuzzy neural networks
provided more accurate outcomes than the general quadratic model (i.e. the statistical model).
The superiority of fuzzy neural networks over a regression-based statistical method does
represent an important fact, because more accurate models may well clarify several intricate
aspects regarding the nutritional ecology of blowflies. Additionally, the extraction of fuzzy
rules from the fuzzy neural networks provided an easily comprehensible way of describing
what the networks had learnt.
Chapter 7 - An ecological-based decision-support system and corresponding algorithmic
analogue for managing natural black spruce (Picea mariana (Mill) BSP.) and jack pine (Pinus
x WenJun Zhang

banksiana Lamb.) mixed stands was developed. The integrated hierarchical system consisted
of six sequentially-linked estimation modules. The first module consisted of a key set of
empirical yield-density relationships and theoretically-based functions derived from allometry
and self-thinning theory that were used to describe overall stand dynamics including temporal
size-density interrelationships and expected stand development trajectories. The second
module was comprised of a Weibull-based parameter prediction equation system and an
accompanying composite height-diameter function that were used to recover diameter and
height distributions. The third module included a set of species-specific composite taper
equations that were used to derive log product distributions and volumetric yields. The fourth
module was composed of a set of species-specific allometric-based composite biomass
equations that were used to estimate mass distributions and associated carbon-based
equivalents for each above-ground component (bark, stem, branch and foliage). The fifth
module incorporated a set of species-specific end-product and value equations that were used
to predict chip and lumber volumes and associated monetary equivalents by sawmill type
(stud and randomized length mill configurations). The sixth module encompassed a set of
species-specific composite equations that were used to derive wood and log quality metrics
(specific gravity and mean maximum branch diameter, respectively). The stand dynamic and
structural recovery modules were developed employing 382 stand-level measurements
derived from 155 permanent and temporary sample plots situated throughout the central
portion of the Canadian Boreal Forest Region, the taper and end-product modules were
developed employing published results from taper and sawmill simulation studies, and the
biomass and fibre attribute modules were developed using data from density control
experiments.
The potential of the system in facilitating the transformative change towards the
production of higher value end-products and a broader array of ecosystem services was
exemplified by simultaneously contrasting the consequences of density management regimes
involving commercial thinning treatments in terms of overall productivity, end-product
yields, economic efficiency, and ecological impact. This integration of quantitative
relationships derived from applied ecology, plant population biology and forest science into a
common analytical platform, illustrates the synergy that can be realized through a multi-
disciplinary approach to forest modeling.
Chapter 8 – The authors present a review of the concepts and methods associated to
ecological niche modeling illustrated with the published works on amphibians and reptiles of
the Mediterranean Basin, one of the world's biodiversity hotspots for conservation priorities.
They start by introducing ecological niche models, analyzing the various concepts of niche
and the modeling methods associated to each of them. The authors list some conceptual and
practical steps that should be followed when modeling, and highlight the pitfalls that should
be avoided. The authors then outline the history of ecological modeling of Mediterranean
amphibians and reptiles, including a variety of aspects: identification of the ecological niche;
detection of common distribution areas (chorotypes) and other biogeographical patterns;
analysis and prediction of species richness patterns; analysis of the expansion of native and
invasive species; integration of molecular data with spatial modeling; identification of contact
zones between related taxa; assessment of species' conservation status; and prediction of
future conservation problems, including the effects of global change. They conclude this
review with a discussion of the research that still needs to be developed in this area.
Preface xi

Chapter 9 - Numerical techniques (e.g. correlation, multiple regression and factor


analysis, path analysis, methods of network analysis, and, in particular, simulation modelling)
may be very helpful in investigations of indirect relationships in aquatic ecosystems. Here we
give a brief overview of some examples of the relevant studies, and focus on 1) a case study
of a freshwater eutrophic lake, where statistical analysis of the datasets obtained within a
comprehensive monitoring programme, and sensitivity analysis by a mathematical model
‗Rostherne‘, helped to reveal the previously overlooked relationships between Si and P
biogeochemical cycles coupled through the dynamics of primary producers, and 2) give an
overview of how the coupling of physical, chemical, and biological processes in the marine
ecosystem models offers a basis for investigations of indirect interactions in continental shelf
seas. Complex aquatic ecosystem models provide a numerical simulation of biogeochemical
fluxes underpinned by coupling physical forcing functions with definitions simulating
biological and chemical processes, and offer a potential for quantitative interpretation of
sediment proxies in the stratigraphic record. Combination of models and sediment proxies,
calibrated by training sets, can provide information on water column structure, surface
heating, mixing, and water depth, thus providing a basis for reconstruction of the past, and
predicting the future environmental dynamics.
Chapter 10 - In the evolution of social insects, the colony and not the (often sterile)
individual worker should be considered the major unit of selection. Thus, social insect
colonies are considered to be 'super-organisms', which have – like all other organisms – to
perform behaviors which affect their outside environment and which alter their own future
internal status. The way these behaviors are coordinated is by means of communication,
which is either direct or indirect and which involves information exchange either by
transmitting signals or by exploiting cues. Therefore, social insect colonies perform
information processing in a rather similar way as multicellular organisms do, where behaviors
result from the exchange of information among their sub-modules (cells). In many cases, self-
organization allows a colony to evaluate massive amounts of information in parallel and to
decide about the colony's future behavioral responses. Many feedback systems that govern
self-organization of workers have been investigated empirically and theoretically. Here, the
authors discuss models which have been proposed to explain division of labor and task
selection in social insects. The authors demonstrate how the collective regulation of labor in
eusocial insect colonies is studied by means of top-down modeling and by bottom-up models,
often analyzed with multi-agent computer simulations.
Chapter 11 - The paper is a review of a research line initiated two decades ago. At the
beginning the research was concentrated on basic qualitative properties of ecological and
population-genetic models, such as observability and controllability. For population system,
observability means that, e.g. from partial observation of the system (observing only certain
indicator species), in principle the whole state process can be recovered. Recently, for
different ecosystems, the so-called observer system (or state estimators) have been
constructed that enables us to effectively estimate the whole state process from the
observation. The methodology of observer design can be also applied to estimate unknown
changes in ecological parameters of the system. Clearly, both observation (i.e. monitoring)
and control are important issues in conservation ecology. For an ecological system, in an
appropriate setting, controllability implies that a disturbed ecosystem can be steered beck to
an equilibrium state by an abiotic human intervention. Recent research concern the effective
calculation of such control functions. While the considered ecological models are density-
xii WenJun Zhang

dependent, observability and controllability problems also naturally arise in frequency-


dependent models of population genetics. As for the frequency-dependent case, observation
systems typically occur in case of phenotypic observation of genetic processes; control
systems can be used to model e.g. artificial selection. In this survey, in addition to the basic
methodology and its applications, the recent developments of the field are also reported.
Chapter 12 – The authors analyse the effects of environmental noise in three different
biological systems: (i) mating behavior of individuals of Nezara viridula (L.) (Heteroptera
Pentatomidae); (ii) polymer translocation in crowded solution; (iii) an ecosystem described by
a Verhulst model with a multiplicative Lévy noise. Specifically, they report on experiments
on the behavioral response of N. viridula individuals to sub-threshold deterministic signals in
the presence of noise. The authors analyze the insect response by directionality tests
performed on a group of male individuals at different noise intensities. The percentage of
insects which react to the sub-threshold signal shows a nonmonotonic behavior, characterized
by the presence of a maximum, for increasing values of the noise intensity. This is the
signature of the non-dynamical stochastic resonance phenomenon. By using a ―har d‖
threshold model the authors find that the maximum of the signal-to-noise ratio occurs in the
same range of noise intensity values for which the behavioral activation shows a maximum.
In the second system, the noise driven translocation of short polymers in crowded solutions is
analyzed. An improved version of the Rouse model for a flexible polymer has been adopted
to mimic the molecular dynamics, by taking into account both the interactions between
adjacent monomers and introducing a Lennard-Jones potential between non-adjacent beads. A
bending recoil torque has also been included in our model. The polymer dynamics is
simulated in a two-dimensional domain by numerically solving the Langevin equations of
motion. Thermal fluctuations are taken into account by introducing a Gaussian uncorrelated
noise. The mean first translocation time of the polymer center of inertia shows a minimum as
a function of the frequency of the oscillating forcing field. In the third ecosystem, the
transient dynamics of the Verhulst model perturbed by arbitrary non-Gaussian white noise is
investigated. Based on the infinitely divisible distribution of the Lévy process we study the
nonlinear relaxation of the population density for three cases of white non-Gaussian noise: (i)
shot noise, (ii) noise with a probability density of increments expressed in terms of Gamma
function, and (iii) Cauchy stable noise. The authors obtain exact results for the probability
distribution of the population density in all cases, and for Cauchy stable noise the exact
expression of the nonlinear relaxation time is derived. Moreover starting from an initial delta
function distribution, they find a transition induced by the multiplicative Lévy noise from a
trimodal probability distribution to a bimodal probability distribution in asymptotics. Finally
the authors find a nonmonotonic behavior of the nonlinear relaxation time as a function of the
Cauchy stable noise intensity.
Chapter 13 - Landscape modelling is founded on the idea that the patterning of landscape
elements strongly influences ecological characteristics, thus the ability to quantify landscape
structure is a prerequisite to the study of landscape function and change over time as well. For
this reason, much emphasis has been placed until now on developing methods to quantify
landscape structure.
Unfortunately, on one side landscape (i.e., landcover or landuse) and vegetation maps are
very complex mosaics of thousands of patches, and this makes the interpretation of their
structure very challenging. On the other side, methods developed so far to quantify landscape
structure just return numerical results, that are not linked to cartographic outputs. Last,
Preface xiii

landscape pattern indices are numerous, and the need for a synthetic representation is more
and more impelling.
I provide here the description and application of a novel approach to landscape structural
modelling based on the combined use of GIS (Geographical Information Systems) and
multivariate statistics. First, landscape structure of the study area (Ceno valley, Italy) is
analyzed through 5 patch-based, non-redundant indicators (area, isolation, compactness,
shape complexity, interspersion) with indirect link to functional aspects. Second, PCA
(principal component analysis) is used in order to synthesize structural indicators, and
cartographic output is given. Third, KCA (k-means cluster analysis) is applied in order to
group landscape patches into homogeneous clusters, and again GIS output is supplied. Last,
LDA (linear discriminant analysis) is employed to provide evidence for the differences
among clusters.
This modelling approach provides the chance for a deep and cost-effective exegesis of
landscape structure, with promising consequences on conjecture formulation about functional
aspects as well.
Chapter 14 - Our days, the climatic change, manifested by strong and brutal precipitation,
violent wind and long drought, has as direct consequence to damage the plant canopies
(forests, sylviculture, oasis, pastoral lands and agricultural fields) so menacing the human
feeding either from plants or animals (caprine, ovine, bovine, cameline..), exhausting the
water resources, increasing the need for energy in buildings used for all activities (industrial,
agricultural and services). Which solution the ecological modelling is capable to participate
with, at short and long dated, in order to buffer the climatic change effect and to assume the
need of food and clean energy for human? In this chapter we will present the basic concepts
to model the plant architecture (species, densities, positions and orientation) the most
adaptable to the sudden calamity, the energy use efficiency in building (material of
construction, isolation system, organisation of accessories and apparatus), and the produce of
clean energy from the wind velocity (founding wind sources and evaluating regional wind
potential offshore and inshore, conceptualising wind turbine and testing their efficiencies)
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 1-14 © 2012 Nova Science Publishers, Inc.

Chapter 1

ARTIFICIAL NEURAL NETWORK SIMULATION OF


SPATIAL DISTRIBUTION OF ARTHROPODS:
A MULTI-MODEL COMPARISON

WenJun Zhang1,2* and GuangHua Liu3+


1
Research Institute of Entomology, School of Life Sciences,
Sun Yat-sen University, Guangzhou 510275, China.
2
International Academy of Ecology and Environmental Sciences, Hong Kong
3
Guangdong AIB Polytech College, Guangzhou 510507, China.

ABSTRACT
Probability distribution functions have been widely used to model the spatial
distribution of arthropods. Aggregation types (i.e., randomly distributed, uniformly
distributed, aggregately distributed, etc.) of arthropods can be detected based on
probability distribution functions, but the abundance at given location is not able to be
predicted by them. This study aimed to present an artificial neural network to simulate
spatial distribution of arthropods. Response surface model and spline function were
compared and evaluated against the neural network model for their simulation
performance.
The results showed that the artificial neural network exhibited good simulation
performance. Simulated spatial distribution was highly in accordant with the observed
one. Overall the neural network performed better in the case of lower total abundance of
arthropods. Response surface model could fit the spatial distribution of arthropods but the
simulation performance was worse than neural network. Cross validation revealed that
neural network performed better than response surface model and spline function in
predicting spatial distribution of arthropods. Confidence interval of predicted abundance
could be obtained using randomized submission of quadrate sequences in the neural
network simulation. It is concluded that artificial neural network is a valuable model to
simulate the spatial distribution of arthropods.

*
Correspondence: zhwj@mail.sysu.edu.cn.
+
Correspondence: ghliu@gdaib.edu.cn.
2 WenJun Zhang and GuangHua Liu

Keywords: Artificial neural network; response surface model; spline function; arthropods;
spatial distribution; simulation.

1. INTRODUCTION
Arthropods account for 90% of global species. Biomass of arthropods reaches 1,000kg
/ha in a temperate grassland, which is lower than plant (20,000kg/ha) and microorganisms
(7,000kg /ha) but higher than mammals (1.2kg/ha) and birds (0.3kg/ha) (Pimental et al.,
1992). They control the structures and functions of ecosystems (Wilson, 1987).Arthropods
have been used as the sensitive indicators of environment health (Brown, 1991; Kremen et al.,
1993).
In ecological research the spatial distribution means the two-dimensional distribution of
animal or plant individuals in the field. Many methods such as GIS (Dantas et al., 2009) and
probability theory are used in the spatial pattern analysis. In the arthropod researches, a lot of
probability distribution functions have been developed and used to describe spatial
distribution of arthropod individuals (Krebs, 1989; Zhang, 2007b). In such methods the
number of individuals found in a sample (plot, quadrat, etc.) is supposed to be a random
variable and the random variable follows some probability distribution, e.g., binomial
distribution, Poisson distribution, negative binomial distribution, Neyman‘s distribution, etc.
Because lack of spatial variables in the function, it can not be used to predict the abundance at
given location (Krebs, 1989; Zhang, 2007b). This is a substantial problem arisen from the use
of probability distribution functions in spatial distribution researches. However, the spatial
information is not available from those models. Due to the lack of theoretical background it is
also hard to construct a mechanistic model that calculates individual distribution from spatial
information. Questions on spatial distribution are therefore data-driven (Schultz and Wieland,
1997). The relationship between individual distribution and spatial information is usually a
nonlinear relationship (Pastor-Barcenas et al., 2005).
Artificial neural networks are known to be flexible and adaptable function approximators
for nonlinear relationships (Bianconi, 2010; Cereghino et al., 2001; Marchant and Onyango,
2003; Acharya et al., 2006; Filippi and Jensen, 2006; Nour et al., 2006; Zhang, 2007a,b;
Zhang and Barrion, 2006; Zhang et al., 2007; Zhang et al., 2008). They can offer the
advantages of simplified and more automated model synthesis and analytical input-output
models (Abdel-Aal, 2004; Tan et al., 2006). A large number of studies were reported
concerning applications of neural networks. For examples, they are considered to be more
effective in time series prediction than previous procedures based on dynamical system theory
(Ballester et al., 2002). They are used in the forecast of short and middle long-term
concentration levels (Viotti et al., 2002), subsurface modeling (Almasri and Kaluarachchi,
2005), modeling hourly temperature with the alternative abductive networks (Abdel-Aal,
2004), modeling sediment transfer (Abrahart and White, 2001), rule extraction (Drumm et al.,
1999) and subsurface drain outflow and nitrate-nitrogen concentration in tile effluent and
surface ozone (Sharma et al., 2003; Pastor-Barcenas, et al., 2005), and estimation of
endoparasitic load using morphological descriptors (Loot et al., 2002), the uses of BP to
describe nitrogen dioxide dispersion (Nagendra and Khare, 2006), and for reservoir
eutrophication prediction (Kuo et al., 2007). Empirical models regained popularity in recent
Artificial Neural Network Simulation of Spatial Distribution of Arthropods 3

years due to the complexity and nonlinearity of ecosystems (Tan et al., 2006). Various
conventional models, including empirical models, were thus used to compare simulation
performances between neural networks and these models. For instance, it demonstrated that
neural network was superior to linear models, generalized additive models, and classification
and regression trees (Moisen and Frescino, 2002). Neural network was proved to outperform
other models like multiple regression, logistic regression, and multiple discriminant model in
predicting the number of salmonids and community composition (McKenna, 2005; Olden et
al., 2006). In the research areas of arthropods or related taxonomic groups, neural networks
have been used to make simulation and prediction. A stream classification based on
characteristic invertebrate species assemblages was also satisfactorily conducted using self-
organizing map neural network (Cereghino et al., 2001). They were used to explain the
observed structure of functional feeding groups of aquatic macro-invertebrates (Jorgensen et
al., 2002); Self-Organizing Map (SOM) neural network was used to determine pest species
assemblages for global regions (Worner and Gevrey, 2006); BP and RBF (radial basis
function) neural networks were used to simulate and predict species richness of rice
arthropods (Zhang and Barrion, 2006), reconstruct spatial pattern of insects (Zhang et al.,
2007), and simulate survival dynamics of insects (Zhang and Zhang, 2008).
This study aimed to present several models, and evaluate their effectiveness in the
simulation of spatial distribution of arthropods. Arthropods were investigated on the
grassland. A neural network and a partial differential equation were developed to model
above-ground distribution of arthropods (Zhang, 2010). Models were validated and compared
for their power in predictability. Details for developing and using neural network were
discussed.

2. MATERIALS AND METHODS


2.1. Field Investigation

Investigation was conducted on the grassland with 8×8 quadrates. Each quadrate has an
area of 1×1 m2. Arthropods were collected, identified, and counted for every quadrate. Insects
were sorted and identified to order level and the other arthropods were identified to classis
level.

2.2. Artificial Neural Network

The artificial neural network for simulating spatial distribution of arthropods is a


mapping from input space (with the spatial coordinates of quadrate as the element) to output
space (with number of arthropod individuals in the quadrate as the element), U:R2→R and
u(x)=v, where u∈U={u|u:R2→R}. For an input set, xi∈R2, and the output set, vi∈R, there is a
mapping f that satisfies f(xi)=vi, i=1,2,…,n. A mapping u∈U={u|u:R2→R}, represented by
this network, should approximate f(x) and satisfy the following condition:

|u(x)- f(x)| <ε x∈R2


4 WenJun Zhang and GuangHua Liu

where x=(x, y) T, andε>0 is the known threshold for error.


A three-layer neural network was developed for simulating spatial distribution of
arthropods (Figure 1; Zhang, 2010). Both the first and second layers contained thirty neurons,
and bias was used to each layer. Transfer functions for layers 1 to 3 were hyperbolic tangent
sigmoid transfer function:

tansig(x) = 2/(1+exp(-2*x))-1,

logarithmic sigmoid transfer function:

logsig(x) =1/(1+exp(-x)),

and linear transfer function:

purelin(x) =x,

respectively. Initialization of network, and weights and bias for each layer, was performed by
a function that initializes each layer i (i=1,2,3) according to its own initialization function
(Hagan et al., 1996; Mathworks, 2002; Fecit, 2003). Network was trained using Levenberg-
Marquardt backpropagation algorithm. Desired performance function was mean squared error
performance function (mse). The first and second layers received inputs from input space and
produced outputs for the third layer. There was a closed loop for the third layer. For each
layer, the net input functions calculated the layer‘s net input by combining its weighted inputs
and biases.

Figure 1. Artificial neural network developed in present study.


Artificial Neural Network Simulation of Spatial Distribution of Arthropods 5

Mathematically, the network output is:

f(x)≈u(x)=∑k=13ωk ak(∙) (1)

where

a1(∙)=2/(1+exp(-2(ω11 x +ω21 y+b1)))-1


a2(∙)=1/(1+exp(-(ω12 x +ω22 y+b2)))
a3(∙)=∑k=13ωk3 ak(∙)+b3

In eq. (1), x=(x, y) T is the input, u=u(x) is the output; ωi, i=1,2,3; ωij, i,j=1,2; ωi3,
i=1,2,3…, and bi, i=1,2,3, are the parameters.
The artificial neural network was developed using Matlab (Mathworks, 2002). Simulation
performance of the neural network was expressed as mse, Pearson correlation coefficient, and
significance level for the linear regression between the simulated and observed.

2.3. Response Surface Model (RSM)

The response surface model (He, 2001; Mathworks, 2002), i.e., the trend surface model
(Zhang and Fang, 1982), was also used in present simulation:

u(x)=a+ bTx+ xTcx (2)

where u(x): arthropod abundance (individuals per quadrate); x=(x, y) T: the coordinate of
quadrate; b=(b1,b2)T, c=(c1,c2)T: parametric vectors; a: constant.

2.4. Spline Function

Spline interpolation is one of the most efficient interpolation models, among which the
cubic spline interpolation is widely used (Burden and Faires, 2001). The cubic spline function
used in present study was:

u(x)=Mi+1(x-xi)3/(6li)+Mi(xi+1-x)3/(6li)+(f(xi+1)/li - Mi+1li/6)(x-xi)+(f(xi)/li - Mili/6)(xi+1-x),


x∈[xi,xi+1], i=0,1,…,n-1 (3)

where, xi=i+1, Mi=S’’(xi), i=0,1,…,n; li =xi+1-xi, i=0,1,…,n-1. Mi, i=0,1,…,n, were obtained
from three-bending moment equation (Zhang, 2007b).

2.5. Data Description

1) Training data. In the simulation of spatial distribution, in total of 64 quadrates (n=64)


were used to train neural network and response surface model. The input space was a
6 WenJun Zhang and GuangHua Liu

two-dimensional space (coordinates of quadrate, e.g., (1,2), (5,7), etc.), and the
output space was a one-dimensional space (arthropod abundance).
2) Cross validation. There are several cross validation methods. I used a widely
applicable method (Olden et al., 2006). Using this method, each quadrate was
separately removed from the input set of 64 quadrates, and the remaining quadrates
were used to train model and to predict the removed quadrates using the trained
model. As a consequence, the cross validation may be conducted within the data set
in the same study. Comparisons between the predicted and observed arthropod
abundances were made and Pearson correlation coefficient (r) and statistic
significance were calculated to validate models.
3) Quadrates were submitted to neural network in two ways, i.e., their fixed sequences,
and randomized sequences of quadrates.

3. RESULTS
Most arthropods found on the grassland were insects, which belong to the orders
Homoptera (523 individuals), Orthoptera (230 individuals), Hymenoptera (110 individuals),
Coleoptera (55 individuals), and Diptera (40 individuals), etc. Other arthropods were sparsely
distributed on the grassland. The spatial distribution of arthropods exhibited a saddle-like
shape, which was similar to the most abundant Homoptera (Figure 2).

Figure 2. Observed spatial distribution of individuals of arthropods, Orthoptera, Hymenoptera, and


Homoptera on the grassland.
Artificial Neural Network Simulation of Spatial Distribution of Arthropods 7

3.1 Simulating Spatial Distribution with Neural Network and Response


Surface Model

Using the artificial neural network developed above (Figure1, eq.(1)) to simulate spatial
distributions of arthropods and the most abundant orders Orthoptera, Hymenoptera, and
Homoptera. Neural network was trained by 10000 epochs and the desired accuracy (mse) was
0.00001. The results revealed that neural network exhibited excellent simulation performance.
The simulated spatial distribution perfectly coincided with the observed (intercept≈0,
slope≈1, r≈1, p<0.0001), as illustrated by Figure 3.
Using a deviation function, stmse=mse/u2, where u is the averaged individuals per
quadrate, together with Figure 4, it was found that the lower abundance would overall lead
neural network to yield the better simulation performance (Arthropods (stmse=6.39*10-3),
Homoptera (stmse=2.16*10-2), Orthoptera (stmse=5.88*10-7), Hymenoptera
-6
(stmse=1.46*10 )).

Figure 3. Simulating spatial distribution of arthropods using neural network. Quadrates were submitted
to neural network in their fixed sequences.

Response surface model could better fit the spatial distribution of arthropods. However,
its simulation performance was lower than neural network (Figure 4).
8 WenJun Zhang and GuangHua Liu

Figure 4. Simulating spatial distribution of arthropods using response surface model.

3.2. Cross Validation of Models

The cross validation, in which quadrates were submitted to neural network in their fixed
sequences, demonstrated that neural network exhibited much better performance than
response surface model in predicting unknown quadrates (Figure 5). In most cases, response
surface model yielded the negative correlation between the predicted and observed
abundances. As a result, it is not suggested using response surface model to predict spatial
distribution of arthropods.
Different from the situation above, neural network exhibited better prediction
performance in the case of larger abundance than lower abundance on the grassland (Figure
5). This means that compared to simulation the neural network needs more information to
train itself for producing a reasonable extrapolation.
In an additional cross validation of neural network to predict the spatial distribution of
arthropods, quadrates were submitted in randomized sequences and five randomizations were
used. The results showed that neural network yielded the better performance (r=0.5323,
p<0.0001). More than 60% of quadrates were correctly predicted and they fell inside 95%
confidence intervals of the predicted (Figure 6).
Artificial Neural Network Simulation of Spatial Distribution of Arthropods 9

Figure 5. Cross validation of neural network and response surface model for predicting spatial
distribution of arthropods. Quadrates were submitted to neural network in their fixed sequences.

Figure 6. Cross validation of neural network for predicting distribution of arthropods. Quadrates were
submitted to neural network in randomized sequences. Five randomizations were conducted.
10 WenJun Zhang and GuangHua Liu

The cross validation indicated that spline function performed badly than both neural
network and response surface model (Table 1).

Table 1. Cross validation of spline interpolation

Arthropods Orthoptera

Observed=15.6520-0.0041*Simulated Observed=3.8052-0.0545

r=-0.0100, p>0.01(0.9341), mse=1164.4 r=-0.0975, p>0.01(0.4436), mse=37.7524

Hymenoptera Homoptera

Observed=1.7092+0.0067*Simulated Observed=8.3154-0.0148*Simulated

r=0.0100, p>0.01(0.9289), mse=15.4195 r=-0.0316, p>0.01(0.8072), mse=445.9147

4. CONCLUSION AND DISCUSSION


An artificial neural network is developed to model spatial distribution. It exhibits
excellent performance in the simulation of spatial distribution of arthropods. Simulated spatial
distribution would perfectly coincide with the observed one. Lower total abundance would
overall lead neural network to yield the better simulation performance. Response surface
model and spline function can be used to fit the spatial distribution of arthropods. Simulation
performance of response surface model is proved to be lower than neural network. The cross
validation confirms that neural network has much better performance than response surface
model and spline function in predicting unknown quadrates. Submitting quadrates in
randomized sequences helps to yield confidence interval of the result in the neural network
simulation.
There are many kinds of artificial neural networks. More and more new algorithms of
neural networks have been developing in various sciences. Both classic algorithms, like BP,
SOM, RBF (Nagendra and Khare, 2006; Worner and Gevrey, 2006; Zhang and Barrion,
2006), etc., and newly developed algorithms such as the one developed in present study, can
be used in the simulation of spatial distribution of arthropods. New algorithms may be
designed according to the specific requirements and some details should be deliberated, such
as the number of layers, neurons, biases, and targets; transfer functions, training functions;
layer connects, input connects, output connects, bias connects, and target connects; input
delays, input weights, layer weights, and initiate functions, and so on.
In addition to network settings, the quality of input set should also be ensured to obtain a
better neural network. Data quality may be improved through a reasonable experiment or
sampling design, eliminating data redundancy, and randomization procedure, etc (Kilic et al.,
2007). As demonstrated in present study, randomized submission of quadrates helps neural
network eliminate the correlation between inputs in training procedure. A larger number of
randomizations are suggested being used to produce the reliable interval for simulation and
prediction.
Over-learning in neural network simulation should be avoided in order to produce the
best predictive performance. It can be solved by limiting the complexity of the neural network
Artificial Neural Network Simulation of Spatial Distribution of Arthropods 11

(layers, neurons, etc.), by training neural network with noise, and by using techniques as
weight decay, and so on (Ozesmi et al., 2006).
Both spatial information (Euclidean coordinates) and environmental factors like plant
composition, climate conditions, etc., can be considered to be input components. By doing
this, we obtain a more interpretative neural network. This model may be used to interpret
data. Some methods on data interpretation of explore neural network, like sensitivity analysis,
inference rule extraction, randomization approach, neural interpretation diagram, etc., are
available now for this purpose (Bradshaw et al., 2002; Olden and Jackson, 2002; Gevrey et
al., 2006).
Similar to the conclusion from present study, most previous studies showed that neural
networks outperfomed conventional models (Lek et al., 1997; Paruelo and Tomasel, 1997;
Brosse et al., 1999; Abrahart and White, 2001; Yu et al., 2006; Zhang, 2007a,b), although
varied results were also produced in using neural networks (Marchant and Onyango, 2003;
Filippi and Jensen, 2006). This study demonstrates that artificial neural network is more
robust than the conventional model in the modelling of spatial distribution. It can provide a
feasible alternative to more classical spatial statistical techniques (Pearson et al., 2002).
However, researches towards complicate spatial distribution are further desired in the future.
This study used the data set of 64 quadrates to model spatial distribution of arthropods
and a better performance was achieved. As fitting models, their simulation performance is
expected to be improved with the increase of the size of data set. However, more studies with
larger data sets are still needed in the future to further validate these models.

ACKNOWLEDGMENTS
This project was supported by ―N ational Basic Research Program of China‖(973
Program)(No. 2006CB102005). We thank all participants of arthropod investigation, Mr. WG
Zhou, HQ Dai, and undergraduates of ecology 2004, Sun Yat-sen University, China.

REFERENCES
Abdel-Aal RE. Hourly temperature forecasting using abductive networks. Engineering
Applications of Artificial Intelligence, 17, 543-556, 2004.
Abrahart RJ, White SM. Modelling Sediment Transfer in Malawi: Comparing
Backpropagation Neural Network Solutions Against a Multiple Linear Regression
Benchmark Using Small Data Set. Phys. Chem. Earth (B), 26(1), 19-24, 2001.
Acharya C, Mohanty S, Sukla LB, et al. Prediction of sulphur removal with Acidithiobacillus
sp. using artificial neural networks. Ecological Modelling, 190(1-2), 223-230, 2006.
Almasri MN, Kaluarachchi JJ. Modular neural networks to predict the nitrate distribution in
ground water using the on-ground nitrogen loading and recharge data. Environmental
Modelling and Software, 20, 851-871, 2005.
Ballester EB, Valls GCI, Carrasco-Rodriguez JL, et al. Effective 1-day ahead prediction of
hourly surface ozone concentrations in eastern Spain using linear models and neural
networks. Ecological Modelling, 156 (1), 27-41, 2002.
12 WenJun Zhang and GuangHua Liu

Bianconi A, Zuben CJV, de Souza Serapião AB, et al. 2010. The use of artificial neural
networks in analysing the nutritional ecology of Chrysomya megacephala (F.) (Diptera:
Calliphoridae), compared with a statistical model. Australian Journal of Entomology, 49:
201–212.
Bradshaw CJA, Davis LS, Purvis M, et al. Using artificial neural networks to model the
suitability of coastline for breeding by New Zealand fur seals (Arctocephalus forsteri).
Ecological Modelling, 148 (2), 111-131, 2002.
Brown KS Jr. Conservation of Neotropical insects: Insects as indicators. In: Collins, NM,
Thomas, JA (eds.), The Conservation of Insects and Their Habitats. Academic Press,
London, 349-404, 1991.
Burden R, Faires JD. Numerical Analysis (Seventh Edition). Thomson Learning, Inc., 2001.
Cereghino R, Giraudel JL, Compin A. Spatial analysis of stream invertebrates distribution in
the Adour-Garonne drainage basin (France), using Kohonen self organizing maps.
Ecological Modelling,146,167–180, 2001.
Dantas A, Yamamoto K, Lamar MV, Yamashita Y. 2009. Neural geo-spatial model: a
strategic planning tool for urban transportation. Transactions of the Wessex Institute,
DOI: 10.2495/UT000251.
Drumm D, Purvis M, Zhou QQ. Spatial ecology and artificial neural networks: modeling the
habitat preference of the sea cucumber (Holothuria leucospilota) on Rarotonga, Cook
Islands. SIRC 99 – The 11th Annual Colloquium of the Spatial Information Research
Centre, University of Otago, Dunedin, New Zealand December 13-15th, 1999.
Fecit. Analysis and Design of Neural Networks in MATLAB 6.5, Electronics Industry Press,
Beijing, China, 2003.
Filippi AM, Jensen JR. Fuzzy learning vector quantization for hyperspectral coastal
vegetation classification. Remote Sensing of Environment, 100, 512-530, 2006.
Gevrey M, Dimopoulos I, Lek S. Two-way interaction of input variables in the sensitivity
analysis of neural network models. Ecological Modelling, 195 (1-2), 43-50, 2006.
Hagan MT, Demuth HB, Beale MH. Neural Network Design. PWS Publishing Company,
USA, 1996.
He RB. MATLAB 6: Engineering Computation and Applications. Chongqing University
Press, Chongqing, China, 2001.
JØrgensen SE, Verdonschot P, Lek S. Explanation of the observed structure of functional
feeding groups of aquatic macro-invertebrates by an ecological model and the maximum
exergy principle. Ecological Modelling, 158 (3),223-231, 2002.
Kilic H, Soyupak S, Tuzun I et al. An automata networks based preprocessing technique for
artificial neural network modelling of primary production levels in reservoirs. Ecological
Modelling, 201 (3-4), 359-368, 2007.
Krebs CJ. Ecological Methodology. HarperCollinsPublishers, New York, 1989.
Kremen C, Colwell RK, Erwin TL, et al. Invertebrate assemblges: their use as indicators in
conservation planning. Conservation Biology, 7, 796-808, 1993.
Kuo JT, Hsieh MH, Lung WS et al. Using artificial neural network for reservoir
eutrophication prediction. Ecological Modelling, 200(1-2), 171-177, 2007.
Lek S, Baran P. Estimations of trout density and biomass: a neural networks approach.
Nonlinear Analysis, Theory, Methods & Applications, 30(8), 4985-4990, 1997.
Artificial Neural Network Simulation of Spatial Distribution of Arthropods 13

Loot G, Giraudel JL, Lek S. A non-destructive morphometric technique to predict Ligula


intestinalis L. plerocercoid load in roach (Rutilus rutilus L.) abdominal cavity. Ecological
Modelling, 156 (1), 1-11, 2002.
Marchant JA, Onyango CM. Comparison of a Bayesian classifier with a multilayer feed-
forward neural network using the example of plant/weed/soil discrimination. Computers
and Electronics in Agriculture, 39, 3-22, 2003.
Mathworks. Neural Network Toolbox, MATLAB 6.5, 2002.
McKenna JE. Application of neural networks to prediction of fish diversity and salmonid
production in the Lake Ontario basin. Transactions of The American Fisheries Society,
134(1), 28-43, 2005.
Moisen GG, Frescino TS. Comparing five modelling techniques for predicting
forest characteristics. Ecological Modelling, 157 (2-3), 209-225, 2002.
Nagendra SMS, Khare M. Artificial neural network approach for modelling nitrogen dioxide
dispersion from vehicular exhaust emissions. Ecological Modelling, 190(1-2), 99-115,
2006.
Nour MH, Smith DW, El-Din MG, et al. The application of artificial neural networks to flow
and phosphorus dynamics in small streams on the Boreal Plain, with emphasis on the role
of wetlands. Ecological Modelling, 191(1), 19-32, 2006.
Olden JD, Jackson DA. Illuminating the "black box": a randomization approach for
understanding variable contributions in artificial neural networks. Ecological Modelling,
154 (1-2), 135-150, 2006.
Olden JD, Joy MK, Death RG. Rediscovering the species in community-wide predictive
modeling. Ecological Applications, 16 (4), 1449-1460, 2006.
Ozesmi SL, Tan CO, Ozesmi U. Methodological issues in building, training, and testing
artificial neural networks in ecological applications. Ecological Modelling, 195 (1-2), 83-
93, 2006.
Pastor-Barcenas O, Soria-Olivas E, Martın-Guerrero JD. Unbiased sensitivity analysis and
pruning techniques in neural networks for surface ozone modeling. Ecological Modelling,
182, 149–158, 2005.
Pearson RG, Dawson TP, Berry PM, et al. SPECIES: A Spatial Evaluation of Climate Impact
on the Envelope of Species. Ecological Modelling, 154 (3), 289-300, 2002.
Pimental D, Stachow U, Takacs DA, et al. Conserving biological diversity in
agricultural/forestry systems. Bioscience, 42(5), 354-362, 1992.
Schultz A, Wieland R. The use of neural networks in agroecologica modeling. Computers and
Electronics in Agriculture, 18, 73-90, 1997.
Sharma V, Negi SC, Rudra RP, et al. Neural networks for predicting nitrate-nitrogen in
drainage water. Agricultural Water Management, 63, 169–183, 2003.
Tan CO, Ozesmi U, Beklioglu M, et al. Predictive models in ecology: Comparison of
performances and assessment of applicability. Ecological Informatics, 1(2), 195-211,
2006.
Viotti P, Liuti G, Di Genova P. Atmospheric urban pollution: applications of an artificial
neural network (ANN) to the city of Perugia. Ecological Modelling, 148(1), 27-46, 2002.
Wilson EO. The little things that run the world. Conservation Biology, 1, 344-346, 1987.
Worner SP, Gevrey M. Modelling global insect pest species assemblages to determine risk of
invasion. Journal of Applied Ecology, 43 (5), 858-867, 2006.
14 WenJun Zhang and GuangHua Liu

Yu R, Leung PS, Bienfang P. Predicting shrimp growth: Artificial neural network versus
nonlinear regression models. Aquacultural Engineering, 34, 26–32, 2006.
Zhang WJ. Computational Ecology: Artificial Neural Networks and Their Applications.
World Scientific, Singapore, 2010.
Zhang WJ. Supervised neural network recognition of habitat zones of rice invertebrates.
Stochastic Environmental Research and Risk Assessment, 21, 729-735, 2007a.
Zhang WJ. Methodology on Ecology Research. Sun Yat-sen University Press, Guangzhou,
China, 2007b.
Zhang WJ, Barrion AT. Function approximation and documentation of sampling data using
artificial neural networks. Environmental Monitoring and Assessment, 122, 185-201,
2006.
Zhang WJ, Bai CJ, Liu GD. Neural network modeling of ecosystems: a case study .on
cabbage growth system. Ecological Modelling, 201,317-325, 2007.
Zhang WJ, Liu GH, Dai HQ. Simulation of food intake dynamics of holometabolous insect
using functional link artificial neural network. Stochastic Environmental Research and
Risk Assessment, 22, 123-133, 2008.
Zhang WJ, Zhang XY. Neural network modeling of survival dynamics of holometabolous
insects: a case study. Ecological Modelling, 211, 433-443, 2008.
Zhang WJ, Zhong XQ, Liu GH. Recognizing spatial distribution patterns of grassland insects:
neural network approaches. Stochastic Environmental Research and Risk Assessment, 22,
207-216 , 2008.
Zhang YT, Fang KT. Introduction to Multivariate Statistics. Science Press, Beijing, China,
1982.
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 15-39 © 2012 Nova Science Publishers, Inc.

Chapter 2

MULTISPECTRAL VEGETATION INDICES


IN REMOTE SENSING: AN OVERVIEW

George P. Petropoulos1* and Chariton Kalaitzidis2


1
Department of Natural Resources Development & Agricultural Engineering,
Agricultural University of Athens, 75, Iera Odos St., Athens, Greece, Email:
petropoulos.george@gmail.com
2
Department of Geoinformation, Mediterranean Agronomic Insitute of Chania, Alsyllio
Agrokipiou, Crete, Greece, Email: harry.kalaitzidis@gmail.com

ABSTRACT
Remote sensing has generally demonstrated a great potential in mapping spatial
patterns of vegetation. By employing the amount of reflected radiation at particular
regions of the electromagnetic spectrum, it is possible to make estimates on certain
characteristic of vegetation. The use of radiometric vegetation indices is a fast and
efficient method for vegetation monitoring, exploiting information acquired from remote
sensing data. These indices are dimensionless radiometric measures that generally
function as indicators of relative abundance and activity of green vegetation.
Throughout the years, a large number of multispectral vegetation indices have been
formulated. Each has variable degree of efficiency in estimating one or more vegetation
parameters such as, health status, nutrient or water deficiency, crop yield, vegetation
cover fraction, leaf area index, absorbed photosynthetically active radiation, net primary
production and above-ground biomass. Additionally some of them also consider
atmospheric effects and/ or the soil background for an enhanced retrieval. The present
chapter aims in providing an overview on the use of radiometric vegetation indices
developed over the last few decades, utilizing spectral information acquired from
multispectral optical remote sensing sensors. This overview is preceded an introduction
to some important principles of remote sensing relevant to the vegetation spectral
response is made available, as this was considered necessary to better understand the
context of the present overview.

Keywords: radiometric vegetation indices, vegetation mapping, optical satellites, remote


sensing.
16 George P. Petropoulos and Chariton Kalaitzidis

1. INTRODUCTION
Remote sensing can be generally defined as the technique of gathering information about
an object (or target) without making actual contact with this object. Extraction of information
is done by analyzing the reflected or emitted electromagnetic radiation (EMR) energy of the
object recorded by the sensor of a remote sensing system. The first instances of remote
sensing occured in 1858 with the first aerial photographs, taken from balloons, with the
purpose of being used for mapping and military reconnaissance. Satellite remote sensing
began around 1960‘s with the launch of the first satellite, TIROS-1, which was primarily used
for meteorological purposes. Nowadays, remote sensing data are collected by both
multispectral and hyperspectral sensors operating in both airborne and satellite platforms,
with a very large number of satellite systems operating in orbit over the Earth‘s surface.
The advent of satellite-based remote sensing over the last few decades has lead to a
considerable amount of work being done in determining their potential usefulness in many
disciplines and applications. Among the main advantages of remote sensing include its ability
to provide synoptic views of large areas in a spatially contiguous fashion and in a repetitive
manner, without a disturbing influence on the area to be surveyed and without accessibility
issues to the site to be studied. Remote sensing observations from microwave sensors offer all
weather capability and daytime and nighttime observations, which, combined with their
strong dependence on the dielectric properties of the target, make them potentially very
powerful for the estimation of various parameters, such as of soil moisture content.
Specifically the potential use of remote sensing for mapping vegetation condition and
health has been explored by many scientists over the last decades. A common approach
employed for this purpose involves the use of radiometric vegetation indices computed from
multispectral remote sensing observations. The present chapter aims in providing an overview
of the approaches exploited for the estimation of vegetation health and vigor from the
computation of radiometric vegetation indices from multispectral optical remote sensing
observations. Nevertheless, before that, a discussion to the principles of remote sensing in the
reflective part of the EMR, namely the visible near-infrared (VNIR) and shortwave infrared
(SWIR) is made available. This was deemed necessary, as understanding the characteristics of
EMR and how the latter interacts with land surface targets is crucial in later understanding the
information that is extracted from remote sensing data and subsequently used in the
estimation of vegetation condition by employing these radiometric indices.

2 RADIOMETRIC PRINCIPLES AND CONSIDERATIONS IN THE


REFLECTIVE PART OF EMR
In this section, some basic principles of remote sensing are introduced assisting the reader
in better understanding the basis on which the computation of the different radiometric
vegetation indices has been based on. Here the discussion is focused only on the reflective
part of EMR (0.4 – 2.5 µm), as the overview of the radiometric indices that follows will be
based on the use of remote sensing data from this part of the EMR.
Multispectral Vegetation Indices in Remote Sensing 17

2.1. Atmospheric and Radiometric Properties of Remote Sensing

The satellite sensor measures the intensity of the electromagnetic waves reflected by the
surface in different parts of the spectrum. Comparison of the radiometric and spectral
characteristics of the reflected energy to the characteristics of the incident energy derives the
surface reflectivity, which is analyzed to determine the physical and chemical properties of
the surface. However, as the solar waves propagate through a planet‘s atmosphere, they
interact with atmospheric constituents, leading to significant effects on the intensity and
spectral composition of the energy recorded by a remote sensor. These effects are caused
principally through the mechanisms of atmospheric scattering and absorption (e.g. Cambell,
1981). Both these effects are directly related to the atmospheric path length and the
wavelengths involved. Scattering arises when particles of various sizes present in the
atmosphere or large gas molecules interact with the EMR resulting to a redirection of the
radiation from its original path. The magnitude of scattering depends on various factors, most
importantly the wavelength of the radiation, the abundance of particles or gases, and the
distance the radiation travels through the atmosphere (e.g. Verbyla, 1995). The atmospheric
absorption normally involves absorption of EMR energy at particular wavelengths by certain
gases present in the atmosphere. The dominant gases responsible for most of this absorption
are water vapor (H2O), carbon dioxide (CO2) and ozone (O3), and depending on atmospheric
conditions and the amount of radiation emitted from the surface one of these effects will be
dominant. Nevertheless, there are certain spectral domains within the EMR spectrum that are
relatively free from the effects of scattering and absorption, known as atmospheric windows.
Figure 1 illustrates these ―absorption-free‖ regions within the EMR, which are very important
in remote sensing, as in these regions atmospheric effects on radiation minimal in comparison
with other wavelengths.

Figure 1. Atmospheric windows within the reflective part of the EMR.


18 George P. Petropoulos and Chariton Kalaitzidis

All wavelengths shorter than 0.30 µm are unavailable for remote sensing and strong
absorption bands also exist in the near-infrared, particularly around 1.9, 1.4, 1.12, 0.95 and
0.76 µm. The first significant atmospheric window begins at 0.3 μm, providing good
transparency in the visible spectrum, between 0.30-0.75 μm. The atmospheric window
continues but with interruptions in the NIR region (i.e. 0.77-0.91 µm). In the near-infrared
(NIR) part of the EMR spectrum there are several atmospheric windows in narrow wavebands
between 1.0-1.12, 1.19-1.34, 1.55-1.75 and 2.05-2.4 μm. Consequently, it is natural to expect
that correction of these atmospheric effects can be particularly useful for improving the
quality of the remotely sensed data. Description of such atmospheric correction approaches
can be found elsewhere, including Slater (1980) and Rees (2001).
As the solar waves propagate through Earth‘s atmosphere from these atmospheric
windows, radiation that is not absorbed or scattered in the atmosphere can reach and interact
with the Earth's surface. EMR reaching the Earth‘s surface interacts with the Earth‘s surface
objects by the mechanisms of reflection, transmission or absorption. The interrelationship of
these three parameters is expressed by a direct application of the conservation of energy law,
from the equation below:

E I (  )  E R (  )  EA (  )  ET (  ) , (1)

where EI(λ) denotes the incident energy, ER(λ) is the reflected energy, EA(λ) the absorbed
energy and ET(λ) is the transmitted energy.
There are two very important points that should be considered, regarding the
aforementioned equation. The first one is that that the proportions of energy, which is
reflected, absorbed and transmitted, will vary for different Earth surface targets, depending on
their material type and condition. These differences allow the distinguishing of the different
features recorded on a satellite image. In addition, even for a given feature type, the
proportion of the energy reflected, absorbed and transmitted will vary at different
wavelengths (Elachi, 1987). Thus, two features may be possible to be discriminated in one
spectral region but be very different in another wavelength band. Furthermore, another
important consideration accounted in remote sensing is the geometric manner in which an
object is reflecting energy, a function of the surface roughness of the object. Specular
reflectors are flat surfaces that manifest mirror-like reflections, where the angle of reflection
equals the angle of incidence. Diffuse (or Lambertian) reflectors are rough surfaces that
reflect uniformly in all directions.
The satellite sensor records a digital number (DN) for each pixel and spectral band.
However, for the majority of practical applications in remote sensing, the DN values must be
converted in measurements of the amount of energy reaching the sensor in each band, using
an appropriate sensor calibration equation. This energy is expressed by the radiance (L).
Radiance (L) is considered to be the measure of the radiant flux per unit of solid angle leaving
an extended area source in a given direction per unit projected source area in that direction,
and is measured in Wm-2sr-1. This measure of radiance is obviously made in a particular
viewing direction where the perceived surface brightness is considered constant for the entire
hemisphere above the surface. The measure of the amount of energy reflected from the
surface can be related to the amount of energy arriving to the surface, from the Sun, by
Multispectral Vegetation Indices in Remote Sensing 19

introducing the reflectance of the surface term, which is mathematically defined as (e.g.
Lillesand & Kieffer, 1994):

ER(  ) energyof wavelengthλ reflected from the object , (2)


ρλ   x100
EI(  ) energyof wavelengthλ incident upon the object

where reflectance ρλ is expressed as a percentage whereas all the other parameters defined as
in equation 2, above.
Conventional research in remote sensing has concentrated on the use of spectral response
based on nadir or near-nadir reflectance measurements. However, for remote sensing
applications the spectral response of a target will also depend upon factors such as the
orientation of the Sun (solar azimuth), the height of the Sun in the sky (solar elevation angle),
the direction in which the sensor is pointing relative to nadir (the look angle) and the
wavelength used (Sabins, 1997). All these factors are combined in the bidirectional
reflectance distribution function (BRDF), which is a theoretical concept that describes
directional reflectance phenomena by relating the incident irradiance from one given direction
to its contribution to the reflected radiance in another specific direction (Nicodemus et al.,
1977). The bidirectional reflectance distribution function is normally defined as the ratio of
reflected radiance to incident irradiance at a particular wavelength:

dLr (i; r;  ) ,


 (i; r;  )  (3)
dEi(i;  )

where the subscripts i and r denote incident and reflected respectively, is the direction of
light propagation, λ is the wavelength of light, L is radiance, and E is irradiance.
Field devices, called ―f ield goniometers‖, which are essentially goniometric radiometric
instruments have been used for a number of years to practically assess the BRDF of natural
and man-made surfaces under natural illumination conditions (e.g Jackson et al., 1990;
Hosgood et al., 1999). Several bidirectional reflectance distribution function (BRDF) models
have been developed to predict the bidirectional reflectance properties (e.g. Verstraete et al.,
1990; Strahler and Jupp, 1991; Qin, 1993; Albuelgasim and Strahler, 1994). Although
detailed reference to these models is beyond the purposes of this discussion it should be
mentioned that essentially the existing approaches to the analytical modelling of the BRDF
are mainly distinguished in two groups; the physical and the statistical models. Physical
models relate the BRDF to various internal properties of the surface relying on physical
parameters (e.g. Myneni et al., 1990; Goel, 1988). On the other hand, statistical models of the
BRDF of the surface characterize the shape of the BRDF function using statistical parameter
(e.g. Kieffer et al., 1977; Pinty & Ramond, 1986).
Studies have addressed the importance of bidirectional effects in remote sensing (e.g.
Holben and Kimes, 1986; Schaaf and Strahler, 1994). Such studies, among others, showed
that correction / normalization of the sun / view angle effects is very important for
quantitative analysis of remotely sensed data. However, it should be taken into consideration
that, in general, most remote sensing applications assume Lambertian reflectance, which is
usually acceptable. Besides, the influence of the atmospheric conditions must also be
20 George P. Petropoulos and Chariton Kalaitzidis

considered for accurate BRDF determination, which practically is constantly changing in the
different viewing angles (Deering and Eck, 1987).

2.2. Properties of Earth Surface Materials

In remote sensing, by measuring the energy that is reflected by targets on the Earth's
surface over a variety of different wavelengths, it is possible to compile a spectral response
for that object. A spectral reflectance curve describes the spectral response of a target as a
function of wavelength, covering the visible to near-infrared region of the electromagnetic
spectrum. The configuration of spectral reflectance curves is particular important in remote
sensing, as it offers an insight into the spectral characteristics of an object and has a strong
influence on the choice of wavelength regions in which remotely sensed data should be
acquired. This important property makes it possible to identify the different substances or
classes and separate them by their spectral signatures denoted by their spectral curves.
The spectral reflectance curves of some typical surface materials are depicted in Figure 2
and their reflectance properties are briefly discussed below. Figure 2 shows reflectance
spectra for three different surface types of terrain cover, i.e. vegetation, soil and water. The
horizontal axis shows the wavelength of the incident energy, whereas the vertical axis shows
the percentage of incident energy reflected at the different wavelengths. Although these lines
for each cover type in Figure 2 represent average reflectance curves, it is prominent how
distinctive the curves are for each surface feature.

Figure 2. Spectral signatures of some terrestrial materials in the reflective part of the EMR (adopted
from http://www.satreponline.org/landsaf/print.htm#page_3.1.0).
Multispectral Vegetation Indices in Remote Sensing 21

Regarding the reflectance of healthy vegetation, as it is illustrated in Figure 2 above, it is


generally low in the visible part of the EMR. The vegetation curve shows relatively low
values in the red and the blue regions of the visible spectrum, with a small peak in the green
spectral region. These peaks and troughs are caused by the absorption of blue and red energy
by plant pigments, which are found in the chloroplasts within the leaf mesophyll. The
majority of those pigments use the absorbed energy to power the process of photosynthesis.
The most common of those pigments are chlorophylls, carotenes and xanthophylls. The
chlorophyll molecules exhibit the most dominant absorption in the visible region, particularly
in the blue (400 – 500 nm) and red (600 – 700 nm) regions (Figure 2; Buschmann and Nagel,
1991). Green light is not absorbed for photosynthesis and therefore most plants appear green.
Under the same principle, thicker leaves have lower absorption, due to increased chlorophyll
content, and higher NIR reflectance, due to increased scattering (Gausman and Allen, 1973).
From the vegetation spectral reflectance curve it is also apparent that plants generally
reflect radiation strongly in the NIR region. The area of the sharp increase in reflectance
between the red and NIR region of the spectrum is known as the ―r ed edge‖ region (Filella
and Penuelas, 1994). This slope is known to be affected by the amount of chlorophyll in the
leaves. At high chlorophyll concentrations, energy absorption in the red region increases and
the absorption feature in this part of the spectrum is widening, causing the red edge slope to
shift to longer wavelengths. On the other hand, at low chlorophyll concentrations (in stressed
plants, for example), the red edge is moving towards shorter wavelengths (Gates et. al.,
1965). The high reflectance of vegetation in the NIR region is mainly due to the high air/cell
interface area within leaves, whereas the air gaps in the cells become larger, resulting to a
decrease of multiple scattering and decrease in near-infrared reflectance (Gausman et al.,
1973). This reflectance is independent of wavelength. On the other hand, when light of a
particular wavelength encounters particles of similar size (i.e. chloroplasts) then only the
energy at that particular wavelength is scattered (Buschmann and Nagel, 1991). In general,
the amount of reflected energy in the NIR depends on: 1) the proportion of mesophyll leaf
exposed to intercellular spaces, 2) the presence or absence of leaf bi-colouration (between the
top and bottom leaf surfaces), and 3) the thickness of leaf cuticle (Slaton et. al., 2001).
Because the difference in the refractive index between cell wall and water is smaller than the
one between cell wall and air, when the plant has high water content, the intercellular spaces
fill up with water and the refraction of the NIR energy decreases. As a result, less energy is
reflected upwards and more is transmitted through the leaf (Knipling, 1970; Gausman et. al.,
1974).
Plant reflectance in the range 0.7 to 1.3 μm is primarily dependent on the internal
structure of the plant leaves. Beyond 1.3 μm, energy incident upon vegetation is essentially
absorbed or reflected, with little to no transmittance of energy. Sharp reductions in reflectance
occur at 1.4, 1.9 and 2.7 μm, because water in the leaf absorbs strongly at these wavelengths.
Accordingly, these spectral regions are referred to as water absorption bands. Throughout the
range beyond 1.3 μm, leaf reflectance is inversely related to the total water present in the leaf.
This total is a function of both the moisture content and the thickness of the leaf. Plants with
different internal structure will often vary greatly in NIR reflectance. Figure 2, effectively
illustrates a summary of the dominant factors affecting vegetation reflectance within the
VNIR.
The soil reflectance curve shows considerably less variations in reflectance. The spectral
reflectance curves of soils are generally characterised by a rise in reflectivity as wavelength
22 George P. Petropoulos and Chariton Kalaitzidis

increases. One of the most important parameters controlling reflectance from soil surfaces is
soil moisture content, which inversely related to the reflectance, especially in the mid-infrared
region. Furthermore, dry fine-textured soils (such as clay) will usually have higher reflectance
than dry coarse-textured soils (such as sand) (Verbyla, 2001). In addition, organic matter has
been found to have a strong influence in soil reflectance, where spectral reflectance generally
decreases over the entire shortwave region as organic matter content increases (Stoner and
Baumgardne, 1980). The presence of iron oxide in the soil will also significantly decrease
reflectance, at least in the visible wavelengths (Asrar, 1989). Last but not least, soil particle
size has been found to affect reflectance of particular soil minerals, which is generally
increased and the contrast of the absorption features decrease as the particle size decreases.
Huete (1989) gives a summary of the influences of soil background on measurements of
vegetation spectra.
Finally, the spectral reflectance curve of water shows a general reduction in reflectance
with increasing wavelength, so that in the NIR the reflectance of deep, clear water is
effectively almost zero. Clear water reflects very little in most spectral regions. However,
turbid water reflects significant amounts of radiation, especially in the red and NIR spectral
regions. There is a shift in the spectral reflectance regions, as water increases in turbidity
(Verbyla, 2001). However, the spectral reflectance of water is affected by the presence and
concentration of dissolved and suspended organic and inorganic material, and by the depth of
the water body (Jensen, 2000). Thus, the intensity and distribution of the radiance upwelling
from a water body are indicative of the nature of the dissolved and suspended mater in the
water and of the water depth. The peak of the reflectance curve moves to progressively longer
wavelengths as concentration of these materials increases.

3. OVERVIEW OF VEGETATION INDICES IN REMOTE SENSING


3.1. Indices Linked with Biomass Estimation

Since 1960‘s with the initial attempt by Jordan (1969) scientists have extracted and
modelled various vegetation biophysical variables from remotely sensed data by exploiting
mathematical formulae, referred to as vegetation indices, defined as dimensionless
radiometric measures that function as indicators of relative abundance and activity of green
vegetation (Jensen, 2000). A vegetation index is effectively a numerical value without units,
resulting from the mathematical combination of radiance values at particular wavebands. The
measurements of those radiance values are almost always collected concurrently, and
represent the state of vegetation at that particular moment in time, under the specific
conditions that were in place during the time the data were acquired. Also, comparison of
vegetation indices between different vegetation targets, or the same vegetation target at
different time and conditions can potentially provide information on the differences between
the targets or the effects of the variable conditions to the vegetation.
The remaining part of this chapter provides an overview on the development of
radiometric vegetation indices developed over the past decades which are based on
multispectral remote sensing observations acquired in the reflective part of the EMR. A
summary of the indices reviewed herein is provided in Table 2.
Multispectral Vegetation Indices in Remote Sensing 23

Table 2. List of indices used to directly or indirectly estimate vegetation biomass

Short name Radiometric Index Name Reference


RVI(SR) Ratio VI or Simple Ratio Jordan, 1969
NDVI Normalised Difference VI Rouse et. al., 1974
TVI Transformed Vegetation Index Rouse et. al., 1974
PVI Perpendicular VI Richardson & Wiegand, 1977
SAVI Soil-Adjusted VI Huete, 1988
WDVI Weighted Difference VI Clevers, 1989
SAVI2 2nd version of Soil-Adjusted VI Major et. al., 1990
TSAVI Transformed Soil-Adjusted VI Baret & Guyot, 1991
ARVI Atmospherically Resistent VI Kaufman & Tanre, 1992
GEMI Global Environmental Monotoring Index Pinty & Verstraete, 1992
NDVIc Corrected NDVI Nemani et. al., 1993
SARVI Atmospherically Resistent SAVI Huete et. al., 1994
MSAVI Modified Soil-Adjusted VI Qi, et. al., 1994
EVI Enhanced Vegetation Index Huete et. al., 1997
OSAVI Optimised SAVI Rondeaux, et. al., 1996
GARI Green Atmospherically Resistent VI Gitelson, et. al., 1996
GNDVI Green NDVI Gitelson, et. al., 1996
MGVI MERIS Global Vegetation Index Gobron et. al., 1999
GESAVI Generalised Soil-Adjusted VI Gilabert, et. al., 2002
VARI Visible Atmospericaly Resistant Index Gitelson, et. al., 2002
LVI Linearised Vegetation Index Unsalan & Boywer, 2004
WDRVI Wide Dynamic Range Vegetation Index Gitelson, 2004
MSI Moisture Stress Index Hunt & Rock, 1989
LSWI Land Surface Water Index Xiao et al., 2002
GVMI Global Vegetation Moisture Index Ceccato et al., 2002a,b

The first use of a ratio between NIR and red radiance was reported by Jordan (1969),
when he measured transmittance value at the floor of a tropical forest, to estimate the Leaf
Area Index (LAI). As explained by Knipling (1970), increased Leaf Area Index (LAI) values,
result in lower red and higher NIR reflectance. The cause for the first is the increased amount
of chlorophyll in the sensor‘s field of view, leading to increased absorption and reduced
reflectance. The increase in NIR reflectance is attributed to the increased number of cells
present in the field of view, resulting in an increased amount of scattered NIR radiation
reaching the sensor. The ratio was coined as a vegetation index by Pearson and Miller (1972),
when they introduced the Ratio Vegetation Index (RVI), which is the ratio of the NIR over
the red signal (Eq. 4), in order to estimate grass canopy biomass.

NIR
RVI  (4)
RED

The simple ratio of NIR reflectance over red reflectance has been shown on numerous
occasions to be related to crop yield and dry matter accumulation (Markham et. al., 1981;
Tucker et. al., 1981).
24 George P. Petropoulos and Chariton Kalaitzidis

An evolved version of the RVI is the Normalised Difference Vegetation Index (NDVI),
introduced for the first time by Deering et et. al. (1975). This index is a ratio between the
difference and the sum of NIR and red radiation (Eq. 5).

NIR  red
NDVI  (5)
NIR  red

Typically NDVI values can scale between -1 to +1 with water surfaces typically having
an NDVI value less than 0, bare soils between 0 and 0.1, clouds about 0.23, snow and ice
about 0.38 and vegetation over 0.1 (Jensen, 2000). NDVI is considered to be superior of the
RVI. The effectiveness especially of NDVI is because chlorophyll absorbs light in the visible
(0.58-0.68 µm) and foliage reflects light in the NIR part of the EMR (0.72-1.10 µm). The
combination of NIR and red reflectance in one index, succeeded in combining information
regarding the chlorophyll content of the vegetation, as well as information about the leaf
anatomy. Therefore, higher photosynthetic activity would result in lower reflectance in the
red channel and higher reflectance in the NIR channel. Also, the normalisation of the
difference makes the index more robust and less affected by variations in the illumination
intensity, allowing for comparisons of target located at different locations, with data acquired
at different times.
NDVI became very popular and was probably the most broadly used index in many
applications related with vegetation. NDVI has been linked in many studies in a positive
correlation with the amount of green biomass, leaf area index, vegetation percentage cover,
plant vigour and health, plant stress, photosynthetic activity and agricultural crop yield (Asrar
et. al., 1984; Sellers, 1987). However, it should be mentioned here that NDVI values can vary
significantly as a function of sensor calibration (Goward et al., 1991), atmospheric conditions
(Myneni and Asrar, 1994), directional surface reflectance effects (Holben et al., 1986), terrain
relief and soil background effects (Major et al., 1990). What is more, when the vegetation
cover is complete and chlorophyll content reaches a certain level, the relationship between
NDVI and any characteristic that is derived from the amount of chlorophyll is saturated
(Huete et. al., 1997), and further increases in chlorophyll content, do not result in proportional
increases of the NDVI. Last but not least, another important limitation of the NDVI, is its
sensitivity to the contribution of the background beneath the vegetation, when vegetation
cover is not complete. The contribution of soil reflectance to total canopy reflectance in the
NIR is significant (Allen and Richardson, 1968). For a given amount of vegetation and
vegetation cover, darker soils in the background result in higher values of vegetation indices
(Elvidge and Lyon, 1985; Huete et. al., 1985).
The Transformed Vegetation Index proposed by Rouse et. al. (1974), was effectively the
NDVI with the addition of a 0.5 constant and the square-rooting of the sum (Eq. 6). The TVI
was produced in order to avoid the negative values of NDVI, and also to avoid the possibility
that the variances of the ratio would be proportional to the mean values.

TVI  NDVI  0.5 (6)

When dealing with vegetation canopies, the contribution of the soil reflectance to the total
canopy reflectance is significant, particularly in the NIR region, where there is no pigment
Multispectral Vegetation Indices in Remote Sensing 25

absorption (Allen and Richardson, 1968). In order to discriminate between vegetation and soil
signal, Richardson and Wiegand (1977) proposed the Perpendicular Vegetation Index (PVI;
Eq. 7). The index calculates the difference between the soil and vegetation signals in both the
red (Redsoil, Redveg.) and NIR region (NIRsoil, NIRveg.), and employs those differences.

PVI  ( red soil


 redveg.) 2  ( NIRsoil  NIRveg.) 2 (7)

PVI was one of the first attempts to discriminate vegetation and soil background
reflectance, however it proved to be very sensitive to variable soil brightness and the index
value increased for brighter soils, when vegetation cover remained constant (Elvidge and
Lyon, 1985; Huete, 1988; Baret and Guyot, 1991). Even though the soil line causes the
NIR/red ratio to remain constant, the actual slope and intercept parameters of the soil line
vary depending on the soil properties (Huete et. al., 1984). In addition to soil brightness, other
properties, depending on the soil type, could affect the greenness assessment, even when the
percentage of vegetation cover was as high as 75% (Huete et. al., 1985).
The Soil-Adjusted Vegetation Index (SAVI; Eq. 8) was later introduced by Huete (1988)
as an alternative index dealing with the background signal. This index was employing an L
factor, representing the amount of vegetation present and the extent of vegetation cover. For
total vegetation cover the L receives a value of 0 and the index effectively becomes NDVI,
while for very low vegetation cover, the L acquires a value near 0. When the extent of
vegetation cover is unknown, the author suggested a value of 0.5 as optimal. For this value
and for intermediate vegetation cover, SAVI was found to be superior to both the NDVI and
PVI (Huete, 1988).

NIR  red
SAVI  (1  L) (8)
NIR  red  L
The index performs best when the slope of the soil line (a) is equal to 1 and the intercept
(b) equals zero. Deviations from these values tend to reduce the accuracy of SAVI (Baret et.
al., 1989). However, Rondeaux et. al. (1996) suggested that the optimal value of the L factor
for SAVI was 0.16, thus coining the OSAVI index. To address the issue of fixed L values,
another version of the SAVI index was introduced by Baret et. al. (1989). Instead of simply
adding a subjective factor, the NDVI was equipped with information of the soil line, namely
its slope (a) and intercept (b), producing the Transformed Soil-Adjusted Vegetation Index
(TSAVI; Eq. 9). The index ranges from zero for bare soil to a maximum of around 0.7 for
very dense vegetation cover. The index was shown to be able to compensate for changes in
solar elevation and canopy structure (Baret et. al., 1989).

aNIR - a red - b 
TSAVI  (9)
red  a NIR - ab
Another incarnation of the SAVI index was proposed by Major et. al. (1990), employing
once again the soil line slope and intercept. In this instance, instead of using the NDVI as a
basis, the information was added at the denominator of the RVI, producing SAVI2 (Eq. 10).
26 George P. Petropoulos and Chariton Kalaitzidis

NIR
SAVI 2  (10)
red  b a 

SAVI was derived specifically to reduce the effects due to ground reflectance and
implicitly assumes a linear relationship between red and near-infrared ground reflectances. Qi
et al (1994) suggested that SAVI was significantly less susceptible to changes associated with
soil variations compared to SR an NDVI. On the other hand, Rondeaux et al (1996) compared
NDVI, SAVI and TSAVI and determined that TSAVI was least prone to perturbations
associated with soil changes.
At the time when Huete (1988) was incorporating the soil line to NDVI to produce SAVI,
Clevers (1989) proposed the Weighted Difference Vegetation index (WDVI) stating that this
index could reportedly estimate LAI, with the assumption that red and NIR reflectance was
independent of soil moisture content. However, further studies by Baret and Guyot (1991)
found that the index had no particular advantage over the PVI and shared the same
weaknesses.
Qi et. al. (1994) suggested an iterative process for the determination of the L factor,
through which the initial L value is combining the NDVI and WDVI and is also employing a
primary soil-line parameter γ (a value of 1.06 was used in that study; Eq.11). Each subsequent
L value is calculated by the difference between the resulting MSAVI (Eq. 12) and 1 (Eq. 13).

L0 = 1 - 2γ * NDVI * WDVI (11)

NIR  red
MSAVI  (1  L0 ) (12)
0 NIR  red  L
0

Ln = 1 – MSAVI n-1 (13)

A further attempt to create a more accurate index in the SAVI family, was made by
Gilabert et. al. (2002). They proposed a generalised soil-adjusted vegetation index
(GESAVI), which utilises the red and NIR reflectance, along with the soil parameters a and b,
plus a soil adjustment coefficient (Eq. 14). The difference between this index and the previous
SAVI indices, is the fact that the vegetation isolines in the NIR-red plane, are neither parallel
to the soil line (as required by PVI), nor converging from the same point (as required by the
NDVI), but somewhere in between.

NIR - b Red - a
GESAVI  (14)
Red  z
The z factor is related to the red reflectance, at the point where the soil line and the
vegetation isolines converge. The authors have suggested the use of a value of 0.35, in case
additional data to derive the actual z value is not available.
A different approach in dealing with partial vegetation cover was followed by Nemani et.
al. (1993), when they introduced a middle infrared band to the traditional NDVI, producing
Multispectral Vegetation Indices in Remote Sensing 27

the corrected NDVI (NDVIc; Eq 15). The middle infrared helped account for understory
effects, when the vegetation cover was partial and the understory vegetation had a
significantly different spectral signature than the tree canopies. As a result the index was
more closely related to the LAI of conifer forests.

MIR - MIR max


NDVI  NDVI * (1  ) (15)
c MIR mix - MIR min

In the above equation (Eq. 15), MIR is the middle infrared reflectance of (band 5 of
LANDSAT TM), and MIRmin and MIRmax are the middle infrared reflectance signals from a
completely open and completely closed canopy respectively, from the general study area.
However, atmosphere can affect considerably vegetation indices (Kaufman and Sendra,
1988). As it was discussed earlier, the interference of the atmosphere with the incident solar
radiation affects its intensity and makes it more diffuse, through scattering. The NDVI, like
most vegetation indices, is suffering from those effects. Subsequently, in addition to
minimizing the effect of background, spectral radiance values must be corrected for
atmospheric effects to recover the vegetation signal.
In an attempt to correct these atmospheric effects, Pinty and Verstraete (1992) proposed
the Global Environmental Monitoring Index (GEMI), a complex non-linear polynomial
equation, which combined the red and near-infrared reflectance. The GEMI was shown to be
more useful in comparing observations under variable atmospheric conditions and also more
representative of surface conditions, compared to the simple ratio (SR) and the NDVI. Also
aiming to create an index resistant to atmospheric influence, Kaufman and Tanre (1992) have
developed the Atmospherically Resistant Vegetation Index (ARVI) for the MODIS sensor.
This index is using the blue and red channel to isolate the atmospheric effects and
subsequently apply the corrections on the red and NIR reflectance. Kaufman and Tanre
(1992) showed that ARVI is four times less sensitive to atmospheric changes than NDVI.
The same principle which was applied to SAVI, correcting the red band with the
information contained in the blue band, producing the Soil-adjusted Atmospherically
Resistant Vegetation Index (SARVI; Huete et. al., 1994). Another index which accounts for
residual atmospheric contamination (e.g., aerosols) and variable soil background reflectance,
is the Enhanced Vegetation Index (EVI) developed by Huete et al. (1997; Eq. 16), which
normalizes the reflectance in the red band as a function of the reflectance in the blue band.
Evaluation of radiometric and biophysical performance of EVI implemented from the
Moderate Resolution Imaging Spectroradiometer (MODIS) radiometer revealed that EVI
remained susceptible to canopy variations (Huete et al., 2002).

NIR  red
EVI  G * (16)
NIR  C1red - C 2 blue  L

In equation 16, G is a gain factor, C1 and C2 are coefficients to correct aerosol effects and
L is a coefficient do account for canopy background effects. The index was produced for use
with the MODIS sensor.
28 George P. Petropoulos and Chariton Kalaitzidis

Gitelson et. al. (1996) have found the spectral region between 520 and 630 nm (primarily
in the green region of the spectrum) to be more sensitive to chlorophyll fluctuations, even at
very high concentrations. The difference between the red and green bands is that the former is
not affected by the presence of carotenoids. However, the authors have found that the 530 –
570 nm region is not affected by carotenoids absorption either. Instead of using the red band,
they adapted the NDVI and ARVI indices to employ the narrower green band (530 – 570
nm), creating the two ―G reen‖ indices, the Green NDVI (GNDVI) and the Green
Atmospherically Resistant Vegetation Index (GARI).
With the aim of creating an index with the ability to ignore atmospheric effects, while
being sensitive to the fraction of absorbed photosynthetically active radiation (fAPAR) by
vegetation, Gobron et. al. (1999) have formulated a vegetation index to be used the MERIS
sensor data. The MERIS Global Vegetation Index (MGVI) is using a red and NIR band at 681
and 865 nm respectively, in order to estimate the fAPAR. Before the index is calibrated, the
information in the blue band (442 nm) is used, in order to remove the atmospheric effects
present in the red and NIR bands. Comparison of the index with the traditional NDVI has
shown that the MGVI is equally efficient and additionally has a global application. Similarly
to the MGVI, a global vegetation index was designed to be used with SeaWiFS data (Gobron
et. al., 2001). This index also used a red and NIR band combination and was corrected for
atmospheric and geometric effects, before being adjusted to maximise its sensitivity to fAPAR.
Due to the limitations of the use of NIR reflectance, when vegetation canopy cover is not
complete (Colwell, 1974), Gitelson et. al. (2002) have formulated an index that only employs
visible reflectance data. The Visible Atmospherically Resistant Index (VARI) is using the
green and red bands for the estimation of vegetation fraction (VF), as well as the blue band
for the compensation of atmospheric effects in the other two bands (Eq. 17).

R green  R red
VARI  (17)
R green  R red  R blue

Despite the fact that the contrast between the bands in the visible region is not as high as
that between red and NIR, the index was found to be more sensitive than NDVI at high VF,
and the error in estimating VF did not exceed 10% (Gitelson et. al., 2002).
The well-known issue of early saturation of many vegetation indices, has also been the
focus of research recently. In their study, Unsalan and Boyer (2004) have analysed the
statistical framework for the NDVI. Subsequently, they represented the index as a slope,
converting its relationship with LAI into a linear form, reducing, in effect, the extend of
saturation. An alternative path was followed by Gitelson (2004), who observed that, while red
reflectance exhibits a flat response once LAI exceeds the value of 2, NIR reflectance
remained sensitive, to LAI values between 2 and 6. On the other hand, the sensitivity of NIR
reflectance was reduced, when its value exceeded 30%. The Wide Dynamic Range
Vegetation Index (WDRVI) is effectively the NDVI, with a factor between 0.1 and 0.2
applied to the NIR reflectance (Eq. 18):
Multispectral Vegetation Indices in Remote Sensing 29

a R NIR  R red
WDRVI  (18)
a R NIR  R red

The value of the a factor is dependent on the vegetation fraction (VF). The new index
was found to be up to three times more sensitive to moderate-to-high LAI values (between 2
and 6). The cause for the increased sensitivity is the effective linearisation of the relationship
with vegetation fraction.
However, in contrast with the numerous studies that use red and NIR spectral bands
discussed so far, a limited number of studies have explored the SWIR spectral bands (e.g., 1.6
and 2.1 µm) for vegetation study. A number of studies have suggested that a combination of
NIR and SWIR bands have the potential for retrieving leaf and canopy water content (e.g.
Hunt & Rock, 1989; Ceccato et al., 2001). One such index is the Moisture Stress Index
(MSI), proposed by Hunt and Rock (1989), which is calculated as a simple ratio between
SWIR (1.6 µm) and NIR (0.82 µm) spectral bands, was proposed to estimate leaf relative
water content (%) and equivalent water thickness (gr cm-2) of different plant species (Eq 19).

SWIR
MSI  (19)
NIR

In analyses of the 10-day composite of VGT data another water index was calculated as
the normalized difference between the NIR (0.78–0.89 µm) and SWIR (1.58–1.75 µm)
spectral bands (Xiao et al., 2002), here it is called Land Surface Water Index (LSWI, Eq 20):

NIR  SWIR
LSWI  (20)
NIR  SWIR

More recently, Ceccato et al. (2002 a, b) proposed the Global Vegetation Moisture Index
(GVMI) to retrieve equivalent water thickness (grcm-2) at canopy level, using images from
the SPOT-VGT sensor. This index uses the reflectance values of the rectified NIR band,
which are derived from a complex procedure that involves blue spectral band and uses the
apparent reflectance as seen at the top-of-atmosphere (Gobron et al., 1999), and shown in Eq
21 below:

* *
( pnir  0.1)  ( pswir  0.02 )
GVMI  (21)
* *
( pnir  0.1)  ( pswir  0.02 )

A comparison between GVMI and NDVI showed that the former provided information
related to canopy water content (EWT), while NDVI supplied the information related to
vegetation greenness (Ceccato et al., 2002a).
30 George P. Petropoulos and Chariton Kalaitzidis

3.2. Indices Linked with LAI

The Leaff Area Index (LAI) is a very important plant parameter because its magnitude
affects the amount of radiation that can be absorbed by the canopy. As a result, the LAI has
been related with leaf mass and overall biomass (Wiegand et. al., 1990). Vegetation indices
have been evaluated for their relationship with LAI in many studies, for both forest species
and agricultural crops. Estimation of LAI was found to be possible by the simple ratio of
NIR/red (Asrar et. al., 1985a; Maas, 1993) and the NDVI (Asrar et. al., 1985a) in wheat.
Running et. al. (1986) reached the same conclusion for coniferous forests (r2 = 0.76 for the
simple ratio and r2 = 0.55 for the NDVI). Studying the same area, Peterson et. al. (1987)
improved the relationship of the simple ratio with LAI from r2 of 0.83 to 0.91, by using a log-
linear transformation.
The Normalised Difference Vegetation Index (NDVI) has been the most commonly used
index in studies of estimating LAI and the accuracy of those estimations through this index
have been met with variable results. The highest coefficient of determination between NDVI
and LAI was r2 = 0.95, in a study with corn (Gilabert et. al., 1996). However, the limitations
of using the NDVI for LAI estimations are quite severe. The index has been proven to be very
sensitive to the contribution of soil or background vegetation at low LAI values (Baret and
Guyot, 1991). On the other hand, the NDVI-LAI relationship seems to saturate at high LAI
values, because of the reduced contribution of the lower canopy leaves to the overall canopy
reflectance (Asrar et. al., 1984; Turner et. al., 1999). A corrected version of the NDVI, using
middle-infrared information was shown to improve the estimation of LAI (Nemani et. al.,
1993) and the short-wave infrared reflectance signal has also been suggested as being able to
estimate LAI, taking advantage of the intense water absorption features in that region (Gong
et. al., 2003). However, in addition to the soil sensitivity and saturation issues, the NDVI-LAI
relationship also appears to be affected by leaf orientation (Baret et. al., 1989) and growth
stage (Hatfield et. al., 1984). In particular, Curran et. al. (1992) have shown that the
coefficient of determination between the NDVI and LAI for slash pine is a low r2 = 0.35
during February, increasing to r2 = 0.86 in March and falling to r2 = 0.75 for September data.
Alternative indices have also been evaluated, in an attempt to deal with the issues faced
with the NDVI. The Green Vegetation Index (Jackson, 1983), was evaluated on various fields
of corn and was found to provide accurate estimates of LAI with r2 = 0.78 – 0.93 (Wiegand
et. al., 1990). In order to account for background soil contribution, the PVI was used to
estimate green LAI (Maas, 1988). Wiegand et. al. (1990) evaluated the GVI and PVI on corn
and found that a function of those two indices provided coefficients of determination of r2 =
0.937, higher than those produced by the simple ratio and the NDVI. The introduction of
vegetation indices such as SAVI (Huete, 1988) and TSAVI (Baret et. al., 1989), which tend to
deal better with soil contribution, prompted the comparison with the more traditional PVI and
NDVI indices. The first two indices appeared to be superior to the latter two, due to their
ability to account for soil contribution (Baret and Guyot, 1991). The introduction of
atmospheric resistance employed in the ARVI index (Kaufman and Tanre, 1992) in SAVI,
created a stronger version of the index (SARVI), which appeared to reduce the accuracy error
of estimating LAI in half, in comparison with NDVI (Huete et. al., 1994).
All vegetation indices have an asymptotic relationship with LAI that saturates at high
LAI values (Spanner et. al., 1990; Baret and Guyot, 1991; Turner et. al., 1999). Fassnacht et.
al. (1997) found that a linear relationship was sufficient to connect LAI and a VI, but a log-
Multispectral Vegetation Indices in Remote Sensing 31

linear transformation could be required if a larger range of LAIs was to be investigated.


Gilabert et. al. (2002) found that the generalised soil-adjusted vegetation index (GESAVI)
was less sensitive to soil contribution and Gitelson (2004) showed that the Wide Dynamic
Range Vegetation Index (WDRVI) was more resistant to saturation, in comparison with the
NDVI. The Enhanced Vegetation Index (EVI; Huete 1997) was shown to be able to make
accurate LAI estimates, in the range of LAI between 0 and 8 (Houborg et. al., 2007). In order
to deal with the saturation issue, Unsalan and Boyer (2004) have suggested representing the
index as a slope and using its inverse tangent, in order to linearise the measure to yield a new
index, the Linearised Vegetation Index (LVI).

3.3. Indices Associated with Fraction (Fapar)

The Photosynthetically Active Radiation (PAR) is the spectral range of light that can be
used by vegetation for the process of photosynthesis. This region is between 400 and 700 nm
(the ―v isible‖ region), because in this region the vegetation pigments absorb energy. The
amount of energy that is actually absorbed is known as Absorbed PAR (APAR) and when it
is expressed as a fraction of the total incident radiation it is referred to as the Fraction of
APAR (fAPAR). This energy is very closely related to the primary productivity of plants and
the production of biomass. Dry matter production and the accumulated intercepted
photosynthetically active radiation in the 400-700 nm region of the electromagnetic spectrum
were also shown to be closely related (Biscoe et. al., 1975; Monteith, 1977; Gallagher and
Biscoe, 1978). In addition, a quantitative relationship between dry matter production and
intercepted radiation can be established, as Hodges and Kanemasu (1977) showed for barley
canopies. Hence, it is possible to use remote sensing to estimate the solar radiation that is
intercepted by canopies, and then converted into dry matter (Daughtry et. al., 1983).
Early studies have shown that both the simple ratio (NIR/red) and the NDVI were closely
related to dry matter accumulation (Tucker et. al., 1981). The simple ratio (NIR/red) was used
to derive an empirical relationship with fAPAR on sugarbeet, but that relationship was not
transferable to different crops or different growth stages (Steven et. al., 1983). Christensen
and Gourdiaan (1993) also used the simple ratio to calculate the cumulative PAR and the
result was found to be closely related with the above-ground biomass. The NDVI was also
used in many cases for the estimation of PAR. Asrar et. al. (1985b) found a strong correlation
between the index and PAR, which was then used to calculate above-ground phytomass of
wheat. Steinmetz et. al. (1990) also found a good relationship between NDVI and PAR, but
highlighted the fact that the relationship is affected by nitrogen and water stress and also that
the rate of conversion from PAR to biomass was dependent on growth stage of wheat.
The relationship between the NDVI and fAPAR has a low signal-to-noise ratio (North,
2002) and is linear, but it is only valid during the growth stage of the crops (Ruimy et. al.,
1994). The possible reason is that the canopy continues absorbing radiation at later crop
stages but it contains less photosynthetic pigments, which leads to a decrease in the NDVI
(Hatfield et. al., 1984; Gallo et. al., 1985). On the other hand, when fAPAR is high, NDVI is
less sensitive to fAPAR changes. In those cases the WDRVI (Gitelson, 2004) appears to be
more sensitive, and in cases where hyperspectral data are available (e.g. ESA‘s MERIS or
NASA‘s Hyperion sensors), the red-edge NDVI is the most sensitive index (Vina and
Gitelson, 2005).
32 George P. Petropoulos and Chariton Kalaitzidis

Other indices were also evaluated for their ability to estimate PAR, fAPAR and indirectly,
biomass and in many cases they provided superior results to both the simple ratio and NDVI.
In a study comparing the GVI, PVI, NDVI and RVI, Wiegand et. al. (1990) found that a
combination of GVI and PVI gave the most accurate estimates of f APAR (r2 = 0.94). On the
other hand, a similar study showed that the NDVI and PVI were the most accurate indices in
the estimation of fAPAR in cotton (Wiegand et. al., 1991).Gobron et. al. (1999) evaluated the
MERIS Global Vegetation Index (MGVI) using MERIS data and found that the index
provided more information than the simple NDVI. The WDRVI was used on maize and
soybean plants, employing field spectra, and it was found to be more sensitive than the NDVI
at high fAPAR values (Vina and Gitelson, 2005).

CONCLUSION
In this chapter it was provided an overview of the different radiometric vegetation indices
usilising multispectral remote sensing observations acquired in the reflective part of the EMR
spectrum. In this framework, it was first provided an overview of the main properties of
remote sensing in this part of the EMR, as this was deemed necessary in order to cover the
theoretical background required in understanding the basic principles on which these
radiometric indices have their basis.
As indicated by the overview presented herein, a wide range of radiometric indices have
been developed in order to establish relationships between such data and biomass, or other
vegetation characteristics that can be indirectly linked to the amount of biomass present. As
was also made clear from the overview conducted herein, the factors that affect the efficiency
of vegetation indices are concerned with the characteristics of the recorded signal, which is
affected by bidirectional and atmospheric effects, canopy structure and background
vegetation or soil contribution, scattering, spatial heterogeneity, adjacency effects, non-linear
mixing and topographic effects. These factors are a major concern for the transferability of an
established methodology to a different plant, location, or time. What is more, on many
occasions the proven relationships between vegetation indices and vegetation properties were
empirical in nature, performing well on the particular study, but facing transferability issues
when the same indices were evaluated on a different vegetation type, different location or
even different time of the year.
As was also indicated in the present review, methods of using vegetation indices for the
estimation of the leaf area index (LAI) have had variable results. Canopy structure and
background contribution appear to play a crucial role in the determination of the canopy
reflectance signal, influencing the performance of any VI – LAI relationship. When
vegetation cover is low, the presence of understory vegetation (primarily in the case of
forests) and the spectral characteristics of the underlying soil, affect canopy reflectance and
give erroneous LAI estimates. Soil-adjusted indices that either use information on the soil line
(NIR/red ratio) underneath the canopy, or a factor estimating the vegetation cover, tend to
somewhat deal with the background contribution issue at low LAI values. However, those
indices still present weaknesses that render their operational ability questionable. On the other
hand, when the LAI is high, the multiple layers of leaves within the canopy have a different
contribution to the canopy reflectance signal, causing the VI – LAI relationship to saturate.
Multispectral Vegetation Indices in Remote Sensing 33

Estimation of the fraction of absorbed photosynthetically active radiation (fAPAR)


through the use of vegetation indices, is another popular method of indirectly assessing the
amount of biomass present. This method is particularly applicable in the case of crops, where
the growth cycle lasts for less than one year. There have been many suggestions on the
optimal period of the growth cycle for the data to be collected, for a variety of crops. It
appears that the best performing method is using integrated series of measurements and
calculations of a vegetation index, in order to account for variability through the growth
season and also to account for the fact that the crops have a different efficiency in biomass
production for a given fAPAR, at different stages of their growth cycle. Initial studies used
integrated NDVI values to estimate biomass production through the fAPAR. However, the
WDRVI was shown to be superior to NDVI, especially at high fAPAR values (Gitelson,
2004).
The recent advancements in remote sensing technology have allowed the development of
a wide range of space-borne multispectral remote sensing systems have been developed
during the last decades, providing a capability for observing land cover at broad spatial scales
and at intervals that previously were not applicable. The recent evolution of remote sensing
technology has also resulted to the development of hyperspectral sensors. Unlike
multispectral sensors, hyperspectral remote sensing systems record spectral information on
land surface targets in numerous narrow continuous spectral bands. This allows to these
systems to provide an enhanced level of information for atmospheric correction and to also
use specific spectral information recorded by selective channels of the sensor accordingly to
the characteristics of the specific problem under analysis (Hansen and Schjoerring, 2003;
Galvao et al., 2005; Dalponte et al., 2009). A number of both airborne and satellite
hyperspectral remote sensing systems have been developed and launched in the recent years,
a review of the most recent ones is made available by Dalaponte et al. (2009). Perhaps the
availability of such rich spectral information context can potentially fosters the development
of new radiometric vegetation indices which will allow overcoming the current limitations
and the same time open up pathways to better map and monitor vegetation health and vigor
conditions and related parameters. Last but not least, if an operational development and
operation of such indices on large scale is in mind, it is required the empirical relationships
between vegetation indices and vegetation variables to be applied on a variety of vegetation
types, locations and conditions, taking also advantage of the recent developments in
spaceborne remote sensors technology.

ACKNOWLEDGMENTS
Part of this work was undertaken within the frame of an FP7 funded programme with the
acronym CEUBIOM, and full title ―Cla ssification of European Biomass Potential for
Bioenergy Using Terrestrial and Earth Observations‖. Authors wish to thank the anonymous
reviewers for the valuable comments which resulted to the improvement of the originally
submitted chapter. Dr. Petropoulos is also grateful to INFOCOSMOS E.E.
(http://www.infocosmos.eu) for supporting his participation to the present work.
34 George P. Petropoulos and Chariton Kalaitzidis

REFERENCES
Albuelgasim, A. A., and Strahler, A. H., 1994. Modeling bidirectional radiance measurements
collected by the Advanced Solid-state Array Spectroradiometer over Oregon transect
conifer forests, Remote Sens. Environ. 47:261-275.
Allen, W.A. and Richardson, A.J. 1968. Interaction of light with plant canopy. Journal of the
Optical Society of America, 58, 1023-1028.
Asrar, G., Fuchs, M., Kanemasu, E. T. and Hatfield, J. L. 1984. Estimating absorbed
photosynthetic radiation and leaf area index from spectral reflectance in wheat.
Agronomy Journal, 76, 300-306.
Asrar, G., Kanemasu, E. T. and Yoshida, M. 1985a. Estimates of leaf area index from spectral
reflectance of wheat under different cultural practices and solar angle. Remote Sensing of
Environment, 17, 1-11.
Asrar, G., Kanemasu, E. T., Jackson, R. D. and Pinter, P. J. 1985b. Estimation of total above
ground phytomass production using remotely sensed data. Remote Sensing of
Environment, 17, 211-220.
Baret, F. and Guyot, G. 1991. Potential and limits of vegetation indices for LAI and APAR
assessment. Remote Sensing of Environment, 35, 161-173.
Baret, F., Guyot, G. and Major, D. J. 1989. TSAVI: A vegetation index which minimizes soil
brightness effects on LAI and APAR estimation. In Proceedings IGARRS ‘90 / 12th
Canadian Symposium on Remote Sensing, Vancouver, Canada, 10-14 July 1989, vol. 3,
pp. 1355-1358.
Biscoe, P. V., Gallagher, J. N., Littleton, E. J., Monteith, J. L. and Scott, R. K. 1975. Barley
and its environment, IV. Sources of assimilate for the grain. Journal of Applied Ecology,
12, 295-318.
Buschmann, C. and Nagel, E. 1991. Reflection spectra of terrestrial vegetation as influenced
by pigment-protein complexes and the internal optics of the leaf structure. In: Putkonen J,
editor. Proceedings of the International Geoscience and Remote Sensing Symposium
(IGARSS '91). Espoo, Finland, 3-7 June 1991: New York: IEEE. p 1909-1912.
Cambell, J. E., 1981.Introduction to remote Sensing, 2nd Ed., New York, Guilford Press.
Ceccato, P., Flasse, S., & Gregoire, J. M. 2002a. Designing a spectral index to estimate
vegetation water content from remote sensing data: Part 2. Validation and applications.
Remote Sensing of Environment, 82, 198– 207
Ceccato, P., Flasse, S., Tarantola, S., Jacquemoud, S., & Gregoire, J. M.2001. Detecting
vegetation leaf water content using reflectance in the optical domain. Remote Sensing of
Environment, 77, 22–33
Ceccato, P., Gobron, N., Flasse, S., Pinty, B., & Tarantola, S. 2002b. Designing a spectral
index to estimate vegetation water content from remote sensing data: Part 1. Theoretical
approach. Remote Sensing of Environment, 82, 188– 197
Christensen, S. and Goudriaan, J. 1993. Deriving light interception and biomass from spectral
reflectance ratio. Remote Sensing of Environment, 43, 87-95.
Clevers, J. G. P. W. 1988. The derivation of a simplified reflectance model for the estimation
of leaf area index. Remote Sensing of Environment, 25, 53-69.
Multispectral Vegetation Indices in Remote Sensing 35

Clevers, J. G. P. W. 1989. The application of a weighted infrared-red vegetation index for


estimating leaf area index by correcting for soil moisture. Remote Sensing of
Environment, 29, 25-37.
Colwell, J. E. 1974. Vegetation canopy reflectance. Remote Sensing of Environment, 3, 175-
183.
Curran, P.J., Dungan, J. L. and Gholz, H. L. 1992. Seasonal LAI in slash pine estimated with
Landsat TM. Remote Sensing of Environment, 39, 3-13.
Dalponte, M. Bruzzone, L., Vescovo, L. and D. Gianelle, 2009. The role of spectral resolution
and classifiers complexity in the analysis of hyperspectral images of forest areas. Remote
Sensing of Environment, in press.
Daughtry, C. S. T., Gallow, K. P. and Bauer, M. E. 1983. Spectral estimates of solar radiation
intercepted by corn canopies. Agronomy Journal, 75, 27-531.
Deering, D.J., Rouse, J.W., Haas, R.H. and Schell, J.A. 1975. Measuring production of
grazing units from Landsat MSS data. Proceedings of the 10th International Symposium
of remote Sensing
Deering, D.W. and Eck, T.F., 1987. Atmospheric optical depth effects on angular anisotropy
Elachi, C., 1987. Introduction to the Physics and techniques of remote Sensing, Wiley, New
York,
Elvidge, C. D. and Lyon, R. J. P., 1985. Influence of rock-soil spectral variation on
assessment of green biomass. Remote Sensing of Environment, 17, 265-279.
Fassnacht, K. S., Gower, S. T., MacKenzie, M. D., Nordheim, E. V. and Lillesand, T. M.
1997. Estimating the leaf area index of North Central Wisconsin forests using the Landsat
Thematic Mapper. Remote Sensing of Environment, 61, 229-245.
Filella, I. and Penuelas, J. 1994. The red edge position and shape as indicators of plant
chlorophyll content, biomass and hydric status. International Journal of Remote Sensing,
15, 1459-1470.
Gallagher, J. N. and Biscoe, P. V. 1978. Radiation absorption, growth and yield of cereals.
Journal of Agricultural Science, 91, 47-60.
Gallo, K. P., Daughtry, C. S. T. and Bauer, M. E. 1985. Spectral estimation on absorbed
photosynthetically active radiation in corn canopies. Remote Sensing of Environment, 17,
221-232.
Galvao, L. S., Formaggio, A. R., Tisot, D. A., 2005. Discrimination of sugarcane varieties in
Southeastern Brazil with EO-1 Hyperion data. Remote Sens. Environ. 94, 523-534.
Gates, D. M., Keegan, H. J., Schleter, J. C. and Weidner, V. R. 1965. Spectral properties of
plants. Applied Optics, 4, 11-20.
Gausman, H. W, Allen, W. A. and Escobar, D. E. 1974. Refractive index of plant cell walls.
Applied Optics, 13, 109-111.
Gausman, H. W. and Allen, W. A. 1973. Optical parameters of leaves of 30 plant species.
Plant Physiology, 52, 57-62.
Gilabert, M. A., Gandia, S. and Melia, J. 1996. Analyses of spectral-biophysical relationships
for a corn canopy. Remote Sensing of Environment, 55, 11-20.
Gilabert, M. A., Gonzalez-Piqueras, J., Garcia-Haro, F. J. and Melia, J. 2002. A generalized
soil-adjusted vegetation index. Remote Sensing of Environment, 82, 303-310.
Gitelson, A. A. 2004. Wide dynamic range vegetation index for remote quantification of
biophysical characteristics of vegetation. Journal of Plant Physiology, 161, 165-173.
36 George P. Petropoulos and Chariton Kalaitzidis

Gitelson, A. A., Kaufman, Y. J. and Merzlyak, M. N. 1996. Use of a green channel in remote
sensing of global vegetation from EOS-MODIS. Remote Sensing of Environment, 58,
289-298.
Gitelson, A. A., Kaufman, Y. J., Stark, R. and Rundquist, D. 2002. Novel algorithms for
remote estimation of vegetation fraction. Remote Sensing of Environment, 80, 76-87.
Gobron, N., Melin, F., Pinty, B., Verstraete, M. M., Widlowski, J. L. and Bucini, G. 2001. A
global vegetation index for SeaWiFS: Design and applications. In: Beniston M, and
Verstraete, M. M., (eds.) Remote Sensing and Climate Modeling: Synergies and
Limitations, Kluwer Academic Publications. p 5-21.
Gobron, N., Pinty, B., Verstraete, M. and Govaerts, Y. 1999. The MERIS Global Vegetation
Index (MGVI): description and preliminary application. International Journal of Remote
Sensing, 20, 1917-1927.
Goel, N. S., 1988. Models of vegetation canopy reflectance and their use in estimation of
biophysical parameters from reflectance data, Remote Sens. Rev. 4:1-222.
Gong, P., Pu, R., Biging, G. S. and Larrieu, M. R. 2003. Estimation of forest leaf area index
using vegetation indices derived from Hyperion hyperspectral data. IEEE Transactions
on Geoscience and Remote Sensing, 40, 1355-1362.
Goward, S.N, Marklam, B.L., Dye, D.G., and Yang, J. 1991. Normalised difference
vegetation index measurements from the Advanced Very High Resolution Radiometer,
Rem. Sen. Envir., 35, pp; 257-277of Environment, ERIM, Ann Arbor, Michighan, August
23-25, pp: 1169-1178
Hansen, P.M., Schjoerring, J.K., 2003. Reflectance measurement of canopy biomass and
nitrogen status in wheat crops using normalized difference vegetation indices and partial
least squares regression. Remote Sens. Environ. 86, 542–553.
Hatfield, J. L., Asrar, G. and Kanemasu, E. T. 1984. Intercepted photosynthetically active
radiation estimated by spectral reflectance. Remote Sensing of Environment, 14, 65-75.
Hodges, T. and Kanemasu, E. T. 1977. Modeling daily dry matter production of winter wheat.
Agronomy Journal, 69, 974-978.
Holben, B., and Kimes, D., 1986. Directional reflectance response in AVHRR red and near-
IR bands for three cover types and varying atmospheric conditions, Remote Sens.
Environ. 19:213-236.
Hosgood, B., Sandmeier, S., Piironen, J., Andreoli, G., and Koechler, C., 1999. Goniometers.
ncyclopedia of Electrical and Electronics Engineering, Wiley, New York, pp. 424-433
Houborg, R., Soegaard, H. and Boegh, E. 2007. Combining vegetation index and model
inversion methods for the extraction of key vegetation biophysical parameters using
Terra and Aqua MODIS reflectance data. Remote Sensing of Environment, 106, 39-58.
Huete, A. R. 1988. A soil-adjusted vegetation index (SAVI). Remote Sensing of Environment,
25, 295-309.
Huete, A. R., Liu, H. Q., Batchily, K. and van Leeuwen, W. 1997. A comparison of
vegetation indices over a global set of TM images for EOS-MODIS. Remote Sensing of
Environment, 59, 440-451.
Huete, A., Didan, K., Miura, T., Rodriguez, E. P., Gao, X., & Ferreira, L. G. 2002. Overview
of the radiometric and biophysical performance of the MODIS vegetation indices. Remote
Sensing of Environment, 83, 195– 213
Huete, A., Justice, C. and Liu, H. 1994. Development of vegetation and soil indices for
MODIS-EOS. Remote Sensing of Environment, 49, 224-234.
Multispectral Vegetation Indices in Remote Sensing 37

Huete, A.R., Jackson, R.D. and Post, D.F. 1985. Spectral response of a plant canopy with
different soil backgrounds. Remote Sensing of Environment, 17, 37-53.
Huete, A.R., Post, D.F. and Jackson, R.D. 1984. Soil spectral effect on 4-space vegetation
discrimination. Remote Sensing of Environment, 15, 155-165.
Hunt, E. R., & Rock, B. N. 1989. Detection of changes in leaf water content using near-
infrared and middle-infrared reflectance. Remote Sensing of Environment, 30, 43– 54
Jackson, R. D. 1983. Spectral indices in n-Space. Remote Sensing of Environment, 13, 409-
421.
Jackson, R. D., Teillet, P. M., Slater, P. N., Fedosejevs, G., Jasinski, M. F., Aase, J. K., and
Moran, M. S., 1990. Bidirectional measurements of surface reflectance for view angle
corrections of oblique imagery. Remote Sens. Environ. 32: 189-202
Jensen, J. R., 2000. Remote Sensing of the Environment: An Earth Resource Perspective.
Prentice Hall: Saddle River, N.J.
Jordan, C. F. 1969. Derivation of leaf area index from quality of light on the forest floor.
Ecology, 50, 663-666.
Kaufman, Y. J. and Tanre, D. 1992. Atmospherically resistant vegetation index (ARVI) for
EOS-MODIS. IEEE Transactions on Geoscience and Remote Sensing, 30, 261-270.
Kaufman, Y.J. and Sendra, C. 1988. Algorithm for automatic atmospheric corrections to
visible and near-IR satellite imagery. Int. J. Rem. Sens., 9, pp:1357-1381
Kieffer, H. H., Martin, T. Z., Peterfreund, A. R., and Jakosky, B. M., 1977. Thermal and
albedo mapping of Mars during the Viking primary mission, J. Geophys. Res. 82:4249-
4291.
Knipling, E. B. 1970. Physical and physiological basis for the reflectance and near infrared
radiation from vegetation. Remote Sensing of Environment, 1, 155-159.
Lillesand, T. M. and Kiefer, 1994. Remote Sensing and Image Interpretation, New York, John
Wiley and Sons, 750 pp
Maas, S. J. 1988. Using satellite data to improve model estimates of crop yield. Agronomy
Journal, 80, 655-662.
Maas, S. J. 1993. Within-season calibration of modeled wheat growth using remote sensing
and field sampling. Agronomy Journal, 85, 669-672.
Major, D. J., Baret, F. and Guyot, G. 1990. A ratio vegetation index adjusted for soil
brightness. International Journal of Remote Sensing, 11, 727-740.
Markham, B. L., Kimes, D. S. and Tucker, C. J. 1981. Temporal spectral response of a corn
canopy. Photogrammetric Engineering and Remote Sensing, 48, 1599-1605.
Monteith, J. L. 1977. Climate and the efficiency of crop production in Britain. Philosophical
Transactions of the Royal Society in London Ser. B., 281, 277-294.
Myneni, R. B., Ross, J., and Asrar, G., 1990. A review on the theory of photon transport in
leaf canopies, Agric. For. Meteorol. 45:1-153.
Nemani, R. P. L., Running, S. W. and Band, L. 1993. Forest ecosystem processes at the
watershed scale: Sensitivity to remotely sensed leaf area index estimates. International
Journal of Remote Sensing, 14, 2519-2534.
Nicodemus, F. E., Richmond, J. C., Hsia, J. J., Ginsberg, I. W.,and Limperis, T., 1977.
Geometrical considerations and nomenclature for reflectance. Nat. Bur. Standards Mono.
160:52.
North, P. R. J. 2002. Estimation of fAPAR, LAI, and vegetation fractional cover from ATSR-
2 imagery. Remote Sensing of Environment, 80, 114-121.
38 George P. Petropoulos and Chariton Kalaitzidis

of plant canopy reflectance. Int. J. Remote Sens. 8, pp. 893–916


Pearson, R. L. and Miller, L. D. 1972. Remote mapping of standing crop biomass for
estimation of productivity of the shortgrass prairie, Pawnee National Grasslands,
Colorado. In Proceedings of the 8th International Symposium on Remote Sensing of
Environment, vol. 2, Environmental Research Institute of Michigan, Ann Arbor, pp.
1355-1381.
Peterson, D. L., Spanner, M. A., Running, S. W. and Teuber, K. B. 1987. Relationship of
Thematic Mapper simulator data to leaf area index of temperate coniferous forests.
Remote Sensing of Environment, 22, 323-341.
Pinty, B. and Verstaete, M. M. 1992. GEMI: A non-linear index to monitor global vegetation
from satellites. Vegetatio, 101, 15-20.
Pinty, B., and Ramond, D., 1986. A simple bidirectional reflectance model for terrestrial
surfaces, J. Geophys. Res. 91:7803-7808.
Qi, J., Chehbouni, A., Huete, A.R., Kerr, Y.H. and Sorooshian, S. 1994. A modified soil
adjusted vegetation index. Remote Sensing of Environment, 48, 119-126.
Qin, W., 1993. Modeling bidirectional reflectance of multicomponent vegetation canopies,
Remote Sens. Environ. 46: 235-245.
Rees, W.G. (2001). Physical Principles of Remote Sensing. 2nd Edition, Cambridge
University Press, 343 pp
Richardson, A. J. and Wiegand, C. L. 1977. Distinguishing vegetation from soil background
information. Photogrammetric Engineering and Remote Sensing, 43, 1541-1552.
Rondeaux, G., Steven, M. and Baret, F. 1996. Optimization of soil-adjusted vegetation
indices. Remote Sensing of Environment, 55, 95-107.
Rouse, J. W. Jr., Haas, R. H., Schell, J. A. and Deering, D. W. 1974. Monitoring vegetation
systems in the Great Plains with ERTS. In Third ERTS Symposium, NASA SP-351, U.S.
Government Printing Office, Washington, DC, vol. 1, pp. 309-317.
Ruimy, A., Saugier, B. and Dedieu, G. 1994. Methodology for the estimation of terrestrial net
primary production from remotely sensed data. Journal of Geophysical Research, 99,
5263-5283.
Running, S. W., Peterson, D. L., Spaner, M. A. and Teuber, K. B. 1986. Remote sensing of
coniferous forest leaf area. Ecology, 67, 273-276.
Sabins, F. F., Jr., 1997. Remote Sensing Proncples and Interpretation, New York: W. H.
Freeman and Co., 494 pp.
Schaaf, C. B., and Strahler, A. H., 1994. Validation of bidirectional and hemispherical
reflectance from a geometric optical model using ASAS imagery and Pyranometer
measurements of a spruce forest, Remote Sens. Environ. 49: 138-144Allen, W.A. and
Richardson, A.J. 1968. Interaction of light with plant canopy. Journal of the Optical
Society of America, 58, 1023-1028.
Sellers, P. J. 1987. Canopy reflectance, photosynthesis and transpiration II. The role of the
biophysics in the linearity of their dependence. Remote Sensing of Environment, 21, 143-
183.
Slater, P.N., 1980. Remote Sensing: Optics and optical Systems, New York: Addison-
Wesley, Inc., 575 pp
Slaton, M. R., Hunt Jr., R. and Smith, W. K. 2001. Estimating near-infrared leaf reflectance
from leaf structural characteristics. American Journal of Botany, 88, 278-284.
Multispectral Vegetation Indices in Remote Sensing 39

Steinmetz, S., Guerif, M., Delecolle, R. and Baret, F. 1990. Spectral estimates of the absorbed
phptpsynthetically active radiation and light-use efficiency of a winter wheat crop
subjected to nitrogen and water deficiencies. International Journal of Remote Sensing,
11, 1797-1808.
Steven, M. D., Biscoe, P. V. and Jaggard, K. W. 1983. Estimation of sugarbeet productivity
from reflection in the red and infrared spectral bands. International Journal of Remote
Sensing, 4, 325-334.
Strahler, A. H., and Jupp, D. L. B., 1991.Geometric-optical modeling of forests as remotely-
sensed scenes composed of three-dimensional, discrete objects, in Photon-Vegetation
Tucker, C. J., Holben, B. N., Elgin, J. H. and McMurtrey, J. E. 1981. Remote sensing of total
dry matter accumulation in winter wheat. Remote Sensing of Environment, 11, 171-189.
Turner, D. P., Cohen, W. B., Kennedy, R. E., Fassnacht, K. S. and Briggs, J. M. 1999.
Relationships between Leaf Area Index and Landsat TM spectral vegetation indices
across three temperate zone sites. Remote Sensing of Environment, 70, 52-68.
Unsalan, C. and Boyer, K. L. 2004. Linearized vegetation indices based on formal statistical
framework. IEEE Transactions on Geoscience and Remote Sensing, 42, 1575-1585.
Verbyla, D. L. 2001. Practical GIS Analysis. Taylor & Francis Press. 12 chapters.
Verbyla, D., 1995. Satellite Remote Sensing of Natural resources. Lewis publ., Florida.
Verstraete, M. M., Pinty, B., and Dickinson, R. E., 1990.A physical model of the bidirectional
reflectance of vegetation canopies, 1: Theory, J. Geophys. Res. 95:11,755-11,765.
Vina, A. and Gitelson, A. A. 2005. New developments in the remote estimation of the
fraction of absorbed photosynthetically active radiation in crops. Geophysical Research
Letters, 32, L17403.
Wiegand, C. L., Gerbermann, A. H., Gallo, K. P., Blad, B. L. and Dusek, D. 1990. Multisite
analyses of specrtal-biophysical data for corn. Remote Sensing of Environment, 33, 1-16.
Wiegand, C. L., Maas, S. J., Aase, J. K., Hatfield, J. L., Pinter Jr., P. L., Jackson, R. D.,
Kanemasu, E. T. and Lapitan, R. L. 1992. Multisite analyses of spectral-biophysical data
for wheat. Remote Sensing of Environment, 42, 1-21.
Wiegand, C. L., Richardson, A. J., Escobar, D. E. and Gergermann, A. H. 1991. Vegetation
indices in crop assessments. Remote Sensing of Environment, 35, 105-119.
Xiao, X., Boles, S., Liu, J. Y., Zhuang, D. F., & Liu, M. L. (2002). Characterization of forest
types in Northeastern China, using multi-temporal SPOT-4 VEGETATION sensor data.
Remote Sensing of Environment, 82, 335– 348
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 41-64 © 2012 Nova Science Publishers, Inc.

Chapter 3

DEVELOPMENT OF A DECISION SUPPORT SYSTEM


FOR THE ESTIMATION OF SURFACE WATER
POLLUTION RISK FROM OLIVE MILL
WASTE DISCHARGES

Anas Altartouri1, Kalliope Pediaditi2,3, George P. Petropoulos2,4,


Dimitris Zianis2 and Nikos Boretos5
1
School of Science and Technology, Aalto University, Niemenkatu 73,
15140 Lahti, Finland. Email: anas.altartouri@aalto.fi
2
Department of Environmental Management,
Mediterranean Agronomic Institute Chania,
Alsyllion Agrokepion, Chania, Crete, 73100, Greece.
Email: dzianis_2000@yahoo.com
3
Ministry of Environment, Energy and Climate Change, 17 Amaliados str.,
11523 Athens, Greece. Email: kalliapediaditi@hotmail.com
4
Department of Natural Resources Development & Agricultural Engineering,
Agricultural University of Athens, 75, Iera Odos St., Athens, Greece.
Email: petropoulos.george@gmail.com
5
Department of Information Systems and Technology,
Mediterranean Agronomic Institute Chania,
Alsyllion Agrokepion, Chania, Crete, 73100, Greece.

ABSTRACT
According to the Water Framework Directive (WFD, 2000/60/EC), Integrated River
Basin Management Plans (RBMP) are required at different scales, in order to prevent
amongst other things, water resource deterioration and ensure water pollution reduction.
An integrated river basin management approach underpins a risk-based land management
framework for all activities within a spatial land-use planning framework. To this end, a
risk assessment methodology is required to identify water pollution hazards in order to
set appropriate environmental objectives and in turn design suitable mitigation measures.
42 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

Surface water pollution as a result of Olive Mill Waste (OMW) discharge is a serious
hazard in the olive oil producing regions of the Mediterranean. However, there is no
standardised method to assess the risk of water pollution from olive mill waste for any
given river basin. The present chapter shows the results from a study conducted
addressing the above issue by designing a detailed risk assessment methodology, which
utilises GIS modelling to classify within a watershed individual sub-catchment risk of
water pollution occurring from olive mill waste discharges. The chapter presents the
proposed criteria and calculations required to estimate sub-catchment risk significance
and comments on the methods potential for wider application. It combines elements from
risk assessment frameworks, Multi Criteria Analysis (MCA), and Geographic
Information Systems (GIS). MCA is used to aggregate different aspects and elements
associated with this environmental problem, while GIS modeling tools helped in
obtaining many criterion values and providing insight into how different objects interact
in nature and how these interactions influence risk at the watershed level. The proposed
method was trialed in the Keritis watershed in Crete, Greece and the results indicated that
this method has the potential to be a useful guide to prioritise risk management actions
and mitigation measures which can subsequently be incorporated in river basin
management plans.

Keywords: Decision Support Systems (DSS), Geographic Information System (GIS), Multi
Criteria Analysis (MCA), Olive Mill Wastewater (OMW), risk assessment, water
pollution.

1. INTRODUCTION
A major environmental issue in Mediterranean region is the pollution of aquatic
ecosystems through the discharge of industrial and domestic effluents in water bodies
(Karageorgis et al., 2003). One of the main polluting activities is the discharge of effluents
generated from olive mill agricultural industries. Olive mill wastewater (OMW) is the liquid
by-product generated during olive oil production. OMW contains pollutants and hazardous
materials in different concentrations which may cause negative impacts on the natural water
bodies and, consequently, human and environmental health. Indicatively, Paliatziki (2006)
states that 50 m3 of olive oil mill wastewater are equivalent to the waste produced by 30,000
citizens. The dispersed spatial location of a large number of small-sized olive oil mills
together with the concentration and seasonal production of OMW, as the Mediterranean
region accounts for 95% of the global OMW production, are the main reasons for the
environmental degradation caused by OMW (Kapellakis et al., 2002; Niaounakis and
Halvadakis, 2006).
According to Niaounakis and Halvadakis (2006), the OMW composition is: water (80-
83%); organic compounds (15-18%); and inorganic compounds (mainly potassium, salts and
phosphates, 2%). It contains phytotoxic and biotoxic substances and is non-biodegradable. It
has a high organic load and classified among the ‗strongest‘ industrial effluents, with
Chemical Oxygen Demand (COD) up to 220 g/l. The consequence of this is a high
consumption of oxygen dissolved in the water bodies which negatively affects the living
organisms and, thus, an imbalance of the whole ecosystem may be caused (Niaounakis and
Halvadakis, 2006). Similar effects can also result from high phosphorus content in OMW
which accelerate the growth of algae leading to eutrophication. Beside the high number of
Development of a Decision Support System … 43

bacteria and fungi in OMW, the presence of high concentration of nutrients may cause
infection of water bodies since it make perfect medium of pathogens to multiply in these
water bodies. Moreover, OMW has long chain fatty acids and phenolic compounds with high
percentage of dissolved mineral salts (ibid). All these contaminants and their impacts on
natural water bodies can result in significant consequences on the environment and people
who may be in contact with these water bodies (Morrison et al., 2001; Ogunfowokan et al.,
2005, Niaounakis and Halvadakis, 2006).
The EU Water Framework Directive (WFD, 2000/60/EC) aims at harmonizing existing
European water policies and to improve water quality in all aquatic environments within the
community area. It emphasizes the need of new Integrated River Basin Management Plans
(RBMP) at national and regional/local scale resulting in the protection and improvement of
the sustainable use of all waters (Heathwite et al., 2005; Rekolainen et al., 2003). The main
objectives of the RBMP include the prevention of further deterioration of water resources and
the promotion of sustainable water use that ensures the progressive reduction of pollution.
These elements in the EU legislation, policies, and programs underpin the need of including
risk-based land management framework to all activities within a spatial land-use planning
framework (Heathwite et al., 2005). However, there is no standardised method to assess the
risk of water pollution from OMW for any given river basin. Due to the small scale and
dispersed nature of OMW processing and disposal method, regulation using a risk-based
approach which uses planning, for example through RBMP as pollution preventing, rather
than relying on post development or pollution incident mitigation is required. To this end, risk
assessment methodology is needed for this point-source water pollution in order to set
appropriate environmental objectives and risk zones to integrate within RBMP as well as
design suitable mitigation measures.
This chapter presents a risk assessment method based on general frameworks to address
different aspects of the environmental problems associated with the OMW pollution. It
provides an analytical approach based on deep investigation of elements of OMW pollution
risk and linkages between them resulting in a conceptual model to understand and cope with
such problem. This model provides insight into how different objects interact in the nature
and how this interactions influences water resources in a watershed. As number of tools and
techniques are needed for risk assessment process to effectively support decision makers
(Allan et al., 2006), the proposed method combines field and desk-based techniques and
utilizes different tools, such as Geographic Information Systems (GIS) and Multi Criteria
Analysis (MCA). This integration between GIS and MCA in developed risk assessment
method takes advantage of the analytical capabilities of MCA on the one hand. On the other
hand, this method benefits from the information processing and display capabilities of GIS as
risk assessment should involve analysis of spatial variability which consider differences
between locations (Allan et al., 2006; Lapucci et al., 2005). The chapter presents the proposed
approach consisting of all criteria and calculation models required to quantitatively estimate
risk significance at each sub-catchment within the river basin under investigation and
comments on the methods potential for wider application.
44 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

2. RISK ASSESSMENT FRAMEWORK AND CONCEPTUAL


MODEL OF OMW POLLUTION
In order to develop a risk assessment process for water pollution from OMW, it is
important to describe the theoretical risk assessment frameworks as well as the conceptual
model of risk generating process of OMW pollution and its controlling factors. Briefly, there
are several risk assessment frameworks which provide a range of definitions of risk (DEFRA
2002; Maltby, 2006; enHealth, 2004; EPA, 1998). Whilst the fundamental processes are
usually similar, slightly different terminologies are used internationally to describe
components of the risk assessment process (Power and McCarrty, 1998). For the purpose of
this research, risk is defined as ―acombination of the probability, or frequency, of occurrence
of a defined hazard and the magnitude of the consequences of the occurrence‖ (DEFRA,
2002, p2). Accordingly, risk can be expressed as: Risk = Probability * Magnitude (Donoghue,
2001; Pediaditi et al., 2005, Billington, 2005).

Box 1. components and criteria of OMW pollution risk

Risk magnitude components


 Component 1: the spatial scale of consequences
Criterion 1 (Cr.1): extent of potentially harmed receptors
 Component 2: the temporal scale of consequences
Criterion 2 (Cr.2): possible sedimentation areas
Risk probability components
 Component 3: the probability of hazard occurring
Criterion 3 (Cr.3): precipitation
Criterion 4 (Cr.4): waste volume to lagoon capacity ratio
Criterion 5 (Cr.5): lagoon conditions
 Component 4: the probability of receptors being exposed to hazard
Criterion 6 (Cr.6): length of the flow path to surface water bodies
 Component 5: the probability of harm resulting
Criterion 7 (Cr.7): surface water quality

As this developed surface water OMW pollution risk assessment method has been
designed to be used in RBMP, the method was based on the generic framework of DEFRA
Guidelines for Environmental Risk Assessment and Management (2002) which makes clear
links between risk assessment and risk management focusing on the practical implementation
of risk assessment result to generate risk management solution (Power & McCarty, 1998). In
addition, it provides clear breakdown of risk into basic components upon which several
criteria have been developed (see Box 1). For the determination of these components, the
olive mill waste process was studied from cradle to grave using field observations, expert
opinion, and literature review. Based on the above, the conceptual model, illustrated in Figure
1, was developed which, subsequently, served as the basis for the development of the
quantitative assessment of risk from OMW pollution.
As illustrated in Figure 1, it is essential for risk assessment process to define the three
elements of risk; sources, pathways, and receptors which, subsequently, help developing
quantitative criteria. A source of stressors can be defined as the place where the stressor
Development of a Decision Support System … 45

originates or is released (EPA, 1998), which in this case are the lagoons where OMW is
gathered. These lagoons are the first element of the exposure pathway and the entities where
hazardous events, such as heavy storms or spillage, may occur and, hence, the probability of
hazard occurring is strongly associated with them (Component 3, Box 1).

Figure 1. illustration of the factors considered in criterion development.

In the case of OMW pollution risk, the pathway consists of the mechanism by which
receptors are exposed to the water polluted by OMW and, therefore, it controls the probability
of receptors being exposed to hazard (Component 4, Box 1). Water bodies, ground and
surface, are considered as the potential pathways by which receptors may be exposed to the
hazard of OMW. However, it has been noticed that surface water bodies are more likely to
transmit the pollutants released from the lagoons. This is due to the fact that the most
probable hazardous event to cause the lagoon overflowing is the heavy storms during which a
large amount of water is running off to the drainage network, meaning that released OMW is
more likely to be washed away into surface water bodies rather than infiltrated into
groundwater bodies.
The last elements in this chain are the receptors which in this case are people or actual
environmental values to be protected. Studying the probability of adverse effects resulting
46 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

requires a clear definition of receptors to be addressed (Allan et al., 2006). For the purpose of
this study, humans and high-value ecological sites (such as NATURA 2000 sites) are
considered as the potential receptors that may be harmed from water pollution from OMW.
The choice of these two generic receptors is underpinned by the fact that they are the most
sensitive and protected according to the EU legislation. As specific dose-response exposure
data is limited in the literature, the use of highly valued and protected by legislation receptors
is the first step in the practical implementation of this method in the context of the WFD and
RBMP.

3. DESCRIPTION OF DEVELOPED RISK-EVALUATING CRITERIA


The developed methodology presented in this chapter to calculate risk of OMW on
human health and protected areas through surface water bodies, using of DEFRA Guidelines
components (Box 1), proposes a number of criteria specific to the risk of OMW pollution.
The process of OMW pollution risk assessment is considered as multi-criteria evaluation
since components of risk depend on several factors which are combined in a specific way
(Mendoza et al., 2002; Eastman, 2003). The criteria have been selected on the basis that they
play a role in the risk generating process considering perspectives of all agents and behaviors
of all the environmental elements playing a part in this process (Lapucci et al., 2005). Below
the rationale behind each criterion is described.

3.1. Extent of Possible Harmed Receptors (Cr.1)

This criterion is proposed to calculate the magnitude of OMW pollution risk at the spatial
scale (Component 1, Box 1). As the main receptor groups of OMW pollution risk are humans
and protected areas, the calculation of this criterion is strongly associated with the spatial
distribution of these receptors and is divided into two directions based on the addressed
receptor group.
Regarding the human receptor group, the magnitude of consequences can be estimated by
the number of affected inhabitants. It can be stated that the larger the population is, the greater
the potential consequences of water pollution in the watershed. Thus, the distribution of
towns and population within sub-catchments in the river basin under investigation should be
drawn. However, the term ‗potentially exposed inhabitants‘ include also all people that may
be exposed to the polluted water in the area regardless their residence place. This may include
people whose drinking water source is located in this area. Also, it may include people who
may have come in contact with the surface water bodies such as tourists or farmers.
However, when addressing the other receptor group, the area of protected areas which
represent a high ecological value should be taken into account in order to calculate the spatial
scale of the magnitude of consequences (Component 1, Box 1). The larger the extension of
these protected areas in the addressed river basin, the greater number of impacted species.
Potentially exposed area within these sites can be delineated by creating a buffer around the
surface water bodies through which the pollutants may be transported. Historical observations
of the heavy storms and floods in the region should take part in determining the extension of
Development of a Decision Support System … 47

the buffer zone. For the EU territories, NATURA 2000 sites can represent these ecologically
valuable areas. NATURA 2000 is a European network of protected sites which represent
areas of the highest environmental value for natural habitats (for several plant and animal
species) which are rare, endangered or vulnerable in the European Community. However,
should designated environmental sites other than NATURA, they should also be included as
receptors1. Ideally, an ecological baseline assessment of areas under consideration should be
undertaken in order to identify areas of potential high ecological value which might not be
designated. However, following discussion with potential end users of this tool, the feasibility
of carrying this out was questioned therefore at a coarser level and, thus as a first step,
calculating the area of potentially exposed protected sites can be used although the first is
preferable.

3.2. Possible Sedimentation Areas Criterion (Cr.2)

The sedimentation in a specific sub-catchment is considered to be a key factor in relation


to the temporal aspect of magnitude of consequences (Component 2, Box 1) for both receptor
groups. Sediments may contain hazardous materials and may expand the time scale of the
consequences for a long period. Therefore, the larger the sedimentation areas in a sub-
catchment, the greater the temporal magnitude of consequences. Sediments may settle in long
flat areas with low velocity of the stream current and, therefore, field investigation of the
topographic nature of the area as well as expert opinion are essential as sediments may not be
applicable in some cases.

3.3. Precipitation Criterion (Cr.3)

Precipitation is one of the factors that may lead to hazard occurrence (Component 3, Box
1). It is important in terms of the potential overflow (hazardous event) of the lagoons where
the waste is deposited. Overflowing, however, causes the pollutants to reach the water bodies
and, therefore, hazard occurs. The greater the precipitation over the area it is, the higher the
probability of overflowing. Records of occasional, heavy storm events should be considered
in estimating the probability of a lagoon to overflow. Alternatively, the maximum monthly
precipitation may substitute in estimating the overflowing probability.

3.4. Waste Volume to Lagoon Capacity Ratio Criterion (Cr.4)

The ratio between generated waste volume and lagoon capacity is another factor which
affects the probability of hazard occurrence, i.e. the probability that OMW lagoons overflow
(Component 3, Box 1). If the amount of produced waste is larger than the lagoon capacity,
then the hazard is more likely to occur. On the contrary, if the lagoons are well designed to
include a buffer margin which can accommodate for excess precipitation, then the probability

1
This is particularly relevant for the application of this method in NON-EU countries which do not belong to
NATRA 2000 network.
48 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

of hazard occurrence goes to zero unless of other events like extreme rainstorm occur.
Therefore, the higher the ratio is, the higher the probability of hazard to occur.

3.5. Lagoon Conditions Criterion (Cr.5)

The structural conditions of the lagoons where the waste is gathered have to be
considered. It is an essential factor which affects the probability of hazard occurring
(Component 3, Box 1). Infiltration of the pollutants through the basement of lagoons and
spillage through the walls, due to poor lagoon conditions e.g. lack of maintenance, structural
malfunction, etc., lead the hazard occur. Therefore, the better the structural conditions of
lagoons, the lower the probability of pollutants to reach water bodies. In order to calculate
this criterion, site visits to lagoon are required as well as an investigation of their specification
and permits.

3.6. Length of the Flow Path to Surface Water Bodies Criterion (Cr.6)

Water bodies close to lagoons have a higher probability of being contaminated should the
OMW discharge event occur. The longer the flow path is to a sub-catchment, the greater the
probability of pollutants to be diluted before reaching receptors. This criterion plays a main
role in the estimation of probability of stressor and receptor co-occurrence (Component 4,
Box 1). The actual length of the flow paths can be determined using the 3D tools in GIS
associated with the Digital Elevation Model (DEM). Several flow paths can be drawn for a
single lagoon. Each of these flow paths starts from the lagoon and ends at the point where this
flow path joins the water bodies in a receiving sub-catchment.

3.7. Surface Water Quality Criterion (Cr.7)

The probability of receptors to be harmed resulting from exposure to the hazard of water
pollution (Component 5, Box 1) depends on the concentrations of polluting substances in the
water bodies they be in touch with. Water quality parameters are the key factors to estimate
the impact of these pollutants on the receptors. If the concentrations of polluting substances
are within the allowable limits in legislation, then the probability of harm is near to zero. On
the contrary, the probability of harm is very high when the concentrations exceed the
contamination thresholds. For example, if a lagoon is located in a catchment, then iron (Fe)
could be one of the pollutants (hazard) that may be found in the water samples taken from the
water bodies of that catchment. According to the WFD, the guide level and the maximum
admissible concentration for the drinking water are 50µg/l and 200µg/l, respectively.
Therefore, a scale from zero to one is established to express the likelihood of harm resulting.
In this scale, the probability of 0 is assigned for samples with 50µg/l or less iron
concentration, and 1 for those with 200µg/l iron concentration. The intermediate
concentrations are assigned values between 0 and 1 according to the linear equation which
interpolating these two limit values (section 4.2). This should be applied on all the pollutants
associated with the OMW. Table 1 lists the chemicals associated with the OMW (Niaounakis
Development of a Decision Support System … 49

and Halvadakis, 2006) and their preferable and maximum admissible levels according to the
WFD (2000/60/EC).
However, as data about dose-response is available only for human in the WFD, this
method proposes the dilution degree as an alternative approach to calculate its value when
addressing risk of OMW pollution on the protected areas. The degree of pollutants‘
concentrations in the surface water bodies located in the protected areas can be indicated
taking in consideration the pathway by which these pollutants are transported. Pollutants
could be carried by direct surface runoff or, alternatively, by streams. By direct surface
runoff, the concentrations of pollutants remain almost the same as in the origin pollution
source or with a slight dilution degree in the rain water while pollutants carried by streams are
subjected to higher dilution degrees which decrease their concentration before reaching
ecological sites. However, the degree of dilution and its speed depend on stream velocity as
well as the quantity of stream water. These two parameters can be expressed in terms of
stream order. Therefore, the higher the stream order, the faster the dilution and, therefore, the
lower the probability of protected areas to be harmed.

Table 1. Water quality parameter associated with OMW

Guide level Maximum admissible


No. Parameter
(mg/l) level (mg/l)
Cr.7.1 Copper (Cu) 0.1 3.0
Cr.7.2 Iron (Fe) 0.05 0.2
Cr.7.3 Lead (Pb) 0 0.05
Cr.7.4 Magnesium (Mg) 30.0 50.0
Cr.7.5 Manganese (Mn) 0.02 0.05
Cr.7.6 Nickel (Ni) 0.0 0.05
Cr.7.7 Nitrogen (N) 0 1.0
Cr.7.8 pH 6.5 8.5
Cr7.9 Phenols 0.0 0.005
Cr.7.10 Phosphorus (P) 0.4 5.0
Cr.7.11 Potassium (K) 10.0 12.0
Cr.7.12 Sodium (Na) 20.0 150.0
Cr.7.13 Zinc (Zn) 0.1 5.0
Cr.7.14 Microbiological contaminants 0 0

Source: WFD, 2000.

4. QUANTITATIVE APPROACH FOR RISK ASSESSMENT FROM OMW


This section provides guidelines for applying the developed quantitative risk assessment
method of OMW pollution. As MCA-based approach, the proposed method consists of the
four steps (Mendoza et al., 1999). The initiative step deals mainly with data collection and
database preparation in order to obtain values of every criterion. This is followed by
standardization step which brings criteria of different scales into comparable dimensionless
scale. Then, criteria are weighted based on their relative significance to risk components (Box
1). Finally, standardized criterion values and their weights are aggregated using a calculation
50 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

model which consists of a set of formulae. These steps are illustrated in Figure 3, 4, and 5 and
discussed in the following subheadings.

4.1. Calculating Criterion Values

Practically, there are some initiative steps to be performed in order to proceed with this
method. These steps mainly deal with data collection and preparation of appropriate datasets
which serves as an input into the calculation model in order to obtain quantitative values of
risk. This data consists of all the initial values of criteria which result from the application of
a variety of data processing procedures. Such procedures may include hydrological
processing, map derivation, tabular calculations, and other procedures depending on the
available data.
In order to define surface water bodies, namely streams, and determine some criterion
values (Cr.6, Box 1), a hydrological model is needed. As the function of the spatial hydrology
modeling tool is to simulate the water flow and transport on a specified area using GIS data,
a hydrological model is needed in this analysis in order to define the drainage network in the
target watershed, including streams and possible flow paths from the sources of hazard to
surface water bodies. In addition, the hydrological model is essential for dividing the
watershed under investigation into sub-catchments which represent, together with the
lagoons, the analysis units.
Additionally, data from the field is necessary to determine criterion values of water
quality parameters (Cr.7). Samples from surface water bodies in the target watershed should
be collected within a pre-planned sampling strategy and analyzed for the chemicals and
parameters associated with polluting source, namely OMW. The first consideration is the
timing of sampling process. Regarding OMW, water samples should be collected from the
surface water bodies (streams) in the target watershed during the period when lagoons contain
the maximum volume of wastewater and the water is still present in the drainage network.
The proposed sampling strategy consists of collecting samples from the points where possible
flow paths from lagoons join the receiving streams. This sampling strategy is designed in
order that the chemical tests to show the worst case where the pollutants are in the highest
possible concentrations and have not yet been affected by the dilution process. Such an
approach restricts biased sampling and minimizes the bias of one criterion‘s effect to the
other. The concentrations of the analyzed chemicals in each sample should be assigned to the
corresponding lagoon as shown in the analysis below.

4.2. Standardization of Criterion Values

Because of the different scales upon which criteria are measured, it is necessary to
standardize them before being combined. There are various standardization procedures,
typically using the minimum and maximum values as scaling points (Eastman, 2003). These
functions transform the values with dimensions to dimensionless values between 0 and 1,
which makes the criteria of different dimensions comparable (Mendoza et al., 1999). The
functions are implemented either as a benefit (the higher the criterion score, the higher the
likelihood or magnitude of risk) or as a cost (the lower the criterion score the higher the
Development of a Decision Support System … 51

likelihood or magnitude of risk). However, different standardization functions are proposed


for different criteria according to their nature. The first standardization method, which is
applied for all the developed criteria except the waste volume to lagoon capacity ratio
criterion, the lagoon condition criterion, and the water quality criteria, consists of the
following linear scaling functions:

R
ci 
Rmax for benefit criteria, and (1)

Rmin
ci 
R for cost criteria (2)

where:

ci = the standardized criterion value;R = the origin criterion value; Rmin = the minimum
value of the criterion of all units (lagoons or sub-catchments); and Rmax = the maximum value
of the criterion of all units (lagoons or sub-catchments).
The other standardization method is applied for water quality parameters (Cr.7, Box1).
As the probability of harm resulting is strongly correlated with the amounts of different
OMW hazardous chemical in the water, it can be given as a function of their concentrations.
These functions (see Figure 2) consist of linear multi-segments based on the following linear
scaling functions:

R  Rmin
ci 
Rmax  Rmin for benefit criteria, and (3)

Rmax  R
ci 
Rmax  Rmin for cost criteria (4)

where:

ci = the standardized criterion value; R = the origin criterion value; Rmin = the guide level
according to the WFD (see Table 1) and; and Rmax = the maximum admissible level according
to the WFD (see Table 1).
Finally, as some criteria are assigned dimensionless values, there is no need to
standardize them. These criteria are the ratio between waste volume and lagoon capacity and
the lagoon conditions (see Table 2). The ratio is result of dividing the amount of produced
waste on the volume of the lagoon where it is located which does not exceed the value of one
since the permission of olive mills requires a lagoon size larger than the expected produced
waste. From another side, the lagoon condition criterion is assigned a Boolean value of zero,
where the lagoon has no constructional problems, or one, where obvious construction
problems are detected.
52 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

Figure 2. Scaling functions of water quality parameters.

4.3. Assignment of Criterion Weights

Acknowledging the context specific nature of risk, this method enables through its design
the assignment of weights according to contextual criterion significance. Hence, criteria that
are deemed more significant indicators of risk in the river basin under consideration can be
assigned higher weights thereby giving them greater importance in the estimation of the risk
of OMW pollution (Mendoza et al., 2002). These weights are assigned separately within
every risk component (Box 1) to show the importance of every criterion in relation to that
component. In order to calculate the weights of multiple criteria, the Analytic Hierarchy
Process (AHP) can be used. AHP is a multi-criteria decision support method which uses
paired comparisons in order to calculate the weights of multiple criteria (Saaty, 1990).
Thereafter, the analysis is separated into two directions according to the addressed
receptors. The first direction addresses the hazard of OMW transported by surface water
bodies and affecting population, while the second direction addresses the hazard of OMW
transported by surface water bodies affecting protected areas. This implies producing two risk
Development of a Decision Support System … 53

maps (one for each direction) as a result of this analysis. This separation is due to the slight
differences in the criteria used for each receptor (see Table 3). Assigning the weight for
criteria is a subjective process which results from the pair wise comparisons. The
specifications of every site greatly influence this process. Hence, for more accurate and
unbiased weight assignment, expert opinion should be considered as well as conducting site
visits.

Table 2. Summary of developed criteria

Unit to Direction
Criterion which it is Component (receptors
assigned addressed)
Cr.1 Number of potentially exposed inhabitants Sub-catch. Component 1 (Box 1)
Cr.2 Possible sedimentation areas Sub-catch. Component 2 (Box 1)
Cr.3 Precipitation Lagoon
Cr.4 Waste volume to lagoon capacity ratio Lagoon Component 3 (Box 1)
Cr.5 Lagoon conditions Lagoon
Cr.6 Length of the flow path to surface water bodies Lagoon Component 4 (Box 1)
Cr.7 Copper (Cu) Lagoon
Cr.8 Iron (Fe) Lagoon
Cr.9 Lead (bp) Lagoon
Water quality parameters

Cr.10 Magnesium (Mg) Lagoon Humans


Cr.11 Manganese (Mn) Lagoon
Cr.12 Nickel Lagoon
Cr.13 Nitrogen (N) Lagoon Component 5 (Box 1)
Cr.14 ph Lagoon
Cr.15 Phenols Lagoon
Cr.16 Phosphorus (P) Lagoon
Cr.17 Potassium (K) Lagoon
Cr.18 Sodium (Na) Lagoon
Cr.19 Zinc (Zn) Lagoon
Cr.1 Area of potentially exposed NATURA sites Sub-catch. Component 1 (Box 1)
Cr.2 Possible sedimentation areas Sub-catch. Component 2 (Box 1)
Cr.3 Precipitation Lagoon
Cr.4 Waste volume to lagoon capacity ratio Lagoon Component 3 (Box 1) Protected
Cr.5 Lagoon conditions Lagoon areas

Cr.6 Length of the flow path to NATURA sites Lagoon Component 4 (Box 1)
Cr.7 Dilution degree expressed by stream orders Lagoon Component 5 (Box 1)

4.4. Applying an Aggregation Rule

In order to obtain results about risk in every sub-catchment within the river basin under
investigation, the standardized criterion values have to be combined based on a well-defined
aggregation rule. A set of formulae has been developed in order to arrive at a particular
evaluation of the risk. These formulae are structured in a calculation model which takes into
account the influence of every lagoon on each sub-catchment within the river basin. The
development of the calculation model was based on the comprehension of the relationships
between risk elements (sources, pathways, and receptors) and components (probability and
magnitude, see Box 1) as well as the conceptual model of the problem. Multi-criteria
54 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

evaluation can be achieved by weighted linear combination (WLC) procedure wherein


standardized criteria are combined by mean of a weighted average (Eastman, 2003). The
basic equation of WLC is:

R   wi ci (5)

where:

R = the risk value of specific sub-catchment; wi = the weight of criterion I; and ci = the
score of criterion i.
At this point, every criterion has its standardized values assigned either to a lagoon or a
sub-catchment (analysis units) as well as it relative weight. When aggregating these criteria,
every sub-catchment will have a single value which indicates the combined magnitude
components (Component 1 and 2, Box 1). Likewise, every lagoon will have a single value
indicating the first and the third probability components (Component 3 and 5, Box 1).
Differently, every lagoon will have multi values regarding the second probability component
as many as possible receiving sub-catchments. These values are obtained using the formulae
below, which are basically substitutions of the formula of WLC, and they are applied for both
directions (receptor groups).
The first step is to calculate the two components of magnitude of consequences
Component 1 and 2, Box 1). This should be applied for both directions of the analysis using
the following formulae:

1st magnitudesubcatchment y  CM 11subcatchment y *WCM 11 (6)

2nd magnitudesubcatchment y  CM 21subcatchment y *WCM 21 (7)

where:

1st magnitudesubcatchment y is the spatial scale of magnitude of consequences in sub-


catchment y (Component 1, Box 1); 2 magnitudesubcatchment y is the temporal scale of
nd

magnitude of consequences in sub-catchment y (Component 2, Box 1); C M 1nsubcatchment y is


the value of criterion n of the 2nd magnitude component regarding sub-catchment x;
C M 2 nsubcatchment y is the value of criterion n of the 1st magnitude component regarding sub-
catchment x; WCMmn is the weight of criterion n from the mth magnitude.
The next step is to calculate the three components of probability of risk. The same
formulae are used for both analysis directions with a slight difference in the formula of the 3rd
probability component as shown below:
Development of a Decision Support System … 55

3
1 probabilitylagoon x   C P1nlagoon x * WCP1n
st
(8)
n 1

2 nd probabilitylagoon x  C P 21lagoon x * WCP 21 (9)


subcatchment y subcatchment y

13
3rd probabilit ylagoon x   C P3nlagoon x * WCP 3 n (10.a)
n1
(Receptors: Population)

3rd probabilit ylagoon x  CP31lagoon x *WCP 31


(10.b)
(Receptors: protected areas)

where:

1st probabilitylagoon x
= the probability of hazard occurring from lagoon x
(Component 3, Box 1);
2 nd probabilit y lagoon x
subcatchment y
= the probability of hazard generated from lagoon x to
reach sub-catchment y (Component 4, Box 1);
3 rd probabilitylagoon x
= the probability of harm resulting from the hazard
generated from lagoon x (Component 5, Box 1),
C P1nlagoon x
= the value of criterion n of the 1st probability
component regarding lagoon x;
C P 2 nlagoon x
subcatchme nt y
= the value of criterion n of the 2nd probability
component regarding lagoon x and sub-catchment y;
C P 3nlagoon x
= the value of criterion n of the 3rd probability component
regarding lagoon x;
WC Pmn
= the weight of criterion n from the mth probability
component.

The next step for both directions is to combine the two magnitude components
(Component 1 and 2, Box 1) into one ‗magnitude‘ component and the three probability
56 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

components (Component 3, 4, and 5, Box 1) into one ‗probability‘ component. This can be
done using the following formulae:

MAGNITUDE sebcatchment y  1st magnitudesubcatchment y * 2 nd magnitudesubcatchment y (11)

PROBABILIT Ylagoon x  1st probabilitylagoon x * 2 nd probability lagoon x * 3 rd probabilitylagoon x (12)


sebcatchme nt y subcatchment y

where:

MAGNITUDE sebcatchment y = the overall magnitude of the consequences in sub-


catchment y caused by lagoon x Component 1 and 2, Box 1); and
PROBABILIT Ylagoon x = the overall probability of the risk in sub-catchment y
sebcatchment y

caused by lagoon x Component 3, 4, and 5, Box 1).

The application of the previously mentioned formulae results in a magnitude of the risk in
a specific sub-catchment and a probability of a specific lagoon to contribute this risk. Hence,
the risk value in a sub-catchment with a contribution probability of all possible lagoons is
determined as the following:

xn

 PROBABILIT Y lagoon x
RISK subcatchment y  MAGNITUDE subcatchment y * x 1 subcatchment y (13)
 xn

  PROBABILIT Ylagoon x 
 x 1 subcatchment y  max

where:

RISK subcatchment y = the risk value in sub-catchment y


x n

 PROBABILIT Y
x 1
lagoon x
subcatchment y
= the production of the probabilities of all lagoons

which may contribute to the risk in sub-catchment y


 xn 
  PROBABILIT Ylagoon x  = the maximum of all the productions defined
 x 1 subcatchment y  max

above
The flowchart of the proposed calculation model is illustrated in Figure 3, 4 (a and b),
and 5. Figure 3 illustrates a flowchart of calculations for each ‗sub-catchment‘ unit within the
river basin which results in a value of risk magnitude in that sub-catchment (combination of
component 1 and 3, Box 1). Figure 4 (a and b), however, illustrates a flowchart of
calculations for each ‗lagoon‘ unit located in the river basin which results in multi values of
probabilities (combination of component 3, 4, and 5, Box 1) of the corresponding lagoon to
cause risk in each ‗sub-catchment‘ unit within the river basin regarding humans or protected
Development of a Decision Support System … 57

areas, respectively. Finally, Figure 5 illustrates the flowchart resulting in a risk value for a
‗sub-catchment‘ unit based on aggregation of the magnitude (resulted from the flowchart in
Figure 3 for the corresponding sub-catchment) and probability (resulting from the flowchart
in Figure 4 (a or b) for every lagoon within the river basin).

Figure 3. Flowchart of calculation model at the sub-catchment level (for both receptor groups).
58 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

Lagoon x

1st prob. 2nd prob. 3rd prob.

Cr. 4 Cr. 5 Cr. 6 Cr. 7 Cr. 9.1 Cr. 9.2 Cr.9.13

Flow Flow Flow


path to path to path to
subcat. 1 subcat. 2 subcat. y

Std Std Std Std Std Std Std Std Std


. . . . . . . . .

WLC WLC WLC


. . .

2 nd probabilitylagoon x
sucatchment 1

2 nd probabilitylagoon x
WLC subcatchmnt 2 WLC
. .
2 probabilitylagoon x
nd

1st probabilitylagoon x sucatchmen t y 3 rd probabilitylagoon x

x PROBABILIT Ylagoon x The probability of lagoon x to contribute to


subcatchment 1 the risk in sub-catchment 1

The probability of lagoon x to contribute to


x PROBABILIT Ylagoon x
subcatchment 2
the risk in sub-catchment 2

x The probability of lagoon x to contribute to


PROBABILIT Ylagoon x the risk in sub-catchment y
subcatchment y

Figure 4a. Flowchart of calculation model at the lagoon level (for humans as receptors).
Development of a Decision Support System … 59

Lagoon x

1st prob. 2nd prob. 3rd prob.

Cr.4 Cr.5 Cr.6. Cr.8 Cr.10

Flow path to Flow path to Flow path to


protected areas protected areas protected areas
in subcat. 1 in subcat. 2 in subcat. y

Std Std Std Std Std Std Std


. . . . . . .

WLC WLC WLC


. . .

2 nd probabilitylagoon x
sucatchment 1

2 nd probabilitylagoon x
WLC subcatchmnt 2 WLC
. .
2 probabilitylagoon x
nd

1st probabilitylagoon x sucatchmen t y 3 rd probabilitylagoon x

x PROBABILIT Ylagoon x The probability of lagoon x to contribute to


subcatchment 1 the risk in sub-catchment 1

The probability of lagoon x to contribute to


x PROBABILIT Ylagoon x
subcatchment 2
the risk in sub-catchment 2

x The probability of lagoon x to contribute to


PROBABILIT Ylagoon x the risk in sub-catchment y
subcatchment y

Figure 4b. Flowchart of calculation model at the lagoon unit (for protected areas as receptors).
60 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

Figure 5. Summary of flowcharts.

The proposed methodology has been implemented in Keritis watershed in western Crete,
Greece (see Figure 6). A 17,855 ha basin, Keritis watershed faces the risk of OMW pollution
as olive mills are the main agricultural industry in the area. Nine sub-catchments have been
delineated and identified within the basin where five OMW lagoons are located (see Figure
6). Having implemented the proposed methodology at a detailed level of analysis, two risk
maps have been produced (see Figure 6). The first map shows the risk of OMW pollution
which may harm the population, while the other map shows the risk of OMW pollution which
may harm the protected areas (NATURA 2000 sites) in the watershed.
The obtained risk maps clearly show that the developed criteria and calculation method
are compatible with risk generating process in the nature. It can be seen from Figure 6 that
some sub-catchments are estimated to have zero-risk value which is due to the absence of
addressed receptors (Figure 6b, sub-catchment 3 and 8), to the absence of connecting
pathways (Figure 6 a and b, sub-catchment 1, 2, and 3), or both (Figure 6b, sub-catchment 8).
Development of a Decision Support System … 61

On the contrary, the study indicated a high risk value in sub-catchments 9 and 6 regarding
population and protected areas, respectively. The reasons behind the former were the
existence of two hazard sources (Component 3, Box1) and their relatively high proximity to
the surface water bodies (Component 4, Box1) as well as the presence of the highest
population (Component 1, Box1), while for the later, the existence of large protected
NATURA sites (Component 1, Box1) as well as the potential of multi sources to contribute to
risk in this sub-catchment (Component 4, Box1).

Figure 6. Risk maps of OMW pollution in Keritis watershed;

(A) Risk concerning population;

(B) Risk concerning protected NATURA 2000 sites.

To conclude, the breakdown of risk into its primary components (Box 1) and the clear
comprehension of risk generating process and its elements (sources, pathways, and receptors)
and controlling factors (represented by the developed criteria) are the main features of this
method resulting in a realistic and unbiased estimation of risk. In other words, not taking into
consideration the aforementioned issues, risk assessment would no longer be an integrated
approach of several components and just a generic description based on a small number of
criteria.

CONCLUSION
This quantitative approach of risk assessment was built based on the risk assessment
framework proposed by DEFRA guidelines. Having analyzed risk elements, components, and
62 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

controlling factors as well as the relationships between them, a conceptual model of risk
generating process has been built and a range of criteria has been formulated. This was
followed by MCA which was based on the development of the criteria. The method proposed
different scaling functions in order to standardize the initial raw value of these criteria. Then,
a calculation model, consisting from a set of formulae, was developed in order to obtain a
quantitative assessment of risk in every sub-catchment within the watershed under
investigation. It has been designed in a way to calculate an overall potential risk a sub-
catchment may be exposed to as a result of the contribution of all point-sources, namely
OMW lagoons, located the watershed. This is one of the main strengths of this method as it is
able to estimate risk in every single sub-catchment by considering and analyzing inputs from
all sources of pollution (lagoons) in the whole basin.
This method can be widely applied in the Mediterranean region where water pollution by
OMW is a common environmental problem. It serves as a decision support tools which
inform risk managers and decision makers about the risk assessment results so they can take
the appropriate management measures, e.g. implementing mitigation measures, properly
locating new OMW lagoon, etc. However, site specification should be taken into account in
when applying this approach. The developed method was implemented in the case study of
Keritis watershed (see Section 4). This implementation has shown the applicability of this
method and its potential as a decision supporting tool.
Besides being a replicable method, a strength point of this methodology is its flexibility
to add or remove criteria as well as changing their weights based on the specific needs of
different case studies without affecting the calculation model. This is a very important issue
since the controlling factors of environmental problems are more likely to change spatially
and temporally. In other words, different cases, for the same environmental problem, may be
subjected to different factors which require a modification of the criteria and their weights for
more accurate results. Moreover, this method can be widely applied for other stressor. It can
be effectively refined to address other point-sources of water pollution. In this case, the
calculation model may need some modification in light of the new problem formulation.
However, the generic frame of this method, consisting of the main steps, is still valid. In
addition, this methodology has the potential to be automated and computerized within a GIS
environment. The calculation model can be programmed using a scripting language to be
applied on the related geo-database. This geo-database should contain the layers where
different criteria are assigned and other calculation elements are found.
However, this method has some weaknesses common to all MCA weight assigning
method. In addition, the proposed method does not assess the overall risk on the river basin
under consideration meaning both surface and ground water bodies. Although the risk of
OMW is more likely to be transported via surface water bodies, due to its nature previously
discussed (section 2), investigating its impacts on the groundwater bodies, as a pathway,
would result in more accurate risk map. However, the stated difficulties in modeling the
natural processes associated with the risk generating process needs further investigation
aiming at simulating such processes.
Development of a Decision Support System … 63

ACKNOWLEDGMENTS
This research was carried out within the context of Remotely Accessed Decision Support
System for Transnational Environmental Risk Management, STRiM, INTERREG IIIB
CADSES project which enabled access to the required data and gave the opportunity to
review the research, in its different stages, from experts in several academic and research
institutions within its consortium.

REFERENCES
Allan, I. J., Mills, G. A., Vrana, B., Knutsson, J., Holmberg, A., Guigues, N., Laschi, S.,
Fouillac, A. & Greenwood, R. (2006). Strategic monitoring for the European Water
Framework Diective. Trends in Analytical Chemistry, 25(7):704-715.
Billington, K. (2005). The River Murray and Lower Lakes Catchment Risk Assessment
Project for Water Quality - Concepts and Method. Environment Protection Authority.
DEFRA, Department for Environment, Food and Rural Affaires (2002). Guidelines for
Environmental Risk Assessment and Management.
Donoghue, A. M. (2001). The design of hazard risk assessment matrices for ranking
occupational health risks and their application in mining and minerals processing.
Occupational Medicine, 51(2):118-123.
Eastman, J. R. (2003). IDRISI Kilimanjaro Guide to GIS and Image Processing. Clark Labs,
Clark University, USA.
enHealth, Council and Department of Health and Ageing (2004). Environmental Health Risk
Assessment, Guidelines for assessing human health risks from environmental hazards.
EPA, Environmental Protection Agency (1998). Guidelines for Ecological Risk Assessment.
Heathwaite, A. L., Dils, R. M., Liu, S., Carvalho, L., Brazier, R. E., Pope, L., Hughes, M.,
Phillips, G. & May, L. (2005). Atiered risk-based approach for predicting diffuse and
point source phosphorus losses in agricultural areas. Science of the Total Environment,
344:225-239.
Kapellakis, I. E., Tsagarakis, K. P., Avramaki, Ch. & Angelakis, A. N. (2006). Olive mill
wastewater management in river basins: A case study in Greece. Agricultural Water
Management, 82:354-370.
Karageorgis, A. P., Nikolaidis, N. P., Karamanos, H. & Skoulikidis, N. (2003). Water and
sediment assessment of the Axios River and its coastal environment, Continental Shelf
Research, 23:1929-1944.
Lapucci, A., Lombardo, S., Retri, M. & Santucci, A. (2005). A KDD based multicriteria
decision making model for fire risk evaluation. Association Geographic Information
Laboratories Europe
[http://plone.itc.nl/agile_old/Conference/estoril/papers/23_Alessandra%20Lapucci.pdf]
[Viewed on 15.06.2011].
Maltby, L. (2006). Environmental risk assessment. Environmental Science and Technology,
22:84-101.
64 Anas Altartouri, Kalliope Pediaditi, George P. Petropoulos et al.

Mendoza, G. A. & Macoun, P. (1999). Guidelines for Applying Multi-Criteria Analysis to the
Assessment of Criteria and Indicators. (Jakarta: Center for International Forestry
Research (CIFOR)).
Mendoza, G. A., Anderson, A. B. & Gertner, G. Z. (2002). Integrating multi-criteria analysis
and GIS for land condition assessment: Part I – Evaluation and restoration of military
training areas. Journal of Geographic Information and Decision Analysis GIDA, 6(1):1-
16.
Morrison, G., Fatoki, O. S., Persson, L. & Ekberg., A. (2001). Assessment of the impact of
point source pollution from the Keiskammahoek Sewage Treatment Plant on the
Keiskamma River - pH, electrical conductivity, oxygen- demanding substance (COD)
and nutrients. Water SA, 27:475-480.
Niaounakis, M. & Halvadakis, C. P. (2006). Olive Processing Waste Management, Literature
Review and Patent Survey. (Oxford: Elsevier)
Ogunfowokan, A. O., Okoh, E. K., Adenuga, A. A. & Asubiojo, O. I. (2005). An assessment
of the impact of point source pollution from a university sewage treatment oxidation
pond on a receiving stream: a preliminary study. Journal of Applied Sciences, 5(1):36-
43.
Paliatziki, A. (2006). Analysis of Environmental Pressures and Impacts in the Koiliaris River
Watershed. Dissertation, Technical University of Crete.
Pediaditi, K., Wehrmeyer, W. & Chenoweth, J. (2005). Brownfield redevelopment,
integrating sustainability and risk management. In Environmental Health Risk III,
Brebbia C A, Popov V, Fayzieva D (eds.), WIT press, 21-30.
Peltonen, L. (2006). Recommendations for a risk mitigation oriented European spatial policy.
Geological Survey of Finland, Special Paper, 42:153-167
Power, M. & McCarrty, L.S. (1998). A comparative analysis of environmental risk
assessment/risk management frameworks. Environmental Science and Technology,
32:224A-231A.
Rekolainen, S., Kämäri, J. & Hiltunen, M. (2003). A conceptual framework for identifying
the need and role of models in the implementation of the Water Framework Directive.
International Journal on River Basin Management, 1(4):347–352.
Saaty, T. L. (1990). How to make a decision: The analytic hierarchy process. European
Journal of Operational Research, 48(1):9-26.
WFD, EU Water Framework Directive (2000). Official Journal of the European communities.
[http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:32000L0060:EN:NOT]
[Viewed on 15.06.2011]
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 65-81 © 2012 Nova Science Publishers, Inc.

Chapter 4

ANALYSIS OF GREEN OAK LEAF ROLLER


POPULATION DYNAMICS IN VARIOUS LOCATIONS

L. V. Nedorezov*
The Research Center for Interdisciplinary Environmental Cooperation (INENCO) of
Russian Academy of Sciences, Kutuzov seafront 14, Saint Petersburg 191187, Russia.

ABSTRACT
Publication is devoted to the problem of population time series analysis with various
discrete time models of population dynamics. Applications of various statistical
criterions, which are normally used for determination of mathematical model parameters,
are under the discussion. With a particular example on green oak leaf roller (Tortrix
viridana L.) population fluctuations, which had been presented in publications by
Rubtsov (1992), and Korzukhin and Semevskiy (1992) for three different locations in
Europe, the possibilities of considering approach to the analysis of population dynamics
are demonstrated. For approximations of empirical datasets the well-known models of
population dynamics with a discrete time (Kostitzin model, Skellam model, Moran –
Ricker model, Morris – Varley – Gradwell model, and discrete logistic model) were
applied. For every model the final decision about the possibility to use the concrete
model for approximation of datasets are based on analyses of deviations between
theoretical (model) and empirical trajectories: the correspondence of distribution of
deviations to Normal distribution with zero average was checked with Kolmogorov –
Smirnov and Shapiro – Wilk tests, and existence/absence of serial correlation was
determined with Durbin – Watson criteria. It was shown that for two experimental
trajectories Kostitzin model and discrete logistic model give good approximations; it
means that population dynamics can be explained as a result of influence of intra-
population self-regulative mechanisms only. The third considering empirical trajectory
needs in use more complicated mathematical models for fitting.

Keywords: mathematical discrete time models, estimation of model parameters, analysis of


time series, green oak leaf roller population dynamics.

*
E-mail address: l.v.nedorezov@gmail.com.
66 L. V. Nedorezov

1. INTRODUCTION
At present in ecological modeling there is a huge number of various mathematical
models, which are used for the description of dynamics of separated populations and
elementary ecosystems with small number of interacting species (Gause, 1934; Kostitzin,
1937; Kot, 2001; Begon, Mortimer, 1981; Varley et al, 1975; Maynard Smith, 1976;
Poluektov et al, 1980; Nedorezov, 1986 and others). At the same time a small number of
models was compared with experimental time series on a quantitative level. In most cases
authors confine themselves by the limits of qualitative comparison of theoretical and real
datasets, which does not allow talking about the adequacy of used models to observations.
Analysis of empirical time series needs in dynamical model using. It is important for
creating forecasts of population size changing in time, for estimation of the probability of pest
population outbreak beginning and so on. It can also be important for finding optimal
methods for population management that leads to the necessity of deep knowledge of all basic
elements of population phase portrait structure (Isaev et al, 1980, 2001, 2009; Nedorezov,
1986, 1999; Berryman, 1981, 1990, 1991).
Abundance of mathematical models, which can be found in modern ecological literature,
leads to appearance of serious problems in selection process – first of all, in finding model,
which gives best fitting for empirical time series. Partly it correlates with absence of the
respective criterions for selection of mathematical expressions, which give adequate
description of one or another biological mechanism affecting population size (Isaev et al,
2001, 2009).
There exists one more important problem in modern ecological modeling – this is the
problem of selection of statistical criteria for estimation values of dynamic model parameters.
In literature there is a big number of various using criterions (Wood, 2001 a, b; Turchin et al.,
2003; Nedorezov, 1986 and others).
But what kind of criteria we have to choose in one or another situation and why – in most
cases this is a puzzle. In current paper we analyze these marked problems – the problem of
selection of mathematical model (and analysis of suitability of selected model to fitting real
trajectories of population dynamics) and the problem of selection the best statistical criteria
for dynamic model parameter estimations. Obtained results are applied to analysis of datasets
on green oak leaf roller (Tortrix viridana L.) population dynamics (Rubtsov, 1992;
Korzukhin, Semevskiy, 1992; Nedorezov, Sadykova, 2005, 2008; Nedorezov, Sadykov,
Sadykova, 2010).

2. SELECTION OF MATHEMATICAL MODEL


Before creation a new mathematical model of population dynamics for the description of
concrete species fluctuations, it is important to be sure that all existing models (which can be
found in modern ecological literature) are not suitable for the solution of respective problems
(i.e. models cannot be applied for fitting of real trajectories). If it is assumed to construct a
new model in the following form


x k 1  x k F ( x k ,  ) , (1)
Analysis of Green Oak Leaf Roller Population Dynamics in Various Locations 67

where x k is population size (or population density) in k -th year (or in k -th moment of
observation), F is a population birth rate (Isaev et al, 1980, 2001, 2009),

x k 1 
yk   F ( xk , ) , (2)
xk

and  is a vector of (unknown) model parameters, then the natural question arises – why we
cannot choose and use one of the models from the table 1? It is well-known that among the
recursive equations from this table one can find the models with very rich set of dynamical
regimes. Moreover, part of these models were applied to the description of dynamics of
various natural populations with success (see, for example, May, 1975; Hassell, 1975, 1978;
Varley et al, 1975; Nedorezov, 1986; Nedorezov, Nedorezova, 1995; Kot, 2001). Moreover,
if simple model of type (1) gives us a sufficient good description of population dynamics
(more precisely, applied statistical criterions don‘t allow us to reject the hypothesis about
suitability of model for fitting of considering time series), then we really haven‘t a
background for creating new and more difficult mathematical model, which contains the first
model as a particular case. It is obvious, that for more complicated model we will also have
satisfactory results.
Taking it into account, we have to note that in our opinion at every time we have to start
the process of population dynamics analysis with a group of simplest mathematical models,
which describe the influence on population the minimal set of regulative mechanisms. For
example, we can start with models, which are presented in table 1: all these models describe
the influence on population the intra-population self-regulative mechanisms only. If we can
prove that it is impossible to explain analyzing datasets as a result of influence of these self-
regulative mechanisms only, we have to use more complicated models, which contain
additional equations for population regulator dynamics – for predators, parasites, climatic
factors etc.
Let‘s assume that with the help of any statistical criteria and for given sample the model

parameters  were estimated. The following question arises – how can we check the
correspondence between theoretical population values (which can be obtained with the help
of mathematical model) and empirical time series?
There exists a unique way for solution of this problem – we have to analyze the time
series, which is organized by the differences between theoretical (model) and empirical
values. Following the common ideas we‘ll assume that model gives us a good approximation
of empirical sample if the next requirements are truthful: there are no reasons for rejecting the
hypothesis that average for residuals is equal to zero and distribution of residuals is Normal
(that can be checked, for example, by the Kolmogorov – Smirnov test, and Shapiro – Wilk
test etc.; Bolshev, Smirnov, 1983; Shapiro et al, 1968); there is also the absence of serial
correlation in a sequence of residuals (Durbin – Watson test; Draper, Smith, 1986, 1987). If
these conditions are truthful altogether it means that there are no reasons for rejecting a
hypothesis about suitability of considering model for fitting of empirical time series.
If statistical criterions didn‘t allow us to reject the hypothesis about suitability of model
for fitting of analyzing time series we have a possibility to solve another one problem – we
can try to identify a population dynamics regime. We have to note that there are no reasons to
68 L. V. Nedorezov

say that population dynamics corresponds to dynamical regime, which is realized in model
with obtained (estimated) parameters. The initial sample is a sequence of stochastic values,
and, respectively, estimated model parameters are the stochastic values too. These parameters
don‘t equal to real population characteristics. It means that there exists a serious problem in
determination of real population dynamics type. On the other hand, it is possible to estimate
the probabilities that observed trajectory corresponds to population extinction, or to
population asymptotic stabilization at non-zero level etc. In other words, analysis of initial
sample can allow us to obtain a distribution of dynamical regimes, which can be realized for
population.
Estimation of probabilities of realization for population one or another dynamical regime
can be provided by the following way. In a space of model parameters the confidence
domains can be obtained with well-known methods (Draper, Smith, 1986, 1987). In the same
space of model parameters there is a set of bifurcation surfaces that depends on selected
model. These bifurcation surfaces cut confidence domains onto sub-domains, which
correspond to one or another dynamical regime. Respectively, we can estimate desired
probabilities with the help of standard Monte-Carlo methods.
Thus, the general diagram of population dynamics analysis must include the following
basic stages:

 Selection of group of mathematical models, which describe the influence of minimal


number of regulative mechanisms onto population dynamics.
 Estimation of parameter‘s values for all selected models.
 Analysis of deviations between theoretical and empirical trajectories.
 Determination of distribution(s) of dynamical regimes for model(s).

If for all selected models statistical criterions show negative results we have to choose a
group of more complicated mathematical models. These models from a new group may
describe the influence of some additional regulators onto population dynamics (parasites,
quality of food, weather factors etc.), may take into account the existence of time lag in a
reaction of self-regulative mechanisms onto population size changing, or may have additional
variables for sex groups of individuals, age groups or other intra-population structures.

3. SELECTION OF STATISTICAL CRITERIA


*
Let‘s consider the model (1) and let {x k } , k  1,2,..., N , be an empirical time series of
population size changing in time ( N is a total number of observations). The problem is in
estimation of model (1) parameters with existing time series.
One of very popular criterions is following (see, for example, Berryman, 1991; Turchin et
al., 2003; Nedorezov, Sadykova, 2005, 2008; Tonnang, 2009):


 

N
Q( )   x k*  x k*1 F ( x k*1 ,  )
2
 min
 (3)

k 2
Analysis of Green Oak Leaf Roller Population Dynamics in Various Locations 69

It is possible to point out various modifications of criteria (3). For example, when we use
expression (2) for birth rates in (3), the expression for minimizing functional has the
following form:


 

N
Q( )   y k*1  F ( x k*1 ,  )
2
 min
 , (4)

k 2

where y k* are the values of birth rates calculated for empirical time series. When we use log-
transformed values of population size and respective log-transformed expressions in (1)
(sometimes it can lead to strong simplification of final formula) expression (3) can be
presented as follows:


  2

N
Q( )   log( xk* )  log( xk*1 )  log(F ( xk*1 ,  ))  min
 (5)

k 2

These expressions (3)-(5) have one common property: model (1) is used in these
*
formulas as non-linear regression curve but not as dynamical model. Real trajectory {x k }
compares with set of values, which don‘t belong to any model trajectory altogether (more
precisely, these points belong to one and the same trajectory in unique ideal variant, which is
unrealistic for real datasets). In other words, in expressions (3)-(5) we compare the objects of
two qualitatively different types.
In this situation the following natural question can arise – why we decided to use one-
step ahead regression curve for estimation model parameters? May be, two-step ahead or
three-step ahead can give us better results? What kind of criteria gives us the best results and
why?
Moreover, in expressions (3)-(5) there are the ―doubl e standards‖: if we use (3), (4), or
(5) we assume that elements in initial empirical sample have two qualitatively different
properties. For example, within the limits of first element of sum (3) we assume that x 2* is
any constant, which was measured with Normal distributed error. But in the next element of

sum (3) we assume that expression F ( x2* ,  ) gives us a theoretical value of population, we
*
have to observe in ecosystem (if amount x 3 was measured without any errors). It means that
in second bracket in sum (3) we assume that amount x 2* was estimated without errors.
These are the reasons we think that criterions of the type (3)-(5) cannot be applied for the
estimation of real population parameters. On the other hand, these criterions can be applied,
for example, for constructing forecasts of population size changing.
One of most perspective ways for model parameter‘s estimations is in using of ―gl obal
fitting‖ (Wood S, 2001a, b), when we try to find the best trajectory (in a set of all model

trajectories), which gives us the best fitting for real trajectory. Let ~
x k ( , x1 ) be a solution of

model (1) for given vector  and initial value of population size x1 (we assume that this
70 L. V. Nedorezov

value is unknown model parameter too). Then criteria can be presented in the form (Tonnang
et al., 2009; Nedorezov, Lohr, Sadykova, 2008):

 N

Q( , x1 )   ( x k*  ~
x k ( , x1 )) 2  min
 (6)
 , x1
k 1

Note, that in (6) there are no ―doubl e standards‖ and all elements in the sample have the
similar properties. It is possible to point out some other important properties of criteria (6).
For example, if we have the same sample but in model we have more equations (in particular,
for predators, and respective variable in model is invisible) we can use (6) for the estimation
of all model parameters without modification of expression (6). At the same time criteria (3)
needs in additional information and in serious modification.
Below we illustrate this approach to population dynamics analysis on an example of
green oak leaf roller fluctuations (Rubtsov, 1992; Korzukhin, Semevskiy, 1992; Nedorezov,
Sadykova, 2005, 2008; Nedorezov, Sadykov, Sadykova, 2010). Choosing of this object
depends on the existence of several good empirical time series and diversity of biological
opinions about green oak leaf roller dynamics type. Partly it correlates with absence of
common opinion about main population regulators (Hunter et al., 1997; Hassell et al., 1998;
Rubtsov, 1983).

4. DYNAMICS OF GREEN OAK LEAF ROLLER


4.1. Phase Portrait of Population Dynamics

A big number of various publications are devoted to the problems of green oak leaf roller
population dynamics and its mathematical models (see, for example, Varley et al., 1975;
Korzukhin, Semevskiy, 1992; Hunter et al., 1997; Rubtsov, 1983, 1992; Rubtsov, Shvytov,
1980 and others). But a set of serious problems (in particular, the problem about main
population regulators) is open up to current moment and under the discussion (Hunter et al.,
1997; Hassell et al., 1998; Nedorezov, Sadykova, 2005, 2008).
Analysis of phase portrait structures, which had been provided in Rubtsov‘s publications
(Rubtsov, 1983, 1992; Rubtsov, Shvytov, 1980), allowed him to describe the basic laws of
population dynamics. In particular, the author had been showed that in some locations tortrix
could realize an outbreak (Isaev et al., 1980, 2001, 2009; pulse eruptive outbreak in
Berryman‘s classification of insect population dynamic types; Berryman, 1990, 1991). This
regime can be realized within the framework of predator – prey system dynamics, and
characterizes by the existence of three non-zero stationary states in a phase space. Following
the basic stages of population dynamics analysis described above, before using difficult
mathematical models including equations for several interacting populations we have to
check the possibilities of simpler models (table 1). First of all, we have to be sure that it is
impossible to explain observed trajectories from the stand point of influence of self-regulative
intra-population mechanisms only.
Analysis of Green Oak Leaf Roller Population Dynamics in Various Locations 71

Table 1. Models for fitting of empirical time series

Name of the model (common


Models* References and/or used in current
publication)
1 xk 1  axk (1  bxk ) 1 Kostitzin, 1937 Kostitzin** model
Moran, 1950; Ricker,
2 x k 1  axk (b  x k ) Discrete logistic model
1954
3 x k 1  a(1  e bxk ) Skellam, 1951 Skellam model
Morris, 1959; Varley, Morris – Varley – Gradwell
4 x k 1  ax1kb Gradwell, 1960, 1970 model
 bxk Moran, 1950; Ricker,
5 xk 1  axk e Moran – Ricker model
1954
*
The model‘s numbers are the same in all tables
**
This model is also known in literature as Skellam model (Skellam, 1951) and Beverton – Holt model
(Beverton, Holt, 1957), but for the first time this model was presented in the monograph by V.A.
Kostitzin (1937).

Figure 1.( Continued on next page.)


72 L. V. Nedorezov

Figure 1. Fluctuations of green oak leaf roller on the plane ―


population size – birth rate‖ in various
locations of European part of Russian Federation. a, b – time series from Rubtsov (1992). Abscissa
axis: number of eggs-laying per 1m of branches in autumn. Point 1 corresponds to 1969. c – time series
from Korzukhin and Semevskiy (1992). Abscissa axis: pupae density per 1000 leafs. Point 1
corresponds to 1962.

Use the models from table 1 means that we a‘priori assume that phase portrait of
population dynamics can be characterized by the existence of one non-zero stationary state (if
population doesn‘t eliminate for every initial values of population size). This stationary state
can be stable or unstable point and in last case we can observe cyclic or chaotic population
fluctuations. In Isaev – Khlebopros classification of insect population dynamics (Isaev et al.,
2001, 2009) such kind of species belong to a group of prodromal species (but not to group of
eruptive species, which can realize an outbreak). This hypothesis will have a good
background if and only if for all analyzed real trajectories (Figure 1) we find at least one
model from the table 1, which gives a good fitting for observed trajectory and all statistical
tests give us positive results.

4.2. Mathematical Models

All models we used for fitting of considering trajectories are presented in table 1.
Parameters of discrete logistic model (table 1) must satisfy the condition ab  4 (Maynard
Smith, 1976; Poluektov et al., 1980; Nedorezov, Nedorezova, 1995). But for fitting of real
time series we have used modified discrete logistic model:

 ax (b  xk ), xk  b
xk 1   k
0, xk  b

This model has no additional limits for its non-negative parameters a and b . But the
origin (if ab  4 ) becomes a complicated stationary state.
This additional assumption that product ab can be bigger then 4 can be interpreted as
follows: we assume that population size can intersect the limit level b . But it leads
Analysis of Green Oak Leaf Roller Population Dynamics in Various Locations 73

immediately to the destruction of ecosystem and local population elimination. Such a


situation is typical for various outbreak species (Isaev et al., 1980, 2001, 2009; Berryman,
1981, 1990, 1991; Varley et al., 1975).
All other models were used for fitting in the forms they are presented in table 1. All
models have the similar properties: within the framework of every model population
dynamics is determined by the influence of intra-population self-regulative mechanisms only.
Several models (Kostitzin model, Skellam model, and Morris – Varley – Gradwell model)
have a poorest set of dynamical regimes: population can extinct (if a  1, table 1, Kostitzin
model, and Skellam model), or population asymptotically stabilizes at non-zero level (if
a  1 ; for all values of parameter a Morris – Varley – Gradwell model contains the regime
of asymptotic stabilization only). Two other models contain very rich sets of dynamical
regimes, which include cyclic regimes of all lengths and chaos.

4.3. Results for First Time Series (Figure 1a)

If we assume that dynamics of green oak leaf roller on Figure 1a corresponds to pure
stochastic fluctuations near any stationary level x s , then x s  13.671 with SE  2.243
(standard error). Minimum value for functional form (6) is equal to 2324.1. This value is
major for all minimum values of functional form (6) for all models from the table 1 for first
time series. Applications of Kolmogorov – Smirnov test ( d  0.1247 with p  0.2 ) and
Shapiro – Wilk test ( W  0.9145 with p  0.0584) show that there are no reasons for
rejecting the hypothesis about normality (with zero average) of the distribution of residuals
(Bolshev, Smirnov, 1983; Shapiro et al., 1968); Durbin – Watson criteria shows
( d  1.5619; for sample size 22 and for one predictor variable the realization of inequality
d  d L  1.24 means that there is a negative serial correlation in a sequence of residuals; if
1.43  d U  d  2 there is no serial correlation; critical values d L and d U are presented
for 5% level of significance; for 1% level of significance the values for critical levels are
equal to 1.00 and 1.17 respectively), that there is no serial correlation in a sequence of
residuals (Draper, Smith, 1986, 1987). Thus, there are no reasons for rejecting the hypothesis
that on Figure 1a we have pure stochastic fluctuations near average value.
Realization of last hypothesis means that population regulators are very weak on the
considering interval of population size changing. If we assume, that population dynamics can
be described by the Kostitzin model the value of loss-function Q (6) is less than in previous
case (table 2). At the same time there are no reasons for rejecting the hypothesis that Kostitzin
model is suitable for fitting of experimental time series (table 3). Note that coefficient b ,
which describes in model the influence of self-regulative mechanisms on population size
changing, is small enough.
74 L. V. Nedorezov

Table 2. Model parameter’s estimations and values of functional form Q (6) for all time
series on green oak leaf roller changing in time

Models x0 a b Q
Estimations for first time series (Figure 1a)
1 21.89 1.394 0.032 2190.8
2 15.472 0.174 24.4 1238.0
3 21.636 27.755 0.048 2194.8
4 0.874 40.516 1.467 1476.0
5 8.728 27.04 0.232 1515.6
Estimations for second time series (Figure1b)
1 0.0074 28603.0 1436.2 3571.6
2 2.19 0.116 37.0 1261.6
3 0.682 19.92 3.514 3571.9
4 20.08 1.414 0.119 3940.7
5 0.972 24.26 0.158 3337.6
Estimations for third time series (Figure1c)
1 2.7∙10-12 731489.6 60263.0 6442.3
2 0.103 0.0896 50.8 1310.9
3 0.00014 12.138 8.24 6442.3
4 7.85∙10-258 10.78 0.95 6456.2
5 0.05 147.1 0.708 4578.5

Table 3. Analysis of deviations between empirical and theoretical trajectories

Left Right
Average±SE KS2 SW3 DW4
limit1 limit1
Results for first time series
1 -0.034±2.178 -4.562 4.495 0.1187/p>0.2 0.9311/p=0.129 1.7064
2 0.87±1.626 -2.511 4.252 0.1476/p>0.2 0.9361/p=0.165 1.6634
3 -0.038±2.18 -4.571 4.495 0.1203/p>0.2 0.9302/p=0.124 1.7026
4 0.448±1.785 -3.263 4.16 0.1131/p>0.2 0.9376/p=0.177 0.7935
5 -0.454±1.809 -4.215 3.307 0.1027/p>0.2 0.95/p=0.315 0.7027
Results for second time series
1 0.0042±2.78 -5.778 5.786 0.1673/p>0.2 0.9253/p=0.098 0.995
2 0.7093±1.645 -2.712 4.131 0.1468/p>0.2 0.9793/p=0.905 2.263
3 -0.025±2.781 -5.808 5.757 0.167/p>0.2 0.9259/p=0.101 0.995
4 -0.0005±2.92 -6.074 6.073 0.18/p>0.2 0.9305/p=0.126 0.9972
5 -0.126±2.69 -5.715 5.463 0.136/p>0.2 0.951/p=0.331 1.0197
Results for third time series
1 0.1392±3.148 -6.344 6.622 0.2222/p<0.1 0.7442/p=0.00002 0.935661
2 3.1161±1.276 0.488 5.744 0.2379/p<0.1 0.8524/p=0.0016 1.961968
3 0.1379±3.148 -6.346 6.622 0.2222/p<0.1 0.7441/p=0.00002 0.935704
4 0.1011±3.152 -6.39 6.592 0.2233/p<0.1 0.7426/p=0.00002 0.935025
5 4.06±2.527 -1.144 9.264 0.3405/p<0.01 0.6281/p<10-5 1.824773
1
Limits for 95% confidence interval
2
KS – values and probabilities of Kolmogorov – Smirnov criteria
3
SW – values and probabilities of Shapiro – Wilk criteria
4
DW – values of Durbin – Watson criteria.
Analysis of Green Oak Leaf Roller Population Dynamics in Various Locations 75

The best result is observed for discrete logistic model (table 2); good results are also
observed for Morris – Varley – Gradwell model and Moran – Ricker model. For all models
we can conclude that the estimated values of population parameters belong to ―bi ological‖
zone. From the table 3 we can see that for all models we cannot reject the hypotheses that the
distributions of residuals are Normal with zero averages. At the same time for residuals for
Morris – Varley – Gradwell model and for Moran – Ricker model the negative serial
correlations are observed. It allows us to say that these models cannot be applied for fitting of
considering time series. For other models the Durbin – Watson criteria shows that there are no
serial correlations.
On Figure 2 empirical trajectory of tortrix fluctuations compares with theoretical
trajectories, which were obtained for estimated parameters (table 2) for Kostitzin model and
discrete logistic model. It is important to note that for estimated parameters discrete logistic
model predicts population elimination in 1992; it doesn‘t correspond to reality and can be
interpreted as additional limit for prognostic properties of this model.
On Figure 3 there are the 99% confidence domain for parameters a and b of discrete
logistic model (with fixed value x 0  15.472 ) together with some bifurcation curves.
Taking into account that limits of confidence domain don‘t intersect curve ab  3 we can
conclude that probability of realization of the regime of population stabilization or the regime
of population elimination (within the framework of this model) is less than 0.01 . The
observed minimum for functional (6) is occupied in intersection of straight lines L1
( x  0.174) and L2 ( y  24.4 ) (Figure 3).

Figure 2. Population density changing in time: curve 1 corresponds to empirical time series (Figure 1a),
curve 2 is the trajectory of Kostitzin‘ model, and curve 3 is the trajectory of discrete logistic model.
76 L. V. Nedorezov

Figure 3. Domain on the plane (a, b) where values of functional (6) is less than 2217.3 (with fixed
value x 0  15.472 ) for discrete logistic model. ab  2 , ab  3 , and ab  4 are the bifurcation
curves. Intersection of lines L1 and L2 gives a point of estimated minimum for loss-function (6).

On Figure 4 there are the limits of confidence domains for Kostitzin model parameters on
the plane (a, b) at fixed value of initial population size x 0 ( x 0  21.89 , table 2). As one
can see from this picture, boundaries  k of confidence domains intersect bifurcation line
a  1 , but the probability of asymptotic population extinction is very small.
Finally, the provided analysis and comparison of theoretical trajectories with observed
time series (Figure 1a) show that population fluctuations can be explained as a result of
influence of self-regulative intra-population mechanisms only. We have no reasons to reject
the hypothesis that analyzed empirical trajectory corresponds to asymptotic stabilization at
non-zero level (Kostitzin model). Analysis of modified discrete logistic model shows (Figure
3) that with big probability there are the periodic fluctuations in population dynamics.

Figure 4. Curves 1 ,  2  3 are the boundaries of confidence domains for 90%, 95% and 99%
and
respectively for Kostitzin‘ model. Bifurcation line a  1 is the boundary for domains of population
extinction ( a  1) and its asymptotic stabilization at non-zero level ( a  1 ).
Analysis of Green Oak Leaf Roller Population Dynamics in Various Locations 77

4.4. Results for Second Time Series (Figure 1b)

If we assume that dynamics of green oak leaf roller on Figure 1b corresponds to pure
stochastic fluctuations near any stationary level x s , then x s  18.95 with standard error for
average 2.922. Minimum of functional form (6) is equal to 3943.9. Applications of
Kolmogorov – Smirnov test ( d  0.1861 with p  0.2 ) and Shapiro – Wilk test
( W  0.9197 with p  0.0749) show that there are no reasons for rejecting the hypothesis
about normality of the distribution of the residuals between theoretical and empirical values
(Bolshev, Smirnov, 1983; Shapiro et al., 1968); Durbin – Watson test shows ( d  0.9927 )
that there is the negative serial correlation (Draper, Smith, 1986, 1987) in a sequence of
residuals. Consequently, the hypothesis that on Figure 1b there is the stochastic fluctuations
near average must be rejected.
Note, that obtained estimations for Kostitzin model (table 2) belong to non-biological
zone for population parameters (maximum value of birth rate a is much bigger than
maximum fecundity of individuals). Respectively, it can be one of the reasons for rejecting
the hypothesis about suitability of this model for fitting of empirical time series.
Analysis of the sequence of residuals shows (table 3) that there is no negative serial
correlation for discrete logistic model only. It means that we have reasons for rejecting the
hypotheses about suitability of considering models for fitting of empirical time series. On the
other hand all used tests don‘t allow us to reject the same hypothesis for discrete logistic
model.
Thus, like in previous case population fluctuations can be explained as a result of
influence of self-regulative intra-population mechanisms only. On the other hand, exploitation
of discrete logistic model (table 1) meets with serious problems. In particular, for estimated
values we cannot determine asymptotic regime of population dynamics: model shows that
population must extinct after 14 years (in 2002) that doesn‘t correspond to reality. On Figure
5 real trajectory of tortrix fluctuations compares with theoretical trajectory, which was
obtained for estimated parameters (table 2) for discrete logistic model.

Figure 5. Population density changing in time: curve 1 corresponds to empirical time series (Figure 1b),
and curve 2 is the trajectory of discrete logistic model.
78 L. V. Nedorezov

4.5. Results for Third Time Series (Figure 1c)

If we assume that population dynamics on Figure 1с corresponds to pure stochastic


fluctuations near any stable level x s , then x s  10.927 with SE  3.217 . The minimum
value for functional form (6) is equal to 6997.7. Obviously, this amount is majoring value for
all other values of functional form (6) at the use of models from the table 1 for fitting of third
time series. Applications of Kolmogorov – Smirnov criteria ( d  0.2624 with p  0.05 )
and Shapiro – Wilk criteria ( W  0.6872 with p  105 ) show that the hypothesis about
normality of the distribution of residuals must be rejected (Bolshev, Smirnov, 1983; Shapiro
et al, 1968). Respectively, we have to reject the hypothesis that observed fluctuations are the
stochastic oscillations near average.
Like in previous case, results, which were obtained for Kostitzin model (table 2), belong
to non-biological domain of population parameters. Analyses of deviations (table 3) show that
all considered models did not give us a sufficient approximation of empirical dataset.
Consequently, it can be considered as the background for the following hypothesis: for the
explanation of population fluctuations in third case (Figure 1c) we have to use more
complicated mathematical models, which take into account the influence of external factors
or influence of intra-population effects (Alley effect, group-effect, time lag in reactions of
self-regulative mechanisms etc.) onto population dynamics.

CONCLUSION
In population dynamics analysis it is possible to mark the following important steps:

1) At the beginning we have to select a model or group of models, which have the
similar properties (for example, all models describe the influence of intra-population
self-regulative mechanisms on population size changing only in simplest variant,
there are no time lags in reactions of self-regulative mechanisms etc.), or we must
construct a new model if existing models don‘t describe some important elements of
population structure, regulative mechanisms etc. In other words, we have to construct
a new model if all existing models don‘t correspond to considering situation. During
the selection process it is very important to take into account that every selected
model has its own limits for application to real datasets. In particular, the choosing of
the Kostitzin model means that a‘priory we assume that population size can change
monotonously only. Respectively, all deviations of empirical values from this law
can be (and must be) explained as a result of influence of external stochastic factors
(for example, weather conditions).
2) On the next step we have to select a statistical criterion for determination of model‘s
parameters. If model is only assumed for obtaining forecasts of population size
changing in time it is expedient to use criterions of the type (3)-(5). But if the model
isn‘t assumed for the solution of especially practical problems only, and the main
goal of time series analysis is in determination of population dynamics type, it is
much better to use criterions of the type (6).
Analysis of Green Oak Leaf Roller Population Dynamics in Various Locations 79

Estimation of model parameters is also important for finding population


characteristics, which cannot be determined in direct experiment. For example, it is
important for estimation of the value of maximum birth rate, coefficient of influence
of intra-population self-regulative mechanisms etc. Taking into account that all
elements of initial sample are the stochastic values, estimations of model parameters
are the stochastic values too. Respectively, for obtaining estimations we have to point
out the confidence domains for selected significance levels (Draper, Smith, 1986,
1987).
3) One of the most important stages in analysis of population dynamics is in finding of
bifurcation structure of confidence domains. It allows estimating the probabilities of
the realization of one or another dynamical regime, which are determined by the
initial sample (particular cases are presented on figures 3 and 4).
4) 4. One of the important elements in analysis of correspondence between model and
empirical datasets is in determination of basic properties of time series, which are
organized by the deviations. If the distribution of deviations doesn‘t correspond to
Normal with zero average, or there is the serial correlation in the sequence of
deviations it gives us a background for the rejecting of hypothesis about the
possibility to use considering model for fitting of the sample.

If all used statistical criterions don‘t allow rejecting the hypothesis about suitability of
considering model for approximation of time series, it gives respectively to present correct
solutions for some other problems. For example, it allows us to solve the problem on
identification of population dynamics type, or at least to describe the set of mechanisms
influenced the population size. Also it can give the solution of the problem of absent
datapoint estimation (―hol es‖ in time series), give a correct (in mathematical sense)
description of the problem of optimal management of population etc.
Application of several mathematical models (table 1) with discrete time for the
approximation of green oak leaf roller population dynamics (Figure 1) showed that in the first
case (Figure 1a) we couldn‘t reject the hypothesis that observed dynamical regime
corresponds to pure stochastic fluctuations near any stationary level. The influence of
regulative mechanisms is considerably weak.
For the second case (Figure 1b) it was obtained that observed fluctuations can be
explained as a result of influence of intra-population self-regulative mechanisms only. In the
last case it was impossible to find good fitting for all models. It gives us a background for the
following hypothesis: in this situation the dynamics of green oak leaf roller population cannot
be explained as a result of influence of self-regulative mechanisms only. It means also that for
fitting of this time series we have to use more difficult mathematical models, which describe
the influence of some other regulators on population dynamics or take into account some
additional intra-population effects.

REFERENCES
Begon M., Mortimer M. Population Ecology: a united study of animals and plants. Oxford
etc.: Blackwell sci. publ., 1981
80 L. V. Nedorezov

Berryman AA. Population systems: a general introduction. New York: Plenum Press, 1981
Berryman AA. 1990. Identification of Outbreaks Classes. Math. Comput. Modelling, 13: 105–
116.
Berryman AA. 1991. Population theory: an essential ingredient in pest prediction,
management and policy making. Am. Ent., 37: 138-142.
Beverton RJ., Holt SJ. 1957. On the dynamics of the exploited fish populations. Great Brit.
Min. Agr. Fish, Food, Fish. Invest., 2-19: 1-533.
Bolshev LN., Smirnov NV. Tables of Mathematical Statistics. Moscow: Nauka, 1983. (in
Russian)
Draper NR., Smith H. Applied Regression Analysis. V.1. Moscow: Finance and Statistics,
1986. (in Russian)
Draper NR., Smith H. Applied Regression Analysis. V.2. Moscow: Finance and Statistics,
Moscow, 1987. (in Russian)
Gause GF. The Struggle for Existence. Baltimore: Williams and Wilkins, 1934.
Hassell MP. 1975. Density-dependence in single-species populations. J. Anim. Ecol., 44: 283-
295.
Hassell MP. The Dynamics of Arthropod Predator-Prey Systems. Princeton: Princeton
University Press, 1978.
Hassell MP., Crawley MJ., Godfray HCJ., Lawton JH. 1998. Top-down versus bottom-up and
Ruritanian bean bug. Proc. Nat. Acad. Sci USA, 95: 10661-10664.
Hunter MD., Varley GC., Gradwell GR. 1977. Estimating the relative roles of top-down and
bottom-up forces on insect herbivore populations: A classic study revisited. Proc. Natl.
Acad. Sci. USA, 94: 9176-9181.
Isaev AS., Khlebopros RG., Kondakov YP., Kiselev VV., Nedorezov LV., Soukhovol‘sky
VG. 2009. Forest Insect Population Dynamics. Euroasian Entomol. J., 8: 3-115.
Isaev AS., Khlebopros RG., Nedorezov LV., Kondakov YP., Kiselev VV., Soukhovol‘sky
VG. Forest Insect Population Dynamics. Moscow: Nauka, 2001. (in Russian).
Isaev AS., Nedorezov LV., Khlebopros RG. 1980. Qualitative Analysis of the
Phenomenological Model of the Forest Insect Number Dynamics. Pest and Pathogen
Control, 9: 1-44.
Korzukhin MD., Semevskiy FN. Forest sinecology. Saint Petersburg: Gidrometeoizdat, 1992.
(in Russian)
Kostitzin VA. La Biologie Mathematique. Paris: A.Colin, 1937.
Kot M. Elements of Mathematical Ecology. Cambridge: Cambridge University Press, 2001.
May RM. 1975. Biological populations obeying difference equations: stable points, stable
cycles and chaos. J. Theor. Biol., 51: 511-524.
Maynard Smith JM. Models in ecology. Moscow: Mir, 1976. (in Russian)
Maynard Smith JM, Slatkin M. 1973. The stability of predator-prey systems. Ecology, 54:
384-391.
Moran PAP. 1950. Some remarks on animal population dynamics. Biometrica, 6: 250-258.
Morris RF. 1959. Single-factor analysis in population dynamics. Ecology, 40: 580-588.
Nedorezov LV. Modeling of Forest Insect Outbreaks. Novosibirsk: Nauka, 1986. (in Russian)
Nedorezov LV. 1999. Restoration of Phase Portrait Structure for the Dynamics of a Forest
Pest, the Pine Moth (Dendrolimus pini L.). Ecol. Modell., 115: 35-44.
Analysis of Green Oak Leaf Roller Population Dynamics in Various Locations 81

Nedorezov LV., Lohr BL., Sadykova DL. 2008. Assessing the importance of self-regulating
mechanisms in diamondback moth population dynamics: Application of discrete
mathematical models. J. Theor. Biol., 254: 587–593.
Nedorezov LV., Nedorezova BN. 1995. Correlation between Models of Population Dynamics
in Continuous and Discrete Time. Ecological Modelling, 82: 93-97.
Nedorezov LV., Sadykov AM., Sadykova DL. 2010. Population dynamics of green oak leaf
roller: applications of discrete-continuous models with non-monotonic density-dependent
birth rates. J. Gen. Biol., 71: P. 41-51.
Nedorezov LV., Sadykova DL. 2005. Toward a problem of selection of mathematical model
of population dynamics (on an example of green oak leaf roller). Euro-Asian Ent. J., 4:
263-272.
Nedorezov LV., Sadykova DL. 2008. Green oak leaf roller moth dynamics: An application of
discrete time mathematical models. Ecological Modelling, 212: 162-170.
Poluektov RA., Pykh JuA., Shvitov IA. Dynamic models of ecological systems. Leningrad:
Gidrometeoizdat, 1980. (in Russian)
Ricker WE. 1954. Stock and recruitment. J. Fish.Res. board of Canada, 11: 559-623.
Rubtsov VV. 1983. Mathematical model for development of leaf-eating insects (oak leaf
roller taken as an example). Ecological Modelling, 18: 269-289.
Rubtsov VV. Models of oscillating processes in forest ecosystems. Moscow: Forest Institute,
1992. (in Russian)
Rubtsov VV., Shvytov IA. 1980. Model of the dynamics of the density of forest leaf-eating
insects. Ecological Modelling, 8: 39-47.
Shapiro SS., Wilk MB., Chen HJ. 1968. A comparative study of various tests of normality. J.
of the American Statistical Association, 63: 1343-1372.
Skellam JG. 1951. Random dispersal in theoretical populations. Biometrika, 38: 196-218.
Tonnang H., Nedorezov LV., Owino J., Ochanda H., Löhr B. 2009. Evaluation of discrete
host –parasitoid models for diamondback moth and Diadegma semiclausum field time
population density series. Ecological Modelling, 220: 1735-1744.
Turchin P., Wood SN. Ellner SP. et al. 2003. Dynamical effects of plant quality and
parasitism on population cycles of larch bud moth. Ecology, 84: 1207-1214.
Varley GC., Gradwell GR. 1960. Key factors in population studies. J. Anim. Ecol., 29: 399-
401.
Varley GC., Gradwell GR. 1970. Recent advances in insect population dynamics. Ann. Rev.
Ent., 15: 1-24.
Varley GS., Gradwell GR., Hassell MP. Insect Population Ecology. An analytical approach.
London: Blackwell Scientific Publications, 1975.
Wood SN. 2001a. Minimizing model fitting objectives that contain spurious local minima by
bootstrap restarting. Biometrics, 57: 240-244.
Wood SN. 2001b. Partially specified ecological models. Ecol. Monographs, 71: 1-25.
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 83-96 © 2012 Nova Science Publishers, Inc.

Chapter 5

INDIVIDUAL BASED MODELLING


OF PLANKTONIC ORGANISMS

Daniela Cianelli,1,2* Marco Uttieri1* and Enrico Zambianchi1*


1
Department of Environmental Sciences, University of Naples ― Parthenope‖,
Centro Direzionale di Napoli Isola C4, 80143 Naples, Italy.
2
ISPRA – Institute for Environmental Research and Protection,
Via di Casalotti 300, 00166 Rome, Italy.

ABSTRACT
In the last decades, numerical modelling has gained increasing consensus in the
scientific world, and particularly in the framework of behavioural and population
ecology. Through numerical models it is possible to reconstruct what is observed in the
environment or in the laboratory and to get a more in-depth comprehension of the factors
regulating the phenomena under examination.
Numerous approaches have been developed in this framework, but probably one of
the most promising is the individual-based modelling. With this type of approach it is
relatively straightforward to investigate aspects related to the ecology of a population
starting from the characterisation of processes taking place at the scale of the individual
organism.
This contribution is intended to provide a general view of the main features of the
individual-based models and of their peculiarities in comparison to other modelling
strategies. Special emphasis will be given to applications in the field of phyto- and
zooplankton ecology and behaviour, and results from the available literature on this topic
will be used as examples.

Keywords: individual-based modelling, phytoplankton photophysiology, zooplankton


behaviour.
84 Daniela Cianelli, Marco Uttieri and Enrico Zambianchi

1. INTRODUCTION
Natural processes are quite often the result of complex and hardly predictable interactions
among the components of a system. Experimental studies, either in situ or in the laboratory,
provide important insights into some of these mechanisms; however, owing to the intrinsic
complexity of these processes and of their interactions, only a few of them can be investigated
at a given time. Numerical modelling often represents an affordable approach to get a more
holistic view of natural systems and of the processes acting in them. As discussed in Mac
Nally (1997), experimental methodology is typically based on the Popperian paradigm of the
hypothesis-deduction approach. Numerical models come to the aid in understanding
ecological processes: in a correct approach, the scope of numerical simulations is to
reproduce natural phenomena and provide new elements to understand their underlying
dynamics, rather than a blind acceptance of model results (Wissel, 1992; Grimm, 1999).
Models are thus ―pur poseful representations‖ (Starfield et al., 1990) and great problem-
solving tools to make the main properties of the considered system emerge and to explain the
observed phenomenon (Grimm, 1999). In a numerical model, the dynamics of several
variables are integrated through interactions of processes (Wroblewski, 1983). As a general
rule of thumb, when modelling a complex system the simplest components should be
identified and the interactions among them and with the environment investigated
(Wroblewski, 1983).
As underlined by Judson (1994), two fundamental aspects make ecology different from
other natural sciences: the lack of first principles but on the other hand the strong presence of
the unifying principle of Darwinian evolution. Since May (1974 and 1976), increasing
awareness has been accumulated about the possible development of chaotic dynamics in even
simple systems. Models of population dynamics are not an exception to this rule (Łomnicki,
1999), and for this reason ecological modelling is still nowadays in a continuous progress.
Over the years several modelling approaches have been developed to address key ecological
topics. Among them, individual-based models (IBMs) have emerged as a promising
framework to relate the individual behaviour with the patterns observed at community and
population levels (Grimm, 1999).
Classically, ecological state-variable models describe a population in terms of bulk
properties averaged over a large number of individuals, without considering the variability
among them, and use continuous functions changing in time and space (Fennel and Osborn,
2005). But within a population individuals are not all the same, and their differences are
reflected in the structure and dynamics of the population itself (Łomnicki, 1999). In an IBM
the focus of the interest is the individual, considered as the central elemental component of
the system (with an approach equivalent to that of experimental biology), and a population is
assumed as made up of individuals differing in their properties (Uchmański and Grimm,
1996). IBMs belong to the family of the agent-based models (or multi agent-based models), a
modelling approach designed to investigate the interactions among numerous components of
a system. These models are a natural extension of the Ising model (Ising, 1925) and of the
cellular-automata like models (Wolfram, 1994). In an IBM the modelled ―agent ‖ is simply an
individual, treated as a unique and discrete entity with at least one property evolving through
time. IBMs explicitly include heterogeneity among individuals (e.g., spatial location, body
size, physiological parameters, etc.), and this makes them particularly suitable to link the
Individual Based Modelling of Planktonic Organisms 85

individual with aggregated levels (e.g., population dynamics, community structure, spatial
distribution, etc.), while at the same time stabilising the model itself (Cope, 2005). By
modelling the probabilistic behaviour of individual organisms and averaging over a
reasonably high number, IBMs use a ―bot tom-up‖ approach (Souissi et al., 2005) and are
capable of delineating population-level dynamics as emergent properties due to the
interactions among the individuals and between the individual and its environment
(Railsback, 2001).
Since the rules governing the individual can account for several processes, IBMs permit
more realistic assumptions than those used for state variable models (Souissi et al., 2004). In
addition, using observed biological entries IBMs do not introduce any mathematical artifact
in the numerical representation (Scheffer et al., 1995). Some IBMs are defined as spatially
explicit, when the individual is associated with a position in space which can be either
continuous or discrete, and may also include mobility if the individual is allowed to move
inside its environment (as for simulations of animal behaviour). IBMs can also be considered
as i-state configuration models (Metz and Diekmann, 1986; Caswell and John, 1992; Maley
and Caswell, 1993), where for each individual a set of i-states (e.g., age, size, weight, etc.) is
defined at each time step.
The first models using individuals as basic units date to the early 1980s (e.g., DeAngelis
et al., 1980; Beyer and Laurence, 1980), but it was only after the work by Huston et al. (1988)
that the IBM approach has been unequivocally defined. The possibility of considering IBMs
as a unifying ecological theory was discussed in Huston et al. (1988) and then reviewed a
decade later by Grimm (1999), while Fennel and Osborn (2005) proposed an alternative
framework to relate state variables and individuals.
In the literature a number of seminal reviews about IBMs are available, focusing on the
potentials and on the applications of this modelling framework to ecological issues (e.g.,
DeAngelis et al., 1990 and 1994; DeAngelis and Gross, 1992; Judson, 1994; Uchmański and
Grimm, 1996; Grimm, 1999; Grimm et al., 1999; Łomnicki, 1999). Of course, all that glitters
ain‘t gold and some proviso must be mentioned. Since IBMs can virtually model all the
individuals of a population, a downside is the costly demand of computational and storage
resources to let the model run fluently. While the computing performances of present personal
computers have made giant leaps and are still evolving very rapidly, some resampling
techniques are commonly adopted. Since realistic numbers of individuals need to be
represented in the model for a reliable representation of natural phenomena, reducing the
number of modelled entries does not represent a suitable solution. This procedure would in
fact decrease the variability in the variables modelled, with possible development of irregular
dynamics (Scheffer et al., 1995). A common procedure consists in assuming the individual
modelled as a group of individuals sharing some common characteristic. This is the root of
the ―L agrangian-ensemble method‖ (Woods and Onken, 1982) and of the ―s uper-individual‖
approach (Scheffer et al., 1995). In this way, every individual represents a varying number of
specimens sharing the same destiny, and for each of them a number of variables and
processes can be simulated and their time-dependent variation be studied. Such methodology
may however modify the spatial and temporal model dynamics, especially when a large
number of individuals is aggregated (Parry and Evans, 2008). Parallel computing can provide
a helpful solution to overcome this limitation maintaining the original model structure (e.g.,
Parry and Evans, 2008).
86 Daniela Cianelli, Marco Uttieri and Enrico Zambianchi

Judson (1994) demurred the absence of a detailed description of models in IBM


approaches. This criticism prompted the development of the ODD (Overview, Design
concepts, Details) protocol by Grimm et al. (2006 and 2010). This protocol consists of seven
tasks aimed at providing a standard description of individual-based and agent-based models
through a detailed description of the variables used, of their initialisation and of the sub-
model implemented.
It is worth stressing that IBMs are extremely versatile, being capable of representing
animals, plants, human beings or vehicles (e.g., Benenson et al., 2008; Berger et al., 2008).
The focus of this work will be the applications of the IBM approach to the dynamics of
plankton ecosystems. Their components have often been described as continuum fields,
characterized by integrating sets of differential equations parameterizing physical, biological
and chemical processes. While computationally favourable, this approach proved not
exhaustive for plankton ecology (e.g., Woods and Barkmann, 1994) and over the last thirty
years it has been integrated with an individual-based description. The next sections will
provide a synthesis of the studies where the IBM framework has been applied to
phytoplankton and zooplankton organisms, and of the insights gained into the dynamics of
aquatic systems.

2. PHYTOPLANKTON
Phytoplankton is composed by autotrophic, photosynthetic organisms belonging to
different taxa and living in all aquatic ecosystems (Mann and Lazier, 1996). They are moved
by the water currents both horizontally and vertically, even though some species are capable
of some degree of autonomous motility. Most organisms are too small to be seen at naked
eye, they often form large aggregates in the form of blooms. Phytoplankton represents almost
the 1–2 % of the total biomass of the world ocean but at global scale these organisms are able
to fix at least 30-60% of the total organic carbon (Falkowski et al., 1994).
Light and nutrients are the basic resources used by phytoplanktonic organisms as a source
of energy to perform their biosynthetic processes. As light intensity and nutrient
concentration show both diel and seasonal changes, phytoplankton has adapted to these
resource variations that regularly occur in the aquatic environment. In particular
phytoplankton has developed a photosynthetic apparatus, which adapts to the changes in light
intensity frequently observed in the water column. Phytoplanktonic organisms respond to
light variations both by adjusting the rates of biochemical processes and by changing the
organization, composition and functioning of the photosynthetic apparatus (Falkowski and
LaRoche, 1991). Different phytoplankton organisms exposed to the same resource
distributions will show different bio-chemical composition and photosynthetic performances
(Falkowski and LaRoche, 1991). This in turn affects the growth rate of the entire
phytoplankton population in the water column, modulating the species abundances and
altering their distributions and occurrence.
In marine ecosystems the physical-biological conditions continuously change supporting
the wide range of photophysiological responses adopted by phytoplanktonic organisms.
Among the others, the high variability in the light regime and nutrient availability
experienced by the cells as well as the frequent vertical displacements in the water column
Individual Based Modelling of Planktonic Organisms 87

induced by turbulent mixing and convection are crucial factors inducing the different
responses of organisms (e.g., Lewis et al., 1984; Figueiras et al., 1999).
Nevertheless, until recently the phytoplankton community has been frequently described
in the marine ecosystem models through ensemble averages, treating the organisms like a
continuum. For example, using these bulk (Eulerian) models the primary production has been
simulated computing the depth-averaged light intensity experienced by phytoplankton
population and then calculating the growth over time. As showed by Woods and Onken
(1982) such an approach, averaging non-linear equations before integration, causes
inaccuracy in the model results.
In order to reproduce realistic dynamics of phytoplankton populations the individual
physiology and the interactions with other organisms and the environment have to be taken
into account (e.g., DeAngelis and Gross, 1992).
The IBMs currently represents the most suitable tool for reconstructing the time evolution
of a phytoplankton community in terms of temporal and spatial histories of the individual
organisms. The individual-based approach allows treating populations as composed of a large
number of organisms, whose individual histories both determine the physiological response to
environmental conditions and the species composition in the water column.
Modelling the primary production by means of the IBM approach, the light perceived by
each individual is firstly computed, then the growth of each organisms is integrated. The
emergent property of the growth of the entire population produces the primary production
estimate of the water column, which may significantly differ from the estimates obtained
through bulk Eulerian models.
The numerical approach to the study of phytoplankton behaviour in the water column
was first introduced in the late 1970‘s (e.g., Marra, 1978a and 1978b; Kamykowski, 1979;
Falkowski and Wirick, 1981), but it was only after the work by Woods and Onken (1982) that
the IBM description for phytoplankton received wider attention. Here we briefly summarize
some of the most relevant studies applying the IBM approach to simulate a large number of
complex physical-biological processes acting simultaneously and at different scales in the
water column.
The first numerical study conducted by Marra (1978a and 1978b) investigated the
interaction between mixing and light regimes showing the different photophysiological
responses to variable light regime experienced by phytoplankton. Kamykowski (1979) firstly
modelled the interplay between individual phytoplankters and a variable flow field associated
with a semidiurnal internal tide, while the distribution of phytoplanktonic organisms in
Langmuir circulations was simulated by Evans and Taylor (1980). In their paper, Falkowski
and Wirick (1981) analysed the effects of variations in the light regime due to vertical mixing
on primary productivity. Phytoplankton cells were allowed to light-shade adapt on a fixed
time scale by varying their Chla:C ratios in response to variations in the light regime. Their
results showed that despite the physiological adaptation to light, vertical mixing may have
little effect on the integrated water column primary productivity.
However the utility of the IBM approach has been fully recognized only after the
reference paper by Woods and Onken (1982), who developed a Lagrangian ensemble model
coupled with a one-dimensional model of the upper ocean. The authors used a Lagrangian
biological model to study how turbulent transport of cells through the diurnal light gradient
affected the depth distribution and energy uptake of a phytoplankton population. The model
showed that as the surface mixed layer deepened during the night, phytoplankton cells and
88 Daniela Cianelli, Marco Uttieri and Enrico Zambianchi

particles produced in a shallow mixed layer during one day were mixed downward at rates far
larger than their still-water settling rates.
Several additional contributions subsequently appeared in the literature addressing
specific aspects of phytoplankton photophysiology in a rapidly changing environment. Lande
and Lewis (1989) questioned the rationale for using the time consuming Lagrangian approach
as opposed to the traditional Eulerian one, while a more recent attempt to overcome the limit
imposed by handling average quantities within an Eulerian framework has been proposed by
Janowitz and Kamykowski (1999). Yamazaki and Kamykowski (1991) and Kamykowski et
al. (1994) investigated the interactions between vertical swimming and photo-adaptation
response by a random walk representation of dinoflagellates swimming in the surface mixed
layer. The results highlighted the interactions of individual organisms with the simulated
turbulent mixing. Subsequently Kamykowski et al. (1994) also analysed by means of the IBM
approach the effect of the photoinhibition process on primary production estimates. The
model clearly demonstrated that vertical mixing regime strongly determined how the
individual light history affected each population‘s statistical characteristics. More recently
Nagai et al. (2003) coupled the Lagrangian photoresponse model of Kamykowski et al.
(1994) with a 2nd-order turbulence closure model (Mellor and Yamada, 1982). In this study
Nagai et al. (2003) investigated the effects of wind mixing and diurnal photoresponse on the
daily phytoplankton production in a realistic water column. Their results suggested that
intense wind mixing in a lower-transparency water column determined greater phytoplankton
production. Consequently, vertical mixing was not a relevant factor for the photoresponse in
open-ocean water, while in a coastal condition it played a more relevant role.
Cianelli et al. (2004) developed an individual-based model describing the spatial and
temporal evolution of phytoplankton organisms moving in the Antarctic mixed layer during
summer. While previous studies used simplified descriptions of cell photo-response, this
study firstly combined the dynamic photoacclimation of the pigment content with a
mechanistic description of photoinhibition process. Moreover, in order to simulate in detail
the phytoplankton photo-physiology the model explicitly considered the dynamics of organic
carbon (C) and chlorophyll a (Chl a) along with the vertical structure of the water column. As
the paper focused on the role of light variability on phytoplankton growth in the Antarctic
mixed layer, the authors simulated both a nutrient-replete and an iron-replete scenario, thus
reproducing the conditions frequently observed in coastal areas (Martin et al., 1990) or at the
onset of the growth season in Antarctica. The effect of different turbulent regimes and mixed
layer depths on the integrated primary production was investigated using in situ measured
parameters and kinetic constants consistent with the measured photosynthetic rates. The
coupling of different mixing levels with photoacclimation strategies led to a wide range of
photophysiological responses which underlined the role of the individual physiological
histories in determining the growth of the entire population. The results showed that the
highest rate of cell accumulation was reached when the vertical mixing compensated for
photoinhibition, thus suggesting that photoacclimation to low irradiance and strong mixing
regimes represented a crucial factor in the photosynthetic performance of Antarctic
phytoplankton.
In their recent paper Esposito et al. (2009) used the IBM approach to analyse how a
fluctuating light environment may influence the carbon assimilation by phytoplankton cells.
In particular this model included the dynamics of photoacclimation and photodamage-repair
mechanisms. Different light regimes (steady, square wave, sinusoidal light–dark cycles and
Individual Based Modelling of Planktonic Organisms 89

fluctuating regimes) experienced by phytoplankton organisms were simulated. A realistic


ocean mixed layer, reproduced by a large eddy simulation was also modelled. The results
showed a decrease of carbon assimilation in the light fluctuating scenario, as compared to
steady light regime, due to the temporal delay between light fluctuations and photoresponses.
During the last 15 years few studies have also applied the IBM approach to the
phytoplankton competition dynamics. Dippner (1998) implemented a mixed Lagrangian–
Eulerian model to investigate the nutrient competition of two pelagic phytoplankton species.
The model results suggested that in coastal waters a shift in the composition of functional
groups was due to a phosphate increase and a silicate reduction. Broekhuizen (1999) used a
combination of the Eulerian and Lagrangian method to study the role of motility in promoting
the persistence or the co-existence of dinoflagellates with diatoms. The author described the
nutrient field and the organic matter distributions on an Eulerian grid while the phytoplankton
organisms were modelled using a Lagrangian description. His findings showed that
dinoflagellates were able to coexist with diatoms by means of nutrient storage capacity and by
their great motility.
More recently, Nogueira et al. (2006) used the Lagrangian ensemble method to analyse,
over a three year period, the phytoplankton competition of a size-structured population whose
organisms belonged to the same functional group but differed in size and competed for two
resources (light and nutrient-nitrogen). This one-dimensional IBM was coupled with a NPZD
food-chain plankton ecosystem model, forced by astronomical and climatological conditions
of a subtropical area. The model reproduced the seasonal pattern of the environmental
variables and of the phytoplankton biomass and displayed seasonality in relative demography.
The outcomes also showed that the species co-existence was achieved over the simulated
period despite substantial seasonal variations in competitive advantage.
After Nogueira et al. (2006), Cianelli et al. (2009a) applied the IBM approach to simulate
the dynamics of two coexisting phytoplankton species in the mid-latitude mixed layer. The
model was aimed at investigating whether turbulent mixing affected the dominance of one
species over another on the time scales of maximum abundance of a bloom forming species
(20–30 days). The species were characterized by different photophysiological behaviour and
shared the same resources (light and nutrient) whose availability was determined by the
turbulent mixing. The physiological complexity of the individual organisms was described
explicitly taking into account the time-dependence of biomass and the chemical content of the
cells (carbon, nitrogen and chlorophyll a) in response to variable environmental resources.
The space and time variability of nutrient concentration and turbulent mixing was reproduced
introducing vertical profiles of measured eddy diffusivity. Three case studies were simulated
to analyse the role of environment–individual interactions in determining the outcome of
competition of the selected species. Starting from a low complexity level, where the two
species only shared light and nutrient resources in almost stationary conditions, a further
factor of environmental variability was added for the subsequent simulated scenarios.
In the modelled conditions individual organisms experienced recurrent fluctuations of
light, temperature, and nutrient concentration gradients, due to the turbulent mixing in the
water column. Such a variability of environmental constraints had significant effects on the
growth of the phytoplankton populations as a whole but did not support the prevalence of one
species over the other over the simulated time scale (20 days). The model results showed that
turbulent mixing might favour both species; in particular a stably stratified water column
90 Daniela Cianelli, Marco Uttieri and Enrico Zambianchi

sustained the optimal growth conditions of both populations, while a variable turbulent
mixing limited their growth reducing the photophysiological differences between the species.
Cianelli et al. (2009a) also investigated how the photophysiological responses of an
individual species to the environmental forcings were affected by the concomitant presence of
the other one. The comparison between individual species (each species developing alone in
the same environmental conditions) and coexisting species simulations suggested that the two
species mutually affected their photosynthetic capability in idealized environmental
conditions. On the other hand, in a more realistic scenario the turbulent mixing might support
the diversity of phytoplankton species composition in the water column.
Being aware that an exhaustive review of all the IBM studies applied to phytoplankton
organisms is outside of the aim of the present work, we have here illustrated some of the most
representative IBMs analyzing the influence of individual variance and photosynthetic
responses on the growth of bulk phytoplankton populations.

3. ZOOPLANKTON
Water column ecosystems are populated by a great variety of microscopic metazoans
collectively named zooplankton. They comprise the majority of taxa, covering a wide size
spectrum, from 10-6 m (microzooplankton) to 101 m (megazooplankton) (Sieburth et al.,
1978). In marine environments the most abundant zooplanktonic organisms are copepods,
while in freshwaters cladocerans are the most represented taxon; they both are crustaceans
with an average body length of 0.2-20 mm (mesozooplankton). Besides their numerical and
geographical importance, freely swimming zooplankters are primary actors in the functioning
of pelagic ecosystems. Zooplanktonic organisms are crucial for the transfer of matter and
energy between lower (e.g., phytoplankton) and higher (e.g., fish) trophic levels (Fowler and
Knauer, 1986), as well as for linking the inertial and the viscous realms (Naganuma, 1996).
The predation upon phytoplankton provides the energy for metabolic requirements, but
determines also the egestion of fecal pellets which, by passive sinking, enhance the vertical
fluxes of carbon from the upper layers towards the deep ocean. In addition, these small
inhabitants of aquatic systems can be efficient indicators of global scale climate changes
(Richardson, 2008). For these reason, it is clear that understanding the ecology of these small
organisms can improve the current knowledge of the functioning of aquatic ecosystems and
their criticality with respect to global scale warming issues. The function played by
zooplankton in large-scale dynamics, developing over spatial scale in the order of tens and
hundreds of meters and over time periods of hours and days, is mediated by the behaviour
displayed at the individual level, which instead occurs at millimetre scale and over time scales
in the order of seconds. Despite their gap, these two scales are intimately correlated: small-
scale interactions with other organisms (prey, predators and mates) are regulated by the
individual behaviour and by the environmental abiotic factors (e.g., light, temperature), which
in turn affect the patterns observed at larger scales.
In the last two decades, the use of numerical models in zooplankton ecology has
burgeoned, as summarised by Carlotti et al. (2000). In this framework, models are mainly
used for three objectives (Carlotti et al., 2000):
Individual Based Modelling of Planktonic Organisms 91

1) to evaluate the fluxes of energy and matter in an ecological entity;


2) to study population dynamics as a function of changes in the environmental
properties;
3) to investigate the behaviour of different species.

Depending on the task addressed, different typologies of models can be used (Carlotti et
al., 2000). Despite the great potentialities, IBMs in this research area are still underexploited.
While the number of applications to fisheries research is substantial (as reviewed, e.g., in
Carlotti et al., 2000 and Neuheimer et al., 2010), the use of these models for copepods and
cladocerans is much less developed. In the following we will review applications of the
individual-based approach to copepods only, though several IBMs have been developed for
cladocerans also (e.g., Mooij and Boersma, 1996; Zadereev et al., 2003).
The first application of the individual-based approach to copepods is the work by
Batchelder and Miller (1989), who modelled the growth, development, reproduction and
death of Metridia pacifica over a one-year simulation. This model was subsequently refined
(Batchelder and Williams, 1995) to explain the consequence of the vertical distribution of
food on copepod‘s growth.
IBMs can be coupled with other models, such as water circulation or ecosystem ones.
Miller et al. (1998) modelled the life history and population dynamics of the copepod
Calanus finmarchicus in the Georges Bank region. The IBM included sex- and stage-
dependent biological traits, while the circulation model was used to move the modelled
copepods in the fluid, both horizontally and vertically. Such integrated investigation allowed
the authors to identify aggregative hot-spots in the region and to relate them to possible
restocking mechanisms.
Carlotti and Wolf (1998) implemented an IBM of C. finmarchicus based on the
― Lagrangian-ensemble method‖ (Woods and Onken, 1982) and integrated it with a food
supply deriving from a one-dimensional Eulerian NPZD ecosystem model. The individual-
based part also included a realistic description of individual swimming over the water column
(vertical migrations). The results of this model matched the observed dynamics of C.
finmarchicus in the Norwegian Sea.
Souissi et al. (2004) built an IBM to simulate the whole life cycle of Centropages
abdominalis, while Souissi et al. (2005) used an IBM to study the population dynamics of
Eurytemora affinis. Their results indicated specific space-time patterns to describe the
dynamics of the copepods, with a good match between simulated and in situ observed
blueprints.
Gentleman et al. (2008) compared different model typologies for copepod development
and proposed an alternative approach, the stage-based model of individuals, to represent the
progressive development through stages rather than using growth equations. The benefits of
this approach were demonstrated by replicating the development time for C. finmarchicus as
derived from laboratory trials. This approach was then used by Neuheimer et al. (2009) to
study the time-varying mortality in the nauplii of C. finmarchicus and in Neuheimer et al.
(2010) to model the recruitment of C. finmarchicus in the North-Western Atlantic, evaluating
the effects of temperature and chlorophyll-a on the physiological traits of this species.
Dur et al. (2009) modelled the reproduction of E. affinis using physiological parameters
such as female longevity, clutch size and interclutch duration. The model, validated through
92 Daniela Cianelli, Marco Uttieri and Enrico Zambianchi

laboratory experiments, underlined the effect of temperature on the above mentioned


parameters, while daily survival was shown to mostly affect the number of clutches produced.
Despite the Greek etymology, zooplanktonic organisms are not passively drifted by the
water currents, but are actually capable of moving on their own through the rhythmic beating
of their swimming appendages. They typically swim in search of food and mates, while at the
same time they use escape strategies to avoid the contact with predators. At the individual
scale, zooplankters exhibit a great variety of swimming repertoires (e.g., Strickler, 1977;
Mazzocchi and Paffenhöfer, 1999; Uttieri et al., 2004 and 2008), with species-specific and
intraspecific stage-dependent differences. The analysis of individual-scale motion contributes
to the comprehension of the adaptations to the varying ambient conditions, while at the same
time casting light on the large-scale behaviour as well as the population dynamics of a target
species (Dodson et al., 1997).
As mentioned above, IBMs are particularly suited to introduce behavioural rules, and
movement may be an important ingredient to stabilise the model and reproduce observed
patterns (Hosseini, 2006). In zooplankton ecology, a preliminary IBM-like approach to
copepod behaviour was used by Tiselius et al. (1993) who modelled the motility of the
copepod Acartia tonsa and of a ciliate of the genus Strombidium as a random walk with three
mobility patterns accounting for different kinetic behaviour. They aimed at verifying the role
of food patchiness and of functional responses on the foraging strategy and the predation risk
in zooplankton, and their results indicated that the risk of being captured must be traded off
with food intake to select an optimal strategy.
Later on, Leising and Franks (2000) implemented a 1D IBM of zooplankton motion in
search of food using a theoretical distribution of phytoplankton. Their simulations showed
that an area-restricted search behaviour increased the foraging efficiency, and indicated that a
model incorporating a behavioural rule was more beneficial than a pure random motion.
These outcomes were then corroborated by successive implementations of the model in a 2D
environment (Leising, 2001 and 2002), supporting the evidence that microscale patches are
critical for copepod growth and for ensuring sufficient energy intake.
Wiggert et al. (2005 and 2008) modelled the swimming behaviour and the foraging mode
of three tropical species (Clausocalanus furcatus, Oithona plumifera and Paracalanus
aculeatus) to evaluate the effects of turbulence, prey-size spectrum and frequency on their
grazing rates. Their simulations indicated that: C. furcatus preferred medium-sized, slowly
moving prey and moderate turbulent intensities; O. plumifera maximised the success of
encounter with large food particles and was favoured by high turbulent intensities; P.
aculeatus was capable of persisting in oligotrophic conditions and capturing smaller cells,
without being significantly affected by turbulence.
The consequences of turbulence upon individual copepod behaviour were also studied by
Mariani et al. (2005 and 2008) through an object-oriented IBM. In these works they focused
on the combined effect of homogeneous isotropic turbulence and contact duration between a
predator and its prey, both from a theoretical perspective (Mariani et al., 2005) and from
applications to real copepods (O. similis: Mariani et al., 2005 – O. davisae: Mariani et al.,
2008). Their simulations indicated that at realistic levels of turbulence the contact duration
was a limiting factor, and increasing intensities of turbulence might be detrimental to copepod
sensitivity. In addition, it was noticed that copepod reactiveness to turbulence was present
only until this stimulus was below a given threshold (approximately one order of magnitude
higher than the typical copepod speed).
Individual Based Modelling of Planktonic Organisms 93

Theoretical IBMs by Uttieri et al. (2007) and Cianelli et al. (2009b) demonstrated the role
of zooplankton motion behaviour and pattern of distribution of resources in determining the
probability of encountering a prey. The results of these works demonstrated that swimming
movement characterised by higher morphological complexity tallied more encounters than
smoother trajectories in uniform (Uttieri et al., 2007) and patchy (Cianelli et al., 2009) prey
distributions. Uttieri et al. (2010) developed an IBM to compare the search strategy and
encounter success of two co-occurring marine copepods (C. furcatus and O. plumifera)
showing different motion rules and sensory performances. This model included the numerical
description of the swimming motion of the two copepods (C. furcatus: Mazzocchi and
Paffenhöfer, 1999; Uttieri et al., 2008 – O. plumifera: Paffenhöfer and Mazzocchi, 2002), as
well as a realistic reconstruction of their perceptive fields (C. furcatus: Uttieri et al., 2008 –
O. plumifera: Paffenhöfer and Mazzocchi, 2002). The encounter success of the two copepods
was tested in homogeneous and patchy distributions of prey, showing that C. furcatus scored
more encounters than O. plumifera which however recorded higher search efficiencies.

CONCLUSION
Plankton populations are composed by microscopic organisms that, even when belonging
to the same species, differ at individual level from each other and interact with the aquatic
environment in a unique way. Planktonic organisms, being a product of evolutionary
processes, have developed adaptation strategies to better exploit the environmental resources.
The inter-individual and individual-environment interactions determine the emergent
properties of the entire population; as a consequence the investigation of the processes taking
place at the level of the individual provides a deeper comprehension of the functioning of
aquatic systems.
An important contribution to the understanding of plankton behaviour and ecology is thus
provided by the individual-based numerical approach (IBMs) which is particularly
appropriate to reproduce the complexity of plankton ecosystems. This promising tool has
been extensively used for a number of applications, revealing important aspects about
population dynamics and behavioural adaptations. In this work we have briefly summarized
the applications of IBMs to the field of phyto- and zooplankton ecology and behaviour. The
results here synthesized highlight that the dynamics of the individual physiological responses
is often non-linear and that extreme behaviour may play a crucial role. In such a framework,
an approach based on individuals is obviously the best candidate for a realistic numerical
description of plankton ecology.

REFERENCES
Batchelder, H. P.; Miller, C. B. Ecol. Model. 1989, 48, 113-136.
Batchelder, H. P.; Williams, R. ICES J. Mar. Sci. 1995, 52, 469-482.
Benenson, I.; Martens, K.; Birfir, S. Comput. Environ. Urban. 2008, 32, 431-439.
Berger, U.; Piou, C.; Schiffers, K.; Grimm, V. Perspect. Plant. Ecol. 2008, 9, 121-135.
Beyer, J. E.; Laurence, G. C. Ecol. Model. 1980, 8, 109-132.
94 Daniela Cianelli, Marco Uttieri and Enrico Zambianchi

Broekhuizen, N. J. Plankton Res. 1999, 21, 1191–1216.


Carlotti, F.; Wolf, K. U. Fish. Oceanogr. 1998, 7, 191-204.
Carlotti, F.; Giske, J.; Werner, F. In ICES Zooplankton Methodology Manual; Harris, R. P.;
Wiebe, P. H.; Lenz, J.; Skjoldal, H. R.; Huntley, M. (Eds.); Academic Press: San Diego,
2000; pp 571-667.
Caswell, H.; John, A. M. In Individual-Based Models and Approaches in Ecology:
Populations, Communities and Ecosystems; DeAngelis, D. L.; Gross, L. J. (Eds.);
Chapman and Hall: New York, 1992; pp 36-61.
Cianelli, D.; Ribera d‘Alcala‘, M.; Saggiomo, V.; Zambianchi, E. Antarct. Sci. 2004, 16, 133-
142.
Cianelli, D.; Sabia, L.; Ribera d'Alcalà, M.; Zambianchi, E. Ecol. Model. 2009a, 220, 2380-
2392.
Cianelli, D.; Uttieri, M.; Strickler, J. R.; Zambianchi, E. Ecol. Model. 2009b, 220, 596-604.
Cope, D. R. Nonlinear Anal.-Real. 2005, 6, 691-704.
DeAngelis, D. L.; Gross, L. J. Individual-Based Models and Approaches in Ecology:
Populations, Communities and Ecosystems; Chapman and Hall: New York, 1992; pp
525.
DeAngelis, D. L.; Cox, D. K.; Coutant, C. C. Ecol. Model. 1980, 8, 133-148.
DeAngelis, D. L.; Rose, K. A.; Huston, M. A. In Frontiers in Mathematical Biology; Levin,
S. A. (Ed.); Springer: Berlin, 1994; pp 390-410.
DeAngelis, D. L.; Barnthouse, L. W.; van Winkle, W.; Otto, R. G. J. Great Lakes Res. 1990,
16, 576-590.
Dippner, J. J. Mar. Syst. 1998, 14, 181-198.
Dodson, S. I.; Ryan, S.; Tollrien, R.; Lampert, W. J. Plankton Res. 1997, 19, 1537-1552.
Dur, G.; Souissi, S.; Devreker, D.; Ginot, V.; Schmitt, F. G.; Hwang, J.-S. Ecol. Model. 2009,
220, 1073-1089.
Esposito, S.; Botte, V.; Iudicone, D.; Ribera d‘Alcala‘, M. J. Theor. Biol. 2009, 261, 361-371.
Evans, G. T.; Taylor, F. J. R. Limnol. Oceanogr. 1980, 25, 840–845.
Falkowski, P.; Wirick, C. D. Mar. Biol. 1981, 65, 69-75.
Falkowski, P.; La Roche, J. J. Phycol. 1991, 27, 8-14.
Falkowski, P. G.; Greene, R.; Kolber, Z. In Photoinhibition of Photosynthesis from Molecular
Mechanisms to the Field; Baker, N. R.; Bowyer, J. R. (Eds.); Bios Scientific Publ.:
Oxford, 1994; pp 407-432.
Fennel, W.; Osborn, T. Deep Sea Res. II. 2005, 52, 1344-1357.
Figueiras, F. G.; Arbones, B.; Estrada, M. Limnol. Oceanogr. 1999, 44, 1599-1608.
Fowler, S. W.; Knauer, G. A. Prog. Oceanogr. 1986, 16, 147-194.
Gentleman, W. C.; Neuheimer, A. B.; Campbell, R. G. ICES J. Mar. Sci. 2008, 65, 399-413.
Grimm, V. Ecol. Model. 1999, 115, 129-148.
Grimm, V.; Wyszomirski, T.; Aikman, D.; Uchmanski, J. Ecol. Model. 1999, 115, 275-282.
Grimm, V.; Berger, U.; DeAngelis, D. L.; Polhill, J. G.; Giske, J.; Railsback, S. F. Ecol.
Model. 2010, 221, 2760-2768.
Grimm, V.; Berger, U.; Bastiansen, F.; Eliassen, S.; Ginot, V.; Giske, J.; Goss-Custard, J.;
Grand, T.; Heinz, S. K.; Huse, G.; Huth, A.; Jepsen, J. U.; Jørgensen, C.; Mooij, W. M.;
Müller, B.; Pe'er, G.; Piou, C.; Railsback, S. F.; Robbins, A. M.; Robbins, M. M.;
Rossmanith, E.; Rüger, N.; Strand, E.; Souissi, S.; Stillman, R. A.; Vabø, R.; Visser, U.;
DeAngelis, D. L. Ecol. Model. 2006, 198, 115-126.
Individual Based Modelling of Planktonic Organisms 95

Hosseini, P. R. Ecol. Model. 2006, 194, 357-371.


Huston, M.; DeAngelis, D. L.; Post, W. BioScience. 1988, 38, 682-691.
Ising, E. Zeitschr. Physik. 1925, 31, 253-258.
Janowitz, G. S.; Kamykowski, D. Ecol. Model. 1999, 247, 118-237.
Judson, O. P. TREE. 1994, 9, 9-14.
Kamykowski, D. Mar. Biol. 1979, 50, 289-303.
Kamykowski, D.; Yamazaki, H.; Janowitz, G. S. J. Plankton Res. 1994, 16, 1059-1069.
Lande, R.; Lewis, M. R. Deep Sea Res. I. 1989, 36, 1161-1175.
Leising, A. W. Mar. Ecol. Prog. Ser. 2001, 216, 167-179.
Leising, A. W. Mar. Models. 2002, 2, 1-18.
Leising, A. W.; Franks, P. J. S. J. Plankton Res. 2000, 22, 999-1024.
Lewis, M. R.; Cullen, J. J.; Platt, T. Mar. Ecol. Prog. Ser. 1984, 15, 141-149.
Łomnicki, A. Ecol. Model. 1999, 115, 191-198.
Mac Nally, R. Ecol. Model. 1997, 99, 229-245.
Maley, C. C.; Caswell, H. Ecol. Model. 1993, 68, 75-89.
Mann, K. H.; Lazier, J. R. N. Dynamics of Marine Ecosystems: Biological Physical
Interactions in the Oceans; Blackwell Scientific Publications: Oxford, 1996; pp 394.
Mariani, P.; Botte, V.; Ribera d'Alcalà, M. Deep Sea Res. II. 2005, 52, 1287-1307.
Mariani, P.; Botte, V.; Ribera d'Alcalà, M. J. Mar. Syst. 2008, 70, 273-286.
Marra, J. Mar. Biol. 1978a, 46, 203-208.
Marra, J. Mar. Biol. 1978b, 46, 191-202.
Martin, J. H.; Fitzwater, S. E.; Gordon, R. M. Global Biogeochem. Cy. 1990, 4, 5-12.
May, R. M. Science. 1974, 186, 645-647.
May, R. M. Nature. 1976, 261, 459-467.
Mazzocchi, M. G.; Paffenhöfer, G.-A. J. Plankton Res. 1999, 21, 1501-1518.
Mellor, G. L.; Yamada, T. Rev. Geophys. Space Phys. 1982, 20, 851-875.
Metz, J. A. J.; Diekmann, O. The Dynamics of Physiologically Structured Populations.
Lecture Notes in Biomathematics; Springer-Verlag: Berlin, 1986; pp 511.
Miller, C. B.; Lynch, D. R.; Carlotti, F.; Gentleman, W.; Lewis, C. V. W. Fish. Oceanogr.
1998, 7, 219-234.
Mooij, W. M.; Boersma, M. Ecol. Model. 1996, 93, 139-153.
Nagai, T.; Yamazaki, H.; Kamikowski, D. Mar. Ecol. Prog. Ser. 2003, 265, 17–30.
Naganuma, T. Mar. Ecol. Prog. Ser. 1996, 136, 311-313.
Neuheimer, A. B.; Gentleman, W. C.; Galloway, C. L.; Johnson, C. L. Fish. Oceanogr. 2009,
18, 147-160.
Neuheimer, A. B.; Gentleman, W.; Pepin, P.; Head, E. J. H. J. Mar. Syst. 2010, 81, 122-133.
Nogueira, E.; Woods, J.; Harris, C.; Field, A. J.; Talbot, S. Ecol. Model. 2006, 198, 1-22.
Paffenhöfer, G.-A.; Mazzocchi, M. G. J. Plankton Res. 2002, 24, 129-135.
Parry, H. R.; Evans, A. J. Ecol. Model. 2008, 214, 141-152.
Railsback, S. F. Ecol. Model. 2001, 139, 47-62.
Richardson, A. J. ICES J. Mar. Sci. 2008, 65, 279-298.
Scheffer, M.; Baveco, J. M.; DeAngelis, D. L.; Rose, K. A.; van Nes, E. H. Ecol. Model.
1995, 80, 191-170.
Sieburth, J. M.; Smetacek, V.; Lenz, J. Limnol. Oceanogr. 1978, 23, 1256-1263.
96 Daniela Cianelli, Marco Uttieri and Enrico Zambianchi

Souissi, S.; Ginot, V.; Seuront, L.; Uye, S.-I. In Handbook of Scaling Methods in Aquatic
Ecology - Measurements, Analysis, Simulation; Seuront, L.; Strutton, P. G. (Eds.); CRC
Press: Boca Raton, 2004; pp 523-542.
Souissi, S.; Seuront, L.; Schmitt, F. G.; Ginot, V. Nonlinear Anal.-Real. 2005, 6, 705-730.
Souissi, S.; Seuront, L.; Schmitt, F. G.; Ginot, V. Nonlinear Anal.-Real. 2005, 6, 705-730.
Starfield, A. M.; Smith, K. A.; Bleloch, A. L. How to Model It: Problem Solving for the
Computer Age; McGraw-Hill: New York, 1990; pp 206.
Strickler, J. R. Limnol. Oceanogr. 1977, 22, 165-170.
Tiselius, P.; Jonsson, P.; Verity, P. G. Bull. Mar. Sci. 1993, 53, 247-264.
Uchmański, J.; Grimm, V. TREE. 1996, 11, 437-441.
Uttieri, M.; Paffenhöfer, G.-A.; Mazzocchi, M. G. Mar. Biol. 2008, 153, 925-935.
Uttieri, M.; Nihongi, A.; Mazzocchi, M. G.; Strickler, J. R.; Zambianchi, E. J. Plankton Res.
2007, 29, i17-i26.
Uttieri, M.; Sabia, L.; Cianelli, D.; Strickler, J. R.; Zambianchi, E. J. Mar. Syst. 2010, 81,
112-121.
Uttieri, M.; Mazzocchi, M. G.; Nihongi, A.; Ribera d'Alcalà, M.; Strickler, J. R.; Zambianchi,
E. J. Plankton Res. 2004, 26, 99-105.
Wiggert, J. D.; Hofmann, E. E.; Paffenhöfer, G.-A. ICES J. Mar. Sci. 2008, 65, 379-398.
Wiggert, J. D.; Haskell, A. G. E.; Paffenhöfer, G.-A.; Hofmann, E. E.; Klinck, J. M. J.
Plankton Res. 2005, 27, 1013-1031.
Wissel, C. Ecol. Model. 1992, 63, 1-12.
Wolfram, S. Cellular Automata and Complexity: Collected Papers; Addison-Wesley:
Reading, 1994; pp 596.
Woods, J.; Onken, R. J. Plankton Res. 1982, 4, 735-756.
Woods, J.; Barkmann, W. Phil. Trans. R. Soc. London B. 1994, 343, 27-31.
Wroblewski, J. S. Ocean Sci. Eng. 1983, 8, 245-285.
Yamazaki, H.; Kamykowski, D. Deep Sea Res. I. 1991, 38, 219-241.
Zadereev, E. S.; Prokopkin, I. G.; Gubanov, V. G.; Gubanov, M. V. Ecol. Model. 2003, 162,
15-31.
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 97-114 © 2012 Nova Science Publishers, Inc.

Chapter 6

THE EFFECTIVENESS OF ARTIFICIAL NEURAL


NETWORKS IN MODELLING THE NUTRITIONAL
ECOLOGY OF A BLOWFLY SPECIES

Michael J. Watts¹, Andre Bianconi2*, Adriane Beatriz S. Serapiao3,


Jose S. Govone3 and Claudio J. Von Zuben2
1
School of Earth and Environmental Sciences, The University of Adelaide.
2
Departamento de Zoologia, Instituto de Biociências – Unesp – São Paulo
State University, (postcode 13506-900),
Avenida 24-A, 1515, Bela Vista, Rio Claro-SP, Brazil.
3
DEMAC – Unesp – São Paulo State University.

ABSTRACT
The larval phase of most blowfly species is considered a critical developmental
period in which intense limitation of feeding resources frequently occurs. Furthermore,
such a period is characterised by complex ecological processes occurring at both
individual and population levels. These processes have been analysed by means of
traditional statistical techniques such as simple and multiple linear regression models.
Nonetheless, it has been suggested that some important explanatory variables could well
introduce non-linearity into the modelling of the nutritional ecology of blowflies. In this
context, dynamic aspects of the life history of blowflies could be clarified and detailed by
the deployment of machine learning approaches such as artificial neural networks
(ANNs), which are mathematical tools widely applied to the resolution of complex
problems. A distinguishing feature of neural network models is that their effective
implementation is not precluded by the theoretical distribution of the data used.
Therefore, the principal aim of this investigation was to use neural network models
(namely multi-layer perceptrons and fuzzy neural networks) in order to ascertain whether
these tools would be able to outperform a general quadratic model (that is, a second-order
regression model with three predictor variables) in predicting pupal weight values
(outputs) of experimental populations of Chrysomya megacephala (F.) (Diptera:

*
Email address: drebianconi@yahoo.com.br
98 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

Calliphoridae), using initial larval density (number of larvae), amount of available food,
and pupal size as input variables. These input variables may have generated non-linear
variation in the output values, and fuzzy neural networks provided more accurate
outcomes than the general quadratic model (i.e. the statistical model). The superiority of
fuzzy neural networks over a regression-based statistical method does represent an
important fact, because more accurate models may well clarify several intricate aspects
regarding the nutritional ecology of blowflies. Additionally, the extraction of fuzzy rules
from the fuzzy neural networks provided an easily comprehensible way of describing
what the networks had learnt.

Keywords: regression models; life history; neural algorithms; larval phase; pupal mass.

1. INTRODUCTION
It is a well-known fact that several species of blowflies can be mechanical vectors of
pathogenic microorganisms (Zumpt, 1965; Guimarães et al., 1978; Furlanetto et al., 1984;
Laurence, 1986; Lima and Luz, 1991). In addition, blowflies have been increasingly utilised
in forensic studies with the purpose of determining post-mortem intervals (Greenberg, 1991;
Catts and Goff, 1992; Arnaldos et al., 2005; Gomes and Von Zuben, 2005; Shiao and Yeh,
2008; Cammack and Nelder, 2010). From an ecological perspective, detailed studies of the
dynamics of these insects are also very useful because some blowflies may have important
implications for the field of invasion ecology (Cammack and Nelder, 2010). Therefore,
comprehending every developmental stage of blowflies may well be considered essential both
for medico-criminal analyses as well as for ecological research as a whole (Bianconi et al.,
2010a, 2010b).
Chrysomya megacephala (Fabricius) (Diptera: Calliphoridae), for example, represents a
blowfly of well-known medical and veterinary importance that is able to cause facultative
myiasis in humans and animals (Zumpt, 1965; Guimarães et al., 1978; Furlanetto et al., 1984;
Laurence, 1986; Lima and Luz, 1991; Gabre et al., 2005). This species is native to the
Oriental zoogeographic region, but it is now established in parts of South America following
accidental introduction (Guimarães et al., 1978; Laurence, 1981; Wells, 1991). Moreover, in
the field of forensic entomology, the genus Chrysomya has been utilised as a useful biological
indicator (Greenberg, 1991; Catts and Goff, 1992; Arnaldos et al., 2005; Gomes and Von
Zuben, 2005; Shiao and Yeh, 2008; Cammack and Nelder, 2010). Owing to the medico-
criminal, biological, and ecological utility of this species, several studies have been conducted
for analysing its bionomic traits in a detailed manner (e.g. Von Zuben et al., 1993, 2000,
2001; Bianconi et al., 2010a, 2010b; Hu et al., 2010).
With respect to the larval phase of C. megacephala, this developmental stage is
considered a critical period in which intense limitation of resources may well occur (Levot et
al., 1979; Goodbrod and Goff, 1990; Reis et al., 1994). This limitation can lead to dynamic
competitive processes (Shiao and Yeh, 2008), wherein each larva attempts to feed off the
available resources, scrambling to exploit the feeding substrate before the depletion of the
food resource (Ullyett, 1950; de Jong, 1976; Lomnicki, 1988; Von Zuben et al., 2001).
Therefore, such exploitative events are characterised by interrelated processes that take place
at both the individual and population levels (Wijesundara, 1957; Herzog et al., 1992; Von
Zuben et al., 2001; Tammaru et al., 2004). Moreover, it has been suggested that both larval
The Effectiveness of Artificial Neural Networks … 99

density and availability of food affect the competition for food in a concurrent manner.
Hence, it is very useful and important to investigate the crowding level of immature
individuals on the feeding resources by means of simultaneous variations of larval densities
and amounts of food (Von Zuben et al., 2000; Ireland and Turner, 2006).
The outcomes of exploitative competition for food resources may determine population
parameters such as survival, fecundity, weight and size of the resultant adults, so that the
variation of these bionomic features could well be influenced by the immatures‘ population
density (Von Zuben et al., 2000, Ireland and Turner, 2006; Shiao and Yeh, 2008). Von Zuben
et al. (1993), for instance, regard the number of emerging adults as a variable that tends to
decrease with an increase in the number of immature individuals of C. megacephala. Hence,
profitable mass-production techniques usually demand the determination of feasible cost-
benefit relationships between larval density and amount of available food (Papandroulakis et
al., 2000; Von Zuben et al., 2001).
Regarding the analysis of bionomic features of insects, pupal weight is considered to be
an important variable for comprehending potential relationships between larval and pupal
weight of Cochliomyia hominivorax (Coquerel) (Calliphoridae) (Peterson II and Candido,
1987). Investigating the thermal requirements for development of nymphalid species, Bryant
et al. (1997) took pupal weight into consideration with the purpose of determining possible
connections between relative performance and distribution. Tammaru et al. (1996) analysed
the pupal weight of Epirrita autumnata (Borkhausen) (Lepidoptera: Geometridae) in order to
assess the relationship between body size and realised fecundity, and Tammaru et al. (2004)
stated that the pupal weight of this same species could be affected by starvation treatments.
Pupal mass was also utilised in analysing dietary specialisation aspects of a micro-
lepidopteran culture that had been maintained on an artificial diet for approximately 350
generations (Warbrick-Smith et al., 2009). Furthermore, pupal weight has been widely
utilised in several works concerning use of different substrates for rearing dipteran larvae
(Brewer, 1992; Friese, 1992; Chaudhury and Alvarez, 1999; Tachibana and Numata, 2001;
Chang et al., 2004).
Unsuitable or inaccurate predictive models may hinder the effective comprehension of
the underlying principles that govern several biological processes, including the nutritional
ecology of blowflies. For example, mass rearing of larvae and pupae with the purpose of
using these immatures as food or bait for other animals constitutes a process in which
adequate models could effectively ameliorate its implementation (Von Zuben et al., 1993,
2000, 2001). Additionally, the suitable conditions which yield pupae containing higher
concentrations of energy (higher biomass) could be assessed and established, using analytical
tools that permit the implementation of mathematical models with higher prediction
capability (Von Zuben et al., 1993, 2000; Bianconi et al., 2010a, 2010b).
In this context, the complexities of the nutritional ecology of blowflies could be clarified
and detailed by the deployment of appropriate modelling techniques such as artificial neural
networks (ANNs), which are mathematical tools widely applied to the resolution of complex
biological problems. A notable feature of artificial neural networks is their independence
from any assumptions about the theoretical distribution of the data used (Bishop, 1995;
Haykin, 1999; Bryant and Shreeve, 2002; Pearson et al., 2002; Zhang and Barrion, 2006).
Moreover, if the dimensional features of a specific system are too complex for a conventional
regression statistical model, artificial neural networks may represent a more effective
100 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

modelling tool (Schultz and Wieland, 1997; Haykin, 1999; Schultz et al., 2000; Zhang and
Wei, 2009).
Neural network algorithms are founded on the construction of models that may possess a
large number of simple processing units (that is, neurons or nodes) that contain several
connections between them and are usually lined up in layers. The number of neurons in the
input layer represents the variables that will be used to feed the neural network and should be
the most relevant variables for the problem in question (Bishop, 1995; Haykin, 1999). ANNs
were conceived with the aim of imitating the functionality of the human brain (Haykin, 1999;
Huang, 2009; Huang et al., 2010). Thus, part of the terminology used in the area of artificial
neural networks, namely neurons, synapses, learning, layers, etc., is due to such a fact.
However, it is important to emphasise that those terms are only associated with mathematical
functions or the method of implementing them.
Studies of process-based neural network models are not as frequent in ecological and
environmental areas as they are in engineering applications (Zhang and Zhang, 2008; Huang,
2009). However, considerable improvements have been achieved in recent years (Zhang and
Zhang, 2008; Huang, 2009; Zhang and Wei, 2009). In the broader context of ecosystem
dynamics, neural networks have been successfully utilised. For example, the abundance of
selected water insects in a small stream was predicted by means of neural models (Obach et
al., 2001). Environmental data collected as part of a study of microhabitat use by butterflies
were utilised with the aim of evaluating the potential for using neural modelling in developing
predictive models of microhabitat temperature (Bryant and Shreeve, 2002). Furthermore,
other multidisciplinary studies have described the development of scale-independent models,
based on coupling artificial neural networks with climate-hydrological process models in
order to simulate species‘ distribution, including insect species (Pearson et al., 2002; Pearson
and Dawson, 2003; Harrison et al., 2006).
Worner and Gevrey (2006) used a self-organising map, which is an artificial neural
network model, with the purpose of identifying global pest species assemblages and potential
invasive insects, including dipteran species. Zhang and Barrion (2006) conducted function
approximation and documentation on sampling data using neural networks, based on
invertebrate data sampled in an irrigated rice field. Howe et al. (2007) demonstrated the value
of using neural networks to predict body temperature and activity of insects by means of
modelling the body temperature and activity of a widespread butterfly species in relation to
weather. With the purpose of improving the prediction accuracy of potential species invasion,
Watts and Worner (2008) deployed two types of biotic factors in ANN models to predict
global establishment of selected phytophagous insect species, and such predictions were then
combined with those derived from an ANN model based on abiotic (climate) factors.
With respect to mass-production techniques, the adequate development of feasible
automatic feeding devices for rearing larval individuals could well benefit from the use of
neural network-based models (Papandroulakis et al., 2000), and effective control strategies of
insect pests usually rely on the knowledge derived from basic research on oviposition rate,
adult emergence, larval development until pupation, etc (Von Zuben et al., 2000; Köppler et
al., 2009). In this context, neural networks may be useful to establish the best possible cost-
benefit relationships between larval density and amount of food, with the aim of producing
heavier pupae and minimising production costs.
Zhang and Zhang (2008) analysed the effectiveness of neural networks in modelling
survival process and mortality distribution of Spodoptera litura (Fabricius) (Lepidoptera:
The Effectiveness of Artificial Neural Networks … 101

Noctuidae), emphasising the importance of five different temperatures to such an evaluation.


Zhang et al. (2008a) fitted and recognised spatial distribution patterns of grassland insects
using various neural networks, and Zhang et al. (2008b) employed conventional models and
functional link artificial neural networks in modelling the accumulated food intake of larvae
of S. litura. Additionally, the risk of insect species invasion was assessed by means of neural
models (Watts and Worner, 2009). Zhang and Wei (2009) proposed a neural network model
for state space modelling, using data derived from a natural grassland area that contained
dipteran species as well as other arthropods. With respect to blowfly species, the nutritional
ecology of C. megacephala was investigated by Bianconi et al. (2010a, 2010b), using both a
relatively small sample size (Bianconi et al., 2010a), as well as a relatively large data set
(Bianconi et al., 2010b).
The nutritional ecology of blowflies has not been analysed by means of neural models,
and the same is true of other insect species. The two studies which investigated bionomic
features of blowflies (i.e. Bianconi et al., 2010a, 2010b) provided reasonable outcomes.
Nonetheless, these same authors did suggest that their investigation represented only a first
approach to modelling the nutritional ecology of C. megacephala. Specifically, Bianconi et al.
(2010b) utilised three well-known neural networks in order to ascertain whether these tools
would be able to outperform a classical statistical method (i.e. first-order multiple linear
regression) in predicting pupal weight values (i.e. output variable) of experimental
populations of C. megacephala, using larval density (i.e. initial number of larvae), amount of
available food, and pupal size as input data. Therefore, the current investigation is aimed at
implementing more accurate neural models than those utilised by Bianconi et al. (2010b),
using exactly the same input and output data. Owing to the well-known importance of C.
megacephala for forensic and ecological analyses, more accurate outcomes should be derived
from more robust neural network models.

2. MATERIALS AND METHODS


2.1. Chrysomya megacephala Collection and Rearing

Adult blowflies were captured around the campus of the Universidade Estadual de
Campinas-Unicamp in Campinas, São Paulo state, Brazil (22º 49‘ 9.52‖ S and 47º 4‘ 12.54‖
W). The specimens were then identified and kept in nylon net cages (30 x 30 x 48 cm). Prior
to identifying the blowflies, the individuals were anaesthetised in a freezer at -18º C for 30 s.
These insects were provided with water and refined sugar ad libitum and taken as the parental
generation of this investigation. The cages were kept at room temperature (25 + 1º C), 60 +
10% relative humidity, and light:dark regime of 12:12 h. In order to induce the development
of the gonotrophic cycle, females were supplied with beef liver. For the formation of the next
generations, ovipositions were obtained using small pots containing decaying beef that were
put into the cages in order to stimulate egg laying.
Five proportions of larvae to amount of food were considered (i.e. 5, 10, 20, 30, and 40
larvae/g) in the current work, and they were obtained as follows: 75, 150, 300, 450, and 600
larvae were put into pots that contained 15g of food; 150, 300, 600, 900, and 1200 larvae in
pots containing 30g of food; 300, 600, 1200, 1800, and 2400 larvae in pots containing 60g of
102 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

food; and 450, 900, 1800, 2700, and 3600 larvae in pots with 90g of an artificial diet
proposed by Leal et al. (1982). Glass pots (8 cm in height x 7 cm in diameter) containing four
amounts (i.e. 15, 30, 60, and 90g) of the artificial diet were utilised. As to the pupal stage,
pupae were individually weighed, and pupal sizes were taken on the eighth day after the peak
of the larval hatching period. Additional experimental details regarding the formation of the
larval densities and amounts of food are provided in Von Zuben et al. (2000) and Bianconi et
al. (2010a).

2.2. Variables

Pupal weight was the dependent variable (output), and the other three were considered
the independent or explanatory variables (inputs), namely larval density (that is, initial
number of larvae), amount of available food (in grams), and pupal size (in mm). These
variables were chosen based on their biological significance for the comprehension of the
nutritional ecology of blowflies. Combinations of the three input variables provided 1114
output values.

2.3. Statistical Model

In several practical situations, neural networks and statistical models can both be utilised,
because they may respond to the same question, albeit with important differences in the
accuracy of the outcomes. Therefore, the performance of neural models is usually compared
with that derived from statistical methods (Schultz and Wieland, 1997; Bianconi et al.,
2010b). Hence, a general quadratic regression method was used as the statistical ‗counterpart‘
of the neural models deployed in the current investigation.
Bianconi et al. (2010a; 2010b) utilised a first-order linear regression method in order to
compare the outcomes derived from implementing well-known neural network models to the
results obtained by using the statistical model. Nonetheless, these same authors suggested that
each one of the independent variables might have produced distinct output responses. Overall,
such explanatory attributes might have generated non-linear variation in the output variable
(i.e. pupal weight), and the accuracy of conventional regression methods such as the first-
order regression models may be significantly reduced in the presence of non-linearity (Neter
et al., 1996; Schultz and Wieland, 1997). Therefore, in the present investigation, instead of
using the traditional first-order regression model, we utilised a second-order regression model
(i.e. a general quadratic model) in order to compare the performance of three types of
artificial neural network (ANN) algorithms to that derived from implementing the general
quadratic regression model (i.e. a second-order regression model).
It is important to emphasise that the general quadratic model (GQM) represents a special
case of the traditional linear regression model deployed by Bianconi et al. (2010a, 2010b).
Nonetheless, the term ‗linear model‘ refers to the fact that this model may be linear in the
parameters, but this linearity is not associated with the shape of the response surface. In
simple terms, a general quadratic regression (GQM) may be described in the following form:

ŷ = a + b1x1 + b2x2 + b3x3 + b12x1x2 + b13x1x3 + b23x2x3 + b11x12 + b22x22 + b33x32 (1)
The Effectiveness of Artificial Neural Networks … 103

in which ŷ represents the output (i.e. predicted pupal weight values); x1= initial larval density,
x2= amount of available food (g), x3 = pupal size (mm); a is a constant; and bi, bii, and bij
represent regression coefficients. The regression coefficients bij (i.e. b12, b13, and b23) may be
termed ‗interaction effect coefficients‘ for interactions between pairs of predictor variables. A
general quadratic regression such as (1) is considered to be a second-order regression model
because the input variables may be expressed in the model to the first and second powers
(Kutner et al., 2004). On the other hand, the traditional regression model implemented by
Bianconi et al. (2010a, 2010b) is deemed to be a first-order regression model in which the
effects of the explanatory variables on the mean response of the output variable do represent
additive effects that do not interact with one another (Kutner et al., 2004). Hence, it is
expected that the second-order model described in the present work would be capable of
providing more accurate outcomes than the traditional first-order regression model deployed
by Bianconi et al. (2010b) in predicting pupal weight values.
The entire data set was utilised in implementing the general quadratic model because the
division of the data set into training and test subsets is suitable for neural network modelling.
That is, if entomologists were to deploy this quadratic statistical model in conducting
experiments, they would utilise the entire data set because a general quadratic model is
derived from the whole data set.
In the current paper, the coefficient of determination (R2) and the root mean square error
(RMSE) were utilised in assessing the performance of the statistical and neural network
models. Apart from these metrics, a residual plot (i.e. plot of residuals against fitted values)
was utilised in order to examine the appropriateness of the regression model in relation to the
constancy of the variance of the error terms. The statistical assumptions of the general
quadratic model (GQM) followed the detailed descriptions in Kutner et al. (2004). The ANN
models used in this work will be described in the following section.

2.4. Artificial Neural Networks

Artificial neural networks are able to model complex non-linear systems, even when the
exact nature of any relationships is unknown (Schultz and Wieland, 1997; Schultz et al.,
2000; Bryant and Shreeve, 2002; Howe et al., 2007; Zhang et al., 2008a). The framework for
deploying neural network models is underpinned by the concept of artificial neurons (that is,
processing elements) that are usually laid out in a parallel fashion (Haykin, 1999). On the
whole, interconnected layers in combination with their neurons constitute the architecture of
neural networks, in which each neuron in one specific layer is only fully connected to neurons
of other layers. Neurons are usually interconnected by means of ‗synaptic weights‘ (Cheng
and Titterington, 1994; Haykin, 1999). In outline, the training of a neural network comprises
the act of presenting the neural network model with input and/or output data; the creation and
implementation of connections that are able to recognise patterns; and the learning of such
patterns based on the relationship between input and output data sets via the adaptation of
specific synaptic weights to varying input data (Cheng and Titterington, 1994; Bishop, 1995;
Haykin, 1999).
A widely used ANN is the multi-layer perceptron, or MLP (Crick, 1989). This model
consists of three layers of artificial neurons: an input layer, where there is one neuron for each
input variable; a hidden neuron layer, where each neuron has incoming connections from the
104 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

input layer; and the output layer, where there is one neuron for each output layer, and each
output neuron has incoming connections from the hidden layer neurons. The hidden layer gets
its name from the fact that it is not directly exposed to either the input or output variables: it is
‗hidden‘ from the outside world. Training a MLP involves setting the values of the
connection weights, and the most commonly used method of training MLP is
backpropagation of errors (Rumelhart et al., 1986). This is a gradient-descent error
minimisation algorithm, where errors at the output layer are propagated back through the
structure of the MLP, with the errors at each layer being used to adjust the incoming
connection weights of each neuron.
An alternative method of training MLP is via evolutionary programming (EP) (Fogel et
al., 1965). This is an evolutionary algorithm (Fogel et al., 1997) whereby populations of
solution attempts are evolved over generations; the performance of each solution attempt in
the population is evaluated over the problem, and the solution attempts with the best
performance in each generation are used to generate the next generation. When applied to
training MLP, the populations consist of MLP, the performance of each MLP is evaluated
over the training data, and new solutions are generated by applying normally distributed
changes to the connection weights. We measured the performance of each MLP as the mean
squared error (MSE) over the training data set. For both BP and EP training of MLP, one
hundred trials were carried out over the selected training parameters. The MLP that had the
best performance over the validation data partition was used to predict over the test partition.
For each trial with MLP, the contributions of each input neuron to the output of the
network were also determined. Many methods have been proposed for determining the
importance of each of the input neurons of a MLP. These include the methods of Garson
(1991), Milne (1995), Gevrey et al. (2003) and Olden and Jackson (2002). Since it was
desirable to identify features that negatively affect pupal weight and the methods of Milne
(1995) and Gevrey et al. (2003) return unsigned values, these methods were rejected. Of the
remaining methods, the work of Olden et al. (2004) has shown that the method of Olden and
Jackson (2002) is the least biased, and it has been previously used in ecological modelling
applications (Joy and Death, 2004). Thus, this method was selected.
A more advanced ANN is the Fuzzy Neural Network FuNN (Kasabov et al., 1997).
FuNN was designed to provide an easy method of combining the advantages of fuzzy logic
(Zadeh, 1965) with the advantages of neural networks. FuNN are ANN with fuzzy logic
elements embedded within them, specifically fuzzy membership functions attached to their
input and output variables, and are trained using BP. Fuzzy logic allows concepts that are not
crisply defined, such as ‗Low‘, ‗Medium‘ or ‗High‘, to be expressed in fuzzy rules that are
more easily understood, are more succint and are more robust than rules that rely on crisply
defined concepts. A chief advantage of FuNN is the ability to extract and insert fuzzy rules
from and into a FuNN structure. Fuzzy rules are extracted from a trained FuNN, and are
useful for explaining in a comprehensible manner the knowledge that the network has
captured.
A potential disadvantage of ANN is the network inability to suitably respond to data sets
which were not previously shown (lack of generalisation ability or overfitting). Normally, to
avoid overfitting of the network, the number of neural connections should be lower than the
number of training examples. However, there are not definite rules or methods of establishing
the suitable number of hidden layers and neurons (Watts and Worner, 2008; Zhang and
Zhang, 2008; Huang, 2009; Huang et al., 2010). Therefore, cross-validation (CV) may be a
The Effectiveness of Artificial Neural Networks … 105

useful way of avoiding such drawbacks by means of inserting the cross-validation subset into
the network in order to verify its performance over the training. In regard to the use of neural
modelling in entomology, Zhang and Zhang (2008), for example, utilised cross-validation
procedures.

3. RESULTS
Table 1 shows the coefficients of determination (R2) and root mean square errors that
were derived from the data subsets utilised for testing the neural network models. Regarding
the statistical model (i.e. general quadratic regression), the entire data set (i.e. 1114 examples)
was utilised. The following equation was derived by fitting the entire data set to (1):

ŷ = 6.2049 - 0.0047x1 + 0.2351x2 - 1.5892x3 + 0.00004x1x2 - 0.0007x1x3 - 0.0056 x2x3 +


0.00000596x12 - 0.0011x22 + 0.7071x32 (2)

in which ŷ represents the estimated quadratic model derived from fitting the entire data set to
(1).

Table 1. Coefficients of determination (R2) and root mean square errors (RMSE)
derived from test subsets of each predictive model. MLP-BP means a multi-layer
perceptron (MLP) ANN trained using backpropagation. MLP-EP means a MLP trained
with evolutionary programming (EP). FuNN is the results of the Fuzzy Neural Network
FuNN. With respect to the statistical model (GQM), the entire data set (n=1114) was
utilised in obtaining the metrics. GQM: general quadratic model

Test
Neural models R2 RMSE
MLP-BP 0.5659 5.8509
MLP-EP 0.5680 6.2415
FuNN 0.6394 5.1099

Statistical model R2 RMSE


(n=1114)
GQM 0.5761 7.7530

The training parameters that were found to be the most effective for BP-trained MLP
were 15 hidden neurons, trained for 100 epochs with a learning rate and momentum of 0.5.
The optimal parameters for EP-trained MLP were 12 hidden neurons, a population size of 200
MLP, over 7500 generations. The best 40 MLP of each generation were used as the parents of
the next generation. Finally, for FuNN, the optimal parameters were 15 hidden neurons,
trained for 100 epochs with a learning rate and momentum of 0.5. There were three fuzzy
membership functions attached to each input neuron and the output neuron.
106 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

The mean and standard deviations of the input contributions derived from BP-trained
MLP were x1 = -6.6061±0.5057, x2 = 3.7892±0.4177, and x3 = 12.6191±0.3450. The input
contributions derived from EP-trained MLP were x1 = -8.7523±2.8487, x2 = 4.9177±3.6100,
and x3 = 9.5995±3.9692.
Three fuzzy rules were extracted from the trained FuNN and are presented below. The
numbers following the labels Low, Medium and High are confidence factors:

if x1 is Medium 3.05073 and x2 is Medium 3.48426 and x3 is Low 2.97521


then y1 is Low 1.81613

if x1 is Medium 4.88913 and x3 is High 1.31923


then y1 is Medium 2.07653 and y1 is High 2.05552

if x1 is Low 2.26085 and x3 is High 1.31822


then y1 is High 1.67556

A threshold value was applied to the rule extraction process, which eliminated rules and
rule elements that did not contribute strongly to the model: thus, not all variables are included
in the rule antecedents.
The R2 obtained by means of the general quadratic model (GQM) was slightly higher (R2
= 0.5761, and RMSE = 7.7530) than that derived from the conventional first-order multiple
linear regression model (i.e. R2 = 0.5720, and RMSE = 7.8320) implemented by Bianconi et
al. (2010b). On the other hand, the fuzzy neural network FuNN provided both a higher R2
(0.6394) and lower RMSE value (5.1099) than the general quadratic model (GQM) utilised in
the present investigation, while the BP-trained MLP and EP-trained MLP both yielded lower
RMSE values (5.8509 and 6.2415 respectively), but slightly lower R2 values (0.5659 and
0.5680).

Figure 1. Residuals (i.e. the difference between predicted and observed values) as a function of the
predicted values. The whole data set is represented (n = 1114). Each point represents the difference
between the observed pupal weight (actual value) and the pupal weight predicted by the general
quadratic model (Equation 2).
The Effectiveness of Artificial Neural Networks … 107

In addition, it is important to note that the residuals derived from the general quadratic
model (Figure 1) are not ‗biased‘. That is, the residuals neither increase nor decrease with the
increase in the predicted output values.

DISCUSSION
Bryant and Shreeve (2002) compared the predictive power of a feed-forward
backpropagation neural network with that provided by a conventional multiple regression
model in estimating microhabitat temperature, and the neural network method was found to
exhibit a higher correlation between predicted and observed values. Howe et al. (2007)
modelled the body temperature and activity of a widespread butterfly species (Polyommatus
icarus) by means of a multilayer feed-forward backpropagation network, and this neural
model was deemed superior to a generalised linear modelling approach to predicting body
temperature.
Similarly, MLP trained using backpropagation and evolutionary programming and the
fuzzy neural network FuNN deployed in the current study exhibited better performances than
a second-order regression-based model (i.e. general quadratic model). Furthermore, Watts and
Worner (2008) successfully used multilayer perceptrons (Cascaded MLP) in order to improve
the prediction accuracy of potential insect species invasions, and improved test accuracy was
obtained through the utilised neural models.
Simple first-order linear regression models (that is, one single output as a function of one
single input variable) have been utilised in studies concerning the nutritional ecology of
blowflies (Von Zuben et al., 1993, 2000). Gomes et al. (2009), for instance, examined the
burrowing behaviour of blowflies in response to different conditions of temperature by means
of a linear regression method in order to determine the relationship between increase in
temperature and decrease in pupal weight. Bianconi et al. (2010a, 2010b) deployed a
conventional multiple linear regression model in analysing the nutritional ecology of C.
megacephala, and this statistical regression technique was less accurate than neural networks.
Bianconi et al. (2010b) suggested that the input variables (i.e. initial number of larvae,
amount of available food, and pupal size) could have introduced some non-linearity into the
output response. Such non-linearity may well hinder the implementation of first-order
multiple linear regression models, because first-order models should be linear both in the
parameters as well as in the response surface. On the other hand, second-order models such as
the general quadratic model utilised in the current work (Equation 1) are not restricted to
linear responses only. Thus, this type of model could be more appropriate for predicting pupal
weight values than the first-order model utilised by Bianconi et al. (2010b).
Nonetheless, the R2 value derived from the general quadratic model implemented in the
current investigation (R2 = 0.5761, Table 1) was slightly higher than that (i.e. R2 = 0.5720)
obtained by means of a traditional first-order regression model (Bianconi et al., 2010b).
Therefore, it could be concluded that traditional multiple regression models such as first- and
second-order models were not capable of incorporating the non-linearity present in the input
variables into the fitted model. Moreover, the RMSE derived from FuNN was lower than all
of the other outcomes (i.e. Bianconi et al., 2010b, and the current investigation).
108 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

The input contribution analysis of both the BP and EP-trained MLP indicated that the
amount of food available (x2) was the least important variable, as this had the lowest
contribution score for both sets of results. This interpretation agrees with the results of the
statistical model, which assigned relatively small coefficients to that variable and from the
fuzzy rules extracted from FuNN, where this variable was included in only one of the three
extracted rules. Pupal size x3 had a large positive contribution, that is, a large value for pupal
size would contribute to a large value for pupal weight (y1). To some extent, this
interpretation agrees with the statistical model, where a relatively large positive coefficient
was assigned to the squared term (i.e. 0.7071, Equation 2), and by the fuzzy rules, as pupal
size was included in each of the three extracted rules. Finally, initial larval density (x1) had a
large negative contribution for MP and EP-trained MLP. This means that a large value for
initial larval density would contribute to a small value for pupal weight. In the statistical
model very small coefficients were assigned to this variable, and in the third extracted fuzzy
rule, a low value of this variable (if x1 is Low) led to a high pupal weight (then y1 is high). In
summary, the neural models utilised in predicting the output variable revealed that a large
value of larval density x1 contributes to a small pupal weight, a large value of pupal size x3
contributes to a high pupal weight, and the amount of food available x2 contributes the least to
the pupal weight.
Zhang et al. (2008a) investigated the spatial distribution pattern of grassland insects by
means of neural models. They concluded from their results that neural networks were more
flexible than a conventional model. Additionally, these authors indicated that further research
based on more complex distribution patterns should be conducted with the aim of obtaining
conclusions that are more reliable. Similarly, neural networks were deemed to be superior to a
conventional model (i.e. general quadratic model) in the current study. Nevertheless, the
present work utilised simple experimental designs and only three explanatory variables
(inputs). Therefore, more complex experimental designs should be implemented in order to
explain the portion of the total variance that was not accounted for by the models.
Furthermore, larger sample sizes are highly desirable, because the number of examples
(sample size) that is usually utilised in nutritional ecology may prevent the full utilisation of
the potential of neural networks (Bianconi et al., 2010a, 2010b).
The sample size used in the present investigation (n = 1114) may be sufficient to conduct
most ecological and biological approaches to analysing the nutritional ecology of blowflies
(Bianconi et al., 2010a, 2010b). Nonetheless, it is important to note that two of the input
variables (i.e. larval density and amount of food available) provided the utilised methods with
few distinct values, because only 11 larval densities and four distinct amounts of available
food were used (see Section 2 for details). Therefore, the larger number of distinct values
measured for the output variable (i.e. pupal weight), using the same combination of values of
those input variables, may have caused a lack of fit between input data and predicted output
values that may have decreased the predictive capability of the statistical and neural network
methods (Bianconi et al., 2010b).
Nonetheless, the number of distinct values of larval densities deployed in the present
work was considerably larger than those widely used in experimental designs on the
nutritional ecology of blowflies (Von Zuben et al., 2000; Bianconi et al., 2010b). Moreover,
such studies would be impractical if an even larger number of larval densities were used (Von
Zuben et al., 1993, 2001; Bianconi et al., 2010a, 2010b). Hence, the utilisation of artificial
The Effectiveness of Artificial Neural Networks … 109

neural networks is justified in this case, since the neural algorithms coped with the lack of fit
better than the general quadratic model (Table 1).
Future studies may consider other complex variables that were not assessed in the current
work. Ambient temperature, for example, was utilised by Zhang et al. (2008b). These authors
deployed neural network algorithms (functional link artificial neural networks) in order to
model the food intake dynamics of larvae of S. litura (Lepidoptera). Six different
temperatures were used for measuring the food intake, and the neural network approach was
deemed accurate. Moreover, temperature is regarded as an important variable in analysing the
dynamics of blowflies (Shiao and Yeh, 2008, Hwang and Turner, 2009; Cammack and
Nelder, 2010; Hu et al., 2010).
The utilisation of a simple linear regression may well require a non-trivial amount of
statistical expertise. The deployment of a multiple non-linear regression model such as a MLP
does require more knowledge and experience (Sarle, 1994). Therefore, it is important to
highlight the usefulness of multidisciplinary studies with the aim of conducting effective
investigations of bionomic parameters, because it is possible that unresolved problems
concerning the bionomics of immature and adult individuals of blowflies could be
disentangled and clarified by the use of neural models (Bianconi et al., 2010a, 2010b).
The employment of these complex analytical tools may well help entomologists to
feasibly schematise the sort of practical situations in which the utilisation of artificial
networks is able to provide new insights into the nutritional ecology of blowflies (Bianconi et
al., 2010a, 2010b). Additionally, the extraction of fuzzy rules from the fuzzy neural networks
provided an easily comprehensible way of describing what the networks had learnt.
Therefore, the outcomes of the present investigation may stimulate the use of artificial neural
networks in entomological research as a whole.

ACKNOWLEDGEMENTS
The Conselho Nacional de Desenvolvimento Científico e Tecnológico provided A.
Bianconi and C.J. Von Zuben with financial support.

REFERENCES
Arnaldos MI, García MD, Romera E, Presa JJ, Luna A. 2005. Estimation of postmortem
interval in real cases based on experimentally obtained entomological evidence. Forensic
Science International, 149:57-65.
Bianconi A, Von Zuben CJ, Serapião ABS, Govone JS. 2010a. Artificial neural networks: A
novel approach to analysing the nutritional ecology of a blowfly species, Chrysomya
megacephala. Journal of Insect Science, 10:1-18 (Article 58).
Bianconi A, Von Zuben CJ, Serapião ABS, Govone JS. 2010b. The use of artificial neural
networks in analysing the nutritional ecology of Chrysomya megacephala (F.) (Diptera:
Calliphoridae), compared with a statistical model. Australian Journal of Entomology,
49:201-212.
110 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

Bishop CM. Neural Networks for Pattern Recognition. Oxford: Oxford University Press,
1995.
Brewer FD. 1992. Gel extenders in larval diet of Cochliomyia hominivorax (Diptera:
Calliphoridae). Journal of Economic Entomology, 85:445-450.
Bryant SR, Thomas CD, Bale JS. 1997. Nettle-feeding nymphalid butterflies: temperature,
development and distribution. Ecological Entomology, 22:390-398.
Bryant SR, Shreeve TG. 2002. The use of artificial neural networks in ecological analysis:
estimating microhabitat temperature. Ecological Entomology, 27:424-432.
Cammack JA, Nelder MP. 2010. Cool-weather activity of the forensically important hairy
maggot blow fly Chrysomya rufifacies (Macquart) (Diptera: Calliphoridae) on carrion in
Upstate South Carolina, United States. Forensic Science International, 195:139-142.
Catts EP, Goff ML. 1992. Forensic entomology in criminal investigations. Annual Review of
Entomology, 37:253-272.
Chang CL, Caceres C, Jang EB. 2004. A novel liquid larval diet and its rearing system for
melon fly, Bactrocera cucurbitae (Diptera: Tephritidae). Annals of the Entomological
Society of America, 97:524-528.
Chaudhury MF, Alvarez LA. 1999. A new starch-grafted gelling agent for screwworm
(Diptera: Calliphoridae) larval diet. Journal of Economic Entomology, 92:1138-1141.
Cheng B, Titterington DM. 1994. Neural Networks: a review from a statistical perspective.
Statistical Science, 9:2-30.
Crick F. 1989. The recent excitement about neural networks. Nature, 337:129–132.
de Jong G. 1976. A model of competition for food. I. Frequency-dependent viabilities. The
American Naturalist, 110:1013-1027.
Fogel DB, Wasson EC, Boughton EM, Porto VW. 1997. A step toward computer-assisted
mammography using evolutionary programming and neural networks. Cancer Letters,
119:93–97.
Fogel LJ, Owens AJ, Walsh MJ. Artificial intelligence through a simulation of evolution. In
Maxfield M, Callahan A, Fogel LJ (editors). Biophysics and Cybernetic Systems:
Proceedings of the 2nd Cybernetic Sciences Symposium, pages 131–155. 1965.
Friese DD. 1992. Calf milk replacers as substitutes for milk in the larval diet of the
screwworm (Diptera: Calliphoridae). Journal of Economic Entomology, 85:1830-1834.
Furlanetto SMP, Campos MLC, Harsi CM, Buralli GM, Ishihata GK. 1984. Microorganismos
enteropatogênicos em moscas africanas pertencentes ao gênero Chrysomya (Diptera:
Calliphoridae) no Brasil. Brazilian Journal of Microbiology, 15:170-174.
Gabre RM, Adham FK, Chi H. 2005. Life table of Chrysomya megacephala (Fabricius)
(Diptera: Calliphoridae). Acta Oecologica, 27:179-183.
Garson GD. 1991. Interpreting neural-network connection weights. AI Expert, 6:47-51.
Gevrey M, Dimopoulos I, Lek S. 2003. Review and comparison of methods to study the
contribution of variables in artificial neural network models. Ecological Modelling,
160:249-264.
Gomes L, Von Zuben CJ. 2005. Postfeeding radial dispersal in larvae of Chrysomya albiceps
(Diptera: Calliphoridae): implications for forensic entomology. Forensic Science
International, 155:61-64.
Gomes L, Gomes G, Von Zuben CJ. 2009. The influence of temperature on the behavior of
burrowing in larvae of the blowflies, Chrysomya albiceps and Lucilia cuprina, under
controlled conditions. Journal of Insect Science, 9:1-5 (Article 14).
The Effectiveness of Artificial Neural Networks … 111

Goodbrod JR, Goff ML. 1990. Effects of larval population density on rates of development
and interactions between two species of Chrysomya (Diptera: Calliphoridae) in laboratory
culture. Journal of Medical Entomology, 27:338-343.
Greenberg B. 1991. Flies as forensic indicators. Journal of Medical Entomology, 28:565-577.
Guimarães JH, Prado AP, Linhares AX. 1978. Three newly introduced blowfly species in
Southern Brazil (Diptera, Calliphoridae). Revista Brasileira de Entomologia, 22:53-60.
Harrison PA, Berry PM, Butt N, New M. 2006. Modelling climate change impacts on
species‘ distributions at the European scale: implications for conservation policy.
Environmental Science and Policy, 9:116-128.
Haykin S. Neural Networks: A Comprehensive Foundation, 2nd edition. Upper Saddle River:
Prentice Hall, 1999.
Herzog JDA, Milward-Azevedo EMV, Ferreira YL. 1992. Observações preliminares sobre o
ritmo horário de oviposição de Chrysomya megacephala (Fabricius) (Diptera,
Calliphoridae). Anais da Sociedade Entomológica do Brasil, 21:101-106.
Howe PD, Bryant SR, Shreeve TG. 2007. Predicting body temperature and activity of adult
Polyommatus icarus using neural network models under current and projected climate
scenarios. Oecologia, 153:857-869.
Hu Y, Yuan X, Zhu F, Lei C. 2010. Development time and size-related traits in the oriental
blowfly, Chrysomya megacephala along a latitudinal gradient from China. Journal of
Thermal Biology, 35:366-371.
Huang Y. 2009. Advances in artificial neural networks – methodological development and
application. Algorithms, 2:973-1007.
Huang Y, Lan Y, Thomson SJ, Fang A, Hoffmann WC, Lacey RE. 2010. Development of
soft computing and applications in agricultural and biological engineering. Computers
and Electronics in Agriculture, 71:107-127.
Hwang CC, Turner BD. 2009. Small-scaled geographical variation in life-history traits of the
blowfly Calliphora vicina between rural and urban populations. Entomologia
Experimentalis et Applicata, 132:218-224.
Ireland S, Turner B. 2006. The effects of larval crowding and food type on the size and
development of the blowfly, Calliphora vomitoria. Forensic Science International,
159:175-181.
Joy MK, Death RG. 2004. Predictive modelling and spatial mapping of freshwater fish and
decapod assemblages using GIS and neural networks. Freshwater Biology, 49:1036-
1052.
Kasabov NK, Kim J, Watts MJ, Gray AR. 1997. FuNN/2 – a fuzzy neural network
architecture for adaptive learning and knowledge acquisition. Information Sciences –
Applications. 101(3-4): 155-175.
Köppler K, Kaffer T, Vogt H. 2009. Substantial progress made in the rearing of the European
cherry fruit fly, Rhagoletis cerasi. Entomologia Experimentalis et Applicata, 132:283-
288.
Kutner MH, Nachtsheim CJ, Neter J. Applied Linear Regression Models, 4th edition. New
York: Irwin, 2004.
Laurence BR. 1981. Geographical expansions of the range of Chrysomya blowflies.
Transaction of the Royal Society of Tropical Medicine and Hygiene,75: 130-131.
Laurence BR. 1986. Old World blowflies in the New World. Parasitology Today, 2:77-79.
112 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

Leal TTS, Prado AP, Antunes AJ. 1982. Rearing the larvae of the blowfly Chrysomya
chloropyga (Wiedemann) (Diptera, Calliphoridae) on oligidic diets. Revista Brasileira de
Zoologia, 1:41-44.
Levot GW, Brown KR, Shipp E. 1979. Larval growth of some calliphorid and sarcophagid
Diptera. Bulletin of Entomological Research, 69:469-475.
Lima MLPS, Luz E. 1991. Espécies exóticas de Chrysomya (Diptera, Calliphoridae) como
veiculadoras de enterobactérias patogênicas em Curitiba, Paraná, Brasil. Acta Biológica
Paranaense, 20:61-83.
Lomnicki A. Population ecology of individuals. Princeton: Princeton Press, 1988.
Milne LK. 1995. Feature selection with neural networks with contribution measures. In:
Proceedings of the Australian Conference on Artificial Intelligence AI’95, Canberra.
Neter J, Kutner MH, Nachtsheim CJ, Wasserman W. Applied Linear Statistical Models, 4th
edition. New York: Irwin, 1996.
Obach M, Wagner R, Werner H, Schmidt HH. 2001. Modelling population dynamics of
aquatic insects with artificial neural networks. Ecological Modelling, 146:207-217.
Olden JD, Jackson DA. 2002. Illuminating the ―bl ack box‖: a randomization approach for
understanding variable contributions in artificial neural networks. Ecological Modelling,
154: 135-150.
Olden JD, Joy MK, Death RG. 2004. An accurate comparison of methods for quantifying
variable importance in artificial neural networks using simulated data. Ecological
Modelling, 178:389-397.
Papandroulakis N, Markakis G, Divanach P, Kentouri M. 2000. Feeding requirements of sea
bream (Sparus aurata) larvae under intensive rearing conditions. Development of a fuzzy
logic controller for feeding. Aquacultural Engineering, 21:285-299.
Pearson RG, Dawson TP. 2003. Predicting the impacts of climate change on the distribution
of species: are bioclimate envelope models useful? Global Ecology & Biogeography,
12:361-371.
Pearson RG, Dawson TP, Berry PM, Harrison PA. 2002. SPECIES: A spatial evaluation of
climate impact on the envelope of species. Ecological Modelling, 154:289-300.
Peterson II RD, Candido AO. 1987. Larval and pupal weight relationships of six strains of
screwworm (Diptera: Calliphoridae) reared in the laboratory and in wounds. Journal of
Economic Entomology, 80:1213-1217.
Reis SF, Stangenhaus G, Godoy WAC, Von Zuben CJ, Ribeiro OB. 1994. Variação em
caracteres bionômicos em função da densidade larval em Chrysomya megacephala e
Chrysomya putoria (Diptera: Calliphoridae). Revista Brasileira de Entomologia, 38:33-
46.
Rumelhart DE, Hinton GE, Williams RJ. 1986. Learning representations by back-propagating
errors. Nature, 323:533–536.
Sarle WS. 1994. Neural networks and statistical models. In: Proceedings of the Nineteenth
Annual SAS Users Group International Conference. pp. 1538-1550. SAS Institute Inc.
Schultz A, Wieland R. 1997. The use of neural networks in agroecological modelling.
Computers and Electronics in Agriculture, 18:73-90.
Schultz A, Wieland R, Lutze G. 2000. Neural networks in agroecological modelling – stylish
application or helpful tool? Computers and Electronics in Agriculture, 29:73-97.
The Effectiveness of Artificial Neural Networks … 113

Shiao SF, Yeh TC. 2008. Larval competition of Chrysomya megacephala and Chrysomya
rufifacies (Diptera: Calliphoridae): behavior and ecological studies of two blow fly
species of forensic significance. Journal of Medical Entomology, 45:785-799.
Tachibana SI, Numata H. 2001. An artificial diet for blow fly larvae, Lucilia sericata
(Meigen) (Diptera: Calliphoridae). Applied Entomology and Zoology, 36:521-523.
Tammaru T, Kaitaniemi P, Ruohomäki K. 1996. Realized fecundity in Epirrita autumnata
(Lepidoptera: Geometridae): relation to body size and consequences to population
dynamics. Oikos, 77:407-416.
Tammaru T, Nylin S, Ruohomäki K, Gotthard K. 2004. Compensatory responses in
lepidopteran larvae: a test of growth rate maximisation. Oikos, 107:352-362.
Ullyett GC. 1950. Competition for food and allied phenomena in sheep-blowfly populations.
Philosophical Transactions of the Royal Society of London, 234:77-174.
Von Zuben CJ, Reis SF, do Val JBR, Godoy WAC, Ribeiro OB. 1993. Dynamics of a
mathematical model of Chrysomya megacephala (Diptera: Calliphoridae). Journal of
Medical Entomology, 30:443-448.
Von Zuben CJ, Stangenhaus G, Godoy WAC. 2000. Larval competition in Chrysomya
megacephala (F.) (Diptera: Calliphoridae): effects of different levels of larval
aggregation on estimates of weight, fecundity and reproductive investment. Revista
Brasileira de Biologia, 60:195-203. (in Portuguese)
Von Zuben CJ, Von Zuben FJ, Godoy WAC. 2001. Larval competition for patchy resources
in Chrysomya megacephala (Dipt., Calliphoridae): implications of the spatial distribution
of immatures. Journal of Applied Entomology, 125:537-541.
Warbrick-Smith J, Raubenheimer D, Simpson SJ, Behmer ST. 2009. Three hundred and fifty
generations of extreme food specialisation: testing predictions of nutritional ecology.
Entomologia Experimentalis et Applicata, 132:65-75.
Watts MJ, Worner SP. 2008. Comparing ensemble and cascaded neural networks that
combine biotic and abiotic variables to predict insect species distribution. Ecological
Informatics, 3:354-366.
Watts MJ, Worner SP. 2009. Estimating the risk of insect species invasion: Kohonen self-
organising maps versus k-means clustering. Ecological Modelling, 220:821-829.
Wells JD. 1991. Chrysomya megacephala (Diptera: Calliphoridae) has reached the
continental United States: review of its biology, pest status, and spread around the world.
Journal of Medical Entomology, 28:471-473.
Wijesundara DP. 1957. The life-history and bionomics of Chrysomya megacephala (Fab.).
Ceylon Journal of Science, 25:169-185.
Worner SP, Gevrey M. 2006. Modelling global insect pest species assemblages to determine
risk of invasion. Journal of Applied Ecology, 43:858-867.
Zadeh LA. 1965. Fuzzy sets. Information and Control, 8:338-353.
Zhang WJ, Barrion A. 2006. Function approximation and documentation of sampling data
using artificial neural networks. Environmental Monitoring and Assessment, 122:185-
201.
Zhang WJ, Zhang XY. 2008. Neural network modeling of survival dynamics of
holometabolous insects: a case study. Ecological Modelling, 211:433-443.
Zhang WJ, Zhong XQ, Liu GH. 2008a. Recognizing spatial distribution patterns of grassland
insects: neural network approaches. Stochastic Environmental Research and Risk
Assessment, 22:207-216.
114 Michael J. Watts, Andre Bianconi, Adriane Beatriz S. Serapiao et al.

Zhang WJ, Liu GH, Dai HQ. 2008b. Simulation of food intake dynamics of holometabolous
insect using functional link artificial neural network. Stochastic Environmental Research
and Risk Assessment, 22:123-133.
Zhang WJ, Wei W. 2009. Spatial succession modeling of biological communities: a multi-
model approach. Environmental Monitoring and Assessment, 158:213-230.
Zumpt F. Myiasis in Man and Animals in the Old World. London: Butterworths, 1965.
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 115-172 © 2012 Nova Science Publishers, Inc.

Chapter 7

DEVELOPMENT AND UTILITY OF AN ECOLOGICAL-


BASED DECISION-SUPPORT SYSTEM FOR MANAGING
MIXED CONIFEROUS FOREST STANDS FOR
MULTIPLE OBJECTIVES

Peter F. Newton1*
1
Canadian Wood Fibre Centre, Canadian Forest Service,
Natural Resources Canada, Sault Ste. Marie, Ontario, Canada, P6A 2E5.

ABSTRACT
An ecological-based decision-support system and corresponding algorithmic
analogue for managing natural black spruce (Picea mariana (Mill) BSP.) and jack pine
(Pinus banksiana Lamb.) mixed stands was developed. The integrated hierarchical system
consisted of six sequentially-linked estimation modules. The first module consisted of a
key set of empirical yield-density relationships and theoretically-based functions derived
from allometry and self-thinning theory that were used to describe overall stand dynamics
including temporal size-density interrelationships and expected stand development
trajectories. The second module was comprised of a Weibull-based parameter prediction
equation system and an accompanying composite height-diameter function that were used
to recover diameter and height distributions. The third module included a set of species-
specific composite taper equations that were used to derive log product distributions and
volumetric yields. The fourth module was composed of a set of species-specific
allometric-based composite biomass equations that were used to estimate mass
distributions and associated carbon-based equivalents for each above-ground component
(bark, stem, branch and foliage). The fifth module incorporated a set of species-specific
end-product and value equations that were used to predict chip and lumber volumes and
associated monetary equivalents by sawmill type (stud and randomized length mill
configurations). The sixth module encompassed a set of species-specific composite
equations that were used to derive wood and log quality metrics (specific gravity and
mean maximum branch diameter, respectively). The stand dynamic and structural
recovery modules were developed employing 382 stand-level measurements derived

*
Correspondence: peter.newton@nrcan.gc.ca
116 Peter F. Newton

from 155 permanent and temporary sample plots situated throughout the central portion
of the Canadian Boreal Forest Region, the taper and end-product modules were
developed employing published results from taper and sawmill simulation studies, and
the biomass and fibre attribute modules were developed using data from density control
experiments.
The potential of the system in facilitating the transformative change towards the
production of higher value end-products and a broader array of ecosystem services was
exemplified by simultaneously contrasting the consequences of density management
regimes involving commercial thinning treatments in terms of overall productivity, end-
product yields, economic efficiency, and ecological impact. This integration of
quantitative relationships derived from applied ecology, plant population biology and
forest science into a common analytical platform, illustrates the synergy that can be
realized through a multi-disciplinary approach to forest modeling.

Keywords: Self-thinning rule; Forest production theory; Plant population biology;


Allometry; Operational utility.

1. INTRODUCTION
Density management within even-aged forest stands consists of regulating species
composition, density-stress levels and structural characteristics, by manipulating initial
planting densities at the time of establishment (initial espacement; IE) and (or) reducing stand
densities during subsequent stages of stand development (e.g., precommercial thinning (PCT)
at the sapling stage, and (or) commercial thinning (CT) at the semi-mature stage).
Ecologically, these treatments redistribute the finite environmental resources on a given site
(e.g., solar radiation, moisture, nutrients, and physical growing space) to selected crop trees
by controlling the intensity and frequency of symmetrical and asymmetrical competitive
interactions. Regulating site occupancy through density management has been a cornerstone
of silvicultural practice since it was first introduced in forestry by Reventlow in 1879
(Pretzsch, 2009). Density management continues to be an important component of intensive
forest management as evident by current efforts: over 500,000 ha of productive forest land
receive IE, PCT or CT treatments every year in both Canada and Finland (CCFM (2009) and
Peltola (2009), respectively).
A broad array of benefits at the tree, stand and landscape scale can be obtained when
implementing proper density management treatment protocols or prescriptions. Productivity
related benefits include increased growth and resultant yields leading to enhanced end-
products (e.g., Kang et al., 2004), early stand operability (e.g., Erdle, 2000), reduced self-
thinning rates and thus lower mortality losses (e.g., Pelletier and Pitt, 2008), spatial and
structural uniformity resulting in lower extraction, processing and manufacturing costs (e.g.,
Tong et al., 2005), and increased carbon sequestration rates (e.g., Nilsen and Strand, 2008).
Other tangible benefits include the production of coarse woody debris (e.g., Sturtevant et al.,
1996) and provision of hiding requirements (e.g., Smith and Long, 1987) for wildlife habitat,
controlling successional pathways in order to prevent the establishment and development of
ericaceous shrub species (e.g., Lindh and Muir, 2004), and enhanced biological diversity
(e.g., Verschuyl et al., 2011). Conversely, however, density management can result in
negative outcomes if regimes have not been optimally designed for a given management
Development and Utility of an Ecological-based Decision-Support System … 117

objective. Thinning treatments which remove too many trees resulting in large inter-tree
distances can promote the development of large branches and resultant knots thus lowering
lumber grades during the manufacturing stage (e.g., Zhang et al., 2005), extend the period of
juvenile wood production resulting in lower specific gravity and associated reductions in end-
product quality and value (e.g., Tong et al., 2009), and (or) reduce merchantable volume
productivity due to prolonged periods of insufficient site occupancy (e.g., Fleming et al.,
2005). Therefore, in order to attain the maximum benefits of density management treatments,
minimize exposure to the serious negative reunifications of incorrect prescriptions, and
provide an objective and transparent framework for justifying treatment decisions, decision-
support tools are required.
The stand density management diagram (SDMD) is a proven density management
decision-support tool which has been utilized by resource managers throughout many of the
world‘s forest regions. For example, SDMDs have been developed and utilized in managing
Japanese red pine (Pinus densiflora Siebold and Zucc.) plantations in Japan and South Korea
(Ando (1962, 1968) and Kim et al. (1987), respectively), Monterey pine (Pinus radiata D.
Don.) plantations in New Zealand and Spain (Drew and Flewelling (1977) and Castedo-
Dorado et al. (2009), respectively), Douglas fir (Pseudotsuga menziesii (Mirb.) Franco.)
plantations in the Pacific Northwest (Drew and Flewelling (1979) and Long et al. (1988)) and
Spain (López-Sánchez and Rodríguez-Soallerio, 2009), lodgepole pine (Pinus contorta var.
latifolia Engelm.) stands in the western USA (Flewelling and Drew (1985), McCarter and
Long (1986), and Smith and Long (1987)), black spruce plantations in central Canada
(Newton and Weetman, 1994), loblolly pine (Pinus taeda L.) plantations in the southern USA
(Dean and Baldwin, 1993), Scots pine (Pinus sylvestris L.) and Austrian black pine (Pinus
nigra Arn.) plantations in eastern Europe (Stankova and Shibuya, 2007), and Merkus pine
(Pinus merkusii Jungh. et de Vriese) plantations in Indonesia (Heriansyah et al., 2009). These
ecological-based decision-support tools are built upon (1) quantitative functional relationships
derived from applied ecology, plant population biology and forest production theory which
include the reciprocal equations of the competition–density and yield–density effect (Kira,
1953; Shinozaki and Kira, 1956), self-thinning rule (Yoda et al., 1963), and site occupancy-
production relationships (Drew and Flewelling, 1979), and (2) empirical allometric
relationships which include size–density relationships for describing the effect of population
density-stress on tree dimensions such as quadratic mean diameter, mean volume, and mean
live crown ratio (e.g., Drew and Flewelling, 1977)). These relationships represent the
cumulative effect of density-dependent resource competition processes on productivity,
allometry and survivorship patterns at both the individual and population levels.
Historically, the analytical development of SDMDs has been characterized by a sequence
of continuous incremental advancements in which increasingly complex and innovative
model variants have been proposed. The first model forms, static SDMDs, where initially
developed in Japan by Ando (1962) and later introduced into North America by Drew and
Flewelling (1977, 1979). However, the lack of mortality submodels to describe stand
dynamics during the self-thinning stage limited their use in density management decision-
making. In response, dynamic SDMDs in which an embedded mortality submodel was
explicitly incorporated within the SDMD model structure, were proposed in the mid 1990s
(e.g., Newton and Weetman 1993, 1994). Later, acknowledging the paradigm shift in
management focus from volumetric yield maximization to end-product recovery and value
118 Peter F. Newton

maximization (e.g., Barbour and Kellogg, 1990; Emmett 2006), and realizing the limitations
of both the static and dynamic variants in terms of addressing these new objectives, the
structural SDMD was introduced (Newton et al., 2004, 2005). Specifically, the structural
model incorporated a parameter prediction equation system for recovering diameter
distributions within the dynamic SDMD model structure and hence enabled the estimation of
size-dependent end-product and value attributes.
The most recent iteration of the SDMD modeling framework is represented by the
integrated modular-based structural stand density management model (SSDMM) developed
for natural and managed jack pine (Pinus banksiana Lamb.) stand-types (Newton, 2009). The
rationale for developing this model was to provide resource managers with a comprehensive
density management decision-support tool that could address volumetric, end-product,
economic and ecological objectives, simultaneously. While this modeling platform has been
successfully applied to monospecific stand-types, the mixed stand analogue has yet to be
developed. Consequently, the first objective of this study was to develop a modular-based
SSDMM and associated algorithmic analogue for natural (naturally regenerated stands
without a history of density regulation) black spruce (Picea mariana (Mill) BSP.) and jack
pine mixed stands. The second objective was to demonstrate the utility of the resultant model
by contrasting operationally plausible density control regimes using a broad array of
productivity, economic and ecological performance metrics (e.g., volumetric yield outcomes,
log-product distributions, biomass production and carbon yields, recoverable products and
associated values, economic efficiency, duration of optimal site occupancy, structural
stability, fibre attributes, and operability status). Refer to Newton (2010) for a general
overview of SDMDs in terms of their history, modeling approach, foundation studies, and
current modeling activities, and to Drew and Flewelling (1979), Newton and Weetman (1993)
and Newton et al. (2004) for specific details regarding the quantitative foundation of the
static, dynamic and structural model variants, respectively.

2. METHOD
The hierarchical designed SSDMM consisted of six sequentially-linked estimation
modules which were denoted according to the following nomenclature: Module A - Dynamic
SDMD; Module B - Diameter and Height Recovery; Module C - Taper Analysis and Log
Estimation; Module D - Biomass and Carbon Estimation; Module E - Product and Value
Estimation; and Module F - Fibre Attribute Estimation Module. Analytically, Module A
involved the development of a dynamic SDMD via the parameterization and integration of a
set of static and dynamic yield–density relationships. Module B consisted of the development
of a Weibull-based parameter prediction equation system and an associated composite height-
diameter prediction equation for recovering diameter and height distributions. Module C was
developed through the employment of species-specific dimensional compatible taper
equations derived from the literature which were used to predict log products (number of
pulplogs and sawlogs) and stem volumes. Module D utilized species-specific allometric-
based composite biomass equations derived from both the literature (jack pine) and density
control experimental data (black spruce) to predict oven-dried masses and associated carbon
equivalents for all four above-ground components, bark, stem, branch and foliage. Module E
Development and Utility of an Ecological-based Decision-Support System … 119

employed species and sawmill (stud and randomized length mill) specific end-product and
value equations derived from the literature to predict the volume and monetary worth of the
manufactured end-products (wood chips and dimensional lumber). Module F involved the
development of composite equations for estimating wood density and mean maximum branch
diameter. An algorithmic analogue of the resultant modular-based SSDMM was also
developed and its utility exemplified by simultaneously contrasting a set of complex density
management regimes in terms of productivity metrics, log-product distributions, biomass pro-
duction and carbon yields, quantity and quality of recoverable end-products, economic
efficiency, duration of optimal site occupancy, and structural stability.
In presenting the methods, it should be noted that in cases where the required
relationships were previously developed in a concurrent investigation (i.e., parameter
prediction equation system, asymptotic size–density relationship, and the composite height-
diameter function) or reported in the literature (e.g., taper, end-product and value functions),
only an abridged version of the pertinent methods and results are included. In other cases
where the required relationships could not be derived from the core calibration data set
(described below) nor from the literature, supplemental data sets were acquired and analyzed
from which the required functions were developed (e.g., data from Nelder plots were used to
develop mean live crown ratio and mean maximum branch diameter prediction functions, and
data from other types of density control experiments were used to parameterized the biomass
and wood density prediction functions). Although similarities exist in regards to the
monospecific model presented by Newton (2009) for jack pine stand-types, the mixed model
variant presented in this study expands, modifies and extends the modeling platform through
the introduction of species invariant relationships for bivariate mixtures, a response delay
function for accounting for post-thinning effects on stand dynamics, and new performance
metrics. In order to facilitate replication and application to other species, the principal
analytical methods, model structure, and the computational sequence utilized, are provided in
their entirety.

2.1. Stand-type Description, Core Data Set and Preliminary Calculations

Stands belonging to the natural even-aged black spruce and jack pine mixed stand-type,
denoted PImPNb(N), were defined as those which had the following attributes (Table 1): (1)
comprised principally of black spruce and jack pine in which the (i) density percentage had to
constitute a minimum of 90% and 10% on a collective and individual species basis,
respectively, for stands at the establishment stage of development, or alternatively, (ii) basal
area percentage had to constitute a minimum of 90% and 10% on a collective and individual
species basis, respectively, for stands past the establishment stage of development; (2)
situated on mineral soils and regenerated naturally following a stand-replacing disturbance
(e.g., wildfire or forest harvesting); and (3) a silvicultural history absent of artificial
regeneration, PCT or CT treatments.
The calibration data for Modules A and B were derived from measurements obtained
from temporary and permanent sample plots (denoted TSPs and PSPs, respectively) located
throughout the central region of the Canadian Boreal Forest Region. In total, 382 tree-list
measurements derived from 155 sample plots were utilized. Geographically, these plots were
largely concentrated within Forest Sections B-4, B-7, B-8, B-9, B-10, B-11 and B-14 (Rowe,
120 Peter F. Newton

1972) which fall within the northeastern, northcentral and northwestern administrative
regions of the Province of Ontario. Historically, the plots were initially established by various
forest-based industrial corporations and government agencies during the 1930-1990 period
and were distributed across the landscape using a stratified pseudo-random sampling design
(n., strata were based on the age-class structure and site quality variation within a given
region or tenured landbase). The plots were circular, square or rectangular in shape with a
mean overall area of 0.082 ha (standard deviation (SD) = 0.021 ha; minimum/maximum =
0.040/0.120 ha) and density of 189 trees/plot (SD = 104 trees/plot; minimum/maximum =
42/926 trees/plot). For the purposes of this study, the plots were also differentiated based their
utility in describing temporal change (stand dynamics): static plots were those that were only
measured once whereas dynamic plots were those that were measured more than once and
hence capable of providing change estimates in terms of surviver growth, ingress, ingrowth
and mortality. Table 1 summarizes the plot measurements according to the source
organization, series and type (static or dynamic) along with the number and frequency of
measurements obtained.
For each plot, its geographic location, disturbance history and the measurement
techniques utilized, were known. Generally, the measurements on each plot included species-
specific tree-lists consisting of diameter measurements at breast-height (1.37 m or 1.3 m) -
outside bark (D; ±0.25 cm) for all biotic trees greater than 2.54 cm in D, and D and total
height (H; ±0.30 m) measurements on a subset of approximately 10 black spruce and jack
pine sample trees, usually selected from across a plot‘s diameter range.
Combining the individual tree values (D and H measurements or estimates derived from
species-specific allometric-based height-diameter functions) with regional volume equations
and the plot area information, the following stand-level variables were calculated: (1) mean
dominant height (Hd; m), defined as the mean height of the trees within the largest height
quintile; (2) quadratic mean diameter (Dq; cm); (3) basal area (G; m2/ha); (4) total volume (Vt
(m3/ha), computed as the sum of individual tree total volumes (vT; m3/tree) as determined
from the D and H values inputted into regional-wide standardized volume equations (Honer et
al., 1983); (5) merchantable volume (Vm (m3/ha), computed as the sum of the individual tree
merchantable volumes (vM; m3/tree) for all trees greater than 9 cm in D as determined from
the D and H values, vT estimate and specified merchantability limits (i.e., 0.15 m stump-
height and a 7.62 cm top diameter (inside-bark); Honer et al., 1983); (6) total density (N
(stems/ha)); and (7) relative density index (Pr (%/100) as defined as the ratio of N to the
maximum N attainable in a stand with the same mean volume (Drew and Flewelling, 1979;
Newton and Weetman, 1993; Newton, 2006a). The mensurational characteristics of the
datasets are summarized in Table 2.
Table 1. Defining attributes of the natural black spruce and jack pine mixed
stand-type and plot data sources, types and their measurement frequency

Stand-type Principal Defining Attributes Plot Series and Number of Plots by Measurement Type Measurement Sequence and Number
Denotation
Seriesa Typeb # of Consecutive Total
# of Plots
Measurements #
Static Dynamic
PImPNb(N) Even-aged black spruce and jack pine mixed Kimberly Clark (PSPs) 20 1 102 382
stands situated on mineral soils which regenerated
American-Can (PSPs) 33 2 1
naturally following a stand-replacing disturbance
(e.g., wildfire or forest harvesting) with no prior Boreal-Growth
66 3 2
history of artificial regeneration or density (OMNR/TSPs)
management treatments (e.g., precommercial or PGP (TSPs) 36 4 9
commercial thinning). Black spruce and jack pine 5 17
collectively constituted >90% of the total density
6 19
at the time of establishment, or >90% of the basal
area of established stands, with each species 7 3
constituting a minimum of 10% of either measure. 8 2
Similar to the JP2 working group analog (Watt et
al., 2001).

a - Plot series denoted according to Ontario-centric nomenclature where the host organization responsible for initial plot establishment is
acknowledged: corporation or governmental agency where OMNR refers to the Ontario Ministry of Natural Resources. Note, TSPs and PSPs
denote temporary and permanent sample plots, respectively.
b - Dynamic plots were those for which a remeasurement was obtained whereas static plots were those for which only a single measurement was
obtained.
122 Peter F. Newton

2.2. Development of the Dynamic SDMD for Mixed Stands (Module A)

As described in detail below, the development of the dynamic SDMD consisted of the
parameterization of the following relationships: self-thinning line and the derived relative
density index function; yield–density relationships and the associated isolines for quadratic
mean diameter, mean dominant height and mean live crown ratio; mean volume–density
relationships at the time of initial crown closure and those delineating the zone of maximum
production; and net density change function for predicting post-crown-closure size–density
trajectories. Subsequent integration of these relationships within the traditional modeling
framework resulted in the dynamic SDMD for mixed stands.

2.2.1. Self-thinning Line and Relative Density Index Equation


For stands incurring density-dependent mortality, the asymptotic logarithmic relationship
between mean volume per stem ( v ; dm3) and density per unit area (N; stems/ha), commonly
referred to as the self-thinning line (Yoda et al., 1963), was quantified using Eq. (1).

log10  v    0  1 log10  N   log10   (1)

where  0 and  1 are intercept and slope (self-thinning) coefficients, respectively, and  is
an error term. The parameter estimates for self-thinning mixed stands, as previously reported
by Newton (2006a), were used in this study (Table 3). The function for calculating relative
density index (Pr; Drew and Flewelling, 1979), defined as the ratio of a stand‘s observed
density ( N o ) to that of the maximum density (Nmax) attainable in a stand with the same mean
volume  vo  , was derived from the self-thinning rule (Eq. (2); Table 3).

1
 vo  1
Pr  N o 100  (2)

Thus solving Eq. (2) for log10  v  yielded the isoline function for a given Pr value (Eq.
(3)).

 100 
log10  v   log10  1   1 log10  N  (3)
 Pr 
Table 2. Mensurational characteristics of the sample trees and stands
utilized to develop the modular-based SSDMM for mixed stands

Stand-typea Variableb Mean Standard Minimum Maximum


(np; ns; nt) (units) Deviation
PImPNb(N) A (yr) 100 26 33 170
(382; 362; Hd (m)
18.78 2.80 7.86 23.78
1290)
Dq (cm) 14.78 3.49 5.80 25.00
G (m2/ha) 35.16 7.94 3.24 52.82
v (dm3) 145.54 81.22 9.22 463.32
Vt (m3/ha) 270.60 71.84 11.29 428.52
Vm (m3/ha) 198.79 72.19 0.00 369.78
N (stems/ha) 2387 1281 543 8342
Pr (%) 83.38 21.00 4.12 126.92
SI (m) 14.84 2.18 9.75 25.44
Dmin (cm) 3.24 1.62 1.30 9.90
â 1.91 1.19 0.09 6.203
b̂ 13.16 4.00 3.39 23.38
ĉ 2.57 0.80 1.00 4.44
D (cm) 17.18 6.09 2.50 35.30
H (m) 15.73 4.43 2.45 25.60
a - As defined in Table 1; np denotes the number of plot measurements utilized to established the relationships used in the dynamic SDMD; ns denotes
the number of plot measurements associated with the development of the diameter distribution parameter prediction equation system (Newton and
Amponsah, 2005); and nt denotes the number of individual diameter at breast-height (D) and height (H) measurement pairs used to develop the
composite H-D relationship (Newton and Amponsah, 2007).
b - A, Hd, Dq, G, v , Vt, Vm, N, Pr, SI and Dmin denote the following stand-level variables: mean stand-age, mean dominant height, quadratic mean
diameter, basal area, mean volume, total volume, merchantable volume, total density, relative density index, mean site index value, and minimum
diameter observed within the empirical diameter frequency distributions, respectively. â , b̂ and ĉ denote maximum likelihood estimates of the
location, scale and shape parameters, respectively, of the 3-parameter Weibull probability density function. Note, site index was calculated as the
mean of the species-specific values using the functions developed by Carmean et al. (2001, 2006).
124 Peter F. Newton

2.2.2. Yield-density Relationships and Associated Isolines


for Dq, Hd and Live Crown Ratio
The yield-density relationships and associated isolines for quadratic mean diameter and
mean dominant height where developed using the model specifications and parameterization
techniques described by Newton and Weetman (1993). Firstly, the relationship between Dq
and v and N was quantified using Eq. (4).

log10 ( Dq )  0  1 log10 (v )  2 log10 ( N )  log10 ( ) (4)

where i , i  0, 2 are parameters estimated by ordinary least squares (OLS) regression


analysis, and  is an error term. The resultant relationship was evaluated on the basis of its
statistical significance, proportion of variation explained, and compliance with the principal
regression assumptions underlying OLS. Residual graphical analysis and associated statistical
indices were used to identify and remove outliers and influential observations. Specifically,
studentized deleted residuals and Cook‘s distance were used to detect outliers and influential
observations, respectively, where the probability level for exclusion was set at 0.01 for both
measures (Neter et al., 1990). The regression results suggested that fitted function described
the relationship well in that it was significant (p ≤ 0.05), explained a large proportion of
variation as determined from the multiple coefficient of determination (R2), and did not
violate the constant error variance, correct model specification and normality assumptions as
inferred from residual analyses (i.e., graphical examination of raw and studentized residual
plots, and normal probability plots). Table 3 lists the resultant parameter estimates and
associated regression statistics for Eq. (4).
The Dq isoline function which defines the relationship between log10  v  and

log10  N  for a given Dq value (Eq. (5)) was derived by solving Eq. (4) for log10  v  .

log10  Dq   0    2  log10  N 
log10  v   (5)
1
Table 3. Parameter estimates and associated statistics for the regression relationships developed in Module A (Dynamic SDMD)

Relationship Parameter Estimate Statistical Notes


Symbol Value
Asymptotic ̂ 6.145 (1) Non-parametric bias-adjusted bootstrap parameter estimates obtained via bisector ordinary least squares
0
log10 v  log10 N (OLS) regression analysis (Isobe et al., 1990).
̂1 -1.181
(Eq. (1)) (2) Bias-corrected 95% percentile confidence intervals (Meyer et al., 1986): 5.908  ̂ 0  6.353 and -
1.242  ̂1  -1.111.
(3) Regression statistics: degrees of freedom for regression (nreg) and residual error (nres) = 1, 31, respectively;
and product-moment correlation coefficient (r) = -0.993.

Pr  N o /  vo / 106.145  .
0.805
(4) Derived relative density index (Pr (%/100)) equation (Eq. (2)):
 
(5) Source: Newton (2006b).
Quadratic mean diameter = (1) Parameter estimates obtained from OLS regression analysis.
̂0 0.5155
f(mean volume and density) (2) The intercept parameter estimate includes a correction factor for the bias introduced via the logarithmic
(Eq. (4)) ˆ1 0.3674 transformation (Baskerville, 1972; Sprugel, 1983).
(3) Regression statistics: nreg = 2; nres = 358; multiple coefficient of determination (R2) = 0.981; standard error of
̂ 2 -0.0377 the estimate (SEE; log10(cm)) = 0.0147; and F-ratio = 9430* where * denotes a significant (p  0.05)
relationship.
Mean volume = f(mean ̂ 0 -1.3715 (1) Parameter estimates obtained from OLS regression analysis.
dominant height and (2) The intercept parameter includes a correction factor as described above.
quadratic mean diameter) ̂1 0.8576 (3) Regression statistics: nreg = 2; nres = 356; R2 = 0.995; SEE (log10(dm3)) =
(Eq. (6)) 0.0183; and F-ratio = 38547*.
̂ 2 2.0495
Mean live crown ratio =
ˆ0
3.4027 (1) Parameter estimates obtained from OLS regression analysis.
f(mean volume and density) (2) The intercept parameter includes a correction factor as described above.
(Eq. (8)) ˆ1 -0.3684 (3) Regression statistics: nreg = 2; nres = 1960; R2 = 0.545; SEE (log10(%)) = 0.1139; and F-ratio = 1170*.

Mean volume = f(density)


ˆ0 7.5197 (1) Parameter estimates obtained from OLS regression analysis.
at crown closure (Eq. (9)) (2) The intercept parameter includes a correction factor as described above.
ˆ
1
-2.0724 (3) Regression statistics: nreg = 1; nres = 18; R2 = 0.999; SEE (log10(dm3) = 0.0113; and F-ratio = 33402*.

Net density change model ̂ 0 -0.1010 (1) Parameter estimates obtained from nonlinear regression analysis.
(Eq. (10)) (2) Regression statistics: nreg = 3; nres = 215; SEE (log10(stems/ha)) = 18.28; and F-ratio = 1164004*.
̂1 0.2756

̂ 2 -0.3055

̂ 3 -0.3716
126 Peter F. Newton

The Hd isoline function which defines the relationship between log10  v  and

log10  N  for a given Hd value (Eq. (7)) was derived by expressing v as function of Hd and
Dq using Eq. (6), substituting Eq. (4) into Eq. (6), and then solving for log10  v  .

log10 (v )  0  1 log10 ( H d )  2 log10 ( Dq )  log10 ( ) (6)

 0   2 0  1 log10 ( H d )   2  2 log10 ( N )
log10 (v )  (7)
1   2 1

where  i , i  0, 2 are parameters estimated by OLS regression analysis, and  is an error


term. Employing the same outlier and influential observation detection procedures and
statistical evaluation protocol as that described for Eq. (4) to Eq. (6), indicated that the fitted
relationship was significant (p ≤ 0.05), explained a large proportion of the variation, and was in
general compliance with the constant error variance, correct model specification and normality
assumptions. Table 3 lists the resultant parameter estimates and associated regression
statistics for Eq. (6).
Live crown ratio (Lr) is defined as the length of the live crown divided by total stem
height and is expressed as a percentage. The Lr isoline within the context of the SDMD
describes the relationship between log10  v  and log10  N  for a given Lr value. In this
study, the isoline was derived by expressing Lr as a function of v and N using Eq. (8), and
then solving the resultant equation for log10  v  , as describe below.

log10  Lr    0  1 log10  v    2 log10  N   log10 ( ) (8)

where  i , i  0, 2 are parameters estimated by OLS regression analysis, and  is an error


term. The calibration data set used to parameterize Eq. (8) consisted of 1960 Lr, v and N
measurements obtained from thirty 40 year-old Nelder plots which were established within
the northeast region of the Province of Ontario. Briefly, these plots were established in the
late 1960‘s using the 1a configuration design as defined by Nelder (1962). Specifically, each
plot consisted of 24 concentric arcs transected by 60 equally-spaced radii or spokes, the radii
were 47 m in length and randomly clustered into twelve 5 radii sectors consisting of jack
pine, black spruce or white spruce (Picea glauca Moench), and 24 trees were planted at
geometrically increasing intertree distances along each spoke initiating at a 0.5 m and
terminating at a 4.32 m intertree radial distance (nominal density equivalents for these
spacings were 42698 stems/ha and 429 trees/ha, respectively). Employing these Lr, v and N
observations in combination with the same regression procedures as that previously described
for Eq. (4), OLS parameter estimates were obtained. Regression results indicated that the
equation describe the relationship moderately well given that the relationship was significant
(p ≤ 0.05), explained a moderate proportion of the variation, and was in general compliance
with the constant error variance, correct model specification and normality assumptions. The
Development and Utility of an Ecological-based Decision-Support System … 127

actual parameter estimates and associated regression statistics for Eq. (8) are listed in Table 3.
In terms of deriving the isolines for Lr values of 35, 40, 50, 60, 70 and 80%, Eq. (8) was
rearranged with respect to log10  v   log10  v    log10  Lr    0   2 log10  N   1 
and log10  v  calculated across the density range, for the specified Lr value.

2.2.3. Size-density Relationship at the Time of Initial Crown Closure


In order to generate v  N values at the time of initial crown closure, species-specific
allometric equations for the relationship between crown width (Cw; m) and D, and between H
and D, for open grown trees, were used in combination with total stem volume equations, and
spatial pattern and crown cover assumptions. Linear regression analysis was then used to
establish relationships between the v and N values, thereby quantifying the size-density
condition at the time of initial crown closure.
More specificially, Cw, H and vT values were estimated across the range of Dq values
observed within the core dataset (Table 2). For black spruce, Cw and H for a given D were
calculated employing the allometric Cw-D and H-D equations developed by Newton and
Weetman (1993) for open grown trees situated on mineral sites throughout central insular
Newfoundland (Cw = 0.870D0.513 and H = 0.974D0.769, respectively). Given the resultant H
estimate, vT for a given D was estimated employing the total volume equation developed by
Honer et al. (1983). Similarly, for jack pine, Cw for a given D was estimated employing the
allometric Cw-D equations developed by Bella (1967; Cw = 0.844+0.208D), Vezina (1963; Cw
= 0.536+0.245D) and Newton (this study; Cw = 1.166D0.535 ) for open grown trees situated on
mineral sites in Manitoba and Quebec. A regional-invariant value for Cw was calculated as the
arithmetic mean of the 3 separate Cw estimates. H for a given D was estimated employing the
H-D allometric equation developed by Newton (this study; H=1.856D0.395) for open grown
trees situated on mineral sites in Manitoba. Given the resultant H estimate, vT for a given D
was estimated employing Honer‘s (et al., 1983) total volume equation. Note, (1) Newton‘s
Cw-D and H-D allometric equations for jack pine as described above were developed from
851 tree measurements obtained in 1973 from the 2.4 m and 3.0 m spacing treatments within
the Moodie spacing trial (Bella and de Franceschi, 1974); (2) all parameter estimates were
obtained via OLS regression analysis on logarithmically transformed data; and (3) resultant
regression statistics and residual analyses indicated the resultant equations adequately
described the Cw-D and H-D relationships (i.e., coefficients of determination (r2) = 0.653 (Cw-
D) and 0.847 (H-D); standard errors of the estimate (SEE (loge(m)) = 0.182 (Cw-D) and 0.080
(H-D) and significant (p ≤ 0.05) F-ratios = 1588.57 (Cw-D) and 4723.5 (H-D)). Thus given a
mean Cw estimate for a fixed D, unadjusted density estimates (N’ (stems/ha)) corresponding
to complete crown closure (contiguous crown cover of 10000 m2 of projected crown area per
ha) were estimated, based on expected mean interplant distance – density relationship for
randomly dispersed spatial patterns (Table 1 in De vos (1973)): N’ = 10000(1.0937/Cw)2
where Cw was used as a surrogate for the mean interplant distance between 4 nearest
neighbours within a randomly dispersed population. Acknowledging that complete crown
closure will not be attainable under the assumption that crown bases are circular is shape (i.e.,
circular crowns can only occupy a maximum 78.54% of the available area (Smith, 1989)), the
unadjusted density estimates were reduced by 21.46% (N = 0.7854N’). Employing these
computations and using vT as a surrogate for v , this sequence of computations resulted in a
128 Peter F. Newton

data set consisting of multiple v  N data pairs for the initial crown closure condition. These
data pairs were then used to parameterized Eq. (9).

log10 (v )  0  1 log10 ( N )  log10 ( ) (9)

where i ,  0,1 are parameters estimated employing OLS regression analysis, and  is an
error term. Employing the same detection procedures and evaluation protocol as that specified
for Eq. (4), regression results suggested that relationship was adequate in describing the size-
density condition at the time of initial crown closure (i.e., the relationship was significant (p ≤
0.05), explained a moderate proportion of the variation, and was in compliance with the constant
error variance, correct model specification and normality assumptions). Table 3 lists the
parameter estimates and associated regression statistics for Eq. (9).

2.2.4. An Approximate Optimal Density Management Window for Mixed Stands


Net production within stands of a given species, site quality and stage of stand
development will increase with increasing site occupancy until a maximum level is reached,
according to the forest production theories espoused by Langsaeter (1941), Mar:Möller
(1954) and Assmann (1970). The generality of these hypotheses can be used by forest
managers to define the site occupancy or stocking levels at which forest productivity is
maximized. Based on output from hybrid biomass/carbon production model for upland black
spruce stands (Newton, 2006b), suggest that this productivity asymptote is achieved when
relative density indices are between 0.32 and 0.45. However, similar results for monospecific
jack pine stands or for black spruce and jack pine mixed stands have yet to be reported.
Consequently this necessitated the employment of the 0.32-0.45 relative density range as an
approximation to the optimal density management window for the mixed stand condition.
Given that both species share similarities in terms of allometric and scaling relationships,
competition processes and density-dependent mortality patterns, this approximation was
tentatively accepted.

2.2.5. Net Density Change Function


A modified version of Khil‘mi‘s (1957) survival model was used to describe the temporal
pattern of net density change (survivor density – mortality density + ingress density) within
mixed stands. Specifically, assuming that the temporal change in stand density is dependent
on absolute density, intensity of competition, stage of stand development and site quality,
Khil‘mi‘s model was modified accordingly: i.e., embedding surrogate measures of these
factors, current density, Pr, stand age (A) and site index (SI), respectively, directly within the
functional form (Eq. (10)).

    
1 2 3
EXP 0 Pr A S I 
 Nt   
Nt 1  N m (t )    (10)
N
 min(t ) 
Development and Utility of an Ecological-based Decision-Support System … 129

where Nt+1 is the density at time t+1 (t = 1,…, T-1; T = rotation age), Nt is the density at time
t, N min(t ) is the minimum asymptotic density at which density-dependent processes initiate as
 log10 ( v )0  1
defined as the density at the time of crown closure ( Nmin(t )  10 as derived from
Eq. (9)), i , i  0,...,3 are model parameters, and  is an error term. Computationally, mean
annual net density change estimates were derived from the dynamic sample plots as follows.
Initial and final densities for each successive plot measurement that had attained crown
closure status were calculated from which the mean annual net density change estimate was
derived: ΔN = (N2-N1)/(A2-A1) where ΔN is in units of stems/ha/yr, and N2 and N1 are the
densities (stems/ha) at stand ages A2 and A1, respectively. These change estimates (Nt (i.e., N1)
and Nt+1 (Nt + ΔN)) along with the corresponding Pr, A and SI values were used in
combination with nonlinear regression analysis to parameterize Eq. (10). Using the same
evaluation protocol as that described for Eq. (4), the regression results suggested that the
relationship was significant (p ≤ 0.05) and was in compliance with the constant error variance,
correct model specification and normality assumptions. Table 3 lists the resultant parameter
estimates and associated regression statistics for Eq. (10).

2.2.6. Resultant Dynamic SDMD


Superimposing the parameterized form of the asymptotic size-density relationship,
isolines for Hd, Dq, Pr and Lr, size-density condition at the time of initial crown closure,
lower and upper Pr isolines delineating the optimal density management window, and site-
specific size-density trajectories, on a logarithmic v  N bivariate graph, yielded the
graphical version of the dynamic SDMD for black spruce and jack pine mixtures (n.,
presented later in Figure 3). As described below, this dynamic SDMD represents the core
prediction system within the modular-based SSDMM, in that, its output is used as input in all
the remaining modules.

2.3. Description of the Diameter Distribution Recovery and Height


Prediction Equations (Module B)

2.3.1. Parameter Prediction Equation System for Diameter Distribution Recovery


At each annual step of stand development, the grouped-diameter frequency distribution
was recovered from the stand-level variables generated from the dynamic SDMD employing
a parameter prediction equation (PPE) system. Conceptually, this technique is similar to the
parameter prediction method that is used to develop stand-level diameter distribution yield
models, in that, parameters of a specified probability density function (PDF) characterizing
the diameter frequency distribution are predicted from a set of stand-level variables (sensu
Hyink and Moser, 1983). Briefly, the method used to developed the PPE system consisted of
first modeling the diameter distribution within each of the PSPs and TSPs using the 3-
parameter Weibull PDF (Eq. (11); Weibull, 1951) and obtaining the resultant maximum
likelihood estimates (MLEs) for the location, scale and shape parameter (number of plot
measurements utilized = 362; Table 2).
130 Peter F. Newton

 c  D  a c 1   D  a c 
 exp    if a  D 
f ( D; a, b, c)   b  b    b   (11)
 

0 if D < a

where a is the location parameter and is less than or equal to the minimum D value (Dmin), b
is the scale parameter which reflects the range of the distribution, and c is the shape parameter
which reflects the degree of skewness within the D distribution. These parameter estimates
were then expressed as direct or indirect functions of a set of stand-level variables, as shown
in Eqs. (12-15), using multiple regression analysis and cumulative density function regression
(CDFR; Cao, 2004) analysis.

aˆ  0.5 Dmin (12)

Dmin = 0 +1  H d or log e H d   2  Dq or log e Dq   3  G or log e G 


+ 4  v or log e v   5 Vt or log e Vt  +6  N or log e N  (13)
 7  Pr or log e Pr   

bˆ = 0 +1  H d or log e H d   2  Dq or log e Dq   3  G or log e G 


+4  v or log e v   5 Vt or log e Vt  +6  N or log e N  (14)
 7  Pr or log e Pr   

cˆ =  0 +1  H d or log e H d    2  Dq or log e Dq    3  G or log e G 


+ 4  v or log e v  +5 Vt or log e Vt  + 6  N or log e N  (15)
  7  Pr or log e Pr   

where  l , l and  l , l  0,..., 7 , are equation-specific parameters, and ε is an equation-


specific error term. Table 4 lists the resultant parameter estimates and associated regression
statistics for the CDFR-based PPE system. The actual grouped-diameter frequency
distribution for a specified set of stand-level variables is estimated using the cumulative
distribution function (CDF) analogue of Eq. (11), as shown in Eq. (16).

  D  a c 
F ( D; a, b, c)  1  exp   
  b  
(16)
 
Table 4. Parameter estimates and associated statistics for the regression relationships used in Module B (Diameter and Height
Recovery): parameter prediction equation (PPE) system and the composite height-diameter function

Relationship Parameter Estimate Statistical Notesa


Symbol Value
Weibull-based ˆ0 20.9051 (1) Parameter estimates obtained employing OLS regression analysis where the independent variables
PPE ̂1 0.1547 were selected employing an all-possible best-subset regression procedure using the Cp (Mallows, 1973)
system: Dmin criterion; values in parenthesis are associated with logarithmically transformed independent variables.
ˆ2 -0.4202 (2) Regression statistics: nreg = 5, nres = 356, R2 = 0.336, SEE = 1.329, F-ratio = 30* and CP statistic =
(Eq. (13)) ˆ3 0.3728 6.15.
ˆ4 0.0170 (3) Source: Newton and Amponsah (2005).
ˆ5 -
ˆ6 (-2.5931)
ˆ7 -12.1053
Weibull-based ˆ0 -10.2463 (1) Parameter estimates obtained employing CDFR (Cao, 2004) where values in parenthesis are
PPE systems: ˆ1 (1.4935) associated with logarithmically transformed independent variables.
b̂ (2) Regression statistics calculated from the observed and predicted values and hence are approximations:
ˆ2 0.9551 nreg = 7, nres = 354, R2 = 0.779, SEE = 1.896, and F-ratio = 178.4*.
(Eq. (14)) ˆ3 (-0.7256) (3) Source: Newton and Amponsah (2005).
ˆ4 0.0021
ˆ5 0.0269
ˆ6 (0.9719)
ˆ7 -8.2411
Weibull-based ˆ0 - (1) Parameter estimates obtained employing CDFR (Cao, 2004) where values in parenthesis are
PPE system: ĉ ̂1 (-1.0673) associated with logarithmically transformed independent variables.
(Eq. (15)) (2) Regression statistics calculated from the observed and predicted values and hence are approximations:
ˆ 2 0.2005
nreg = 7, nres = 354, R2 = 0.578, SEE = 0.525, and F-ratio = 81*.
ˆ3 0.0589 (3) Source: Newton and Amponsah (2005).
ˆ 4 -0.0060
ˆ5 0.0666
ˆ6 (0.3549)
ˆ7 -22.8471
Table 4. (Continued).

Relationship Parameter Estimate Statistical Notesa


Symbol Value
Height – ̂0 11.0470 (1) Parameter estimates obtained from OLS regression analysis where the independent variables were
diameter model selected employing an all-possible best-subset regression procedure using the Cp (Mallows, 1973)
(Eq. (17)) ̂1 0.5677 criterion.
̂2 -0.2329 (2) The intercept parameter estimate includes a correction factor for the bias introduced via the
logarithmic transformation (Baskerville,1972; Sprugel, 1983).
̂3
(3) Regression statistics: nreg = 5, nres = 1272, R2 = 0.872, SEE (loge(m)) = 0.1163, F-ratio = 1737*, and
0.0111 CP statistic = 5.48.
(4) Source: Newton and Amponsah (2007).
̂4 -
̂5 1.0711
̂6 -
̂7 -0.3878
a - Cp denotes Mallows‘ (1973) Cp statistic; all other denotations are as defined in Table 3.
Development and Utility of an Ecological-based Decision-Support System … 133

2.3.2. Composite Height-diameter Function


The composite regression function previously developed for black spruce and jack pine
mixtures by Newton and Amponsah (2007), was used to describe the relationship between H
and D. Briefly, in this companion study, a calibration dataset consisting of 1290 H-D pairs
and associated stand-level variables derived from 88 sample plots, were used to parameterize
5 nonlinear H-D models. The most applicable form, based on a comprehensive set of
evaluation criteria (e.g., goodness-of-fit measures, lack-of-fit indices, and predictive ability),
was selected. Specifically, a multivariate allometric-based composite model with the
inclusion of stand-level predictor variables reflecting both density-stress and stage of
development (Pr and Hd, respectively) performed the best (Eq. (17)), and hence was utilized.

H  0  D 1  2 Pr  3 H d  4 Pr H d  Pr 5  H d 6  ( Pr  H d )7   (17)

where l (i ) , l  0,...,7 are parameters, and ε is an error term. The resultant parameter
estimates and associated regression statistics are given in Table 4.

2.4. Description of the Taper Equations Utilized (Module C)

Due to data limitation which prevented the development of a taper equation for the
natural mixed stand-type, composite variable exponent taper equations developed previously
for monospecific black spruce (Sharma and Parton, 2009) and jack pine (Newton, 2009)
stands, were utilized (Eq. (18a) and (18b), respectively).

 
2
 h  h  G 
1   2    3    4  2 
  h  H H D   H  h 
d  D 0      (18a)
  1.3   H  1.3  
 

 h  G 
1  2    4  2  
 h  H D   H h 

d  D 0  
  1.3     (18b)
 H  1.3 
 

where d is the inside-bark diameter (cm) at height h (m), i , i = 0,...,4 are model
parameters, and ε is an equation-specific error term. These equations arose from a concurrent
study in which the objective was to develop dimensional compatable taper equations for a
suite of boreal species (e.g., Newton and Sharma, 2008; Sharma and Parton, 2009). Briefly,
the equations were parameterized using stem profile data obtained from percent-height
destructive stem analysis sampling procedures on 113 jack pine and 1189 black spruce trees.
The sample trees were selected using a size-based stratified random sampling protocol within
9 jack pine and 25 black spruce stands. The actual parameter estimates are given in Table 5.
134 Peter F. Newton

2.5. Description of the Biomass Equations Utilized (Module D)

The lack of available biomass data for mixed stands negated the development of stand-
specific equations. Alternatively, however, composite biomass functions previously
developed for estimating bark (periderm) mass per tree (Mp; oven-dry g), stem mass per tree
(Ms; oven-dry g), branch mass per tree (Mb; oven-dry g) and foliage mass per tree (Mf; oven-
dry g) within monospecific black spruce (Newton, unpublished) and jack pine (Newton,
2009) stands, were employed. Briefly, the composite models were based on a multivariate
extension of the simple equation of allometry in which stand-level measures of relative
density stress (Pr) and stage of development (Hd) were explicitly incorporated (e.g., Jolliffe et
al., 1988; Newton, 2006b). The resultant equations predict the oven-dried mass of each
component per tree using D, H, Pr and Hd as predictor variables (Eq. (19)).

m(i )  0(i )  D 2 H 
1( i ) 2( i ) Pr 3( i ) H d 4( i ) Pr H d
 Pr H d 
5( i ) 6( i ) 7 ( i )
Pr Hd  (i ) (19)

where m(i) is the mass of the ith biomass component,  j (i ) , j  0,...,7 are parameters specific
to the ith component estimated via OLS regression analysis (all-possible best-subset
regression procedure employing Mallows (1973) Cp selection criterion) following a double-
logarithmic transformation, and  ( i ) is the error term specific to the ith component. For black
spruce, the parameterization data set consisted of mp, ms, mb, mf, D, H, Pr and Hd data derived
from 161 destructively sampled black spruce trees located in 52 variable-sized (minimum of
100 trees/plot) area plots which were established within 18 monospecific even-aged stands
located throughout Forest Section B28b (Rowe, 1972)). The stands were selected according to
a stratified pseudo-random sampling approach in which an approximately equal number of
stands within each of 5 age classes (15, 30, 45, 60 and 75 year) were sampled. Similarly, for
jack pine, the parameterization procedure consisted of obtaining estimates via OLS regression
analysis using mp, ms, mb, mf, D, H, Pr and Hd data derived from 186 semi-mature jack pine
trees situated within 6 stands located in northern and northeastern Ontario. The sample trees
were selected for destructive sampling according to a size-based stratified random sampling
protocol. Table 6 lists the resultant component-specific parameter estimates and associated
regression statistics.
Table 5. Parameter estimates and associated statistics for the composite
variable-exponent taper equations used in Module C (Taper Analysis and Log Estimation)

Species Parameter Estimatea Source


̂0 ̂1 ̂2 ̂3 ̂4
Black Spruce Eq. (7) in Table 3 of Sharma and
0.9088 -0.0667 0.5410 -0.3636 0.0755
(Eq. (18a)) Parton (2009)
Jack Pine 0.1877 Eq. (18) in Table 5 of Newton (2009)
0.9292 -0.0534 0.4035 -
(Eq. (18b))
a – As reported in the source publication.

Table 6. Parameter estimates and associated statistics for the regression relationships
used in Module D (Biomass and Carbon Estimation): composite biomass equations by component (Eq. (19))

Species Parameter Estimatea Regression Statisticsb


Comp. ̂ ' ̂1 ̂ 2 ̂ 3 ̂ 4 ̂ 5 ̂ 6 ̂ 7 nreg nres CP R2 SEE F-ratio
0

Black Bark 4.8 0.9524 - - - - - -0.1507 2 156 2.39 0.992 0.2261 4613*
Spruce Stem 14.2 0.9325 - - - 0.0006 - 0.2611 3 154 1.90 0.998 0.1265 11996*
Branch 4.1 1.2048 - -0.0127 - - - -0.5282 3 154 2.34 0.898 0.5409 454*
Foliage 16.6 0.9611 0.1522 - -0.0121 - - -0.6127 4 154 3.96 0.892 0.4964 318*
Jack Bark 17.1 0.5787 0.2083 0.0136 -0.0175 - - - 4 181 3.74 0.808 0.2425 191*
Pine Stem 66.0 1.0096 -0.1471 - - 0.6895 - - 3 182 4.12 0.930 0.1716 802*
Branch 330.3 - 1.1513 0.0570 -0.0503 - - -2.6615 4 181 3.95 0.863 0.3191 286*
Foliage 345.1 - 1.1635 0.0551 -0.0498 - - -2.8438 4 181 3.91 0.836 0.3448 231*
a - Component-specific parameter estimates obtained from OLS regression analysis where the independent variables were selected according to an all-
possible best-subset regression procedure employing Mallows‘ (1973) Cp criterion. Note, the intercept parameter estimates, denoted ̂0' , includes
a correction factor for the bias introduced via the logarithmic transformation (Baskerville, 1972; Sprugel, 1983), and Comp. denotes Component.
b - As defined in Table 3.
136 Peter F. Newton

2.6. Description of the End-Product and Value Functions Utilized (Module E)

Species-based sawmill-specific product recovery and value functions, reported previously


in the literature, were utilized for mixed stands: Eq. (20a) for black spruce (Lui and Zhang,
2005; Zhang et al. 2006) and Eq. (20b) for jack pine (Newton, 2009).

1( i )  2( i )
Y(i )   0(i ) D H  (i ) (20a)

Y(i )   0(i )  D2 H 
1( i )  2( i ) Pr  3( i ) H d  4( i ) Pr H d
 Pr H d 
 5( i )  6( i )  7( i )
Pr Hd  (i ) (20b)

where Y(i) is either the (1) total lumber volume per tree (vl(s) (dm3)), total lumber value per tree
(pl(s) (CAN$(2002))), or total product value per tree (pt(s) (CAN$(2002))) recovered under a
stud sawmill processing protocol, or (2) total lumber volume per tree (vl(r) (dm3)), total lumber
value per tree (pl(r) (CAN$(2002))) or total product value per tree (pt(r) (CAN$(2002)))
recovered under a randomized length sawmill processing protocol,  j (i ) , j  0,..., 7 and
 (i ) are equation-specific parameters and error terms associated with the ith dependent
variable. These equations were parameterized employing simulation results derived using
virtual taper profiles from tree measurements obtain from sample plots distributed throughout
the central region of the Canadian Boreal Forest. Specifically, the Optitek sawing simulator
(Forintek Canada Corp. 1994) software, employing conventional stud sawmill and
randomized length sawmill processing protocols, was used to derive product volumes and
value estimates. The stud sawmill processing protocol consisted of bucking each virtual stem
of a given diameter, length, taper and wane into 2.44 m sections and then sawing each section
into dimensional lumber products according to an algorithm which maximized lumber value
recovery. The randomized length sawmill processing protocol consisted of optimally bucking
the virtual stem into sections of variable length (4.88 to 1.22 m by approximately 0.6 m
intervals) and then sawing each section into dimensional lumber products employing a sawing
algorithm which similarly maximized lumber value recovery. The resultant sawdust, chip and
lumber volumes in combination with market-based economic values for the calendar year
2002, were used to generate tree-level estimates of lumber value and total product value by
species and sawmill-type. Combining these estimates along with the stand-level variables,
simple (Eq. (20a)) and composite (Eq. (20b)) equations were developed by sawmill-type.
Table 7 lists the resultant component-specific parameter estimates, regression statistics and
predictive performance measures for each equation.
Table 7a. Parameter estimates and associated statistics for the regression relationships used in Module E
(Product and Value Estimation): sawmill-specific product recovery and value functions (Eq. (20a))

Species Parameter Estimate and Performance Metricsa


Variable ˆ0 ˆ1 ˆ 2 Calibration Data Set Validation Data Set Source
2 2
R RMSE MAE R RMSE MAE
Black vl(s) 0.0001 2.4594 1.6032 0.939 27.66 15.74 0.901 29.41 18.71 Model 4 in Table 4 as
Spruce pl(s) 0.0000 2.8846 1.6010 0.944 6.46 3.34 0.924 7.57 4.23 reported by Liu and
pt(s) 0.0004 2.5544 1.2134 0.975 4.59 2.58 0.949 4.65 2.84 Zhang (2005)
vl(r) 0.0014 2.4516 1.1957 0.914 13.43 8.44 0.897 15.87 10.11 Model 4 in Table 4 as
pl(r) 0.0002 2.7569 1.2085 0.919 7.04 4.11 0.887 9.03 7.01 reported by Zhang et
pt(r) 0.0007 2.4463 1.2020 0.960 5.69 3.24 0.950 4.89 3.21 al. (2006)
a - As derived from the source publications; note, RMSE and MAE denote root mean square error and mean absolute error, respectively, and are in
units of dm3/tree for variables vl(s) and vl(r) and CAN$(2002) for variables pl(s), pt(s), pl(r) and pt(r)

Table 7b. Parameter estimates and associated statistics for the regression relationships used in Module E (Product and Value
Estimation): composite sawmill-specific product recovery and value functions (Eq. (20b))

Species Parameter Estimatea Regression Statisticsb


Var. ̂ ' ˆ1 ˆ 2 ˆ 3 ˆ 4 ˆ5 ˆ6 ˆ7 nreg nres CP R2 SEE F-ratio
0

Jack vl(s) 0.00054 1.2528 0.2017 - -0.0069 -0.5462 0.2597 - 5 946 5.00 0.947 0.1561 3367*
Pine pl(s) 0.00005 1.5637 - -0.0030 - - - -0.0461 3 948 2.30 0.959 0.1561 7291*
pt(s) 0.00002 1.5305 0.0435 -0.0088 - -0.2880 0.5576 - 5 946 5.01 0.975 0.1133 7438*
vl(r) 0.00004 1.5929 - -0.0156 - -0.0993 - 1.2756 4 947 5.34 0.952 0.1525 4703*
pl(r) 0.00001 1.6632 - -0.0127 - -0.1313 0.9862 - 4 947 3.04 0.969 0.1319 7335*
pt(r) 0.00001 1.6130 - -0.0125 - -0.1248 0.9748 - 4 947 3.45 0.973 0.1180 8595*
a - Component-specific parameter estimates obtained from OLS according to an all-possible best-subset regression procedure employing Mallows‘
(1973) Cp criterion. The intercept parameter estimate ( ̂ 0' ) includes a logarithmic-based correction factor (Baskerville, 1972; Sprugel, 1983).
b - As defined in Table 3; SEE in units of loge(dm3) for dependent variables vl(s) and vl(r) and loge(CAN$(2002)) for dependent variables pl(s), pt(s), pl(r)
and pt(r).
138 Peter F. Newton

2.7. Description of the Fibre Attribute Equations Utilized (Module F)

The composite wood density and mean maximum branch diameter functions previously
developed for black spruce (Newton, unpublished) and jack pine (Newton, 2009) were
utilized. Briefly, species-specific cross-sectional area-weighted mean wood density per tree
( wD (g/cm3)) was estimated using Eq. (21), and species-specific mean maximum branch
diameter within the first 4.9 m sawlog per tree ( bD (cm)) was estimated using Eq. (22).

wD   0  D 2 H 
1  2 Pr  3 H d  4 Pr H d
Pr 5 H d 6  Pr H d  7 

(21)

d B  0  D2 H 
1 2 Pr 3 H d 4 Pr H d
Pr5 H d6  Pr H d  7 

(22)

where  j , j  0,...,7 and  j , j  0,...,7 are model parameters estimated via OLS
regression analysis following a double-logarithmic transformation using an all-possible best-
subset regression procedure based on Mallows‘ (1973) Cp selection criterion, and  is an
equation-specific error term. Table 8 lists the resultant parameter estimates and associated
regression statistics.

3. RESULTS AND DISCUSSION


3.1. The Modular-based SSDMM and Associated
Computational Framework for Mixed Stands

Integrating the Weibull-based PPE system and composite height-diameter function


(Module B), dimensional-compatible variable exponent taper equations (Module C),
allometric-based component biomass functions (Module D), sawmill-specific product
recovery and value functions (Module E), and composite fibre quality attribute functions
(Module F), within the dynamic SDMD modelling framework, resulted in the modular-based
SSDMM for black spruce and jack pine mixtures (Figure 1). Computationally, the yield–
density relationships within the dynamic SDMD are used to predict the temporal size–density
trajectory for a given site quality and density management regime (Dynamic SDMD Module).
Most if not all, decision-support models built on the SDMD modeling approach assume
that the size-density trajectory of a recently thinned stand immediately follows that of a stand
which was always managed at the lower post-thinned density. However, trees within recently
thinned stands require a period of time to morphologically adjust to their newly allocated
growing space and additional resources. Consequently, a response delay function was
developed to account for this effect.
Table 8. Parameter estimates and associated statistics for the regression relationships used in
Module F (Fibre Attribute Estimation): composite wood density and mean maximum branch diameter functions

Relationship Parameter Estimate Statistical Notesa


Symbol Value
Black Spruce Jack Pine
Composite wood ̂ ' 0.5247 11.6252 (1) Parameter estimates obtained from OLS regression analysis where
0
density equations the independent variables were selected according to an all-possible
̂ 1 -0.0075 -
(Eq. (21)) best-subset regression procedure employing Mallows‘ (1973) Cp
ˆ 2 -0.0115 -0.2495 criterion.
- 0.0034 (2) The intercept parameter estimate includes a correction factor for
ˆ 3
the bias introduced via the logarithmic transformation (Baskerville,
ˆ 4 - 0.0066 1972; Sprugel, 1983).
ˆ 5 - 0.6445 (3) Regression statistics: nreg = 2, nres = 157, R2 = 0.219, SEE
(loge(g/cm3)) = 0.0860, CP statistic = 0.00, and F-ratio =22*, for
ˆ 6 - -0.9660
black spruce; and : nreg = 5, nres = 171, R2 = 0.364, SEE (loge(g/cm3))
ˆ 7 - - = 0.0470, CP statistic = 6.17, and F-ratio =18*, for jack pine.

Composite mean ˆ0' 3.6137 0.4901 (1) Parameter estimates obtained from OLS regression analysis where
maximum branch the independent variables were selected according to an all-possible
ˆ - 0.2976
diameter 1 best-subset regression procedure employing Mallows‘ (1973) Cp
equation - -0.0338 criterion.
(Eq. (22))
ˆ2 (2) The intercept parameter estimates includes a correction factor for
ˆ - - the bias introduced via the logarithmic transformation (Baskerville,
3
- 0.0024 1972; Sprugel, 1983).
ˆ4 (3) Regression statistics: nreg = 1, nres = 117, R2 = 0.053, SEE
ˆ5
-0.0427 - (loge(cm)) = 0.1147, CP statistic = 2.92, and F-ratio =7*, for black
spruce; and nreg = 4, nres = 510, R2 = 0.483, SEE (loge(cm)) = 0.1440,
ˆ6 - -
CP statistic = 3.38, and F-ratio =119*, for jack pine.
ˆ7
- -0.2021

a – Denotations as defined in Table 3.


Module A -Dynamic SDMD

Yield-density relationships for estimating


mean-tree and stand-level volumetric
yield estimates
(Figure 2(a))

Module B - Diameter and Height


Recovery

PPE system for diameter distribution


recovery and composite height-diameter
function for height estimation
(Figure 2(b))

Module C - Taper Analysis and Module D - Biomass and Carbon Module E – Product and Value Estimation Module F - Fibre Attribute Estimation
Log Estimation Estimation
Species-specific sawmill-specific product Species-specific composite wood density and
Species-specific composite taper equations Species-specific composite biomass equations recovery and value equations mean maximum branch diameter equations
(Figure 2(c)) (Figure 2(d)) (Figure 2(e)) (Figure 2(f))

Log-type: Diameter-class and Dimensional Lumber Volume: Dimensional Lumber Value:


Biomass: Diameter-class and stand-level
stand-level estimates of the number of Diameter-class and stand-level estimates Diameter-class and stand-level estimates Wood Density: Diameter-class-weighted
biomass estimates for bark, stem, branch,
sawlogs and pulplogs, and residual of recoverable lumber volumes by of the value of recoverable lumber by stand-level estimates of specific gravity
and foliage components
merchantable stem tip volumes sawmill-type sawmill-type

Mean Maximum Branch Diameter:


Taper-based stem volumes: Carbon: Diameter-class and stand-level Residual Chip Volume: Residual Chip Value:
Diameter-class-weighted stand-level
Diameter-class and stand-level estimates biomass-based carbon estimates for bark, Diameter-class and stand-level estimates Diameter-class and stand-level estimates of
estimate of mean maximum branch diameter
of merchantable and total volumes stem, branch and foliage components of residual chip volumes by sawmill-type the value of residual chips by sawmill-type
within the first sawlog

Total Value: Diameter-class and stand-level


estimates of the value of dimensional
lumber and residual chips by sawmill-type

Figure 1. Schematic illustration of the modular-based SSDMM.


Development and Utility of an Ecological-based Decision-Support System … 141

The function was based on the difference in live crown ratios between trees within the
pre-thinned and post-thinned stand condition as initially conceptualized by Newton (2003).
To illustrate this idea, it is instructive to consider the empirical results derived from an IE
black spruce experiment as they are reported in the literature (i.e., Table 1 in McClain et al.,
1994). The differential in mean live crown ratio between a high density-stressed stand
(surrogate for the thinned stand just before the time of thinning (Lr = 44%)) and a low
density-stressed stand (surrogate for the thinned stand immediately following thinning (Lr =
80%)), situated on the same site and of equal age, was 36% after 41 years. If the high density-
stressed stand was thinned at an age of 41, SDMD-based models would assume there was no
differential between the stands and erroneously describe the post-treatment size-density
trajectory as being identical to that of a stand which was always managed at the lower
density. Thus in this example, the temporal duration of the response delay is approximated by
the length of time required for trees within the thinned stand to achieve a mean live crown
ratio of 80%. More precisely, by extending this concept and accounting for the intrinsic
change in live crown ratio within the lower density stand in the years immediately following
thinning, the exact number of years required for the live crown ratios in both stands to
converge, can be determined. Specifically, the number of years that the live crown ratio of the
tree of mean dominant height took to rebuild its live crown ratio (based on a static live crown
base), to be equivalent to that predicted for a similar tree growing at the lower density, was
defined as the response delay period. Given that trees within recently thinned stand would not
be fully occupying their newly allocated space and hence not competing nor incurring
density-dependent mortality until their live crown ratios recovered, stand densities were
conditionized to remain constant during this period (i.e., the net density change function was
disabled during this adjustment period).
The temporal rate at which the size–density trajectory tracks through the SDMD is site-
dependent and hence described by the mean values derived from species-specific height-age
functions (Eq. (23a) for black spruce (Carmean et al., 2006) and Eq. (23b) for jack pine
(Carmean et al., 2001)).

0.6167( S I 1.3)0.3116
0.1136  AB

H d  1.3  16.95( S I  1.3) 1  K 50

  (23a)
1
 ( S I  1.3)  0.6167( S I 1.3)0.3116
where K  1   0.1136 
16.95( S I  1.3) 

1.3723( S I 1.3)0.0802
0.6224  
AB
H d  1.3  4.1459( S I  1.3) 1  K 
50

 
1
(23b)
 ( S I  1.3)  1.3723( SI 1.3) 0.0802

where K  1   0.6224 
 4.1459( S I  1.3) 

where AB is mean breast-height age (yr).


142 Peter F. Newton

An abridged version of the computational sequence is as follows. Essentially, Module A


(Dynamic SDMD) provides a set of stand-level variables which are required as input to
Modules B-F. Module B utilizes the PPE system and the composite height-diameter function
to recover the grouped-diameter frequency distribution and estimate corresponding tree
heights (Diameter and Height Recovery Module), and similar to Module A, provides
prerequisite input to the remaining modules. The taper equations are used to derive a pooled
mean estimate of the upper stem diameters for each tree from which the number of sawlogs
and pulplogs, residual tip volumes, and merchantable and total stem volumes are calculated
(Taper Analysis and Log Estimation Module). The composite biomass equations are used to
derive a pooled mean mass and carbon equivalent estimate for each above-ground component
(Biomass and Carbon Estimation Module). The product recovery and value functions are used
to derive pooled mean estimates of the sawmill-specific chip and lumber volumes and
associated market-based monetary values (Product and Value Estimation Module). The
composite fibre attribute functions are used to derive mean pooled estimates of wood density
and mean maximum branch diameter (Fibre Attribute Estimation Module).
In order facilitate interpretation of the model structure and the associated computational
sequence, a comprehensive schematic illustration is provided in Figure 2, and a corresponding
descriptive synthesis is given as follows. Module A (Figure 2(a)) provides estimates of the
ˆ , Gˆ , Vˆ and
system‘s driving variables for each time step (year; Hˆ d (t ) , Nˆ ( t ) , vˆ( t ) , Dq (t ) (t ) t (t )

Pˆr (t ) ) using the dynamic SDMD and its embedded functions. Module B (Figure 2(b))
recovers the corresponding location, scale and shape parameter estimates for the 3-parameter
Weibull PDF using the PPE system (Eqs. (12)-(15)) in combination with the stand-level
variable estimates derived from Module A, from which density estimates are derived for each
recovered diameter class using Eq. (16) and a 10% truncation threshold rule (Clutter et al.,
1983). Height estimates for each recovered diameter class are then calculated employing the
stand-level and diameter-class-specific variable estimates ( Pˆr ( t ) and Hˆ d (t ) (from Module A)
ˆ , respectively) via the composite height-diameter function (Eq. (17)). Module C
and D(t , j )

(Figure 2(c)) computes tree and diameter-class estimates of merchantable and total stem
volumes, log-type distributions and residual merchantable tip volumes at each time step via
the taper equations (Eqs. (18a) and (18b)) using stand-level and diameter-class-specific
ˆ from Module A, and Dˆ , Hˆ ˆ
predictor variable estimates ( G(t ) (t , j ) ( t , j ) and N ( t , j ) from Module

B, respectively). Module D (Figure 2(d)) provides bark, stem, branch, foliage and total
biomass estimates at the tree, diameter-class and stand levels for each time step using the
composite biomass equations (Eq. (19)) in combination with the stand-level and diameter-
class-specific variable estimates ( Pˆr ( t ) and Hˆ d (t ) from Module A and D
ˆ , Hˆ
(t , j ) ( t , j ) and

Nˆ (t , j ) from Module B, respectively). Corresponding carbon equivalents at the tree, diameter-


class and stand levels are estimated using a carbon/biomass ratio estimator. Module E (Figure
2(e)) calculates tree, diameter-class and stand-level estimates of the volume of chip and
lumber end-products and their associated monetary values by sawmill-type at each time step
using the product and value equations (Eq. (20a) and (20b)) in combination with the stand-
Development and Utility of an Ecological-based Decision-Support System … 143

level and diameter-class-specific variable estimates ( Pˆr ( t ) and Hˆ d (t ) from Module A, and

Dˆ (t , j ) , Hˆ (t , j ) and Nˆ (t , j ) from Module B, respectively). Module F (Figure 2(f)) provides tree


and diameter-class estimates of specific gravity (wood density) and branch diameters at each
time step using the composite fibre attribute equations (Eqs. (21) and (22), respectively) in
combination with the stand-level and diameter-class variable estimates ( Pˆr ( t ) and Hˆ d (t ) from
ˆ
Module A, and D ˆ
( t , j ) and H ( t , j ) from Module B, respectively). Stand-level estimates of

mean wood density for the merchantable tree population, and mean maximum branch
diameter for the sawlog-sized tree population, are subsequently calculated using a diameter-
class density-based weighing factor.
The above sequence represent the essential computations required for deriving volumetric
yields, diameter distributions, tree heights, log assortments, components-specific biomass and
carbon outcomes, sawmill-specific products and associated values, and fibre attributes, for a
given density management regime, site quality and rotation age.
(a) Module A - Dynamic SDMD

Input Variables and Constants  Computations 


 Output 
   Variables 
 Fixed   Hˆ d (t )  f β S 
, β SI ( PNb ) , S I , A(t )    
 User-specified  Constants  
I ( PIm )

  Hˆ d ( t ) 
 

Input Variables   A(t )
 
  (t ) 
Nˆ  f β S , N I , Nˆ (t 1) , Pˆr (t ) , A(t ) , S I , T2 N ( i ) , T3N ( i )    
     Nˆ ( t ) 
 S
 I
 β S
 
 ˆ ˆ
 v(t )  f β v , H d (t ) , N (t )
ˆ    ˆ
v


    ( t )
 N I
  β Dq
  β
 
   Dˆ q (t )  f β Dq , vˆ(t ) , Nˆ ( t )      Dˆ


T2 A ( i )     q ( t ) 


 2N ( i )
T
  v
 β

 
Gˆ ( t )  f Dˆ q ( t ) , Nˆ ( t )

   Gˆ
  (t )


 
Pr
    Vˆt ( t )  f vˆ( t ) , Nˆ ( t )  Vˆ 
T3A ( i )  β Lr     t(t) 
T  β 
 N ( i )
3   S I ( PIm ) 
 Pˆ  f β , vˆ , Nˆ
 r(t) Pr (t) 
(t)    Pˆ
  r(t)


 β  ˆ   Lˆr ( t ) 
  SI ( PNb )  L
 r ( t )
 f β Lr , vˆ ( t) , Nˆ 
( t)  
 

(b) Module B - Diameter and Height Recovery

 (1) Recovery of the grouped-diameter frequency distribution 


 
 Input Variables and Constants   Computations: recovering Weibull parameters via PPEs    
  aˆ  0.5Dˆ where Dˆ = f β 

 Input    (t ) min min
ˆ  ˆ ˆ ˆ ˆ ˆ ˆ
Dmin , H d ( t ) , N ( t ) , v( t ) , Dq ( t ) , G( t ) , Pr ( t ) , Vt ( t )  
 
 (2) Height estimation by diameter class


    ˆ    
 Variables   ˆ
 b(t )  f βW (b ) , Hˆ d (t ) , Nˆ (t ) , vˆ(t ) , Dˆ q (t ) , G(t ) , Pˆr (t ) , Vˆt (t )    Input Constants and Variables  
  ˆ       
  H d (t )
  Nˆ
 
 Fixed  

  cˆ(t )  f βW ( c ) , Hˆ d (t ) , Nˆ (t ) , vˆ(t ) , Dˆ q (t ) , Gˆ (t ) , Pˆr (t ) , Vˆt (t )  

 Input
  






  ( t )  Constants      Output Variables  Variables  Fixed   
  ˆ         
   ˆ     
 v(t )   β Dmin    Computations: obtaining resultant diameter frequency distribution    Nˆ (t , j )    D( t , j )   Constants   
  Dˆ      ˆ    ˆ  β   
  q ( t )  βW (b )      ˆ cˆ( t )  
 aˆ(t )      D(t , j )   Pr (t )   H 
 
   Nˆ (t , j )  Nˆ (t )   1  exp    (t , j 1)
D
 Gˆ  β     Hˆ   
  ( t )   W (c)   

  bˆ(t )       d (t )   
 Vˆ           Computation 
  t ( t )       
   Output Variable 

  cˆ( t )  

  Pˆ
   r (t )



   Nˆ 
(t ) 
ˆ
 1  exp    D(t , j )  aˆ(t )    

  

   ( t , j )
  


 f β H , Dˆ ( t , j ) , Pˆr ( t ) , Hˆ d ( t )   Hˆ (t , j )
 

  bˆ(t )   
    
      
 
(c) Module C - Taper Analysis and Log Estimation

(1) Estimating upper stem diameter at 2.59 m height intervals from stump height 
 
 Computation: upper stem diameter 
 

 Input Variables and Constants  dˆ     
f βT( PIm ) , h(t , j ) , Dˆ (t , j ) , Hˆ (t , j ) , Gˆ ( t )  f βT( PNb ) , h( t , j ) , Dˆ ( t , j ) , Hˆ ( t , j ) , Gˆ ( t ) 

    
  Fixed    (t , j )
2
 Input   

 Variables  Constants     Hˆ ( t , j )  0.3   Hˆ ( t , j )  0.3  
ˆ
    
  where h( t , j ) =0.3, 0.3+1
 100
 ,0.3+2 
  100
 ,..., H ( t , j )


 Dˆ (t , j )   
d
 
s
    
   dp  
ˆ


 Hˆ       if d  d then increment sawlog count per tree ( ˆ
n ) by 1 
      
(t, j) s ls ( t , j )

 (t , j ) dt
 Nˆ     if dˆ  d and dˆ = d then increment pulplog count per tree (nˆ 
lp (t , j ) ) by 1 
  ( t , j )  dm    
( t , j ) s ( t , j ) p


 Nˆ (t )     if dˆ  d p and dˆ = d t then calculate residual tip volume per tree (vˆr (t , j ) ) 
  βT( PIm ) 
   
(t , j ) (t , j )

Gˆ (t )  β
  T( PNb )
   ˆ
 if d (t , j ) = d m then calculate merchantable stem volume per tree (vˆm (t , j ) ) 
      
  ˆ
if d ( t , j ) = 0 then calculate total stem volume per tree (vˆt (t , j ) ) 
  

 
(2) Calculating diameter-class and stand-level estimates of the number of 
 
sawlogs and pulplogs, residual tip volumes, and merchantable and total stem 
 
 volumes 
 J  Output Variables 
 Nˆ ls (t , j )  Nˆ (t , j )  nˆls (t , j )  Nˆ ls (t )   Nˆ ls (t , j )  ˆ 
 j 5   N ls (t ) 
  ˆ 
 Nˆ
J
 N
 lp (t ) 
  lp (t , j )  N ( t , j )  nˆlp ( t , j )  N lp ( t )   Nˆ lp (t , j )
ˆ ˆ
  ˆ 
 j 5
 Vr ( t ) 
 J  ˆ 
Vr ( t , j )  N ( t , j )  vˆr ( t , j )  Vt ( t )   Vˆr ( t , j )
ˆ ˆ ˆ  Vm (t ) 
 j 5  vˆ 
   m ( t , j ) 
J

Vm ( t , j )  N ( t , j )  vˆm ( t , j )  Vm ( t )   Vˆm (t , j ) 
ˆ ˆ ˆ '

 ˆ  Vˆ '  
j 5
  ˆ )   m(t ) 
 J
V m ( t ) V t ( t
 Vˆ '  
Vˆt ( t , j )  Nˆ ( t , j )  vˆt ( t , j )  Vˆ '   Vˆt (t , j )   t (t )  
 t(t)
j 1

 
(d) Module D - Biomass and Carbon Estimation

 Estimating component-specific and total biomass and carbon equivalents 


 


 mˆ p
 ( t , j )( PIm )

 f β M p ( PIm ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )  
 Calculating component-specific biomass and 



 ˆ ˆ
 mˆ s( t , j ) ( PIm )  f β M s ( PIm ) , D(t , j ) , H (t , j ) , H d (t ) , Pr (t )
ˆ ˆ   
 carbon equivalents per stand


  


 mˆ b( t , j ) ( PIm )  f β M b ( PIm ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )    ˆ J  mˆ p( t , j )( PIm )  mˆ p( t , j )( PNb )  
 M p( t )   N (t , j )  
 ˆ  
  2  
  mˆ f
 ( t , j ) ( PIm )

 f β M f ( PIm ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )   
 
j 1
  
  mˆ s( t , j )( PIm )  mˆ s( t , j )( PNb )  
 mˆ   Mˆ  Nˆ  
J

  s( t ) 
  mˆ p( t , j )  mˆ s( t , j )  mˆ b( t , j )  mˆ f( t , j )  
  t( t , j )( PIm ) ( PIm ) ( PIm ) ( PIm ) ( PIm )
(t , j )
 2  
  
j 1
   

  ˆ ˆ
 mˆ p( t , j )( PNb )  f β M p ( PNb ) , D(t , j ) , H (t , j ) , H d (t ) , Pr (t )
ˆ ˆ    J  mˆ b( t , j )( PIm )  mˆ b( t , j )( PNb ) 

 Output Variables 
 Input Variables and Constants     Mˆ  Nˆ    Mˆ 
  b( t ) 
 

 Fixed  
 ˆ ˆ
  mˆ s( t , j ) ( PNb )  f β M s ( PNb ) , D(t , j ) , H (t , j ) , H d (t ) , Pr (t )
ˆ ˆ   
j 1
(t , j )


2 
   p( t )
 Mˆ


   

 Input  
Constants    mˆ
   b( t , j ) ( PNb ) 
 f β M b ( PNb ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )    J
  Mˆ f   Nˆ (t , j )  
 mˆ f( t , j )( PIm )  mˆ f (t , j )( PNb ) 





s( t )

Mˆ b( t )


     ( t ) j 1  2   

 Variables  β
 β m p ( PIm ) 
 ˆ
  mˆ f( t , j ) ( PNb )  f β M f ( PNb ) , D(t , j ) , H (t , j ) , H d (t ) , Pr (t )
ˆ ˆ ˆ      
  Mˆ 
  ˆ        J  mˆ t(t , j )( PIm )  mˆ t(t , j )( PNb )    f( t ) 
  mˆ t( t , j ) ( PNb )  mˆ p( t , j )( PNb )  mˆ s( t , j ) ( PNb )  mˆ b( t , j ) ( PNb )  mˆ f( t , j ) ( PNb )
m
  Mˆ t( )   Nˆ (t , j )    
s ( PIm )

  (t , j )
D  β     t

   Mˆ t( t ) 
     2 
  Hˆ   mb ( PIm )
    j 1
    
  ( t , j )    β       ˆ  ˆ  ˆ
C p( t ) 
 
  Nˆ   m f ( PIm )   cˆ   ˆ c c
 0.5  mˆ p( t , j )
J
ˆ   
 C p(t )   N (t , j )  
p ( , ) p( , )
( PIm ) ( PNb )

t j t j

 (t , j )  β m    p( t , j )( PIm )   ˆ
Cs( t ) 
( PIm )

    2

 Hˆ d (t )
p ( PNb )
  cˆ  0.5  mˆ s( t , j )   j 1
    
  β ms ( PNb )    ˆ
s( t , j )( PIm ) ( PIm )
   cˆs( t , j )( PIm )  cˆs( t , j )( PNb )   Cb( t ) 
 Pˆr (t )     cˆ ˆ  Cˆ  Nˆ  
J
 0.5 
  s( t ) 
m    
   β mb ( PNb )    b( t , j )( PIm ) b( t , j )
( PIm ) (t , j )
 2   Cˆ f 
    cˆ  0.5  mˆ  
j 1
    (t )

 β m f ( PNb )    f( t , j )( PIm ) f( t , j )
   Cˆt 
 cˆb( t , j )( PIm )  cˆb( t , j )( PNb ) 
( PIm )
J
cˆt  Cˆ  Nˆ    
  b( t ) 
  ˆ
c  cˆs( t , j )  cˆb( t , j )  cˆ f( t , j )   (t )

 ( t , j )( PIm ) (t , j )
 
p

( t , j )( PIm ) ( PIm ) ( PIm ) ( PIm )
j 1
 2  
  cˆ p( t , j )  0.5  mˆ p( t , j )   
  ( PNb ) ( PNb )
  J  cˆ f( t , j )( PIm )  cˆ f( t , j )( PNb )  
 cˆs  0.5  mˆ s( t , j )  Cˆ f   Nˆ (t , j )    
  ( t , j )( PNb ) ( PNb )
  ( t ) j 1  2  
cˆb( t , j )    
  0.5  mˆ b( t , j ) 
  ( PNb ) ( PNb )
  J  cˆt( t , j )( PIm )  cˆt( t , j )( PNb )  
 cˆ f  0.5  mˆ  Cˆt( t )   Nˆ (t , j )    
  ( t , j )( PNb )
f (t , j )
( PNb )    2  
j 1
 
 cˆt  ˆ
c  cˆs( t , j )  cˆb( t , j )  cˆ f( t , j ) 
  ( t , j )( PNb ) p ( t , j )( PNb ) ( PNb ) ( PNb ) ( PNb )

 
 
(e) Module E - Product and Value Estimation

 Input Variables and Constants 


  Computations: stand-level 
    
   chip volumes, lumber volumes 
 Computations: tree-level chip volumes,  and product values by 
    
lumber volumes and product values by  sawmill type
 Input Variables and Constants   
  sawmill type   J 
 Fixed     Vˆc ( s )   Nˆ (t , j ) vˆc ( s )  


Constants   
   vˆ   
f β vl ( s ) , Dˆ (t , j ) , Hˆ (t , j )  f β vl ( s ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )   
 
( t )
j 5
(t , j )

 Output Variables 
Vˆ 
l ( s )(t , j ) 
( PIm ) ( PNb )

β vl ( s )   
J
2  Vˆ
  l ( s ) ( t ) 
  Nˆ (t , j ) vˆl ( s )   c(s) (t ) 
  ( PIm )
  vˆ ˆ ˆ   
   c ( s ) ( t , j )  vm (t , j )  vl ( s ) (t , j )
(t , j )

β pl ( s )  
j 5 ˆ
  Vl ( s ) ( t ) 
 
   
( PIm )
   ˆ J
 
  Pl ( s ) ( t )   N (t , j ) pˆ l ( s ) ( t , j )

 Input  β  ˆ 
     f β pl ( s ) , Dˆ (t , j ) , Hˆ (t , j )  f β pl ( s ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t ) 
ˆ
 Pl ( s ) ( t ) 
 pt ( s )
 Variables      pˆ j 5
( PIm )

( PIm ) ( PNb )
    
  Dˆ β v    l ( s )(t , j ) 2   ˆ J
 ˆ
 Pt ( s ) 
  Pt ( s ) ( t )   N (t , j ) pˆ t ( s ) ( t , j )
  ( PIm )  ˆ
 
   
( )
 
l r
  (t , j ) 
(t )
    f β Pt ( s ) , Dˆ (t , j ) , Hˆ (t , j )  f β Pt ( s ) , Dˆ (t , j ) , Hˆ ( t , j ) , Hˆ d ( t ) , Pˆr ( t )    j 5 ˆ 
  Hˆ (t , j )  β pl ( r ) ( PIm )         Pc ( s ) ( t ) 
    pˆ t ( s ) (t , j ) 
( PIm ) ( PNb )

     ˆ
J
  

 Nˆ (t , j )    β    2   Pc ( s ) ( t )   Nˆ (t , j ) pˆ c ( s )
(t , j )  Vˆc ( r ) 
    t ( r ) ( PIm )    pˆ c ( s )  pˆ t ( s )  pˆ l ( s )
p
j 5
    (t )

  ˆ
 Pr (t )   β    (t, j) (t, j) (t, j)
  ˆ J
 Vˆ 
     Vc ( r ) ( t )   N ( t , j ) vˆc ( r ) ( t , j )
  
vl ( s )( PNb )
 ˆ
    f β vl( r ) , Dˆ ( t , j ) , Hˆ ( t , j )  f β vl( r ) , Dˆ ( t , j ) , Hˆ ( t , j ) , Hˆ d ( t ) , Pˆr ( t )   l (r )(t )

  Hˆ d (t )  β p      j 5
  Pˆ 
  l ( s ) ( PNb )   vˆl ( r ) ( t , j ) 
( PIm ) ( PNb )

  2   J
  l (r )(t ) 

 vˆm (t , j )   
 β pt ( s ) ( PNb )   vˆ  Vˆl ( r ) ( t )   Nˆ ( t , j ) vˆl ( r ) ( t , j )  ˆ 
   c( r )  vˆm ( t , j )  vˆl ( r )   j 5
 P
 t (r )(t ) 
 β    (t, j) (t, j)
   ˆ 
   
J
    Pˆl ( r )   Nˆ ( t , j ) pˆ l ( r )
vl ( r )
 ( PNb )
  f β pl ( r ) , Dˆ (t , j ) , Hˆ (t , j )  f β pl ( r ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )   Pc ( r ) ( t ) 
 β       ( t ) j 5 (t , j )

pˆ l ( r ) 
( PIm ) ( PNb )

   pl ( r )
( PNb )      
 
(t, j)
2 J
      Pˆt ( r )   Nˆ ( t , j ) pˆ t ( r ) 
   
β
  pt ( r ) ( PNb )    f β pt ( r ) , Dˆ (t , j ) , Hˆ (t , j )  f β pt ( r ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )    ( t ) j 5 (t , j )

 ˆ   

( PIm ) ( PNb )

 p   Pˆ
J

  c ( r ) ( t ) 
 t ( r ) (t , j )
2  Nˆ ( t , j ) pˆ c ( r ) 
  (t , j )

pˆ  pˆ t ( r )  pˆ l ( r ) j 5
  c ( r ) ( t , j ) (t , j ) (t , j ) 
(f) Module F - Fibre Attribute Estimation
 Estimating wood density and mean maximum branch diameter per tree 
 
 Input Variables and Constants  
  
 
 Input 

 Variables  
  Fixed   
   Computations 
    Constants    
  Dˆ (t , j )
 

β wD    
    wˆ D( t , j ) 

f β wD , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )   f β wD( PNb ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )  
 
 
( PIm )
ˆ   ( PIm ) 
 H     2
 ( t , j ) β   
  Nˆ
   (t , j )
  bD( PIm )
  β
  
  ˆ
bD (t , j ) 

f βbD , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )
( PIm )
  f β bD( PNb ) , Dˆ (t , j ) , Hˆ (t , j ) , Hˆ d (t ) , Pˆr (t )  

   
wD( PNb )

 Hˆ d (t )  
 2 

   βb   
  ˆ   D( PNb )  

  P r ( t ) 
  
Calculating mean wood density and 
mean maximum branch diameter per 
 
stand 
 
 
 J  Output Variables 

ˆ
j 5
ˆ ˆ
wD ( t , j ) N ( t , j ) 


ˆ


WD ( t )  J
if D( t , j )  10   WD ( t ) 
  ˆ 
  Nˆ ( t , j )
  B 
j 5  D (t ) 
 
 ˆ 
J

  bD (t , j ) Nˆ (t , j )

 Bˆ D (t ) 
j 8
if D(t , j )  16 
 
J

  Nˆ ( t , j )

 j 8 

Figure 2. Schematic illustration of the computational framework utilized in the modular-based SSDMM: (a) Module A - Dynamic SDMD; (b) Module
B - Diameter and Height Recovery; (c) Module C - Taper Analysis and Log Estimation; (d) Module D - Biomass and Carbon Estimation; (e) Module E
- Product and Value Estimation; and (f) Module F - Fibre Attribute Estimation. Refer to Table 9 for variable definitions and additional computational
details.
Table 9. Variable definitions and associated computational details associated
with the SSDMM are schematically illustrated in Figure 2

Variable Description and Computational Details


Module A - Dynamic SDMD (Figure 2(a))
SI Site index (m): mean dominant height at a breast-height age of 50 yr (Eqs. (23a) and (23b); Carmean et al., 2006 and 2001,
respectively).
NI Initial density (stems/ha) at time of stand establishment.
T2 A( i ) Regime 2 thinning treatment(s): stand age (yr) at the time of the ith thinning event (i=1,…,I; I=4).
T2 N ( i ) Regime 2 thinning treatment(s): density reduction (stems/ha) during the ith thinning event (i=1,…,I; I=4).
T3A ( i ) Regime 3 thinning treatment(s): stand age (yr) at the time of the ith thinning event (i=1,…,I; I=4).
T3N ( i ) Regime 3 thinning treatment(s): density reduction (stems/ha) during the ith thinning event (i=1,…,I; I=4).
Hˆ d (t ) Predicted mean dominant height (m) at time t. Estimated for each stand age  A(t ) , t  1,..., T  using the site-specific height-
age functions where β SI ( PIm ) and β SI ( PNb ) denotes the associated parameter estimate vector for black spruce ((Eq. (23a); Carmean et
al., 2006) and jack pine (Eq. (23b); Carmean et al., 2001), respectively. Notes: (1) based on the range of stand ages within the
data sets (Table 2), the maximum stand age (T) was set at 170 yr; (2) site-specific conversion of total to breast-height ages
required an estimate of the number of years required for dominant height to reach 1.3 m, consequently, Eqs. (23a) and (23b)
were rearranged with respect to the breast-height age term and solved for a zero height value; the absolute equivalent of the
returned value for a given site index is an estimate of the number of years required for dominant height to reach 1.3 m for the
specified site quality; and (3) given (2), estimates of dominant height for each year to breast-height was determined via linear
interpolation.
Nˆ ( t ) Predicted density (stems/ha) at time t. Estimated using Eq. (10) where β S denotes the associated vector of parameter
estimates
( β S  [ˆl where l  0,...,3] ; Table 3).
vˆt Predicted mean volume per tree (dm3) at time t. Estimated employing Eq. (6) where βv denotes the associated vector of
parameter estimates ( βv   ˆl where l  0,1,2 ; Table 3).
Dˆ q ( t ) Predicted quadratic mean diameter (cm) at breast height at time t. Estimated employing Eq. (4) where β Dq denotes the

associated vector of parameter estimates ( βDq   ˆl where l  0,1,2 ; Table 3).
 
Gˆ ( t ) Predicted basal area (m2/ha) at time t: Gˆ (t )  0.00007854  Dˆ q2(t )  Nˆ (t )
Vˆt (t ) Predicted total volume (m3/ha) at time t: Vˆt ( t )  1000  vˆ(t )  Nˆ (t )
Pˆr ( t ) Predicted relative density index (%/100) at time t. Estimated employing Eq. (2) where β Pr denotes the associated vector of
parameter estimates ( β Pr  ˆ 0 ˆ1  ; Table 3).
Lˆr ( t ) Predicted mean live crown ratio (%) at time t. Estimated employing Eq. (8) where β Lr denotes the associated vector of

parameter estimates ( βLr  ˆl where l  0,1,2 ; Table 3)


 

Module B - Diameter and Height Recovery (Figure 2(b))


aˆ(t ) Predicted location parameter of the Weibull PDF (Eq. 11) at time t. Calculated using the minimum diameter estimate D̂min
according to Eqs. (12) and (13) where β Dmin denotes the vector of parameter estimates associated with Eq. (13)
( β Dmin  ˆl where l  0,...,7  ; Table 4).
bˆ( t ) Predicted scale parameter of the Weibull PDF (Eq. (11)) at time t. Estimated using Eqs. (14) where βW (b) denotes the
associated vector of parameter estimates ( βW (b )  ˆl where l  0,...,7 ; Table 4).
cˆ(t ) Predicted shape parameter of the Weibull PDF (Eq. (11)) at time t. Estimated using Eqs. (15) where βW ( c ) denotes the
associated vector of parameter estimates ( βW ( c )  ˆl where l  0,...,7 ; Table 4).
Nˆ (t , j ) Predicted number of trees (stems/ha) within the jth two-centimetre-wide diameter class (j = 1, 2,…, J) at time t. Estimating
using Eq. (16) in combination with Eqs. (12), (13), (14) and (15).
Dˆ (t , j ) Predicted diameter class midpoint (cm) for the jth two-centimetre-wide diameter class at time t.

Hˆ ( t , j ) Predicted height (m) of the trees within the jth two-centimetre-wide diameter class at time t. Estimated employing Eq. (17)
where β H denotes the associated vector of parameter estimates ( β H  ˆl where l  0,...,7  ; Table 4).
 

Module C - Taper Analysis and Log Estimation (Figure 2(c))


dˆ( t , j ) Predicted inside bark diameter (cm) at stem height h(t,j) at time t. Estimated employing Eqs. (18a) and (18b) where
βT( PIm )   ˆ l where l  0,..., 4 and βT( PNb )   ˆ l where l  0,1, 2, 4 denotes the associated vector of parameter estimates for black
spruce and jack pine, respectively (Table 5).
nˆls (t , j ) Predicted cumulative number of sawlogs per tree (sawlogs/tree) within the jth merchantable-sized diameter class ( Dˆ (t , j ) ≥
10) at time t. Estimated employing Eqs. (18a) and (18b) according to specified merchantability limits for sawlogs (2.59 m log
lengths with a minimum small-end inside-bark diameter of 14.0 cm (ds)).
nˆlp(t , j ) Predicted cumulative number of pulplogs per tree (pulplogs/tree) within the jth merchantable-sized diameter class at time t.
Estimated employing Eqs. (18a) and (18b) according to specified merchantability limits for pulplogs (2.59 m log lengths with a
minimum small-end inside-bark diameter of 10.0 cm (dp)).
vˆr (t , j ) Predicted residual log tip volume per tree (m3/tree) within the jth merchantable-sized diameter class at time t. Estimated
employing Eqs. (18a) and (18b) according to specified dimensional requirements for merchantable volume (dt which is the
specified minimum dˆ which defines the merchantable height of the tree) via numerical integration techniques: sectional
(t , j )

volume summation between the height at the top of the last recovered pulp or saw log and the height (h(j,t)) at which the dˆ( t , j )
value is equivalent to the specified merchantability limit for merchantable stems (minimum inside bark diameter of 10.0 cm
(dm)).
vˆm(t , j ) Predicted merchantable stem volume per tree per tree (m3/tree) within the jth merchantable-sized diameter class at time t.
Estimated employing Eqs. (18a) and (18b) according to specified dimensional requirements for merchantable volume via
numeric integration techniques: sectional volume summation between a stump height of 0.30 m and the height (h(j,t)) at which
the dˆ value is equivalent to the specified merchantability minimum inside bark diameter (dm; 10.0 cm).
(t , j )

vˆt (t , j ) Predicted total stem volume per tree per tree (m3/tree) within the jth diameter class at time t. Estimated employing Eqs.
(18a) and (18b): (1) numeric integration via summation of sectional volumes between a stump height of 0.30 m and the height
(h(j,t)) at which a zero dˆ value first occurs; (2) calculating the volume of the 0.30 m stump; and (3) summing the both
(t , j )

volumes (stump + stem) to obtain an estimate of total stem volume.


Nˆ ls ( t , j ) Predicted number of sawlogs within the jth merchantable-sized diameter class (sawlogs/jth diameter class) at time t.

Nˆ lp (t , j ) Predicted number of pulplogs within the jth merchantable-sized diameter class (pulplogs/jth diameter class) at time t.

Vˆr (t , j ) Predicted residual log tip volume within the jth merchantable-sized diameter class (m3/jth diameter class) at time t.

Nˆ ls ( t ) Predicted number of sawlogs per unit area (sawlogs/ha) at time t.

Nˆ lp (t ) Predicted number of pulplogs per unit area (pulplogs/ha) at time t.

Vˆr ( t ) Predicted residual log tip volume per unit area (m3/ha) at time t.

Vˆm ( t , j ) Predicted merchantable stem volume within the jth merchantable-sized diameter class (m3/jth diameter class) at time t.
Vˆm' ( t ) Approximated merchantable volume per unit area (m3/ha) at time t (estimate derived from the taper equations).

Vˆt ( t , j ) Predicted total stem volume within the jth diameter class (m3/jth diameter class) at time t.

Vˆt '(t ) Approximated total stem volume per unit area (m3/ha) at time t (estimate derived from the taper equations).

Vˆm ( t ) Predicted merchantable volume per unit area (m3/ha) at time t.

Module D - Biomass and Carbon Estimation (Figure 2(d))


mˆ k( t , j ) Predicted bark (periderm; k=p), stem (k=s), branch (k=b), foliage (k=f) and total (k=t) oven-dry mass per tree (g/tree) within
the jth diameter class at time t for the nth species (n = PIm or PNb) . Estimated employing Eq. (19) where
(n)

k  p , s , b, f , t
βmk , k  p, s, b, f denotes the parameter estimate vectors specific to the kth component and nth species
(n)

( β mk  ˆl where l  0,...,7  ; Table 6). Total mass calculated as the arithmetic sum of the individual component masses.
(n)

cˆk( t , j ) Predicted bark (periderm; k=p), stem (k=s), branch (k=b), foliage (k=f) and total (k=t) carbon equivalents per tree (g/tree)
(n)
within the jth diameter class at time t for the nth species. Calculated as a fixed proportion of the component mass estimates.
k  p, s, b, f , t
Mˆ Predicted bark (k=p), stem (k=s), branch (k=b), foliage (k=f) and total oven-dry mass (k=t) per unit area (g/ha) at time t.
k( t )

k  p , s , b, f , t
Cˆ k( t ) Predicted bark (k=p), stem (k=s), branch (k=b), foliage (k=f) and total carbon (k=t) per unit area (g/ha) at time t.

k  p , s , b, f , t

Module E - Product and Value Estimation (Figure 2(e))


vˆl ( k ) Predicted lumber volume recovered by a stud (k=s) and randomized length (k=r) sawmill processing protocol per tree
(dm3/tree) within the jth merchantable-sized diameter class at time t for the nth species. Estimated employing Eqs. (20a) and
( t , j )( n )

k  s, r
(20b) where βvl ( k )  ˆl ( k ) where l = 0,1,2 and βvl ( k )  ˆl ( k ) where l = 0,...,7 denotes the black spruce and jack pine
( PIm ) ( PNb )

parameter estimate vectors for lumber volume, respectively, specific to the kth processing protocol (Table 7).
vˆc ( k ) Predicted chip volume recovered by a stud (k=s) and randomized length (k=r) sawmill processing protocol per tree
(dm3/tree) within the jth merchantable-sized diameter class at time t for the nth species. Note, calculated as the arithmetic
( t , j )( n )

k  s, r difference between the taper-based merchantable volume per tree estimate and lumber volume per tree estimate.
pˆ l ( k ) Predicted value of dimensional lumber recovered by a stud (k=s) and randomized length (k=r) sawmill processing protocol
per tree (CAN$(2002)/tree) within the jth merchantable-sized diameter class at time t for the nth species. Estimated employing
( t , j )( n )

k  s, r
Eqs. (20a) and (20b) where β pl ( k )  ˆl ( k ) where l = 0,1,2 and β pl ( k )  ˆl ( k ) where l = 0,...,7  denotes the black spruce and
( PIm ) ( PNb )

jack pine parameter estimate vectors for lumber value, respectively, specific to the kth processing protocol (Table 7).
pˆ t ( k ) Predicted value of lumber and chip products recovered by a stud (k=s) and randomized length (k=r) sawmill processing
protocol per tree (CAN$(2002)/tree) within the jth merchantable-sized diameter class at time t for the nth species. Estimated
( t , j )( n )

k  s, r
employing Eqs. (20a) and (20b) where β pt ( k )  ˆl ( k ) where l = 0,1,2 and β pt ( k )  ˆl ( k ) where l = 0,...,7  denotes the
( PIm ) ( PNb )

black spruce and jack pine parameter estimate vectors for total value, respectively, specific to the kth processing protocol (Table
7).
pˆ c ( k ) Predicted mean value of chips recovered by a stud (k=s) and randomized length (k=r) sawmill processing protocol per tree
(CAN$(2002)/tree) within the jth merchantable-sized diameter class at time t for the nth species. Note, calculated as the
( t , j )( n )

k  s, r arithmetic difference between the total and lumber mean value estimates.
Vˆc ( k ) Predicted chip volume recovered by a stud mill (k=s) and randomized length (k=r) sawmill processing protocol per stand for
(t )
all merchantable-sized trees at time t (dm3/ha).
k  s, r
Vˆl ( k ) Predicted lumber volume recovered by a stud mill (k=s) and randomized length (k=r) sawmill processing protocol per stand
(t )
for all merchantable-sized trees at time t (dm3/ha).
k  s, r
Pˆl ( k ) Predicted value of dimensional lumber recovered by a stud mill (k=s) and randomized length (k=r) sawmill processing
(t )
protocol per stand for all merchantable-sized trees at time t (CAN$(2002)/ha).
k  s, r
Pˆt ( k ) Predicted value of all products (lumber and chip) recovered by a stud mill (k=s) and randomized length (k=r) sawmill
(t )
processing protocol per stand for all merchantable-sized trees at time t (CAN$(2002)/ha).
k  s, r
Pˆc ( k ) Predicted value of chips recovered by a stud mill (k=s) and randomized length (k=r) sawmill processing protocol per stand
(t )
for all merchantable-sized trees at time t (CAN$(2002)/ha).
k  s, r
Module F - Fibre Attribute Estimation (Figure 2(f))

wˆ D( t , j ) Predicted stem cross-sectional area-weighted mean wood density per tree within the jth diameter class at time t (g/cm3) for
the nth species. Estimated employing Eq. (21) where βwD denotes the parameter estimate vector specific to the nth species
(n)

(n)

( βwD  [ˆ l where l  0,...,7] ; Table 8).


(n)

Predicted mean maximum branch diameter (cm) within the first 4.9 m sawlog per tree within the jth sawlog-sized diameter
bˆD (i , j )( n )
class Dˆ  
 16 at time t. Estimated employing Eq. (22) where β denotes the parameter estimate vector specific to the nth
(t , j ) bD( n )

species ( βbD  [ˆl where l  0,...,7] ; Table 8).


(n)

Wˆ D ( t )  
Predicted diameter-class-weighted mean wood density within the merchantable-sized tree population Dˆ (t , j )  10 at time t
3
(g/cm ).
Bˆ D ( t )  
Predicted diameter-class-weighted mean maximum branch diameter within the sawlog-sized tree population Dˆ (t , j )  16 at
time t (cm).
Development and Utility of an Ecological-based Decision-Support System … 155

3.2. Performance Indices for Comparing Density Management Outcomes

A standardized set of stand-level performance indices were derived from the model‘s
output in order to simplify the decision-making process in terms of comparing alterative
density management regimes. These indices reflect overall productivity, type of products
produced, economic efficiency, degree of optimal site occupancy, structural stability, and
fibre quality attributes, over the rotation. Indices representing merchantable volume
production (mean annual merchantable volume increment; RMAI (Eq. (24)), biomass
accumulation rate (mean annual biomass increment; RBMI (Eq. (25)), and carbon sequestration
potential (mean annual carbon increment; RCAI (Eq. (26)) were used to quantify overall
productivity for a given regime.

 K

RMAI   Vm  Vm ( k )  A(T ) (24)
 k 1 

where Vm is the standing merchantable volume (m3/ha) at rotation (A(T)) and Vm( k ) is the
merchantable volume removed during the kth thinning entry (k = 1,…,K; K = 4),

 K

RBMI   M t   M t ( k )  A(T ) (25)
 k 1 

where M t is the standing total aboveground biomass (t/ha) at rotation and M t ( k ) is the total
aboveground biomass (t/ha) removed during the kth thinning entry (k = 1,…,K; K = 4),

 K

RCAI   Ct   Ct ( k )  A(T ) (26)
 k 1 

where C t is the standing total aboveground carbon (t/ha) at rotation and Ct ( k ) is the total
aboveground carbon (t/ha) removed during the kth thinning entry (k = 1,…,K; K = 4).
Relative indices reflecting the type and quality of logs produced (percentage of sawlogs
produced; RSL; (Eq. (27)) and the resultant end-products manufactured (percentage of lumber
volume recovered by sawmill type; RLV(m) (Eq. (28)) were used to differentiate each regime in
terms of its potential to produce commercial-grade end-products.

 K 
 Nls   Nls ( k ) 
RSL  100   k 1  (27)
 K
  K

  Nls   Nls ( k )    N lp   N lp ( k )  
 k 1   k 1 
156 Peter F. Newton

where N ls and N lp are the total number of sawlogs (logs/ha) and pulplogs (logs/ha) at
rotation, respectively, and Nls ( k ) and Nlp ( k ) are the total number of sawlogs (logs/ha) and
pulplogs (logs/ha) removed during the kth thinning entry (k = 1,…,K; K = 4), respectively,

 K 
 Vl ( m)   Vl ( k , m) 
RLV ( m )  100   k 1  (28)
 K
  K

  Vl ( m )   Vl ( k ,m )    Vc ( m )   Vc ( k ,m )  
 k 1   k 1 

where Vl ( m) and Vc ( m ) are the lumber and chip volumes (m3/ha) recovered employing the mth
sawmill processing protocol (m = 1 (stud mill) or 2 (random length mill)) from the
merchantable-sized trees at rotation, respectively, and Vl ( k ,m ) and Vc ( k ,m) are the lumber and
chip volumes (m3/ha) recovered employing the mth sawmill processing protocol from the
merchantable-sized trees removed during the kth thinning entry (k = 1,…,K; K = 4),
respectively.
Economic efficiency was measured by the land expectation value (i.e., the maximum an
investor could pay for bare land to achieve a specified rate of return (discount rate)) of the
manipulated regimes, relative to the control regime (E(m); Eq. (29)).

 LTE ( m )  LCE ( m ) 
E( m )  100    where (29)
 L C
 E ( m ) 

 K

 PtT(m ) (1  I r ) A(T )    CFE (1  I r ) A(T )   CFT T (k ) (1  I r ) A(T )  A( k ) 
   k 1

 K
K T
 k 1

  Pt (k ,m ) (1  I r ) 1  I r 
A( k )

A( T )  A( k ) 
 
   CV T (k ) (1  I r )
T A( T )  A( k )
 CVT  H


LT
  k 1 
E (m )
(1  Dr ) (T )  1
A

 T Ys  2002
 Pt ( m )  Pt ( m ) (1  I r )
 T Ys  2002
 Pt ( k ,m )  Pt ( k ,m ) (1  I r )

where CFT T ( k )  cFT T ( k ) (1  I r ) ( k )
A

 T

CV T ( k )  cV T ( k ) (1  I r )
T A( k )
Vˆm ( k ) 


CVT  H  cVT  H (1  I r ) A( T ) Vˆm (T )
 

C

Pt C( m) (1  I r )
A( T )

 CFE (1  I r )
A( T )
 CVC H  where  P C
t ( m)  Pt ( m) (1  I r )Ys 2002
 C
 
L
Vˆm (T )
E ( m)
(1  Dr ) 1 CV  H  cV  H (1  I r )
A( T ) C A( T )
Development and Utility of an Ecological-based Decision-Support System … 157

where LE ( m ) and LE ( m ) are the land expectation values at rotation attained employing the
T C

mth sawmill processing protocol within the density manipulated and control stands,
respectively, Pt ( m ) and Pt ( m ) are the inflation-adjusted (to the year of simulation; Ys) total
T C

product values ($/ha) recovered employing the mth sawmill processing protocol from the
merchantable-sized trees within the density manipulated and control stands at rotation,
respectively, Pt ( k , m ) is the inflation-adjusted total product value ($/ha) recovered employing
T

the mth sawmill processing protocol from the merchantable-sized trees removed during the
kth thinning entry (k = 1,…,K; K = 4), CFE is the fixed cost ($/ha) incurred at the time of stand
establishment (e.g., regeneration assessment or vegetation management expenses), cF T ( k )
T

and CF T ( k ) are the inflation unadjusted and adjusted fixed costs ($/ha) incurred during the
T

kth thinning entry, respectively (e.g., logistical costs such as those associated with
transporting thinning equipment), cVC H and CVC H are the inflation unadjusted and adjusted
variable costs (dollars per cubic metre of merchantable volume harvested ($/m3)),
respectively, associated with crown charges (stumpage and renewal fees), harvesting,
transportation, processing and manufacturing at the time harvest within the control stand,
cVT  H and CVT  H are the inflation unadjusted and adjusted variable costs ($/m3), respectively,
at the time of harvest within the treated stand, cV T ( k ) and CV T ( k ) are the inflation
T T

unadjusted and adjusted variable costs ($/m3), respectively, at the time of the kth thinning
entry within the treated stand, Ir and Dr are the inflation and discount rate, respectively, and
A( k ) and A(T) are the stand ages at the time of the kth thinning entry and at rotation,
respectively.
In order to evaluate the regimes to the degree to which each optimally occupied the site
during the rotation, the percentage of years in which the size-density trajectory was within the
optimal density management window, was calculated (SO; Eq. (30)).

Y 
SO  100   O  (30)
 YN 

where YO is the number of years in which the size-density trajectory was within the
conceptual optimal relative density management zone as delineated by lower and upper
relative density index thresholds of 0.32 and 0.45, respectively, and YN is the rotation length
in years. The weighted mean height/diameter ratio for trees within the dominant crown
classes (SS; Eq. (31)) was used to differentiate the regimes in terms of stand stability.

 J  H (t , j )  
    N ( t , j ) 
1  j 1 100  D(t , j )  
SS     J (31)
T

 if D(t , j )  D80

T t i
 N (t , j ) 
 j 1 
 
158 Peter F. Newton

where D80 is the 80th percentile of the recovered diameter frequency distribution as
calculated from the Weibull (1951) scale and shape parameters (Bailey and Dell, 1973):
D80  bˆ   log e  0.20  . Similarly, the weighted mean wood density for trees within the
1/ cˆ

merchantable tree population ( WD ; Eq. (32)) and the weighted mean maximum branch
diameter ( BD ; Eq. (33)) within the sawlog-sized tree population were used as indices of
wood and log quality, respectively.

 J 
T   D (t , j ) (t , j ) 
w N
1
WD    
j 5
(32)
T t 1  J

  N (t , j ) 
 j 5 

where wD (t , j ) is the stem cross-sectional area-weighted mean wood density at time t of trees


within the jth merchantable size diameter class D(t , j )  10 . 
 J 
T   D (t , j ) (t , j ) 
b N
1
BD    j J 
8
(33)
T t i  
  N (t , j ) 
 j 8 

where bD ( t , j ) is the mean maximum branch diameter within the first 4.9 m sawlog at time t

 
within the jth sawlog-sized diameter-class D(t , j )  16 . Note, in the case of the thinned

stands, WD and BD values are calculated for the period since the last treatment, exclusively.

3.3. Application and Utility in Multiple Resource Management

In order to facilitate the implementation of complex models in natural resource


management, software analogues are usually required. Consequently, given the computation
burden associated with calculating the volumetric yields, log products, biomass and carbon
outcomes, recoverable products and associated monetary values, fibre attributes and stand-
level performance indices, a VisualBasic.Net program was developed. This program predicts
site-dependent annual and rotational estimates for these variables at both the diameter-class
and stand levels. The program is structured so that 3 density management regimes can be
evaluated for any given simulation: a control regime with no thinning treatments versus two
other regimes which can include 1 to 4 thinning treatments each. The user must specific the
site quality as defined by site index, initial densities, thinning treatments in terms of time of
entry and removable densities, length of rotation, fixed and variable cost estimates, and
Development and Utility of an Ecological-based Decision-Support System … 159

interest and discount rate information. Essentially, the program carries out all the required
computations as described in Figure 2, calculates the performance indices, and presents the
results in both graphical and tabular formats.
Although the computational requirements are mitigated with the development of the
software program, designing the optimal density regime for a given objective is still a major
challenge considering the number of decision variables involved. For example, a silviculturist
must decide which sites should be treated, the number, type, intensity (removable densities)
and timing of thinning treatments, when to conduct the final harvest (rotation length), and
what economic values are most appropriate (fixed and variable costs estimates, and interest
and discount rates). Furthermore, the candidate regimes must comply with the local statutory
framework which may introduce a broad array of additional conditions that must be met
before treatments can be implemented. Thus in order to exemplify the complexity of density
management decision-making and illustrate the potential utility of the SSDMM, the software
was used to determine and evaluate the consequences of implementing CT treatments within
density-stressed mixed stands situated on sites of moderate productivity.
In this example, CT is used as a stand improvement practice in order to realize a diverse
set of stand-level objectives. These included (1) avoidance of density-dependent mortality
within the merchantable-sized classes during the later stages of stand development so that
potentially valuable crop trees are retained on the site until final harvest, (2) provision of
interim fibre supplies via the mid-rotational partial harvests (thinning yields), (3) reduction in
vertical and horizontal structure heterogeneity by increasing spatial pattern uniformity and
decreasing size variation within the crop tree population thus reducing variable costs
associated with harvesting, transportation and manufacturing, and (4) enhancing carbon
sequestration potential by increasing biomass production and reducing the generation of
abiotic masses.
Preferable, CT density management regimes are those which increased mean tree size
without incurring declines in merchantable volume production, do not unacceptably increase
the risk of volume losses to wind, snow, insects, and disease, and minimize the occurrence of
density-dependent mortality within the merchantable-sized diameter classes. Although
legislative and operational constraints placed on CT treatments vary by stand type and
jurisdiction, the ones proposed for coniferous stands within the central portion of the
Canadian Boreal Forest Region are used in this example (McKinnon et al., 2006): (1) CT
treatments should be implemented when density-dependent mortality is occurring or
imminent within the merchantable-sized classes; (2) CT should occur within stands where the
mean live crown ratio exceeds 35% and the stand basal area exceeds 25 m2/ha; and (3) CT
treatments should reduce basal areas by no more than 30-35% per entry. Furthermore, if the
goal is to enable stands to achieve an optimal level of production before thinning commences,
stands should be allowed to attain the upper threshold relative density value of at least 0.45,
before treatment.
Thus within the context of implementing CT treatments within density-stressed mixed
stands, a control regime (untreated) consisting of a stand with an initial density of 5000
stems/ha (NI) which naturally established on a moderately good site quality (SI = 17)
following forest harvesting was contrasted with two stands with identical initial conditions,
but subject to the following CT treatments. The first of two CT treatments was equivalent in
both stands: removing 1500 stems/ha from stands which had achieved all of the above CT
criteria in addition to having attained merchantable status (i.e., having a minimum quadratic
160 Peter F. Newton

mean diameter of 10 cm). Similarly, once the treated stands had re-achieved the criteria for
CT treatments, they were subsequently thinned again. Specifically, at a stand age of 65 yr,
30% of the basal area in the first CT stand, and 35% of the basal area in the second CT stand,
was removed. The control and treated stands were denoted Regime 1, 2 and 3 where Regime
3 was the stand that received the heaviest second CT treatment. The rotation age was set at 85
yr for all 3 regimes. The economic and operability variables were specified as follows: (1) a
cost of $100/ha were applied to all 3 regimes to cover the expense of a regeneration
assessment survey; (2) the fixed costs associated with equipment transport and basic
administration for each CT treatment was set at $100/ha; (3) the variable costs associated with
stumpage, renewal, harvesting, transportation and manufacturing in units of dollars per cubic
metre of merchantable volume harvested at rotation, was set at $100/m3 for the control
regime, and $60/m3 for the regimes involving the CT treatments; note, this cost differential is
principally due to the reduction in harvesting and manufacturing costs due to increased spatial
pattern uniformity and decreased piece-size variation commonly attributed to thinning
treatments; (4) similar to (3), the variable costs associated with the first and second thinning
yields were set at $100/m3 and $70/m3, respectively; (5) interest and discount rates were set at
2 and 4%, respectively; (6) a quadratic mean diameter of 16 cm was set as an operability
target; and (7) the simulation year was set to 2011.
Figure 3 illustrates the resultant mean volume-density trajectories for each regime within
the context of the traditional SDMD graphic. The stands at the time of the first thinning (45
yr) were at the midway point between the 14 and 16 m dominant height isolines, slightly
above the 10 cm quadratic mean diameter isoline, midway between the 0.6 and 0.7 relative
density isolines, and just above the 40% mean live crown ratio isoline. The interpolated
values for these and other values were as follows: mean dominant height of 15.2 m, quadratic
mean diameter of 10.3 cm, relative density of 0.64, mean live crown ratio of 41%, mean
volume of 52 dm3, stand density of 3575 stems/ha, and basal area of 29.7 m2/ha. Thus the
stands had met the minimum conditions for CT.
The first thinning reduced density stress levels by approximately 30% (relative density
indices declining from 0.64 to 0.45) and placed the residual stands exactly on the upper
threshold of the optimal density management window (i.e., 0.45 relative density index). The
treatment removed approximately 31% of the basal area and resulted in a merchantable
volume recovery of 33 m3/ha. At the time of the second thinning (65 yr), both of the treated
stands were almost on the 19 m dominant height isoline, midway between the 14 and 16 cm
quadratic mean diameter isolines, just above the 0.7 relative density isoline, and
approximately intersecting the 35% mean live crown ratio isoline, which is the minimum
value allowed for implementing a CT treatment. At this stage, the interpolated values for
these and other values were as follows: mean dominant height of 18.9 m, quadratic mean
diameter of 15.1 cm, relative density of 0.71, mean live crown ratio of 36%, mean volume of
138 dm3, stand density of 1750 stems/ha, and basal area of 31.4 m2/ha. For Regimes 2 and 3,
the second CT treatment removed 30% and 35% of the basal area, respectively, which
resulted in a merchantable volume recovery of 52 m3/ha and 61 m3/ha, respectively. Table 10
provides a complete list of the rotational and thinning yield estimates by regime. The derived
stand-level performance indices are given in Table 11.
Development and Utility of an Ecological-based Decision-Support System … 161

Figure 3. Graphical illustration of the dynamic SDMD for black spruce and jack pine mixtures. Note
the following principal components of the SDMD: (1) isolines for mean dominant height (Hd; 6-24 m
by 2 m intervals), quadratic mean diameter (Dq; 4-24 cm by 2 cm intervals), mean live crown ratio (Lr;
35, 40, 50,…, 80%), relative density index (Pr; 0.1-1.0 by 0.1 intervals); (2) self-thinning rule at a Pr =
1.0 (solid diagonal line delineating the outer boundary of the size-density region); (3) lower and upper
Pr values delineating the optimal density management window (Dm; 0.32 ≤ Pr ≤ 0.45); (4) crown
closure line (solid diagonal line delineating the inner boundary of the size-density region); and (5)
expected 85 yr size-density trajectories with 1 yr intervals denoted for 3 user-specified density
management regimes for stands situated on sites of moderate productivity (SI = 17). Regime 1 consisted
of a regeneration density of 5000 stems/ha with no thinning. Regime 2 consisted of a regeneration
density of 5000 stems/ha with 2 commercial thinnings: one at 45 yr in which 1500 stems/ha were
removed and a subsequent one at 65 yr in which 700 stems/ha were removed. Similarly, Regime 3
consisted of a regeneration density of 5000 stems/ha with 2 commercial thinnings: one at 45 yr in
which 1500 stems/ha were removed and a subsequent one at 65 yr in which 805 stems/ha were
removed. Source: SSDMM algorithm.

Evaluating the regimes in terms of the realization of the multiple objectives under
consideration, it is evident that commercial thinning is a viable treatment when managing
mixed stands. Density-dependent mortality rates within the thinned stands were considerably
less than those within the unthinned stand during the time from the first treatment to the time
of the second treatment: total densities declined by 1049 stems/ha within the control stand
versus 325 stems/ha within the thinned stands. Similar trends were observed from the time of
the second thinning to rotation age: total densities declined by 712 stems/ha within the control
stand versus 60 stems/ha for Regime 2, and 17 stems/ha for Regime 3.
162 Peter F. Newton

Table 10. Regime-specific interim and rotational yield estimates


for black spruce and jack pine mixtures subjected to mid-rotation
commercial thinning treatments. Values in parenthesis denote yields
derived from the two CT treatments (ordered by time of treatment)

Attributea Regimeb
(t = T) 1 2 3
A(t ) (yr) 85 85 85

21.1 21.1 21.1


Hˆ d (t ) (m)
Dˆ q ( t ) (cm) 17.3 19.0 19.2

Gˆ ( t ) (m /ha)2
43 28 27

vˆ(t ) (dm3) 202 244 249

Vˆt (t ) (m3/ha) 366 241 (58, 72) 231 (58, 83)

Vˆm ( t ) (m3/ha) 311 209 (33, 52) 201 (33, 61)

Nˆ ( t ) (stems/ha) 1815 991 (1500, 700) 928 (1500, 805)


ˆ
P (%/100) 1.02 0.47 0.47
r (t )

Nˆ lp (t ) (logs/ha) 3753 1793 (666, 1161) 1660 (666, 1338)

Nˆ ls ( t ) (logs/ha) 3177 1790 (179, 125) 1813 (179, 160)

Vˆr ( t ) (m3/ha) 18 15 (4, 3) 9 (4, 4)

Mˆ p( t ) (t/ha) 19 15 (4, 5) 14 (4, 6)

Mˆ s( t ) (t/ha) 238 150 (39, 48) 145 (39, 55)

Mˆ b( t ) (t/ha) 11 14 (3, 3) 15 (3, 4)

Mˆ f( t ) (t/ha) 9 11 (4, 3) 12 (4, 4)

Mˆ t( t ) (t/ha) 276 190 (50, 59) 186 (50, 68)

Cˆ p( t ) (t/ha) 10 7 (2, 2) 7 (2, 3)

Cˆ s( t ) (t/ha) 119 75 (20, 24) 72 (20, 28)

Cˆ b( t ) (t/ha) 5 7 (2, 2) 7 (2, 2)

Cˆ f( t ) (t/ha) 4 6 (2, 3) 6 (2, 2)

Cˆt( t ) (t/ha) 138 95 (26, 31) 93 (26, 35)


127 78 (16, 26) 75 (16, 30)
Vˆc ( s ) (m3/ha)
(t )

184 125 (14, 26) 122 (14, 31)


Vˆl ( s ) (m3/ha)
(t )

Vˆc ( r ) (m3/ha) 101 59 (11, 20) 56 (11, 23)


(t )

Vˆl ( r ) (m3/ha) 210 144 (19, 33) 140 (19, 38)


(t )

a - As defined in Table 9.
Development and Utility of an Ecological-based Decision-Support System … 163

b - Regime 1 - NI = 5000 (control); Regime 2 - NI = 5000 with T2 A (1) = 45/ T2 N (1) = 1500 + T2 A ( 2 ) =
65/ T2N ( 2 ) = 700; Regime 3 - NI = 5000 with T3A (1) = 45/ T3N (1) = 1500 + T3 A ( 2 ) = 65/ T3N ( 2) = 805.

Table 11. Stand-level performance indices for black spruce and jack pine mixtures
subjected to mid-rotation commercial thinning treatments

Indexa Regimeb
1 2 3
RMAI (m3/ha/yr) 3.7 3.5 3.5
RBMI (t/ha/yr) 3.2 3.5 3.6
RCAI (t/ha/yr) 1.6 1.8 1.8
RSL (%) 46 37 37
RLV(s) (%) 59 58 58
RLV(r) (%) 68 68 69
EP(s) (%) - 15 16
EP(r) (%) - 5 5
SO (%) 7 8 8
SS (m/m) 102 95 94
WD (g/cm3) 0.45 0.47 0.47
BD (cm) 2.75 2.79 2.79
a - As defined in the text.
b- As defined in Table 10.

In terms of site occupancy, the thinned stands had fully reoccupied their sites by rotation
age as measured by their relative densities (0.65 and 0.62 for Regimes 2 and 3, respectively;
Table 10). Crop trees within the thinned stands had fully adjusted to their newly given space
from the second thinning just prior to rotation age (i.e., stand ages of 80 and 83 yr for Regime
2 and 3, respectively). Based on a 16 cm quadratic mean diameter threshold, the thinned
stands achieved operability status 10 years earlier than the control regime (66 versus 76 yr).
Although merchantable volumetric yields were slightly less for the thinned stands, biomass
productivity and carbon sequestration potential were greater (Table 11). The percentage of
lumber volume recovered over the rotation was approximately equivalent among the regimes,
irrespective of sawmill-type. The economic-based metrics suggested that CT resulted in an
approximate 10% gain in efficiency relative to the control regime. This differential is partially
due to the time-sensitive nature of the net revenues recovered from the end-products obtained
from the CT treatments. The lower costs of harvesting, transportation and manufacturing also
contributed to this result. The duration of optimal site occupancy did not vary much among
the regimes and with the exception of the years immediately following the CT treatments
(response delay period), all the regimes had fully occupied their sites once they had attained
initial crown closure status. Stand stability and branch diameters increased slightly with
thinning whereas mean wood density decreased slightly. Although these results present CT in
a positive light in terms of capturing expected mortality, increasing biomass productivity and
carbon sequestration potential, improving economic efficiency and enhancing stand stability,
the results are dependent on the specified regime chosen and the economic assumptions
employed. Apart from the specific applicability of this example, the demonstration clearly
164 Peter F. Newton

illustrates the utility of the SSDMM in terms of managing for multiple objectives within the
black spruce and jack pine mixed stand-type.

3.4. Ecological Foundation and Linkages

A large number of relationships derived from applied ecology, plant population biology
and forest science, are used in the development of SDMDs (Drew and Flewelling, 1977; Jack
and Long, 1996; Newton, 1997). These include the reciprocal equations of the competition–
density and yield–density effect (Kira, 1953; Shinozaki and Kira, 1956), self-thinning rule
(Yoda et al., 1963), relative density indices (Ando 1962, Drew and Flewelling, 1979), and
forest production theories (Langsaeter, 1941; Drew and Flewelling, 1979). The integration of
these biologically-based constructs within the modular-based SSDMM provides an ecological
foundation for density management decision-making.
One of the principal relationships employed is the self-thinning rule which is used to
represent the asymptotic size-density relationship within stands undergoing density-dependent
mortality. Quantitatively, the power exponent of this relationship for black spruce and jack
pine mixtures was found to be -1.2 which is significantly (p ≤ 0.05) different from the
theoretical expected value of -1.5, as postulated under the rule‘s traditional geometric
deviation (Yoda et al., 1963). However, the self-thinning exponent is much closer to the value
of -1.3 which was proposed for a wide range of tree species based on a mechanistic
reformation (Enquist et al., 1998) employing the universal scaling law (West et al., 1997).
This derivation is applicable to plant species which have a fractal-like biological resource
distribution network (West et al., 1997). Assuming that (1) individuals compete for spatially
limited resources, (2) resource use per individual  Q  scales with their mass (m) according
3/ 4
to Q  m , and (3) and growth continues until all available resources have been utilized,
the maximum number of plants which can be supported per unit area (Nmax) can be related to
the rate of resource supply per unit area (R) and the average rate of resource use per
 
individual Q , according to the relationship, R  Nmax Q  N max m3/4 . As a population
reaches full site occupancy and the rate of resource use and the rate of resource supply
approaches equivalency within a given environment, R becomes a constant. Hence,
Nmax  m3/4 , or more generally, m  0 Nmax
4/3
where  0 is a constant of proportionality.
Therefore by extension, self-thinning black spruce and jack pine mixed stands may be at a
stationary equilibrium condition in which the rate of resource supply is equivalent to the rate
of resource.
Self-thinning deviations proposed by Yoda et al. (1963) and Enquist et al. (1999) assume
that asymmetric competition for finite resources is the principal determinate underlying
density-dependent mortality within plant populations. Although this assumption has been
widely supported by numerous studies, other factors may also be at play within density-
stressed coniferous tree populations. Physical competition for space in which neigbhouring
crowns collide during high wind events commonly results in spatially noncontiguous
canopies due to the loss of branches and associated leaf area (Rudnicki et al., 2001). Over
time, such wind-induced crown abrasions and resultant loss of foliar mass for trees within the
Development and Utility of an Ecological-based Decision-Support System … 165

lower size classes, may produce a mortality pattern analogous to that observed for a light-
based dominance-suppression competitive relationship (i.e., resource pre-emption
competition process in which the smaller individuals do not receive sufficient solar radiation
due to shading from the larger individuals which eventually results in declining growth rates
and subsequent mortality (e.g., Newton and Jolliffe, 2003)). Newton (2006a) derived a
mechanical-based reformation of the self-thinning rule for monospecific jack pine stands
based on this concept. Given the sway dynamics, crown collisions and the noncontiguous
nature of crown cover (crown shyness) commonly observed within self-thinning mixed
coniferous stands at the later stages of stand development, and the concordance between the
empirical self-thinning exponent for mixed stands ( ̂ 1 = -1.2) and that predicted by Newton‘s
(2006a) reformulation ( 1.3  1  1.0 ), suggest this mechanical reformation and its
assumptions (e.g., competition for physical space is partially responsible for the observed
self-thinning pattern) may be applicable to mixed stands as well.
Another important linkage between the SSDMM and ecological theory relates to the
relationship between net production and site occupancy, which is used to define the optimal
density management window. Conceptually, net production increases with increasing site
occupancy until an asymptote is reached for a given species, site quality and stage of stand
development (Langseter, 1941; Mar:Möller, 1954; Assmann, 1970). Additional increases in
site occupancy results in a decline in net production, principally due to density-dependent
mortality arising from the self-thinning process. Thus if the objective is to maximize biomass
productivity and by extension carbon sequestration potential, then stands should be allowed to
achieve full occupancy. Once attained, they should be maintained at the asymptotic
occupancy-productivity condition via density management treatments. For black spruce and
jack pine mixed stands, this condition is represented by the optimal density management
window as delineated by relative densities indices between 0.32 and 0.45. Hence,
conceptually, mixed stands managed below the 0.32 relative density threshold are not
optimally occupying the site and thus not achieving their full productivity potential.
Conversely, stands in which density-stress levels are above the 0.45 relative density threshold
are likely to incur substantial density-dependent mortality which may result in productivity
losses and negative carbon balances. Operationally, however, maintaining stands within this
optimal window is currently impractical given the necessity to implement numerous light and
costly thinning treatments. Nevertheless, the maturing bio-economy along with developing
carbon markets may make this management proposition more economically amiable in the
future.
Ecologically, density management treatments directly affect the competitive
interrelationships within forest tree populations. Treatments such as PCT and CT within
density-stressed stands immediately reduce the intensity and occurrence of symmetrical and
asymmetrical competitive interactions. The temporary reduction or elimination of these
competition pressures increases the availability of moisture, nutrients, solar radiation and
physical growing space, for the remaining crop trees. Thinning from below in which the
smallest trees are removed provides the residual crop trees with an immediate influx of
moisture and nutrients which are shared equally. This competition process is also known as a
resource depletion process in which all competitors passively acquire an equal share of the
available below ground resources on a per-unit size basis (Newton and Jolliffe, 2003).
Conversely, selection thinning in which the trees with the greatest potential to respond to
166 Peter F. Newton

treatment are left and all others are removed, commonly results in a residual population
comprised mostly of trees from the co-dominant crown classes. Essentially, selection thinning
results in an immediate increase in the availability of both below and above ground resources.
In contrast to the passive nature of the resource depletion process, competition for solar
radiation and physical space is an asymmetrical process in which larger-sized competitors
acquire a disproportionate share of the these resources at the expense of the smaller-sized
competitors on a per-unit size basis (Newton and Jolliffe, 2003). The self-thinning stage of
stand development in which the smallest individuals incur mortality is largely driven by a
resource pre-emption process. Thus understanding resource competition processes and their
effects is an essential prerequisite to sound density management decision-making.
In essence, the modular-based SSDMM presented in this study for mixed stands provides
forest managers with an ecological-based decision-support tool for manipulating competitive
relationships within forest tree populations in order to realize single or multiple stand-level
management objectives. In relation to the evolving bio-economy and carbon markets, the
SSDMM can also be used to derive optimal density management regimes for these objectives.
For example, in a general sense, stands should be managed so that they (1) do not excessively
self-thin themselves and thereby do not generate large amounts of abiotic plant material that
could contribute to a negative carbon budget, (2) allocate a greater proportion of their
resources to foliage production in order to increase CO2 sequestration rates, and (3) produce
large volumes of dimensional lumber products so that the sequestrated carbon is stored
infinitely.

3.5. Concluding Notes

Globally, the modular-based SSDMM provides the quantitative foundation for


forecasting and contrasting volumetric, product, economic and ecological rotational outcomes
of mixed stands to density manipulation. The SSDMM represents a consequential
contribution to the growing list of comprehensive models that have been developed for
addressing multiple resource management objectives (e.g., Di Lucca, 1999; Pretzsch et al.
2002; Hynynen et al., 2005; Kotze and Malan, 2007) and likewise will enable boreal resource
managers to address a broader range of management objectives. As management objectives
shifts towards the production of high-value end-products, bio-energy and carbon
sequestration outcomes, the modular-based SSDMM has the potential to be at the forefront in
facilitating this transformative change. The approach employed in which concepts from
ecology, plant population biology and forest science are combined and integrated within a
common analytical framework, exemplifies the synergy that can be realized through a multi-
disciplinary approach to model development.

ACKNOWLEDGEMENTS
The author expresses his appreciation to: (1) Dave Wood, Forest Ecosystem Science Co-
operative Inc, Thunder Bay, Ontario, Canada, for access to the temporary and permanent
sample plot measurements and records used for model calibration; (2) Dr. I. Amponsah,
Development and Utility of an Ecological-based Decision-Support System … 167

Sustainable Resource Development, Alberta, Canada, Dr. D. Reid, Research Scientist, Centre
for Northern Forest Ecosystem Research, Lakehead University, Thunder Bay, Ontario,
Canada, Dr. M. Sharma, Research Scientist, Ontario Forest Research Institute, Sault Ste.
Marie, Ontario, Canada, and John Parton, Provincial Growth and Yield Coordinator, Ontario
Ministry of Natural Resources (OMNR), Timmins, Ontario, Canada, for analytical advice; (3)
NaturaLogic Inc., North Bay, Ontario, Canada, for VB.Net software design assistance; and
(4) Forestry Research Partnership and the Canadian Wood Fibre Centre for fiscal support.

REFERENCES
Ando T. 1962. Growth analysis on the natural stands of Japanese red pine (Pinus densiflora
Sieb. et. Zucc.). II. Analysis of stand density and growth (in Japanese; English summary).
Government of Japan, Bulletin of the Government Forest Experiment Station (Tokyo,
Japan), No. 147.
Ando T. 1968. Ecological studies on the stand density control in even-aged pure stands (in
Japanese; English summary). Government of Japan, Bulleton of the Government Forest
Experiment Station (Tokyo, Japan) No. 210.
Assmann E. 1970. The Principles of Forest Yield Study. 1st English Edition, Pergamon Press
Ltd. Oxford, England.
Bailey R.L., Dell T.R. 1973. Quantifying diameter distributions with the Weibull function.
Forest Science, 19: 97-104.
Barbour R.J., Kellogg R.M. 1990. Forest management and end-product quality: A Canadian
perspective. Canadian Journal of Forest Research, 20: 405-414.
Baskerville G.L. 1972. Use of logarithmic regression in the estimation of plant biomass.
Canadian Journal of Forest Research, 2: 49-53.
Bella I.E. 1967. Crown width/diameter relationship of open-growing jack pine on four site
types in Manitoba. Canadian Department of Forest and Rural Development, Bi-monthly
Research Notes, 23: 5-6.
Bella I.E., De Franceschi J.P. 1974. Early results of spacing studies of three indigenous
conifers in Manitoba. Environment Canada, Forestry Service, Northern Forestry Research
Center, Edmonton, Alberta. Information Report, NOR-X-113.
Cao Q.V. 2004. Predicting parameters of a Weibull function for modeling diameter
distribution. Forest Science, 50: 682-685.
Carmean W.H., Hazenberg G., Deschamps K.C. 2006. Polymorphic site index curves for
black spruce and trembling aspen in northwest Ontario. Forest Chronicle, 82: 231-242.
Carmean W.H., Niznowski G.P., Hazenberg G. 2001. Polymorphic site index curves for jack
pine in Northern Ontario. Forestry Chronicle, 77: 141-150.
Castedo-Dorado F., Crecente-Campo F., Álvarez-Álvarez P., Barrio A.M. 2009. Development
of a stand density management diagram for radiate pine stands including assessment of
stand stability. Forestry, 82: 1-16.
CCFM (Canadian Council of Forest Ministers). 2009. National forestry database.
http://nfdp.ccfm.org/silviculture/national_e.php.
Clutter J.L., Fortson J.C., Pienaar L.V., Brister G.H., Bailey R.L. 1983. Timber Management:
A Quantitative Approach. 1st Edition, John Wiley and Sons, New York, NY, USA.
168 Peter F. Newton

Dean T.J., Baldwin Jr. V.C. 1993. Using a density management diagram to develop thinning
schedules for loblolly pine plantations. Government of the United States of America,
Department of Agriculture, Forest Service, Southern Forest Experiment Station, New
Orleans, Louisiana. Research Paper SO-275.
De Vos S. 1973. The use of nearest neighbour methods. Tijdschrift voor Economische en
Sociale Geografie, 64: 308-319.
Di Lucca C.M. 1999. TASS/SYLVER/TIPSY: systems for predicting the impact of
silvicultural practices on yield, lumber value, economic return and other benefits. In: C.
Barnsey, Editor, Stand Density Management Planning and Implementation Conference,
Edmonton, Alberta. Clear Lake Publishing Ltd., Edmonton, Alberta, Canada. pp. 7-16.
Drew T.J., Flewelling J.W. 1977. Some recent Japanese theories of yield-density relationships
and their application to Monterey pine plantations. Forest Science, 23: 517-534.
Drew T.J., Flewelling J.W. 1979. Stand density management: an alternative approach and its
application to Douglas-fir plantations. Forest Science, 25: 518-532.
Emmett B. 2006. Increasing the value of our forest. Forestry Chronicle, 82: 3-4.
Enquist B.J., Brown J.H., West G.B. 1998 Allometric scaling of plant energetics and
population density. Nature, 395: 163-166.
Erdle T. 2000. Forest level effects of stand level treatments: using silviculture to control the
AAC via the allowable cut effect. In: P.F. Newton, Editor, Expert Workshop on the
Impact of Intensive Forest Management on the Allowable Cut, Canadian Ecology Centre,
Mattawa (2000), Ontario, Canada. pp. 19-30.
http://www.forestresearch.ca/Projects/fibre/IFMandACE.pdf.
Fleming R.L., Mossa D.S., Marek G.T. 2005. Upland black spruce stand development 17
years after cleaning and precommercial thinning. Forestry Chronicle, 81: 31-41.
Flewelling J.W., Drew T.J. 1985. A stand density management diagram for lodgepole pine. In
D.M. Baumgarter, R.G. Krebill, J.T. Arnott, and G.F. Weetman (Editors), Lodgepole
pine: the species and its management, Pullman, Washington, USA: Washington State
University.
Heriansyah I., Bustomi S., Kanazawa Y. 2009. Density effects and stand density management
diagram for merkus pine in the humid tropics of Java, Indonesia. Journal of Forestry
Research, 5: 91-113.
Honer T.G., Ker M.F., Alemdag I.S. 1983. Metric timber tables for the commercial tree
species of central and eastern Canada. Government of Canada, Department of
Agriculture, Canadian Forestry Service, Maritimes Forest Research Centre, Fredericton,
New Brunswick. Information Report M-X-140.
Hyink D.M., Moser J.W. 1983. A generalized framework for projecting forest yield and stand
structure using diameter distributions. Forest Science, 29: 85–95.
Hynynen J., Ahtikoski A., Siitonen J., Sievanen R., Liski J. 2005. Applying the MOTTI
simulator to analyse the effects of alternative management schedules on timber and non-
timber production. Forest Ecology and Management, 207: 5-18.
Isobe T., Feigelson E.D., Akritas M.G., Babu G.J. 1990. Linear regression in astronomy. I.
Astrophysical Journal, 364: 104-113.
Jack S.B., Long J.N. 1996. Linkages between silviculture and ecology: an analysis of density
management diagrams. Forest Ecology and Management, 86: 205–220.
Jolliffe P.A., Eaton G.W., Potdar M.V. 1988. Plant growth analysis: allometry, growth and
interference in Orchardgrass and Timothy. Annals of Botany, 62: 31-42.
Development and Utility of an Ecological-based Decision-Support System … 169

Kang K.Y., Zhang S.Y., Mansfield S.D. 2004. The effects of initial spacing on wood density,
fibre and pulp properties in jack pine (Pinus banksiana Lamb.). Holzforschung, 58: 455-
463.
Khil‘mi G.F. 1957. Theoretical Forest Biogeophysics. Academy of Sciences of the USSR
(Translated from Russian by the National Science Foundation, Washington, D.C.).
Kim D.K., Kim J.W. Park S.K. Oh M.Y., Yoo J.H. 1987. Growth analysis of natural pure
young stand of red pine in Korea and study on the determination of reasonable density (in
Korean; English abstract). Government of Korea, Research Reports of the Forestry
Institute (Seoul, Korea), 34: 32-40.
Kira T., Ogawa H., Sakazaki N. 1953. Intraspecific competition among higher plants. I.
Competition-yield-density interrelationship in regularly dispersed populations. Journal of
the Institute of Polytechnics (Osaka City University, Japan), Series D, 4: 1-16.
Kotze H., Malan F. 2007. Further progress in the development of prediction models for
growth and wood quality of plantation-grown Pinus patula sawtimber in South Africa.
In: Dykstra, D.P., Monserud, R.A. (Tech. Eds), Forest Growth and Timber Quality:
Crown Models and Simulation Methods for Sustainable Forest Management, Proceedings
of an International Conference, August 7-10, 2007, Portland, OR, USA. Department of
Agriculture, Forest Service, Pacific Northwest Research Station, General Technical
Report, PNW-GTR-791: 113-123.
Langsaeter A. 1941. Om tynning i enaldret gran- og furuskog (About thinning in even-aged
stands of spruce, fir and pine). Meddel. F. d. Norske Skogforsoksvesen, 8: 131-216.
Lindh B.C., Muir P.S. 2004. Understory vegetation in young Douglas-fir forests: does
thinning help restore old-growth composition. Forest Ecology and Management, 192:
285-296.
Liu C., Zhang S.Y. 2005. Models for predicting product recovery using selected tree
characteristics of black spruce. Canadian Journal of Forest Research, 35: 930-937.
Long J.N., McCarter J.B., Jack S.B. 1988. A modified density management diagram for
coastal Douglas-fir. Western Journal of Applied Forestry, 2: 6-10.
López-Sánchez C., Rodríguez-Soalleiro R. 2009. A density management diagram including
stand stability and crown fire risk for Pseudotsuga Menziesii (Mirb.) Franco in Spain.
Mountain Research and Development, 29: 169-176.
Mallows C.L. 1973. Some comments on Cp. Technometrics, 15: 661-675.
Mar:Möller C.M. 1954. The influence of thinning on volume increment. I. Results of
investigations. In: C. Mar:Möller, J. Abell, T. Jagd, and F. Juncker, Editors, Thinning
problems and practices in Denmark, State University of New York, College of Forestry,
Syracuse, NY, USA, Technical Publication, 76: 5-32.
McCarter J.B., Long J.N. 1986. A lodgepole pine density management diagram. Western
Journal of Applied Forestry, 1: 6-11.
McClain K.M., Morris D.M., Hills S.C., Buse L.J. 1994. The effects of initial spacing on
growth and crown development for planted northern conifers: 37-year results. Forestry
Chronicle, 70: 174–182.
McKinnon L.M., Kayahara G.J., White R.G. 2006. Biological framework for commercial
thinning evenaged single-species stands of jack pine, white spruce, and black spruce in
Ontario. Ontario Ministry of Natural Resources, Science and Information Branch,
Northeast Science and Information Section. Technical Report, TR-046.
170 Peter F. Newton

Meyer J.S., Ingersoll C.G., McDonald L.L., Boyce M.S. 1986. Estimating uncertainty in
population growth rates: jackknife vs. bootstrap techniques. Ecology, 67: 1156-1166.
Nelder J.A. 1962. New kinds of systematic designs for spacing experiments. Biometrics, 18:
283-307.
Neter J., Wasserman W., Kutner M.H. 1990. Applied Linear Statistical Models, 3rd Edition,
Irwin, Boston, MA, USA.
Newton P.F. 1997. Stand density management diagrams: review of their development and
utility in stand-level management planning. Forest Ecology and Management, 98: 251-
265.
Newton P.F. 2003. Yield prediction errors of a stand density management program for black
spruce and consequences for model improvement. Canadian Journal of Forest
Research, 33: 490-499.
Newton P.F. 2006a. Asymptotic size–density relationships within self-thinning black spruce
and jack pine stand-types: Parameter estimation and model reformulations. Forest
Ecology and Management, 226: 49-59.
Newton P.F. 2006b. Forest production model for upland black spruce stands—Optimal site
occupancy levels for maximizing net production. Ecological Modelling, 190: 190–204.
Newton P.F. 2009. Development of an integrated decision-support model for density
management within jack pine stand-types. Ecological Modelling, 220: 3301-3324.
Newton P.F. 2010. Stand Density Management Diagrams. SciTopics. Retrieved November
30, 2010, from http://www.scitopics.com/Stand_Density_Management_Diagrams.html.
Newton P.F., Amponsah I.G. 2005. Evaluation of Weibull-based parameter prediction
equation systems for black spruce and jack pine stand types within the context of
developing structural stand density management diagrams. Canadian Journal of Forest
Research, 35: 2996-3010.
Newton P.F., Amponsah I.G. 2007. Comparative evaluation of five height–diameter models
developed for black spruce and jack pine stand-types in terms of goodness-of-fit, lack-of-
fit and predictive ability. Forest Ecology and Management, 247: 149–166.
Newton P.F., Jolliffe P.A. 2003. Aboveground dry matter partitioning responses of black
spruce to directional-specific indices of local competition. Canadian Journal of Forest
Research, 33: 1832-1845.
Newton P.F., Sharma M. 2008. Evaluation of sampling design on taper equation performance
in plantation-grown Pinus banksiana. Scandinavian Journal of Forest Research, 23: 358-
370.
Newton P.F., Weetman G.F. 1993. Stand density management diagrams and their utility in
black spruce management. Forestry Chronicle, 69: 421-430.
Newton P.F., Weetman G.F. 1994. Stand density management diagram for managed black
spruce stands. Forestry Chronicle, 70: 65-74.
Newton P.F., Lei Y., Zhang S.Y. 2004. A parameter recovery model for estimating black
spruce diameter distributions within the context of a stand density management diagram.
Forestry Chronicle, 80: 349-358.
Newton P.F., Lei Y., Zhang S.Y. 2005. Stand-level diameter distribution yield model for
black spruce plantations. Forest Ecology and Management, 209: 181-192.
Nilsen P., Strand L.T. 2008. Thinning intensity effects on carbon and nitrogen stores and
fluxes in a Norway spruce (Picea abies (L.) Karst.) stand after 33 years. Forest Ecology
and Management, 256: 201–208.
Development and Utility of an Ecological-based Decision-Support System … 171

Pelletier G., Pitt D.G. 2008. Silvicultural responses of two spruce plantations to midrotation
commercial thinning in New Brunswick. Canadian Journal of Forest Research, 38: 851-
867.
Peltola A. 2009. Finnish Statistical Yearbook of Forestry. http://www.metla.fi/julkaisut/
metsatilastollinenvsk/index-en.htm.
Pretzsch H. 2009. Forest Dynamics, Growth and Yield. Springer, Verlag, Berlin and
Heidelberg.
Pretzsch H., Biber P., Dursky J. 2002. The single tree-based stand simulator SILVA:
construction, application and evaluation. Forest Ecology and Management, 162: 3–21.
Rowe J.S. 1972. Forest regions of Canada. Government of Canada, Department of
Environment, Canadian Forestry Service, Ottawa, Ontario. Publication No. 1300.
Rudnicki M., Silins U., Lieffers V.J., Josi G. 2001. Measure of simultaneous tree sways and
estimation of crown interactions among a group of trees. Trees, 15: 83-90.
Sharma M., Parton J. 2009. Modeling stand density effects on taper for jack pine and black
spruce plantations using dimensional analysis. Forest Science, 55: 268-282.
Shinozaki K., Kira, T. 1956. Intraspecific competition among higher plants. VII. Logistic
theory of the C-D effect. Journal of the Institute of Polytechnics (Osaka City University,
Japan), Series D, 12: 69-82.
Smith F.W., Long J.N. 1987. Elk hiding and thermal cover guidelines in the context of
lodgepole pine stand density. Western Journal of Applied Forestry, 2: 6-10.
Smith N.J. 1989. A stand-density control diagram for western red cedar, Thuja plicata. Forest
Ecology and Management, 27: 235-244.
Sprugel D.G. 1983. Correcting for bias in log-transformed allometric equations. Ecology, 64:
209-210.
Stankova T.V., Shibuya M. 2007. Stand Density Control Diagrams for Scots pine and
Austrian black pine plantations in Bulgaria. New Forests, 34: 123-141
Sturtevant B.R., Bissonette J.A., Long J.N. 1996. Temporal and spatial dynamics of boreal
forest structure in western Newfoundland: silvicultural implications for marten habitat
management. Forest Ecology and Management, 87: 13-25.
Tong Q.J., Zhang S.Y., Thompson M. 2005. Evaluation of growth response, stand value and
financial return for pre-commercially thinned jack pine stands in Northwestern Ontario.
Forest Ecology and Management, 209: 225-235.
Tong Q.J., Fleming R.L., Tanguay F., Zhang S.Y. 2009. Wood and lumber properties from
unthinned and precommercially thinned black spruce plantations. Wood and Fibre
Science, 41: 168-179.
Verschuyl J., Riffell S., Miller D., Wigley T.B. 2011. Biodiversity response to intensive
biomass production from forest thinning in North American forests – A meta-analysis.
Forest Ecology and Management, 261: 221-232.
Vezina P.E. 1963. More about the crown competition factor. Forestry Chronicle, 39: 313-
317.
Watt W.R, Parton J., Chen H., Lucking G., Houle N., Levesque S., Luke A. 2001. Standard
forest units for northeastern Ontario boreal Forests. Ontario Ministry of Natural
Resources, Northeast Science and Technology Unit, Information Report (Draft dated
14/02/01).
Weibull W. 1951. A statistical distribution function of wide applicability. Journal of Applied
Mechanics, 18: 293-297.
172 Peter F. Newton

West G.B., Brown J.H., Enquist B.J. 1997. A general model for the origin of allometric
scaling laws in biology. Science, 276: 122-126.
Yoda K., Kira T., Ogawa H., Hozumi K. 1963. Self-thinning in overcrowded pure stands
under cultivated and natural conditions. Journal of Biology, (Osaka City University,
Japan) 14: 107-129.
Zhang S.Y., Lei Y.C., Bowling C. 2005. Quantifying stem quality characteristics in relation
to initial spacing and modeling their relationship with tree characteristics in black spruce
(Picea mariana). Northern Journal of Applied Forestry, 22: 85-93.
Zhang S.Y., Liu C., Jiang Z.H. 2006. Modeling product recovery in relation to selected tree
characteristics in black spruce using an optimized random sawing simulator. Forest
Products Journal, 56: 93-99.
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 173-204 © 2012 Nova Science Publishers, Inc.

Chapter 8

ECOLOGICAL NICHE MODELS IN MEDITERRANEAN


HERPETOLOGY: PAST, PRESENT AND FUTURE

A. Márcia Barbosa1,2*, Neftalí Sillero3, Fernando Martínez-Freiría4


and Raimundo Real5
1
'Rui Nabeiro' Biodiversity Chair, CIBIO (Centro de Investigação em Biodiversidade e
Recursos Genéticos) – University of Évora, 7004-516 Évora, Portugal.
2
Department of Life Sciences, Imperial College London,
Silwood Park Campus, Ascot (Berkshire) SL5 7PY, United Kingdom.
3
CICGE (Centro de Investigação em Ciências Geo-Espaciais), Faculty of Sciences,
University of Porto, Rua do Campo Alegre 687, 4169-007 Porto, Portugal.
4
CIBIO (Centro de Investigação em Biodiversidade e Recursos Genéticos) – University
of Porto, Instituto de Ciências Agrárias de Vairão,
R. Padre Armando Quintas, 4485-661 Vairão, Portugal.
5
Biogeography, Diversity and Conservation Lab,
Department of Animal Biology,
Faculty of Sciences, University of Málaga,
29071 Málaga, Spain.

ABSTRACT
We present a review of the concepts and methods associated to ecological niche
modeling illustrated with the published works on amphibians and reptiles of the
Mediterranean Basin, one of the world's biodiversity hotspots for conservation priorities.
We start by introducing ecological niche models, analyzing the various concepts of niche
and the modeling methods associated to each of them. We list some conceptual and
practical steps that should be followed when modeling, and highlight the pitfalls that
should be avoided. We then outline the history of ecological modeling of Mediterranean
amphibians and reptiles, including a variety of aspects: identification of the ecological
niche; detection of common distribution areas (chorotypes) and other biogeographical
patterns; analysis and prediction of species richness patterns; analysis of the expansion of
native and invasive species; integration of molecular data with spatial modeling;
174 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

identification of contact zones between related taxa; assessment of species' conservation


status; and prediction of future conservation problems, including the effects of global
change. We conclude this review with a discussion of the research that still needs to be
developed in this area.

Keywords: ecological niche models, Mediterranean Basin, amphibians, reptiles.

1. PART ONE: INTRODUCTION TO ECOLOGICAL NICHE MODELING


1.1. What Are Ecological Niche Models?

Ecological niche models (ENM) are empirical or mathematical approaches to the


ecological niche of a species (for reviews see Guisan and Zimmermann, 2000, Austin, 2002,
Rushton et al., 2004, Guisan and Thuiller, 2005, Araújo and Guisan, 2006, Guisan et al.,
2006; Peterson, 2006, Austin, 2007, Jiménez-Valverde et al., 2008, Morin and Lechowicz,
2008). The primary objective of an ENM is to relate different types of eco-geographical
(environmental, topographical, human, or purely spatial) variables to the distribution of a
species, in order to identify the factors that limit and define its niche. The final result of an
ENM may be a spatial representation of the habitats that favor the presence of a species
(Guisan and Zimmermann, 2000). An ENM can be used to predict suitable habitats in poorly
sampled areas (Engler et al., 2004), or in the future under expected environmental changes
(e.g. Shugart, 1990, Sykes et al., 1996, Teixeira and Arntzen, 2002, Araújo et al., 2006). It
can also be related to trends in species abundance (Araújo and Williams, 2000, Real et al.,
2009) or to their probability of persistence in certain areas (Araújo and Williams, 2000).
ENMs have become popular due to the need for efficiency in the design and implementation
of conservation management (Bulluck et al., 2006).

1.2. What is the Ecological Niche?

Several definitions of ecological niche have been proposed over time. The first one is due
to Grinnell (1917), who understood the ecological niche as a subdivision of the habitat
containing the environmental conditions that allow the individuals of a species to survive and
reproduce. This concept is based on variables for which the species compete (climatic or
scenopoetic variables according to Soberón, 2007; see also Hirzel and Le Gay, 2008, Wiens
et al., 2009). On the other hand, Elton (1927) emphasized the functional role of a species in a
community, especially its position in the food chain, depending on variables that can be
consumed by the species (nutrients or bionomic variables in Soberón, 2007). Finally,
Hutchinson (1957) defined mathematically the fundamental and the realized niche (Figure 1).
The fundamental niche is an n-dimensional volume of environmental space within which a
species can maintain a viable population and persist over time without immigration. Each
dimension is an environmental variable that influences the niche. The realized niche is a part
of the fundamental niche where the species is not excluded by competition. The main
difference between Grinnell‘s and Elton's niche concepts relative to Hutchinson‘s one is that
the former two used the term niche to refer to places in the environment that can
Ecological Niche Models in Mediterranean Herpetology 175

accommodate the species, while for Hutchison, species, and not the environment, have
niches.
Jackson and Overpeck (2000) replaced the concept of realized niche by that of potential
niche, which is the part of the fundamental niche that is available for the species; some parts
are not available because not all possible combinations of variables under which the species
could survive currently exist in the environment. Similarly, Colwell and Rangel (2009) and
Soberón and Nakamura (2009) considered that there are three different niches: the
fundamental niche, the potential niche (i.e. the existing part of the fundamental niche), and
the realized niche. Finally, Pearson (2007) introduced the concept of occupied niche, to which
species distributions are limited by historical, geographical, and biotic factors (dispersal
ability, competition, predation, parasitism, symbiosis).

Figure 1. The easiest way to visualize the different ecological niches is the BAM (biotic, abiotic,
movement) diagram (see Soberón and Peterson, 2005, Soberón, 2007), which represents the theoretical
environmental space divided into the three main factors that limit the distribution of a species. The
suitable habitat corresponds to the area common to all three factors, which represents the occupied
niche (ON; sensu Pearson, 2006). The area shared by A and M represents Grinnell's niche (GN). The
area shared by A and B is Elton's niche (EN). The whole A area is Hutchinson's fundamental niche
(FN). A species can live in climatically favorable regions to which it has been able to disperse and from
which it is not excluded by biotic interactions. Regions that fail to meet all these conditions are not
suitable for the species' presence.
176 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

To these concepts we must add two other important ones: the source-sink theory
(Pulliam, 1988, Pulliam, 2000) and dispersal limitation (Holt, 2003). According to the source-
sink theory, some populations may occupy unsuitable habitats (sinks) due to immigration
from healthier nearby populations (sources). Although individuals in the sinks may die due to
the adverse conditions, they are replaced by new immigrants. Here, the realized distribution
goes beyond the fundamental niche, as the species occupies habitats that are inadequate and
not contained in the niche (Pulliam 1988, Pulliam 2000). With dispersal limitation, a species
can be absent from suitable habitats for historical reasons or due to limitations in its ability to
disperse to those habitats (Holt, 2003).

1.3. Types of Ecological Niche Models

ENMs can be classified into mechanistic (explanatory) and statistical/correlative


(predictive). Mechanistic models are based on hypothetical cause-effect relationships between
the variables and the species' distribution, which makes them more ecologically meaningful.
They use variables that, according to existing theory or experimental results, have a direct
effect on the species' survival, such as temperature or humidity. In contrast, correlative ENMs
are based on statistical correlations between species occurrence and variables that do not
necessarily have a direct effect on the species, such as altitude or latitude, but that summarize
the effects of various direct factors, and are easier to measure (Guisan and Zimmermann,
2000). Correlative models tend to provide more accurate predictions than mechanistic ones,
and they can also have an explanatory component: more than simply predicting the species'
geographic distribution, they may reflect important aspects of its biology and natural history,
and suggest underlying ecological factors not included in the existing theory (e.g. Peterson
and Cohoon, 1999).
Stoms et al. (1992) proposed an alternate classification into deductive and inductive
models, based on the conceptual approach used to define the species-environment
relationship. Deductive models use expert opinion on the ecological requirements of a species
to infer where the appropriate areas are within the studied territory. Such models are
subjective and limited to the few species and habitats whose relationship is sufficiently
known (Araújo et al., 2005). Conversely, inductive models perform an environmental
characterization, through statistical analyses, of the species' distribution range, to infer its
ecological preferences. Then, following a more objective deductive process, these preferences
are extrapolated to the studied territory (Pereira and Itami, 1991, Aspinall and Matthews,
1994, Woodward and Cramer, 1996).
The existence of a correlation between a variable and the distribution of a species does
not imply a cause-effect relationship. The explanatory interpretation of ENMs should thus be
taken with caution, as the causal effect of one variable on the species can be masked by the
effects of other non-causal variables that are correlated with it (MacNally, 2000; see section
1.7). In fact, Kerney (2006) believes that only mechanistic models can predict the niche of a
species, as they are the only ones that rely on a theoretical foundation for such cause-effect
relationships. However, the theoretical basis necessary for building mechanistic ENMs (e.g.
MacNally, 2000) is seldom available. If there is not sufficient knowledge about the species to
identify the direct determinants of its distribution, correlative ENMs based on statistical
Ecological Niche Models in Mediterranean Herpetology 177

relationships can be very helpful, as long as the constraints and limitations inherent to the
statistical analyses are taken into account (MacNally, 2000).

1.4. Correlative Niche Modeling Methods

The correlative methods of modeling species' niches can be classified into three main
groups: presence/absence methods, profile methods, and presence-only methods. The first
kind relates a binary dependent variable (i.e., with only two possible values, such as presence
and absence, one and zero) to a series of independent variables, so it induces the conditions
that make a species present rather than absent. This kind of methods includes generalized
linear models (GLM), such as discriminant analysis (Lachenbruch, 1975), logistic regression
(Hosmer and Lemeshow, 1989), and the favorability function (Real et al., 2006); and
generalized additive models (GAM; Hastie and Tibshirani, 1990), which are more complex
and usually fit the data better, but may be less general, i.e., less applicable to other data sets.
Profile methods compare the environmental conditions in the observed presence areas
with the conditions available in the whole study area, thus outlining presence against a
background. These methods include ecological niche factor analysis (ENFA; Perrin, 1984,
Hirzel et al., 2002), the genetic algorithm for rule-set production (GARP; Stockwell and
Noble, 1992), and maximum entropy (Maxent; Phillips et al., 2004). The definition of these
methods as "presence-only" is incorrect, as they compare presence areas with all the
environmental space analyzed (the so-called background), which includes both presence and
non-presence areas (e.g. Phillips et al., 2009). From this background, some authors select
pseudo-absences (e.g. Chefaoui and Lobo, 2008).
Presence-only data can be modeled, for example, with overlap analysis (Brito et al., 1999,
Arntzen and Teixeira, 2006), which overlays the species‘ presence area to the environmental
variables to derive the range of environmental conditions under which the species can live.
Mahalanobis distance (Etherington et al., 2009), the multidimensional envelope (MDE) used
in BIOCLIM (Busby, 1991), and the HABITAT (Walker and Cocks, 1991) and DOMAIN
models (Carpenter et al., 1993) are other examples of truly presence-only methods. Another
way to analyze only presence data is to use binary models such as logistic regression to
confront, rather than presence and absence of a species, the presence of one species or
subspecific variant against the presence of another one (Romero and Real, 1996, Brito and
Crespo, 2002, Arntzen and Alexandrino, 2004, Real et al., 2005, Arntzen and Espregueira
Themudo, 2008).
There are more complex methods, such as random forests, classification and regression
trees (CART), multivariate adaptive regression splines (MARS), and artificial neural
networks (ANN). Moisen and Frescino (2002) compared the effectiveness of various methods
in modeling the distribution of simulated and actual data on forest variables. The models built
with more sophisticated techniques showed better results with simulated data, but for real data
the difference were not significant and a simple linear model worked almost as well as
complex models. Complex methods may describe species‘ distributions more accurately, but
produce models more difficult to interpret from an ecological point of view. Moreover, they
seldom provide intelligible information on which and how environmental variables are related
to species‘ distributions, an information that can be valuable from a conservation standpoint.
178 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

A compromise between complexity and intelligibility may be desirable in ENMs meant for
use in conservation and management.
Some absence records may actually correspond to presences that were not detected for
various reasons, including insufficient or no surveying effort. This affects not only
presence/absence modeling methods, but also profile methods, which will include these false
absences in the background but outside the presence area; and presence-only methods, which
will fail to include these undetected presences in the analysis. When the survey is not too
deficient or spatially biased (Reese et al., 2005), explicitly including absence data can
improve the ENM by providing information on locations that may be less suitable for the
species (Hirzel et al., 2001; see also section 1.5) due to historical (e.g. barriers to dispersal),
biotic (competition, predation) or human restrictions (Guisan and Zimmermann, 2000,
Anderson et al., 2002). Even when the data include false absences, these are often associated
to low local abundance of the species, so their inclusion can improve the relationship between
ENM predictions and actual species abundance (Real et al., 2009, Barbosa et al., 2009).
If the aim is to obtain the fundamental niche of the species, the use of absence data may
have an undesirable effect by excluding suitable environmental areas where the species is not
present due to historical limitations or biotic interactions. But if the goal is to approach the
realized niche and the data are of generally good quality, it is advisable to explicitly include
absences, even if they are not all correct. In any case, the quality of absence data is as
important for the models as the quality of the presence data.
The application of profile or presence-only methods may be more desirable in the case of
very limited or dispersed presence data, such as those taken from herbaria or museum
collections (Elith et al., 2006) or from samples identified through molecular analysis (Real et
al., 2005). Presence/absence models based on distribution atlases have been widely used with
success and have shown good relationships with independent data on species abundance (Real
et al., 2009) and good extrapolation ability, both to contiguous geographic areas (Barbosa et
al., 2009) and to finer resolution scales (Barbosa et al., 2010). To be able to model presences
and absences with more reliable data, it would be useful to publish information on areas that
have been surveyed but where the target species were not found.
Abundance data can also be modeled (e.g. Anadón et al. 2010). Abundance provides
more information than the simple presence or absence of species, but it also contains more
noise and is more costly to measure, so it is rarely available. Abundance data can be analyzed
with generalized linear models that assume a Poisson distribution, suitable for count data, or a
negative binomial distribution, when there is over-dispersion of these data (i.e., when the
variance is greater than the mean). When the data contain many more zeros than would be
expected according to any of these distributions, zero-inflated models are more appropriate
(Zuur et al., 2009).

1.5. Which Niche Do ENMs Represent?

Not all ENMs represent the same niche: the result varies depending on the method and
the type of variables used (Soberón and Nakamura, 2009; Figures 1 and 2). It is widely
accepted that mechanistic ENMs predict the fundamental niche (Pearson and Dawson, 2003,
Kearney and Porter, 2004, Kearney and Porter, 2009, Rodder et al., 2009) and correlative
ENMs are closer to the realized niche, since the recorded presences are determined by biotic
Ecological Niche Models in Mediterranean Herpetology 179

and abiotic factors (Pearson and Dawson, 2003, Araújo and Guisan, 2005, Guisan and
Thuiller, 2005, Soberón and Peterson, 2005, Kearney, 2006, Morin and Lechowicz, 2008,
Pearman et al., 2008, Colwell and Rangel, 2009, Lobo et al., 2010). Within correlative ENMs,
those calculated with presence/absence data yield the probability of finding the species in
each portion of the study area. Above a chosen probability threshold, these models can be
considered to represent the spatial distribution of the habitats that are both suitable and
occupied by the species (Guisan and Zimmermann, 2000; Pearson, 2007). Models based only
on presence data provide an indication of the suitability of the habitat, not necessarily
implying that the species will be found there.
Correlative methods forecast the distribution of the species through correlations among
environmental variables. The values of these variables are determined by the geographical
positions of the species‘ records. In other words, correlative ENMs are sensitive to the
topology of presences (their geographical positions and the relationships between them). If
we want to calculate a species‘ potential distribution without considering its geographical
records, we should use a mechanistic modeling method.

Figure 2 The idea of habitat suitability actually corresponds to a gradient (Soberón, 2010) with
extremely favorable habitats on one end, and completely unfavorable ones (where the species simply
cannot survive) on the other. The position of the species in this gradient depends on the intensity with
which climate, dispersal and biotic interactions act. In habitats with unfavorable climate, the species can
survive for some time if there is immigration and there are no competing species. In contrast, the
species may be absent from a climatically favorable area if it is excluded by other species or if
geographical barriers prevent its access. Intermediate situations represent small populations that subsist
despite unfavorable climatic conditions and biotic interactions. See abbreviations in Figure 1.
180 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

However, not all authors acknowledge that ENMs predict the ecological niche. For
Kearney (2006), the niche is a mechanical rather than descriptive concept. In this way,
correlative ENMs model only habitats. In contrast, mechanistic ENMs model the niche and
are the only ones that establish a mechanical connection between the model and the species.
Kearney (2006) considers that the niche is determined by the factors that allow a species to
survive, which implies a mechanical relationship between the species and these factors, while
the habitat is simply the place where the species lives. Jiménez-Valverde et al. (2008) and
Lobo (2008) also argue that correlative ENMs predict habitats rather than niches: if the
absences of the species are recorded in areas where they are excluded by biotic factors, the
model calculates the realized distribution; if they are taken from areas where the species is
excluded only by abiotic factors, the model calculates the potential distribution. However,
these "distributions" do not correspond to the realized and fundamental niches, respectively.
Godsoe (2010) goes further by proving mathematically that it is impossible to calculate the
niche: the only thing that can be determined is whether the set of variables used to calculate
an ENM belongs to the niche or not.

1.6. Which Conceptual and Practical Steps Should Be Followed When


Modeling, and Which Situations Should Be Avoided?

The first step in modeling the distribution of a species is to decide which is the motive for
the work (to know the potential distribution, the limiting factors…) and which are the most
appropriate modeling parameters (distribution data, environmental variables, study area,
modeling method, resolution scale). Once these are defined, we must gather chorological (i.e.
location) data of the species and create a database, for example from distribution atlases
(Sillero et al., 2009), museum collections (Brito et al., 2008) or systematic surveys (Hirzel
and Guisan, 2002, Martínez-Freiría et al., 2008, Sillero, 2009). Such surveys can be oriented
to establish only the presence of the species (Martínez-Freiría et al., 2008) or to record also
their absence (Anadón et al., 2006). The spatial distribution of absences determines the type
of ecological niche that will be predicted and, therefore, can significantly alter the outcome of
the models (Figure 1). It is also important that the database has no geographical errors (Sillero
et al., 2005). Even if the ENM is to be produced with a low (i.e., coarse) spatial resolution, it
is recommended that the records of the species are taken with a GPS. This will minimize
errors in the geographical coordinates and allow the data to be used for ENMs with different
spatial resolutions (Carretero et al., 2008, Kaliontzopoulou et al., 2009).
The chorological records should be independent of one another, i.e. should not be
spatially autocorrelated (Koenig, 1999, Dormann, 2007, Dormann et al., 2007). This
condition can only be fulfilled with systematic surveys. The autocorrelation of the surveying
effort must be uniform. Thus, the distribution of the records should correspond as much as
possible to the actual distribution of the species, or at least to the one observed in the field.
For example, the distribution of a species can be strongly aggregated or fragmented. This
variation in the degree of clustering observed in the sample of distribution records must have
a correspondence with reality. An appropriate sampling design with a consistent surveying
effort is the best way to minimize the problem of autocorrelation in the data.
Sample size also influences the results of ENMs (Hirzel and Guisan, 2002, McPherson et
al., 2004, Pearson et al., 2007, Wisz et al., 2008). The accuracy of the ENM first increases
Ecological Niche Models in Mediterranean Herpetology 181

substantially with sample size, and then stabilizes, reaching an asymptote (Stockwell and
Peterson, 2002). Teixeira and Arntzen (2006) study the variation in the number of records of
Chioglossa lusitanica throughout history and its effect on ENMs: from a certain number of
records, the ENM does not improve significantly. There is a minimum sample size below
which it is not possible to calculate an ENM, which depends on the modeling method used.
The types of modeling algorithms that can be used depend on the chorological data
available. Not all methods are equivalent or useful in all situations. To build a mechanistic
ENM, the knowledge on the biology and physiology of the species must be substantial
(Kearney and Porter, 2004, Kearney and Porter, 2009). For correlative ENMs, the
chorological data available will determine the type of modeling method that can be used. If
we have both presence and absence data, we can apply a variety of methods such as GLM or
GAM. If we have presence data within a wider background with environmental data, we can
apply profile methods such as Maxent (Phillips et al., 2004, Phillips et al., 2006), ENFA
(Hirzel et al., 2002) or GARP (Stockwell and Noble, 1992). When data are available only for
the presence area, we can apply presence-only methods such as Mahalanobis distance
(Etherington et al., 2009), BIOCLIM (Busby, 1991), HABITAT (Walker and Cocks, 1991) or
DOMAIN (Carpenter et al., 1993). In each case, we will get an approximation to a different
ecological niche (Soberón and Nakamura, 2009). Moreover, depending on the position of
absences in the BAM diagram (Figure 1), i.e., which factor they derive from, the ENM will
also be different (Lobo et al., 2010).
Lobo et al. (2010) classify absences into three different types: contingent absences, which
correspond to environmentally suitable areas that are not occupied for historical or biotic
reasons; environmental absences, when the environment is in fact unsuitable for species
presence; and methodological absences, caused by survey deficiency. Contingent absences
are outside the realized niche but inside the fundamental one; environmental absences are
outside both niches; and methodological absences are included in both niches (Lobo et al.,
2010). The latter can be very important in amphibian and reptile species, as it is difficult to
ensure that a species is really absent from a place where it has not been detected. This is why
profile and presence-only modeling methods are largely used in herpetology. For comparison,
in the Mediterranean Basin in the last three years, thirteen modeling works were published in
which profile methods were used (Ficetola et al, 2007, Brito et al., 2008, Carretero et al.,
2008, Kaliontzopoulou et al., 2008, Martínez-Freiría et al., 2008, Ficetola et al., 2009, Santos
et al., 2009, Martínez-Freiría et al., 2009, Ribeiro et al., 2009, Rödder and Lötters, 2009,
Sillero, 2009, Sillero et al., 2009, Sillero, 2010), against four works with presence/absence
models (Arntzen and Espregueira Themudo, 2008, Real et al., 2008, Bombi et al., 2009, Real
et al., 2010).
There are differences in the way each method works. For example, ENFA cannot work
with categorical variables: they must be converted to quantitative values. GLM, GAM, GARP
and Maxent models can be projected to other geographical areas or sets of variables; ENFA,
on the other hand, does not allow the projection of the ENM. All methods mentioned here are
affected by correlations between variables, except for ENFA, which previously transforms the
variables into a set of uncorrelated factors, similar to the axes of principal component analysis
(Hirzel et al., 2002). The ability of many methods to discriminate between presences and
absences can be assessed, for example, with the Area Under the ROC (receiver operating
characteristic) Curve (AUC), although Lobo et al. (2008) warn against equating higher
discrimination with higher accuracy when models differing in species prevalence are
182 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

compared. It is thus not advisable to compare the AUC of models for the same species in
different areas, or for different species in the same area (VanDerWal et al., 2009). ENFA does
not make it possible to calculate the AUC for comparison with other modeling methods and
algorithms. In some cases (GLM, GAM, ENFA) the result is unique, as there is not a random
process associated (e.g. Arntzen, 2006, Espregueira Themudo and Arntzen, 2007, Soares and
Brito, 2007, Arntzen and Espregueira Themudo, 2008, Santos et al., 2006, Ribeiro et al.,
2009, Santos et al., 2009; Sillero et al., 2009). Maxent, on the other hand, produces a different
ENM each time, requiring the calculation of multiple replicas (e.g. Martínez-Freiría et al.,
2008, Martínez-Freiría et al., 2009, Sillero, 2009, Sillero, 2010) and a final averaged ENM
(ensemble model; Araújo and New, 2007). In conclusion, each modeling method has its own
peculiarities, which must be taken into account before undertaking any modeling project.
The next step is the selection of variables that can theoretically influence and limit the
species‘ distribution. This selection is a subjective process and depends on the ecological
relevance of these variables for the modeled species and on their availability, particularly in a
spatially geo-referenced format (see Sillero and Tarroso, 2010 for free environmental data
sources on the Internet). However, ENMs are not necessarily spatial. Actually, in some
methods the last step is the spatialization of the ENM, which is just a mathematical formula
that relates several variables. When the variables are available in a spatial format that can be
incorporated in a Geographic Information System (GIS), the ENM can be represented in
space (e.g. Brito and Crespo, 2002). To comply with statistical theory, the variables should
not be correlated (Keitt et al., 2002, Diniz-Filho et al., 2003, Betts et al., 2006, Segurado et
al., 2006; see section 1.7), which is rarely possible in the real world. At least, we should
remove one variable from each pair of variables that have a high correlation, for example
greater than 70-75% (e.g. Martínez-Freiría et al., 2008), and point to the same causal factor.
However, we cannot tell if we are eliminating the most causal variable (if it is one of the two)
or the variable that correlates with it.
The use of a large number of variables in relation to the number of localities can increase
the risk of over-parameterization of the model, i.e. of including variables whose relationship
with the species is due to chance, and that can only describe the species‘ distribution in the
study area. It is also important to note that the factors determining the ecological niche in the
ENM are expressed differently depending on the scale: historical and abiotic factors are more
relevant on coarse scales (i.e., large areas), while biotic factors act mainly on local scales
(Peterson, 2006).
Once the variables are selected, we should ensure that the study area is adequate. This is
an important step, though it is rarely taken into account, and should be decided at the
beginning of the process along with the modeling aims. Ideally, the study area should include
the entire ecological range of the species: if the area is too small, part of the ranges of some
variables may be left out of the analysis and lead to incorrect results. However, Thuiller et al.
(2004) found no difference between ENMs when they reduced the size of the study area, but
only when they projected ENMs outside their geographic boundaries (see also Barbosa et al.,
2009). In addition, environmental data are often not available for the entire distribution area
of a species. Albert and Thuiller (2008) recommend not using large areas if the distribution of
the species is small. Stockwell and Peterson (2002) propose to divide the area, since ENMs
may react differently to different sample sizes, showing that the species‘ response to
environmental variables is not uniform throughout the study area. Other authors consider that
the definition of the study area should respond to biogeographical criteria (e.g. Sillero et al.,
Ecological Niche Models in Mediterranean Herpetology 183

2009). Perhaps the best method is to check that the response curves of the variables are not
truncated (Guisan and Thuiller, 2005). If a curve is truncated, the study area should be
increased until a normal curve is reached (and then the correlations between variables should
be measured again). In any case, we should avoid to define study areas with political criteria
such as national divisions when they do not correspond to natural limits (e.g. Brito et al.,
1996, Teixeira et al., 2001, Teixeira and Arntzen, 2002, Arntzen, 2006, Arntzen and Teixeira,
2006 ), although data availability is often limited by political boundaries.
ENMs assume an equilibrium between the species and the environment, i.e., that the
species occupies all available favorable habitats and is absent from all unfavorable ones
(Araújo and Pearson, 2005, Wiens et al., 2009). However, it is very unlikely that this
condition is met. If a species is still expanding its distribution, chorological records do not
represent the breadth of its ecological niche, and the ENM may not identify all potentially
favorable areas. One solution is to increase the size of the study area; another is to divide the
data into groups that simulate the dispersion and calculate the respective ENMs over the same
area (Saddler, 2010): if the ENMs are similar, they can be considered robust. We can also
limit the chorological sample to proven breeding sites (Ficetola et al., 2008, Ficetola et al.,
2009), which correspond to populations and not to dispersing individuals (which are not in
equilibrium). In the case of introduced species, the best approach may be to model the
distribution in their original area and then project the ENM to the introduced area (Ficetola et
al., 2007, Pearman et al. 2008, Beaumont et al., 2009, Rodder and Lötters, 2009).
Once the modeling method and the study area are defined, the species‘ chorological data
are gathered, and the predictor (and uncorrelated) variables are selected, we can calculate the
ENM. Although each modeling method provides a different type of information, an ENM
should be composed primarily of three items: a map of the predictions (as long as it is a
spatial model), the importance with which each variable contributes to the model, and a
measure of its accuracy. The map depicts the favorable habitats for the species, and is usually
composed of continuous values, bounded between zero (completely unfavorable) and one
(completely favorable). In profile models, this result is often called habitat suitability map
(HSM; Hirzel et al., 2002). However, in some cases, it is of interest to convert the HSM into a
(potential) distribution map, with only values of "present" and "absent". To do this, we have
to choose a threshold within the continuous range of the HSM to distinguish unfavorable
habitats (below the threshold) from favorable ones (above the threshold). The choice of
threshold is up to the researcher and depends on the modeling aims (for reviews see Liu et al.,
2005, Jiménez-Valverde and Lobo, 2007).

1.7. When is Multicollinearity a Problem?

Multicollinearity arises when several predictor variables are linked by a linear


combination, so that the influence of each of them on the distribution of the species cannot be
distinguished, as they overlap each other. Strictly speaking, this occurs when the correlation
between the variables is 1, in which case the inclusion of one of these variables provides all
the predictive power of all of them. However, the explanatory power remains undetermined,
because we run the risk of attributing causal power to one variable when the one that actually
affects the species could be one excluded from the ENM.
184 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

When there is correlation between the variables but it is lower than 1, which is what
occurs in the real world, the exclusion of correlated variables produces 1) a decrease in the
predictive ability of the ENM which is greater when the correlation between the variables is
lower, and 2) a loss of explanatory power of the variable kept in the ENM that increases with
increasing correlation. Therefore, in correlative ENMs, the elimination of highly correlated
variables simplifies the ENM with little loss of predictive power. However, ENMs can be
seriously affected in their explanatory power if the hypothetically relevant variables are not
all included, even when they are, and precisely because they are, correlated.
For example, Cartron et al. (2000) point out that if some of the relationships between
variables are negative, the effect that is operating on the variables may be weakened by
another, stronger mechanism, due to the relationships between them. Thus, in a system with
three variables, if two of the correlations are positive and one is negative, the predicted
relationships may not all be seen in the relationships between each pair of variables, since
there are effects operating in different directions (Bárcena et al., 2004). If, for example, a
species is favored by high temperatures and rainfall, but these variables are negatively
correlated in the analyzed territory (i.e., precipitation is higher where the temperature is
lower), the effect of each variable is weakened by the effect of the other. Only the inclusion
of the two correlated variables in the same ENM will allow detecting the actual effect of each
variable.

1.8. What is it Really to Validate an Ecological Niche Model?

ENMs obtained inductively from the recorded distribution of a species follow certain
rules that guarantee their consistency with that distribution, and therefore do not require
validation with respect to their starting data. However, it is possible to validate whether an
ENM remains accurate when applied to similar chorological data sets in other geographical
areas or moments in time, or when it is used to determine other population parameters of the
species. An ENM should thus be validated with data different from those used to build it,
according to, and specifically for, the purpose for which it was built. For example, an ENM
made to be transferred to the future, such as those that predict species‘ potential distributions
under climate change scenarios (e.g. Araújo and Pearson, 2005, Araújo et al., 2006, Araújo et
al. 2008, Sillero, 2010) cannot be validated in the present.
We can, however, determine if the ENM is spatially transferable within the analyzed
area. To do this, we can divide the territory into sub-areas of recalibration and pseudo-
validation. The general ENM, obtained with data from the complete study area, can be
recalibrated in the first sub-area to then check if it works equally well in the second sub-area.
The two sub-areas can be defined either randomly or by applying the formula proposed by
Fielding and Bell (1997):

[1+(p-1)1/2]-1,

where p is the number of predictors in the ENM. The degree of agreement between the results
of recalibration and pseudo-validation can be estimated by comparing their respective values
of Kappa (Cohen, 1960), sensitivity, specificity, correct classification rate, or AUC (Fielding
and Bell 1997, Manel et al., 2001). However, we must keep in mind that a discrepancy
Ecological Niche Models in Mediterranean Herpetology 185

between recalibration and pseudo-validation does not imply that the recalibrated ENM is
incorrect, and certainly not that the general ENM is incorrect. It simply indicates that the
factors that act predominantly in the area of recalibration differ from those that do in the area
of pseudo-validation. The ENM to be used should, in any case, be the one based on the entire
study area, as general models tend to work better than those based on subsets of data (e.g.
Barbosa et al., 2009).

1.9. Are We Really Modeling the Ecological Niche, or Can We Overcome the
Concept of Niche When Modeling?

Environmental processes are inherently complex and vary with spatial and temporal
scales. This complexity is often based on the definition of laws or assumptions about the way
these processes work, often expressed in the form of mathematical or logical relationships,
which are the core of ENMs. Mechanistic ENMs, as correlative ones, are based on the
previous acceptance of relationships between a species and the factors that supposedly
determine its distribution. A considerable degree of idealization is therefore necessary to
describe the distributions of species with mechanistic ENMs.
Among other things, a mechanistic ENM requires accepting a mechanism on which to
base it. This approach is based on Newtonian mechanics, called into question since the early
Twentieth Century by quantum mechanics, which is formulated on the basis of subatomic
phenomena. Biological phenomena are also subject to uncertainty, which is one of the pillars
of quantum mechanics. This uncertainty arises not only from the inability to identify all the
factors involved in the process, but also from the inability to determine the final outcome of
the process even when all relevant factors are controlled. The different degrees of coherence
between phenomena are mathematically translated to correlations. Correlative ENMs may
thus represent coherence between natural phenomena more adequately than mechanistic
Newtonian models. Furthermore, living organisms not only respond to environmental factors:
environmental factors are also strongly conditioned by the action of living organisms. The
classic formulation of logic and mathematics is not well equipped to handle this formulation
on physical systems, and even less on live systems and the relationships between them. A
more realistic interpretation of the complex ecological and biological systems can come from
the application of fuzzy logic.
Fuzzy logic is a form of multi-valued logic that allows for several values of truth, but also
takes into account that these values are inaccurate. Fuzzy logic and its applications have their
origin in the theory of fuzzy sets proposed by Zadeh (1965), who established that a fuzzy set
is characterized by a membership function that assigns to each object in the set a degree of
membership ranging (with continuous values) from zero to one. The need for fuzzy sets arises
in situations where it is difficult to determine if an element belongs to a set or not. Classical
sets are special cases of fuzzy sets in which only two degrees of membership (0 and 1) are
allowed. Fuzzy logic has been used to predict species‘ distributions (Robertson et al., 2004),
to detect favorable areas for species (Real et al., 2005, Real et al., 2008, Real et al., 2009), to
analyze gaps in the protection of biodiversity using distribution models (Estrada et al., 2007,
2008, Real et al., 2006b), and to assess the impact of climate change on species‘ distributions
(Levinsky et al., 2007, Real et al., 2010).
186 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

The theory of complexity can also be used in the interpretation of ENMs. Historical and
ecological factors, together with the idiosyncratic response of each species to these factors,
determine the current configuration of species‘ distributions. These factors dynamically
delimit the biogeographic responses of biodiversity in time and space. In this type of complex
scenario, distributions are the geographical response of the species to the past and present
factors that act(ed) in a particular area. The interpretation of ENMs as the degree of
membership of each locality to the fuzzy set of suitable areas for a species allows obtaining a
measure of how much the species has been influenced towards each location by the various
factors.

2. PART TWO: ECOLOGICAL MODELING OF


MEDITERRANEAN HERPETOFAUNA
The Mediterranean Basin is one of the world‘s biodiversity hotspots for conservation
priorities (Mittermeier et al., 2004). It presents broad topographical and environmental
variation, and has undergone a rich and eventful biogeographic history. Amphibians and
reptiles display particularly high specific and genetic diversity, with numerous endemics to
the Mediterranean region. This has provided an optimal scenery for modeling studies.
Besides species' distributions and ecological niches, it is also possible to model spatial
patterns in many other biological or ecological variables, such as species richness,
morphological characters, or genetic diversity. In this second part of the chapter, we review
the published works on different types of ecological modeling of amphibians and reptiles in
the Mediterranean Basin.

2.1. Identification of the Ecological Niche

One of the most important applications of ENMs is the identification of the species'
ecological niche. Various studies have specifically aimed at identifying the potential
distribution of species and the variables that determine it within the Mediterranean Basin, for
example in Portugal (Brito et al., 1996, Sá-Sousa, 2000, Teixeira et al., 2001, Teixeira and
Ferrand, 2002, Arntzen, 2006), in Spain (Real et al., 2005, Anadón et al., 2006, Roman et al.,
2006, Carretero et al., 2010) and, recently, in Morocco (Beukema et al., 2010). Among these
works, Brito et al.‘s (1996) is the first one on ecological niche modeling within the
Mediterranean Basin, and uses logistic regression to calculate the potential distribution of
Lacerta schreiberi in Portugal. Using similar methodologies, Sá-Sousa (2000) and Teixeira et
al. (2001) analyze, respectively, the biogeography of Podarcis hispanica and Chioglossa
lusitanica in Portugal. Sá-Sousa (2000) separates the two forms of P. hispanica (type 1 and
type 2), confirming that their distributions are parapatric and influenced by different factors.
Teixeira et al. (2001) calculate an ENM of C. lusitanica in Portugal and project it to Spain.
Like Sá-Sousa (2000), Real et al. (2005) use logistic regression to distinguish the niches of
two cryptic species with parapatric distributions: Discoglossus galganoi and D. jeanneae.
Anadón et al. (2006) and Roman et al. (2006) provide examples of modeling applications at
the local level; both studies use GLM, the former for Testudo graeca in the Spanish region of
Ecological Niche Models in Mediterranean Herpetology 187

Murcia, and the latter for Podarcis carbonelli in Doñana (Southern Spain), although without
developing the spatial component of the ENM. Carretero et al. (2010) use Maxent to model
the distribution of the endemic lizard Algyroides marchi at three scales in southern Iberia, and
identify suitable areas and environmental factors related to its presence. Beukema et al.
(2010) combine distributional and genetic data of Salamandra algira in Morocco for studying
its phylogeny and biogeography. They model the distributions of viviparous and oviparous
populations of these species using Maxent, and compare both niches using ENMtools
(Beukema et al., 2010).

2.2. Identification of Common Distribution Patterns

Ecological modeling is also used to identify chorotypes, i.e., distinct distribution patterns,
often shared by several species and significantly different from other distribution patterns.
Chorotypes can be determined from the observed distributions of various taxa (e.g. Real et
al., 1992, 1997) or by previously modeling these distributions (e.g. Sillero et al, 2009). Real
et al. (1997) document a gradual longitudinal replacement of reptile species in the eastern part
of the Rif region (northern Morocco). They attribute this to the northward movement of the
Saharan boundaries, which have not yet reached biogeographical equilibrium. Thus, Saharan
reptiles enter the Rif from the east, through the lower basin of the River Moulouya. Seven
reptile chorotypes were identified in the western part of the Rif, and these comprise
Mediterranean species and others endemic to the Maghreb (the region that spans most of
North-western Africa, excluding the Sahara). These chorotypes are segregated from one
another according to altitude. Historical and ecological processes can account for the
distributions shared by these species, which have inhabited the Rif for longer than eastern
reptiles. Sillero et al. (2009) analyze the biogeography of Iberian herpetofauna and identify
seven chorotypes for amphibians and seven for reptiles from the classification of a
presence/absence matrix calculated from ENFA models and environmental variables obtained
from satellite images. These chorotypes separate the species of Atlantic and Mediterranean
affinity. Flores et al. (2004) and Real et al. (2008) use ecological modeling to characterize
environmentally chorotypes previously identified from observed distributions. Aragón et al.
(2010) compare the influence of climatic and non-climatic factors on the distribution of
Iberian species of endotherms and ectotherms. They use GAM and find that amphibians and
reptiles are more influenced by precipitation and temperature than birds and mammals
(Aragón et al., 2010). Rueda et al. (2010) generate analytically derived regionalizations for
multiple groups of European plants and animals and explore potential influences on the
regions for each taxonomic group. They use GLM for modeling the obtained coherent clusters
and identify a discernable biogeographic structure in the European biota, mainly influenced
by climate (Rueda et al., 2010).

2.3. Identification of Other Biogeographic Patterns

ENMs can also be used to study and identify other biogeographical patterns, such as
spatial variations in species' morphology and genetics. The vast majority of species show
geographic differences in morphological traits in response to changing selective pressures of
188 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

the environments in which they live (Stearns and Hoekstra, 2000, West-Eberhard, 2003), both
biotic (e.g. predation, competition) and abiotic (e.g. temperature, precipitation). The study of
geographic variation in phenotypic traits has been a recurring theme during the second half of
the Twentieth Century (see Thorpe, 1987) and, currently, has resurfaced under a new
approach owing to the use of GIS and ENMs. Within the Mediterranean area, two articles
address the geographical variation in the morphology of Iberian vipers using ENMs as a tool:
Brito et al. (2008) study the morphological variation of Vipera latastei throughout its
distribution range, and Martinez-Freiría et al. (2009) study the morphological variation and
convergence of V. aspis and V. latastei, both along their whole contact area in northeastern
Spain and, on a more local scale, at the contact zone of the upper Ebro River (northern Spain).
Both works employ the same methodology: a combination of geo-statistics and GIS to obtain
uni- and multivariate patterns of geographic variation in morphology. They then use ENMs to
analyze correlations between these patterns and environmental variables. Tomović et al.
(2010) study the morphological variation of V. ammodytes in its European distribution area,
using a similar approach to that of Brito et al. (2008) and Martínez-Freiría et al. (2009), and
identify three morphologically different groups and their climatic requirements using Maxent.
Luiselli (2006) uses logistic regression to examine which environmental factors have led two
species of whip snakes that are not phylogenetically close, Hierophis viridiflavus in Italy
(Europe) and Psammophis phillipsii in Nigeria (Africa), to converge morphologically and
ecologically. Ficetola et al. (2010b) identify the relationship between bioclimatic variables
and body size predicted a priori by several alternative hypotheses for the newt T. carnifex in
Italy. They explore the correlations among these features and use an information theoretic
approach (Akaike‘s information criterion) to select the best model (Ficetola et al., 2010b). In
a similar way, Romano et al. (2010) analyze the relationships between body size and climate
for two sister species of salamander (Salamandrina perspicillata and S. terdigitata) endemic
to Italy, using GLM and an information theoretic approach.
As with morphology, the study of genetic variability and structuring of species may lead
to the identification of geographical patterns (e.g. Alexandrino et al., 2004). However, the
importance of the geographical component in typical studies of phylogeography has been
recognized only recently (see Manel et al., 2003, Kidd and Ritchie, 2006). This new approach
is known as landscape genetics. Broadly speaking, these studies combine GIS and ENMs to
identify geographic patterns in the variation of genetic markers, to delimit genetically similar
groups, to identify routes of interconnectivity between populations, and to study their
relationships with eco-geographical variables (e.g. Spear et al., 2005, Cushman et al., 2006).
However, to our knowledge, no work has yet been published on Mediterranean herptile
species using the methodologies employed in this field.
Alexandrino et al. (2004) combine the study of both morphological and genetic
variations, comparing ENMs obtained for C. lusitanica (Teixeira et al., 2001) with maps of
genetic and phenetic variation. However, this comparison is made in a descriptive way,
without investigating in depth the environmental factors that can be related to both
geographic variations.
Ecological Niche Models in Mediterranean Herpetology 189

2.4. Prediction of Potential Species Richness

Species richness can also be estimated using ENMs, either through direct modeling
(Nogués-Bravo and Martínez-Rica, 2004, Araújo et al., 2008) or through the addition of
individual species‘ ENMs (Soares and Brito, 2007, Estrada et al., 2007, Estrada et al, 2008,
Sillero et al., 2009). In any case, the arithmetic difference between observed and estimated
species richness shows the areas where there may be a deficit in knowledge, that is, where
there are probably species yet to record (Sillero et al., 2009). Thus, ecological modeling is a
useful tool to manage and plan chorological atlas surveys (Loureiro and Sillero, 2010). ENMs
may also allow identifying environmental factors that influence species richness (Soares and
Brito, 2007, Araújo et al., 2008). However, it is necessary to model separately the
distributions of species with different biogeographic affinities (see Nogués-Bravo and
Martínez-Rica, 2004, Ribeiro et al., 2009).

2.5. Expansion of Native and Invasive Species

ENMs can be used to identify areas not yet occupied by expanding species (e.g. Hyla
merdionalis in the Iberian Peninsula; Sillero, 2009, Sillero, 2010) or by invasive species (e.g.
Rana catesbeiana and Trachemys scripta in Italy; Ficetola et al., 2007, Ficetola et al., 2009;
Ficetola et al. 2010a).
Sillero (2009) and Sillero (2010) develop ENMs at local scale (Salamanca province,
Spain) and at continental scale (Europe and North Africa) in which they note that H.
meridionalis occupies almost all suitable habitats available. They conclude that this species
cannot expand its distribution much more and that it is valid to assume that is in equilibrium
with the environment.
Ficetola et al. (2007) and Ficetola et al. (2009) use Maxent to identify suitable areas for
the expansion of R. catesbeiana and the reproductive populations of T. scripta in Italy,
respectively. Also with Maxent, Ficetola et al. (2010a) model the historical and current
distributions of R. catesbeiana in Italy to infer how changes in the landscape are involved in
the expansion of this invasive species. Moreover, they use five scenarios of landscape
variation (derived from the ALARM project) to predict the future expansion of this species
(Ficetola et al., 2010a).
ENMs have also been used to model chorotypes of several invasive species in the Iberian
Peninsula, including the red-eared slider Trachemys scripta, determining the most influential
environmental and human factors (Real et al., 2008). Chorotype modeling can aid
conservation or management measures for the entire set of species considered; it can also help
determine the areas that are most prone to invasions (Real et al., 2008).

2.6. Integration of Molecular Data in Models

The "taxonomic inflation‖ (Isaac et al., 2004, Harris and Froufe, 2005) that has affected
both amphibians and reptiles in recent years has led to the subdivision of species into several
new species whose presence records cannot be distinguished a posteriori. Moreover, many of
these species are cryptic (not distinguishable by morphological characters), making the
190 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

correct identification of these records difficult even after their definition as new species. The
clear-cut distinction of cryptic species often requires genetic or molecular analyses that can be
very expensive and time-consuming. In cases like these, ENMs may help to distinguish the
distributions of such species from the characteristics that define their areas of occurrence.
Starting from the genetic identification of a sample of individuals, we can build ENMs that
allow inferring to which species or subspecific variety other individuals will belong, based on
their location (e.g. lineages of C. lusitanica in Portugal: Arntzen and Alexandrino, 2004; D.
galganoi and D. jeanneae in Spain: Real et al., 2005). Spatial modeling of genetic data can
also serve to compare the niches of distinct populations of a species, as is done by Beukema
et al. (2010) with viviparous and oviparous forms of Salamandra algira in Morocco (see
section 2.1).

2.7. Identification of Contact Zones

Contact zones, mainly those that occur between phylogenetically close species, are very
important in the study of evolutionary processes (Hewitt, 1988). They usually occur at the
limits of species‘ distributions, in areas of environmental transition (ecotones), where
environmental factors and biotic interactions play a major role in the local distribution and
population dynamics of the species in contact (e.g. Martínez-Freiría et al., 2008, Martínez-
Freiría et al., 2010). ENMs allow identifying environmental factors that affect these species,
the responses of each species to the variations in these factors, and the areas where species
can potentially coexist (areas of potential sympatry).
In the study of contact zones, Brito and Crespo (2002) use logistic regression to calculate
for the first time a single ENM for two species simultaneously, with the aim of identifying
environmental requirements and areas of sympatry between Vipera latastei and V. seoanei in
northern Portugal. Since the distributions of both species are parapatric, they calculate two
separate ENMs where the absence data for each species correspond to the presences of the
other species. Espregueira Themudo and Arntzen (2007) use the same methodology to
determine the factors the influence the local distribution of Triturus marmoratus and T.
pygmaeus in the area of Caldas da Rainha (Portugal). However, they only calculate one ENM
(for T. marmoratus). Arntzen and Alexandrino (2004) use logistic regression to analyze the
distribution of C. lusitanica in northern Portugal and obtain several models that identify the
environmental factors more closely related to the distribution of each of the two genetically
distinct groups. Although they do not represent the likely areas of contact, they do identify the
environmental variables to which the two forms have a similar response, and identify a zone
of contact and eco-morphological transition for the two groups (north of the Mondego River,
Portugal). Martínez-Freiría et al. (2008) analyze the distribution of the three Iberian species of
vipers at their contact zone in the upper Ebro River (northern Spain). They use presence data
and Maxent, with which they obtain the responses of these species to environmental factors
and the areas of potential occurrence and sympatry. By representing and comparing the
responses of these species to certain environmental factors common to them, they identify
those factors for which these species show a similar pattern, and that thus allow their
coexistence; and the factors for which the species show different responses, which thus lead
to habitat segregation.
Ecological Niche Models in Mediterranean Herpetology 191

2.8. Assessment of Species’ Conservation Status

ENMs can also be used to infer the conservation status of species, to identify factors of
threat, and to propose management measures. The earliest such work with Mediterranean
herpetofauna is the one carried out by Brito et al. (1999) on the identification of priority areas
for conservation, delimitation of areas of high extinction risk, assessment of the degree of
protection, and definition of a conservation strategy for L. schreiberi in Portugal. The authors
combine the results obtained with previous works (ENM, Brito et al., 1996, habitat selection,
Brito et al., 1998) with other variables such as the protected area network in Portugal,
detected density, and electrophoretic data of allozymes of this species. Similarly, Teixeira and
Ferrand (2002) identify important areas for the conservation of the genetic diversity of C.
lusitanica by combining 1) ENMs for the current conditions, made with logistic regression
(Teixeira et al., 1996, Teixeira et al., 2001), discriminant analysis, classification trees, and
overlay analysis; 2) ENMs for the years 2050 and 2080, made with logistic regression and
discriminant analysis (Teixeira and Arntzen, 2002); and 3) analysis of the geographic
variation in molecular data (Alexandrino et al., 2004). Santos et al. (2006) and Santos et al.
(2009) use ENFA to identify biotic and abiotic factors involved in the distributions of V.
latastei and C. austriaca, respectively, in the Iberian Peninsula, and evaluate the conservation
status of both species. Santos et al. (2006) realize that V. latastei, though it should be
widespread in the Mediterranean part of the Iberian Peninsula due to its high environmental
adaptability, is relegated to mountainous regions by human activities. Santos et al. (2009)
combine ENMs of C. austriaca at regional scale in Iberia with local analyses of the isolated
populations of southern Spain (performed through intensive sampling), to infer why this
species has small isolated populations in the southern Iberian Peninsula.
Analyses of species richness are very important to determine the areas that should be
protected to preserve the largest possible number of taxa (Soares and Brito, 2007). For
example, Rey Benayas et al. (2006) evaluate the potential impact of future infrastructures on
Spanish herpetofaunal diversity. Estrada et al. (2007) and Estrada et al. (2008) use ENMs in
Andalusia (southern Spain) to assess the degree of agreement between the protected area
network and the important areas for amphibians and reptiles, according to their species
richness, rarity, endemism, and vulnerability. From another point of view, but similar to the
previous one, Ribeiro et al. (2009) use ENMs to evaluate the impact of human activities on
reptile diversity in Catalonia (north-eastern Spain). The authors correlate the differences
between observed and potential species richness (ENMs calculated from 25 species) with
different types of land use; agricultural areas appear to be the least favorable to reptiles,
having the largest differences between observed and potential richness (Ribeiro et al, 2009).
The NIM can be used to infer the state of conservation of the species, to identify factors
of threat and to propose management measures. The earliest work in this area is conducted by
Brown et al. (1999) on the identification of priority conservation areas, delimitation of areas
at high risk of extinction, assessing the degree of protection and definition of a conservation
strategy for L. schreiberi in Portugal. The authors combined the results of previous work
(ENM, Brito et al., 1996, selection of habitats, Brito et al., 1998), along with other variables
such as the network of protected areas in Portugal, density and electrophoretic data detected
of allozymes of the species. Similarly, Teixeira & Ferrand (2002) identified important areas
for conservation of genetic diversity of C. lusitanica by combining 1) ENM for current
conditions, using logistic regression (Teixeira et al., 1996, Teixeira et al., 2001), discriminant
192 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

analysis, classification trees and overlay analysis, 2) for the years 2050 ENM and 2080, using
logistic regression and discriminant analysis (Teixeira & Arntzen, 2002), and 3) analysis of
geographic variation in molecular data (Alexandrino et al., 2004). Santos et al. (2006) and
Santos et al. (2009) ENFA used to identify biotic and abiotic factors involved in the
distribution of V. latastei and C. Austrian in the Mediterranean Basin, respectively, and
evaluate the conservation status of both species. Santos et al. (2006) found as V. latastei, even
though it should be present in almost all Mediterranean Iberia due to its high environmental
adaptability, is relegated to the mountainous areas due to human activities. Santos et al.
(2009) combined C. ENM Austrian regional scale in Iberia, along with a more local scale
analysis of isolated populations of southern Spain (performed by intensive sampling), to infer
why this species isolated populations with low effective population in the southern half of
peninsular.
Analyses of species richness are of great importance in determining the areas to be
protected and try to preserve the largest possible number of taxa (Soares & Brito, 2007). For
example, Benaiah Rey et al. (2006) evaluated the potential impact of future infrastructure in
Spain Spanish herpetofaunística diversity. Estrada et al. (2007) and Estrada et al. (2008) used
ENM restricted to Andalusia to assess the degree of agreement between the network of
protected areas and important areas for amphibians and reptiles, according to its richness,
rarity, endemism and vulnerability. From another point of view, but like the previous one,
highlights a recent study by Ribeiro et al. (2009), who used ENM to evaluate the impact of
human activities on the diversity of reptiles in Catalonia. The authors correlated the
differences between observed richness and potential richness (ENM calculated from 25
species of reptiles Catalan) with different land uses: agricultural use areas were less favorable
to the reptiles, having the largest differences between wealth (Ribeiro et al, 2009).

2.9. Prediction of Future Conservation Problems

The current availability of climatic data for various possible scenarios of future climate
change (e.g. WorldClim: www.worldclim.org/futdown.htm; World Climate Research
Programme: http://ccr.aos.wisc.edu/model/ipcc10min/index.html) has increased the studies
that predict the response of organisms to this process. Due to the strong dependence of
amphibians and reptiles on environmental conditions, these studies can be a valuable tool to
identify the vulnerability of these species to climate change and to develop effective
conservation measures.
In the Mediterranean Basin, the first published ENM predicting the future range of a
species is the one by Teixeira and Arntzen (2002), which focuses on C. lusitanica in Spain
and Portugal. The modeling methods are logistic regression and discriminant analysis, and the
climatic variables include contemporary conditions and the predictions of the International
Panel for Climate Change from the year 2001, which predicted increases in July temperature
of 2 and 3ºC for the years 2050 and 2080, respectively. Whilst climate change does not
consist simply of an increase in temperature, in 2002 there was not enough information on
how precipitation and other variables would be altered. Araújo et al. (2006) use ENMs,
through a combination of GLM, GAM, classification trees and artificial neural networks, to
determine the potential effects of climate change (WorldClim data for the years 2020 and
2050 at a resolution of 50x50 km) on the distributions of European amphibians and reptiles.
Ecological Niche Models in Mediterranean Herpetology 193

Similarly, Carvalho et al. (2010) model the distributions of 37 Iberian endemics or quasi-
endemics (15 amphibians and 22 reptiles) under current weather conditions, and project them
to the future climatic conditions predicted for the years 2020, 2050, and 2080. This study uses
a finer scale (10x10 km) and combines various modeling methods, such as Maxent, GLM,
GAM, classification trees, artificial neural networks, Generalized Boosting Models (GBM),
random forests, mixed discriminant analysis, and MARS. Real et al. (2010) and Márquez et
al. (2011) evaluate the predicted effect of different greenhouse gas emission scenarios on the
distributions of Alytes dickhilleni and Vipera latastei in continental Spain using two global
circulation models. Ficetola et al. (2010a) forecast the future expansion of invasive R.
catesbeiana in Italy under five landscape variation scenarios (see section 2.5).

2.10. Concluding Remarks: The Future of Ecological Niche Models in the


Mediterranean Basin

Ecological modeling is an important tool in studies of biogeography and ecology, and


there is no doubt of their usefulness for amphibian and reptile species. Ecological niche
modeling in the Mediterranean Basin has nearly 20 years of development. There are many
research lines open: species richness modeling (Sillero et al., 2009, Ribeiro et al., 2009),
conservation status assessment (Santos et al., 2006, Estrada et al., 2008; Santos et al., 2009),
responses to climate change (Carvalho et al., 2010; Sillero, 2010; Real et al. 2010), hybrid
and contact zones (Martínez-Freiría et al., 2008, Martínez-Freiría et al., 2010), expansion of
native species (Sillero, 2009, Sillero, 2010). However, for other areas there still are no
published studies regarding Mediterranean herpetofauna: landscape genetics, adequacy of
protected areas, new modeling methods, local-scale modeling (resolution <1x1 km),
expansion of invasive species, modeling of past climate scenarios. This shows there is still a
long way to go in ecological modeling of the Mediterranean herpetofauna.
In addition, ecological niche modeling of Mediterranean amphibians and reptiles is an
optimal setting for works addressing very diverse issues in ecology, biogeography and
conservation. This is due to: 1) the high diversity of species, subspecies and genetic lineages
of amphibians and reptiles, many of them endemic to the Mediterranean Basin, which is one
of the world‘s biodiversity hotspots (Mittermeier et al., 2004); 2) the broad environmental
gradients that exist in this region, which facilitate the use and characterization of the species‘
niches; 3) the particular biogeographic history of this Basin; and 4) the human occupation of
this area for thousands of years, which has changed quite a few patterns of distribution. In
particular, it may be especially important to develop studies that jointly address the
distribution of species with genetic tools, studies on morphological variation, and niche
modeling analysis. With these three tools we can address numerous aspects related to
speciation, niche differentiation, or the expansion of distribution areas, which go beyond the
purely herpetological interest.
194 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

ACKNOWLEDGMENTS
A version of this review focused on the Iberian Peninsula was previously published in
Spanish in the Boletín de la Asociación Herpetológica Española (Bulletin of the Spanish
Herpetological Association). We thank its editors Xavier Santos and Alexander Richter-Boix
for permission to reproduce it here in English. A.M.B., N.S. and F.M.-F. are supported by
post-doctoral fellowships (SFRH/BPD/26666/2006, SFRH/BPD/40387/2007 and
SFRH/BPD/69857/2010) from Fundação para a Ciência e a Tecnologia (Portugal), co-
financed by the European Social Fund. The ‗Rui Nabeiro‘ Biodiversity Chair is financed by
Delta Cafés.

REFERENCES
Albert, CH, Thuiller, W. 2008. Favourability functions versus probability of presence:
advantages and misuses. Ecography, 31: 417-422.
Alexandrino, J., Teixeira, J., Arntzen, JW, Ferrand, N. 2004. Historical biogeography and
conservation of the golden-striped salamander ( Chioglossa lustanica ) in northwestern
Iberia: integrating ecological, phenotypic and phylogeographic data. 189-205. In : Weiss,
S., Ferrand, N. (eds.), Phylogeography of Southern European Refugia. Netherlands.
Springer.
Anadon, JD, Gimenez, A., Martinez, M., Martinez, J., Perez, I., Esteve, MA 2006. Factors
determining the distribution of the spur-thighed tortoise Testudo graeca in south-east
Spain: a hierarchical approach. Ecography, 29: 339-346.
Anadón, JD, Giménez, A., Ballestar, R. 2010. Linking local ecological knowledge and
habitat modelling to predict absolute species abundance at large scales. Biodiversity and
Conservation , 19: 1443-1454.
Anderson, RP, Peterson, AT, Gomez Laverde, M. 2002. Using niche-based GIS modeling to
test geographic predictions of competitive exclusion and competitive release in South
American pocket mice. Oikos, 98: 3-16.
Aragón, P, Lobo, JM, Olalla-Tárraga, MA, Rodríguez, MA 2010. The contribution of
contemporary climate to ectothermic and endothermic vertebrate distributions in a glacial
refuge. Global Ecology and Biogeography 19: 40–49.
Araújo, MB, Guisan, A. 2006. Five (or so) challenges for species distribution modelling.
Journal of Biogeography, 33: 1677-1688.
Araújo, MB, New, M. 2007. Ensemble forecasting of species distributions. Trends in
Ecology, Evolution, 22: 42-47.
Araújo, MB, Nogués-Bravo, D., Diniz-Filho, JAF, Haywood, AM, Valdes, PJ, Rahbek, C.
2008. Quaternary climate changes explain diversity among reptiles and amphibians.
Ecography, 31: 8-15.
Araújo, MB, Pearson, RG 2005. Equilibrium of species' distributions with climate.
Ecography, 28: 693-695.
Araújo, MB, Thuiller, W., Pearson, RG 2006. Climate warming and the decline of
amphibians and reptiles in Europe. Journal of Biogeography, 33: 1712-1728.
Ecological Niche Models in Mediterranean Herpetology 195

Araújo, MB, Whittaker, RJ, Ladle, RJ, Erhard, M. 2005. Reducing uncertainty in projections
of extinction risk from climate change. Global Ecology and Biogeography, 14: 529-538.
Araújo, MB, Williams, PH 2000. Selecting areas for species persistence using occurrence
data. Biological Conservation, 96: 331-345.
Arntzen, JW 2006. From descriptive to predictive distribution models: a working example
with Mediterranean amphibians and reptiles. Frontiers in Zoology, 3.
Arntzen, JW, Alexandrino, J. 2004. Ecological modelling of genetically differentiated forms
of the Mediterranean endemic golden-striped salamander, Chioglossa Lusitanica.
Herpetological Journal, 14: 137-141.
Arntzen, JW, Espregueira Themudo, G. 2008. Environmental parameters that determine
species geographical range limits as a matter of time and space. Journal of Biogeography,
35: 1177-1186.
Arntzen, JW, Teixeira, J. 2006. History and new developments in the mapping and modelling
of the distribution of the golden-striped salamander, Chioglossa lusitanica. Zeitschrift für
Feldherpetologie, Supplement : 1-14.
Aspinall, RJ, Matthews, K. 1994. Climate change impact on distribution and abundance of
wildlife: An analytical approach using GIS. Environment and Pollution , 86: 217-223
Austin, MP 2002. Spatial prediction of species distribution: an interface between ecological
theory and statistical modelling. Ecological Modelling, 157: 101-118.
Austin, MP 2007. Species distribution models and ecological theory: a critical assessment and
some possible new approaches. Ecological Modelling, 200: 1-19.
Barbosa, AM, Real, R., Vargas, JM 2009. Transferability of environmental favourability
models in geographic space: The case of the Mediterranean desman ( Galemys
pyrenaicus ) in Portugal and Spain. Ecological Modelling, 220: 747-754.
Barbosa AM, Real R., Vargas JM 2010. Use of coarse-resolution models of species'
distributions to guide local conservation inferences. Conservation Biology , early view.
doi: 10.1111/j.1523-1739.2010.01517.x.
Bárcena, S, Real, R, Olivero, J and Vargas, JM. 2004. Latitudinal trends in breeding
waterbird species richness in Europe and their environmental causes. Biodiversity and
Conservation, 13: 1997-2014
Beaumont, LJ, Gallagher, RV, Thuiller, W., Downey, PO, Leishman, MR, Hughes, L. 2009.
Different climatic envelopes among invasive populations may lead to underestimations of
current and future biological invasions. Diversity and Distributions, 15: 409-420.
Beukema, W, de Pous, P, Donaire, D, Escoriza, D, Bogaerts, S, Toxopeus, AG, de Bie,
CAJM, Roca, J, Carranza, S. 2010. Biogeography and contemporary climatic
differentiation among Moroccan Salamandra algira. Biological Journal of the Linnean
Society, 101: 626–641.
Betts, MG, Diamond, AW, Forbes, GJ, Villard, MA, Gunn, JS 2006. The importance of
spatial autocorrelation, extent and resolution in predicting forest bird occurrence.
Ecological Modelling, 191: 197-224.
Bombi, P.; Salvi, D.; Vignoli, L., and Bologna, M. A. Modelling Bedriaga's rock lizard
distribution in Sardinia: An ensemble approach. Amphibia-Reptilia. 2009; 30:413-424.
Brito, JC, Brito-e-Abreu, F., Paulo, OS, Rosa, HD, Crespo, EG 1996. Distribution of
Schreiber's green lizard ( Lacerta schreiberi ) in Portugal: a predictive model.
Herpetological Journal, 6: 43-47.
196 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

Brito, JC, Crespo, EG 2002. Distributional analysis of two vipers ( Vipera latastei and V.
seoanei ) in a potential area of sympatry in the Northwestern Mediterranean Basin. 129-
138. In : Schuett, GW, Hoggren, M., Douglas, ME, Greene, HW (eds.), Biology of the
Vipers. Eagle Mountain, Utah. Eagle Mountain Publishing, LC.
Brito, JC, Crespo, EG, Paulo, OS 1999. Modelling wildlife distributions: Logistic Multiple
Regression vs Overlap Analysis. Ecography, 22: 251-260.
Brito, JC, Godinho, R., Luís, C., Paulo, OS, Crespo, EG 1999. Management strategies for
conservation of the lizard Lacerta schreiberi in Portugal. Biological Conservation, 89:
311-319.
Brito, JC, Santos, X., Pleguezuelos, JM, Sillero, N. 2008. Inferring Evolutionary Scenarios
with Geostatistics and Geographical Information Systems (GIS) for the viperid snakes
Vipera latastei and V. monticola. Biological Journal of the Linnean Society, 95: 790-806.
Bulluck, L., Fleishman, E., Betrus, C., Blair, R. 2006. Spatial and temporal variations in
species occurrence rate affect the accuracy of occurrence models. Global Ecology and
Biogeography, 15: 27-38.
Busby, JR 1991. BIOCLIM - A Bioclimatic Analysis and Prediction System. 64-68. In :
Margules, CR, Austin MP (eds.), Nature Conservation: Cost Effective Biological Surveys
and Data Analysis. CSIRO. Canberra.
Carpenter, G., Gillison, AN, Winter, J. 1993. DOMAIN: a flexible modelling procedure for
mapping potential distributions of plants and animals. Biodiversity and Conservation, 2:
667-680.
Carretero, MA, Sillero, N., Ayllón, E., Kaliontzopoulou, A., Lima, A., Hernández-Sastre, PL,
Godinho, R., Harris, DJ 2008. Multidisciplinary approaches for conserving Southern
isolates of Atlantic lizards in the Mediterranean Basin. Herpetologia Sardiniae : 251-255.
Carretero, MA, Ceacero, F, García-Muñoz, E, Sillero, N, Olmedo, MI, Hernández-Sastre, PL,
Rubio, JL 2010. Seguimiento de Algyroides marchi. Informe final. Monografías SARE.
Asociación Herpetológica Española – Ministerio de Medio Ambiente y Medio Rural y
Marino. Madrid.
Cartron, JL. E., Kelly, JF, Brown, JH 2000. Constraints on patterns of covariation: a case
study in strigid owls. Oikos , 90: 381-389.
Cohen, JA 1960. A coefficient of agreement for nominal scales. Educational and
Psychological Measurement, 20: 37-46.
Colwell, RK, Rangel, TF 2009. Hutchinson's duality: The once and future niche. Proceedings
of the National Academy of Sciences, 106: 19651-19658.
Chefaoui, RM, Lobo, JM 2008. Assessing the effects of pseudo-absences on predictive
distribution model performance. Ecological Modelling, 210: 478-486.
Diniz-Filho, JAF, Bini, LM, Hawkins, BA 2003. Spatial autocorrelation and red herrings in
geographical ecology. Global Ecology and Biogeography, 12: 53-64.
Dormann, CF 2007. Effects of incorporating spatial autocorrelation into the analysis of
species distribution data. Global Ecology and Biogeography, 16: 129-138.
Dormann, CF, McPherson, JM, Araújo, MB, Bivand, R., Bolliger, J., Carl, G., Davies, R.,
Hirzel, A., Jetz, W., Kissling, W., Kuhn, I., Ohlemuller, R., Peres-Neto, P., Reineking,
B., Schroder, B., Schurr, FM, Wilson, R. 2007. Methods to account for spatial
autocorrelation in the analysis of species distributional data: a review. Ecography, 30:
609-628.
Ecological Niche Models in Mediterranean Herpetology 197

Elith, J., Graham, C., Anderson, R., Dudik, M., Ferrier, S., Guisan, A., Hijmans, R.,
Huettmann, F., Leathwick, J., Lehmann, A., Li, J., Lohmann, L., Loiselle, B., Manion,
G., Moritz, C., Nakamura, M., Nakazawa, Y., Overton, J., Peterson, AT, Phillips, SJ,
Richardson, K., Scachetti-Pereira, R., Schapire, R., Soberón, J., Williams, S., Wisz, M.,
Zimmermann, NE 2006. Novel methods improve prediction of species' distributions from
occurrence data. Ecography, 29: 129-151.
Elith, J., Graham, CH 2009. Do they? How do they? Why do they differ? On finding reasons
for differing performances of species distribution models. Ecography, 32: 66-77.
Elton, C. 1927. Animal Ecology. London. Sedgwick and Jackson.
Engler, R., Randin, CF, Vittoz, P., Czáka, T., Beniston, M., Zimmermann, NE, Guisan, G.
2009. Predicting future distributions of mountain plants under climate change: does
dispersal capacity matter? Ecography, 32: 34-45 .
Espregueira Themudo GE, Arntzen JW. 2007. Newts under siege: range expansion of
Triturus pygmaeus isolates populations of its sister species. Diversity and Distributions,
13: 580-586.
Estrada A, Márquez AL, Real R, Vargas JM. 2007. Utilidad de los espacios naturales
protegidos de Andalucía para preservar la riqueza de especies de anfibios. Munibe , 25:
74-81.
Estrada A, Márquez AL, Real R, Vargas JM. 2008. ¿En qué medida preservan los espacios
naturales la riqueza de reptiles en Andalucía? 367-373. In : Redondo, MM, Palacios, MT,
López, FJ, Santamaría, T., Sánchez, D. (eds.), Avances en Biogeografía. Departamento
Análisis Geográfico Regional y Geografía Física. Universidad Complutense de Madrid.
Madrid.
Etherington TR, Ward AI, Smith, G. C., Pietravalle, S., Wilson, G. J. 2009. Using the
Mahalanobis distance statistic with unplanned presence-only survey data for
biogeographical models of species distribution and abundance: a case study of badger
setts. Journal of Biogeography, 36:845-853.
Ficetola, GF, Thuiller, W., Miaud, C. 2007. Prediction and validation of the potential global
distribution of a problematic alien invasive species: the American bullfrog. Diversity and
Distributions, 13: 476-485.
Ficetola, GF, Thuiller, W., Padoa-Schioppa, E. 2008. From introduction to the establishment
of alien species: a preliminary analysis of bioclimatic differences between presence and
reproduction localities in Trachemys scripta. Herpetologia Sardiniae : 266-269.
Ficetola, GF, Thuiller, W., Padoa-Schioppa, E. 2009. From introduction to the establishment
of alien species: bioclimatic differences between presence and reproduction localities in
the slider turtle. Diversity and Distributions, 15: 108-116.
Ficetola, GF, Maiorano, L, Falcucci, A, Dendoncker, N, Boitani, L, Padoa-Schioppa, E,
Miaud, C, Thuiller, W 2010a. Knowing the past to predict the future: land-use change
and the distribution of invasive bullfrogs. Global Change Biology, 16: 528–537.
Ficetola, GF, Scali, S, Denoël, M, Montinaro, G, Vukov, TD, Zuffi, MAL, Padoa-Schioppa,
E. 2010b. Ecogeographical variation of body size in the newt Triturus carnifex:
comparing the hypotheses using an information-theoretic approach. Global Ecology and
Biogeography, 19: 485–495.
Fielding, AH, Bell, JF 1997. A review of methods for the assessment of prediction errors in
conservation presence/absence models. Environmental Conservation, 24: 38-49.
198 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

Flores, T., Puerto, MA, Barbosa, AM, Real, R., Gosalvez, RU 2004. Agrupación en corotipos
de los anfibios de la provincia de Ciudad Real (España). Revista Española de
Herpetologia, 18: 41-53.
Godsoe, W. 2010. I can't define the niche but I know it when I see it: a formal link between
statistical theory and the ecological niche. Oikos, 1: 53-60.
Grinnell, J. 1917. The niche-relationships of the California Thrasher. Auk, 34: 427-433.
Guisan, A., Lehmann, A., Ferrier, S., Austin, M., Overton, J.Mc.C., Aspinall, R., Hastie, T.
2006. Making better biogeographical predictions of species' distributions. Journal of
Applied Ecology, 43: 386-392.
Guisan, A., Thuiller, W. 2005. Predicting species distribution: offering more than simple
habitat models. Ecology Letters, 8: 993-1009.
Guisan, A., Zimmermann, NE 2000. Predictive habitat distribution models in ecology.
Ecological Modelling, 135: 147-186.
Harris, DJ, Froufe, E. 2005. Taxonomic inflation: species concept or historical geopolitical
bias? Trends in Ecology, Evolution, 20: 6-7.
Hastie, TJ, Tibshirani, RJ 1990. Generalized Additive Models. Chapman, Hall. London. New
York.
Hernandez, PA, Graham, CH, Master, LL, Albert, DL 2006. The effect of sample size and
species characteristics on performance of different species distribution modeling
methods. Ecography, 29: 773-785.
Hewitt, GM 1996. Some genetic consequences of ice ages, and their role in divergence and
speciation. Biological Journal of the Linnean Society, 58: 247-276.
Hirzel, AH, Guisan, A. 2002. Which is the optimal sampling strategy for habitat suitability
modelling? Ecological Modelling, 157: 331-341.
Hirzel, AH, Hausser, J., Chessel, D., Perrin, N. 2002. Ecological-niche factor analysis: how to
compute habitat suitability maps without absence-data? Ecology, 83: 2027-2036.
Hirzel, AH, Helfer, V., Metral, F. 2001. Assessing habitat-suitability models with a virtual
species. Ecological Modelling, 145: 111–121.
Hirzel, AH, Le Lay, G. 2008. Habitat suitability modelling and niche theory. Journal of
Applied Ecology, 45: 1372-1381.
Holt, RD 2003. On the evolutionary ecology of species' ranges. Evolutionary Ecology
Research, 5: 159–178.
Hosmer, DWJ, Lemeshow, S. 1989. Applied Logistic Regression. John Wiley, Sons. New
York.
Hutchinson, GE 1957. Concluding remarks. Cold Spring Harbour symposium on quantitative
biology : 415-427.
Isaac, NJB, Mallet, J., Mace, GM 2004. Taxonomic inflation: its influence on macroecology
and conservation. Trends in Ecology, Evolution, 19: 464-469.
Jackson, ST, Overpeck, JT 2000. Responses of Plant Populations and Communities to
Environmental Changes of the Late Quaternary. Paleobiology 26: 194-220.
Jiménez-Valverde, A., Lobo, JM, 2007. Threshold criteria for conversion of probability of
species presence to either–or presence–absence. Acta Oecologica, 31: 361-369.
Jiménez-Valverde, A., Lobo, JM, Hortal, J. 2008. Not as good as they seem: the importance
of concepts in species distribution modelling. Diversity and Distributions, 14: 885-890.
Kaliontzopoulou, A., Brito, JC, Carretero, MA, Larbes, S., Harris, DJ 2008. Modelling the
partially unknown distribution of wall lizards (Podarcis) in North Africa: ecological
Ecological Niche Models in Mediterranean Herpetology 199

affinities, potential areas of occurrence, and methodological constraints. Canadian


Journal of Zoology, 86: 992-1001.
Kearney, M. 2006. Habitat, environment and niche: what are we modelling? Oikos, 115: 186-
191.
Kearney, M., Porter, WP 2004. Mapping the fundamental niche: physiology, climate, and the
distribution of a nocturnal lizard. Ecology, 85: 3119-3131.
Kearney, M., Porter, WP 2009. Mechanistic niche modelling: combining physiological and
spatial data to predict species' ranges. Ecology Letters, 12: 334-350.
Keitt, TH, Bjornstad, ON, Dixon, PM, Citron-Pousty, S. 2002. Accounting for spatial pattern
when modeling organism-environment interactions. Ecography, 25: 616-625.
Kidd, DM, Ritchie, MG 2006. Phylogeographic information systems: putting the geography
into phylogeography. Journal of Biogeography, 33: 1851-1865.
Koenig, WD 1999. Spatial autocorrelation of ecological phenomena. Trends in Ecology,
Evolution, 14: 22-26.
Lachenbruch, PA 1975. Discriminant Analysis. Hafner. New York.
Liu, C., Berry, PM, Dawson, TP, Pearson, RG 2005. Selecting thresholds of occurrence in the
prediction of species distributions. Ecography, 28: 385-393.
Lobo, JM 2008. More complex distribution models or more representative data? Biodiversity
Informatics, 5
Lobo, JM, Jiménez-Valverde, A., Hortal, J. 2010. The uncertain nature of absences and their
importance in species distribution modelling. Ecography, 33: 103-114.
Loureiro, A., Sillero, N. 2010. Metodologia. 66-74. In : Loureiro, A., Ferrand, N., Carrertero,
MA, Paulo, O. (eds.), Atlas dos anfíbios e répteis de Portugal. Lisboa. Esfera do Caos.
Luiselli, L. 2006. Ecological modelling of convergence patterns between European and
African 'whip' snakes. Acta Oecologica, 30: 62-68.
MacNally, R. 2000. Regression and model-building in conservation biology, biogeography
and ecology: The distinction between-and reconciliation of -predictive and explanatory
models. Biodiversity and Conservation, 9: 655-671.
Manel, S., Schwartz, MK, Luikart, G., Taberlet, P. 2003. Landscape genetics: combining
landscape ecology and population genetics. Trends in Ecology & Evolution, 18: 189-197.
Manel, S., Williams, HC, Ormerod, SJ 2001. Evaluating presence/absence models in ecology:
the need to account for prevalence. Journal of Applied Ecology, 38: 921-931.
Márquez, AL, Real, R, Olivero, J. and Estrada, A. 2011. Combining climate with other
influential factors for modelling climate change impact on species distribution. Climatic
Change, online first (DOI 10.1007/s10584-010-0010-8).
Martinez-Freiria, F., Sillero, N., Lizana, M., Brito, JC 2008. GIS-based niche models identify
environmental correlates sustaining a contact zone between three species of European
vipers. Diversity and Distributions, 14: 452-461.
Martínez-Freiría, F., Santos, X., Pleguezuelos, JM, Lizana, M., Brito, JC 2009. Geographical
patterns of morphological variation and environmental correlates in contact zones: a
multi-scale approach using two Mediterranean vipers (Serpentes). Journal of Zoological
Systematics and Evolutionary Research, 47: 357-367.
Martínez-Meyer, E., Peterson, AT 2006. Conservatism of ecological niche characteristics in
North American plant species over the Pleistocene-to-Recent transition. Journal of
Biogeography, 33: 1779-1789.
200 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

Martínez-Meyer, E., Peterson, AT, Hargrove, WW 2004. Ecological niches as stable


distributional constraints on mammal species, with implications for Pleistocene
extinctions and climate change projections for biodiversity. Global Ecology,
Biogeography, 13: 305-314.
McPherson, JM, Jetz, W., Rogers, DJ 2004. The effects of species' range sizes on the
accuracy of distribution models: ecological phenomenon or statistical artefact? Journal of
Applied Ecology, 41: 811-823.
Mittermeier, R.A., P. Robles Gil, M. Hoffmann, J. Pilgrim, T. Brooks, C. Goettsch
Mittermeier, J. Lamoreux, and G.A.B. da Fonseca. 2004. Hotspots revisited: earth's
biologically richest and most endangered terrestrial ecoregions. Cemex, Monterrey, and
University of Chicago Press, Chicago.
Moisen, GG, Frescino, TS 2002. Comparing five modelling techniques for predicting forest
characteristics. Ecological Modelling, 157: 209-225.
Morin, X., Lechowicz, MJ 2008. Contemporary perspectives on the niche that can improve
models of species range shifts under climate change. Biology Letters ,4: 573-576.
Mouillot, D., Gaston, K. 2009. Spatial overlap enhances geographic range size conservatism.
Ecography, 32: 671-675.
Nogués-Bravo, D., Martínez-Rica, JP 2004. Factors controlling the spatial species richness
pattern of four groups of terrestrial vertebrates in an area between two different
biogeographic regions in northern Spain. Journal of Biogeography, 31: 629-640.
Pearman, PB, Guisan, A., Broennimann, O., Randin, CF 2008. Niche dynamics in space and
time. Trends in Ecology, Evolution, 23: 149-158.
Pearson, RG 2007. Species' Distribution Modeling for Conservation Educators and
Practitioners. Synthesis. American Museum of Natural History : http://ncep.amnh.org.
Pearson, RG, Dawson, TP 2003. Predicting the impacts of climate change on the distribution
of species: are bioclimate envelope models useful? Global Ecology and Biogeography,
12: 361-371.
Pearson, RG, Raxworthy, CJ, Nakamura, M., Peterson, AT 2007. Predicting species
distributions from small numbers of occurrence records: a test case using cryptic geckos
in Madagascar. Journal of Biogeography, 34: 102-117.
Pereira, JMC, Itami, RM 1991. GIS-based habitat modelling using logistic multiple
regression: a study of the Mt. Graham Red Squirrel. Photogrammetric Engineering,
Remote Sensing, 57: 1475-1486.
Perrin, N., 1984. Contribution à l'écologie du genre Cepaea (Gastropoda): Approche
descriptive et expérimentale de l'habitat et de la niche écologique. Tesis Doctoral.
Universidad de Lausana.
Peterson, AT 2003. Predicting the Geography of Species' Invasions via Ecological Niche
Modeling. The Quarterly Review of Biology, 78: 419-433.
Peterson, AT 2006. Uses and Requirements of Ecological Niche Models and Related
Distributional Models. Biodiversity Informatics, 3: 59-72.
Peterson, AT, Cohoon, KP 1999. Sensitivity of distributional prediction algorithms to
geographic data completeness. Ecological Modelling, 117: 159-164.
Phillips, SJ, Anderson, RP, Schapire, RE 2006. Maximum entropy modeling of species
geographic distributions. Ecological Modelling, 190: 231-259.
Ecological Niche Models in Mediterranean Herpetology 201

Phillips, SJ, Dudík, M., Elith, J., Graham, CH, Lehmann, A., Leathwick, J., Ferrier, S. 2009.
Sample selection bias and presence-only distribution models: implications for
background and pseudo-absence data. Ecological Applications, 19: 181-197.
Phillips, SJ, Dudík, M., Schapire, RE 2004. A maximum entropy approach to species
distribution modeling. Proceedings of the Twenty-First International Conference on
Machine Learning, 655-662.
Pleguezuelos, JM, Márquez, R., Lizana, M. 2002. Atlas de distribución y Libro Rojo de los
Anfibios y Reptiles de España. Dirección General de Conservación de la Naturaleza-
Asociación Herpetológica Española, 2ª impresión. Madrid.
Prinzing, A., Durka, W., Klotz, S., Brandl, R. 2002. Geographic variability of ecological
niches of plant species: are competition and stress relevant? Ecography, 25: 721-729.
Pulliam, HR 1988. Sources, Sinks, and Population Regulation. The American Naturalist, 132:
652-661.
Pulliam, HR 2000. On the relationship between niche and distribution. Ecology Letters, 3:
349-361.
Real, R. 1992. Las tendencias geográficas de la riqueza específica. 85-94. In : Vargas, JM,
Real, R., Antúnez, A. (eds.), Objetivos y método biogeográficos. Aplicaciones en
Herpetología. Monografías Herpetológicas. Asociación Herpetológica Española.
Valencia.
Real, R., Barbosa, AM, Martínez-Solano, I., Garcia-Paris, M. 2005. Distinguishing the
distributions of two cryptic frogs (Anura: Discoglossidae) using molecular data and
environmental modeling. Canadian Journal of Zoology, 83: 536-545.
Real, R., Barbosa, AM, Rodríguez, A., García, FJ, Vargas, JM, Palomo, LJ, Delibes, M.
2009. Conservation biogeography of ecologically interacting species: the case of the
Mediterranean lynx and the European rabbit. Diversity and Distributions, 15: 390-400.
Real, R., Barbosa, AM, Vargas, JM 2006. Obtaining environmental favourability functions
from logistic regression. Environmental and Ecological Statistics, 13: 237-245.
Real, R., Estrada, A., Barbosa, AM, Vargas, JM, 2006b. Aplicación de la lógica difusa al
concepto de rareza para su uso en Gap Analysis : el caso de los mamíferos terrestres en
Andalucía. Serie Geográfica 13, 99-116. http://www.geogra.uah.es/inicio/revista/index-
13.php
Real, R, Pleguezuelos, JM and Fahd, S. 1997. The distribution patterns of reptiles in the Riff
region, Northern Morocco. African Journal of Ecology, 35: 312-325
Real, R., Marquez, AL, Estrada, A., Munoz, AR, Vargas, JM 2008. Modelling chorotypes of
invasive vertebrates in mainland Spain. Diversity and Distributions, 14: 364-373.
Real, R., Márquez, AL, Olivero, J., Estrada, A., 2010. Species distribution models in climate
change scenarios are still not useful for informing policy planning: an uncertainty
assessment using fuzzy logic. Ecography 33, 304-314.
Real, R., Vargas, JM, Guerrero, JC 1992. Análisis biogeográfico de clasificación de áreas y
de especies. 73-84. In : Vargas, JM, Real, R., Antúnez, A. (eds.), Objetivos y método
biogeográficos. Aplicaciones en Herpetología. Monografías Herpetológicas. Asociación
Herpetológica Española. Valencia.
Reese, GC, Wilson, KR, Hoeting, JA, Flather, CH 2005. Factors affecting species distribution
predictions: a simulation modeling experiment. Ecological Applications, 15: 554-564.
202 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

Rey Beneyas, JM, Montana, E., Belliure, J., Eekhout, XR 2006. Identifying areas of high
herpetofauna diversity that are threatened by planned infrastructure projects in Spain.
Journal of Environmental Management, 79: 3.
Ribeiro, R., Santos, X., Sillero, N., Carretero, MA, Llorente, GA 2009. Biodiversity and Land
Uses: Is Agriculture the Biggest Threat for reptiles' assemblages? Acta Oecologica, 35:
327-334.
Robertson, MP, Villet, MH, Palmer, AR, 2004. A fuzzy classification technique for
predicting species´ distributions: applications using invasive alien plants and indigenous
insects. Diversity and Distributions 10, 461-474.
Román, R., Ruiz, G., Delibes, M., Revilla, E. 2006. Factores ambientales condicionantes de la
presencia de la lagartija de Carbonell Podarcis carbonelli (Pérez–Mellado, 1981) en la
comarca de Doñana. Animal Biodiversity and Conservation, 29: 73-82.
Romano, A, Ficetola, GF. 2010. Ecogeographic variation of body size in the spectacled
salamanders (Salamandrina): influence of genetic structure and local factors. Journal of
Biogeography, 37: 2358-2370.
Romero, J., Real, R. 1996. Macroenvironmental factors as ultimate determinants of the
distribution of common toad and netterjack toad in the south of Spain. Ecography , 19:
305-312.
Rueda, M, Rodríguez, MA, Hawkins, BA. 2010. Towards a biogeographic regionalization of
the European biota. Journal of Biogeography, 37: 2067-2076.
Rushton, SP, Ormerod, SJ, Kerby, G. 2004. New paradigms for modelling species
distributions? Journal of Applied Ecology, 41: 193-200.
Rödder, D., Lötters, S. 2009. Niche shift versus niche conservatism? Climatic characteristics
of the native and invasive ranges of the Mediterranean house gecko ( Hemidactylus
turcicus ). Global Ecology and Biogeography, 18: 674-687.
Sa-Sousa, P. 2000. A predictive distribution model for the Mediterranean wall lizard (
Podarcis hispanicus ) Portugal. Herpetological Journal, 10: 1-11.
Santos, X., Brito, JC, Caro, J., Abril, AJ, Lorenzo, M., Sillero, N., Pleguezuelos, JM 2009.
Habitat suitability, threats and conservation of isolated populations of the smooth snake (
Coronella austriaca ) in the southern Mediterranean Basin. Biological Conservation,
142: 344-352.
Santos, X., Brito, JC, Sillero, N., Pleguezuelos, JM, Llorente, GA, Fahd, S., Parellada, X.
2006. Inferring habitat-suitability areas with ecological modelling techniques and GIS: A
contribution to assess the conservation status of Vipera latastei. Biological Conservation,
130: 416-425.
Segurado, P., Araújo, MB, Kunin, WE 2006. Consequences of spatial autocorrelation for
niche-based models. Journal of Applied Ecology, 43: 433-444.
Shugart, HH 1990. Using ecosystem models to assess potential consequences of global
climatic change. Trends in Ecology and Evolution , 5: 303-307.
Sillero, N. 2009. Potential distribution of the new populations of Hyla meridionalis in
Salamanca (Spain). Acta Herpetologica, 4: 83-98.
Sillero, N. 2010. Modelling new suitable areas for Hyla meridionalis in a current and future
expansion scenario. Amphibia-Reptilia, 31: 37-50.
Sillero, N., Brito, JC, Toxopeus, B., Skidmore, AK 2009. Biogeographical patterns derived
from remote sensing variables: the amphibians and reptiles of the Mediterranean Basin.
Amphibia-Reptilia, 30: 185-206.
Ecological Niche Models in Mediterranean Herpetology 203

Sillero, N., Celaya, L., Martín-Alfageme, S. 2005. Using GIS to Make An Atlas: A Proposal
to Collect, Store, Map and Analyse Chorological Data for Herpetofauna. Revista
Española de Herpetologia, 19: 87-101.
Sillero, N., Tarroso, P. 2010. Free GIS for herpetologists: free data sources on Internet and
comparison analysis of proprietary and free/open source software. Acta Herpetologica, 5:
63-85.
Soares, C., Brito, JC 2007. Environmental correlates for species richness and biogeographic
relationships among amphibians and reptiles in a climate transition area. Biodiversity and
Conservation, 16: 1087-1102.
Soberón, J. 2007. Grinnellian and Eltonian niches and geographic distributions of species.
Ecology Letters, 10: 1115-1123.
Soberón, J. 2010. Niche and area of distribution modeling: a population ecology perspective.
Ecography, 33: 159-167.
Soberón, J., Nakamura, M. 2009. Niches and distributional areas: Concepts, methods, and
assumptions. Proceedings of the National Academy of Sciences, 106: 19644-19650.
Soberón, J., Peterson, AT 2005. Interpretation of Models of Fundamental Ecological Niches
and Species Distributional Areas. Biodiversity Informatics, 2.
Stearns, SC, Hoekstra, R. 2000. Evolution: an introduction. Oxford University Press. Oxford.
Stockwell, DRB, Noble, IR 1992. Induction of sets of rules from animal distribution data: A
robust and informative method of data analysis. Mathematics and Computers in
Simulation, 33: 385-390.
Stockwell, DRB, Peterson, AT 2002. Effects of sample size on accuracy of species
distribution models. Ecological Modelling, 148: 1-13.
Stoms, DM, Davis, FW, Cogan, CB 1992. Sensitivity of wildlife models to uncertainties in
GIS data. Photogrammetric Engineering, Remote Sensing , 58: 843-850.
Sykes, MT, Prentice, IC, Cramer, W. 1996. A bioclimatic model for the potential distribution
of north European tree species under present and future climates. Journal of
Biogeography , 23: 203-233.
Teixeira, J., Arntzen, JW 2002. Potential impact of climate warning on the distribution of the
Golden-striped salamander, Chioglossa lusitanica , on the Mediterranean Basin.
Biodiversity and Conservation, 11: 2167-2176.
Teixeira, J., Ferrand N. 2002. The application of distribution models and Geographical
Information Systems for the study of biogeography and conservation of herpetofauna:
Chioglossa lusitanica as a case study. Revista Española de Herpetologia, Vol. Especial:
119-130.
Teixeira, J., Ferrand, N., Arntzen, JW 2001. Biogeography of the golden-striped salamander
Chioglossa lusitanica : a field survey and spatial modelling approach. Ecography, 24:
614-624.
Thorpe, RS 1987. Geographic variation: a synthesis of cause, data, pattern and congruence in
relation to subspecies, multivariate analysis and phylogenesis. Bollettino di Zoologia, 54:
3-11.
Thuiller, W., Brotons, L., Araújo, MB, Lavorel, S. 2004. Effects of restricting environmental
range of data to project current and future species distributions. Ecography, 27: 165-172 .
Tomović, L, Crnobrnja-Isailović, J, Brito, JC. (2010): Geostatistics and Geographical
Information Systems uncover the evolutionary history of the nose-horned viper (Vipera
ammodytes) on the Balkans. Biological Journal of the Linnean Society, 101: 651-666.
204 A. Márcia Barbosa, Neftalí Sillero, Fernando Martínez-Freiría et al.

VanDerWal, J., Shoo, LP, Graham, C., Williams, SE 2009. Selecting pseudo-absence data for
presence-only distribution modeling: How far should you stray from what you know?
Ecological Modelling, 220: 589-594.
Vieites, DR, Nieto-Román, S., Wake, DB 2009. Reconstruction of the climate envelopes of
salamanders and their evolution through time. Proceedings of the National Academy of
Sciences, 106: 19715-19722.
Walker, PA, Cocks, KD 1991. HABITAT: a procedure for modelling a disjoint environmental
envelope for a plant or animal species. Global Ecology and Biogeography Letters, 1:
108-118.
West-Eberhard M.-J. 2003. Developmental plasticity and evolution. Oxford University Press,
Oxford.
Wiens, JA, Stralberg, D., Jongsomjit, D., Howell, CA, Snyder, MA 2009. Niches, models,
and climate change: Assessing the assumptions and uncertainties. Proceedings of the
National Academy of Sciences, 106: 19729-19736.
Wiens, JJ, Graham, CH 2005. Niche conservatism: Integrating Evolution, Ecology, and
Conservation Biology. Annual Review of Ecology, Evolution, and Systematics, 36: 519-
539.
Wisz, MS, Hijmans, RJ, Li, J., Peterson, AT, Graham, CH, Guisan, A., NCEAS Predicting
Species Distributions Working Group. 2008. Effects of sample size on the performance
of species distribution models. Diversity and Distributions, 14: 763-773.
Woodward, FI, Cramer, W. 1996. Plant functional types and climatic changes: Introduction.
Journal of Vegetation Science, 7: 306-308.
Zadeh, LA 1965. Fuzzy sets. Information and Control, 8: 338–353.
Zuur, AF, Ieno, EN, Walker, N., Saveliev, AA, Smith, GM 2009. Mixed Effects Models and
Extensions in Ecology with R. Springer, New York.
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 205-222 © 2012 Nova Science Publishers, Inc.

Chapter 9

SOME ASPECTS OF PHYTOPLANKTON AND


ECOSYSTEM MODELLING IN FRESHWATER AND
MARINE ENVIRONMENTS: CONSIDERATION OF
INDIRECT INTERACTIONS, AND THE IMPLICATIONS
FOR INTERPRETING PAST AND FUTURE OVERALL
ECOSYSTEM FUNCTIONING

V. Krivtsov1,2* and C.F. Jago1


1
School of Ocean Sciences, Bangor University, Menai Bridge, Anglesey, UK.
2
Department of Ecology, Kharkov State University,
4 Svobody (Dzerzhinskiy) Square, Kharkov, USSR (Ukraine).
School of Ocean Sciences, University of Wales, Bangor,
Marine Science Laboratories, Menai Bridge, Bangor, Gwynedd LL59 5AB UK.

ABSTRACT
Numerical techniques (e.g. correlation, multiple regression and factor analysis, path
analysis, methods of network analysis, and, in particular, simulation modelling) may be
very helpful in investigations of indirect relationships in aquatic ecosystems. Here we
give a brief overview of some examples of the relevant studies, and focus on 1) a case
study of a freshwater eutrophic lake, where statistical analysis of the datasets obtained
within a comprehensive monitoring programme, and sensitivity analysis by a
mathematical model ‗Rostherne‘, helped to reveal the previously overlooked
relationships between Si and P biogeochemical cycles coupled through the dynamics of
primary producers, and 2) give an overview of how the coupling of physical, chemical,
and biological processes in the marine ecosystem models offers a basis for investigations
of indirect interactions in continental shelf seas. Complex aquatic ecosystem models
provide a numerical simulation of biogeochemical fluxes underpinned by coupling

* Corresponding author. Present address: SBE, Institute for infrastructure and environment, Heriot-Watt University,
Edinburgh EH14 4AS, UK.
E-mail: e96kri69@netscape.net.
206 V. Krivtsov and C. F. Jago

physical forcing functions with definitions simulating biological and chemical processes,
and offer a potential for quantitative interpretation of sediment proxies in the stratigraphic
record. Combination of models and sediment proxies, calibrated by training sets, can
provide information on water column structure, surface heating, mixing, and water depth,
thus providing a basis for reconstruction of the past, and predicting the future
environmental dynamics.

Keywords: phytoplankton, sediments, algae, nutrients, light penetration, freshwater lakes,


continental shelf seas, SPM, indirect effects, ecosystem modeling, CTEA.

INTRODUCTION
Natural ecosystems are complex, and are characterised by a multitude of interconnected
relationships between ecosystem components (Krivtsov, 2008). The understanding of these
complex interactions is paramount for sustainable management of environmental resources.
However, ecological research mainly tends to concentrate on investigations of direct
relationships, whilst indirect interactions (and especially the less obvious, e.g. the delayed
ones) are often overlooked or understudied (NB In this paper all the relations not restricted to
the effects of a direct transaction of matter and energy between the adjacent ecosystem
components will be treated as indirect). Mathematical techniques (e.g. correlation, multiple
regression and factor analysis, simulation modelling, path analysis and methods of network
analysis, etc.) may be very helpful in investigations of indirect relationships in ecosystems.
The origin of algorithms capable of studying indirect interactions within ecological
context can be traced back to the 18th century (Krivtsov, 2004). Here we refer to some
examples of the relevant aquatic studies, and, in particular, give a brief account how
mathematical techniques have been helpful in investigating indirect effects in a freshwater
eutrophic lake (see Krivtsov et al., 1998, 1999a,b,c, 2001, and references therein) where
indirect relationships appeared to occur on (and across) various levels of organisation. We
also give an overview of how the coupling of physical, chemical, and biological processes in
the models of continental shelf seas gives a potential for a breakthrough in the investigations
of indirect effects. Finally, we argue, that due to the importance of aquatic ecosystems, the
existence of comprehensive models linking physical, chemical, and biological processes, and
the experience accumulated in studies of indirect effects in the aquatic environment, complex
aquatic models are likely to continue to play a leading role in the investigations of indirect
effects, in particular within the framework of the comparative theoretical ecosystem analysis
(sensu Krivtsov, 2002, 2004).

EXAMPLES OF MODELLING STUDIES OF INDIRECT EFFECTS IN


AQUATIC ENVIRONMENT
Despite the claims (Wardle, 2002) that aquatic scientists have only recently recognised
and started to study indirect effects, awareness of indirect interactions in aquatic environment
has rather a considerably long history (e.g. Mortimer, 1941, 1942; Hutchinson, 1957;
Reynolds, 1984). In particular, in an earlier review it was even suggested that most studies
Some Aspects of Phytoplankton and Ecosystem Modelling … 207

specifically addressing behaviour-mediated indirect effects tend to be conducted in freshwater


ecosystems, while many of the early demonstrations of density-mediated indirect effects were
done in community studies in marine habitats (see Abrams et al., 1996 and references
therein). Likewise, much of the knowledge related to indirect ecological interactions has been
contributed through the development and applications of the methods of simulation modelling
(e.g. Jorgensen, 1980; Jorgensen, 1994) and network analysis (e.g. Patten, 1985; Patten et al.,
1976, and references therein) in relation to aquatic environment. Consequently, simulation
models capable of demonstrating indirect interactions in aquatic biogeocenoses (e.g. the Lake
2 model of J. Solomonsen, - see Jorgensen, 1994) are widely used for teaching in the
educational establishments across the world. Studies of indirect effects in aquatic
environment involved e.g. application of various statistical techniques, methods of network
analysis, and simulation modelling using ‗what–if‘ scenarios and sensitivity analysis.
Modifications of the model CASM were used by Bartell et al. (Bartell et al., 1999) and
Naito et al. (Naito et al., 2002) to study direct and indirect effects in the aquatic ecosystems
of Canada (Quebec) and Japan (lake Suwa), respectively. Numerical sensitivity analysis was
applied in both cases (sensitivity of a state variable on changes in a parameter was measured
as the percent change from the reference situation). For the Canadian case study it was found
that variability in the production of macrophyte population determines an indirect risk
component of toxic Hg effects on phyto- and zooplankton, periphyton and fish. In the
Japanese case study it was found that the annual production of piscivorous fish was
considerably influenced by the optimal consumption temperature of certain benthic insects.
Another interesting finding was that the physiological parameters of the diatom Melosira
were the important sources of the cyanobacterium Microcystis production variability.
Although the authors did not make a detailed interpretation of the latter relationship, their
results suggest that the underlying mechanism might be a common inverse relationship
between spring diatom and summer cyanobacterial blooms (see references related to the
Rostherne Mere case study below).
Dippner (Dippner, 1998) addressed indirect interrelations between Si and P availability.
On the basis of a simple numerical model it was concluded that indirect effect of the silicate
reduction in coastal waters causes an increased flagellate bloom, due to a high availability of
riverborne nutrient loads. These conclusions are highly in line with the results related to lakes
Suwa (referred to earlier) and Rostherne Mere, described below.
The simultaneous application of ANOVA and ANCOVA analysis allowed the
investigator to elicit direct and indirect effects between the biota inhabiting a North American
intertidal rocky shore (Wootton, 2002). The results indicated that consumers may have a
major influence on the dynamics of ecological succession. In an earlier study published by the
same author, path analysis was helpful in predicting which direct and indirect effects were
important in a seabird exclosure experiment (Wootton, 1994b). Statistical techniques
(including linear regressions and a number of variation of ANOVA analysis) were also
helpful in another intertidal rocky shore study (Navarrete and Menge, 1996).
Hanratty and Liber (Hanratty and Liber, 1996) studied indirect effects of a pollutant
diflubenzuron on growth of larval bluegill sunfish in a littoral enclosure. At very high
concentrations the model predictions were good, but at intermediate concentrations the
accuracy was variable, with some indirect responses being exaggerated due to cascading
effects through the ecosystem trophic levels.
208 V. Krivtsov and C. F. Jago

Hulot and coauthors (Hulot et al.. 2000) compared the performance of linear food chain
models and an intermediate complexity model, applied to data of a mesocosm experiment
simulating lake nutrient enrichment. The intermediate complexity model (with separation of
trophic levels into functional groups according to size and diet) was the only one which
performed satisfactory, thus highlighting the importance of functional diversity and indirect
interactions.
Another modelling study (Loladze et al., 2000) investigated how the interactions between
phytoplankton and zooplankton change if the Lotka-Volterra model incorporates chemical
heterogeneity for both trophic levels. It was found that indirect competition between two
populations for P can shift the relationship from a usual (+, −) type to an unusual (−, −) type,
leading to a very complex overall dynamics.
Structural equation modelling (a technique combining path, factor, and regression
analyses) was used by Malaeb et al. (Malaeb et al., 2000) to estimate the contribution of
indirect effects of sediment contamination and natural variability on biodiversity and growth
potential in a selection of North American estuaries. They found that a positive indirect effect
of natural variability (mediated through biodiversity) on growth potential exceeded a direct
negative effect, resulting in the overall positive relationship.
A simulation model of the Mediterranean infralittoral rocky bottom was used by
McClanahan and Sala, 1997) to study possible effects of various management options.
Running a number of ‗what–if‘ scenarios they concluded that many of potential changes are
likely to be indirect effects caused by changes in trophic composition. For example, if
invertivorous fish were removed as part of a management scenario, sea urchins would reduce
algal abundance and primary production, leading to competitive exclusion of herbivorous
fish. Although similar interactions were known from tropical seas, these results were not
anticipated by previous field studies in the Mediterranean.
Carrer and Opitz (Carrer and Opitz, 1999) investigated indirect interactions in the Lagoon
of Venice using ‗Ecopath‘, a software implementing methods of network analysis. Among
other interesting relationships, they found that about half of the food of nectonic benthic
feeders and nectonic necton feeders passed through detritus at least once, whilst there was no
direct transfer of such food according to the diet matrix. The paper contains a number of
references to other studies where network analysis was used to analyse indirect relationships
among ecosystem components (see also Patten, 1992; Fath and Patten, 1998; Fath and Patten,
1999, and references therein), as well as a reference to the Ecopath web site, which, in turn,
gives a list of studies (mainly aquatic) where this software was applied. A couple of examples
from this list are reviewed below.
Ortiz and Wolff, 2002a and Ortiz and Wolff, 2002b used Ecopath with Ecosim software
to study benthic communities in Chile. They found that a simulated harvest of the clam
Mulinia generated a complex interplay involving direct and indirect effects, and drastically
changed the properties of the whole system.
Another study utilizing network analysis provided an analysis of the extended path and
flow structure for the well documented oyster reef model (Whipple, 1999). Few simple paths
and large number of compound paths were counted. The study provided structural evidence
for feedback control in ecosystems, and illustrated importance of non-living compartments (in
this case detritus) for the ecosystem‘s functioning. Even for the model with a low cycling
index (i.e. 11%) multiple cyclic passage paths provided a considerable (22%) flow
Some Aspects of Phytoplankton and Ecosystem Modelling … 209

contribution. Therefore, it was envisaged that for ecosystems with higher cycling indexes the
patterns observed should be even more pronounced.
Application of mathematical modelling techniques has been helpful in furthering the
understanding of a number of aspects of marine ecosystem dynamics in the Menai Strait
(Krivtsov et al., 2008a) and in the Irish Sea and Liverpool Bay (see Krivtsov et al., 2008b
and references therein). In particular, these studies helped to elucidate a complex role played
in the ecosystem by suspended particulate matter (SPM), which is arguably thought to be one
of the most important ecosystem constituents affecting the majority of ecological processes
(Krivtsov et al., 2011). At present, most ecosystem models are in dire need of improving the
representation of the SPM dynamics, and it has been argued (Krivtsov et al., 2008b) that
many of the discrepancies between measurements and simulations commonly observed in
aquatic modelling studies may, in fact, have been caused by inadequate representation of the
SPM subsystem.

CASE STUDY OF ROSTHERNE MERE


Detailed attention to indirect effects was given in a number of studies conducted at
Rostherne Mere, one of the best studied lakes in UK (see Krivtsov et al., 1998, 1999a,
1999b, 1999c, 2000a, 2001b, 2001c, 2002b, 2003a, and references therein). Indirect effects
were shown to occur on (and across) various levels of organisation, including intracellular,
population and ecosystem levels. A comprehensive monitoring data set was analysed by
means of statistical techniques, which facilitated the construction of a dynamic simulation
model. Statistical analysis of the observed data sets and sensitivity analysis (using
mathematical model ‗Rostherne‘) were used to elicit the hidden relationships between Si and
P biogeochemical cycles coupled through the dynamics of primary producers (Krivtsov,
2001; Krivtsov et al., 2000b). These results were then confirmed by new statistical analysis,
and ultimately resulted in changes of the contemporary ecological theory. It was shown that
there is an inverse relationship between spring diatom and summer cyanobacterial blooms,
which could be utilised as a new method of eutrophication control.
If the spring phytoplankton is limited by Si, then artificial Si additions should alleviate Si
limitation and result in the increased P and N removal from the water column (Krivtsov et al.,
2000a). This would consequently lead to a decreased peak of blue–greens in summer. Further
investigations (Krivtsov et al., 2000b), however, revealed a complex interplay between direct
and indirect effects in the ecosystem, including those related to the influences of temperature,
light, inflow/outflow characteristics, and interactions among nutrients, algae, detritus,
zooplankton and fish. Dynamic ecosystem modelling suggested that in cases where factors
other than nutrients make a considerable contribution to the limitation of spring diatom
increase, any event or measure which diminishes the negative effect of these factors, should
also result in a decreased summer cyanobacterial maxima, due to enhanced nutrient removal
in spring by diatoms.
Some of the indirect relationships studied within the Rostherne Mere research were
classified in relation to the underlying mechanisms (i.e. in this case directly and indirectly
mediated), which facilitated extrapolation of the conclusions for other types of ecosystems
(see Krivtsov, 2002, 2004, and references therein). A number of ‗what–if‘ scenarios examined
210 V. Krivtsov and C. F. Jago

provided information on the differences of manifestation of the indirect effect of Si on


cyanobacterial bloom in relation to, e.g., hydrological and morphological parameters, thus
assessing differences between ecosystem types (e.g. deep versus shallow lakes, lakes with
high versus lakes with slow retention time). These analyses have led to the derivation of the
‗indirect regulation rule for consecutive stages of ecological succession‘, which generalised
the most notable interdependencies observed for other types of ecosystems (Krivtsov et al.,
2000c), and to a general classification of the ecosystem effects (Krivtsov, 2001). It is intended
that further work should involve application of structural equation modelling, and the
comparison with the indirect relationships revealed for a terrestrial ecosystem (Krivtsov et al.,
2004), thus providing further basis for the ongoing development of the comparative
theoretical ecosystem analysis (CTEA) framework.

MODELLING BIOGEOCHEMICAL
FLUXES IN CONTINENTAL SHELF SEAS
Representation of a biogeochemical cycle in an ecological model should involve all
relevant physical forcing functions, biological and abiotic processes, and chemical fractions
associated with biota, dissolved nutrient pool, bottom sediments, and suspended particulate
matter (SPM). Among those, SPM is often the least appreciated, and is, consequently,
underrepresented in ecological models. The discussion below reviews a number of important
aspects related to biogeochemical cycling, with a particular emphasis on SPM.

SPM

Suspended particulate matter (SPM) is an important ecosystem constituent in shelf seas,


and its characteristics influence overall ecosystem functioning through a wide range of
biogeochemical processes (Tett et al., 1993; Krivtsov et al., 2011) . The dynamics of SPM
fluxes in shelf seas is governed both by horizontal circulation and, on a shorter time scale, by
vertical mixing. SPM in shelf seas usually consists of aggregated material (organic and
inorganic) forming flocs. Flocs aggregate and rupture on short time scales in response to the
varying turbulence regime (e.g. aggregation at slack water and rupture at peak flows) with the
result that their size and settling velocity also vary on short time scales. SPM is important
ecologically as it constrains primary productivity while in the water column and constrains
benthic productivity when it accumulates rapidly on the seabed as benthic fluff. The vertical
flux of SPM is also the primary pathway for transfer of organic carbon from plankton blooms
to the seabed.
The vertical flux of particles primarily depends on the entrainment rate from the sea bed,
settling velocity (which, in turn, is dependent upon the size and density of the particles which
are usually in the form of flocs), and vertical mixing. Therefore, the parameters which may be
of interest for relevant mathematical models are biological (e.g. benthic and pelagic
production), hydrodynamic (e.g. hydrochemcial and hydrophysical profiles, tidal regime), and
sedimentological (particle composition, aggregation rate, settling velocity). Consequently,
relevant ecosystem models (e.g. Sharples and Tett, 1994; Luytens et al., 1999; Sharples,
Some Aspects of Phytoplankton and Ecosystem Modelling … 211

1999; Smith and Tett, 2000) capable of describing biogeochemical fluxes in shelf seas have,
albeit to a variable extent, included tidal mixing, water column dynamics, algal production
and loss processes, and SPM benthic fluff settling/resuspension.

Tidal Mixing

In continental shelf seas, daily inputs of turbulent kinetic energy from tidal and wind
stirring, and convective mixing, are opposed by buoyancy inputs due to solar heating. The
balance between these depends on seasonal variability of the surface heat flux. In deeper
regions, seasonal stratification develops whereby a thermocline separates surface and bottom
mixed layers (SML and BML, mixed by wind and tide, respectively – see Figure 1).
Shallower regions remain mixed for most of the time. During summer stratification develops,
and a tidal mixing front (which is the surface expression of the thermocline) is positioned
between stratified and mixed regions, where the intensity of tidal turbulent mixing is
sufficient to overcome the tendency to stratify due to surface heating. By equating turbulent
mixing to the potential energy difference before and after mixing, it has been shown that the
critical determinant of water column structure is tidal stirring, predicted by h/u3, where u is a
depth-mean average tidal current and h is water depth (Simpson and Hunter, 1974). A
reasonable precision of front prediction can be achieved by a combined wind and tide stirring
model (for details see Simpson and Bowers, 1981). Further improvements can be obtained
through formulations of water column structure based on consideration of turbulent kinetic
energy (tke) (e.g. Sharples and Tett, 1994; Simpson et al., 1996; Luytens et al., 1996;
Sharples, 1999). In such models, a turbulence closure scheme (e.g. Mellor and Yamada,
1982) links vertical stratification, driven by surface heating, and turbulence generated by tidal
friction at the seabed and wind stress at the surface; in effect, water column stability is related
to the efficiency of vertical turbulent transport. Thus, changes in tke (q2/2 where q is the
turbulent velocity scale) are given by:

(1)

where Kq is the vertical eddy diffusivity of tke, Nz is the depth-dependent coefficient of eddy
viscosity, Kz is the vertical eddy diffusivity, u and v are the x and y components of current
velocity, g is the acceleration due to gravity, ρ is water density, BI is a constant of the
turbulence closure scheme, and l is the constant turbulent length scale. This formulation
contains the vertical diffusion of tke, shear production of turbulence, work done against
buoyancy, and the dissipation of tke (the terms on the right hand side of Eq. 1, respectively).
Distribution of heat input from solar radiation is mediated by attenuation due to SPM and
algal biomass in the water column. Such models have been very successful in simulating
vertical mixing processes in general, and genesis of the thermocline in particular.
212 V. Krivtsov and C. F. Jago

From Jago and Jones (2002).

Figure 1. Conceptual diagram of water column structure and water quality in tidal shelf seas. *=factor
which limits algal growth. 2.7 is a critical transition value of h/u3

It should be noted, that although frontal zones tend to be quite narrow (ca. 5 km wide), a
front is not an abrupt transition. The frontal density structure implies a pressure gradient
perpendicular to the front which drives an along-front current. The geostrophic balance is not
perfect, due to friction, and a small cross-stream flow is generated down the pressure
gradient. Consideration of the cross-frontal dynamical balance, including friction (James,
1978; Garrett and Loder, 1981 ), suggests a weak cross-frontal circulation with a surface
convergence close to the region of maximum horizontal gradient (e.g. Pingree et al., 1974).
The Simpson–Hunter stratification parameter predicts a lateral migration of fronts during the
lunar cycle of 10–20 km. Even after removing tidal advection, Simpson and Bowers (1979)
concluded that fronts migrate laterally by a few kilometers in response to the lunar cycle.
Numerical models (Simpson and Bowers, 1981; Sharples and Simpson, 1996) suggest that the
position of the frontal zone lags the spring-neap cycle by a few days. A major consequence of
lateral migration is that water from the mixed region is incorporated into the stratified region
as the front advances into shallower water on neap tides (Simpson and Hunter, 1974;
Simpson and Bowers, 1979). Furthermore, infrared satellite imagery and direct observations
show that baroclinic instabilities along the front grow into large eddies (typically 25–40 km)
which make a large contribution to cross-frontal mixing (Pingree and Griffiths, 1978). In
addition, some water from the BML may be mixed across the thermocline by spring tide
currents. Finally, some diffusion of the thermocline might occur due to dissipation of internal
tides.

Advection/Resuspension

Observations of SPM in continental shelf seas can be reasonably reproduced (Jago and
Jones, 1998) by the following conceptual model:
Some Aspects of Phytoplankton and Ecosystem Modelling … 213

t
dS
dx 0
S ( t )  S0  U x dt  k U x , (2)

where

S 0 is a background concentration at t=0,


U x is the rectilinear current velocity
k is a function describing a combined effect of the entraiment from the sea bed and the
vertical distribution through the water column

Values of background concentration, combined function k, and the gradient along the
dS
tidal stream are usually calculated by linear regressions between observations of S(t), and
dx
tidal current displacement and speed.
The formulation presented above proved to be reasonable, and is capable of reproducing
the so-called ‗twin peaks signal‘ in SPM concentration, which results from the
superimposition of the quarter-diurnal signal due to the resuspension of benthic fluff and
semi-diurnal signal caused by longitudinal concentrational gradients in SPM in long-term
suspension. Further improvements in the performance can be gained by introducing a
numerical model which includes horizontal advection of a particulate concentration gradient
and vertical diffusion, and the bottom boundary conditions including fluxes due to
resuspension and diffusion, with entraiment rate being a function of the bed shear stress (for
details see Jago and Jones, 1998, and references therein).

Phytoplankton Population Dynamics

Primary production in shelf seas is determined by nutrient availability, grazing pressure,


and growth rate. Growth rate mainly depends on nutrient availability (Droop, 1983) and light,
while peak biomass is constrained by nutrients (Tett et al., 1993). Tidal stirring controls the
availability of both nutrients and light.
In mid and high latitude shelves, where plankton dynamics are dominated by transients
such as the spring bloom, it is likely that growth rate is the most important regulator of the
observed changes in population density (Tett et al., 1993). The temporal evolution of
phytoplankton biomass is determined by growth and vertical turbulent transport, tempered by
grazing (Sharples and Tett, 1994; Sharples, 1999; Sharples et al., 2001):

(3)

where X is biomass and is the specific growth rate, Kz is vertical eddy diffusivity, and g is
the loss of algal biomass to grazers.
214 V. Krivtsov and C. F. Jago

It should be noted that in summer 30–80% of the total algal production in the euphotic
zone takes place in the thermocline (Fransz and Gieskes, 1984) and that the greatest
production occurs near tidal mixing fronts (Pingree et al., 1975, 1978; Savidge, 1976;
Holligan, 1981; Holligan et al., 1983; Loder and Platt, 1985. ; Tett et al., 1993; Tett and
Walne, 1995). This could be a passive response to the convergent flows which have been
observed at fronts but it is more likely to result from in situ growth of plankton. This is where
the combination of light and nutrients is optimal (Dufour and Stretta, 1973): nutrient renewal
during the summer, due to mixing by tide and wind, and surface stabilisation and reduction of
h1 during fairweather and neap tides. Thus a chlorophyll maximum is observed at the
thermocline and at fronts. The optimal conditions for rapid algal growth are at the front, as
lateral mixing across the front is greater than vertical mixing across the thermocline (Garrett
and Loder, 1981; Tett, 1981; Tett et al., 1986).
The growth rate of phytoplankton may be nutrient-limited:

(4)

where m is the maximum growth rate, kQ is the subsistence cell nutrient quota, and Q is the
cell nutrient quota (=the ratio of algal cell internal nutrient concentration to chlorophyll
biomass). Unlike in most freshwater ecosystems, in continental shelf seas nitrogen is usually
less available for growth, and is therefore more limiting, than is phosphorus (Tett and Droop,
1988), while silicon may be more limiting than nitrogen for diatoms (Brzezinski, 1985).
In addition to nutrient limitation, phytoplankton growth may be slowed down by low
light and/or temperature. There are numerous definitions how exactly these limitations may
be combined with the limitations imposed by nutrient availability, and the detailed
consideration of the matter would be beyond the scope of this paper. Low light due to high
SPM concentrations is a major control in mixed regions of shelf seas. The insight into one of
the more simple representations may be seen in the definitions of the freshwater ecosystem
model Rostherne presented elsewhere (see Krivtsov et al. 2000b).

EXAMPLES OF COUPLED ENVIRONMENTAL MODELS


There are a number of models which provide coupling of the relevant physical,
biological, and chemical processes manifesting in the continental shelf seas. For example,
SEDBIOL (Smith and Tett, 2000) is a 1-D depth-resolving model which couples water
column dynamics, algal production, and SPM/benthic fluff settling/resuspension . The model
provides seasonally varying turbulent diffusivities which drive nutrient cycles and
interactions with phytoplankton (grazed by zooplankton) and SPM, including settling flux
and deposition of benthic fluff. The model predicts annual net primary productivity and
carbon fluxes to the seabed. COHERENS (Luytens et al., 1999) is an advanced 3-D coupled
model which uses turbulence closure schemes to provide the dynamical framework for
plankton cycling and SPM settling and exchange with the benthic fluff layer. It simulates
plankton dynamics, settling flux and fluff deposition rate and resolves mesoscale and seasonal
scale processes. Such models provide conceptual insights and numerical solution of
Some Aspects of Phytoplankton and Ecosystem Modelling … 215

biogeochemical fluxes in stratified and mixed regions of tide-driven shelf seas. Importantly,
the coupled models provide numerical relationships between factors such as water
temperature and seabed anoxia (which are potentially recorded in the sediments by biological
and geochemical proxies) and governing variables such as water depth, mixing, and
turbulence.
Another relevant example in this respect is POLCOMS (the Proudman Oceanographic
Laboratory Coastal Ocean Modelling System, see www.pol.ac.uk/home/research/polcoms), a
three-dimensional modelling system whose main elements are a three-dimensional baroclinic
hydrodynamic model (Holt and James, 2001) linked to a surface wave model (Wolf et al.,
2002; Osuna et al., 2007), a sediment resuspension and transport model (Holt and James,
1999) and an ecosystem model (ERSEM, the European regional seas ecosystem model, see
e.g. Baretta et al., 1995; Blackford et al., 2004) with benthic and pelagic components. This
modelling system has been developed primarily to investigate physical–biogeochemical
interactions in shelf seas, see for example Proctor et al. (2003).
Coupled models are capable of reproducing a complex interplay between ecosystem
components, and simulating spatial and temporal variation of algal productivity and the
dynamics in tidal shelf seas. Simulations suggest that plankton are constrained in mixed
regions by the transparency of the water column (high nutrients but high turbidity), and that
productivity seems to be greater in areas where h1 (i.e. the 'mixed layer optical thickness',
where is an attenuation coefficient, h1 is the thickness of the layer through which algae are
transported by vertical turbulence) is small rather than where nutrients are greatest (Tett et al.,
1993). Plankton are also constrained in the BML of stratified regions (high nutrients but low
light). In the SML of stratified regions, productivity is initially high in spring (high nutrients
and good light) but rapidly diminishes during the summer as nutrients are consumed and not
replaced. Grazing by herbivorous zooplankton and settling of algal cells (associated with
SPM aggregates) to the seabed reduce the standing stock in the SML. Subsequent predation
of zooplankton reduces grazing pressure and increase of nutrients by wind-driven mixing
stimulates a secondary autumn algal bloom in the SML. As the spring bloom wanes,
enhanced chlorophyll concentrations (up to 100 mg m−3) and maximum phytoplankton
biomass are generally seen at the thermocline in seasonally stratified regions (e.g. Anderson,
1969; Cullen and Eppley, 1981; Holligan et al., 1983). Turbulent tidal mixing generates new
production by periodically supplying nitrate from the BML. However, turbulent tidal mixing
also entrains algae into the BML where they are lost from the productive regions of the water
column (Sharples et al., 2001).
The nutrient status of algal cells has been shown to affect their cohesiveness (Kiorboe et
al., 1990) and enhanced post-bloom plankton agglutination has been attributed to an increase
in stickiness due to nutrient depletion (Logan and Alldredge, 1989; Smetacek, 1985). Such
stickiness has been linked to SPM aggregation with respect to flagellates and dinoflagellates
(Passow and Wassman, 1994; Jones et al., 1998) as well as to diatoms. Strong biological
mediation of SPM aggregation during the late stages of a flagellate bloom, resulting in
increased particle size and settling flux of SPM to the seabed, has been measured (Jago, et al.,
2007). Links between algal activity and particulate settling flux suggest that the greatest flux
per unit area of particulate matter to the seabed should occur in regions of greatest algal
production per unit volume, i.e. in frontal zones.In the aftermath of blooms, the enhanced
settling flux gives rise to low density, organic-rich, benthic fluff on the bed (Jago et al.,
1993). The correct understanding of the exact nature, and the dynamics of this fluff is crucial,
216 V. Krivtsov and C. F. Jago

because of its links with a number of important biological and chemical processes, and its
overall effect on biogeochemical cycling, and may be greatly facilitated by further
investigations combining experimental approaches, with comprehensive monitoring and
modelling studies.

POTENTIAL FOR INVESTIGATING INDIRECT EFFECTS AND


IMPLICATIONS FOR INTERPRETING PAST AND FORECASTING FUTURE
ENVIRONMENTAL DYNAMICS
Comprehensive aquatic models are characterized by coupling of physical forcing
functions, with definitions describing relevant chemical and biological processes, and can,
therefore, assist in revealing complex multivariate interplay among components and processes
involved, as well as assist analysis of indirect interactions and their role in overall ecosystem
functioning (Krivtsov, 2002, 2000). Therefore, comprehensive aquatic models have a
considerable potential to aid interpretation of the past, and prediction of the future
environmental dynamics.
For example, in the fossil records of Rostherne Mere, Livingston (Livingston, 1979) was
the first to notice an inverse relationship between diatom and cyanobacterial remains
deposited, respectively, during the spring and the summer of the same year. However, it is not
until much later that this relationship was explained after careful interpretation of a
comprehensive set of ‗What if ?‘ scenarios simulated using the model ‗Rostherne‘, and was
attributed to the indirect effect of coupling between Si and P biogeochemical cycles by the
dynamics of primary producers (Krivtsov, 2001).
Since models such as COHERENS provide a numerical simulation of biogeochemical
fluxes underpinned by sophisticated treatment of shelf dynamics, they offer the potential for
quantitative interpretation of sediment proxies in the stratigraphic record. Combination of
models and sediment proxies (e.g. stable isotope chemistry of foraminifera), calibrated by
training sets, can provide information on water column structure, surface heating, mixing, and
water depth, thus providing a basis for reconstruction of past shelf sea regimes (see, for
example, Jago and Jones, 2002).
It has previously been argued (Krivtsov, 2002) that the correct understanding of indirect
relationships in environmental systems is crucial for sustainable development of humankind.
Considering the importance of aquatic ecosystems and the experience accumulated in studies
of indirect effects in the aquatic environment, complex aquatic models are likely to continue
playing a leading role in these investigations, in particular within the framework of the
comparative theoretical ecosystem analysis.

REFERENCES
Abrams P. A., Menge B. A., Mittelbach G. G., Spiller D. A. & Yodzis P. (1996) The role of
indirect effects in food webs. In: Food webs: integration of patterns and dynamics. (eds.
G. Polis & W. K.) pp. 371 - 395. Chapman and Hall, New York.
Some Aspects of Phytoplankton and Ecosystem Modelling … 217

Anderson, G.C. Subsurface chorophyll maximum in the Northeast Pacific Ocean. Limnol.
Oceanogr. 14 (1969), pp. 386–391.
Baretta J.W. , W. Ebenhoeh and P. Ruardij, The European regional seas ecosystem model, a
complex marine ecosystem model., Neth. J. Sea Res. 33 (3–4) (1995), pp. 233–246.
Bartell S. M., Lefebvre G., Kaminski G., Carreau M. & Campbell K. R. (1999) An ecosystem
model for assessing ecological risks in Quebec rivers, lakes, and reservoirs. Ecological
Modelling, 124: 43-67.
Blackford J.C., J.I. Allen and F.J. Gilbert (2004) Ecosystem dynamics at six contrasting sites:
a generic model study, J. Mar.Syst. 52: 217–234.
Brzezinski, M.A. The Si:C:N ratio of marine diatoms: interspecific variability and the effect
of some environmental variables. J. Phycol. 21 (1985), pp. 347–357.
Carrer S. & Opitz S. (1999) Trophic network model of a shallow water area in the northern
part of the Lagoon of Venice. Ecological Modelling, 124: 193-219.
Cullen, J.J. and R.W. Eppley , Chlorophyll maximum layers of the Southern-California Bight
and possible mechanisms of their formation and maintenance. Oceanol. Acta, 4 (1981),
pp. 23–32.
Dippner J. W. (1998) Competition between different groups of phytoplankton for nutrients in
the southern North Sea. Journal of Marine Systems, 14: 181-198.
Droop , M.R. 25 years of algal growth kinetics - a personal view. Bot. Mar. 26 (1983), pp.
99–112.
Dufour, P., Stretta, J.M., 1973. Fronts thermique et thermohalins dans la région du Cap Lopez
(Golfe de Guinée) juin-juillet 1972: phytoplancton, zooplancton, micronecton et pêche
thonière. Documents Scientifiques, Centre de Recherches Océanographiques, Abidjan
4/99-142.
Fath B. D. & Patten B. C. (1998) Network synergism: Emergence of positive relations in
ecological systems. Ecological Modelling, 107: 127-143.
Fath B. D. & Patten B. C. (1999) Review of the foundations of network environ analysis.
Ecosystems, 2: 167-179.
Fransz, H.G. and W.W.C. Gieskes , The imbalance of phytoplankton and copepods in the
North Sea. Rapp. P.-v Réun. Cons. perm. int. Explor. Mer 183 (1984), pp. 218–225.
Garrett, C.J.R. and J.W. Loder , Dynamical aspects of shallow sea fronts. Phil. Trans. R. Soc.
London A, 302 (1981), pp. 563–581.
Hanratty M. P. & Liber K. (1996) Evaluation of model predictions of the persistence and
ecological effects of diflubenzuron in a littoral ecosystem. Ecological Modelling, 90: 79-
95.
Holligan, P.M. Biological implications of fronts on the northwest European shelf. Phil. Trans.
R. Soc. London A, 302 (1981), pp. 547–562.
Holligan, P.M. , M. Viollier, C. Dupouy and J. Aiken , Satellite studies on the distribution of
chlorophyll and dinoflagellates in the western English Channel. Cont. Shelf Res. 2 (1983),
pp. 81–96.
Holt J.T. and I.D. James (1999) A simulation of the Southern North Sea in comparison with
measurements from the North Sea Project. Part 2. Suspended particulate matter, Cont.
Shelf Res. 19: 1617–1642.
Holt, J.T. and I.D. James (2001) An s-coordinate density evolving model of the north west
European continental shelf. Part 1. Model description and density structure, J. Geophys.
Res. 106 (C7), 14015–14034.
218 V. Krivtsov and C. F. Jago

Hulot F. D., Lacroix G., Lescher-Moutoue F. O. & Loreau M. (2000) Functional diversity
governs ecosystem response to nutrient enrichment. Nature, 405: 340-344.
Hutchinson G. E. (1957) A Treatise on Limnology. Chapman and Hall Ltd., London.
Jago C.F. and S.E. Jones (1998). Observation and modelling of the dynamics of benthic fluff
resuspended from a sandy bed in the southern North Sea. Cont. Shelf Res. 18:1255–1282.
Jago C.F. A.J. Bale, M.O. Green, M.J. Howarth, S.E. Jones, I.N. McCave, G.E. Millward,
A.W. Morris, A.A. Rowden and J.J. Williams (1993) Resuspension processes and seston
dynamics. Phil. Trans. R. Soc. London A, 343: 475–491.
Jago, C.F., Jones, S.E., Kennaway, G., Latter, R.J., McCandliss, R.R., Rippeth, T., Simpson,
J.H., 2002. Mediation of shelf sea suspended particle properties by plankton and
turbulence. EOS Trans. AGU 83(4), Ocean Sciences, Meet. Suppl., Abstract OS22D-213.
Jago, C.F. and Jones, S.E., 2002. Diagnostic criteria for reconstruction of tidal continental
shelf regimes: changing the paradigm. Marine Geology, 191, 95-117.
Jago, C.F., Kennaway G, Novarino G, Jones S E, 2007. Size and settling velocity of
suspended flocs during a Phaeocystis bloom in the tidally-stirred Irish Sea, N W
European shelf. Marine Ecology Progress Series, 345, 51-62
James , I.D. A note on the circulation induced by a shallow sea front. Estuar. Coast. Mar. Sci.
7 (1978), pp. 197–202.
Jones, S.E. , C.F. Jago, A.J. Bale, D. Chapman, R. Howland and J. Jackson , Aggregation and
resuspension of suspended particulate matter at a seasonally stratified site in the southern
North Sea: physical and biological controls. Cont. Shelf Res. 18 (1998), pp. 1283–1310.
Jørgensen S. E. (1980) Lake management. Pergamon, Oxford.
Jorgensen S. E. (1994) Fundamentals of Ecological Modelling (2nd Edition). Elsevier,
Amsterdam.
Kiorboe, T. , K.P. Anderson and H.G. Dam , Coagulation efficiency and aggregate formation
in marine phytoplankton. Mar. Biol. 107 (1990), pp. 235–245.
Krivtsov V. (2001) Study of cause-and-effect relationships in the formation of biocenoses:
Their use for the control of eutrophication. Russian Journal of Ecology, 32: 230-234.
Krivtsov V. (2002) Indirect Effects in Ecosystems: a Review of Recent Modelling Studies
and a Methodological Framework for Comparative Theoretical Analysis. In: Integrated
Assessment and Decision Support. Proceedings of the first biennial meeting of the
International Environmental Modelling and Software Society (also available at
http://www.iemss.org/iemss2002/proceedings/pdf/volume%20uno/410_Krivtsov.pdf)
(eds. A. E. Rizzoli & A. J. Jakeman) pp. 233 - 238, Lugano.
Krivtsov V., Bellinger E. & Sigee D. (2000a) Incorporation of the intracellular elemental
correlation pattern into simulation models of phytoplankton uptake and population
dynamics. Journal of Applied Phycology, 12: 453-459.
Krivtsov V., Bellinger E. & Sigee D. (2003a) Ecological study of Stephanodiscus rotula
during a spring diatom bloom: dynamics of intracellular elemental concentrations and
correlations in relation to water chemistry, and implications for overall geochemical
cycling in a temperate lake. Acta Oecologica, 24: 265-274.
Krivtsov V., Bellinger E. G. & Sigee D. C. (1999a) Modelling of elemental associations in
Anabaena. Hydrobiologia, 414: 77-83.
Krivtsov V., Sigee D., Corliss J. & Bellinger E. (1999b) Examination of the phytoplankton of
Rostherne Mere using a simulation mathematical model. Hydrobiologia, 414: 71-76.
Some Aspects of Phytoplankton and Ecosystem Modelling … 219

Krivtsov V., Tien C., Sigee D. & Bellinger E. (1999c) X-ray microanalytical study of the
protozoan Ceratium hirundinella from Rostherne Mere (Cheshire, UK): Dynamics of
intracellular elemental concentrations, correlations and implications for overall
ecosystem functioning. Netherlands Journal of Zoology, 49: 263-274.
Krivtsov V., Bellinger E., Sigee D. & Corliss J. (1998) Application of SEM XRMA data to
lake ecosystem modelling. Ecological Modelling, 113: 95-123.
Krivtsov V., Bellinger E., Sigee D. & Corliss J. (2000b) Interrelations between Si and P
biogeochemical cycles - a new approach to the solution of the eutrophication problem.
Hydrological Processes, 14: 283-295.
Krivtsov V., Corliss J., Bellinger E. & Sigee D. (2000c) Indirect regulation rule for
consecutive stages of ecological succession. Ecological Modelling, 133: 73-82.
Krivtsov V., Goldspink C., Sigee D. C. & Bellinger E. G. (2001b) Expansion of the model
'Rostherne' for fish and zooplankton: role of top-down effects in modifying the prevailing
pattern of ecosystem functioning. Ecological Modelling, 138: 153-171.
Krivtsov V., Sigee D. & Bellinger E. (2001c) A one-year study of the Rostherne Mere
ecosystem: seasonal dynamics of water chemistry, plankton, internal nutrient release, and
implications for long-term trophic status and overall functioning of the lake.
Hydrological Processes, 15: 1489-1506.
Krivtsov V., Sigee D. & Bellinger E. (2002b) Elemental concentrations and correlations in
winter micropopulations of Stephanodiscus rotula: an autecological study over a period
of cell size reduction and restoration. European Journal of Phycology 37: 27-35.
Krivtsov V., J. Gascoigne and M. W. Skov (2011) Dynamics of suspended particulate matter
in the Menai Strait (UK) and its implications for ecosystem functioning and management.
In A. D. Nemeth (Editor). The Marine Environment: Ecology, Management and
Conservation. NovaScience Publishers, pp 179-197.
Krivtsov V., Walker S. J. J., Staines H. J., Watling R., Burt-Smith G. & A. G. (2004b)
Integrative analysis of ecological patterns in an untended temperate woodland utilising
standard and customised software. Environemntal Modelling and Software: Volume 19,
Issue 3, March 2004, Pages 325-335.
Krivtsov V., Gascoigne J. and Jones S. E. 2008a. Harmonic analysis of suspended particulate
matter in the Menai Strait (UK). Ecological Modelling, 212, 53-67.
Krivtsov V., Howarth M. J., Jones S. E., Souza A. J. and Jago C. F. 2008b. Monitoring and
modelling of the Irish Sea and Liverpool Bay: An overview and an SPM case study.
Ecological Modelling, 212, 37-52.
Krivtsov V. 2004. Investigations of indirect relationships in ecology and environmental
sciences: a review and the implications for comparative theoretical ecosystem analysis.
Ecological Modelling, 174, 37-54
Krivtsov V. 2008. Indirect Effects in Ecology. In: Sven Erik Jorgensen and Brian Fath,
Editors-in-Chief, Encyclopedia of Ecology, Academic Press, Oxford. Pages 1948-1958,
ISBN 978-0-08-045405-4, DOI: 10.1016/B978-008045405-4.00693-5.
Loder, J.W., Platt, T., 1985. Physical controls on phytoplankton production at tidal fronts. In:
Gibbs, P.E. (Ed.), Proc. 19th European Mar. Biol. Symposium, Cambridge University
Press, Cambridge, pp. 3–21.
Logan, B.E. and A.L. Alldredge , Potential for increased nutrient uptake by flocculating
diatoms. Mar. Biol. 101 (1989), pp. 443–450.
220 V. Krivtsov and C. F. Jago

Loladze I., Kuang Y. & Elser J. J. (2000) Stoichiometry in producer-grazer systems: Linking
energy flow with element cycling. Bulletin of Mathematical Biology 62: 1137-1162.
Luytens, P.J. , E. Deleersnijder, J. Oxer and K.G. Ruddick , Presentation of a family of
turbulence closure models for stratified shallow water flows and preliminary application
to the Rhine outflow region. Cont. Shelf Res. 16 (1996), pp. 101–130.
Luytens, P.J., Jones, J.E., Proctor, R., Tabor, A., Tett, P., Wild-Allen, K., 1999. COHERENS
- a coupled hydrodynamical-ecological model for regional and shelf seas: users
documentation. Management Unit of the Mathematical Models of the North Sea,
Brussels, MUMM Internal Document, 911 pp.
Malaeb Z. A., Summers J. K. & Pugesek B. H. (2000) Using structural equation modeling to
investigate relationships among ecological variables. Environmental and Ecological
Statistics, 7: 93-111.
McClanahan T. R. & Sala E. (1997) A Mediterranean rocky-bottom ecosystem fisheries
model. Ecological Modelling, 104: 145-164.
Mellor, G.L. and T. Yamada , Development of a turbulence closure model for geophysical
fluid problems. Rev. Geophys. Space Phys. 20 (1982), pp. 841–875.
Mortimer C. H. (1941) The exchange of dissolved substances between mud and water in
lakes. Journal of Ecology, 29: 280 - 329.
Mortimer C. H. (1942) The exchange of dissolved substances between mud and water in
lakes. Journal of Ecology, 30: 147-201.
Naito W., Miyamoto K., Nakanishi J., Masunaga S. & Bartell S. M. (2002) Application of an
ecosystem model for aquatic ecological risk assessment of chemicals for a Japanese lake.
Water Research, 36: 1-14.
Navarrete S. A. & Menge B. A. (1996) Keystone predation and interaction strength:
Interactive effects of predators on their main prey. Ecological Monographs, 66: 409-429.
Ortiz M. & Wolff M. (2002a) Application of loop analysis to benthic systems in northern
Chile for the elaboration of sustainable management strategies. Marine Ecology-Progress
Series, 242: 15-27.
Ortiz M. & Wolff M. (2002b) Dynamical simulation of mass-balance trophic models for
benthic communities of north-central Chile: assessment of resilience time under
alternative management scenarios. Ecological Modelling, 148: 277-291.
Osuna P., A.J. Souza and J. Wolf (2007) Effects of the deep-water wave breaking dissipation
on the wind-wave modelling in the Irish Sea, J. Mar. Syst. 67: 59–72.
Passow, U. and P. Wassman , On the trophic fate of Phaeocystis pouchetii (Hariot). IV. The
formation of marine snow by P. pouchetii. Mar. Ecol. Progr. Ser. 104 (1994), pp. 153–
161.
Patten B. C. (1985) Energy Cycling in the Ecosystem. Ecological Modelling, 28: 1-71.
Patten B. C. (1992) Energy, Emergy and Environs. Ecological Modelling, 62: 29-69.
Patten B. C., Bosserman R. W., Finn J. T. & Gale W. G. (1976) Propagation of cause in
ecosystems. In: Systems analysis and simulation in ecology (ed. B. C. Patten) pp. 457 -
579. Academic Press, New York.
Pingree, R.D. and D.K. Griffiths , Tidal fronts on the shelf seas around the British Isles. J.
Geophys. Res. 83 (1978), pp. 4615–4622.
Pingree, R.D. , P.M. Holligan, G.T. Mardell and R.N. Head , The effects of vertical stability
on phytoplankton distribution in the summer on the nortwest European shelf. Deep Sea
Res. 25 (1978), pp. 1011–1028.
Some Aspects of Phytoplankton and Ecosystem Modelling … 221

Pingree, R.D., G.R. Forster and G.K. Morrison , Turbulent convergent tidal fronts. J. Mar.
Biol. Assoc. UK. 54 (1974), pp. 469–479.
Pingree, R.D., P.R. Pugh, P.M. Holligan and G.R. Forster , Summer phytoplankton blooms
and red tides along tidal fronts in the approaches to the English Channel. Nature, 258
(1975), pp. 672–677.
Proctor R., J.T. Holt, J.I. Allen and J. Blackford (2003) Nutrient fluxes and budgets for the
north west European Shelf from a three-dimensional model., Sci. Total Environ. 314–
315: 769–785.
Reynolds C. S. (1984) The ecology of freshwater phytoplankton. Cambridge University
Press, Cambridge.
Savidge, G. A preliminary study of the distribution of chlorophyll a in the vicinity of fronts in
the Celtic and western Irish Sea. Estuar. Coast. Mar. Sci. 4 (1976), pp. 617–625.
Sharples, J. and P. Tett , Modelling the effect of physical variability on the midwater
chlorophyll maximum. J. Mar. Res. 52 (1994), pp. 219–238.
Sharples, J. , C.M. Moore, T. Rippeth, P.M. Holligan, D.J. Hydes, N.R. Fisher and J.H.
Simpson , Phytoplankton distribution and survival in the thermocline. Limnol. Oceanogr.
46 (2001), pp. 486–496.
Sharples, J., 1999. Investigating the seasonal vertical structure of phytoplankton in shelf seas.
Prog. Oceanogr. Suppl. S, 3–38.
Sharples, J., Simpson, J.H., 1996. The influence of the springs-neaps cycle on the position of
shelf fronts. In: Aubrey, D.G., Friedrichs, C.T. (Eds.), Buoyancy effects on coastal and
estuarine dynamics, Am. Geophys. Union, Washington, DC, pp. 71–82.
Simpson, and D. Bowers , Models of stratification and frontal movement in shelf seas. Deep
Sea Res. 28A (1981), pp. 727–738.
Simpson, J.H. and D. Bowers , Shelf sea front adjustments revealed by satellite IR imagery.
Nature, 280 (1979), pp. 648–651.
Simpson, J.H. and J.R. Hunter , Fronts in the Irish Sea. Nature, 250 (1974), pp. 404–406.
Simpson, J.H. , W.R. Crawford, T.P. Rippeth, A.R. Campbell and J.V.S. Cheok, The vertical
structure of turbulent dissipation in shelf seas. J. Phys. Oceanogr. 26 (1996), pp. 1579–
1590.
Smetacek, V.S. Role of sinking in diatom life history cycles: ecological, evolutionary and
geological significance. Mar. Biol. 84 (1985), pp. 239–251.
Smith, C.L. and P. Tett , A depth-resolving numerical model of physically forced
microbiology at the European shelf edge. J. Mar. Syst. 675 (2000), pp. 1–36.
Tett, P. Modelling of phytoplankton production at shelf sea fronts. Phil. Trans. R. Soc.
London A, 302 (1981), pp. 605–615.
Tett, P. and A. Walne , Observations and simulations of hydrography, nutrients and plankton
in the southern North Sea. Ophelia, 42 (1995), pp. 371–416.
Tett, P. , I. Joint, D. Purdie, M. Baars, S. Oosterhuis, G. Daneri, F. Hannah, D.K. Mills, D.
Plummer, A. Pomroy, A.W. Walne and H.J. Witte , Biological consequences of tidal
stirring gradients in the North Sea. Phil. Trans. R. Soc. London A, 340 (1993), pp. 493–
508.
Tett, P., A. Edwards and K.J. Jones , A model for the growth of shelf sea phytoplankton in
summer. Estuar. Coast. Shelf Sci. 23 (1986), pp. 641–672.
222 V. Krivtsov and C. F. Jago

Tett, P., Droop, M.R., 1988. Cell quota models and planktonic primary production. In:
Wimpenny, J.W.T. (Ed.), Handbook of Laboratory Model Systems for Microbiol
Ecosystems, vol. 2. CRC Press, Boca Raton, FL, pp. 177–233.
Wardle D. A. (2002) Communities and ecosystems. Linking the aboveground and
belowground components. Princeton University Press, Princeton.
Whipple S. J. (1999) Analysis of ecosystem structure and function: extended path and flow
analysis of a steady-state oyster reef model. Ecological Modelling, 114: 251-274.
Wolf, S.L. Wakelin and J.T. Holt (2002) A coupled model of waves and currents in the Irish
Sea, Proceedings of the 12th International Offshore and Polar Engineering Conference
Kitakyushu, Japan, vol. 3 May 26–31(2002), pp. 108–114.
Wootton J. T. (1994) Predicting Direct and Indirect Effects - an Integrated Approach Using
Experiments and Path-Analysis. Ecology, 75: 151-165.
Wootton J. T. (2002) Mechanisms of successional dynamics: Consumers and the rise and fall
of species dominance. Ecological Research, 17: 249-260.
In: Ecological Modeling ISBN: 978-1-61324-567-5
Editor: WenJun Zhang, pp. 223-265 © 2012 Nova Science Publishers, Inc.

Chapter 10

MODELING POPULATION DYNAMICS, DIVISION OF


LABOR AND NUTRIENT ECONOMICS OF SOCIAL
INSECT COLONIES

Thomas Schmickl* and Karl Crailsheim


Artificial Life Laboratory of the Department of Zoology
Karl-Franzens University Graz, Universitätsplatz 2, 8010 Graz, Austria.

ABSTRACT
In the evolution of social insects, the colony and not the (often sterile) individual
worker should be considered the major unit of selection. Thus, social insect colonies are
considered to be 'super-organisms', which have – like all other organisms – to perform
behaviors which affect their outside environment and which alter their own future
internal status. The way these behaviors are coordinated is by means of communication,
which is either direct or indirect and which involves information exchange either by
transmitting signals or by exploiting cues. Therefore, social insect colonies perform
information processing in a rather similar way as multicellular organisms do, where
behaviors result from the exchange of information among their sub-modules (cells). In
many cases, self-organization allows a colony to evaluate massive amounts of
information in parallel and to decide about the colony's future behavioral responses.
Many feedback systems that govern self-organization of workers have been investigated
empirically and theoretically. Here, we discuss models which have been proposed to
explain division of labor and task selection in social insects. We demonstrate how the
collective regulation of labor in eusocial insect colonies is studied by means of top-down
modeling and by bottom-up models, often analyzed with multi-agent computer
simulations.

Keywords: social insects, division of labor, task selection, colony integration, multi-agent
modeling, honeybees.

*
Tel: +43 316 380 8759, Fax: +43 316 380 9875.
224 Thomas Schmickl and Karl Crailsheim

1. INTRODUCTION
For any living organism, choosing an appropriate behavior is a crucial issue to survive
and to reproduce. Usually, the actions that compose behaviors are attributed to certain
functionality, like 'foraging for food', 'reproduction', or 'self-defense'. In most cases, a
successful performance of such behaviors significantly alters the constitution and the mode of
operation of this organism. This happens for example by increasing the activity of specific
organs in multicellular organisms or of specific organelles in unicellular organisms.
In our field of study, which is social insects, similar behaviors are observed in insect
colonies. Such colonies are usually conceived as 'super-organisms' (Moritz & Fuchs 1998,
Hölldobler & Wilson 2008). This is reasoned by the fact that in social insects, whole colonies
– and not individual workers – can be seen as units of selection, as was reviewed by Tarpy et
al. (2004) for honeybees. In most cases, female workers are sterile and colonies perform
reproduction by division of whole colonies (e.g., swarming in honeybees) or by producing
queens and males for founding new colonies (e.g., most ants, termites, wasps, bumblebees).
We adapt this super-organism approach and discuss colony-level decision-making and self-
regulation as a result from division of labor and from communication/interaction among
workers: We focus here on those eusocial insects which exhibit division of labor (DOL) and
partially also task partitioning (TP). The phenomenon of DOL refers to the fact that the
working duties in the colony are not distributed at random. In many social insect species,
specific groups of workers perform specific sets of tasks within the colony's collective
physiology, thus they have many similarities to organs in metazoa and to organelles in
protozoa and microbes. These analogies were already pointed out by Wilson (1985) as well as
by Robinson (1992), who showed the importance of adaptability and of self-organizational
aspects in DOL.
The predictions of our mathematical models of honeybees, which are discussed in this
chapter, suggest that the processes, which allow a social insect colony to perform collective
decision making and collective homeostasis, are influenced by the colony-wide physiological
network, where specific pathways (flow of nutrients, flow of workforce, flow of information
...) exist. In turn, colony-level decision making affects these colony-level physiology
significantly, making the collective physiology an important component of significant
feedback loops. Colony-level physiology is regulated again by feedback loops that establish
colony-level homeostasis. However, these feedback loops involve in most cases specific
behaviors of individual workers, thus modeling these networks quickly leads to models that
cannot be solved in mathematical closed form. Thus numerical simulation becomes
important. Often the behavior of individual agents that constitute a colony as well as their
environmental conditions are found to be so complex that individual-based modeling is
needed to generate plausible models of these focal systems. However, loss of generality is
often the price that has to be paid, whenever individual-based models are used. We are well
aware of that in our scientific approach, as our models tend to describe the modeled system in
a very detailed manner. However, this level of detail is needed to allow our models to answer
our focal scientific questions. We try to model the physiology of the whole colony by
modeling the physiology of the individuals with as much precision as is feasible. Our concept
is to fuse many physiological building blocks (worker physiology) to one single higher-level
physiology (colony). In turn, behavioral decisions of individual agents are mostly results of
Modeling Population Dynamics, Division of Labor … 225

their physiological status and their local environment, which are summarized as 'proximate
mechanisms' (Mayr 1961, Tinbergen 1963). These individual behaviors ultimately constitute
the collective behavior of the colony. Our modeling approach allows us to investigate how
such colony-level decision making and self-regulation affects energy economics and brood
production of insect colonies. Thus, these studies ultimately allow us to discuss also 'ultimate
causation' of colony-level behaviors (Mayr 1961, Tinbergen 1963).
Also in our work, the main scientific questions are focusing on proximate mechanisms
and on ultimate causation of colony-level reactions (or group-level behavior), which are a
response of a colony to a specific environmental stimulation. These questions can be grouped
into two distinct sets of questions:

 Q1: How does the composition of the colony (different specialists) change in
response to these stimuli? How does colony structure (e.g. age structure) change?
How does the flow of nutrients and energy through the colony change? The answers
to this set of questions build the 'substrate' for investigating the second set of
questions, as the colony status restricts the availability of nutrients, building material,
energy and workforce which is available for the colony.
 Q2: How does individual worker decision making affect the colony's global
physiological status and vice versa? How is the physiology of individual workers
correlated with the global physiology of the collective? How is the ratio of gain to
costs of specific worker actions related to the colony-wide economics of energy and
nutrients? Which information is spread, stored and filtered in the colony by specific
actions performed by workers? This set of questions is investigating the proximate
mechanisms involved in DOL, but – as we model also the physiology of workers –
the studies aimed to answer these questions deliver also predictions of colony-level
fitness parameters, which are survival and fecundity of colonies. Thus, these studies
link back also to the ultimate causation of observed behaviors. In this article we will
describe how studies of DOL can be performed with a set of models of honeybee
population dynamics and nectar energetics.

2. DIVISION OF LABOR IN SOCIAL INSECTS

The degree of specialization found in social insect workers differs among species: The
phenomenon of specialization refers to the fact that workers are more likely to perform one
task over the other compared to other workers in similar situations. In contrast to that,
plasticity refers to the fact that one worker engages in a variety of tasks or that it changes the
performed task over time. Specialization of workers will be advantageous if the specialist
worker performs its specialized task more efficient than an average worker is able to (Jeanne
1986a). This is most obvious in ants and termites, where there exist sometimes huge
morphological differences among worker castes (Wilson 1985). In addition to morphological
differences between workers, drones and queens, there are also significant physiological
differences observable between worker specialists in honeybees. Due to this physiological
specialization a worker bee can be more efficient in its specialized task. Seeley (1982)
mentions additionally the fact that age-related groups are better in locating specific age-
226 Thomas Schmickl and Karl Crailsheim

related tasks in their local environment than the hypothetical generalist worker could do: For
example young bees most prominently find brood cells to clean nearby, as they just have
emerged from such cells a few hours ago. But specialization could be also disadvantageous: It
makes a colony vulnerable to losses of one specific caste and it maximizes searching time of
specialists to find the next suitable working site.
In honeybees, the observed correlation between task specialization and worker age is
called 'age polyethism' (AP). DOL, TP, and AP are very prominent features in eusocial
insects, as we will describe in more detail in the upcoming sections.
Wilson (1984, 1985) reports the case of the ants Pheidole guilelmimuelleri and Pheidole
pubiventris where even huge morphological differences between small workers (minors) and
huge workers (majors) do not prevent one caste to engage in a task that is typical for the other
caste in times of experimental removal of one caste. Workers with sufficient plasticity might
find the next spots to work sooner than specialists. We assume that natural selection found a
near-optimal trade-off between these two features for every species, according to the species
ecological constraints. Computational models help us to understand these ultimate reasons,
why an according mode of operation in DOL emerged in specific species.
In addition to specialization and plasticity, some social insects show also the
phenomenon of task partitioning. In contrast to DOL, which refers to splitting the workers
into task-specific groups, TP refers to splitting a task among several workers. In most cases of
task partitioning found in social insects, it was observed that the foraging task was
partitioned: Searching, cutting, transporting and storing of food is often performed by
specialist workers. In leaf-cutter ants, TP was reported for almost all tasks that involve
handling of material (also waste removal, see Hart et al., 2002). Ratnieks and Anderson
(1999) reviewed this issue in detail and discussed several important aspects of TP: (1) It
allows workers to specialize in specific parts of a whole task, thus possibly increasing
efficiency. (2) By doing so, reliability of a single subtask increases, as specialized workers are
more reliable to accomplish the sub-tasks, but (3) the reliability of the whole task decreases,
because if one sub-task fails, the whole task cannot be performed. In addition (4) food-
transfer times have to be short enough so that benefits of sub-task specialization are not
wasted. Again here, computational models are able to support ultimate reasoning of TP.
The organs in non-eusocial metazoan organisms show significant changes in activity
during specific behaviors. Also the task-specific groups in eusocial insects change their group
sizes and the activity levels of their members in association with specific colony-level
behaviors. It was found that those mechanisms that allow a social insect colony to decide
collectively for nesting sites or for important foraging targets are in most cases regulated in a
decentralized and self-organized way (Camazine et al., 2001), often also referred to by the
term 'swarm intelligence'. In such systems, a network of feedbacks regulates DOL and TP,
which arises from the flow of substances (food, pheromones) within the colony, as well as
from worker-to-worker interactions and communication. However, this is not totally different
from non-eusocial metazoan organisms, where hormones and other cell-to-cell interactions
also contribute to selection and shaping of the organisms' behaviors.
When reflecting on an observed behavior, scientists should concentrate on two distinct
sets of questions (Mayr 1961): The first set of questions refers to the individual local
mechanisms that cause the behavior. For example ―H ow does it work?‖ and ―Whe n and how
is it triggered?‖. The second set of questions refers to ultimate reasoning of the behavior, like
―Whyhas natural selection favored this behavior?‖. Computational models support scientists
Modeling Population Dynamics, Division of Labor … 227

in elaborating on both sets of questions, as we will discuss in the following sections of this
article. In the following sections we describe two major approaches how DOL and TP are
investigated by means of computational modeling: On the one hand, there are top-down
models, which often consider the population structure of the modeled colony or predict the
population dynamics that arise from DOL. These models often aim for an ultimate reasoning
of DOL. On the other hand, there exist many individual-based models, which often aim on
investigating the proximate mechanisms. By describing our own model 'TaskSelSim' in
detail, we show how an individual-based model is used for addressing questions of ultimate
reasoning.

3. TOP-DOWN MODELING OF DIVISION OF LABOR

The system of feedbacks that allows DOL in an insect society can be well studied by top-
down-approaches, for example by stock&flow modeling. This quantity-based modeling
approach was used to investigate the interplay between age demography and DOL in social
insects only rarely. However, we think that this modeling technique is an excellent way to
describe social insect populations: Stock&flow models are graphical representations of
coupled differential-equations. Besides the benefit of being easily understandable for the
broad public due to their graphical nature, they implicitly ensure conservation of mass, as
they track quantities in 'stocks' and 'flows' separately from other informational variables. In
such a stock&flow model of DOL in eusocial insects, available workers are the quantities that
reside in a 'stock' and which flow to other 'stocks' which represent worker groups of specific
tasks. Thus, these 'flows' represent the recruitment results of the recruitment system of the
modeled species. Another flow of workers connects back from the task-related stock to the
stock of un-recruited workers. In addition, birth of new animals increases the available
workforce over time, a process which might be affected by successful performance of specific
tasks. In parallel, death decreases the workforce, whereby mortality rates might be task-
related as well. This scheme represents a sort of system-dynamics skeleton that demonstrates
how DOL and population dynamics are easily described from a top-down perspective. As
DOL should be explained by some ultimate reasoning (e.g. by a fitness enhancement of the
colony), such a point of view is advantageous, as it helps to investigate how DOL potentially
enhances colony growth or stabilizes a colony's population dynamics.
We develop here first a simple basic skeleton of a top-down model of DOL. Figure 1
shows a simple stock&flow model of DOL, which we use as a starting point for our
elaboration: Birth is a flow from a source into a stock, as it produces new workers. In parallel,
workers are subject to death, depicted as flows from stocks into sinks. Workers that engage
into a task are tracked in another stock which exchanges quantities of workers with the stock
of un-recruited workers via two flows, called 'recruitment' and 'abandonment'. Abandonment
is proportional to the amount of recruited workers (fixed decay rate). Recruitment is
proportional to the number of recruited workers (active worker-to-worker recruitment), to the
number of unrecruited workers and to the current workload (stimulus). Birth is steadily
producing new workload but creates also new workforce. In contrast to that, task performance
is reducing the workload and, in parallel, is enhancing birth rates. This simplified sketch of a
model clearly demonstrates how population dynamics inside of a social insect colony affects
228 Thomas Schmickl and Karl Crailsheim

DOL and how DOL in turn affects population dynamics by influencing birth rates and by
task-specific mortality rates.

+
births

+
abandonment

Recruited Unrecruited
workers workers

+ recruitment +
death
of task
+ deaths
normal
group + +

Work to
+ be done
work done new work +

Figure 1. A stock&flow diagram of DOL that is embedded in a simple model of population dynamics.
Boxes indicate quantities of workers or quantities of work. Solid arrows indicate flows (transitions) of
quantities. Dashed arrows indicate causal relationships. Cloud symbols indicate sinks and sources.

From this basic modeling sketch, not only the recruitment and abandonment mechanisms
could be elaborated. In addition, the depicted age-structure should be modeled in higher
resolution for depicting DOL in a specific species. In some social insect species, workers are
morphologically (and possibly also physiologically) in-discriminable. Thus the above
depicted model holds for these species. In other species – especially in ants and termites –
there often exist strong morphological differences, which predispose worker castes for
specific jobs. Figure 2 sketches the basic principles of a model that describes DOL in a
species (of ants) that has two distinct morphological castes. As described by Wilson (1984,
1985), the huge 'majors' engage in brood caring only seldom. However, when 'minors' were
experimentally removed, they start with brood care already one hour after the experimental
disturbance of the colony's age structure. From the colony level perspective, the stock&flow
diagram depicted in figure 2 describes the observed flows of workers, as well as the change in
population structure. For the sake of simplicity, we omitted most of the arrows that indicate
causal relationships and depicted just those that are most significant for DOL. Thickness of
the arrows indicates the strength of the causal relationship.
Modeling Population Dynamics, Division of Labor … 229

births births

abandonment abandonment

Recruited Unrecruited Unrecruited Recruited


majors majors minors minors

deaths recruitment deaths deaths recruitment


deaths
task normal normal task
majors minors

<births>

Jobs to be
done inside
of the nest
work done new work

Figure 2. A stock&flow diagram of DOL in a species having morphological castes. The model of DOL
is embedded in a simple model of population dynamics. Boxes indicate quantities of workers or
quantities of work. Solid arrows indicate flows (transitions) of quantities. Dashed arrows indicate causal
relationships. Cloud symbols indicate sinks and sources. Only relationships significant for DOL are
drawn.

Another alternative is 'age polyethism' (AP) or 'temporal polyethism' (TP), where


specialisation occurs multiple times throughout the lifetime of a worker. These changes of
preferred tasks often correlate with distinct changes in physiological and sometimes also
morphological (e.g., gland development) modifications. Although 'minor' workers can engage
in the job of 'majors', they do not become 'majors'. In contrast to that, in bumblebees,
morphological and physiological changes stretch over a time span of one season: Spring-born
bumblebees are smaller and behave differently than late summer-born bumblebees. In
honeybees, a worker bee engages into several different tasks as it gets older (Seeley 1982;
Wilson, 1985). Figure 3 shows a simplified stock&flow sketch that depicts such a colony
which exhibits AP. In such models, tracking the age structure of the colony is an important
aspect and recruitment rates to specific tasks vary between the age classes. If an age class has
a low population size, less recruitment occurs to the task usually performed by workers of that
age, thus less work will be performed. In turn, the workload accumulates, enhancing the
recruitment rates also for the other age classes, which have a lower but non-zero recruitment
probability for this task. This way, more workforce is recruited from age classes that are less
predisposed for the specific task, in turn decreasing the effects of the experimentally induced
disturbance of DOL. Such phenomena are described in empiric studies of honeybees (e.g.,
Robinson et al., 1992). As figure 3 clearly shows, the feedback between DOL and colony
demography is an important feature to predict dynamics of DOL in social insects that exhibit
AP, especially when predictions are made throughout a longer time period.
230 Thomas Schmickl and Karl Crailsheim

foraging foraging

getting
births hatching older deaths

brood age group 1 age group 2

brood care brood care

new work work done


in-nest job

Figure 3. A stock&flow diagram of DOL in a species that shows AP concerning two tasks (foraging,
nursing). The model of DOL is embedded in a simple model of population dynamics. Boxes indicate
quantities of workers or quantities of work. Solid arrows indicate flows (transitions) of quantities.
Dashed arrows indicate causal relationships. Cloud symbols indicate sinks and sources. Only
relationships significant for DOL are drawn. Mortality of brood and age classes are omitted, to keep the
sketch simple.

Wakano et al. (1998) modeled a social insect colony with age structure, in which one
vector holds all populations of age classes. They modeled two tasks (in-nest and foraging).
The degree of specialization to foraging or in-nest work for every age-class n is defined by
setting one parameter xn. Their model incorporates task specific mortality rates. Brood
production is limited by performance of both tasks: with no brood care or with no foraging at
all, no brood is produced. Using this model, and by simulating several parameterizations, the
authors reached balancing between the two modeled tasks. They showed life time expectancy
as an ultimate reasoning of the fact that the risky foraging task is always performed by the
older workers in social insects with age polyethism. The model was used to test three regimes
of AP: Hard: age fixed the job; Soft: every age has a certain probability to perform the job;
Not: no age polyethism at all. The authors found that each kind of AP is predicted to be
advantageous under specific environmental and ecological conditions. In case those
environmental fluctuations affect the performance of both, inside and outside jobs, soft AP
was found to be most adaptive. If fluctuations affect only outside tasks (foraging) then hard
AP was found to be most adaptive. The model of Wakano et al. (1998) is not addressing a
specific species, it is a general modeling approach of AP in social insects.
The first issue to be solved in studying DOL in eusocial insects is to know which workers
are available for specific work. In connection with AP, this means to know how many
workers of each age class are available. In case of morphological different workers, it is
important to know how many workers of each morphe exist. As honeybees show an adaptive
regime of AP quite prominently (e.g., Seeley 1982), the age demography of honeybees was
Modeling Population Dynamics, Division of Labor … 231

studied intensively with classical top-down models and only rarely with bottom-up models. In
our work, we followed both modeling approaches. Empiric studies showed that a honeybee
colony can change its own colony structure in reaction to environmental conditions (e.g.
seasonal changes), for example by changing the egg laying behavior of the queen, the nursing
intensity of larvae or by cannibalizing brood (Schmickl & Crailsheim 2001). Thus, studying
these dynamics is an important aspect in the investigation of DOL.
AP is most prominent and most complex in honeybees, thus many demographic models
have been made in this field. Some of them also incorporate aspects of AP, in a way that
either age structure affects AP or vice versa. Omholt (1986, 1988, 1992) suggested several
closely related honeybee models that predict the egg laying of the queen, the workers deriving
from these eggs and how these workers in turn affect the egg laying. These models predict
plausible population dynamics, they predict foraging intensity (honey yields) and they show
that worker longevity could be a response of the nursing load a bee has encountered.
DeGrandi-Hoffman et al. (1989) published a model called BEEPOP, which models the
queen's laying of female (workers) and male (drones) eggs, and the resulting populations
(with age structure). The model assumed that bees up to a specific age work inside of the
colony and that older bees work outside. Thus, BEEPOP implements 'hard AP', according to
the definitions of Wakano et al. (1998). The model is parameterized by weather profiles and it
predicts population dynamics as well as the foraging workforce throughout the year.
Makela et al. (1993) described an object-oriented model called 'AHBsim', which models
the population dynamics and AP of africanized honeybee colonies. What makes this work
outstanding is the fact, that parameters were coded in a sort of 'artificial genome'. Also colony
level reproduction (swarming) and 'drone production' were part of this model. Thus, Makela
et al. (1993) programmed maybe the first 'Artificial Life'-like model of social insects, in
which evolution could be studied on the colony level. In AHBsim, there is also a hierarchical
order, in which worker bees are associated to specific tasks.
In Schmickl & Crailsheim (2007), we presented a model called 'HoPoMo' (honeybee
population model), which models the population dynamics of a honeybee colony from a given
egg laying pattern of the queen. This model, consisting of more than 60 equations,
incorporates a system of 'soft AP', in which adult workers are recruited on a daily basis for the
tasks of 'nursing', 'nectar handling', 'pollen foraging' and 'nectar foraging'. This is done based
on a hierarchical system of task priorities and according to the current status of supply and
demand on the current day. The model incorporates weather data, a seasonal environmental
resource profile and age demography. After elaborating on the basic model, the reproductive
act of a colony (swarming) was simulated, demonstrating that the population model is able to
predict even significant disturbances of colony structure in a qualitatively and quantitatively
plausible way. In addition, the model is able to predict the effects of changes in the colony's
age demography on the emerging DOL and resource dynamics inside of the colony. In turn,
these variables affect the future intra-colonial population dynamics, thus establishing a time
delayed feedback loop between population dynamics and DOL. Figure 4 depicts the most
important feedback loops that are modeled in HoPoMo.
In our HoPoMo model, DOL is based on a hierarchical system of task priorities, which
considers also the current status of supply and demand on the current day: The model assigns
the available pool of worker bees (workforce) to tasks according to the current colony
demand (nutrient flow, brood nursing demands …). As there is modeled a complex system of
interwoven feedback loops, the modeled colony achieves homeostatic regulation of DOL and
232 Thomas Schmickl and Karl Crailsheim

nutrient stores (see figure 4). After elaborating on the basic model, which was validated
against many sets of empiric measurements on honeybees and after exhaustive sensitivity
analysis of the model, the reproductive act of a colony (swarming) was simulated (see Figure
5, Schmickl & Crailsheim 2007). In future, the model will be used to interpret data from
empiric colony-level experiments on honeybees and to develop mesoscopic honeybee models,
as we have described them in the discussion section.

Figure 4. Major feedback loops regulating DOL and population dynamics in HoPoMo. Reprinted from
Schmickl & Crailsheim (2007). Boxes represent stocks that hold quantities (workers, resources). Grey
arrows represent flows. Black arrows represent causal relationships. Circular shapes represent sources
and sinks of quantities. ELR: daily egg laying rate of the queen. SUPcombs: Egg laying suppression
due to little space available on the combs.

Flows of nutrients have a regulating effect on dynamic regulation of DOL not only in
bees but also in other species: Karsai & Balazsi (2002) showed that the inflow of water into a
wasp colony (Polybia and Metapolybia wasps) dynamically regulates the pulp foraging and
thus regulates the nest building behavior. This was shown by a differential equation model,
yielding results that correspond well to empirical results (Jeanne 1986b, Karsai & Wenzel
1998, 2000). A corresponding individual-based model was published recently by (Karsai &
Runciman 2008).
Models that combine mechanisms of DOL in social insects with colony demography are
not frequently produced. Especially since computers became faster and faster in the last
decades, there is a trend of investigating DOL with microscopic, individual-based, bottom-up
approaches. Most of these studies focus on investigating proximate mechanisms of DOL and
TP, as is described in the upcoming section.
Modeling Population Dynamics, Division of Labor … 233

Source: Schmickl & Crailsheim (2007).

Figure 5. Application of the model HoPoMo. Using this model we simulated the population dynamics
and the change in colony weights (bees, nutrient stores and comb) during colony reproduction. Top
row: Predicted Intra-colonial population dynamics and weight development of a colony that does not
swarm. Lower two rows: Same predictions for a colony that duplicates by swarming. Middle row:
Colony that stays in the hive. Bottom row: Colony that is formed by the swarm that left the hive for a
new home. All: Day 0 corresponds to the 1 st of January in a year.

4. INDIVIDUAL-BASED MODELS OF DIVISION OF LABOR

Division of labor is a colony-level phenomenon. It is the result of individual task


selection of hundreds or thousands of workers in parallel. Thus, much research has been done
on the proximate mechanisms that allow the workers to decide upon engaging in specific
tasks. Several classes of models have been proposed to describe which proximate
mechanisms make a single worker to engage in a specific task and how this is regulated on
the colony level. In the majority of models on DOL, a basic assumption is that the
engagement of an animal to a specific task is triggered by a neuronal or hormonal mechanism
that can be represented by a threshold. A stimulus that is intense enough to meet this
234 Thomas Schmickl and Karl Crailsheim

threshold triggers a behavior (Lorenz 1978). Beshers and Fewell (2001) give an overview
which classes of models have been proposed to explain the observed models of DOL by
different proximate mechanisms working on the individual worker level. These models are
not necessarily incompatible to each other, as mechanisms might differ with species or with
the kind of regulated tasks.
Bonabeau et al. (1996) showed that a model consisting of two distinct castes ('minors' and
'majors'), having different thresholds to perform the task of brood care, suffices to simulate
the colony-level plasticity in Pheidole ants reported by Wilson (1984, 1985). In this
individual-based model, the probability of each individual to engage in the brood caring task
is modeled by the following equation

st2
Pi ,t  , (1)
st2  i2

where Pi denotes the probability of an individual of caste i to engage in the nursing task. The
variable t refers to the current time step in the simulation, s refers to the current stimulus
strength (workload to be performed) and Θi refers to a fixed threshold of caste i that has to be
met by the stimulus to trigger the task. In their simulations they set Θ1=8.0 and Θ2= 1.0.
Equation 1 has been used several times in literature to model behavioral response thresholds.
It became a quasi-standard, although a simple step-like function ( 'if s>Θ then act()')
would suffice in most cases. According to the parameter values used in Bonabeau et al.
(1996), the probabilities to engage in the brood-care task will scale with stimulus intensity as
depicted in figure 6. In their model, individuals abandoned a task with a fixed rate, assuming
a fixed time period for task performance. The dynamics of the stimulus were modeled as
follows: The stimulus increased by a linear term and was decreased proportionally to the
fraction of active workers of both castes in the number of all available workers. Thus, the
authors assumed equal efficiency of both castes in performing the task.

0,9

0,8
Probability of task engagement

0,7

0,6

0,5

0,4

0,3

0,2 minors
majors
0,1

0
0,1 1 10 100
Stimulus intensity

Figure 6. Probabilities of task performance of 'majors' and 'minors' in the model of Bonabeau et al.
(1996). It needs a higher stimulus intensity to trigger the nursing task in a 'major' than in a 'minor'.
Modeling Population Dynamics, Division of Labor … 235

Using this model of fixed thresholds, Bonabeau et al. (1996) could predict the dynamics
of experimental removal of 'minors' in Pheidole ants by assigning different values of Θi for
'minors' and 'majors' in a way that was comparable to empiric observations (Wilson 1984,
1985). This model was extended to a system with two tasks, where each caste is specialized to
one of the two tasks. The authors called the fixed-threshold model ―...t he arguably simplest
model that connects flexibility at the worker level with the resiliency observed at the colony
level...‖ (Bonabeau et al. 1998). Although the model is simple, they report in their extensive
paper that they have been able to simulate several aspects of dynamic DOL: Not only the
plasticity observed in Pheidole ants, but also spatial specialization of workers, and AP could
be modeled. The authors introduced also genetic (threshold) variability. For modeling AP,
they did not vary the individual's threshold over time, but the exposure or sensitivity to a
certain stimulus was depending on a worker's age. The fact that different ages of workers
occupy usually different spatial locations in the colony and are thus exposed to different sets
of stimuli, was also considered by the 'foraging-for-work' hypothesis (Tofts 1993, Tofts &
Franks 1992, Franks & Tofts 1994): This hypothesis is based on the assumption that spatial
displacement of workers, as they grow older, moves them from the central brood nest to the
periphery of the colony, as new workers constantly hatch in the brood nest. In consequence,
the encountered bouquet of local stimuli changes with this displacement. Thus workers
engage sequentially in tasks that correlate with their age, without requiring any intrinsic
specialization mechanism that explicitly reflects a worker's age.
The above mentioned models assumed fixed thresholds that stay constant for every
individual through time. Lorenz (1978) discussed the importance of such behavioral
thresholds and also described their adaptation: Both, the exposure to stimuli as well as the
successful performance of a stimulus-triggered behavior modulate such thresholds. The
phenomena of 'sensitization' and 'facilitation' refer to the fact that once a stimulus has
triggered a behavior, the animal is more likely to repeat this behavior again, even at a lower
stimulus level. This is equivalent to lowering behavioral thresholds. After some time without
such a stimulus, these effects disappear, what is equivalent to increasing these thresholds.
These mechanisms were incorporated into the previously described threshold-reinforcement
models, which assume that the behavioral threshold Θi,j,t of individual i to engage in task j at
time t is decreased each time a worker engages in a task. Thus the worker gets more
specialized to that task j. The same threshold is increased in each time step when the task is
not performed. Variants of this model were analyzed and published by Deneubourg et al.
(1987), by Plowright & Plowright (1988), and by Theraulaz et al. (1991, 1998): In these
models, the probability to engage in a task i for individual j is described by

si2, j ,t
Pi , j ,t  (2)
si2, j ,t  i2, j ,t

This is a variant of equation 1 in which si,j,t is the associated stimulus to perform task j, as it is
sensed by worker i at time t.
Like in the fixed-threshold models, performance of a task leads to a local decrease of the
associated stimulus strength, but stimuli accumulate again as new workload emerges
constantly. Using this mechanism, simulations showed (Theraulaz et al. 1998) that the
236 Thomas Schmickl and Karl Crailsheim

modeled colony members quickly split up into two groups: one specialized on the first task
and one specialized on the second task. It was also shown that removal of one worker group
leads to an increase in stimulus strength of the associated task and this fact ultimately allows
specialists of another task group to re-specialize. Theraulaz et al. (1998) modeled this
mechanism in a time-continuous way, considering the fractions of time spent by agent i with
and without engaging in task j. In contrast to that, Gautrais et al. (2002) modeled this process
in an individual-based model. They showed that such effects occur more prominently in
bigger simulated colonies compared to small colonies. Equation 3 gives the details for the
threshold adaptation mechanism:

i , j ,t   j if task j was performed during time step t


i , j ,t 1   (3)
i , j ,t   j else

The self-reinforcement model incorporates a strong positive feedback, which drives the
system into a state in which individuals are quickly caught by one of several basins of
attraction. If there are enough other individuals around performing the other task, they will
stay in this basin (of specialization) for a long time. Merkle and Middendorf (2004) identified
several shortcomings of the Gautrais et al. (2002) model: The observed non-specialization in
small colonies was shown to be an effect of initial conditions, because in the initial phase of
the simulation runs, workload (demand) was growing much slower in small colonies than in
bigger colonies. As only frequent task-performance leads to specialization of a worker, these
colonies developed mainly generalists. The authors also identified the maximum height of
thresholds as a crucial parameter: The bigger this maximum value is, the more often an
individual has to perform a task within a short window of time to become a perfect specialist.
In addition to changes of parameterization and simulation procedures, Merkle &
Middendorf (2004) introduced also extensions to the original model: They modeled a limited
life-span of workers and introduced new (grown) workers into the system. In addition, they
varied maximum thresholds with age (comparable to suggestions in Bonabeau et al., 1998),
thus extended the model to depict also AP. Their model extensions contained also a
'competition-for-work' aspect, where those individuals with the lowest thresholds for a given
task were