Guerra - Capítulo de Citrus PDF

Genetics Research and Issues Series
GENETIC DIVERSITY

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or
by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of information
contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in
rendering legal, medical or any other professional services.

Sex Chromosomes: Genetics, Abnormalities, and Disorders
Cynthia N. Weingarten and Sally E. Jefferson (Editors)
2009. ISBN: 978-1-60741-304-2

Genetic Diversity
Conner L. Mahoney and Douglas A. Springer (Editors)
2009. ISBN: 978-1-60741-176-5

GENETIC DIVERSITY

CONNER L. MAHONEY
AND
DOUGLAS A. SPRINGER
EDITORS

Nova Science Publishers, Inc.
New York

Copyright 2009 by Nova Science Publishers, Inc.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system or
transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical
photocopying, recording or otherwise without the written permission of the Publisher.

For permission to use material from this book please contact us:
Telephone 631-231-7269; Fax 631-231-8175
Web Site: http://www.novapublishers.com

NOTICE TO THE READER
The Publisher has taken reasonable care in the preparation of this book, but makes no expressed
or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of
information contained in this book. The Publisher shall not be liable for any special,
consequential, or exemplary damages resulting, in whole or in part, from the readers use of, or
reliance upon, this material. Any parts of this book based on government reports are so indicated
and copyright is claimed for those parts to the extent applicable to compilations of such works.

Independent verification should be sought for any data, advice or recommendations contained in
this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage
to persons or property arising from any methods, products, instructions, ideas or otherwise
contained in this publication.

This publication is designed to provide accurate and authoritative information with regard to the
subject matter covered herein. It is sold with the clear understanding that the Publisher is not
engaged in rendering legal or any other professional services. If legal or any other expert
assistance is required, the services of a competent person should be sought. FROM A
DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE
AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS.

Library of Congress Cataloging-in-Publication Data

Genetic diversity / [edited by] Conner L. Mahoney and Douglas A. Springer.
p. ; cm.
Includes bibliographical references and index.
ISBN 978-1-60876-541-6 (E-Book)
1. Variation (Biology) I. Mahoney, Conner L. II. Springer, Douglas A.
[DNLM: 1. Genetic Variation--genetics. 2. Adaptation, Biological--genetics. 3. Biodiversity. 4.
Evolution, Molecular. QU 500 G328 2009]
QH401.G46 2009
576.5'8--dc22 2009012468

Published by Nova Science Publishers, Inc. New York

Contents

Preface vii
Chapter 1 Analysis of Sequence Diversity at Two Mitochondrial Genes on
Different Taxonomic Levels. Applicability of DNA Based Distance
Data in Genetics of Speciation and Phylogenetics 1
Y. Ph. Kartavtsev
Chapter 2 Chromosomal Variability and the Origin of Citrus Species 51
Marcelo Guerra
Chapter 3 Genetic Diversity of Mycobacterium Tuberculosis Population in
Bulgaria 69
Violeta Valcheva, Igor Mokrousov, Olga Narvskaya,
Nalin Rastogi and Nadya Markova
Chapter 4 Genetic Diversity in Switchgrass A Potential Bioenergy Crop 105
B. Narasimhamoorthy, M. C. Saha, H. S. Bhandari and
J. H. Bouton
Chapter 5 Genetic Variability in the Fescue-Ryegrass Complex 129
F. M. Kirigwi, A. A. Hopkins and M. C. Saha
Chapter 6 Genetic Diversity of the Population of Russia: Gene Pool and
Genegeography 149
Sergei Rychkov, Oksana Naumova, Alexei Evsyukov,
Irina Morozova, Yuri Shneider and Olga Zhukova
Chapter 7 Genetic Variability within Cypella fucata Ravenna in Southern
Brazil 179
vilin Giordana de Marco, Luana Olinda Tacuati,
Lilian Eggers, Eliane Kaltchuk-Santos and
Tatiana Teixeira de Souza-Chies

Contents vi
Chapter 8 Genetic and Functional Diversity of Phosphate Solubilizing
Fluorescent Pseudomonads and Their Simultaneous Role in
Promotion of Plant Growth and Soil Health 195
K. Badri Narayanan, M. Jaharamma, G. Raman and N. Sakthivel
Chapter 9 Genetic Diversity and Population Structure of Alpine Plants
Endemic to Qinghai-Tibetan Plateau, with Implications for
Conservation under Global Warming 213
Yupeng Geng, John Cram and Yang Zhong
Chapter 10 Bayesian Inference under Complex Evolutionary Scenarios Using
Microsatellite Markers: Multiple Divergence and Genetic
Admixture Events in the Honey Bee, Apis Mellifera 229
Jean-Marie Cornuet, Laurent Excoffier, Pierre Franck and
Arnaud Estoup
Chapter 11 Geographic Structure of Craniometric Variation and the Estimates
of Possible Dispersal Routes of Major Human Populations 247
Tsunehiko Hanihara
Chapter 12 Intra-Specific Genetic Variation in Mosses: A Novel Approach to
Detect Environmental Changes 271
Valeria Spagnuolo, Stefano Terracciano and Simonetta Giordano
Index 293

Preface

Genetic diversity is a level of biodiversity that refers to the total number of genetic
characteristics in the genetic makeup of a species. It is distinguished from genetic variability,
which describes the tendency of genetic characteristics to vary. Research has found that
genetic diversity and biodiversity are dependent upon each other, that diversity within a
species is necessary to maintain diversity among species, and vice versa. If any one type is
removed from the system, the cycle can break down, and the community may become
dominated by a single species. Thus, genetic diversity plays a huge role in survival and
adaptability of a species. This book provides research on genetic diversity in plant, animal
and human species. Relationships to environment changes and global warming are also
studied.
Chapter 1 - Algorithms of nucleotide diversity estimates and other measures of genetic
divergence for the two genes Cyt-b (cytochrome b) and Co-1 (cytochrome oxidase 1) are
analyzed. Based on the theory and algorithms of distance estimates on DNA sequences, as
well as on the observed distance values retrieved from literature, it is recommended for
realistic tree building to use a specific nucleotide substitution model from at least 56
available from Modeltest 3.7 or other software depending on the specific set of nucleotide
sequences. Using a database of p-distances and similar measures gathered from published
sources and GenBank (http://www.ncbi.nlm.nih.gov) sequences, genetic divergence of
populations (1) and taxa of different rank, such as subspecies, semispecies or/and sibling
species (2), species within a genus (3), species from different genera within a family (4), and
species from separate families within an order (5) have been compared.
Empirical data for 18,192 vertebrate and invertebrate species demonstrate that the data
series are realistic and interpretable when p-distance and its various derivates are used. The
focus was on vertebrates and fish species in particular, and the newest dataset obtained in the
framework of FishBOL (http://www.fishbol.org). Distance data revealed various and
increasing levels of genetic divergence of the sequences of the two genes Cyt-b and Co-1 in
the five groups compared. Mean unweighted scores of p-distances for five groups are: Cyt-b
(1) 1.460.34, (2) 5.350.95, (3) 10.460.96, (4) 17.991.33 (5) 26.363.88 and Co-1 (1)
0.720.16, (2) 3.781.18, (3) 10.870.66, (4) 15.000.90, (5) 19.970.80. The estimates
show good correspondence with former analyses. This testifies to the applicability of p-
distance for most intraspecies and interspecies comparisons of genetic divergence up to the
Conner L. Mahoney and Douglas A. Springer viii
order level for the two genes compared. As seen from the numbers above, and from a
regression analysis, there is no a sign of saturation, usually expected from a homoplasy
effect.
Differences in divergence between the genes themselves at the five hierarchical levels
were also found. This conforms to the ample evidence showing different and nonuniform
evolution rates of these and other genes and their various regions. The results of the analysis
of the nucleotide as well as allozyme divergence within species and higher taxa of animals
are, firstly, in a good agreement with previous results and showed the stability of a general
trend, and, secondly, suggest that in animals, phyletic evolution is likely to prevail at the
molecular level, and speciation mainly corresponds to the geographic model (type D1). The
prevalence of the D1 speciation mode does not mean that other modes are absent. There are
at least seven possible modes of speciation. How we can recognize them formally with
operational genetic criteria is a key question for establishing a quantifiable genetic model
(theory) of speciation. An approach is suggested that allows a step forward in this direction.
Research was supported by the Russian Foundation for Fundamental Research grants
#07-04-00186, #08-04-91200 and the Far Eastern Branch of the Russian Academy of
Sciences (RAS) grant #08-3B-06-031, RAS Board Programs, grant #09-1P23-06.
Chapter 2 - The genus Citrus includes some of the most important crop plants in the
world although its taxonomy remains one of the most controversial among angiosperms.
Most species are of hybrid origin and some of them may include germplasm from other
genera. Cytologically, Citrus species are characterized by a stable chromosome number and a
highly variable pattern of heterochromatic bands. Most accessions display heteromorphic
chromosome pairs, suggesting that they were originated from cross hybridization. On the
other hand, citron (C. medica), pummelo (C. maxima), a few mandarin accessions, and most
wild Citrus species and related genera exhibit chromosome pairs that are homomorphic for
similar heterochromatic bands. Based on these findings, hybrids and non-hybrid accessions
were identified and the possible origin and relationship among most accessions were
reconsidered.
Chapter 3 - Tuberculosis remains an important public health issue for Bulgaria, a Balkan
country located in the world region with contrasting epidemiological situation for
tuberculosis. Here, we present results of the recent studies on the genetic diversity of
Mycobacterium tuberculosis population in Bulgaria that was evaluated with various DNA
fingerprinting methods (spoligotyping, 24-MIRU-VNTR and IS6110-RFLP typing). The
spoligotype-based population structure of M. tuberculosis in Bulgaria was shown to be
sufficiently heterogeneous. It is dominated by several worldwide distributed spoligotypes
ST53 and ST47 and Balkan-specific spoligotypes ST125 and ST41. The Beijing genotype
strains were not found in Bulgaria in spite of close links with Russia in the recent and
historical past. Comparison with international database SITVIT2 (Pasteur Institute of
Guadeloupe) showed that spoligotype ST53 is found in similar and rather high proportion in
the neighboring Greece and Turkey and almost equally distributed across different regions of
Bulgaria. Contrarily, ST125 is not found elsewhere and is specific for Bulgaria; furthermore
it appears to be mainly confined to the southern part of the country. Novel 15/24-loci format
of MIRU-VNTR typing was found to be the most discriminatory tool compared to
spoligotyping and IS6110-RFLP typing of M. tuberculosis strains in Bulgaria. Furthermore,
Preface ix
VNTR typing was shown useful for resolving ambiguous phylogeny of some spoligotypes, in
particular, those classified as LAM/S by bioinformatics approach. In practical terms, a
reduced Bulgaria-specific 5-locus set (MIRU40, Mtub04, Mtub21, QUB-11b, QUB-26)
provided a sufficiently high differentiation and may be preliminarily recommended for a first-
line typing of M. tuberculosis isolates in Bulgaria although further studies are needed to
validate this scheme. At the same time, a comprehensive secondary subtyping of the clustered
isolates should target all 15 discriminatory loci. We additionally investigated molecular basis
of drug resistance of the studied strains. Three types of the rpoB mutations were found in 20
of 27 RIF-resistant isolates; rpoB S531L was the most frequent. Eleven (48%) of 23 INH-
resistant isolates had katG S315T mutation. inhA -15C>T mutation was detected in one INH-
resistant isolate (that also had katG315 mutation) and three INH-susceptible isolates. A
mutation in embB306 was found in 7 of 11 EMB-resistant isolates. Consequently, rpoB and
embB306 mutations may serve for rapid genotypic detection of the majority of the RIF and
EMB-resistant strains in Bulgaria; the results on INH resistance are complex and further
investigation of more genes is needed. Comparison with spoligotyping and 24-VNTR locus
typing data suggested that emergence and spread of drug-resistant and MDR-TB in Bulgaria
are not associated with any specific spoligotype or MIRU-VNTR genotype. A local
circulation of the particular clones appears to be an important factor to take into consideration
in the molecular epidemiological studies of tuberculosis in Bulgaria.
Chapter 4 - Switchgrass

(Panicum virgatum L.) is a warm-season C4 perennial grass
belonging to the family Poaceae. It is native to North America. Persistence across a wide
geographical range, in addition to high biomass production with minimum inputs, makes it an
excellent choice for a sustainable bioenergy crop. Switchgrass is a highly heterozygous, self-
incompatible and out-crossing species. Broad species adaptation, natural selection and
photoperiodism

have combined to create considerable ecotypic differentiation

in switchgrass.
The natural population is classified into two distinct cytotypes; upland and lowland. Upland
cytotypes are mostly octaploid (2n = 8x = 72) and lowlands are tetraploid (2n = 4x = 36);
however, multiple ploidy levels ranging from diploid (2n = 2x = 18) to dodecaploid (2n = 12x
= 108) have been reported in switchgrass. In the USA, uplands are adapted to the mid and
northern latitudes, while lowlands are in the southern parts of the country. In addition, these
ecotypes differ with respect to photosynthesis, drought tolerance and N-use efficiency.
Knowledge on the amount of genetic diversity and polymorphism in switchgrass is necessary
to enhance the effectiveness of breeding programs and germplasm conservation efforts. In the
past two decades, several studies have been conducted to evaluate the genetic variability in
switchgrass populations. Molecular markers, such as RFLPs, RAPDs and SSRs, were used to
find within and among population variation in a wide range of switchgrass cytotypes. Hybrid
cultivars can be an attractive option for improving biomass production. Molecular marker and
phenotypic data suggest that lowland and upland genotypes

represent different heterotic
groups that can potentially be

used to produce F
1
hybrid cultivars. This review summarizes
the current understandings on the genetic diversity available in P. virgatum populations, with
a focus on studies performed at the Noble Foundation, where the genetic variability and the
relationships within and among switchgrass populations were determined with simple
sequence repeat markers and ploidy analysis.
Conner L. Mahoney and Douglas A. Springer x
Chapter 5 - Fescues and ryegrasses in the Lolium genus are widely used as forage and
turf, especially in temperate regions of the world. These highly productive grass species
provide feed and fodder for livestock and wild animals, play a major role as turf on golf
courses and lawns worldwide, and prevent soil erosion. Among these grasses, tall fescue
[Lolium arundinaceum (Schreb.) Darbysh.] germplasm is classified into five botanical
varieties that range from tetraploid to decaploid and into two major germplasm pools,
Continental" and Mediterranean, as well as into two functional groups, forage and turf
types. Important species in the genus Lolium include the outcrossing Lolium perenne L.,
(perennial ryegrass) and the self-pollinated L. temulentum L. subsp. temulentum (darnel,
darnel ryegrass). The majority of the Lolium are self-infertile, have a strong self-
incompatibility system and are, therefore, highly heterogeneous. Grazing or selection may
lead to loss of rare alleles that may be useful in adaptation in extreme environments, e.g.,
when these cool-season grasses are grown in warmer, drier areas. Understanding the levels of
genetic diversity within and genetic relationships between populations is therefore important
for not only breeding, but also for ensuring adaptability

and persistence, quality and disease
resistance of germplasm accessions, breeding

lines and populations. At the Noble Foundation,
efforts have been concentrated on collecting tall fescue and L. temulentum germplasm, and
the development of molecular tools for these species. Molecular tools developed in-house
were employed to study genetic diversity and to understand the utility of various marker tools
for diversity studies. In this chapter, we review the genetic diversity work carried out in
Lolium, with an emphasis on our work at the Noble Foundation. Various marker systems
have been found to be useful in the Lolium genus, with SSRs in particular being transferable
across the fescue-ryegrass complex.
Chapter 6 - Genetic differentiation of the population of Russia is investigated. The work
is based on data about immuno-biochemical and molecular markers polymorphism in about
1,500 populations from 62 ethnoses belonging to six main linguistic families and having
different cultural traditions. Genetic diversity is studied by cartographic and statistical
methods and is presented in a form of genegeographical maps. The position of the Russian
gene pool on the Eurasian background is described. The genetic relief of Russia is
investigated, and main structure components are revealed in the gene pool. Analysis of these
components from the ethno-historical point of view revealed their connection with different
Eurasia regions (West and Central Europe, Central and East Asia).
Chapter 7 - Iridaceae is a relatively large family of monocots comprising over 2,030
species in 65-75 genera. Cypella fucata Ravenna is characterized as a perennial herb which
presents bulbous and beautiful orange flowers that have ornamental value. The distribution of
the species comprises Brazil, in the states of Rio Grande do Sul and Santa Catarina, and
Uruguay. This study aims to compare two geographically distinct survey areas of C. fucata
using molecular approaches and to offer a contribution to the knowledge of genetic variation
of the species. Cypella fucata specimens were collected in the State of Rio Grande do Sul,
Brazil, in two sites: the municipalities of Piratini (26 specimens) and Capo do Leo (28
specimens). Survey sites were localized along a road, and were 22 km distant from each
other. Specimens were analyzed by ISSR-PCR (Inter Simple Sequence Repeats) since ISSR
markers have a high capacity to reveal polymorphism and offer a great potential to determine
intra- and interspecific levels of variation. Nine primers were tested, generating 201
Preface xi
fragments (bands) with sizes ranging from 150 bp to 2,000 bp and an average of 22 bands per
primer. A matrix of presence and absence of fragments was constructed and the Jaccards
coefficient was calculated. A dendrogram based on these values was generated to reveal the
genetic structure of both populations. The patterns were highly polymorphic within each
collection site, with samples aggregated into two major groups, corresponding to the
surveyed populations. In addition,
ST
was calculated and may indicate some interpopulation
gene flow (
ST
= 0.0851) and an intermediate structure. The Neis genetic distance showed a
high identity between the two collection sites analyzed (98%). Since the sampled areas were
near each other, our data may suggest that they in fact correspond to two subpopulations
derived from a single original one. These data may indicate that C. fucata presents cross-
pollination and the vegetative propagation does not play an important role in the maintenance
of the populations. Specimens from other sites will be analyzed to confirm the mating system.
This study is the first contribution to the knowledge of evolutionary aspects of this species.
Chapter 8 - Soil microbes that solubilize the insoluble phosphates play a vital role in
maintaining soil fertility, plant health and subsequent enhancement of crop yield. Fluorescent
pseudomonad group of bacteria are often predominant among bacterial species associated
with the plant rhizosphere. Due to their innate capability for plant growth promotion, plant
disease suppression and their potential for biodegradation of agricultural chemical pollutants,
fluorescent pseudomonads have been a major focus for investigators around the world. In
recent years, rich knowledge has been generated on diversity, functional potential of
fluorescent pseudomonads. This chapter describes the genetic and functional diversity of
fluorescent pseudomonads and their role in phosphate solubilization, biological control and
soil fertility.
Chapter 9 - The Qinghai-Tibetan Plateau is one of the most important centers of
biodiversity for alpine species in the world and is among the areas that are most sensitive to
global warming. Knowledge about population genetics is essential for understanding the
dispersal ability and evolutionary potential of alpine species in a warming world. In this
chapter, we review the genetic diversity and population structure of 19 alpine plant species
endemic to the Qinghai-Tibetan Plateau. Generally, the population genetic variation can
varygreatly among different species and the endangered species have much lower levels of
genetic diversity than the co-occurring common species. Although a few species showed
increased levels of genetic diversity along altitude, we dectected no significiant correlation
between diversity and altitude in most species. In addition, the isolation-by-distance model
cannot explain the spatial genetic structure in most alpine species that have been investigated,
which may partially due to the discontinous distribution of alpine species shaped by complex
geomorphology in Qinghai-Tibetan Plateau. The implications of these results for the
conservation of alpine plants during global warming are discussed.
Chapter 10 - Making inference from molecular data on the demographic parameters of
complex evolutionary scenarios remains methodologically challenging. The approximate
Bayesian computation (ABC) method has the potential to treat such scenarios (Beaumont et
al.., 2002). We have developed a user-friendly methodological framework based on ABC that
allows one to make inferences from microsatellite data under evolutionary scenarios
including any combination of admixture, divergence and (discontinuous) effective population
size variation events, and this for any number of populations. We illustrate here the potential
Conner L. Mahoney and Douglas A. Springer xii
of this methodological framework by making inferences on a complex scenario involving
four A. mellifera populations sharing two divergence and two admixture events. Four groups
of honey bee populations belonging to two genetic lineages (M and C) and genotyped at eight
microsatellite loci have been analysed twice to evaluate estimation stability. In addition,
mean relative bias and errors have been computed from 500 data sets simulated with known
values of parameters (close to estimates on real data), showing that the order of magnitude of
all parameters is correctly estimated. Time estimates of divergences between populations are
compatible with previous estimates: -0.6 My for lineages M and C divergence and -0.2 My
for French and Italian M lineage divergence. The estimated proportion of lineage M alleles in
the subspecies ligustica, amounting to 12%, is intermediate between estimates obtained by
two different methods. Furthermore, our ABC analysis allows decomposing the previous
estimate of 35% of lineage M alleles in the recently admixed population as 23% from the
local mellifera subspecies and 77%12% (9.2%) from the imported ligustica, making a total
of 32.2%. The most unexpected result concerns the time of the admixture of lineages M and
C that gave rise to the subspecies ligustica. It is estimated at 2,000 years with an
approximate credibility interval of (-1,000, -7,000).
Chapter 11 - In the last decade, a near consensus has emerged in supporting single
African origin of modern humans. However, the timing of dispersal out of Africa and the
routes taken are far from obvious and focus of debate. In the present study, possible dispersal
routes taken across Eurasia and finally New World and the Pacific were investigated using
craniometric dataset consisting of 34 measurements. The degree of intra-regional variation
shows that sub-Saharan Africans are the most diverse and that the diversity of non-Africans is
negatively correlated with geographic distance to East Africa. The relationship between
regional variation and geographic distance from sub-Saharan Africa tested by linear
regression analysis supports a possible dispersal route proposed from the research of mtDNA
haplotype variation, the Horn of Africa (the route across the Bab el Mandeb Strait) as a
passageway in major human migration out of Africa. The results obtained support, moreover,
the multiple migration hypothesis for the peopling of East/Northeast Asian region; mainly
from central/western Asia with minor contribution from Southeast Asia. Nonlinear regression
(exponential approximation) analysis using geographic distance measured along a
hypothetical dispersal route shows that phenotypic similarity between populations decreases
as the geographic distance increases. Such findings suggest that geographic distance is a
primary and significant determinant of not only genetic but also craniometric variation
between major human population groups. The present study illustrates that modern human
cranial diversity patterns fit an evolutionary model of neutral expectation and a dispersal
model of iterative founder effects with an African origin.
Chapter 12 - Intra-specific genetic variation is considered an important factor for
evaluating biodiversity; indeed, the higher genetic variation within a species, the higher its
surviving ability. The loss of suitable habitats for moss species involves demographic
decreases and genetic impoverishment. Mosses, have a short generation time compared to
phanerogamic vegetation, particularly trees, and therefore may exhibit all these effects
earlier, predicting the destiny of higher plant communities and the ongoing changes in natural
landscapes. Indeed, intra-specific genetic variation in moss species may represent an ideal
model system for investigating species fitness consequent to natural and man driven
Preface xiii
environmental changes, both at a local level, and at a large scale. At a local level these
studies provide useful information for territory management since they promptly signal local
environmental changes; whereas, over a large scale they highlight historical processes which
have affected taxon origin, distribution, radiation, in relation to the main geological events.
Genetic variation and structure within moss species is influenced by reproductive strategy
and dispersal, giving information about gene exchange, occurrence of sexual reproduction,
selfing/outcrossing rates. Demographic constraints and especially ongoing demographic
fluctuations also concur to shape population genetic diversity and structure, evidencing
phenomena such the relative importance of the founder effect, the occurrence of bottleneck
and genetic drift. Moss genetic variation may highlight environmental disturbance caused
both by natural events and by land use and human pressure. Among disturbances, habitat
fragmentation is one of the most studied due to the increasing loss of suitable habitats for
moss species. In general, it can be stated that intraspecific genetic variation in mosses reflect
environmental gradients, with high amount of variation in natural environment, versus low
level of variation in threatened environments.
The rapid transformation of the environment into a network of patches due to habitat
fragmentation, and the increasing environmental disturbance, lead to a genetic erosion in
isolated populations, with consequent increase of extinction risk. Thus, intraspecific genetic
variation in mosses appears a suitable tracer of environmental disturbance due to the global
ubiquity and the fast generation time of these plants.

In: Genetic Diversity ISBN 978-1-60741-176-5
Editors: C. L. Mahoney and D. A. Springer 2009 Nova Science Publishers, Inc.

Chapter 1

Analysis of Sequence Diversity at Two
Mitochondrial Genes on Different
Taxonomic Levels. Applicability of DNA
Based Distance Data in Genetics of
Speciation and Phylogenetics

Y. Ph. Kartavtsev
A. V. Zhirmunsky Institute of Marine Biology of the Far Eastern
Branch of the Russian Academy of Sciences,
Vladivostok 690041, Russia

Abstract

Algorithms of nucleotide diversity estimates and other measures of genetic
divergence for the two genes Cyt-b (cytochrome b) and Co-1 (cytochrome oxidase 1) are
analyzed. Based on the theory and algorithms of distance estimates on DNA sequences,
as well as on the observed distance values retrieved from literature, it is recommended
for realistic tree building to use a specific nucleotide substitution model from at least 56
available from Modeltest 3.7 or other software depending on the specific set of
nucleotide sequences. Using a database of p-distances and similar measures gathered
from published sources and GenBank (http://www.ncbi.nlm.nih.gov) sequences, genetic
divergence of populations (1) and taxa of different rank, such as subspecies, semispecies
or/and sibling species (2), species within a genus (3), species from different genera
within a family (4), and species from separate families within an order (5) have been
compared.
Empirical data for 18,192 vertebrate and invertebrate species demonstrate that the
data series are realistic and interpretable when p-distance and its various derivates are
used. The focus was on vertebrates and fish species in particular, and the newest dataset
obtained in the framework of FishBOL (http://www.fishbol.org). Distance data revealed
various and increasing levels of genetic divergence of the sequences of the two genes
Cyt-b and Co-1 in the five groups compared. Mean unweighted scores of p-distances for
Y. Ph. Kartavtsev 2
five groups are: Cyt-b (1) 1.460.34, (2) 5.350.95, (3) 10.460.96, (4) 17.991.33 (5)
26.363.88 and Co-1 (1) 0.720.16, (2) 3.781.18, (3) 10.870.66, (4) 15.000.90, (5)
19.970.80. The estimates show good correspondence with former analyses. This
testifies to the applicability of p-distance for most intraspecies and interspecies
comparisons of genetic divergence up to the order level for the two genes compared. As
seen from the numbers above, and from a regression analysis, there is no a sign of
saturation, usually expected from a homoplasy effect.
Differences in divergence between the genes themselves at the five hierarchical
levels were also found. This conforms to the ample evidence showing different and
nonuniform evolution rates of these and other genes and their various regions. The
results of the analysis of the nucleotide as well as allozyme divergence within species
and higher taxa of animals are, firstly, in a good agreement with previous results and
showed the stability of a general trend, and, secondly, suggest that in animals, phyletic
evolution is likely to prevail at the molecular level, and speciation mainly corresponds to
the geographic model (type D1). The prevalence of the D1 speciation mode does not
mean that other modes are absent. There are at least seven possible modes of speciation.
How we can recognize them formally with operational genetic criteria is a key question
for establishing a quantifiable genetic model (theory) of speciation. An approach is
suggested that allows a step forward in this direction.
Research was supported by the Russian Foundation for Fundamental Research
grants #07-04-00186, #08-04-91200 and the Far Eastern Branch of the Russian
Academy of Sciences (RAS) grant #08-3B-06-031, RAS Board Programs, grant #09-
1P23-06.

Introduction

Currently in the field of molecular phylogenetics there is a hum of activity that
stimulated by such sources as the Tree of Life Project (http://tolweb.org/tree/ ), CBOL
(Consortium for Barcoding of Life; http://www.barcoding.si.edu/ ) and FishBOL
(http://www.fishbol.org ) global initiatives. Data bases have increased in asymptotically and
many newcomers are researchers in the field. However, analysis of genetic variation and tree
building is not a routine task even for those with experience. Many experimental papers,
reviews, monographs and software (e.g. Kumar et al., 1993; Avise, 1994; Li, 1997; Avise,
Wollenberg, 1997; Johns, Avise, 1998; Posada, Grandal, 1998; Nei, Kumar, 2000; Swofford
et al., 1996; 2000; Hall, 2001; Hebert et al., 2002a; Felsenstein, 2004; etc.) are available for
consulting. Still, not much attention is paid to general recommendations in the field for such
newcomers and experts from other disciplines of genetics and evolutionary biology, which is
quite important for general biology and general genetics. Specifically, such a review is
required for molecular taxonomic differentiation and the genetics of speciation. Thus,
investigation of the molecular divergence of organisms over time must take into account
basic genetic properties of the organisms and their groups, forming in nature such
reproduction units as populations and biological species.
It seems logical to combine the issues of population genetics and molecular evolution to
avoid a contradiction between the Biological Species Concept (BSC) and Phylogenetic
Species Concept (PSC), a contradiction that is more apparent then real (Avise, Wollenberg,
1997).
Analysis of Sequence Diversity at Two Mitochondrial Genes 3
Temporal population genetic dynamics cannot be separated from spatial population
dynamics and understanding of the bases of intraspecies genetic differentiation. Misled by the
vast possibilities of phylogenetic reconstructions inferred from the primary DNA sequences,
some authors even reject the analysis of spatial divergence at all, opposing the PSC to the
BSC (Cracraft, 1983, DeQuieros, Donoghue, 1988). Fortunately, many geneticists are far
from such extreme views, understanding the common nature of many intraspecies and
interspecies divergence mechanisms (Altukhov, 1983; Ayala, 1984; Nei, 1987; Avise,
Wollenberg, 1997; Avise, 2001). These, as well as some other issues, are considered in the
current review, which is intended for geneticists of different specialties. Kartavtsev and Lee
(2006) considered this three years ago, but since then many new data have appeared. This
review will both update former information and look at a somewhat different angle on
molecular phylogenetics and genetics of speciation. Further, the English edition of the former
review as was not satisfactorily translated in some places, and the current review attempts to
improve this area. Since the author is a marine biologist, many examples are from the
literature in this field.
Here, I have mainly summarized and analyzed the evidence on the proportion of
nucleotide substitutions in populations within species and in taxa of different ranks, although
there have been earlier generalizations on similar topics (Avise, 1994; Li, 1997; Powell,
1997, Johns, Avise, 1998; Graur, Li, 1999). I was motivated by two reasons: first, the rapid
increase in information amount in this field and, second, estimates for different genes had not
been compared, except for Kartavtsev and Lee (2006).
In the last decade, mitochondrial genes for cytochrome b (Cyt-b) and cytochrome oxidase
1 (Co-1) have been most frequently used for taxonomic and phylogenetic analysis at the
species-and-family level. These genes proved to be useful for estimating divergence in taxa
up to the family level in many animal groups (Johns, Avise, 1998; Graur, Li, 1999; Hebert et
al., 2002 a, b; Greer et al., 2003; Sazaki et al., 2007, Kartavtsev, Hanzawa, 2007). A survey
of the evidence on intraspecies divergence of mitochondrial genes in 256 vertebrate, mostly
sexually reproducing species, indicated that 56% of them form distinct intraspecies maternal
lines, which typically are confined geographically (Avise, Walker, 1999). Thus, the polytypic
species or subdivision into groups of most species is documented by sources that are
independent from other ecological or demographic data and in good agreement with the
latter. In the present paper I do not consider problems related to the construction and analysis
of phylogenetic trees and related phylogenetic issues. This is a specific topic discussed
elsewhere (Li, Zarkhih, 1995; Swofford et al, 1996; Avise, 2000; Nei, Kumar, 2000; Hall,
2001; Sanderson, Shaffer, 2002; Felsenstein, 2004). The main objective of this study is
considering the levels of nucleotide diversity in animal populations and taxa of four various
ranks. For convenience, I will refer to these categories as comparison groups. In connection
with the main objective, the aims of the review are as follows: (1) comparing statistical
algorithms for analysis of molecular variation and evolution; (2) comparing estimates of
nucleotide divergence or proportion of nucleotide substitutions in sampled pair of sequences
(p-distance); and (3) briefly summarizing the views on the species in genetic terms and
showing whether and how molecular genetic variability and divergence are related to
speciation.

Y. Ph. Kartavtsev 4
1. Material and Methods

The primary nucleotide sequences of genes (further sequences for shortage) are the focus
of this paper. The conclusions are mainly based on information from a database on p-
distances of two genes, Cyt-b and Co-1, presented in the table (see Appendix). A
considerable part of genetic distances in this table was obtained primarily from sequences or
taken directly or indirectly as authors estimates both for Cyt-b (Johns, Avise, 1998) and Co-
1 (Hebert et al., 2002 a, b). Most sequences were retrieved by the authors of the cited works
from GenBank (Release 103.0, 131). For Cyt-b, 2821 gene sequences were examined and for
Co-1, 655 and 13320 sequences. Sequence length varied for Cyt-b from 200 bp (Johns,
Avise, 1998) up to a nearly complete 1200 bp (Hardman, 2004; Kartavtsev et al., 2007 a, b,
etc.) and for Co-1 varied in different group comparisons from 619 to 669 bp (Hebert et al.,
2002 a, b; Ward et al., 2005; Kartavtsev et al., 2008; 2009a, b, etc.). In each group compared,
the p-distance or its derivate was estimated (see section 2.1). My analysis consisted in
computing and comparing the mean values from which the database was formed, which
included much data from other sources (see table in the Appendix for references).
The information was retrieved from the literature sources by means of the following three
methods. (1) If the distance matrices were available, the arithmetic means were calculated
directly, using each of the pairs once relative to the other units of comparison: e.g., 1-2 and 1-
3, but not 2-3 of the three possible pairs. This principle, which permits avoiding restriction of
random choice, imposed by the matrix, was also employed earlier (Johns, Avise, 1998).
Hebert et al. (2002b) compared all possible pairs of n(n - 1)/2, while in Hebert et al. (2002 a),
the comparison principles were different for different taxa compared. I also made all pairwise
comparisons in the cases where only a few sequences were available or when a choice within
data matrix was complicated. (2) When the distance matrixes were not available, I
extrapolated the distances from the scores presented on plots and dendrograms (this can be
readily accomplished, using the scales of graphs and dendrograms). (3) In many works, the p-
distances between the comparison groups required were directly presented. Note that
virtually all values from (Johns, Avise, 1998) were computed from plots. This procedure
inevitably entails some approximation. However, in view of very high intragroup (intrataxon)
distance variance, these errors were negligible for comparative group analysis. In addition to
distances, some other measures on DNA marker variability were examined. The literature
data were screened using the Thompson Institute of Scientific Information, Science Citation
Index, SCI data base, and other sources. Articles of 1995 through 2008 were examined. Our
work also included analysis and obtaining analytical expressions for the statistics used. Since
this part of the work is indirectly related to examination of observed data on molecular
variation, it is only briefly outlined here (section 2.1). Statistical analysis was performed
using the STATISTICA (1994) software package. From this package, we employed the basic
module for calculating mean and variance parameters, as well as those for parametric analysis
of variance (ANOVA, and multi-dimensional version, MANOVA) and Kruskall-Wallis
nonparametric ANOVA.

2. Intraspecies and Interspecies DNA Variation

The BSC implies that a species is an isolated reproductive unity. Molecular data,
especially pertaining to mitochondrial DNA (mtDNA) show that, on the one hand, natural
hybridization between species may leads to introgression of genes from one gene pool to the
other one. On the other hand, sequences of individual genes exemplified that the variability
of DNA markers increases with the rank of the taxon (Johns, Avise, 1998; Hebert et al.,
2002a; Ward et al., 2005). Hence, I believe it is expedient to compare the data on nucleotide
divergence for several genes, from several data sources, and, in addition, to substantiate both
the variability and distance parameters. The latter is important for understanding the essence
of estimating divergence at the DNA level and its connection to speciation. Some
complications as mentioned in the Introduction above are available to obscure real DNA
variability. There may be other hidden factors. In particular, various genes may encode
different functional properties of phenotype (macromolecules firstly). To compare such genes
we have to know some of these properties. For example protein coding genes have biased
proportions of pyrimidines (T, C) and purines (A, G). Bias in the ratio of (T+C) : (A+T) is
well described in the literature for some protein-coding genes (e.g. Kim et al. 2004).
However, it is frequently stated without statistical substantiation. The presented analysis for
Cyt-b in flatfish (Figure 1) validated this for Cyt-b on a firm statistical basis and also
emphasized that taxonomic differences are discernible (Figure 1; Kartavtsev et al., 2007 b).

Figure 1. Plot of the average proportion of four nucleotides at Cyt-b gene in flatfish species, order
Pleuronectiformes (1-2) in comparison with representatives of Perciformes (3) (From Kartavtsev et al.,
2007b). Cyt-b gene nucleotide content presented for all three nucleotide positions. Results of one-factor
MANOVA are given. Groups 1 to 3: 1, Species from study by Kartavtsev et al. (2007b); 2, Species
taken from GenBank; 3, GenBank data on Perciformes. T+C : A+G ratio significantly deviate from 1:1.
Significance of the impact for this factor and comparison groups is given on the top.
Y. Ph. Kartavtsev 6
The same bias was substantiated for Cyt-b in catfish (Kartavtsev et al., 2007 a) and for
Co-1 in flatfish and two other taxa (Kartavtsev et al., 2008 a, b, c). It is believed that the
nucleotide bias reflects the hydrophobic properties of the encoded proteins (Naylor et al.
1996). However, the taxonomic differences, if observed (Figure 1), are more relevant to taxa
evolution and reflected their separate divergence. Thus, in such cases distance estimates may
have an unexpected impact.

2.1. Polymorphism of DNA Sequences. Nucleotide Diversity

Understanding DNA sequence polymorphism as a result of nucleotide substitution is of
primary interest for molecular phylogenetics. Amino acid sequence substitution rate is also an
important to estimate but this is out of the scope of this paper. If the nucleotide sequence for a
particular set of loci or alleles in a population sample is known, DNA polymorphism can be
assessed in a several ways. The best measures of DNA sequence divergence are nucleotide
diversity as a per site measure, (Nei, 1987) and the proportion of different nucleotide cites
at a pair of randomly choose sequences, p-distance as P or its estimate p (Nei, Kumar, 2000,
p. 33).

=
ij

i
j

ij
, (2.1)

where
i,
j
is the population frequency of the i
th
and j
th
types of DNA sequences, and
ij
is
the proportion of different nucleotides between the i
th
and j
th
types of DNA sequences. In a
panmictic population, is usually referred to as heterozygosity at a nucleotide level. Its
estimates can be found either by

^ = [n/(n-1)]
ij

i
^
j
^
ij
(2.2)

or by ^ =
i<j

ij
/ n
c.
(2.3)

Here n,
i
^ and n
are the number of DNA sequences examined, the frequency of the i

th

type of DNA sequence in the sample, and the total number of sequence comparisons [n(n
1)/2], respectively. In equation (2.3), i and j refer to the i
th
and j
th
sequences rather than to the
i
th
and j
th
types of sequences.
P-distance may be estimated as follows:

p^ = n
d
/n, (2.4)

where n
d
is the number of nucleotides differing between the DNA sequences X and Y, and n
is the total number of analyzed nucleotides. Although both and p measure sequence
diversity, their numerical values are different. Variance for is quite complicated and given
elsewhere (Nei, Tajima 1981; Kimura, 1983) and summarized by Nei (1987), Kartavtsev
(2005), etc. Variance for p, to detect the standard error of the mean (SE), is a simple
binomial:

V(p^) = [p(1-p)] / n, (2.5)

In actual computations, all p as population frequencies are substituted by their estimates,
p^.
To understand the essence of the process of substitution, an appropriate mathematical
model should be used. We have analyzed the substantiation of the measures, their analytic
expressions (models), and the variance estimates on the basis of four sources (Nei, 1987; Li,
1997; Nei, Kumar, 2000; Felsenstein, 2004). Two popular models are Jukes-Cantor's and
two-parameter Kimura's, which assume equal substitution rates for all nucleotides and
different proportions for transitional () and transversional () modes. At least eight
substitution models are known: (1) JukesCantor, JC; (2) two-parameter Kimura, K2P; (3)
Equal-input; (4) Tamura; (5) Hashigawa Kishino Yano, HKY; (6) TamuraNei, TrN; (7)
General time reversible, GTR; and (8) Unrestricted. In a combination with some other
parameters available, the total number of the models reaches 56 (Posada, Grandal, 1998).
In the K2P model, the equilibrium frequencies of all four nucleotides are 0.25. However,
the proposed algorithms could be applied, irrespective of the initial frequencies (Rzhetsky,
Nei, 1995; Nei, Kumar, 2000, p.38). In this respect, this model is similar to that by Jukes
Cantor, and, as the latter, can be applied to a wider range of empirical data than the remaining
six models. Note that in the Kimura model R = /2, but many authors and software program
packages employ the ratio k = /. This should be kept in mind to avoid erroneous results in
comparisons. Furthermore, an examination of the model algorithms showed that a thoughtful
choice of the model for data analysis is advisable. Sometimes it may be worthwhile to spend
some additional time and select a more complex model of estimation of the nucleotide
substitution number (nos. 3 8) instead of following the routine software option, leading to
K2P, in order to get more correct results. However, we would like to note that in the more
complicated models, the greater number of parameters results in relatively higher standard
errors (Nei, Kumar, 2000). In the analyzed array of studies, the authors most often (49 and
33% according to the data table, see Appendix) employ simply an uncorrected p value or use
the K2P model. There are examples of use of the HKY model (Volker, 1999, Tarjuelo et al.,
2001, Zheng et al., 2003), TrN model (Jerome et al., 2003, Bertsch et al., 2005), GTR model
(Piaggio, Spicer, 2001, Quatro et al., 2006), etc. A numerical simulation for an infinite
number of nucleotides showed that if the substitution number is low (< 20%), all models give
similar values (Nei, Kumar, 2000, Figure 3.1). However, as the substitution number and
homoplasy increase, the p value is the first to be biased. Empirical results for several beetle
taxa gave very similar patterns, with the estimates of p scores deviating most from
expectations and GTRs least (Martinez-Navarro et al., 2005, Figure 6). Correction by means
of the gamma-distribution is important to make unbiased estimates of distance in relation to
non-uniform substitution rate in different sequence regions (Rzhetsky, Nei, 1995; Swofford et
al., 1996; Li, 1997; Felsenstein, 2004). The MODELTEST 3.06 program (Posada, Crandal,
1998) and later versions (Abascal et al., 2005) is widely used for selecting a model suitable
for concrete empirical data. Valuable information on the properties of the models and their
applicability to various data types is presented in the papers (Rzhetsky, Nei, 1995; Hall,
2001, Sanderson, Shaffer, 2002; Felsenstein, 2004). Different options for computing p-
distances and other distance estimates are implemented in software packages PAUP*
Y. Ph. Kartavtsev 8
(Swofford, 2000), MEGA (MEGA2 to MEGA4 (Kumar et al., 1993; 2000; Tamura et al.,
2007), and others. New software or updated versions of existing software appear on a regular
basis in the journal Bioinformatics. I recommend inspection of this journal by interested
readers. A helpful interface and various statistic possibilities, including those for analysis of
DNA sequences, haplotypes and genotypes, are presented in the ARLEQUIN package
(Schneider et al., 2000). A very good guide for phylogenetic analysis is given by Hall (2001)
and later editions. It is mainly intended for PAUP*, but also present in a popular form general
principles of phylogenetic analysis of DNA variation.

2.2. Divergence at DNA Markers within Species and at Different levels of
the Taxonomical Hierarchy: Analysis of Empirical Data

Intraspecies differentiation at DNA markers. With exclusion of deletions and insertions,
the most suitable measure of polymorphism at the nucleotide sequence level is nucleotide
diversity (See equations 2.1 2.4). However, examination of large samples for population
analysis is often impractical as nucleotide sequence analysis is very labor-consuming.
Usually, RFLPs are analyzed for this purpose or, more recently, microsatellites, which permit
estimation of allele frequencies and other differentiation measures such as Fst, etc. However,
these data are beyond the scope of the present review.
In spite of high labor costs, intraspecies nucleotide diversity has been studied in many
species, recently even complete mtDNA sequences have been compared in copepods (Burton
et al., 2007). Nei (1987) presented a summary of earlier results for various mtDNA regions
and nuclear genes (-globin, alcohol dehydrogenase, histone H4, hemagglutinin, insulin, and
two immunoglobulins) of human, monkeys, and other organisms (in total, nine species). Most
of these estimates were obtained using restriction analysis. The nucleotide diversity varies
from 0.002 to 0.019 in eukaryotes and is rather similar for mitochondrial and nuclear genes:
mean = 0.007, i.e., 0.7% for both groups (Nei, 1987, Table 10.6; means are given in my
recalculation). In two fish species, nucleotide diversity in a Cyt-b gene fragment comprised:
= 0.59% and = 0.08% (Baker et al., 1995). The intraspecies nucleotide diversity in the
control mtDNA region in fish, Pterois miles reaches = 1.9% (Kochzius, Blohm, 2005).
Latitudinal differences in nucleotide diversity were found in two copepod species, the
subarctic species, Calanus finmarchicus ( = 0.37%, SD = 0.26) proving to have lower
diversity that the temperate species, Nanocalanus minor ( = 0.50%, SD = 0.32) (Bucklin,
Wiebe, 1998). The p-distance (K2P) for a 600-bp Co-1 gene sequence was estimated for 107
intraspecies groups of various species from five butterfly families (Lepidoptera: Arctidae,
Geometridae, Noctuidae, Notodontidae, and Sphingidae) and shown to exhibit low variation
with mean values ranging from 0.17 to 0.36% (Hebert et al., 2002 a). Recalculation for these
groups produced the grand mean K2P = 0.250.04% (here and in further text, is followed
by SE). On average (arithmetic mean, M), for the intrapopulation p-distances were: M =
1.460.34% and M = 0.720.16%, respectively for genes Cyt-b and Co-1 (Appendix). A
similar value was obtained for a 2214-bp mtDNA fragment, treated with restriction
endonuclease HindIII in five individuals of Oncorhynchus mykiss from different populations
(Beckenbach et al., 1990): mean for p = 0.2540.025%. These variation values, related to
single nucleotides, show a vast reservoir of intraspecies variability, if recalculated for a
relatively short gene, say, 1000 bp in size: 0.25 1000 = 250, or 250 variable sites per gene.
In some cases, however, nucleotide diversity is fairly low. For instance, mtDNA of Indians
from Venezuela is completely monomorphic (Johnson et al., 1983). Apparently, this
population has recently passed through a bottleneck. Very low nucleotide diversity was
recorded for Co-1 in cnidarians (Creer et al., 2003).
Data on p-distances for two genes at the population level (Appendix) show that
intraspecies divergence is under a strong influence of common population-genetic factors,
predominantly isolation (migration), population size, and, apparently, natural selection (some
evidence on selection is considered below). The space limits and the aim of this article do not
allow us to dwell on these issues. We only would like to note that mammal populations (the
genus Apodemus) for example from main Japanese islands are less differentiated at Cyt-b
(K2P = 0.96%) than populations from small islands (K2P = 1.54%), also geographically
distant populations of another group of mammals (the genus Martes) have larger distances
(TrN = 3.2%), than the geographically close populations (TrN = 0.4%) (Appendix). Some
nuclear genes also exhibit quite high differentiation levels, e.g. for bovine PRNP p = 0.811
(Rongyan et al., 2008), and several others also show comparable diversity with mtDNA genes
in rodents (e.g. Suzuki et al., 2008). Obviously, the reproduction system may play an
important role too. In the review, we focus on out-breeding animals. However, among
obligate hermaphrodites or parthenogenetic forms, nucleotide divergence between lines of
different geographic distances within nominal species can reach high values. For instance, in
freshwater crustaceans, Potamoneutes P = 1.5-2.0% (Daniels et al., 2002), in Artemia p =
3.8% (Schon et al., 1998) and in copepod, Tigriopus californicus JC = 2.430.40, =
0.00260.0002 for Cyt-b and JC = 1.360.33 for Co-1 (Willet, Burton, 2004; my
recalculation of mean values for synonymous and nonsynonymous substitutions jointly). In
such organisms, genetic differentiation may develop because of their isolation and relatively
small effective population size Ne for separate lines with the total population size of hundreds
of millions (Bucklin, Wiebe, 1998).
Generally, mtDNA shows maternal inheritance and exists in form of haplotypes. Hence,
Ne values for mtDNA are expected to be equal to one-fourth of the value obtained from
nuclear gene variation, which must reduce the level of drift mutation equilibrium, and,
consequently, . However, the substitution rate is higher for mtDNA than for the nuclear
genome. Recent estimates, reported for 14 to 150 taxa of various insect groups, showed that
even by the conservative estimates based on the Jukes-Cantor model, the ratio of Co-1/EF-1
(i.e., mitochondrial and nuclear genes) substitution rates varied from 1.90.3 to 5.41.7
(Johnson et al., 2003). The effects of these two compensating factors, probably, lead to a
situation where mean and p values are nearly equal in mitochondrial and nuclear genomes.
However, recent observations showed bigger sequence diversity of mtDNA genes (Willet,
Burton, 2004).
Silent or synonymous polymorphism. It is of interest to examine polymorphism that does
not manifest phenotypically even at the simplest level of amino acid sequences in proteins.
Ample literature exists on the subject, to which we refer the reader for further information
(Nei, 1987; Li, 1997; Swofford et al., 1996; Graur, Li, 1999; Zhimulev, 2002; Barns, 2003).
Here, we confine ourselves to discussing several simple issues. The neutrality theory predicts
Y. Ph. Kartavtsev 10
that this so-called silent polymorphism occurs more frequently than polymorphism
implemented at the level of amino acid sequences, because silent mutations would undergo
less strong selection than non-silent ones. Conversely, if polymorphism is generally
maintained by most common directional selection (in case of a negligibly small drift effect)
then silent polymorphism would be less frequent than non-silent polymorphism. One of the
ways to test these assumptions is examining polymorphism at the first, second and third
codon positions in functional structural genes, as well as at pseudogenes, which currently are
nonfunctional.
Li and co-authors (Li et al., 1981; 1984) were among the first to address these issues. In
their studies of the myoglobin gene in comparison with four pseudogenes in human, mouse,
rabbit, and goat, these authors have shown that (1) the nucleotide substitution rate in
functional genes is the highest at the third codon position and (2) the nucleotide substitution
rate in pseudogenes is twice as high as the corresponding parameter even at the third codon
position. The R index, R =
2
/ (Li, 1997, p.232), estimated for 20 loci, supports these old
conclusions for more extensive data for non-synonymous (N) and synonymous (S)
substitutions, the means being R
N
= 8.26 and R
S
= 14.41 (Li, 1997, Table 8.8). The
differences in the nucleotide substitution rates (r) of nuclear genes in human (47 genes) and
Drosophila (32 genes) are even more contrasting: r
N
= 0.74 (0.67; in brackets is standard
deviation); r
S
= 3.51 (1.01) and r
N
= 1.91 (1.42), r
S
= 15.6 (5.5), respectively (Li, 1997,
Tables 7.1, 7.6). In Tigriopus californicus, the simple proportion N/S (d
n
/d
s
) for 3 pair-wise
interpopulation comparisons were 0.017, 0.018, and 0.025 (Willet, Burton, 2004).
Recently a novel method to address this issue was presented (Tennessen, 2008), which

demonstrated that divergence in bacteria-killing activity among

animal antimicrobial peptides
is positively correlated with

the log of the d
N
/d
S
ratio. The primary cause of this pattern

appears to be that positively selected substitutions change

protein function more than neutral
substitutions do. Tennessen (2008) thus believes that the

d
N
/d
S
ratio is an accurate estimator
of adaptive functional

divergence.

Earlier, using another gene set, the substitution rates for N-
and S-codons in evolution were also shown to differ: 8.26 and 14.41, respectively (Gillespie,
1989). On average, the nucleotide substitution rate in pseudogenes is 4.7 10
-9
per
nucleotide per year and is thought to be close to the neutral process (Nei, 1987). Analysis of
another multigene family, amylases, revealed clear differences in p-distances for synonymous
(1) and nonsynonymous (2) nucleotide substitutions in the sequences in three Drosophila
species: (1) p = 0.3980.043 and (2) p = 0.0680.008 (Brown et al., 1990). Thus,
summarized analysis of extensive data showed that for a randomly selected coding sequence,
the ratio of synonymous and nonsynonymous substitutions is approximately 25 : 75%, while
this proportion is reversed (69 : 31%) for the third position (Li, 1997, Table 1.4). Note that
the N/S ratio is significantly higher in human and close anthropoid ape species than in other
monkey groups, owing to greater Ne (Wu et al., 2000) and inevitably stronger natural
selection. The increased proportion of nonsynonymous substitutions in hominids is attributed
to the rapid adaptive evolution in this group.
A search for selective response at the electron transport system in Tigriopus californicus
showed no impact on two mtDNA genes Cyt-b, Cyt-c and one nDNA gene RISP with some
impact at nuclear Cyt-c (Willet, Burton, 2004). In this and other cases, evidence suggests that
nuclear vs non nuclear interactions may play an important role on selective response (Gerber
et al., 2001; Willet, Burton, 2004). The above evidence suggests that (1) genes and their
regions with and without functional significance accumulate mutations and diverge at
different rates and (2) the presence of purifying selection on coding sequences of structural
genes is a well-established fact. However, we have to keep in mind that selection measures
are quite a complicated matter and frequently we may have not enough information for a
proper solution (Ohta, Gillespie, 1996). At 4 out of 5 loci above there was no sign of
selective impact at protein coding genes (Willet, Burton, 2004). In other cases of positive
impacts (e.g. Plotkin et al., 2004), bulk may be more apparent than real (Hahn et al., 2004).
Compared data sets, taken from GenBank for instance, may be not randomized enough for
neutrality tests as may be the case for some evidences in Rand and Cann (1998). However,
the general conclusion that mildly deleterious mutations prevailed at mtDNA genes is
reasonable (Rand, Cann, 1998).
Genealogical relationships of genes within and among populations. The phylogenetic
relationships for one gene or DNA fragment may be inferred from the DNA polymorphism in
nucleotide sequences or restriction sites. If a phylogenetic tree is constructed on the basis of
genes sampled from several populations connected by migration, theoretically we deal with
mixed genealogies (Malecote, 1973). MtDNA in human races, which today are actively
intermixing, provide an example of such genealogical mixture. The mode of clustering of
members of various races in a phylogenetic tree unequivocally demonstrate its "mixed"
branches (Nei, 1985). Similar results were obtained in other studies of humans (Cann, 1982,
Cann et al., 1982; Ingman, Gyllensten, 2007); this ambiguous clustering was interpreted as
showing migration among races (Cann, 1982). The most ancient mtDNA divergence, dating
back some 300,000 years, occurred in the members of Mongoloid and Negroid races.
Apparently, the mtDNA divergence preceded the divergence of the races themselves, as
follows from the estimates of their divergence based on other genes (Nei, 1987). Note,
however, that genealogical mtDNA mixing for various populations is expected, even in the
absence of migration, if the ancestral population was polymorphic and the time since
divergence was relatively short (Nei, 1987; Avise, 2000). Lineage sorting of individuals by
populations, which are isolated after that, yield a phenomenon of older age of gene
genealogies than population lineages. Later evolution will convert this in difference between
gene and species trees.
The evidence given in the section 2.1 above, in the subsection Intraspecies differentiation
at DNA markers, demonstrates a relatively low percentage of nucleotide divergence within
species. Yet, the available data suggest that the divergence of populations within a species in
some cases produces stable, geographically distinct spatial groups, phyletically marked by
mitochondrial genes. This was found for bottle-nosed dolphin, Tursiops truncates (Dowling,
Brown, 1993), Canadian goose, Branta canadiensis (Van Wagner, Baker, 1990), in fish,
Fundulus heteroclitus and Stizostedion vitreum (Gonzalez-Willasenor, Powers, 1990;
Billington, Strange, 1995), and in a number of other organisms (Stepien, Faber, 1998). Thus,
migration and gene flow can be restricted, while intraspecies phyletic groups are as real as
are stable population units of species, detected in analysis of spatial genetic differentiation of
particular generations or their mixtures (Avise, Walker, 1999). In further sections of the
paper, we will consider the question whether and to what extent these data are associated
with the genetics of speciation.
Introgression of mitochondrial DNA. Investigation of mtDNA genotypes, in combination
with nuclear DNA markers or isozyme loci, have sometimes demonstrated the ability of
mtDNA to introgress from one species to the other species, if the hybrids between these
species and their progeny are fertile and in this case make an impact on the nuclear vs
cytoplasmic background. This introgressive hybridization requires successful backcrosses of
the ancestral hybrid female with males of the parental species or other taxa. This
introgression is independent of recombination and segregation events, occurring in the
nuclear genome, if natural selection, maintaining nuclear cytoplasmic compatibility, is
absent (Takahata, Slatkin, 1984; Nei, 1987). However, evidence of this kind appears
increasingly often, indicating operation of subtle mechanisms that maintain the interaction of
nuclear and, for example, mitochondrial genes (Gerber et al., 2001). A large number of cases
of mtDNA introgression (see below) show that this selection, if it exists at all, is not
sufficiently strong to prevent hybridization and introgression. Thus, the instances of
possession of foreign mtDNA in natural species hybrids, identified by other methods, may be
a proof of hybridization of closely related species (taxa). Such interspecies mtDNA transfer
has been found in species of invertebrates (Drosophila) and vertebrates (Mus and Rana)
(Yonekawa et al., 1981; Powell, 1983; Ferris et al., 1983; Spolsky, Uzzell, 1984; Yonekawa
et al., 2000; Suzuki et al., 2007; 2008). Literature on the topic was already considered with
the aim of comparative analysis (Campton, 1987; Avise, Wollenberg, 1997; Yonekawa et al.,
2000; Suzuki et al., 2007). Based on an analytical approach for analysis of nuclear
cytoplasmic equilibrium (Clark, 1984; Asmussen et al., 1987), an original method has been
developed for testing the direction of hybridization and the intensity of introgression (Avise,
2001). In this section, we only briefly touch upon the issue of mtDNA introgression, to
elucidate its relationship with the species status in BSC.
Firstly, let us consider some examples. In southern Denmark, there is a hybrid zone
between two house mouse species, Mus musculus and M. domesticus (previously assigned to
two distinct subspecies). Northern Denmark is inhabited mainly by M. musculus.
Examination of mtDNA of these species have shown (Ferris et al., 1983 ) that, in contrast to
the Eastern European form of M. musculus, in Denmark M. musculus to the north of the
hybrid zone also possesses mtDNA of M. domesticus, which shows a 5% divergence of
mtDNA nucleotide sequence from M. musculus. This part of M. domesticus is restricted to
northern Denmark and some Swedish regions. Because of this, mtDNA introgression from M.
domesticus to M. musculus seems to have appeared relatively recently. As reported, M.
musculus species can be divided into at least three subspecies groups, domesticus, castaneus
and musculus, which were genetically differentiated roughly one million years ago (Moriwaki
et al, 2001). In Asia, the castaneus subspecies group inhabits mainly Southeast Asia, Taiwan
and mainland China south of the Yangtze River, whereas where musculus subspecies group
inhabits mainland China north of the Yangtze River, northeast China, Far East Russia, the
Korean Peninsula and Japan. The former group has w1- and p-haplotypes of beta-hemoglobin
(Hbb) alleles, the latter group has d-haplotype Hbb alleles. Recent molecular analysis of these
Hbb DNA revealed that the nucleotide sequence of Hbbp-b1 gene are almost identical to that
of Hbbd-b1, and that of Hbbp-b2 gene to that of Hbbw1-b2. This result strongly suggests that
the p-haplotype is a recombinant between the b1-d and b2-w1 genes, probably between the
two subspecies groups, castaneus and musculus. Comparisons of the nucleotide sequences of
genomic DNA between Hbbp-b1 and Hbbd-b1 and also those between Hbbp-b2 and Hbbw1-
b2 suggest that the recombination occurred more than 0.1 million years ago. Further analysis
of the nucleotide sequence in the DNA stretch between b1 and b2 genes in p-haplotype
samples obtained from several localities geographically apart demonstrated their common
break point near b1 gene ORF. Those findings allow us to assume that the wild mouse
populations carrying p-Hbb haplotype began to spread geographically quite recently in an
evolutionary sense. Although at this moment it is still not possible to pinpoint the area where
the d-w1 recombination occurred, the broader distribution of p-Hbb haplotype could suggest
that it was somewhere in the central region in Asia (Moriwaki et al, 2001). Interestingly, the
nuclear genes did not show evidence of introgression. It may well be that introgression at
nuclear genes in these mammalian species is prevented by sterility or non-viability of
hybrids, which is caused by the nuclear genes, whereas mtDNA, which does not affect
fitness, can be inherited and transmitted independently.
MtDNA has been used for investigation of natural hybridization in fish and marine
invertebrates since the mid-1980s (Avise et al., 1983; 1984; Avise, Saunders, 1984). Avise
and Saunders (1984) used mtDNA combined with allozymes to study hybridization rate
among nine sunfish species (the genus Lepomis) from two localities from the southeastern
United States. The results of this study can be summarized in the four following points. (1)
Hybridization occurs at a relatively low rate, but involves five out of nine species examined.
(2) No mtDNA or allozyme evidence of gene introgression among the Lepomis species was
found; all hybrids proved to be F1 progeny. (3) Each of the hybrids was produced by a cross
between the most common and the rarest species. (4) In six out of seven possible hybrid
combinations, the maternal parent was from the rare species, as shown by the mtDNA
genotype. This was explained by intense mating competition among males and general
promiscuity of females. In many fish groups, hybridization and introgression are quite
common (Campton, 1987), although in many cases introgression occurs sporadically, as a
result of a past climatic shift, which, in particular, was demonstrated for two char species of
the genus Salvelinus (Glemet et al., 1998). The list of examples of mtDNA analysis can be
easily extended. We will consider only one further example, the mussels of the Mytilus spp.
complex.
Among 12 samples of mussels collected in southwestern British Columbia and in
Vancouver Island, the distribution of alien alleles at two marker loci (PLIIa and ITS) differ in
different sampling sites, which implies differential introgression (Heath et al., 1995). The
wide distribution of alien alleles, combined with the evidence for intense hybridization
between the native and the introduced (alien) species indicate that the introduced alleles may
have existed for some time in the mussel population of British Columbia (Heath et al., 1995).
One of the markers used in Heath et al. (1995) is a nuclear gene, ITS. Other nuclear DNA
markers also provide strict proof for hybridization in marine organisms (Rawson et al., 1999).
In mussels in Peter the Great Bay, Sea of Japan, the proportion of hybrid animals, estimated
using the DNA marker Me-5 and the allozyme locus MPI*, varied from 1.60.9 to 8.91.7%,
indicating the ongoing process of the hybrid zone formation (Skurikhina et al., 2001;
Kartavtsev, 2005; Kartavtsev et al., 2005). Examining hybridization by a single marker does
not clearly distinguish F1 hybrids from the individuals produced in backcrosses (Campton,
1987; Kartavtsev, 2005). For Peter the Great Bay, Sea of Japan, there is a set of data,
testifying for gene introgression between two mussel forms (Kartavtsev, 2005; Kartavtsev et
al., 2005). Here, introgression occurs from the more ancient form, M. trossulus, to a younger
form, M. galloprovincialis (Skibinski et al., 1983; Gardner, Skibinski, 1988; Wilhelm, 1993;
Rawson et al., 1996; Kartavtsev et al., 2005). Investigation of the common mussel based on a
set of marker genes (allozyme, mitochondrial and nuclear) in another part of the range, in
England, yielded interesting results (Skibinski et al., 1983; Gardner, Skibinski, 1988;
Wilhelm, 1993; Rawson et al., 1996). The latter authors, using an enzyme gene and two
nuclear DNA markers, confirmed the presence of a reported earlier large hybrid zone,
occupied by hybrid animals (F1, F2, and different Fb) and a patchy distribution of M. edulis
and M. galloprovincialis (Skibinski et al., 1983; Gardner, Skibinski, 1988; Wilhelm, 1993).
In addition, it was shown that hybrid mussels from Whitsand Bay, United Kingdom, carry
alleles that had appeared as a result of intragenic recombination (Rawson et al., 1996). A high
(10%) frequency of these recombinant alleles within the hybrid population suggest either
frequent recombination at this gene or significant hybridization between M. edulis and M.
galloprovincialis that has occurred during a long period of time in the evolutionary history of
the taxa. Presence of a cline, as in Rawson and co-authors (1996) for instance, may be
interpreted as evidence of selection. Though such evidence is usually difficult to obtain under
normal conditions, the above association, together with other evidence from the hybrid zones
of this species group (Skibinski et al., 1983; Gardner, Skibinski, 1988; Wilhelm, 1993),
testifies for an increase in natural selection intensity in these zones, including selection
against the hybrids. An assessment of introgression among all of the three forms of the
Mytilus complex showed that it is maximal between M. edulis and M. galloprovincialis and
minimal between M. trossulus and M. galloprovincialis. Restricted introgression between the
latter species pair was found along the Pacific coast of the United States (Rawson et al.,
1999). Gene introgression in the Sea of Japan is also low and asymmetric (Kartavtsev et al.,
2005). Asymmetric introgression in a direction edulis galloprovincialis was found using
RFLP markers of the mitochondrial genome and a sequenced DNA fragment (Rawson,
Hilbish, 1998). Laboratory crosses showed that M. trossulus M. galloprovincialis hybrids
have considerably deteriorated morphological development than hybrids between M.
trossulus and M. edulis (Beaumont et al., 2005). This suggests that in this mussel species
complex, M trossulus is closest to completing the RIB (Reproductive Isolating Barriers)
formation and its species status is the most definite among the three forms. The remaining
two forms are likely semispecies (Kartavtsev et al., 2005).
The above evidence indicates that mtDNA can cross species boundaries and stably exists
over many generations in the gene pools of species, whose reproductive and biological
integrity is confirmed by means of other molecular markers or phenotypic traits. This is
illustrated by the data on mice (Yonekawa et al., 2000), frogs, fish, mussels (see above), and
other organisms (Avise, 2001). Asymmetry of introgression, as was shown for two frog
species from the genus Hyla, is particularly clearly demonstrated by data on nuclear
cytoplasmic disequilibrium, being a common phenomenon in nature (Avise, 2001). Thus,
surpassing interspecies barriers by mtDNA and probably some mobile elements from alien
genomes, does not necessarily lead to disintegration of the species, and in some cases, as
predicted by BSC, may play a role in subsequent RIB formation. The appearance of RIBs
depends on the establishment of further nuclear cytoplasmic relationships and on other
biological changes (e.g. accumulation of substitutions at different genes) and climatic events,
which probably currently occurs in the mussel species complex. To date, monitoring of
hybridization and introgression in various biological species seems to be among most
relevant tasks of general and evolutionary genetics.
Divergence of DNA nucleotide sequences on the interspecies level. As measures for
comparison, we employ uncorrected p-distances, distances of the two-parameter Kimura
model (K2P), or other indices, used in the literature for the genes Cyt-b and Co-1 (Appendix).
The possibility of their use follows from the theory and from numerical simulation, as noted
in Section 2.1. The optimistic expectations at the preliminary stage of comparison were
generated by similarity of K2P and p at the Co-1 gene in butterflies (Lepidoptera) in two
studies (Hebert et al., 2002 a, b). It is also come from theoretic considerations (Nei, Kumar,
2000) and empirical relationships (Martinez-Navarro et al., 2005). However, below we will
examine actual comparability of the p-distances for the five comparison groups and for the
two genes.
Variation rows of pair-wise K2P comparisons for sequences of the Cyt-b gene, presented,
for instance, in a review of data on vertebrate animals, show a far from normal distribution
(Johns, Avise, 1998).

Figure 2. Continued on next page.

Figure 2. Cyt-b (top) and Co-1 (bottom) genetic distance frequency distribution plotted against different
species compiled in table of appendix for comparison groups, 1-5.


Figure 3. Resulting graphs of two-way ANOVA and mean p-distance values at five levels of
differentiation in animal species. Top is variation among comparison groups without weighting the
distance scores on the species number (n). Main effect, i.e. the differences among 5 comparison groups
is exemplified: F = 84.90, d.f. = 4, 178, P < 0.000001. Statistically significant is also the difference in
mean distance score for the two genes, interaction of two factors is non-significant (see text). Bottom is
variation among 5 comparison groups with weighting the distance scores on the n. All effects are
significant. Interaction effect, i.e. the differences among the distance scores at two genes in 5
comparison groups is exemplified: F = 268.63, d.f. = 4, 18295, P < 0.0001. Bars are confidence
intervals for mean (95%). All p-distances and means for 5 comparison groups at each of two genes Cyt-
b and Co-1 are presented in the data table (Appendix). Comparison groups: 1. Intraspecies, among
individuals of the same species; 2. Intrasibling species (plus semispecies, and subspecies), 3. Intragenus,
among morphologically distinct species of the same genus; 4. Intrafamily, among genera of the same
family; 5. Intraorder, families of a certain order.

This creates additional problems of analyzing this and other genes, in which the distance
distributions also seem to deviate from normality. We have analyzed their distribution, based
on the data table (Appendix) for all of the five comparison groups (1 5) and for genes Cyt-b
and Co-1, respectively. Indeed original data showed great variability and different patters of
distributions both for the two genes and groups of comparison (Figure 2). In such cases,
means of the estimates generally provide more satisfactory variance distributions. The
distribution of the mean p-distances in fact did not differ from normality as obtained by
Kolmogorov-Smirnov (K-S) test and had a unimodal shape. Variation row estimates were as
follow for two genes: Cyt-b, K-S d = 0.0939, P > 0.20; mean = 11.380.91, sample size, n =
85, standard deviation, SD = 8.43, Skewness = 0.800.26, Kurtosis = 0.330.35; Co-1, K-S d
= 0.1076, P < 0.20; mean = 9.370.68, n = 103, SD = 6.85, Skewness = 0.260.24, Kurtosis
= 0.770.47. For both genes: K-S d = 0.0915, P < 0.10; mean = 10.280.56, n = 188, SD =
7.65, Skewness = 0.660.18, Kurtosis = 0.330.35.
A one-way ANOVA (model with random effects for groups of the same size) showed
that mean distances in the five groups analyzed were significantly different for the two genes:
Cyt-b, F = 32.50, d.f. = 4, 80; P < 0.0001; Co-1, F = 60.81, d.f. = 4, 98; P < 0.0001.
Accordingly, pooling of the data in a two-way MANOVA (see scheme below) for the two
genes produced a statistically significant increase in the p-distances in the hierarchy of the
comparison groups: F = 84.90, d.f. = 4, 178, P < 0.000001 (Figure 3, top). Interaction of
factors in this data set is not statistically significant: F = 1.966, d.f. = 4, 178, P = 0.1017.
However, this pooling is not quite correct for all of the DNA sequences compared, because it
includes heterogeneous groups of different size. Consequently, categorized representation of
mean values with weighting an individual score on a sample size (n) for each gene is more
correct (Figure 3, bottom). However, both approaches showed that the distance for two genes
increases with the rank. Mean unweighted distances for the five groups were as follows: Cyt-
b (1) 1.460.34, (2) 5.350.95, (3) 10.460.96, (4) 17.991.33 (5) 26.363.88 and Co-1 (1)
0.720.16, (2) 3.781.18, (3) 10.870.66, (4) 15.000.90, (5) 19.970.80 (Appendix).
Taking in account variation in sample size (n) for each i-th distance measure in
comparison groups (Appendix), we performed a two-way MANOVA with p-distances
weighted by n (factor 1, comparison groups: 1, populations within species; 2, sibling species,
etc.; 3, morphologically distinct species within genera; 4, genera within a family, and 5,
families within an order; factor 2, genes: Cyt-b and Co-1; also, a model with random effect of
factors was applied) (Figure 3, bottom). In this MANOVA, the effect of factor 1 (i.e.,
comparison group) was significant F = 4715.42, d.f. = 4, 18295; P < 0.000001. The effect of
factor 2 (mean p-distance differences for two genes) proved to be significant: F = 15.40, d.f.
= 1, 18295; P = 0.00009. The interaction between factors 1 and 2 was significant too: F =
268.63, d.f. = 4, 18295; P < 0.000001. The categorized graph of the distribution of mean
weighted p-distance values supported the earlier conclusion on the increase of distances with
the rank of the groups compared. Fit of the bivariate distribution (the taxa rank, taxa
against the distance score, distance) showed that there is accordance with the linear
regression model, although factorial impact is moderate, 44-72%: for Cyt-b, taxa =
1.7911+0.1021*distance (t = 96.10, d.f. = 3107, P < 0.0001, R
2
= 0.7247; r
p
= 0.77); for Co-
1, taxa = 1.4332+0.1471*distance (t = 194.14, d.f. = 15194, P < 0.0001, R
2
= 0.4383; r
p
=
0.85). Thus, it is possible to conclude that there is little, if any, impact of saturation at both
genes up to the order level in our data set. The lower graph in Figure 3 clearly shows the
meaning of the factor interaction: the p-distance values or its derivates of these two genes
differ among some of five comparison groups; i.e., the substitution rates are different for Cyt-
b and Co-1 at least in some of the groups of animal taxa compared. This conclusion, on an
extended data set, validates the same conclusion made earlier for these two genes
(Kartavtsev, Lee, 2006).
The data presented in Figures 2 and 3 demonstrate that both genes show a trend of
increasing mean p-distances with increasing rank of the groups compared, from populations
to orders. Because of the importance of this conclusion, the data presented in Figures 2 and 3
were additionally tested using nonparametric Kruskall Wallis ANOVA. In this case,
unweighted scores were used to have more conservative estimation. For gene Cyt-b, H =
57.01, d.f. = 4, n = 85, P = 0.0001. For gene Co-1, H = 74.05, d.f. = 4, n = 103, P = 0.0001.
Thus, the comparative analysis of the data for nucleotide sequences of genes Cyt-b and Co-1,
performed for groups with increasing the rank for each of the genes separately, demonstrates
(with a probability of error P < 0.0001) that in animals, genetic divergence increases with the
taxon rank. Heterogeneity of gene evolution rate, also significant in our data for the two
genes (Figure 3), is widely known in literature (e.g. Li, 1997; Machordom, Macpherson,
2004), which was noted previously.
Let us take one more look at the detected differences to explain the essence of them. The
differences in p-distance estimates between the two genes can have the following
interpretations. Firstly, the substitution rate may in fact be different in the two genes but
hidden somehow. For instance, the data on taxonomic groups from the most representative
sources (Johns, Avise, 1998; Hebert et al., 2002 a, b), which can differ in divergence level,
may be differently represented in our database. Actually, heterogeneity of K2P values at Cyt-
b gene was found for the vertebrate groups examined: amphibians and reptiles have the
highest, and birds, the lowest variability (Johns, Avise, 1998). Significant heterogeneity of
the nucleotide diversity was obtained for Co-1 among flatfish genera (Figure 4).
Interspecies heterogeneity of nucleotide diversity estimates at Cyt-b can be found even
within a single fish genus (Garcia-Machado et al., 2004). Secondly, in the two most
representative works on Co-1, several different measures were used (Hebert et al., 2002 a, b).
In addition, instead of K2P and other similar measures (expected distance), non-corrected p-
distance (observed distance) was employed in many studies. In general, a shortcoming of
analysis of such data array is high biological heterogeneity of the material and presence of
some unknown or not identifiable components of the estimates (some of them were
mentioned above). For instance, p-distances and other distance measures can differentially
represent one and the same group of comparison. However, non-weighted p-distances in the
most numerous comparison group (morphologically distinct species within a genus) did not
statistically significantly differ between two groups, (1) p-distance and (2) other distance
estimates (K2P, GTR, TrN, etc.). The results of ANOVA based on data table (see Appendix)
were as follows: Cyt-b, F = 0.18; d.f. = 1, 30; P < 0.6707; Co-1, F = 0.52; d.f. = 1, 41; P <
0.8197. For both genes: F = 0.28; d.f. = 1, 73; P = 0.5981. However, as we remember,
unmodified p-distance must undergo homoplasy faster, i.e., be smaller than the expected
values of K2P, GTR, TrN, etc. (see section 2.1). The differences between these groups are
also non-significant, when n is used as covariance in ANOVA of the distance scores.
However, the differences between the groups are significant, if the distance scores are
weighted by n: Cyt-b, F = 231.38; d.f. = 1, 943; P < 0.01; Co-1, F = 207.60; d.f. = 1, 13888;
P < 0.01. The latter differences apparently are caused by unequal representation of taxa in
compared groups and also their different numeric representation. This effect is still obscure;
e.g. there was no correlation detected between the distance score and n: Spearmans
correlation coefficient, r
s
= 0.1125, P = 0.1252. For the Cyt-b gene, all groups consist almost
exclusively of vertebrates, which may on average have differed in p-distances compared to
invertebrates that were mostly tested on Co-1 (see Appendix). For the Co-1 gene, 50% of
group number 3 constitutes the variable insects (see Appendix, Lepidoptera, Arthropoda).
Note, however, that different directions of differences of the mean p-distances at two genes in
these two distance groups compared are available. So, this source of variation should not
mean much.

0
0,02
0,04
0,06
0,08
0,1
0,12
0,14
0,16
0,18
0,2
0 1 2 3 4
Genera
p
-
D
i
s
t
a
n
c
e

Figure 4. Plot of p-distances for Co-1 gene sequence data within flatfish genera (After Kartavtsev et al.,
2008a). On the x-axis is 3 flatfish groups: 1, Pseudopleuronectes, 2, Verasper + Hippoglossoides, 3,
Cynoglossus. On the y-axis is p-distance scores for intragroup comparisons.

3. Biological Species: Genetic Variability,
Divergence and Introduction of an Operational
Criterium for Delemiting a Speciation Mode in
Genetic Terms

In this part of the review, we briefly present a concept of the species (section 3.1),
compare molecular genetic and biochemical genetics data (section 3.2), and draw conclusions
from this evidence (section 3.3).

3.1. Species Examined

Let us clarify, what is usually considered as species in most studies.
According to the BSC, the definition of the species is as follows. A species is a
biological group, consisting of one or several crossbreeding individuals that are
reproductively isolated from other such groups, are stable in nature, and occupy a particular
area. This is the definition by the author of the present article and made formerly
(Kartavtsev, 2005), but it is very close to the one given earlier in the monograph by
Timofeev-Ressovsky, Vorontsov, and Yablokov (1977). In principle, this is a definition
typical for the BSC. For instance, one of the BSC definitions is formulated by Mayr as
follows: "A species is a reproductive community of individuals (reproductively isolated from
others), occupying in nature a certain habitat" (Mayr, 1982, p. 273). In the text below, we will
take this definition as a basis for discussing the BSC (which is largely limited to higher
bisexual organisms) (Mayr, 1963; Timofeev-Ressovsky et al., 1977; Templeton, 1998). As
the concept of the BSC is closest to population genetic theory, so, it seems convenient to use
it as the basis of the discussion, despite the above limitation. Several other species concepts,
with their attendant advantages and restrictions, have been critically analyzed (Krasilov,
1977; King, 1993; Altukhov, 1997; Templeton, 1998). Conceptual analysis of BSC and its
contraposition to the typological species concept were provided by Altukhov (1974; 1983;
1997). Most authors, in spite of criticisms, accept the BSC as the main modern paradigm. We
confine ourselves to listing the existing concepts of the species: (1) Linnaean species, (2)
BSC; (3) BSC modified by Mayr (1963); (4) BSC, modification II (Mayr, 1982); (5) concept
of species recognition (Paterson, 1978; 1985); (6) concept of species cohesion (Templeton,
1998); (7) evolutionary concept of the species; (8) Simpson's evolutionary concept of the
species (Simpson 1961); (9) Wiley's evolutionary concept of the species (Wiley, 1978); (10)
ecological concept of the species (Van Valen, 1976); (11) phylogenetic concept of the species
(Cracraft, 1983), and others (see, e.g., Howard, 1998; DeQuieros, 1998).

3.2. Brief Analysis of Biochemical Genetics Data and Their Comparison to
Nucleotide Divergence

Let us briefly consider the evidence on variability of structural protein-coding genes as
assessed by electrophoretic allozyme analysis. Although they are now not very popular, they
give quite a representative view of variability for nuclear genes. Mean heterozygosity per
individual (locus) has been recognized as the best measure of variability (Lewontin, 1978;
Zhivotovsky, 1983; Nei, 1987). Many statistics have been used to measure taxon divergence
during evolution (Nei, 1975; Zhivotovsky, 1983; Pasekov, 1983), but the most popular
among them is standard Nei's distance, Dn and the inverse measure, normalized similarity I
(Nei, 1972). To assess differentiation at the intraspecies level, minimal distance and
standardized variance of allele frequencies are more convenient (these data are not
considered in this review as being analyzed previously; e.g. Altukhov, 1983; 1989; 1999;
DeWoody, Avise, 2000). Examination of genetic diversity of natural species requires analysis
of heterozygosity (diversity) and distances (differences), assessing different aspects of
variability, which is not always taken into account. Heterozygosity (and its equivalent,
nucleotide diversity) estimates are weighted variabilities of individuals in a population
(species), while distance/similarity measures are the pairwise differences between
populations (species) at marker genes or molecular sequences. Note, however, that p-distance
and can be used both as measures of variability and measures of distance. Comparing
individuals from one or several populations of a species, one can estimates intraspecies
diversity (heterozygosity), while comparison of individuals of different species provides an
estimate of their divergence (distance).
Brief results of comparing H and I. Mean heterozygosity per individual H widely varies
in plant and animal taxa. The total mean H = 0.076; in vertebrates, H = 0.054, for
invertebrates, H = 0.100 (Nevo et al., 1984). A number of other surveys give similar data
(Aronshtam et al., 1977; Hedgecock, Nelson, 1981; Nei, Koehn, 1983; Hedgecock, 1986;
Ward et al., 1992; Kartavtsev, 2005, etc.). The H value underestimates the actual genetic
diversity by approximately one-third, owing to technical restrictions of protein
electrophoresis, which is commonly used to estimate variability at that level (Nei, 1975;
1987; Altukhov, 1989; 1999, and others).
Coefficients of genetic distance or similarity at enzyme loci show in comparable scale
genetic divergence in taxa of various ranks, from subspecies to families (Nei, 1975; 1987;
Lewontin, 1978). Comparison of higher-rank taxa at this level is hindered by high probability
of synonymous substitutions increasing nonlinearity of genetic similarity (distance) and
divergence time (Lewontin, 1978; Nei, 1987). Coefficients of intraspecies genetic similarity
of difference were estimated in many groups of animals. The mean genetic difference at this
level is I = 0.95 (Lewontin, 1978; Nei, 1987; Altukhov, 1983; 1989; Kartavtsev, 2005).
According to our database, which comprises more than 300 populations of 80 animal species,
I = 0.940.01 (Kartavtsev, Lee, 2006). In the hierarchy of animal taxa, subspecies have
coefficients of similarity I ranging from 0.6 to 1.0, with a mode of approximately 0.9; the
variation range is I = 0.5 1.0 (mode about 0.8) for semispecies and sibling species; the
variation range is 0.5 1.0 (mode about 0.7) for species within a genus and 0.0 1.0 (mode
0.4) for genera within a family (Avise, Aquadro, 1982; Nei, 1987; Thorpe, 1983; Kartavtsev,
2005; Kartavtsev, Lee, 2006). This means that genetic similarity significantly decreases with
increasing rank of the taxon and conversely, distance increases with increasing taxon rank
(Kartavtsev, 2005; Kartavtsev, Lee, 2006).
Thus, the current molecular genetic evidence (section 2.2) and the results of analysis of
protein marker genes support, first, the basic BSC idea that taxon formation necessarily
requires isolation of gene pools and, second, that the geographic (divergent) speciation mode
prevails in nature, implying gradual accumulation of small genetic differences after
separation of gene pools. Yet, there are facts warning against simplified conclusions on
modes of speciation. For instance, it has long been known that the genetic "weight" of the
species, say, on the Dn scale, may be different for different animal taxa. For example, Dn is
on average 1.1 in amphibians, which is an order of magnitude higher than the corresponding
value in birds (Dn = 0.1) (Avise, Aquadro, 1982). Other examples of this trend can be found
(Avise, 1994). The range of nucleotide diversity also shows that some animal taxa display a
high divergence level among the species, while others are characterized by a very low value
of this measure. As already noted above, avian taxa are substantially less differentiated at
Cyt-b than amphibians and reptiles (Johns, Avise, 1998; see also table in Appendix). For
three main geographic phyletic groups of Orizias latipes, the nucleotide diversity of Cyt-b
was found to be comparable to the within-genus divergence: p = 11.3 11.8% (Takehana et
al., 2003). For the other gene, Co-1, the species within the genus Cnidaria have p = 1%,
while in crustaceans p = 15.4% (Hebert et al., 2002b). The difference for flatfish genera at
Co-1 sequences was show above (see Figure 4). Data in the table presented (Appendix) allow
assessments of heterogeneity among animal taxa at the level of genus for both Cyt-b and Co-
1. One-way parametric ANOVA and K-W ANOVA proved this conclusion: for the Cyt-b, F
= 265.08, d.f. =3, 10654, P < 0.01; K-W H = 10.87, d.f. = 3, n =32, P = 0.01, and for Co-1, F
= 196.91, d.f. =3, 13886, P < 0.01; K-W H = 12.11, d.f. = 3, n =43, P = 0.007.
Some studies show that the concept of natural selection is necessary to explain joint
variation of H and environmental variability (Nevo et al., 1984), an association of individual
heterozygosity at enzyme genes (H
o
) with physiological, morphological, and other
components of phenotypic variation in population environment gradient (Nei, Koehn, 1983;
Aronshtam et al., 1977; Koehn, 1978; Zouros et al., 1980; Ayala, 1981; Hedgecock, Nelson,
1981; Avise, Aquadro, 1982; Koehn et al., 1983; 1988; Hedgecock, 1986; Zouros, 1987;
Zouros, Foltz, 1987; Powers, 1987; Altukhov, 1989; 1999; Kartavtsev, 1992; Takehana et al.,
2003; Kartavtsev, Svinyna, 2003). The data on genetic similarity may be interpreted in the
same way. For instance, frequencies of genetic similarity coefficients for enzyme loci,
estimated for various species, follow a U-shaped distribution, whereas neutrality implies a
reverse association with the expected differentiation (Ayala, 1975), i.e., a distribution close to
normal. Nevertheless, nearly normal distribution of coefficients of similarity has been found
for some protein loci, e.g., duplicated hemoglobin loci of salmonid fishes (Kartavtsev, 2005,
Figure 8.3.5). In general, the observed temporal differentiation at many loci is consistent with
the neutral process of drift (King, Jukes, 1969; Kimura, 1969; Kimura, 1983; Nei, 1987;
Ohta, Gillespie, 1996). On the other hand, as stressed at the end of section 2.1, the role of
natural selection in determining molecular diversity of various genes and their different
regions has been conclusively demonstrated. Thus, the early expectations of predominantly
selective neutrality of variation in DNA sequences and other markers, including
mitochondrial DNA markers, have not been supported by observations. The problems of
selectivity/neutrality of mtDNA markers were considered in special reviews (Ohta, Gillespie,
1996; Rand, Cann, 1998; Gerber et al., 2001). In particular, it was pointed out that
assessments of genotype expression in different nuclear backgrounds in many cases reveal
differential fitness, caused by co-evolution. Experimental manipulations also showed that
particular haplotypes are selectively advantageous (Gerber et al., 2001). However, generally
data are complicated and ambiguous. First, as known since early studies by Mukai and co-
authors (1980), it is virtually impossible to experimentally assess weak effects of molecular
markers on fitness and second, there is a complex of factors that may disrupt stochastic
processes, but these factors are not necessarily adaptive ones. In particular, the gene bank
data show that a half of the species pairs examined do not substantially deviate from
neutrality expectations, while the other half exhibit a significant excess of amino acid
polymorphism in structural genes (Rand, Cann, 1998 ). Gillespie (2001) has offered his view
on the ratio of stochastic and selective processes, expressed as the genetic draft model. Some
novel ideas on using molecular data for proving the role of natural selection (Plotkin et al.,
2004) received strong criticism (Hahn et al., 2005; Nielsen, Hubisz, 2005).

3.3. Applicability of Molecular Evolution Data to Speciation Genetics

It is of interest to comprehend whether the obtained evidence is relevant to genetic
aspects of speciation. As shown in the previous sections of this review, genetic differences
are acquired gradually, in formed isolated populations or their groups.
The process of divergence further proceeds to diversify semi-species and sibling species,
genera, and so on. The presented data on nucleotide sequences of genes Cyt-b, Co-1, and
protein markers conclusively demonstrate that this process is implemented to the order level
(see Figures 2 3), although other molecular markers present good evidence in favor of
phyletic evolution as the main process of divergence, also for higher rank taxa (Nei, 1987; Li,
1997). We cannot cover all aspects of speciation in a short paper. This issue has been
addressed to different extents by a number of authors (Templeton, 1981; Ayala, Fitch, 1997;
Avise, Walker, 1999). I will present own views on these processes.
It is important to emphasize that evolutionary genetics lacks a speciation theory in the
strict scientific sense, implying a formal, analytic model and prediction of future events on its
basis. In a particular case, such model must predict the formation of a species or at least
distinguish different speciation modes on the basis of quantitatively estimated parameters and
their empirical estimates. The attempts taken in this direction (Templeton, 1981; 1998;
DeQuieros, 1998) do not meet the above requirements. To step in, a scheme and an
algorithmic approach have been developed (Kartavtsev, 2000; Kartavtsev et al., 2002;
Kartavtsev, 2005) to distinguish speciation modes (models) on the basis of key population
genetic parameters and their estimates available in literature.
This approach, which I will call it the operation-and-genetic approach for delimiting
speciation mode, may lay the foundation for a future theory, a genetic theory of speciation.
As a basis for the evolutionary genetic concept of speciation, descriptions by Templeton
(1981) were used. As a result, a classification scheme for seven known modes of speciation
was developed (Kartavtsev et al., 2002; Kartavtsev, 2005). Here, I present for illustration the
main elements of this scheme for types D1 D3 (divergent speciation) and T1T4
(transformative or transilience speciation) (Figure 5).
This approach leads to a relatively simple experimental scheme, which allows us to (1)
organize further investigation of speciation in various groups of organisms, based on a
focused genetic approach and (2) obtain analytic expressions (equations) for each of the
speciation modes (Figure 6). Using the proposed scheme (Kartavtsev et al., 2002; Kartavtsev,
2005, Figure 7.4.1), one can determine the conditions required for speciation (necessity
conditions or necessary conditions) and sufficient for the formation of a species (adequacy
conditions or sufficient conditions).
Importantly, in addition to the general definition of the sufficient conditions, four (1 4)
experimentally measured descriptors are introduced (their number can be increased, if
necessary) to clarify how, and in which form, these conditions are manifested in a particular
case of speciation or in a potential model. For instance, the divergent type of speciation D1
explains classic geographic (or allopatric) speciation (see Figure 5).
According to the BSC, this model implies that large populations are isolated (disruption
of the gene flow) and evolve separately, accumulating mutations, while reproductive isolating
barriers (RIBs) are caused by pleiotropic effects. The longer the time elapsed from the
isolation event, the greater the distances between the corresponding taxa. Accordingly, in my
notation a descriptor is introduced: (1) D
T
> D
S
(where subscripts T and S indicate genetic
distances in the putative parental taxon and in conspecific populations or at the higher and
lower levels of taxonomic hierarchy in statu nascendi situation).


DESCRIPTORS:

D, Genetic distance at structural
genes:
DT, in suggested parent taxa,
DS, among conspecific demes,
DD, among subspecies or sibling
species;
HD, Mean heterozygosity/diversity
in suggested daughter
population;
Hp, Mean heterozygosity/diversity
in suggested parent population;
EP, Divergence in regulatory genes
among suggested parent taxa;
ED, Divergence in regulatory
genes among suggested
daughter taxa;
TM
+
, Test for modification
(positive);
TM
, Test for modification

(negative).
RIB, Reproductive Isolation
Barriers.
DIVERGENCE SM
D3. HABITAT D2. CLINAL D1. ADAPTIVE
Necessary Conditions for Speciation
D3. a) Selection over multiple
habitats with no isolation by
distance; b) Origin of RIB by
disruptive selection at genes,
determining behavior
D2. a) Selection on a cline with
isolation by distance; b) Pleio-
tropic origin of RIB
D1. a) Erection of extrinsic
Reproductive Isolating
Barriers (RIB) followed by
gene flow break; b)Pleiotropic
origin of RIB in long time
Sufficient Conditions for Speciation
Lack of efficient hybridi-
zation inside and outside the
zone of contact
zation outside the zone of
contact
zation in the zone of contact
1. DT = DS 3 (S)
2. ED EP
3. HD =< HP
4. TM+
1. DT > DS 2 (S)
2. ED EP
3. HD = HP
4. TM-
1. DT > DS 1 (S)
2. ED = EP
3. HD = HP
4. TM-
Experimentally measurable features
and possible descriptors for the
model (theory), (S)

TRANSILIENCE SM
T1. GENETIC
T4. HYBRIDOGENIC 2 T3. HYBRIDOGENIC 1 T2. CHROMOSOMAL
Necessary Conditions for Speciation
T4. a) Hybridization of
incompartible parental
species followed by
inbreeding and selection for
stabilized recombinant; b)
RIB origin as a cause of
hybrid disgenesis
T3. a) Hybridization of in-
compartible parental species
followed by selection for
maintenance for hybrid state;
b) RIB origin as a cause of
hybrid disgenesis
T2. a) Inbreeding and drift
causing fixation of strongly
underdominant chromosomal
mutatins; b) RIB origin as a
cause of hybrid disgenesis
T1. a) Founder event causing
a rapid shift in previously
stable genetic system; b) RIB
origin as byproduct of one or
a small number gene
substitutions
Sufficient Conditions for Speciation
Lack of efficient
hybridization inside and
outside the zone of contact
Lack of efficient
hybridization in the zone of
contact
Lack of efficient
Lack of efficient
1. DT > DS 7 (S)
2. ED EP
3. HD < HP
4. TM
-

1. DT > DD 6 (S)
2. ED EP
3. HD > HP
4. TM
-

1. DT = DD 5 (S)
2. ED = EP
3. HD > HP
4. TM
-

1. DT = DD 4 (S)
2. ED = EP
3. HD <= HP
4. TM
-

Experimentally measurable features and possible
descriptors for the model (theory), (S)

Figure 5. Schematic representation of the divergent speciation type (ST), based on population genetic
principles (From Kartavtsev, 2005 with modifications). D1D3, divergent speciation modes; T1T4,
transformative (transilience) speciation modes.

Figure 6. Analytic representation of seven speciation mode (From Kartavtsev, 2005 with
modifications). D1D3, divergent speciation modes; T1T4, transformative (transilience) speciation
modes. Descriptors: D, genetic distances for structural gene; D
T
: in putative parental taxon; D
S
: among
conspecific demes; D
D
: among subspecies or sibling species; H
D
: mean heterozygosity/diversity in
putative daughter population; H
P
: mean heterozygosity/diversity in putative parental population; E
P
:
divergence at regulatory genes in putative parental taxon; E
D
: divergence at regulatory genes in putative
daughter taxon; TM
+
: test for modification (positive); TM
: test for modification (negative).

Likewise, since upon implementation of the D1 mode, no significant genetic diversity
differences appear at either structural gene or the regulatory part of the genome (because the
initial and derived taxa are large), we introduce parameters (2) H
D
= H
P
and (3) E
D
= E
P

(differences in heterozygosity/diversity and gene expression between the daughter and the
parental taxon are absent). Finally, upon some types of speciation, not only variability and
genetic distances, but also some quantitative loci (polygenes) are of importance, which
cannot be distinguished at the molecular level, but lead to the RIB formation. Hence, we
introduce TM (TM
+
vs TM
-
; an experimental test for modification), which also allows to
distinguish between epigenetic variation and taxonomic differences.
Do all these data imply that speciation always corresponds to the D1 type? Apparently
not. Here is an example supporting this answer. In a Swedish mountain lake, two trout (Salmo
trutta) forms were known. It was unclear whether their gene pools were isolated. A genetic
examination (Ryman et al., 1979) revealed in these forms two different fixed alleles, which
unambiguously proved total reproductive isolation of these sympatric trout forms. The gene
pools of these taxa were found to differ by five out of seven polymorphic loci examined
(Ryman et al., 1979). There are other examples of bursts of fish evolution, documented by
molecular markers (Rutaisire et al., 2004; Duftner et al., 2005). These, as well as other data,
for instance from our data base of coefficients of similarity, indicate that sometimes very
small differences in structural genes may result in the appearance of RIBs (and thus
reproductively isolated biological entities). In the case of the trout mentioned above, the
genetic differences between the two forms Dn = 0.02 (Ryman et al., 1979), which
corresponds to the level of intraspecies genetic differentiation. There are many other
examples for salmonid fishes (Kartavtsev, 2005), supporting the view that in these fishes,
small changes can generate biological species during a short period of time. This evidence
also suggests an alternative speciation mode, such as the transformative (T1) or other types
(Figure 5-6), though in general, D1 speciation mode prevails in this group.
Thus, we can now accept that speciation does not necessarily involve changes in
structural genes that can be very small (at the level typical for populations of the species).
Conversely, in some cases of speciation we can expect substantial rearrangements of
regulatory genes (Wilson, 1976), chromosomal or other reorganizations of the genome. Data
on regulatory changes upon speciation are scarce in literature, because exact investigation of
regulatory shifts or changes in gene expression is very labor-consuming. Moreover, the
classification of genes into structural and regulatory ones is rather arbitrary (Wilson, 1976;
Klug, Cummings, 2002). However, apart from the task of precise estimation of differences in
expression, very valuable comparative information for speciation studies can be obtained
approximately. In particular, considerable regulatory differences (in the expression level of
enzyme genes) were found for two sibling char species, in which up to 32% of loci diverged
in this respect, whereas distance Dn = 0.08, i.e., nearly at the level characteristic of
populations within a species (Kartavtsev et al., 1983). Similar results were obtained for a
group of species in status nascendi, in the family of white-fish and graylings in the Baikal
Lake. In this case, genetic differences Dn between several fish forms ranged from 0.01 to
0.03, whereas the divergence in the expression level reached 9 27% (Kartavtsev,
Mamontov, 1983). These and other similar data (Ferris, Whitt, 1978; 1979; Laurie-Ahlberg,
1982; Kartavtsev et al., 2002) suggest that correct judgment on the mode of speciation (and
the critical species features from genetic viewpoint) should be based not only on distances,
but also on heterozygosity (diversity), variability of other genomic elements, and include
other operational criteria (like the TM descriptor, that testifies for a modification as suggested
above and others).
Developments in evolutionary genetics were made in several directions. I will touch only
a few, that are close to the topic of this paper. For instance, the method of distance scaling
along phyletic lines was suggested by Avise and Walker (1998). It was designed for the
normalization of taxa weights; and as an outcome the unification of Systematics is expected.
The estimation of gene trees cohesion was suggested by Templeton (2001) to decide on
species boundaries. The second approach includes the notion of genetic exchangeability
and/or ecological interchangeability among lineages belonging to the same species
(Templeton, 2001). Both approaches are operational for species delimitation but it seems that
these techniques hardly will solve the above mentioned rigidity of species problem and
species boundaries without formalization of a species notion. Some authors reached similar
conclusions on the basis of independent analysis of different characters and approaches for
species delimitation (Ferguson, 2002; Wiens, Penkrot, 2002; Sites, Marshall, 2004). In
particular, the latter authors emphasize the idea of diffuse peculiarities of the species concept
and species boundaries and, consequently, the necessity and applicability of several sets of
operational criteria in a multiple approach for species identification (Sites, Marshall, 2004).
This is also emphasized in the approach suggested here (Figure 5-6). The scheme presented
in the current paper is designed originally to define a speciation mode. However, it is also
contains the logical criteria of whether species have or have not yet originated. Thus, this
approach is quite suitable for species delimiting as a complex and empirically operational
approach. It has weakness, which all current methods, both the non-tree based and tree-based
methods have (Sites, Marshall, 2004), i.e. in some cases, the approach will require researches
to make qualitative judgments because of the infinite ways for species to originate.
Potentially the approach developing is close to the Population Aggregation Analysis (PAA)
in Daviss (1999) version because it is based on population based parameters like D
T
, H
D
etc.
(see Figure 5-6). However, this PAA1 approach could easily be converted to the mode PAA2
as defined by Brower (1999) (his notations; see also Sites, Marshall, 2004) and may even
have properties of the tree-based method (see below). As in PAA2, it is suggested to use not
only genotypic scores (character states) but other suitable descriptors (qualitative and
quantitative traits: QT, QTL, etc.); they could be represented as per individual sets of the
records or as vector-scores for implementing a multi-dimensional analysis (Canonical, PCA,
PAA, etc.) with the aims of (a) testing a null hypothesis (H1) of the absence of vectors
gatherings and if rejected, the alternative hypothesis H2 will be tested for discrimination
among them and taking solution in the frame of logic suggested (Figure 5), and (b) obtaining
a solution whether vectors genetic (=phylogenetic) unity is available, both as a distance
value and a coalescent signature; again solving H1 and H2. To obtain phyletic signal it will
be necessary to develop new descriptors in the Figure 5 scheme and introduce them in the set
of equations D1 T4 (and others when developed) in Figure 6. These special descriptors, like
the branch length or the parsimony outcomes to current OTU (Operational Taxonomic Units)
at a strict consensus tree built at several gene sequences, could be operational criteria among
others. The approach is basically empirical but different from such others, reviewed for
instance by Sites, Marshall (2004), as having (1) a general genetics and population genetics
theory basis and (2) having formalization as equations of set theory. Such approach has its
own limitations and advances. One limitation is that it is restricted to sexually reproducing
species, for which basic population genetic principles are more or less clear. The other
limitation is that generalizations (deductions) are only possible in a framework of the genetic
terms defined. But individuals comprising species are phenotypes. Thus, genotype/phenotype
correspondence should be defined in an appropriate form and genotype-and-environment
interaction or ecological interchangeability should also be introduced somehow. An advance
is that this approach is wider than many other suggested for species delimiting (see Sites,
Marshall, 2004) in its ability to define different speciation modes (or take into account the
differences in species types). Also, by weighting the members of equations in a specific way
it is possible to further develop the approach as framework for future theory, the genetic
theory of speciation.
In the conclusion of this paper, I have to discuss two complications observed under p-
distance data comparison. The first one is connected with the possible contradiction between
gradual species formation, as evidenced from p-distance increase with the increasing
taxonomic rank, and data on environmentally caused flux in species number (Bernatchez,
Wilson, 1998; Ruber, Zardoya, 2005). The second came from recent observations on a
bifurcation impact hidden in molecular phylogenetic trees vs distances (Pagel et al., 2006).
As to the first, it seems more apparent than real, because the environment shifts may
stimulate both the species number increase and genetic distance decrease (through reduced
time for the substitutions to accumulate when time for species origin shorten). In such
circumstances D1 may not prevail, but, perhaps, D2, T3 or T4 modes (see Figures 5-6). That
trend should create differences of mean distances in the taxa which undergo such speciation
modes and those that do not; and this may be a reason for the observed heterogeneity of
distance scores among taxa of the same rank. However, the genetic trend to get bigger
distances with time since gene pool separation is an innate property of modes D2, T3, and T4
(see Figures 5-6). That is why, in a long time span, genetic distance will increase as
taxonomic rank increases, especially bearing in mind D1 mode prevalence. Also, averaging
of distance scores across numerable taxa should align the proportional gradual dependence.
In considering the second complication, I have analyzed my own data on p-distances and
estimated whether the distance scores are indeed correlated with the branch number or OTU
number in a tree; a statistically significant and positive correlation for Cyt-b and Co-1 that
included sequences of flatfish and catfish genera was obtained: r
s
= 0.54, t = 2.51, k = 17,
OTU number = 530, P = 0.0241. However, regression analysis showed that factorial impact is
insignificant here (P = 0.4802), despite a significant intercept (P < 0.001). Less directly, the
analysis of the correlation between distance score and species number, n (not OTU), given
earlier (section 2.2, last paragraph) showed the same weak impact, if any, in agreement with
the old observations (Avise, Ayala, 1979; Kartavtsev et al., 1984).
I am aware that all facts and interpretations provided here present only one angle of a
view on molecular genetic data in respect of evolution and species origin. I have omitted
consideration of such events as horizontal transfer through mobile elements, chromosome
change, gene and genome duplications, deletions/insertions, organelle vs organism
commensalisms and others and their impact. Other views are possible if different markers or
time spans are considered. More drastic effects of transformative evolution may become
evident in this case. Anyway, quantitative data analyses of any kind are always welcome and
here I take one step in this direction applying a statistical analysis and some formal genetic
notations together with the equations of set theory.

Conclusion

(1) The theory and the algorithms of calculation of genetic distances from nucleotide
DNA sequences suggest that a suitable model should be thoughtfully selected for
analysis of empirical data. However, the observed data for nearly 20000 species
confirm the realistic character and interpretability of the data sets, analyzed for p-
distance or its derivates. This testifies to the possibility of using this measure for
most interspecies and intraspecies comparisons of genetic divergence up to the order
level.
(2) The data on p-distances show different levels of genetic divergence of sequences of
the compared Cyt-b and Co-1 genes in the five comparison groups examined.
Differences between genes themselves were also found. This is in good agreement
with ample data on different evolution rates of genes and their regions.
(3) The results of our analysis of nucleotide and allozyme divergence within animal
species and taxa of different ranks, first, are in good agreement with other similar
data, including protein gene markers and, second, these data allow a generalization
that phyletic evolution prevails in the animal kingdom at the molecular level, while
speciation mainly follows the D1 type (the geographic mode).
(4) The prevalence of the type D1 speciation does not preclude other speciation modes.
There are at least seven such modes. Recognition of different speciation modes is a
task requiring the construction of a quantitative genetic model (theory) of speciation.
In view of the vast diversity of the possible causes of RIBs and species origin, some
of the newly appearing questions remain unanswered and species delimiting requires
further work. Their solution is likely to lie in an increase of the number of
descriptors and members of the equations (D1-T4, Figures 5 6) on the basis of
DNA markers and other genomic characteristics and phenotype tests.

Appendix
Average Genetic Distances within and between
Species for Two mtDNA Genes (Cyt- b and Co- 1)
At Five Comparison Groups of the Increased
Categorical (taxa) Ranks

Distance

Model of
distance
estimate
Species
number,
n
Taxa

Reference

Cyt-b
Intraspecies, among individuals of the same species (1)
1.1* K2P 5 Mammalia Lepus Halanych et al., 1999
0.4 K2P 7 Mammalia Microtus Mazurok et al., 2001
3.2 TrN 1 Mammalia Martes Stone, Cook, 2002
1.54 K2P 9 Mammalia Apodemus Suzuki et al., 2004
0.96 K2P 9 Mammalia Apodemus Suzuki et al., 2004
0.28* p 2 Aves
Cyanopica,
Pica Kryukov et al., 2004
4 GTR 2 Amphibia Rana Sumida, Ogata, 1998
0.32 K2P 20 Pisces Mormiridae Kramer et al., 2003
3.09 p 9 Pisces Siluriformes Hardman, 2004
1.61 TVM 2 Pisces Molidae Bass et al., 2005
1.59 p 29 Pisces Siluriformes
Kartavtsev et al.,
2007a
0.46 p 34 Pisces
Pleuro-
nectiformes
Kartavtsev et al.,
2007b
Mean distance = 1.460.34, k=13, n=134
Intragenus, among sibling species, semispecies and subspecies (2)
5.5 K2P 87 Mammalia - Johns, Avise, 1998
12 p 2 Mammalia Rhabdomys Rambau et al., 2003
4.8 HKY 2 Mammalia Peromiscus Zheng et al., 2003
0.9 K2P 2 Mammalia Lepus Halanych et al., 1999
Distance

Model of
distance
estimate
Species
number,
n
Taxa

Reference

3.5 K2P 94 Aves - Johns, Avise, 1998
3.8 HKY 12 Aves Motacillidae Volker, 1999
5.7* p 2 Aves
Cyanopica,
3.5 K2P 96 Pisces - Johns, Avise, 1998
8.69 p 2 Pisces Urocampus
Chernoweth et al.,
2002
2.5 p 2 Pisces Pollimyrus Kramer et al., 2003
8 p 2 Pisces Cyprinidae Johnson et al., 2003
Mean distance = 5.350.95, k=11, n=303
Intragenus, among morphologically distinct species of the same genus (3)
9.4 K2P 7 Mammalia Microtus Mazurok et al., 2001
12.5* GTR 6 Mammalia Sciuridae Piaggio, Spicer, 2001
14 K2P 2 Mammalia Apodemus Serizawa et al., 2000
8.5 GTR 23 Mammalia Neotamias Piaggio, Spicer, 2001
22 TrN 2 Mammalia Mustella Stone, Cook, 2002
13.5 HKY 2 Mammalia Peromyscus Zheng et al., 2003
12 p 2 Mammalia Rhabdomys Rambau et al., 2003
8.95* K2P 11 Mammalia Lepus Halanych et al., 1999
14.5 K2P 67 Mammalia Rodents
Rocha-Olivares et al.,
1999a
11 K2P 15 Aves Pollimirus Kimbal et al., 1999
7.4 K2P 7 Aves Alectoris Kimbal et al., 1999
12.3 p 2 Aves
Cyanopica,
12 K2P 11 Reptilia - Johns, Avise, 1998
14 K2P 16 Amphibia - Johns, Avise, 1998
14.8 K2P 8 Amphibia Rana Sumida et al., 2000
26.2 p 2 Amphibia Rana Sumida, Ogata, 1998
11.8 K2P 81 Pisces - Johns, Avise, 1998
1.43 p 15 Pisces Sebastomus
1999a
2.3 p 15 Pisces Sebastomus
1999b
9 p 45 Pisces Sebastes
1999b
7.89* K2P 285 Pisces Several orders
1999a
12 p 2 Pisces Rhabdomys Rambau et al., 2003
3.5 p 19 Pisces Zoarcidae
Moller, Gravlund,
2003
Appendix Continued

Distance

Model of
distance
estimate
Species
number,
n
Taxa

Reference

12.5 TrN 6 Pisces Clupeidae Jerome et al., 2003
1.8 K2P 13 Pisces Pollimyrus Kramer et al., 2003
5.46 p 31 Pisces Siluriformes Hardman, 2004
Kartavtsev et al.,
2007a
17.51 p 34 Pisces
Pleuro-
nectiformes
Kartavtsev et al.,
2007b
Mean distance = 10.460.96, k=32, n=945
Intrafamily, among genera of the same family (4)
23 K2P 67 Mammalia Rodents
1999a
14.7 K2P 2 Mammalia Murinae Serizawa et al., 2000
32.8 p 5 Mammalia Scuridae Piaggio, Spicer, 2001
16.7 TrN 9 Mammalia Scuridae Stone, Cook, 2002
19.26* K2P 13 Mammalia Leporidae Halanych et al., 1999
31.3 K2P 25 Aves Phasianinae Kimbal et al., 1999
14.5 p 15 Aves Falconidae Griffits, 1997
20.5 K2P 18 Reptilia - Johns, Avise, 1998
19.5 K2P 3 Amphibia - Johns, Avise, 1998
31 K2P 8 Amphibia Rana/Xenopus Sumida et al., 2000
15.3* K2P 285 Pisces Several orders
1999a
24.8 TrN 6 Pisces Clupeidae Jerome et al., 2003
13.2 K2P 19 Pisces Mormiridae Kramer et al., 2003
9.5 p 19 Pisces Zoarcidae
Moller, Gravlund,
2003
6.6 p 32 Pisces Cottidae Kontula et al., 2003
16.27 p 861 Perciformes Sparidae Orrell, 2000
12.28 p 1 Perciformes Lutjanidae Orrell, 2000
18.33 p 1 Perciformes Haemulidae Orrell, 2000
17.02 p 1 Perciformes Lethrinidae Orrell, 2000
22.81 p 1 Perciformes Nemipteridae Orrell, 2000
Kartavtsev et al.,
2007a
11.74 p 34 Pisces
Pleuronectifor
mes
Kartavtsev et al.,
2007b

Distance

Model of
distance
estimate
Species
number,
n
Taxa

Reference

Mean distance = 17.991.33, k=25, n=1541
Intraorder, among families of the same order (5)
22.58 p 121 Actinopterigii Perciformes Orrell, 2000
37.44 TVM 2 Pisces
Tetraodontifor
mes Bass et al., 2005
Kartavtsev et al.,
2007a
25.60 p 34 Pisces
Pleuronectifor
mes
Kartavtsev et al.,
2007b
Mean distance = 26.363.88, k=4, n=186
Co-1
Intraspecies, among individuals of the same species (1)
0.39 K2P 173 Pisces Several orders Ward et al., 2005
3.3 GTR 2 Pisces Sphyrna Quatro et al., 2006
0.41 K2P 4 Teleostei Mugilidae
Papasotiropoulos et al.,
2007
0.34* K2P 13 Pisces Several orders Ward et al., 2008
0.17 p 8 Pisces
Pleuronectifor
mes Kartavtsev et al., 2008
0.09 p 5 Pisces
Pleuronectifor
mes
Sharina, Kartavtsev,
2008
0.11 p 5 Pisces Perciformes
Kartavtsev et al.,
2009b
1.00 p 9 Pisces
Scorpaeniform
es
Kartavtsev et al.,
2009a
1.4 p 2 Agnata Letentheron Yamazaki et al., 2003
0.49 GTR 3 Echinodermata Zoroasteridae Howell et al., 2004
<1 p 2 Mollusca Cephalopoda Herke, Foltz, 2002
0.98* p 1 Crustacea Potamonautes Daniels et al., 2002
0.33 K2P 13 Lepidoptera Arctidae Hebert et al., 2002a
0.23 K2P 30 Lepidoptera Geometri Hebert et al., 2002a
0.17 K2P 42 Lepidoptera Noctuida Hebert et al., 2002a
0.36 K2P 14 Lepidoptera Notodontidae Hebert et al., 2002a
0.17 K2P 8 Lepidoptera Sphingidae Hebert et al., 2002a
1 p 12 Coleoptera Carabidae
Martinez-Navarro et
al., 2005
1.5 p 6 Arthropoda Theridiidae Garb et al., 2004
1.43* p 7 Crustacea Decapoda
Machordom,
Macpherson, 2004
0.8 p 16 Collembola Hexapoda Hogg, Hebert, 2004
0.17 TrN 3 Hymenoptera Apidae Bertsch et al., 2005

Appendix Continued

Distance

Model of
distance
estimate
Species
number,
n
Taxa

Reference

Mean distance = 0.720.16, k=22, n=378
Intragenus, among sibling species, semispecies and subspecies (2)
0.27 GTR 1 Pisces Sphyrna lewini Quatro et al., 2006
2.2* K2P 2 Pisces
Z. faber, L.
caudatus Ward et al., 2008
9.1 p 2 Agnata Letentheron Yamazaki et al., 2003
5.4 p 2 Arthropoda Theridiidae Garb et al., 2004
4.75 HKY 2 Tunicata Ascidiacea Tarjuelo et al., 2001
0.4 K2P 4 Mollusca Dressana Therriault et al., 2004
Mean distance = 3.781.18, k=7, n=16
Intragenus, among morphologically distinct species of the same genus (3)
15.1 TrN 36 Actinopterigii
Cyprinodonti-
formes Webb et al., 2004
2007
12.4 p 10 Actinopterigii
Pleuronectifor
11.98 p 9 Pisces
Pleuronectifor
mes
2008
12.67 p 4 Actinopterigii Perciformes
Kartavtsev et al.,
2009b, in press
Scorpaeniform
es
Kartavtsev et al.,
2009a, in press
9.6 p 964 Chordata - Hebert et al., 2002b
10.9 p 86 Echinodermata - Hebert et al., 2002b
18.3 HKY 18 Ascidiacea Clavelina Williams et al., 2001
15.5 K2P 24 Gastropoda Tegula Hellberg, 1998
18.3 K2P 2 Mollusca Dressena Therriault et al., 2004
14 p 2 Mollusca Cephalopoda Herke, Foltz, 2002
11.1 p 1155 Mollusca - Hebert et al., 2002b
7 K2P 4 Lepidoptera Arctidae Hebert et al., 2002a
5.8 K2P 12 Lepidoptera Noctuida Hebert et al., 2002a
5.5 p 4 Lepidoptera 2 genera Hwang et al., 1999
6.3 GTR 51 Coleoptera 7 genera
Martinez-Navarro et
al., 2005
9.2 p 15 Coleoptera 3 genera Farrel, 2001
4 TrN 3 Hymenoptera Bombus Bertsch et al., 2005
Distance

Model of
distance
estimate
Species
number,
n
Taxa

Reference

5.6 K2P 12 Diptera Drosophila Goto, Kimura, 2001
11.2 p 7 Arthropoda Lactrodectus Garb et al., 2004
13 p 19 Arthropoda Lactrodectus Garb et al., 2004
3.9 p 2 Arthropoda Chlorina Dijikstra et al., 2003
9.1 p 3 Arthropoda Chlorina Dijikstra et al., 2003
14.4 p 1249 Arthropoda Chelicerata Hebert et al., 2002b
15.4 p 1781 Arthropoda Crustacea Hebert et al., 2002b
11.2 p 891 Arthropoda Coleoptera Hebert et al., 2002b
9.3 p 1429 Arthropoda Diptera Hebert et al., 2002b
11.5 p 2993 Arthropoda Hymenoptera Hebert et al., 2002b
6.6 p 882 Arthropoda Lepidoptera Hebert et al., 2002b
10.1 p 1458 Arthropoda Other orders Hebert et al., 2002b
5.5 p 2 Lepidoptera
Bombyx,
Antheraea Hwang et al., 1999
13.3 p 154 Other taxa - Hebert et al., 2002b
13.0* p 96 Crustacea Decapoda
Machordom,
Macpherson, 2004
19.0 p 3 Hexapoda Collembola Hogg, Hebert, 2004
11 p 49 Namatoda - Hebert et al., 2002b
14.4 p 84 Platyhelmintes - Hebert et al., 2002b
15.7 p 128 Annelida Annelida Hebert et al., 2002b
1 p 17 Cnidaria - Hebert et al., 2002b
Mean distance = 10.870.66, k=43, n=13725
Intrafamily, among genera of the same family (4)
Cyprinodonti-
formes Webb et al., 2004
2007
Pleuronectifor
11.98 p 9 Pisces
Pleuronectifor
mes
2008
Kartavtsev et al.,
2009b
Scorpaeniform
es
Kartavtsev et al.,
2009a
22.7 K2P 4 Mollusca 3 genera Therriault et al., 2004
10 K2P 18 Lepidoptera Arctidae Hebert et al., 2002a
10.4 K2P 90 Lepidoptera Noctuidae Hebert et al., 2002a
Appendix Continued

Distance

Model of
distance
estimate
Species
number,
n
Taxa

Reference

14 p 2 Lepidoptera 2 genera Hwang et al., 1999
17.1 p 18 Coleoptera 2 genera Farrel, 2001
12.8 GTR 59 Coleoptera Carabidae
Martinez-Navarro et
al., 2005
13.8 p 23 Arthropoda Dressenidae Garb et al., 2004
13.3 p 2 Arthropoda Delphacini Dijikstra et al., 2003
20.1 p 3 Arthropoda Delphacini Dijikstra et al., 2003
16.1 p 2 Arthropoda Delphaeidae Dijikstra et al., 2003
19.9 p 2 Arthropoda Delphaeidae Dijikstra et al., 2003
10 K2P 18 Lepidoptera Arctidae Hwang et al., 1999
Mean distance = 15.000.90, k=24, Total n=617
Intraorder, among families of the same order (5)
Pleuronectifor
Pleuronectifor
mes
2008
Kartavtsev et al.,
2009b
Scorpaeniform
es
Kartavtsev et al.,
2009a
Cyprinodontifo
rmes Webb et al., 2004
Mean distance = 19.970.80, k=7, Total n=307
Note: Absence of information is indicated by a dash. *, an asterisk denotes recalculation of original
estimates by the author. After arithmetic means the standard errors (SE) are given with a sign ;
k is the sample size of the groups when mean and SE were estimated; n is species number in
comparisons. Distance models: p, p-distance (observed proportion of nucleotide substitutions);
K2P, two-parameter Kimura distance; GTR, General time reversible distance model; HKY,
Hasigawa-Kishino-Yano distance, TrN, Tamura-Nei distance.

Acknowledgments

I am very thankful for proofreading of the manuscript and useful comments to Drs. R.
Ward, H. Suzuki. I also grateful to Dr. N. Hanzawa for the quiet space he provided me at a
concluding step of the manuscript writing.

References

Abascal, F., Zardoya, R., Posada, D. ProtTest: selection of best-fit models of protein
evolution. Bioinformatics, 2005, 21 (9), 2104-2105.
Altukhov, Yu.P., Populyatsionnaya genetika ryb (Fish Population Genetics). Moscow:
Pishchevaya Promyshlennost', 1974.
Altukhov, Yu.P. Geneticheskie protsessy v populyatsiyakh (Genetic Processes in
Populations). Moscow: Nauka Publ.; 1983.
Altukhov, Yu.P. Genetic Processes in Populations. 2nd ed. Moscow: Nauka Publ.;1989.
Altukhov, Yu.P. Species and Speciation, Soros. Obrazovat. Zh. (Soros Educational J.), 1997,
4, 2 10.
Altukhov, Yu.P. Genetic Processes in Populations, 3rd ed. Moscow: Nauka Publ.;1999.
Aronshtam, A.A., Borkin, L.Ya., Pudovkin, A.I. Isozymes in Population and Evolutionary
Genetics, in Genetika izofermentov (Genetics of Isozymes). Moscow: Nauka, 1977, 199
249.
Asmussen, M.A., Arnold, J., Avise, J.S. Definition and Properties of Disequilibrium Statistics
for Associations between Nuclear and Cytoplasmic Genotypes, Genetics, 1987, 115,
755768.
Avise, J.C., Ayala, F.J. Genetic differentiation in speciose versus depauperate phylads:
evidence from the California Minnows. Evolution, 1976. 30, 46-58.
Avise, J.C., Aquadro, C.F. A Comparative Summary of Genetic Distances in the Vertebrates:
Pattern and Correlations, Evol. Biol., 1982, 15, 151 185.
Avise, J.C., Shapira, J.F., Daniel, S.W., et al. Mitochondrial DNA Differentiation during the
Speciation Process in Peromyscus, Mol. Biol. Evol., 1983, 1, 38 56.
Avise, J.C., Saunders, N.C. Hybridization and Introgression among Species of Sunfish
(Lepomis): Analysis by Mitochondrial DNA and Allozyme Markers, Genetics, 1984,
108, 237 250.
Avise, J.C., Bermingham, E., Kessler, L.G.,, Saunders, N.C. Characterization of
Mitochondrial DNA Variability in a Hybrid Swam between Subspecies Bluegill Sunfish
(Lepomis macrochirus), Evolution, 1984, 38, 931 941.
Avise, J.C., Wollenberg, K. Phylogenetics and Origin of Species. Proc. Natl. Acad. Sci. USA,
1997, 94, 7748 7755.
Avise, J.C. Molecular Markers, Natural History and Evolution, New York: Chapman and
Hall; 1994.
Avise, J.C., Walker, D. Species Realities and Numbers in Sexual Vertebrates: Perspectives
from an Asexually Transmitted Genome, Evolution, 1999, 9(3), 992 995.
Avise, J.C. Phylogeography: The History and Formation of Species, Cambridge: Harvard
Univ. Press; 2000.
Avise, J.C. Cytonuclear Genetic Signatures of Hybridization Phenomena: Rationale Utility
and Empirical Examples from Fishes and Other Aquatic Animals. Rev. Fish Biol.
Fisheries, 2001, 10, 253 263.
Ayala, F.J. Scientific Hypotheses, Natural Selection and Neutrality Theory of Protein
Evolution. The Role of Natural Selection in Human Evolution. In: Salzano, F.M., Ed.
North-Holland, 1975, 19 42.
Ayala, F.J.. Mekhanizmy evolyutsii, evolyutsiya (Evolution Mechanisms and Evolution).
Moscow: Mir, 1981.
Ayala, F.J. Vvedenie v populyatsionnuyu i evolyutsionnuyu genetiku (Introduction to
Population and Evolutionary Genetics), Moscow: Mir; 1984.
Ayala, F.J., Fitch, W.M. Genetics and the Origin of Species: An Introduction. Proc. Natl.
Acad. Sci. USA, 1997, 94, 7691 7697.
Baker, C.S., Perry, A., Chambers, G.K., Smith, P.J. Population Variation in the
Mitochondrial Cytochrome-b Gene of the Orange Roughy Hoplostethus atlanticus and
the Hoki Macruronus novaezelandiae. Marine Biol., 1995, 122 (4), 503 509.
Barns, M.R. Predictive Functional Analysis of Polymorphisms: An Overview, Bioinformatics
for Geneticists, Barnes, M.R., Gray, I.C., Eds. Chichester: Wiley, 2003, 249 271.
Bass, A.L., Dewar, H., Thys, T., J. Streelman, T., Karl, S.A. Evolutionary divergence among
lineages of the ocean sunfish family, Molidae (Tetraodontiformes). Marine Biology,
2005. 148, 405414.
Beaumont, A.R., Turner, G., Wood, A.R., Skibinsky, D.O.F. Laboratory Hybridizations
between Mytilus Species and Performance of Pure Species and Hybrid Veliger Larvae at
Lowered Salinity. J. Molluscan Studies, 2005, 71(3), 303 306.
Beckenbach, A.T., Thomas, W.K.,, Sohrabi, H. Intraspecific Sequence Variation in the
Mitochondrial Genome of Rainbow Trout. Genome, 1990, 33(1), 13 15.
Bernatchez, L., Wilson, C.C. Comparative phylogeography of nearctic and palearctic fishes.
Molecular Ecology, 1998, 7(4), 431-452.
Bertsch, A., Schweer, H., Tanaka, H. Male Labial Gland Secretions and Mitochondrial DNA
Markers Support Species Status of Bombus criptarum and B. magnus (Hymenoptera,
Apidae), Insect. Soc., 2005, 52, 45 54.
Billington, N., Strange, R.M. Mitochondrial DNA Analysis Confirms the Existence of a
Genetically Divergent Walleye Population in Northeastern Mississippi. Trans. Am. Fish.
Soc., 1995, 124(5), 770 776.
Brower, A.V.Z. Delimitation of phylogenetic species with DNA sequences: A critique of
Davis and Nixon's population aggregation analysis. Systematic Biology, 1999, 48, 199-
213.
Brown, C.J., Aquadro, C.F., Anderson, W.W. DNA Sequence Evolution of the Amylase
Multigene Family in Drosophila pseudoobscura. Genetics, 1990, 126, 131 138.
Bucklin, A., Wiebe, P.H. Low Mitochondrial Diversity and Small Effective Population Sizes
of the Copepods Calanus finmarchicus and Nannocalanus minor: Possible Impact of
Climatic Variation during Recent Glaciation. J. Hered., 1998, 89(5), 383 392.
Burton, R.S., Byrne, R.J., Rawson, P.D. Three divergent mitochondrial genomes from
California populations of the copepod Tigriopus californicus. Gene, 2007, 403 (1-2), 53-
59.
Campton, D.E. Natural Hybridization and Introgression in Fishes: Method of Detection and
Genetic Interpretation. Population Genetics and Fishery Management. In: Ryman, N.,
Utter, F., Eds. 1987, 161 192.
Cann, R.L. The Evolution of Human Mitochondrial DNA. Ph.D. Thesis, Berkeley: Univ. of
California, 1982.
Cann, R.L., Brown, W.M., Wilson, A.C. Evolution of Human Mitochondrial DNA: A
Preliminary Report, Human Genetics. Part A.: The Unfolding Genome. Bonne-Tamir, B.,
Ed., New York: Liss, 1982, 157 165.
Chenoweth, S.F., Hughes, J.M., Connolly, R.C. Phylogeography of the pipefish, Urocampus
carinirostris, suggests secondary intergradation of ancient lineages. Marine Biology,
2002. 141(3), 541-547.
Clark, A.G. Natural Selection with Nuclear and Cytoplasmic Transmission: I. A
Deterministic Model, Genetics, 1984, 107, 679 701.
Cracraft, J. Species Concepts and Speciation Analysis. Curr. Ornithol., 1983, 1, 159 187.
Creer, S., Malhotra, A., Thorpe, R.S., Assessing the Phylogenetic Utility of Four
Mitochondrial Genes and a Nuclear Intron in the Asian Pit Viper Genus, Trimeresurus:
Separate, Simultaneous, and Conditional Data Combination Analyses, Mol. Biol. Evol.,
2003, 20(8), 1240 1251.
Daniels, S.R., Stewart, B.A., Cook, P.A. Congruent Pattern of Genetic Variation in a
Burrowing Freshwater Crab Revealed by Allozymes and mtDNA Sequence Analysis.
Hydrobiologya, 2002, 468, 171 179.
Davis, J.I., Nixon, K.C. Populations, genetic-variation, and the delimitation of phylogenetic
species. 1992. Systematic Biology, 41, 421-435.
DeQuieros, K., Donoghue, M.J. Phylogenetic Systematics and the Species Problem.
Cladistics, 1988, 4, 317 338.
DeQuieros, K., The General Lineage Concept of Species, Species Criteria, and the Process of
Speciation: A Conceptual Unification and Terminological Recommendations. Endless
Forms: Species and Speciation, Howard, D.J., Berlocher, S.H., Eds. New York: Oxford
Univ. Press, 1998, 57 78.
DeWoody, J.A., Avise, J.C., Microsatellite Variation in Marine, Freshwater and Anadromous
Fishes Compared with Other Animals. J. Fish. Biol., 2000, 56(3), 461 473.
Dijikstra, E., Rubio, J.M., Post, R.J. Resolving Relationship Over a Wide Taxonomic Range
in Dephacidae (Homoptera) Using the CO1 Gene. Sys. Entomol., 2003, 28, 89 100.
Dowling, T.E., Brown, W.M. Population Structure of the Bottle-Nosed Dolphin (Tursiops
truncatus) As Determined by Restriction Endonuclease Analysis of Mitochondrial DNA,
Marine Mamm. Sci., 1993, 9(2), 138 155.
Duftner, N., Koblmuller, S., Sturmbauer, C. Evolutionary Relationships of the
Limnochromini, a Tribe of Benthic Deepwater Cichlid Fish Endemic to Lake
Tanganyika, East Africa. J. Mol. E , 2005, 60(3), 277 289.
Endless Forms: Species and Speciation. In: Howard, D.J., Berlocher, S.H., Eds. New York:
Oxford Univ. Press, 1998.
Evolution of Genes and Proteins. Nei, M., Koehn, R.K., Eds. Sunderland: Sinauer Ass., 1983.
Farrel, B.D. Evolutionary Assembly of the Milkweed Fauna: Cytochrome Oxidase 1 and the
Age of Tetraopes Beetles. Mol. Phylogenet. Evol., 2001, 18(3), 467 478.
Felsenstein, J. Inferring Phylogenies, Sunderland: Sinauer Ass.; 2004.
Ferguson, J.W.H. On the Use of Genetic Divergence for Identifying Species. Biol. J. Linn.
Soc., 2002, 75(4), 509 516.
Ferris, S.D., Whitt, G.S. Phylogeny of Tetraploid Catostomid Fishes Based on the Loss
Duplicate Gene Expression. Syst. Zool., 1978, 27, 189 206.
Ferris, S.D., Whitt, G.S. Evolution of the Differential Regulation of Duplicate Genes after
Polyploidization. J. Mol. E , 1979, 12(3), 267 317.
Ferris, S.D., Sage, R.D., Huang, C.-M., et al. Flow of Mitochondrial DNA across a Species
Boundary. Proc. Natl. Acad. Sci. USA, 1983, 80, 2290 2294.
Garb, J.E., Gonzales, A., Gillespie, A.G. The Black Widow Spider Genus Latrodectus
(Araneae: Teridiidae): Phylogeny Biogeography and Invasion History. Mol. Phylogenet.
Evol., 2004, 31, 1127 1142.
Goto, S.G., Kimura, M.T. Phylogenetic Utility of Mitochondrial CO1 and Nuclear Gpdh
Genes in Drosophila. Mol. Phylogenet. Evol., 2001, 18(3), 404 422.
Garcia-Machado, E., Chevalier Monteagudo, P.P., Solignac, M. Lack of mtDNA
Differentiation among Hamlets (Hypoplectrus, Serranidae). Marine Biol., 2004, 144, 147
152.
Gardner, J.P.H., Skibinski, D.O.F. Historical and Size-Dependent Genetic Variation in
Hybrid Mussel Populations, Heredity, 1988, 61, 93 105.
Gerber, A.S., Tibbets, C.A., Dowling, T.E. The Role of Introgressive Hybridization in the
Evolution of the Gila robusta Complex (Teleostei: Cyprinidae). Evolution, 2001, 55 (10),
2028 2039.
Gillespie, J.H. Lineage Effects and the Index of Dispersion Molecular Evolution. Mol. Biol.
Evol., 1998, 6, 636 647.
Gillespie, J.H. Is the Population Size of a Species Relevant to Its Evolution? Evolution, 2001,
55(11), 2161 2169.
Glemet, H., Blier, P., Bernatchez, L. Geographical Extent of Arctic Char (Salvelinus alpinus)
mtDNA Introgression in Brook Char Populations (S. fontinalis) from Eastern Quebec
Canada, Mol. Ecol., 1998, 7 (12), 1655 1662.
Gonzalez-Willasenor, L.I., Powers, D.A. Mitochondrial DNA Restriction Site
Polymorphisms in the Teleost Fundulus heteroclitus Support Secondary Intergradation.
Evolution, 1990, 44 (1), 27 37.
Goto, S.G., Kimura, M.T. Phylogenetic Utility of Mitochondrial COI and Nuclear Gpdh
Genes in Drosophila. Molecular Phylogenetics and Evolution, 2001. 18, 3, 404422.
Graur, D., Li, W.H. Fundamentals of Molecular Evolution. Sunderland: Sinauer Ass.; 1999.
Griffiths, C.S. Correlation of Functional Domains and Rates of Nucleotide Substitutions at
Cytochrome b. Mol. Phylogenet. Evol., 1997, 7(3), 352 365.
Hahn, M.W., Mesey, J.G., Begun, D.J., et al. Evolutionary Genomics: Codon Bias and
Selection on Single Genomes. Nature, 2005, 433, brief communication, p. E5
(10.1038/nature03221).
Halanych, K.M., Demboski, J.R., van Vuuren, B.J., Klein, D.R., Cook J.A. Cytochrome b
Phylogeny of North American Hares and Jackrabbits (Lepus, Lagomorpha) and the
Effects of Saturation in Outgroup Taxa. Molecular Phylogenetics and Evolution, 1999,
11(2), 213221.
Hardman, M. The phylogenetic relationships among Noturus catfishes (Siluriformes:
Ictaluridae) as inferred from mitochondrial gene cytochrome b and nuclear
recombination activating gene 2. Molecular Phylogenetics and Evolution, 2004, 30, 395
408.
Hall, B. Phylogenetic Trees Made Easy: A How-To Manual for Molecular Biologists,
Sunderland: Sinauer Ass.; 2001.
Heath, D.A., Rawson, P.D., Hilbish, T.J. PCR-Based Nuclear Markers Identify Alien Blue
Mussel (Mytilus spp.) Genotypes on the West Coast of Canada. Can. J. Fish. Aquat. Sci.,
1995, 52, 2621 2627.
Hebert, P.D.N., Givinska, A., Ball, S.L., Biological Identification Through DNA Barcodes,
Proc. R. Soc. London, B, 2002, 270, 1512, 02PB0653.1 02PB0653.9.
Hebert, P.D.N., Ratnasingham, S., Barcoding Animal Life: Cytochrome c Oxidase Subunit 1
Divergences among Closely Related Species, Proc. R. Soc. London, B, 2002, 270(1512),
03BL0066.S1 03BL0066.S4.
Hedgecock, D. Population Genetic Bases for Improving Cultured Crustaceans, FIFAC/FAO
Symp. on Selection, Hybridization and Genetic Engineering in Aquaculture of Fish and
Shellfish for Consumption and Restocking. Bordeaux, 1986.
Hedgecock, D., Nelson, C. Genetic Variation of Enzymes and Adaptive Strategies in
Crustaceans. In: Genetika i razmnozhenie morskikh zhivotnykh (Genetics and
Reproduction of Marine Animals). Vladivostok: Dal'nevost. Nauchn. Tsentr Akad. Nauk
SSSR, 1981, 105 129.
Hellberg, M.E. Sympatric sea shells along the sea's shore: The geography of speciation in the
marine gastropod Tegula. Evolution, 1998. 52(5), 1311-1324.
Herke, S.W., Foltz, D.W. Phylogeography of two squid (Loligo pealei and L. plei) in the Gulf
of Mexico and northwestern Atlantic Ocean. Marine Biology, 2002, 140(1), 103-115.
Hogg, I.D., Hebert, P.D.N. Biological identification of springtails (Hexapoda : Collembola)
from the Canadian Arctic, using mitochondrial DNA barcodes. Canadian Journal Of
Zoology-Revue Canadienne De Zoologie, 2004, 82(5), 749-754.
Howard, D.J. Unanswered Questions and Future Directions in the Study of Speciation.
Endless Forms: Species and Speciation, In: Howard, D.J., Berlocher, S.H., Eds. New
York: Oxford Univ. Press, 1998, 439 448.
Howell, K.L., Rogers, A.D., Tyler, P.A., Billett, D.S.M. Reproductive isolation among
morphotypes of the Atlantic sea starspecies Zoroaster fulgens (Asteroidea:
Echinodermata). Marine Biology, 2004, 144, 977984.
Hwang, J.S., Lee, J.S., Goo, T.W., et al. The Comparative Molecular Study between
Bombycidae and Saturniidae Based on mtDNA RFLP and Cytochrome Oxidase 1 Gene
Sequences: Implication for Molecular Evolution. Z. Naturforschung (J. Biosci.), 1999,
54(78), 587 594.
Ingman, M, Gyllensten, U. European Journal of Human Genetics, 2007. 15, 115120.
Jerome, M., Lemaire, C., Bautista, J.M. Molecular Phylogeny and Species Identification of
Sardines. J. Agric. Food Chem., 2003, 51, 43 50.
Johns, G.C., Avise, J.C. A Comparative Summary of Genetic Distances in the Vertebrates
from the Mitochondrial Cytochrome b Gene, Mol. Biol. Evol., 1998, 15 (11), 1481
1490.
Johnson, M.J., Wallace, D.C., Farris, C.D., et al. Radiation of Human Mitochondria DNA
Types Analyzed by Restriction Endonuclease Cleavage Patterns. J. Mol. Evol., 1983, 19,
255 271.
Johnson, K.P., Cruickshank, R.H., Adams, R.J., et al. Dramatically Elevated Rate of
Mitochondrial Substitution in Lice (Insecta: Phthiraptera). Mol. Phylogenet. Evol., 2003,
26, 231 242.
Kartavtsev, Yu.P., Kartavtseva, I.V., Vorontsov, N.N. Population genetics and gene
geography of wild mammals. 5. Genetic distances between representatives of different
genera of Palearctic hamsters (Rodentia, Cricetini). Russian J. Genetics, 1984. 20(6),
961-969.
Kartavtsev, Yu.F., Glubokovskii, M.K., Chereshnev, I.A., Genetic Variation and
Differentiation of Sympatric Char Species (Salvelinus, Salmonidae) from Chukotka.
Genetika (Moscow), 1983, 19(4), 584 593.
Kartavtsev, Yu.F., Mamontov, A.M. Electrophoretic Estimation of Protein Variation and
Similarity in Omul, Two Forms of Lake Herring (Soregonidae), and Grayling
(Thumallidae) from the Lake Baikal. Genetika (Moscow), 1983, 19(11), 1895 1902.
Kartavtsev, Y.P. Allozyme Heterozygosity and Morphological Homeostasis in Pink Salmon
Fry Oncorhynchus gorbuscha (Pisces: Salmonidae): Evidences from the Family
Analysis. J. Fish. Biol., 1992, 40(1), 17 24.
Kartavtsev, Y. Genetic Aspects of Speciation, Species Differentiation and Biodiversity. Proc.
Int. Meet. of Biodiversity in Asia 2000, September 2000, Tokyo, p. 27.
Kartavtsev, Yu.F., Sviridov, V.V., Hanzawa, N., Sasaki, T. Genetic Divergence of Far-
Eastern Dace Species. Rus. J. Genet., 2002, 38(11), 1285 1257.
Kartavtsev, Y.P., Svinyna, O.V. Allozyme Markers and Morphometric Variability in
Gastropod Mollusk, Nucella heyseana (Mollusca, Gastropoda) and Their Association
with Environmental Change. Korean J. Genet., 2003, 25(4), 1 12.
Kartavtsev, Y.P., Hanzawa, N. Inferences in Leuciscinae (Pisces, Cyprinidae) phylogeny and
taxonomy based on cytochrom b sequence distances and on enzyme loci diversity.
Korean J. Genetics, 2007, 29(4), 427-435.
Kartavtsev, Yu.F. Molekulyarnaya evolyutsiya i populyatsionnaya genetika (Molecular
Evolution and Population Genetics), Vladivostok: Dal'nevost. Gos. Univ. (Far Eastern
State Univ. Publ.), 2005.
Kartavtsev, Y.Ph., Chichvarkhin, A.Y., Kijima, A., et al. Allozyme and Morphometric
Analysis of Two Common Mussel Species of Mytilus Genus (Mollusca, Mytilidae) in
Korea, Japan and Russia Waters. Korean J. Genet. 2005, 27(4), 289-306.
Kartavtsev, Y.P., Lee, Y.-M, Jung, S-O, Byeon,

H-K, Son, Y., Lee, J-S. The complete
mitochondrial genome of the bullhead torrent catfish, Liobagrus obesus (Siluriformes,
Amblycipididae): Genome description and phylogenetic considerations inferred from the
Cyt b gene. Gene, 2007a, 396, 13-27.
Kartavtsev, Y.P., Park, T.-J., Vinnikov, K.A., Ivankov, V.N., Sharina, S.N., Lee,

J.-S.
Cytochrome b (Cyt-b) gene sequences analysis in six flatfish species (Pisces,
Pleuronectidae), with phylogenetic and taxonomic insights. Marine Biol., 2007b, 152(4),
757-773.
Kartavtsev, Y.Ph., Sharina ,S.N., Goto, T., Chichvarkhin, A.Y., Balanov, A.A., Vinnikov,
K.A., Ivankov, V.N., Hanzawa, N. Cytochrome oxidase 1 (Co-1) gene sequence analysis
in six flatfish species (Teleostei, Pleuronectidae) of Russia Far East with inferences in
phylogeny and taxonomy. Mitochondrial DNA, 2008, 19(6), 479-489.
Kartavtsev Y.Ph., Sharina S.N., Goto T., Balanov A.A., Hanzawa N. Sequence diversity at
cytochrome oxidase 1 (Co-1) gene among sculpins (Scorpaeniformes, Cottidae) and
some other scorpionfish of Russia Far East with phylogenetic and taxonomic insights.
Genes and Genomics, 2009b, 31(2), 191-205.
Kartavtsev Y.Ph., Sharina S.N., Goto T., Rutenko O.A., Zemnukhov V.V., Semenchenko
A.A., Hanzawa N. Sequence diversity at cytochrome oxidase 1 (Co-1) gene among
pricklebacks (Actinopterigii, Stichaeidae) and some other percoids fishes of Russia Far
East with inferences in phylogeny and taxonomy. Aquatic Biology, 2009a. (Accepted).
Kimbal, R.T., Braun, E.L., Zwartjez, P.W., et al. A Molecular Phylogeny of Pheasants and
Partridges Suggests That These Lineages Are Not Monophyletic. Mol. Phylogenet. Evol.,
1999, 11(1), 38 54.
Kimura, M. The Number of Heterozygous Nucleotide Sites Maintained in a Finite Population
Due to Steady Flux of Mutations. Genetics, 1969, 61, 893 903.
Kimura, M. The Neutral Theory of Molecular Evolution. Cambridge: Cambridge Univ.,
1983.
King, J.L., Jukes, T.H. Non-Darwinian Evolution. Science, 1969, 164, 788 798.
King, M. Species Evolution: The Role of Chromosome Change. Cambridge: Cambridge Univ.
Press, 1993.
Klug, W.S. and Cummings, M.R., Essential Genetics, Prentice Hall, 2002.
Kochzius, M., Blohm, D. Genetic Population Structure of the Lionfish Pterois miles
(Scorpaenidae, Pteroinae) in the Gulf of Aqaba and Northern Red Sea. Gene, 2005, 347,
2, 295 301.
Koehn, R.K. Physiology and Biochemistry of an Enzyme Variation: The Interface of Ecology
and Population Genetics. Ecological Genetics: The Interface. In: Brussard, P., Ed. New
York: Springer-Verlag, 1978, 51 72.
Koehn, R.K., Zera, A.J., Hall, J.G. Enzyme Polymorphism and Natural Selection, Evolution
of Genes and Proteins. In: Nei, M., Koehn, R.K., Eds. Sunderland: Sinauer Ass., 1983,
115 136.
Koehn, R.K., Diehl, W.J., Scott, T.M. The Differential Contribution by Individual Enzymes
of Glycolysis and Protein Catabolism to the Relationship between Heterozygosity and
Growth Rate in the Coot Clam, Mulinia lateralis. Genetics, 1988, 118, 121 130.
Kontula, T., Kirilchik, S.V., Vainola, R. Endemic Diversification of the Monophyletic
Cottoid Fish Species Flock in the Lake Baikal Explored with mtDNA Sequencing, Mol.
Phylogenet. Evol., 2003, 27, 143 155.
Kramer, B., van der Bank, H., Flint, N., @et al. Evidence for Parapatric Speciation in the
Mormyrid Fish, Pollimyrus castelnaui (Boulenger, 1911), from the Okavango-Upper
Zambezi River Systems: P. marianne sp. nov., Defined by Electric Organ Discharges.
Morphology and Genetics, Environ. Biol. Fishes, 2003, 67(1), 47 70.
Krasilov, V.A. Evolyutsiya i biostratigrafiya (Evolution and Biostratigraphy). Moscow:
Nauka, 1977.
Kryukov A.P., Iwasa M.A., Suzuki H., Pinsker W., Haring E. Synchronic east-west
divergence in azure-winged magpies (Cyanopica cyanus) and magpies (Pica pica). J.
Zool. Syst. Evol. Research, 2004. 42: 342-351.
Kumar, S., Tamura, K., Nei, M., MEGA: Molecular Evolutionary Genetics Analysis (With a
130-Page Printed Manual), University Park: Pennsylvania Univ., 1993.
Kumar, S., Tamura, K., Nei, M., MEGA3: Molecular Evolutionary Genetics Analysis, 2000,
Web-base version (www.magasoftware.net).
Laurie-Ahlberg, C.C., Maroni, G., Buley, G.C., et al. Qauntative Genetic Variation of
Enzyme Activities in Natural Populations of Drosophila melanogaster. Proc. Natl. Acad.
Sci. USA, 1982, 77, 1073 1077.
Lewontin, R.C. Geneticheskie osnovi evolutsii. Moscow: Mir, 1978. Transl. From The
Genetic Basis of Evolutionary Change. New York: Columbia Univ., 1974.
Li, W.-H., Gojobory, T., Nei, M. Pseudogenes as a Paradigm of Neutral Evolution. Nature,
1981, 292, 237 239.
Li, W.-H., Wu, C.-I., Luo, C.-C. Nonrandomness of Pint Mutation As Reflected in
Nucleotide Substitutions in Pseudogenes and Its Evolutionary Implications. J. Mol. Evol.,
1984, 21, 58 71.
Li, W.H., Zarkhih, A., Statistical Tests of DNA Phylogenies. Syst. Biol., 1995, 44, 49 63.
Li, W.H. Molecular Evolution, Sunderland: Sinauer Ass., 1997.
Machordom, A., Macpherson, E. Rapid Radiation and Cryptic Speciation in Squat Lobsters
of the Genus Munida (Crustacea, Decapoda) and Related Genera in the South West
Pacific: Molecular and Morphological Evidence. Mol. Phylogenet. Evol., 2004, 33(2),
259 279.
Makarieva, A.M., Variance of Protein Heterozygosity in Different Species of Mammals with
Respect to the Number of Loci Studied. Heredity, 2001, 87, 41 51.
Malecote, G. Isolation by Distance, Genetic Structure of Populations. Morton, N.E., Ed.,
Honolulu: Univ. of Hawaii, 1973, 72 75.
Martinez-Navarro, E.M., Galian, J., Serrano, J., Phylogeny and Molecular Evolution of the
Tribe Harpalini (Coleoptera, Carabidae) Inferred from Mitochondrial Cytochrome-
Oxidase I, Mol. Phylogenet. Evol., 2005, 35, 127 146.
Mazurok, N.A., Rubtsova, N., Isaenko, A.A., et al. Comparative Chromosome and
Mitochondrial DNA Analysis and Phylogenetic Relationships in Common Voles
(Microtus arvicoliodae). Chromosome Res., 2001, 9, 107 120.
Mayr, E. Process of Speciation in Animals. Mechanisms of Speciation. In: Barigozzi, C., Ed.,
New York: Liss, 1982, 1 20.
Mayr, E. Animal Species and Evolution. Cambridge: Harvard Univ. Press, 1963.
Moller, P.R., Gravlund, P. Phylogeny of the Eelpout Genus Lycodes (Pisces, Zoarcidae) As
Inferred from Mitochondrial Cytochrome b and 12S rDNA. Mol. Phylogenet. Evol.,
2003, 26, 369 388.
Moriwaki, K., Hirai, K., Kohigashi

C., Miyashita N., Kryukov A., Frisman

L., Yamaguchi

Y.
2001. Geographical distribution of a recombinant beta-hemoglobin haplotype P and
possible migration of wild mouse populations in Asia. Evolution, genetics, ecology and
biodiversity: International conference. Editors Alexei P. Kryukov, Yuri Ph. Kartavtsev.
Vladivostok - Vostok Marine Biological Station. September 24-30, 2001: Abstracts.
Vladivostok: Institute of Marine Biology; 2001; P.37.
Mukai, T., Tachida, H., Ichinose, M. Selection for Viability at Loci Controlling Protein
Polymorphism in Drosophila melanogaster is Very Weak at Most. Proc. Natl. Acad. Sci.
USA, 1980, 77, 4857 4860.
Nailor, G.J., Collins, T.M., Brown, W.M. Hydrophobicity and phylogeny. Nature, 1996, 373,
565-566.
Nei, M., Genetic Distances between Populations. Am. Nat., 1972, 106(949), 283 292.
Nei, M. Molecular Population Genetics and Evolution. Amsterdam: North Holland, 1975.
Nei, M. Koehn, R. Evolution of Genes and Proteins. Nei, M., Koehn, R.K. Eds. Sunderland,
Mass.: Sinauer Assoc., 1983.
Nei, M. Human Evolution at Molecular Level, Population Genetics and Molecular Evolution,
Ohta, T., Aoki, K., Eds. Tokyo: Japan Sci. Soc., 1985, 41 64.
Nei, M. Molecular Evolutionary Genetics, New York: Columbia Univ. Press; 1987.
Nei, M., Kumar, S. Molecular Evolution and Phylogenetics, New York: Oxford Univ. Press;
2000.
Nevo, E., Beiles, A., Ben-Shlomo, R. The Evolutionary Significance of Genetic Diversity:
Ecological, Demographic and Life History Correlates. Evolutionary Dynamics of Genetic
Diversity. In: Mani, G.S.., Ed., Lect. Notes Biomath., 1984, 53, 4 213.
Nielsen, R., Hubisz, M.J. Detecting Selection Needs Comparative Data. Nature, 2005, 433,
brief communication, p. E6 (10.1038/nature03221).
Ohta, T., Gillespie, J.H. Development of Neutral and Nearly Neutral Theories. Theor. Popul.
Biol., 1996, 49(2), 128 142.
Orrell, T.M. A Molecular Phylogeny of the Sparidae (PERCIFORMES: PERCOIDEI). Ph.D.
Thesis. A Dissertation Presented to The Faculty of the School of Marine Science, The
College of William and Mary in Virginia, USA, 2000.
Pagel, M., Venditti, C., Meade A. Large Punctuational Contribution of Speciation to
Evolutionary Divergence at the Molecular Level. Science, 2006, 314(5796), 119 121.
Papasotiropoulos, V., Klossa-Kilia, E., Alahiotis, S.N., Kilias, G. Molecular Phylogeny of
Grey Mullets (Teleostei: Mugilidae) in Greece: Evidence from Sequence Analysis of
mtDNA Segments. Biochemical Genetics, 2007, 45(7-8), 1573-4927.
Pasekov, V.P. Genetic Distances, Geneticheskie rasstoyaniya, Itogi Nauki Tekhn.: Obshch.
Genet., 1983, 8, 4 75.
Paterson, H.E.H. More Evidence Against Speciation by Reinforcement. J. South African Sci.,
1978, 74, 369 371.
Paterson, H.E.H. The Recognition Concept of Species, Species and Speciation. In: Vrba, E.S.,
Ed., Pretoria: Transvaal Museum Monograph, 1985, 21 29.
Piaggio, A.J., Spicer, G.S. Molecular Phylogeny of the Chipmunk Inferred from
Mitochondrial Cytochrome b and Cytochrome Oxidase II Gene Sequences. Mol.
Phylogenet. Evol., 2001, 20(3), 335 350.
Plotkin, J.B., Dushoff, J., Fraser, H.B. Detecting Selection Using a Single Genome Sequence
of M. tuberculosis and P. falciparum. Nature, 2004, 428(6986), 942 945.
Posada, D., Grandal, K.A. MODELTEST: Testing the Model DNA Substitution,
Bioinformatics, 1998, 14, 817 818.
Powell, J.R. Interspecific Cytoplasmic Gene Flow in the Absence of Nuclear Gene Flow:
Evidence from Drosophila, Proc. Natl. Acad. Sci. USA, 1983, 80, 492 495.
Powell, J.R. Progress and Prospects in Evolutionary Biology: The Drosophila Model, New
York: Oxford Univ. Press; 1997.
Powers, D.A., A Multidisciplinary Approach to the Study of Genetic Variation in Species,
@New Directions in Physiological Ecology. In: Feder, M.L., Bennet, A.F., Eds. New
York: Cambridge Univ. Press, 1987, 102 134.
Quattro, J.M., Stoner, D.S., Driggers, W.B., Anderson, C.A., Priede, K.A., Hoppmann, C.,
Campbell, N.H., Duncan, K.M., Grady, J.M. Genetic evidence of cryptic speciation
within hammerhead sharks (Genus Sphyrna). Marine Biology, 148, 1143 1155.
Rambau, R.V., Robinson, T.J., Stanyon, R., Molecular Genetics of Rhabdomys pumilio
Subspecies Boundaries: mtDNA Phylogeography and Karyotypic Analysis by
Fluorescence In Situ Hybridization. Mol. Phylogenet. E, 2003, 28(3), 564 575.
Rand, D.M., Kann, L.M. Mutation and Selection at Silent and Replacement Sites in the
Evolution of Animal Mitochondrial DNA. Genetics, 1998, 102 103, 393 407.
Rawson, P.D., Secor, C.L., Hilbish, T.J. The Effect of Natural Hybridization on the
Regulation of Doubly Uniparental mtDNA Inheritance in Blue Mussels (Mytilus spp.).
Genetics, 1996, 144, 241 248.
Rawson, P.D., Hilbish, T.J. Asymmetric Introgression of Mitochondrial DNA among
European Populations of Blue Mussels (Mytilus spp.). Evolution, 1998, 52(1), 100 108.
Rawson, P.D., Argawal, V., Hilbish, T.J. Hybridization between the Blue Mussels Mytilus
galloprovincialis and M. trossulus Along the Pacific Coast of North America: Evidence
for Limited Introgression. Marine Biol., 1999, 134(1), 201 211.
Rocha-Olivares, A., Rosenblatt, R.H., Vetter, R.D. Molecular Evolution, Systematics and
Zoogeography of the Rockfish Subgenus Sebastomus (Sebastes, Scorpenidae) Based on
Mitochondrial Cytochrome b and Control Region Sequences. Mol. Phylogenet. Evol.,
1999a, 11(3), 441 458.
Rocha-Olivares, A., Kimbell, C.A., Eitner, B.J., Vetter, R.D. Evolution of Mitochondrial
Cytochrome b Gene Sequence in the Species-Rich Genus Sebastes (Teleostei,
Scorpenidae) and Its Utility in Testing Monophyly in the Subgenus Sebastomus. Mol.
Phylogenet. Evol., 1999b, 11(3), 426 440.
Rongyan, Z., Xianglong, L., Lanhui, L., Xiangyun, L., Fujun, F. Evolution and
Differentiation of the Prion Protein Gene (PRNP) among Species. Journal of Heredity,
2008, 99(6), 647-652.
Ruber, L, Zardoya, R. Rapid cladogenesis in marine fishes revisited. Evolution, 2005, 59(5),
1119-1127.
Rutaisire, J., Boot, A.J., Masemba, C., et al. Evolution of Labeo victorianus Predates the
Pleistocene Desiccation of Lake Victoria: Evidence from Mitochondrial DNA Sequence
Variation. South African J. Sci., 2004, 100(11-12), 607 608.
Ryman, N.F., Allendorf, F.W., Stahl, G. Reproductive Isolation with Little Genetic
Divergence in Sympatric Populations of Brown Trout (Salmo trutta), Genetics, 1979, 92,
247 262.
Rzhetsky, A., Nei, M. Test of Applicability of Several Substitution Models for DNA
Sequence Data. Mol. Biol. Evol., 1995, 12, 131 151.
Sanderson, M.J., Shaffer, H.B. Troubleshooting Molecular Phylogenetic Analyses, Annual
Rev. Ecol. Syst., 2002, 33, 49 72.
Sasaki, T., Kartavtsev, Y.P., Uematsu, T., Sviridov, V.V., Hanzawa, N. Phylogenetic
independence of Far Eastern Leuciscinae (Pisces: Cyprinidae) inferred from
mitochondrial DNA analysis. Gene and Genetic Systems, 2007, 82, 329-340.
Schneider, S., Roessli, D., Excoffier, L. Arlequine Ver. 2.000: A Software for Population
Genetic Data Analysis, Geneva: Univ. of Geneva, 2000.
Schon, I., Rubin, R., Griffits, H., Martins, K. Slow Molecular Evolution in Ancient Asexual
Ostracod, Proc. R. Soc. London, B, 1998, 265, 235 242.
Serizawa, K., Suzuki, H., Tsuchia, K. Phylogenetic View on Species Radiation in Apodemus
Inferred from Variation of Nuclear and Mitochondrial Genes. Biochem. Genet., 2000,
38(1 2), 27 40.
Sharina, S.N. Kartavtsev, Y.Ph. Phylogeny of Far Eastern flatfish species (Pisces,
Pleuronectiformes) based on primary sequence of nucleotide cytochrome oxidase 1 gene.
DNA Barcoding and Molecular Phylogenetics: The International Workshop.
Vladivostok, September 7-14, 2008: Program and Abstracts. Vladivostok, 2008. 17
p.Engl., p.15.
Simpson, G.G. Principles of Animal Taxonomy: The Species and Lower Categories, New
York: Columbia Univ. Press, 1961.
Sites, J.W., Marshall, J.C. Operational Criteria for Delimiting Species. Annual Rev. Ecol.
Evol. Syst., 2004, 35, 199 227.
Skibinski, D.O.F., Beardmore, J.A., Cross, T.F. Aspects of the Population Genetics of
Mytilus (Mytilidae: Mollusca) in the British Isles. Biol. J. Linn. Soc., 1983, 19, 173
183.
Skurikhina, L.A., Kartavtsev, Yu.F., Chichvarkhin, A.Yu., Pan'kova, M.V. Study of Two
Species of Mussels, Mytilus trossulus and Mytilus galloprovincialis (Bivalvia, Mytilidae)
and Their Hybrids in Peter the Great Bay of the Sea of Japan with the Use of PCR
Markers. Rus. J. Genet., 2001, 37(12), 1448 1451.
Spolsky, C., Uzzell, T. Natural Interspecies Transfer of Mitochondrial DNA in Amphibians.
Proc. Natl. Acad. Sci. USA, 1984, 81, 5802 5805.
STATISTICA, Statistica for Windows: Users Guide, Tulsa: StatSoft; 1994.
Stepien, C.A., Faber, J.E. Population Genetic Structure Phylogeography and Spawning
Phylopatry in Walleye (Stizostedion vitreum) from Mitochondrial DNA Control Region
Sequences. Mol. Ecol., 1998, 7, 1757 1769.
Stone, K.D., Cook, J.A., Molecular Evolution of Holarctic Martens (Genus Martes,
Mammalia: Carnivora: Mustellidae). Mol. Evol. Phylogenet., 2002, 24, 169 176.
Sumida, M., Ogata, M. Intraspecific Differentiation in the Japanese Brown Frog Rana
japonica Inferred from Mitochondrial DNA Sequences of the Cytochrome b Gene. Zool.
Sci., 1998, 15(6), 989 1000.
Sumida, M., Ogata, M., Nishioka, M. Molecular Phylogenetic Relationships of Pond Frogs
Distributed in Palearctic Region Inferred from DNA Sequences of Mitochondrial 12S
Ribosomal RNA and Cytochrome b Genes. Mol. Phylogenet. Evol., 2000, 16(2), 278
285.
Suzuki, H., Yasuda, S.P., Sakaizumi, M., et al. Differential Geographic Patterns of
Mitochondrial DNA Variation in Two Sympatric Species of Japanese Wood Mice,
Apodemus speciosus and A. argenteus. Genes Gene Syst., 2004, 79, 165 176.
Suzuki, H., Nunome, M., Moriwaki, K., Yonekawa, H., Tsuchiya, K., Kryukov, A.P. A
thirty-years project of the genetic survey on the wild mice. Modern Achievements in
Population, Evolutionary, and Ecological Genetics: International Symposium,
Vladivostok Vostok Marine Biological Station, September 914, 2007: Program and
Abstracts. Vladivostok, 2007. 44 p. Engl. P. 38.
Suzuki, H, Filippucci, M.G., Chelomina, G.N., Sato, J., Serizawa, K., Nevo, E. A
biogeographic view of Apodemus in Asia and Europe inferred from nuclear and
mitochondrial gene sequences. Biochemical Genetics, 2008, 46(5-6), 329-346.
Swofford, D.L., Olsen, G.J., Waddel, P.J., Hillis, D.M., Phylogenetic Inference, Molecular
Systematics, Hillis, D.M., Moritz, C., Mable, B., Eds. Sunderland, Mass.: Sinauer Ass.,
1996, 407 514.
Swofford, D.L. PAUP*: Phylogenetic Analysis Using Parsimony and Other Methods
(Software), Sunderland: Sinauer Ass.; 2000.
Takahata, N., Slatkin, M. Mitochondrial Gene Flow. Proc. Natl. Acad. Sci. USA, 1984, 81,
1764 1767.
Tamura, K., Dudley, J., Nei, M., Kumar, S. MEGA4: Molecular Evolutionary Genetics
Analysis (MEGA) software version 4.0. Molecular Biology and Evolution
10.1093/molbev/msm092. 2007.
Takehana, Y., Nagai, N., Matsuda, M., et al. Geographic Variation and Diversity of the
Cytochrome b Gene in Japanese Wild Populations of Medaka, Oryzias latipes. Zool. Sci.,
2003, 20(10), 1279 1291.
Tarjuelo, I., Posada, D., Crandall, K., et al. Cryptic Species of Clavelina (Ascidiacea) in Two
Different Habitats: Harbors and Rocky Littoral Zones in the Northwestern
Mediterranean. Marine Biol., 2001, 139, 455 462.
Templeton, A.R., Mechanisms of Speciation Population Genetic Approach, Annual Rev.
Ecol. Syst., 1981, 12, 23 48.
Templeton, A.R. Species and Speciation: Geography, Population Structure, Ecology and
Gene Trees. Endless Forms: Species and Speciation. In: Howard, D.J., Berlocher, S.H.,
Eds. New York: Oxford Univ. Press, 1998, 32 43.
Templeton, A.R. Using phylogeographic analyses of gene trees to test species status and
processes. Molecular Ecology, 2001. 10(3), 779-791.
Tennessen J.A. Positive selection drives a correlation between non-synonymous/synonymous
divergence and functional divergence. Bioinformatics, 2008, 24(12), 1421-1425.
Terriault, T.W., Docker, M.F., Orlova, M.I., et al. Molecular Resolution of the Family
Dreissenidae (Mollusca: Bivalvia) with Emphasis on Ponto-Kaspian Species, Including
First Report of Mytilopsis leucophaeata in the Black Sea Basin. Mol. Phylogenet. Evol.,
2004, 30, 479 489.
Thorpe, J.P., Enzyme Variation, Genetic Distance and Evolutionary Divergence in Relation
to Levels of Taxonomic Variation, Protein Polymorphism: Adaptive and Taxonomic
Significance, Oxford, J.S., Rollinson, D., Eds. London: Academic, 1983, 131 152.
Timofeev-Resovsky, N.V., Vorontsov, N.N., Yablokov, A.V. Kratkii ocherk teorii evolyutsii
(Brief Essay of the Evolution Theory). Moscow: Nauka, 1977.
Van Valen, L. Ecological Species, Multispecies and Oaks. Taxon, 1976, 25, 233 239.
Van Wagner, C.E., Baker, A.J. Association between Mitochondrial DNA and Morphological
Evolution in Canada Geese. J. Mol. Evol., 1990, 31, 5, 373 382.
Volker, G. Molecular Evolutionary Relationships in Avian Genus Anthus (Pipits:
Mothacillidae). Mol. Phylogenet. Evol., 1999, 11(1), 84 94.
Ward, R.D., Skibinski, D.O.F., Woodwark, M. Protein Heterozygosity, Protein Structure, and
Taxonomic Differentiation. Evolutionary Biology. In: Hecht, M.K., et al., Ed., New
York: Plenum, 1992, 73 159.
Ward RD, Zemlak TS, Innes BH, Last PA, and Hebert PDN DNA barcoding Australia fish
species. Philosophical Transactions of the Royal Society of London B, 2005, 360, 1847-
1857.
Ward, R.D., Costa, F.O., Holmes, B.H., Steinke, D. DNA barcoding of shared fish species
from the North Atlantic and Australasia: minimal divergence for most taxa, but Zeus
faber and Lepidopus caudatus each probably constitute two species. Aquat. Biol., 2008,
3, 7178.
Webb, S.A., Graves, J.A., Macias-Garcia, C., Magurran, A.E., Foighil, D.O, Ritchie, M.G.
Molecular phylogeny of the livebearing Goodeidae (Cyprinodontiformes) phylogeny of
the livebearing Goodeidae (Cyprinodontiformes). Molecular Phylogenetics and
Evolution, 2004, 30, 527544.
Wiens, J.J., Penkrot, T.A. Delimiting species using DNA and morphological variation and
discordant species limits in spiny lizards (Sceloporus). Systematic Biology, 2002, 51(1),
69-91.
Wiley, E.O. The Evolutionary Species Concept Reconsidered. Syst. Zool., 1978, 27, 17 26.
Wilhelm, R., Genotype-Specific Selection within a Hybrid Population of the Mussel Genus
Mytilus, Master Thesis. Columbia: Univ. of South Carolina, 1993.
Willett, C.S., Burton, R.S. Evolution of Interacting Proteins in the Mitochondrial Electron
Transport System in a Marine Copepod. Mol. Biol. E, 2004, 24(3), 443 453.
Williams, S.T., Knowlton, N., Weight, L.A., Jara, J.A. Evidence for Three Major Clades
within Snapping Shrimps Genus Alpheus Inferred from Nuclear and Mitochondrial
Sequence Data. Mol. Phylogenet. Evol., 2001, 20(3), 375 389.
Wilson, A.C., Gene Regulation in Evolution. Molecular Evolution. In: Ayala, F.J., Ed.
Sunderland: Sinauer Ass., 1976, 225 234.
Wu, W., Schmidt, T.R., Goodman, M., Grossman, L., Molecular Evolution of Cytochrome c
Oxidase Subunit 1 in Primates: Is There Coevolution between Mitochondrial and Nuclear
Genomes? Mol. Phylogenet. Evol., 2000, 17 (2), 294 304.
Yamazaki, Y., Goto, A., Nishida, M. Mitochondrial DNA Sequence Divergence between
Two Cryptic Species of Lethenteron, with Reference to an Improved Identification
Technique. J. Fish. Biol., 2003, 62(3), 591 609.
Yonekawa, H., Moriwaki, K., Gotoh, O. et al. Evolutionary Relationships among Five
Subspecies of Mus musculus Based on Restriction Enzyme Cleavage Patterns of
Mitochondrial DNA. Genetics, 1981, 98, 801 816.
Yonekawa, H., Tsuda, K., Tsuchia, K. et al. Genetic Diversity, Geographic Distribution and
Evolutionary Relationships of Mus musculus Subspecies Based on Polymorphism of
Mitochondrial DNA. Problems of Evolution, In: Kryukov, A.P., Yakimenko, L.V., Eds.
Vladivostok: Dalnauka, 2000, 90 108.
Zheng, X., Arbogast, B.S., Kenagy, G.J. Historical Demography and Genetic Structure of
Sister Species: Deer mice (Peromiscus) in the North American Temperate Rain Forest.
Mol. Ecol., 2003, 12, 711 724.
Zhimulev, I.F. Obshchaya i molekulyarnaya genetika (General and Molecular Genetics).
Novosibirsk: Novosib. Univ., 2002.
Zhivotovsky, L.A. Statistical Methods of Analyzing Gene Frequencies in Natural
Populations, Itogi Nauki Tekhn.: Obshch. Genet., 1983, 8, 76 104.
Zouros, E., Singh, S.M., Miles, H.E. Growth Rate in Oysters: An Overdominant Phenotype
and Its Possible Explanations. Evolution, 1980, 34(5), 856 867.
Zouros, E. On the Relation between Heterozygosity and Heterosis: An Evaluation of the
Evidence from Marine Mollusks Isozymes. Curr. Topics Biol. Med. Res., 1987, 15, 255
270.
Zouros, E., Folts, D.W. The Use of Allelic Isozyme Variation for the Study of Heterosis:
Isozymes, Curr. Topics Biol. Med. Res., 1987, 13, 1 59.

A Short Review

I have read this paper and have made some minor editorial suggestions. These were
mostly to improve the English style of the paper, although the original was quite reasonable.
The paper analyses nucleotide sequence divergence for two mitochondrial genes, Cyt-b
and CO-I, in large numbers of invertebrate and vertebrate species. It shows that while these
are significant differences in mean values for these two genes, increasing taxonomic rank
increases divergence for both genes. These analyses are done both ably and carefully. These
are the central analyses of this paper, which are then placed into context of evolutionary
theory, and especially modes of speciation. The author describes seven different modes, three
of which he categorises as Divergence modes, and four as Transilience modes. The bulk of
the genetic data appear to be consistent with expectations of one of the Divergence modes,
Adaptive Divergence. The author thoughtfully and indeed philosophically describes how
genetic sequence data can help to distinguish among these various speciation models, making
use of and describing species data sets that are particularly well-worked and informative.
However, arguably these different models could do with some further elaboration in the text
rather than being largely restricted to figures.
This is a significant step forward in our molecular understanding of modes, origins and
outcomes of speciation events, and certainly warrants publication.

Dr Robert D Ward

Principal Research Scientist
CSIRO Marine and Atmospheric Research
GPO Box 1538
Hobart, Tasmania 7001, Australia.
bob.ward@csiro.au

Chapter 2

Chromosomal Variability and the Origin
of Citrus Species

Marcelo Guerra
Department of Botany, Federal University of Pernambuco
50.670-420 Recife, PE, Brazil

Abstract

The genus Citrus includes some of the most important crop plants in the world
although its taxonomy remains one of the most controversial among angiosperms. Most
species are of hybrid origin and some of them may include germplasm from other genera.
Cytologically, Citrus species are characterized by a stable chromosome number and a
highly variable pattern of heterochromatic bands. Most accessions display heteromorphic
chromosome pairs, suggesting that they were originated from cross hybridization. On the
other hand, citron (C. medica), pummelo (C. maxima), a few mandarin accessions, and
most wild Citrus species and related genera exhibit chromosome pairs that are
homomorphic for similar heterochromatic bands. Based on these findings, hybrids and
non-hybrid accessions were identified and the possible origin and relationship among
most accessions were reconsidered.

Introduction

The genus Citrus belongs to the family Rutaceae, subfamily Aurantioideae, and was
originated in southern China and north-eastern India. It comprises some of the most important
crop fruits in the world, including sweet oranges, mandarins, lemons, and several other citric
fruits. However, its taxonomy remains one of the most controversial among all angiosperms
(reviewed by Nicolosi, 2007). The two main classification systems differ largely in terms of
the number of species accepted: 16 species according to Walter T. Swingle and 157 species
according to Tyzabur Tanaka (revised by Swingle and Reece, 1967). The correspondence
between species name of the two systems was detailed by Blondel (1978). The genus was
Marcelo Guerra 52
subdivided by Swingle into the subgenus Papeda, a basal group of species, and subgenus
Citrus, which includes all commercially important species, whereas the subdivision of
Tanakas system is more complex and less accepted. Barrett and Rhodes (1976), based on a
phylogenetic analysis of 146 morphological and biochemical traits of 43 accessions,
concluded that there exist only three cultivated species [C. reticulata Blanco, C. medica L.
and C. maxima (Burm.) Merrill, formerly referred as C. grandis (L.) Osbeck], whereas the
remaining accessions were hybrids. Scora (1975), based on biochemical studies of several
species, had previously concluded for the existence of the same three species plus a fourth
one, C. halimii Stone, but no further support for the latter has been reported. In spite of the
evidence largely favourable to a small number of species, Tanakas nomenclature remains by
far the most widely used.
The taxonomic conflict concerning the Citrus species has been attributed to several
factors, mainly the intense apomictic reproduction by nucellar embryony, interspecific and
intergeneric cross compatibility, resulting in many natural and artificial hybrids that are
highly vigorous and fertiles, cultivar selection often from spontaneous somatic mutations, and
a long history of cultivation (Cameron and Frost 1968, Davies and Albrigo 1994, Moore
2001). As an additional complication, many Citrus related genera, like Poncirus, Fortunella,
and Microcitrus, are also able to cross with the Citrus species and produce fertile
descendants. It is not surprising that the circumscription of each species has become almost
entirely blurred. Stebbins (1969) concluded that a biosystematic classification of Citrus is
impractical in the foreseeable future.
More recently, several authors have tried to find out the relationships between Citrus
accessions, using mainly isozymes and molecular markers (Herrero et al., 1996; Machado et
al., 1996; Federici et al., 1998; Nicolosi et al., 2000; Li et al., 2007). Although they seem to
agree that C. maxima, C. medica, and C. reticulata may be the basic species of the group
(Moore, 2001), the relationships between the many accessions remain confusing. In this
work, I will summarize the contribution of cytogenetic studies to the understanding of the
variation and evolution of the Citrus species.

General Chromosome Characteristics
of Citrus Species

Extensive chromosome analyses have shown that the members of subfamily
Aurantioideae possess a stable chromosome number of 2n=18, with the exception of a few
natural polyploids. Conventional chromosome staining with Feulgen, Giemsa or acetic dyes
has revealed small chromosomes, very similar in size and morphology and rare chromosome
markers within or between karyotypes. We analysed 51 accessions of citrus plants with
conventional chromosome staining and found out that the karyotype differentiation was
restricted to the number and position of secondary constrictions (Guerra et al., 1997).
During prometaphase one can observe the presence of numerous heteropycnotic blocks,
which seem to distinguish most chromosomes of each karyotype (Guerra, 1985; Befu et al.,
2000). However, such heteropycnotic patterns are greatly dependent upon chromosome
condensation and do not offer a reliable karyotype characterization. The chromosome size
Chromosomal Variability and the Origin of Citrus Species Complex 53
ranges between 2 and 4 m, although fluorochrome stained chromosomes seem to be a bit
larger than conventionally stained chromosomescompare for example Agarwal (1987) and
Befu et al. (2002).
C-banding analyses has permitted the differentiation of heterochromatic blocks in
metaphase chromosomes, but due to the severe treatment of C-banding methods and the small
size of chromosomes and bands of Citrus species the banding pattern is not always clear
(Guerra 1985; Liang, 1988; Wei et al., 1988). Reliable karyotype differentiation of Citrus
species was first observed after double staining with the fluorochromes chromomycin A
3

(CMA) and 4-6-diamidino-2-phenylindole (DAPI) (Guerra, 1993). These fluorochromes
preferentially bind to DNA rich in guanine-cytosine or adenine-thymine base pairs,
respectively, allowing for easy identification of heterochromatic regions richer in one of these
base pairs (Schweizer, 1976).
After CMA/DAPI double staining, the chromosomes of all Citrus species exhibit many
CMA bright, DAPI dull bands (CMA
+
/ DAPI
bands), mainly located at the terminal region

of long arms. Secondary constrictions (also identified as nucleolus organizing regions
NORs) are usually deeply stained with CMA but sometimes they stain weakly or neutrally
with CMA, and are better recognized as DAPI

bands (Figure 1). No DAPI
+
bands are
observed, although some chromocentres in the interphase nuclei are more intensely stained
with DAPI than with CMA.

Figure 1. Metaphase cells of Citrus species stained with DAPI and CMA showing CMA
+
/DAPI
bands,
mainly at the terminal region of long arms. a-c, C. sinensis metaphase stained with CMA (a), the same
cell stained with DAPI (b), and CMA/DAPI merged images (c). d-f, C. reticulata cv Cravo stained with
CMA (d) and DAPI (e), and CMA/DAPI merged images of C. reticulata cv. Ponkan (f). Arrows in a-c
point to the proximal band of B chromosomes. Arrow in d points to a satellite with a distended
secondary constriction of a D chromosome. Photos: a-c, Andr Marques, d-f, Ana Paula de Moraes.
CMA
chromocentres should represent another kind of heterochromatin, probably

corresponding to some proximal heteropycnotic blocks observed by conventional staining
(Guerra, 1985, 1987) or to the proximal bands revealed by Ito et al. (1993) in most
Marcelo Guerra 54
chromosomes of C. sinensis using a HKG banding technique. On the other hand, the DAPI
staining allows a clearer observation of chromosome morphology, being often very useful to
identify some chromosomes.
Fluorescent in situ hybridization (FISH) with 5S and 45S rDNA probes adds specific
chromosome marks which help to distinguish some chromosomes in several plant genera
(see, for example, Vaio et al., 2005; Almeida et al., 2006; Baeza et al., 2007). In Citrus, the
sequential staining with CMA/DAPI and FISH with 5S and 45S rDNA probes has
contributed to a much finer chromosome differentiation and karyotype comparison (Carvalho
et al., 2005; Moraes et al., 2007a,b). FISH with telomeric DNA and microssatellite probes
were also investigated in Citrus chromosomes (Matsuyama et al., 1996, 1999), but they did
not improve the karyotype differentiation. Kang et al. (2008) hybridized in situ the main
satellite DNA sequence of Citrus and a 45S rDNA probe against metaphases of 13 Citrus
species. The satellite sequence hybridized exclusively on terminal regions, probably
corresponding to the CMA
+
bands detected by other authors.

Chromosome Types Found in Citrus Species

Analysis of CMA banding in many Citrus species revealed the presence of at least eight
different chromosome types (reviewed by Carvalho et al., 2005). According to the number
and position of bands they were designated as (see Figure 2): type A (two terminal and one
proximal band), type B (one terminal and one proximal band), type C (two terminal bands),
type D (a single terminal band), type E (an interstitial band), type F (without bands), type F
L

(the largest F chromosome), and type G (a terminal and a subterminal band in the long
chromosome arm). Some authors have used different denomination for some of these types.
However, in order to facilitate the understanding of the different karyotype formulae, the
chromosome types described by other authors are here converted to this nomenclature.

Figure 2. Chromosome types observed in Citrus species, according to Carvalho et al. (2005).
The chromosome type F
L
was distinguished from other F chromosomes because it was
longer and easily identified in all accessions of Citrus and in other related genera so far
analyzed (Guerra et al., 2000). In some accessions the F
L
chromosome may exhibit a fine
terminal or subterminal band in homozygosis or heterozygosis. For example, C. sunki and C.
reshni, two closely related wild species, have an identical karyotype, differing only in the
small terminal band of the F
L
chromosomes which is larger in C. sunki and weaker or absent
in C. reshni (Cornlio et al., 2003). On the other hand, in some hybrid accessions, as limes
and lemons, the F
L
pair is heteromorphic for the presence of this fine terminal band (Carvalho
et al., 2005). Since the F
L
chromosome with the fine terminal band (F
L
+
) is homologous to
that without a band (F
L
0
), as observed in some accessions with both F
L
subtypes, it was
characterized as a F chromosome type, in spite of having a terminal band.
Based on the number of different chromosome types, most accessions can be
characterized by a single karyotype formula. In general, we use the simplest karyotype
formula, avoiding details about the subtypes of F
L
or the presence or absence of rDNA sites.
For instance, the karyotype formula of C. sinensis can be indicate as 2B + 2C + 7D + 7F,
although this does not indicate the variation in chromosome size, band size, subtypes of F
L
,
position of rDNA sites, and other karyotype details characteristic of the species (see Guerra,
1993; Cornlio et al., 2003). Analyses of the karyotype formulae of many accessions showed
that each Citrus species displays several D and F chromosome types but only a small number
of A, B or C types. Therefore, the number of A, B, and C chromosome types is usually the
most important feature to characterize the accessions.
The main problem with the identification of karyotype formulae based on chromosome
types lies on the detection of small or weak CMA bands, which depending on the staining
technique used may be identified or not. The best CMA banding differentiation is obtained
after simultaneous staining with CMA and DAPI (Guerra, 1993; Matsuyama, 1996; Miranda
et al., 1997a), although CMA bands have also been revealed by a combination of CMA plus
quinacrine mustard or distamycin A (Befu et al., 2000) or by CMA only (Yamamoto and
Tominaga, 2003). In the CMA/DAPI double staining, DAPI plays two important roles in the
differentiation of CMA bands: on the one hand, it intensifies the contrast between GC-rich
CMA bands and the euchromatin, on the other hand, it allows to observe the CMA
+
bands as
DAPI negative (DAPI
) bands. A good contrast between heterochromatin and euchromatin is

especially important for the smallest or weakest CMA
+
bands, as for example the small band
of the F
L
+
chromosomes. Nevertheless, even using CMA/DAPI sometimes a small band is not
detected, changing the karyotype formula significantly. For example, Cornlio et al. (2003)
described the karyotype formula of C. aurantium as 1A + 1B + 1C + 7D + 8F. Later, we
detected in some cultivars of this species a small terminal band in a D chromosome which
had not been previously detected (unpublished data), changing the formula to 1A + 1B + 1C
+ 8D + 7F (see also, Befu et al., 2002).
In Citrus species, the less conspicuous CMA
+
bands are those associated with the 45S
rDNA, whose reaction with this fluorochrome is quite variable. Moraes et al. (2007a,b)
observed that CMA
+
bands co-localized with 45S rDNA show different brightness intensity,
from positive to neutral, even in the same slide, but they were always unequivocally
distinguished as DAPI
bands. The weak staining of the proximal CMA bands have been
reported since the first analysis of CMA bands in Citrus species (Guerra, 1993). Miranda et
al. (1997a) also reported that proximal bands of five Citrus and two Fortunella species were
CMA positive or CMA neutral. In most cells, the small proximal CMA
+
band of B
chromosomes are best observed as DAPI
than as CMA
+
(Cornlio et al., 2003; Moraes et
al., 2007a).

Marcelo Guerra 56
Chromosome Types and rDNA Sites

In most plant species the 45S rDNA sites co-localize with CMA
+
bands although not all
CMA
+
bands co-localize with 45S rDNA sites (see e.g., Almeida et al., 2006). The CMA
+

proximal bands of chromosome types A and B of Citrus and Poncirus species were always
reported to co-localize with 45S rDNA sites (Matsuyama et al., 1996; Miranda et al., 1997b;
Roose et al., 1998; Pedrosa et al., 2000; Carvalho et al., 2005; Brasileiro et al., 2007; Moraes
et al., 2007a,b). Other 45S rDNA sites were found co-localizing with CMA
+
bands on one or
two D chromosomes in most karyotypes of Citrus species investigated. They also co-
localized with the CMA
+
band of two D chromosomes of Poncirus trifoliata and two C
chromosomes of Fortunella crassifolia (Miranda et al., 1997b). On the other hand, the 5S
rDNA sites were found on B or D chromosomes adjacent to the 45S rDNA sites (but not co-
localized with CMA
+
bands) or at the euchromatic region of D or F chromosomes (Pedrosa et
al., 2000; Carvalho et al., 2005; Moraes et al., 2007a,b). In P. trifoliata, the 5S rDNA sites
were always closely linked to the 45S rDNA site, both in one D and in two B chromosomes
pairs (Roose et al., 1998; Brasileiro-Vidal et al., 2007). In representatives of several other
Citrus related genera, as Severinia and Murraya, the two different rDNA sites are also linked
to each other. Figure 3 illustrates the distribution of 5S and 45S rDNA sites and CMA
+
bands
in two Citrus species and in Murraya paniculata.

Figure 3. Distribution of 5S (red signals) and 45S rDNA sites (green signals) in Citrus species and in a
related genus. a,b, Metaphase of C. keraji sequentially stained with CMA/DAPI (a) and FISH showing
the rDNA sites (b); c,d, Chromosomes of Murraya paniculata stained with CMA (c) and FISH (d); e,f,
Metaphase of C. paradisi stained with CMA/DAPI (e) and FISH (f). Two F chromosomes are partially
superimposed in e, f (arrowhead). Observe the co-localization of CMA
+
bands with 45S rDNA sites but
not with 5S rDNA ones. Arrows point to the small 5S rDNA sites. Photos: a,b,e,f, Ana Paula de
Moraes; c,d, Ana Emilia Barros e Silva.
Chromosome types bearing 5S and 45S rDNA sites should be distinguished from similar
types without rDNA sites. Thus, we can distinguish four types of D chromosomes: D
(without rDNA sites), D/5S (bearing a 5S rDNA far from the CMA
+
band), D/45S (bearing a
45S co-localized with a CMA
+
band), and D/5S-45S (bearing both sites closely associated).
Likewise, B, C and F chromosomes can be distinguished in subtypes B, B/45S, B/5S-45S, C,
C/45S, F, and F/5S. The A chromosome type do not need a division in subtypes since all of
them analyzed till now have a proximal 45S rDNA site.

Karyotype Diversity and Chromosome Homology

The Citrus species of Tanakas system exhibit a wide karyotype diversification after
CMA banding. Table 1 summarizes the karyotype formulae of some important Citrus
accessions, among more than 40 species and 60 accessions already analyzed with CMA.

Table 1. Karyotype formula of some species of Citrus and Poncirus trifoliata

Species Common name Karyotype
formula
Chromosome subtypes Ref.
*

Citrus aurantifolia
Swingl.
Mexican lime 2B + 10D + 5F
+1G
2B/45S + 1D/45S +
1D/5S-45S + 1FL
0

a
C. clementina hort. ex-
Tanaka
Clementine
mandarin
1B + 1C + 11D
+ 5F
1B/45S + 2D/5S-45S +
2FL
0

b
C. deliciosa Ten. cv.
Comum
Mediterranean
mandarin
2C + 10D + 6F 2D/5S-45S + 2FL
0
b
C. jambhiri Lush. cv.
Mazoe
rough lemon 1B + 11D + 6F 1B/45S + 1D/5S + 1D/5S-
45S + 2FL
0

a
C. limettioides Tanaka Palestine sweet
lime
2B + 10D + 6F 2B/45S + 1D/5S + 1D/5S-
45S + 1FL
0
+ 1FL
+

a
C. limon (L.) Burm. cv.
Lisboa
lemon 1B + 1C + 9D +
7F
1B/45S + 1D/5S + 1D/5S-
45S + 1FL
0
+ 1FL
+

a
C. maxima (Burm.)
Merrill cv. Israel
pummelo 4A + 2C + 4D +
8F
2F/5S + 2FL
0
c
C. medica L. cv. Ethrog citron 2B + 8D + 8F 2B/45S + 2D/5S + 2FL
0
a
C. paradisi Macf. cv.
Marsh
grapefruit 2A + 1B + 3C +
5D + 7F
1B/45S + 1D/45S + 2FL
0
c
C. reshni hort. ex-
Tanaka
Cleopatra
mandarin
14D + 4F 2D/5S-45S + 2FL
+/0
b
C. reticulata Blanco cv.
Cravo
cravo mandarin 2C + 10D + 6F 2D/5S-45S + 2FL
0
b
C. sinensis (L.) Osbeck
cv. Pra
sweet orange 2B + 2C + 7D +
7F
2B/45S + 1D/5S-45S +
1F/5S + 2FL
0

d
C. sunki (Hayata) hort.
ex-Tanaka
sunki mandarin 14D + 4F 2D/5S-45S + 2FL
+
b
C. tachibana (Makino)
Tanaka
tachibana 1B + 1C + 10D
+ 3E + 3F
1B/45S + 2D/5S-45S +
2FL
0

b
C. unshiu Marcow satsuma mandarin 1B + 2C + 10D
+ 5F
1B/45S + 2D/5S-45S +
2FL
0

b
Poncirus trifoliata (L.)
Raf. cv. Pomeroy
trifoliate orange 4B + 8D + 6F 4B/5S-45S + 2D/5S-45S +
2F
L
0

e
*
References: a, Carvalho et al., 2005; b, Moraes et al., 2007a; c, Moraes et al., 2007b; d, Pedrosa et al.,
2000; e, Brasileiro-Vidal et al., 2007.
Marcelo Guerra 58
Figure 4 shows the idiogrammatic representation of four homozygous species where each
chromosome pair is represented by a single chromosome. In the vast majority of hybrid
accessions it is difficult to identify the chromosome pairs, since we do not know the
homology between identical chromosome types from different species. For example, a cross
between C. maxima (4A + 2C + 4D + 8F) and C. sunki (14D + 4F) will produce a hybrid with
the karyotype formula 2A + 1C + 9D + 6F. Based only on the karyotype formula of the
hybrid, we would say that it has a single heteromorphic pair (1C/1D), while based on the
karyotype formulae of the parents we could assume that it has two chromosome pairs in
homozygosis (2D + 2F) and seven others in heterozygosis. The two A chromosomes of this
hybrid are non-homologous and they would certainly be homeologous to D or F
chromosomes of C. sunki. It means that chromosomes of the same type observed in different
species are not necessarily homeologous. Therefore, the heterozygosity level cytologically
detected in many hybrids is probably largely understimated.
A reliable identification of chromosome homology or homeology, in this case, is only
possible by meiotic pairing analysis or chromosome mapping of single copy sequences by
FISH. Analysis of meiotic pairing with CMA/DAPI, aiming to identify the individual
chromosome types involved in each meiotic bivalent, is rather complicate. Because most
CMA
+
bands are terminally located it is difficult to say if a band of a bivalent belongs to one
chromosome or to the other. Figure 5 a,b shows a diplotene stage cell of Murcott mandarin
displaying apparently the following pairing: AD + BD + 4 DD + 3 FF). On the other hand,
chromosome map based on BAC-FISH is still unknown for Citrus species. Recently, Moraes
et al. (2008) localized by FISH a set of 13 bacterial artificial chromosomes (BACs)
containing fragments of Poncirus trifoliata genome on mitotic metaphases of this species and
identified molecular marks for seven out of its nine chromosome pairs. Figure 5 c,d shows
one of these BACs hybridized on the short arm of a B chromosome pair. In situ hybridization
of the same BAC set from P. trifoliata on chromosomes of C. medica revealed that some
BACs hybridized on the same chromosome types in both species whereas others hybridized
on different chromosome types (Sandra Mendes, personal communication). Once again we
see that homeologous chromosomes from different species may present different band
patterns. Therefore, chromosome types are good karyotype markers but they are not a reliable
indication of homeology between species.

Figure 4. Idiogramms showing only the distribution of CMA
+
bands in homozygous Citrus species.
Individual chromosome morphology is not represented.

Figure 5. Different approaches to identify chromosome homologies within Citrus species. a,b, Meiotic
cell of Murcott mandarin in diplotene stage stained with CMA/DAPI (a) and interpretative drawing of
the nine bivalents of the same cell (b); c,d, Identification of an individual chromosome pair of Poncirus
trifoliata using BAC-FISH (arrows). One of the two B chromosome pairs has a single copy DNA
sequence detected by FISH using the BAC 28A07 as probe (red signals in d). Photos: a,b, Filipe
Felinto; c,d, Ana Paula de Moraes.

The Karyotypes of Citron, Lemons and Limes

In this group of species, only C. medica (citron) exhibits an entirely homozygous
karyotype with 2B + 8D + 8F (Carvalho et al., 2005; Befu et al., 2001; Yamamoto et al.,
2007). Analysis of the distribution of rDNA sites (Carvalho et al., 2005) in C. medica has
further confirmed its structural homozygosity, supporting the assumption that it is one of the
true citrus species (Scora, 1975; Barrett and Rhodes, 1976).
Four cultivars of true lemon (C. limon) analysed by Carvalho et al. (2005) showed
identical karyotype formulae (1B + 1C + 9D + 7F), displaying three chromosome pairs
heteromorphic for CMA
+
bands and two pairs heteromorphic for rDNA sites. The
conservation of such a complex heteromorphism indicates that these accessions descend from
a single hybrid which was subsequently asexually propagated and diversified into several
cultivars by gene mutation.
The closely related lemon species C. volkameriana (Volkamer lemon), C. limonia
(Rangpur lemon) and C. jambhiri (rough lemon) display very similar karyotypes and were
also heteromorphic for the same pairs of chromosome types (1B + 11D + 6F). They diverged
Marcelo Guerra 60
slightly from C. limon because this species has a C chromosome absent in the other lemons
(see also Yamamoto et al., 2007). All these species display some chromosomes with a single
subterminal CMA
+
band (indicate by arrows in Carvalho et al., 2005, figure 3), which could
be better characterized as E chromosomes. However, because these bands were very near to
the chromosome end, they have been referred as D chromosomes. All these lemons are
clearly related to the completely homozygous C. medica (citron), which also has a
chromosome pair with a subterminal CMA
+
band (see Carvalho et al., 2005, figure 3j).
Observe in the Figure 4 that an entire haploid chromosome set of C. medica is represented in
C. limon (see Carvalho et al., 2005, for a detailed comparison with other lemons and limes).
These data support the assumption that lemons are hybrids derived from crosses between C.
medica and other species. Some other evidences indicate that C. aurantifolia could be the
other parental (Nicolosi et al., 2000; Moore, 2001).
The limes, C. aurantifolia and C. limettioides, are strongly heterozygous and
karyotypically related to each other. Like the lemons, their karyotypes include an entire
haploid complement of citron, supporting the hypothesis that C. medica is one of their
ancestors (Federici et al., 1998; Nicolosi et al., 2000; Carvalho et al., 2005).

The Karyotypes of Mandarins

The most surprising and taxonomically controversial group of Citrus species is the
mandarin one. According to Tanakas classification system, there are 36 mandarin species,
whereas Swingle recognised only three species (Swingle and Reece, 1967). Phenotypically,
they are a highly diversified group, including both monoembryonic and polyembryonic, self-
fertile and self-incompatible cultivars. Most of them are supposed to be interspecific hybrids
whereas others seem to be true species (Hodgson, 1967; Swingle and Reece, 1967; Barrett
and Rhodes, 1976). Nevertheless, a remarkable similarity among mandarin cultivars has been
reported at the isoenzymatic and molecular level (Esen and Scora, 1977; Machado et al.,
1996; Coletta Filho et al., 1998; Li et al., 2007).
Karyologically, the mandarins are highly diversified, including many heteromorphic and
some homomorphic cultivars. Among the accessions analysed by Cornlio et al. (2003) and
Moraes et al. (2007a) there are four distinct groups: I) the wild species C. sunki (Sunki) and
C. reshni (Cleopatra), with the simplest karyotype formula (14D + 4F); II) the
Mediterranean mandarins (C. deliciosa cv. Rio, Montenegrina and Comum) plus C. reticulata
Cravo, with an homomorphic karyotype (2C + 10D + 6F), and C. tangerina Dancy which
has a heteromorphic pair but displays the same chromosome types as the accessions of this
group (1C + 11D + 6F); III) most heterozygous accessions with B chromosomes and without
A chromosomes, including C. amblycarpa, C. clementina (Clementine de Nules), C.
depressa (Shiikuwash), C. reticulata (Batangas, Cravo, Oneco, Ponkan), C.
tachibana, C. unshiu, and the tangelos (C. paradisi x C. tangerine) Page and Orlando; IV)
accessions having A chromosomes, which include only C. nobilis (King) and Murcott (C.
sinensis x C. reticulata ).
In situ hybridization with 5S and 45S rDNA probes confirmed that all species of the first
two groups were homozygous in relation to the number and position of rDNA sites, including
Dancy. Handa et al. (1986) also observed that Dancy has a closer proximity to the
Mediterranean mandarins, sharing a similar rubisco subunit. All these cultivars exhibited a
pair of D/5S-45S chromosomes, whereas the cultivars of group III had a pair of D/5S-45S
and a single B/45S. Murcott (group IV) had a highly heterozygous and complex karyotype,
bearing 1A + 1B/45S + 1D/5S-45S + 9D + 5F + 1F/5S (Moraes et al., 2007a).
Yamamoto and Tominaga (2003) investigated 17 accessions of mandarins with CMA and
observed the same karyotype formulae for C. sunki and C. reshni described above.
Concerning the accessions of group II, they found the formula 1C + 10D + 7F for Dancy
and the Mediterranean mandarin Tardivo di Ciaculli. All other accessions were
heterozygous for one or more chromosome pairs, except satsuma mandarin (C. unshiu),
which was homozygous (2C + 8D + 8E) in the cultivar Juman but heterozygous (1A + 1C +
8D + 8E) in the cultivar Okitsu Wase. However, the authors emphasized that one C
chromosome of Juman probably corresponds to the A chromosome of Okitsu Wase.
Indeed, in the metaphase illustrating the karyotype of Juman cultivar there is a C type
chromosome with a proximal secondary constriction weakly stained with CMA, which is
typical for A chromosomes. Therefore, both cultivars seem to have 2A + 8D + 8E.
If homozygous karyotypes, constituted by D and F or C, D, and F chromosomes, are the
original karyotypes of the mandarins, one can suppose that the accessions of group III and
IV, which have some A or B chromosome types, arose from hybridization with other citrus
groups. In this case, only C. sunki, C. reshni, C. deliciosa, and C. reticulata Cravo seem to
be true biological species. It is possible that all edible mandarins are related to C. deliciosa or
a similar accession with the karyotype formula 2C + 8D + 8E, as C. reticulata cv. Cravo.
However, these two species are not known in the wild and their homozygous karyotype may
have been derived from other true species. On the other hand, if the wild species C. sunki and
C. reshni were involved in the origin of the cultivated mandarins they should have been
hybridized with other citrus species bearing A, B, or C chromosomes, improving the edible
qualities of their acid and small fruits.

The Karyotypes of Pummelos and Grapefruits

The karyotypes of pummelos (C. maxima) and grapefruits (C. paradisi) are characterized
mainly by the presence of A chromosomes, which seems to be a reliable marker for this
group (see Table 1). Befu et al. (2000, 2001, 2002) reported the occurrence of one to three A
chromosomes in the karyotypes of different pummelo cultivars. Similarly, Yang et al. (2002)
found one to three A chromosomes in each one of the 38 seedlings derived from a cross
between two pummelo cultivars. On the other hand, Moraes et al. (2007b) found a single
karyotype formula for Israel and Pink pummelos (4A + 2C + 4D + 8F). A similar formula
was reported by Yamamoto et al. (2007) for Hayasaki pummelo (3A + 3C + 4D + 8F). If
one of these C chromosomes have a weak proximal CMA
+
band it would turn to be an A
chromosome and the karyotype formula would be identical to Pink and Israel. The number
of A chromosomes observed may depend on the technique used since some A chromosomes
have one or two faint bands. The presence of the weak proximal band can be better revealed
by in situ hybridization with 45S rDNA probe (Moraes et al., 2007b).
Marcelo Guerra 62
The presence of three or four A chromosomes, as observed in pummelos, has never been
reported in any other Citrus species. Grapefruits, which are derived from a cross between
pummelo and sweet orange (Gmitter, 1995), have two A non-homologous chromosomes, as
expected (Moraes et al., 2007b). Likewise, it is quite possible that all citrus accessions
bearing A chromosomes, including some mandarins, limes and lemons accessions, are
directly or indirectly derived from hybridizations with pummelos. Pummelos are also
characterized by the exclusive presence of a pair of F/5S chromosomes. Till now, this
chromosome type was only found in pummelos, grapefruits, sweet orange, tangelo Orlando,
and in the tangor Murcott. All these accessions are derived from crosses with C. maxima
and all of them display a single F/5S chromosome (Pedrosa et al., 2000; Moraes et al.,
2007b). On the other hand, secondary hybrids of pummelos do not necessarily have A or
F/5S chromosomes, due to the random segregation of non-homologous chromosomes in F1
hybrids. For example, none of the two A chromosomes of grapefruit were transmitted to the
tangelo Orlando (Moraes et al., 2007b).
Although the literature suggest that all grapefruit cultivars are apomictic clones derived
from a single hybrid between C. sinensis and C. maxima (Barrett and Rhodes, 1976), this
assumption is controversial (Bowman and Gmitter, 1990). Karyologically, Moraes et al.
(2007b) have demonstrated that there are at least two different grapefruit cytotypes. The
grapefruit Duncan and Foster have the karyotype formula 2A + 1B + 2C + 6D + 7F,
whereas Flame, Henderson, Marsh, and Rio Red have 3C + 5D instead of 2C + 6D.
Most important, the D chromosomes are not identical in these two groups: the former has 5D
+ 1D/5S-45S while the latter has 3D + 1D/5S-45S + 1D/45S. Therefore, grapefruit cultivars
are not derived from a single hybrid but there are at least two different lineages.

The Karyotypes of Sweet Orange and
Sour Orange

In spite of the chromosomal variability among Citrus species, the analysis of fourteen
different sweet orange (C. sinensis) cultivars has revealed a heterozygous karyotype (2B +
2C + 7D + 7F) strictly conserved in all samples investigated (Guerra, 1993; Matsuyama et
al., 1996; Miranda et al.,1997a; Befu et al., 2000; Pedrosa et al., 2000; Cornlio et al., 2003;
Yamamoto et al., 2007). These data suggest that all sweet orange cultivars were originated
from a single hybrid which has been assexually propagated for hundred or thousand years,
forming different cultivars by gene mutations (sports). This single hybrid was probably
originated from a cross between pummelo and mandarin (revised by Moore, 2001).
Sour orange (C. aurantium) is another putative hybrid between pummelo and mandarin
(Scora, 1975; Barrett and Rhodes, 1976; Asns et al., 1998). Supporting this assumption, it
has an A chromosome type which is usually found in C. maxima and related hybrids. The
karyotype reported for sour orange by Cornlio et al. (2003) is highly heteromorphic (1A +
1B + 1C + 8D + 7F). Befu et al. (2002) have reported a similar formula for another cultivar
of sour orange but they did not detected the A chromosome and described two C
chromosomes instead, perhaps due to the confusion between C and A chromosomes. The
occurrence of 1A + 1B + 1C chromosomes was confirmed by Yamamoto et al. (2007) in
another cultivar of C. aurantium. The single seedling of sour orange analysed by Guerra
(1993), with two A chromosomes, was obtained from a botanical garden and was probably a
zygotic embryo.
The hypothesis that C. sinensis and C. aurantium are hybrids between C. maxima and the
polytypic C. reticulata is hardly sustainable on the basis of their karyotype formulae. Since
C. maxima has two pairs of A chromosomes (Moraes et al., 2007b), a hybrid directly
descending from this parental should have at least 2A chromosomes (as observed in
grapefruits), which are not found in these species. On the other hand, C. sinensis has two
different B chromosomes, whereas the mandarin accessions investigated till now have no
more than one pair of B chromosomes and pummelos have no B chromosome type.
Therefore, both sweet orange and sour orange may contain germplasm from C. maxima, as
many evidences indicate (Moore, 2001), and most probably they are derived from secondary
hybrids containing germplasm from two or more species (Barrett and Rhodes, 1976).

The Evolution of Citrus Species

The vast majority of accessions of the subgenus Citrus analysed by CMA banding seem
to represent hybrids with different combinations of chromosome types (see Table 1). The
heteromorphic chromosome pairs are largely a consequence of the intensive hybridization of
a few base species, as also indicated by morphological, isoenzymatic and molecular data
(Scora, 1975; Barrett and Rhodes, 1976; Moore, 2001). Since most species of subgenus
Papeda and other Citrus related genera were not exposed to this kind of artificial
hybridization, such high level of heterozygosity is not expected and it has not been
experimentally found by us (Guerra et al., 2000, and unpublished results). Indeed, all other
Citrus-related genera we have investigated exhibited rather homomorphic karyotypes (Guerra
et al., 2000; Brasileiro-Vidal et al., 2007). However, Yamamoto et al. (2007, 2008) reported
heterozygous karyotypes for several species of subgenus Papeda and Citrus related genera.
The cytological data suggest that among the edible, largely cultivated species of Citrus,
only C. medica, C. maxima, and C. deliciosa, as well as C. reticulata cv. Cravo, are true
biological species. The main features of lemons, limes, grapefruits, sweet orange, and
mandarins, as the distinct fruit sizes, shapes, peel colours and flavours are present in these
basic species. Among the investigated non-commercial species of subgenus Citrus, only C.
reshni and C. sunki are undoubtedly homozygous and may represent true biological species.
Although the fruits of these two species are acid and small, comparing with the large and less
acidic fruits of C. maxima and C. medica, they may have contributed to the origin of a high
number of hybrids with similar fruit and flavor qualities. According to Gmitter and Hu
(1990), C. sunki grows in the wild in the province of Yunnan, southern China, together with
C. medica, C. maxima and some of the most primitive members of Citrus subgenus Papeda.
These authors found a substantial portion of the Citrus gene pool represented in this region
and suggested that the Yunnan area was part of the primitive centre of origin of the modern
Citrus species.
Because most Citrus species are hybrids between a few base species, it is difficult to
organize Tanakas species into subgroups. Herrero et al. (1996), based on isoenzyme
Marcelo Guerra 64
diversity of 198 cultivars of Citrus and related genera, were able to distinguish two main
groups of cultivated species: the orange-mandarin group and the lime-lemon-citron-pummelo
group. On the other hand, based on the karyotype formulae, one can roughly distinguish four
groups of Citrus species and associated hybrids: the citron group, comprising citrons, limes
and lemons; the pummelo group, including pummelos and grapefruits; the mandarin group,
comprising the wild and most cultivated mandarins; and a group of hybrids, which is made up
of those species whose origin is difficult to assert and are probably resulting from two or
more crosses among representatives of different groups, eventually including germplasm
from Papeda and from other genera (see also, Barrett and Rhodes, 1976). In this hybrid group
we can include some mandarin accessions, as King, Murcott, and some other accessions
described by Yamamoto et al. (2005, 2007), the sour and sweet orange, and several others.
The most important markers distinguishing germplasm of the first three groups are: the
chromosome type D/5S, which was found in all investigated species from the citron group,
the exclusive presence of 2D/5S-45S among the mandarins, and the A and F/5S
chromosomes characteristic of the pummelo group (Pedrosa et al., 2000; Carvalho et al.,
2005; Moraes et al., 2007a,b). The occurrence of some uncommon chromosome types, like
type E in some limes, lemons and mandarin accessions (Yamamoto and Tominaga, 2003) and
type G in C. aurantifolia, C. latifolia and in Microcitrus (unpublished results), point to a
more complex hybridization history of this genus and should be better investigated.

Acknowledgments

The author is grateful to her students and former students Ana Emilia Barros e Silva, Ana
Paula de Moraes, Andr Marques, and Filipe Felinto for the photographs presented in this
chapter, Magdalena Vaio and Ana Emilia for careful text revision, and to Artur Fonseca and
Gustavo Souza for helping with the plates.

References

Almeida C.C.S., Carvalho P.C.L., and Guerra M. (2007) Karyotype differentiation among
Spondias species and the putative hybrid Umbu-caj (Anacardiaceae). Bot. J. Linn. Soc.
155: 541547.
Agarwal P. K. (1987) Karyotype of Citrus tamurana (Tan.). Chrom. Inf. Serv. 42: 3-5.
Asns M. J., Mestre P. F., Herrero R., Navarro L., and Carbonell E. A. (1998) Molecular
markers: a continuously growing biotechnology area to help Citrus improvement. Fruits
53: 293-302.
Baeza C., Shraeder O., and Budahn H. (2007) Characterization of geographically isolated
accessions in five Alstroemeria L. species (Chile) using FISH of tandenly repeated DNA
sequences and RAPD analysis. Plant Syst. Evol. 269: 1-14.
Barrett H. C., and Rhodes A. M. (1976) A numerical taxonomic study of affinity
relationships in cultivated Citrus and its close relatives. Syst. Bot. 1: 105-136.
Befu M., Kitajima A., and Hasegawa K. (2001) Chromosome composition of some Citrus
species and cultivars based on the chromomycin A (3) (CMA) banding patterns. J.
Japan. Soc. Hort. Sci. 70: 83-88.
Befu M., Kitajima A., Ling Y.X., and Hasegawa K. (2000) Classification of Tosa Buntan
pummelo (Citrus grandis [L.] Osb.), Washington navel orange (C. sinensis [L.] Osb.)
and trifoliate orange (Poncirus trifoliata [L.] Raf.) chromosomes using young leaves. J.
Japan. Soc. Hort. Sci. 69: 2228.
Befu M., Kitajima A., and Hasegawa K. (2002) Classification of the Citrus chromosomes
with same types of chromomycin A banding patterns. J. Japan. Soc. Hort Sci 71: 394
400.
Blondel L. (1978) Classification botanique des espces du genre Citrus. Fruits 33: 695-720.
Bowman K. D., and Gmitter F.G. Jr. (1990) Caribbean forbidden fruit: grapefruits missing
link with the past and bridge to the future. Fruit Var. J. 44: 41-44.
Brasileiro-Vidal A. C., Santos-Serejo J., Soares Filho W. S., and Guerra M. (2007) A simple
chromosomal marker can reliably distinguish Poncirus from Citrus species. Genetica
129: 273-279.
Bret M. P., Ruiz C., Pina J. A., and Asns M. J. (2001) The diversification of Citrus
clementina Hort. ex Tan., a vegetatively propagated crop species. Mol. Phylogenet. Evol.
21: 285-293.
Cameron J. W., and Frost H. B. (1968) Genetics, breeding and nucellar embryony in Citrus.
In: W. Reuther, L. D. Batchelor, H. J. Weber (Eds.), The Citrus industry: Anatomy,
physiology, genetics and reproduction, vol II (pp. 325-366). Berkeley, University of
California Press.
Carvalho R., Soares Filho W. S., Brasileiro-Vidal A. C., and Guerra M. (2005). The
relationships among lemons, limes and citron: a chromosomal comparison. Cytogenet.
Genome Res. 109: 276-282.
Coletta Filho H. D., Machado M. A., Targon M. L. P. N., Moreira M. C. P. G., and Pompeu
Jr. J. (1998) Analysis of the genetic diversity among mandarins (Citrus spp.) using
RAPD markers. Euphytica 102: 133-139.
Cornlio M.T.M.N. Figueira A. R. S. Santos K. G. B., Carvalho R., Soares Filho W.S., and
Guerra M. (2003) Chromosomal relationships among cultivars of Citrus reticulata
Blanco, its hybrids and related species. Plant Syst. Evol. 240: 149161.
Davies F. S., and Albrigo L. G. (1994) Citrus. Crop Production Science in Horticulture 2
(pp. 19-43). Wallingford, CAB International.
Deng Z. N., Gentile A., Nicolosi E., Continella G., and Tribulato E. (1996) Parentage
determination of some Citrus hybrids by molecular markers. Proc. Int. Soc. Citricul. 2:
849-854.
Esen A., and Scora R. W. (1977) Amylase polymorphism in Citrus and some related genera.
Amer. J. Bot. 64: 305-309.
Federici C. T., Fang D. Q., Scora R. W., and Roose M. L. (1998) Phylogenetic relationships
within the genus Citrus (Rutaceae) and related genera as revealed by RFLP and RAPD
analysis. Theor. Appl. Genet. 96: 812-822.
Gmitter F. G. Jr., and Hu X. (1990) The possible role of Yunnan, China, in the origin of
contemporary Citrus species (Rutaceae). Econ. Bot. 44: 267-277.
Marcelo Guerra 66
Gmitter F. G. Jr. (1995) Origin, evolution, and breeding of the grapefruit. Plant Breed. Rev.
13: 345-363.
Guerra M. (1985) Cytogenetics of Rutaceae. III. Heterochromatin patterns. Caryologia 38:
335-346.
Guerra M. (1987) Cytogenetics of Rutaceae. IV. Structure and systematic significance of the
interephase nuclei. Cytologia 52: 213-222.
Guerra M. (1993) Cytogenetics of Rutaceae. V. High chromosomal variability in Citrus
species revealed by CMA/DAPI staining. Heredity 71: 234-241.
Guerra M., Pedrosa A., Silva A. E. B., Cornlio M. T. M., Santos K. G. B., Soares Filho W.
S. (1997) Chromosome number and secondary constriction variation in 51 accessions of
a Citrus germplasm bank. Braz. J. Genet. 20: 489-496.
Guerra M., Santos K. G. B., Silva A. E. B., and Ehrendorfer F. (2000) Heterochromatin
banding patterns in Rutaceae-Aurantioideae - A case of parallel chromosomal evolution.
Amer. J. Bot. 87: 735-747.
Guerra M. (2008) Chromosome numbers in plant cytotaxonomy: concepts and implications.
Cytogenet. Genome Res. 120: 339-350.
Handa T., Ishizawa Y., and Oogaki C. (1986) Phylogenetic study of fraction I protein in the
genus Citrus and its close related genera. Jpn. J. Gen. 61:15-24
Herrero R., Asns M. J., Carbonell E. A., and Navarro L. (1996) Genetic diversity in the
orange subfamily Aurantioideae. I. Intraspecies and intragenus genetic variability. Theor.
Appl. Genet. 92: 599-609.
Hodgson R. W. (1967) Horticultural varieties of Citrus. In: W. Reuther, H. J. Weber, L. D.,
Batchelor (Eds.), The Citrus industry: History, world distribution, botany and varieties,
vol I (pp. 431-591). Berkeley, University of California Press.
Ito Y., Omura M., and Nesume H. (1993) Improvement of chromosome observation methods
for Citrus. In: Hayashi, T., Omura, M.; Scott, N. S. (Eds.), Techniques on gene diagnosis
and breeding (pp. 31-38). Tsukuba, FTRS.
Kang S., Lee D., An H., Park J., Yun S., Moon Y., Bang J., Hur Y., and Koo D. (2008)
Extensive chromosomal polymorphism revealed by ribossomal DNA and satellite DNA
loci in 13 Citrus species. Mol. Cells 26: 319-322.
Li Y., Cheng Y., Tao N., and and Deng X. (2007) Phylogenetic analysis of mandarin
landraces, wild mandarins, and related species in China using nuclear LEAFY second
intron and plastid trnL-trnF sequence. J. Amer. Soc. Hort. Sci. 132: 796-806.
Liang G. (1988) Studies on the Giemsa C-banding patterns of some Citrus and its related
genera. Acta Genet. Sin.15: 409-415
Machado M. A., Colleta Filho H. D., Targon M. L. P. N., and Pompeu Jr. J. (1996) Genetic
relationship of Mediterranean mandarins (Citrus deliciosa Tenore) using RAPD markers.
Euphytica 92: 321-326.
Marcon A.B., Barros I. C. L., and Guerra M. (2003) A karyotype comparison between two
closely related species of Acrostichum. Amer. Fern J. 93: 116-125.
Matsuyama T., Akihama T., Ito Y., Omura M., Fukui K. (1996) Characterization of
heterochromatic regions in Trovita orange (Citrus sinensis Osbeck) chromosomes by
the fluorescent staining and FISH methods. Genome 39: 941-945.
Matsuyama T., Akihama T., Ito Y., Omura M., and Fukui K. (1999) Distribution of TGG
repeat-related sequences in Trovita orange (Citrus sinensis Osbeck) chromosomes.
Genome 42: 1251-1254.
Miranda M., Ikeda F., Endo T., Moriguchi T., and Omura M. (1997a) Comparative analysis
on the distribution of heterochromatin in Citrus, Poncirus and Fortunella chromosomes.
Chromosome Res. 5: 86-92.
Miranda M., Ikeda F., Endo T., Moriguchi T. and and Omura M. (1997b) rDNA sites and
heterochromatin in Meiwa kumquat (Fortunella crassifolia Swing.) chromosomes
revealed by FISH and CMA/DAPI staining. Caryologia 50: 333-340.
Moore G. A. (2001) Oranges and lemons: clues to the taxonomy of Citrus from molecular
markers. Trends Genet. 17: 536-540.
Moraes A. P., Lemos R. R., Brasileiro-Vidal A. C., Soares Filho W. S., and Guerra M.
(2007a) Chromosomal markers distinguish hybrids and non-hybrid accessions of
mandarin. Cytogenet. Genome Res. 119: 275-281.
Moraes A. P., Soares Filho W. S., and Guerra M. (2007b) Karyotype diversity and the origin
of grapefruit. Chromosome Res. 15:115-121.
Nicolosi E. (2007) Origin and taxonomy. In: Khan, I. A. (ed.) Citrus, Genetics and
Biotechnology. CAB International, Wallingfort, pp 19-43.
Nicolosi E., Deng Z. N., Gentile A., La Malfa S., Continella G., and Tribulato E. (2000)
Citrus phylogeny and genetic origin of important species as investigated by molecular
markers. Theor. Appl. Genet. 100: 1155-1166.
Pedrosa A., Schweizer D., Guerra M. (2000) Cytological heterozygosity and the hybrid origin
of sweet orange [Citrus sinensis (L.) Osbeck]. Theor. Appl. Genet. 100: 361-367.
Roose M. L., Schwarzacher T., and Heslop-Harrison J. S. (1998) The chromosomes of Citrus
and Poncirus species and hybrids: Identification of characteristic chromosomes and
physical mapping of rDNA loci using in situ hybridization and fluorochrome banding. J.
Hered. 89: 83-86.
Schweizer D. (1976) Reverse fluorescent chromosome banding with chromomycin and
DAPI. Chromosoma 58: 307-324.
Schweizer D., and Ambros P. F. (1994) Chromosome banding. In: Gosden JR (ed) Methods
in molecular biology, vol. 29, Chromosome analysis protocols, Humana Press, Totowa,
pp 97-113.
Scora R. W. (1975) IX. On the history and origin of Citrus. Bull. Torrey. Bot. Club. 102: 369-
375.
Stebbins G.L. (1969) The effect of asexual reproduction on higher plant genera with special
reference to Citrus. Proc. First Intern. Citrus Symp., vol. 1: 455-458.
Swingle W. T., Reece P. C. (1967) The botany of Citrus and its wild relatives. In: W.
Reuther, H. J. Weber, L. D., Batchelor (Eds.), The Citrus Industry: History, world
distribution, botany and varieties, vol I (pp. 190-430). Berkeley, University of California
Press.
Vaio M., Speranza P., Valls J. F., Guerra M., and Mazzella C. (2005) Localization of the 5S
and 45S rDNA sites and cpDNA sequence analysis in species of the Quadrifaria group of
Paspalum (Poaceae, Paniceae). Ann. Bot. 96: 191-200.
Marcelo Guerra 68
Wei W., Cheng Y. and Duan Y. (1988) Studies on the evolution of Citrus based on karyotype
and C-banding patterns. Acta Hort. Sin. 15: 223-228.
Yamamoto M., Abkenar A.A., Matsumoto R., Nesumi H., Yoshida T., Kuniga T., Kubo T.,
and Tominaga S. (2007) CMA banding patterns of chromosomes in major Citrus species.
J. Japan. Soc. Hort. Sci. 76: 36-40.
Yamamoto M., Abkenar A.A., Matsumoto R., Kubo T., and Tominaga S. (2008) CMA
staining analysis of chromosomes in several species of Aurantioideae. Genet. Resour.
Crop Evol. 55: 1167-1173.
Yamamoto M., Kubo T., and Tominaga S. (2005) CMA banding patterns of chromosome of
mid- and late- maturing citrus and acid citrus grown in Japan. J. Japan. Soc. Hort. Sci.
74: 476-478.
Yamamoto M., and Tominaga S. (2003). High chromosomal variability of mandarins (Citrus
spp.) revealed by CMA banding. Euphytica 129: 267274.
Yang X., Kitajima A., and Hasegawa K. (2002) Chromosome pairing set and the presence of
unreduced gametes explain the possible origin of polyploid progenies from the diploids
Tosa-Butan X Suisho-Butan pummelo. J. Japan. Soc. Hort. Sci. 71: 538-543.


Chapter 3

Genetic Diversity of Mycobacterium
Tuberculosis Population in Bulgaria

Violeta Valcheva
1
, Igor Mokrousov
2
, Olga Narvskaya
2
,
Nalin Rastogi
3
and Nadya Markova
1

1
Department of Pathogenic Bacteria, The Stephan Angeloff Institute of Microbiology,
Bulgarian Academy of Sciences, Sofia, 1113 Bulgaria
2
Laboratory of Molecular Microbiology, St. Petersburg Pasteur Institute,
197101 St. Petersburg, Russia;
3
Unit de la Tuberculose et des Mycobactries,
Institut Pasteur de Guadeloupe, Abymes 97183 Guadeloupe

Abstract

Tuberculosis remains an important public health issue for Bulgaria, a Balkan country
located in the world region with contrasting epidemiological situation for tuberculosis.
Here, we present results of the recent studies on the genetic diversity of Mycobacterium
tuberculosis population in Bulgaria that was evaluated with various DNA fingerprinting
methods (spoligotyping, 24-MIRU-VNTR and IS6110-RFLP typing). The spoligotype-
based population structure of M. tuberculosis in Bulgaria was shown to be sufficiently
heterogeneous. It is dominated by several worldwide distributed spoligotypes ST53 and
ST47 and Balkan-specific spoligotypes ST125 and ST41. The Beijing genotype strains
were not found in Bulgaria in spite of close links with Russia in the recent and historical
past. Comparison with international database SITVIT2 (Pasteur Institute of Guadeloupe)
showed that spoligotype ST53 is found in similar and rather high proportion in the
neighboring Greece and Turkey and almost equally distributed across different regions of
Bulgaria. Contrarily, ST125 is not found elsewhere and is specific for Bulgaria;
furthermore it appears to be mainly confined to the southern part of the country. Novel
15/24-loci format of MIRU-VNTR typing was found to be the most discriminatory tool
compared to spoligotyping and IS6110-RFLP typing of M. tuberculosis strains in

Correspondence: Igor Mokrousov, Ph.D., St. Petersburg Pasteur Institute, 14 Mira street, St. Petersburg, 197101
Russia. Email: imokrousov@mail.ru
Violeta Valcheva, Igor Mokrousov, Olga Narvskaya et al. 70
Bulgaria. Furthermore, VNTR typing was shown useful for resolving ambiguous
phylogeny of some spoligotypes, in particular, those classified as LAM/S by
bioinformatics approach. In practical terms, a reduced Bulgaria-specific 5-locus set
(MIRU40, Mtub04, Mtub21, QUB-11b, QUB-26) provided a sufficiently high
differentiation and may be preliminarily recommended for a first-line typing of M.
tuberculosis isolates in Bulgaria although further studies are needed to validate this
scheme. At the same time, a comprehensive secondary subtyping of the clustered isolates
should target all 15 discriminatory loci. We additionally investigated molecular basis of
drug resistance of the studied strains. Three types of the rpoB mutations were found in 20
of 27 RIF-resistant isolates; rpoB S531L was the most frequent. Eleven (48%) of 23
INH-resistant isolates had katG S315T mutation. inhA -15C>T mutation was detected in
one INH-resistant isolate (that also had katG315 mutation) and three INH-susceptible
isolates. A mutation in embB306 was found in 7 of 11 EMB-resistant isolates.
Consequently, rpoB and embB306 mutations may serve for rapid genotypic detection of
the majority of the RIF and EMB-resistant strains in Bulgaria; the results on INH
resistance are complex and further investigation of more genes is needed. Comparison
with spoligotyping and 24-VNTR locus typing data suggested that emergence and spread
of drug-resistant and MDR-TB in Bulgaria are not associated with any specific
spoligotype or MIRU-VNTR genotype. A local circulation of the particular clones
appears to be an important factor to take into consideration in the molecular
epidemiological studies of tuberculosis in Bulgaria.

Introduction

Tuberculosis (TB) infects a significant proportion of the world population and constitutes
a major public health problem, particularly, in the developing regions. A reemergence of TB
accompanied by an increasing number of drug resistant and multidrug-resistant (i.e. resistant
to at least rifampin [RIF] and isoniazid [INH]) Mycobacterium tuberculosis strains has been
noted since the mid-1980s. Management of tuberculosis is complicated by the emergence of
drug resistant M. tuberculosis strains, which has become a serious health problem worldwide
(WHO, 2008ab).
Tuberculosis (TB) remains an important public health issue for Bulgaria whereas no
genotypic data on the circulating Mycobacterium tuberculosis strains were yet published
from this Balkan country. Although a number of new cases is showing a steady decline since
2001 (48.6/100,000), the TB incidence rate in Bulgaria is still sufficiently high (41/100,000
in 2006) (WHO, 2008b). Geographically, Bulgaria is located in the region with contrasting
epidemiological situation for tuberculosis. The southern neighbour, Greece, reported TB rates
to have been gradually decreased, while the incidence was only 6.9/100,000 in 2005. The
reported TB rates for Romania and Turkey are significantly higher and have been increasing
and reached 135.2/100,000 in Romania and 28.1/100,000 in Turkey in 2005 (EuroTB, 2007).
The rate of the MDR-TB among newly diagnosed TB patients in Bulgaria was estimated
to be 10.7% (95% CLs 1.8-44.7) that is higher than in the neighboring countries such as,
Romania (2.8% [95% CLs 1.8-4.2]), Greece (1.1% [95% CLs 0.2-7.4]) or Turkey (1.4%
[95% CLs 0.2-9.0 ]) and is more similar to this estimated rate in Ukraine (16% [95% CLs
13.7-18.4]) and Russia (13% [95% CLs 11.3-14.8]) (WHO, 2008a). However one should take
notion of the CL values of these estimations.
Mycobacterium tuberculosis in Bulgaria 71
Recent advances in molecular techniques have enabled development of a variety of
genotyping methods for differentiation of clinical isolates of M. tuberculosis (van Embden et
al., 1993, van Soolingen et al., 2001, Mostrm et al., 2002). In particular, repetitive and
insertion sequences were proven useful to study both epidemiology and phylogeography of
M. tuberculosis (Supply et al., 2001; Sola et al., 2001; Brudey et al., 2006ab; Mokrousov,
2007; Zozio et al., 2005; van Soolingen et al., 2001; Mostrm et al., 2002; Al-Hajoj et al.,
2007) and regularly updated genetic diversity databases are available for this pathogen
(Filliol et al., 2003; Brudey et al., 2006b; Mokrousov et al., 2005; Mokrousov, 2007, 2008;
Weniger et al., 2007; El-Sahly et al., 2004; Kremer et al., 2004).
The chromosomal locus, containing a large number of direct repeats (DRs) interspersed
with unique spacer sequences, is the target of spoligotyping (spacer oligonucleotide typing)
technique (Kamerbeek et al., 1997) (Figure 1). This method has been widely applied to study
molecular epidemiology and evolutionary genetics of TB. Since the technique is PCR based,
it requires less DNA than conventional IS6110 restriction fragment length polymorphism
analysis, which is the most widely applied and standardized molecular typing method (van
Embden et al 1993, van Soolingen et al 2001).
In recent years, various novel DNA typing methods have been developed which are
faster and easier to perform than IS6110-RFLP method. Among them, VNTR typing is
probably the most popular approach. This method is based on the variable-number tandem
repeats of mycobacterial interspersed repetitive units (MIRU-VNTR) scattered throughout
the genome and each isolate is typed based on the number of copies of repeated units (Supply
et al., 2001). Implementation of the large number of loci is expected to achieve a high
discrimination. This relatively new method, which requires only basic PCR and agarose
electrophoresis equipment, was shown with different strain samples to possess a higher
discriminatory power than that of spoligotyping and only slightly below that of IS6110-RFLP
typing (Supply et al., 2001) although this may vary depending on local population structure
(Zozio et al. 2005; Mokrousov et al., 2004). The apparent advantage of the VNTR approach
(compared to the IS6110 typing) is its portability due to easy digitalization of the generated
profiles and hence easy interlaboratory exchange, as well as easy creation and maintenance of
the databases. Since 1998, the VNTR typing of M. tuberculosis has undergone a remarkable
improvement. Whereas the initial scheme used only six exact tandem repeat loci
(Frothingham, R., and W. A. Meeker-O'Connell. 1998), a more recently developed and
already classical MIRU set involved 12 loci (Supply et al., 2001), finally, the most recently
proposed new format for MIRU typing includes 24 loci (Supply et al., 2006) (Figure 1).
The development and application of the MIRU-VNTR typing for M. tuberculosis became
an important methodological achievement towards a better understanding of the molecular
epidemiology of tuberculosis. The first paper on the new 24-locus format dealt with mainly
cosmopolitan, geographically diverse set of strains (Supply et al., 2006). Although it made a
critically important step in evaluating a wide array of loci and selecting those most
appropriate, more common in-field studies are carried out, by definition, in the
geographically limited settings with possibly biased local population structures of the
circulating strains. A population-based study in Hamburg, Germany, concluded that the 15-
and 24-locus VNTR typing combined with spoligotyping represents the first PCR-based
method with operating parameters (specificity and sensitivity) comparable to those of the
gold standard IS6110 fingerprinting and thus can be used as a stand-alone approach to
study TB transmission (Oelemann et al., 2007). Since than, this new 24-locus set was
evaluated in four studies (Allix-Beguec et al., 2008a; Iwamoto et al., 2007; Jiao et al., 2008;
Mokrousov et al., 2008). Three of them were carried out in the specific world areas (Japan,
China, Russia) with biased population structures of M. tuberculosis, i.e., dominated by a
single and homogeneous clonal group, the Beijing genotype. In fact, two of these studies
focused exclusively on the Beijing genotype strains (Jiao et al., 2008; Mokrousov et al.,
2008). Finally, a large-scale three-year population-based study in the Brussels-Capital Region
in Belgium included strains of highly diverse origins, in particular, 76% patients were
foreign-born from 69 countries, in majority, from Africa (Allix-Beguec et al., 2008a).
The early detection of resistance to first line anti-TB drugs is essential for the efficient
treatment and constitutes one of the priorities of TB control of MDR strains. Patients infected
with drug resistant strains are less likely to be cured, and their treatment is more toxic and
expensive than the treatment for patients infected with susceptible organisms. Inadequate
and/or interrupted therapy allows for the selection of spontaneous mutations in favor of
resistant organisms while sequential acquisition of these mutations in different genome loci
results in the development of resistance to multiple drugs. Therefore, a correct and rapid
detection of resistant strains is necessary for the appropriate and timely anti-TB therapy and
the reduction of total treatment cost.
Multiple genes responsible for conferring resistance to the major anti-TB drugs have
been identified for M. tuberculosis. A majority of rifampin (RIF) resistant strains harbor
mutations in the 81-bp hot-spot region (rifampin resistance determining region, RRDR) of the
rpoB gene encoding DNA-dependent RNA polymerase -subunit, a target of the drug
(Telenti et al., 1993; Ramaswamy, Musser, 1998; Martin, Portaels, 2007).

Figure 1. (a) Position of the 24 MIRU-VNTR and DR loci on the M. tuberculosis H37Rv chromosome;
(b) their structure and (c) example of spoligoprofiles.
Isoniazid (INH) resistance is controlled by a complex genetic system that involves
several genes, katG, inhA, ahpC, kasA, and ndh (Ramaswamy, Musser, 1998; Slayden, Barry,
2000; Lee et al., 2001; Martin, Portaels, 2007). Ethambutol (EMB) resistance was most
frequently associated with mutations in the embCAB operon which product arabinosyl
transferase is involved in mycolic acids metabolism and particularly with mutations in embB
codon 306 (Telenti et al., 1997; Sreevatsan et al., 1997; Ramaswamy et al., 2000). More
recently, Mokrousov et al. (2002b) highlighted a presence of embB306 mutations in EMB-
susceptible strains and Hazbon et al. (2005) suggested an association of embB306 mutations
with broad drug resistance and clustering rather than with EMB resistance.
This chapter reviews our recent molecular studies of M. tuberculosis strains circulating in
Bulgaria, as a necessary step towards an implementation and better understanding of
molecular epidemiology of TB here. We further looked at our data at a global scale through
comparison with international database SITVIT2. Different typing methods, including
IS6110-RFLP and MIRU-VNTR were applied to M. tuberculosis strains from Bulgaria. The
objective was to assess new versus traditional molecular markers for epidemiological studies
of M. tuberculosis in Bulgaria. The general interest of our study was to evaluate the
performance of the newly proposed 24-locus standard (i) in the relatively heterogeneous M.
tuberculosis population (ii) circulating in the setting of a single country (iii) devoid of the
significant influx of the foreign-born population. Characterization of the molecular basis of
drug resistance in a survey area constitutes a first step towards an implementation of the
methods permitting its fast detection. Here, we analyzed the molecular basis of drug
resistance in M. tuberculosis strains currently circulating in Bulgaria. We also compared the
distribution of the drug resistance in the main genotypic clusters defined using spoligotyping
and VNTR typing.

Methods

Bacterial Isolates

One hundred and thirty three M. tuberculosis isolates were randomly selected among M.
tuberculosis strains isolated from newly-diagnosed, adult, pulmonary TB patients in different
regions of Bulgaria from December 2004 to March 2006. These isolates were recovered from
adult HIV-negative pulmonary TB patients who were permanent residents of the country. The
patients were permanent Bulgarian residents and were proven to be unlinked on the basis of a
standard epidemiological investigation.
No preliminary selection of strains based on their drug resistance or patient data was
made. These isolates corresponded to all newly isolated M. tuberculosis cultures available at
the time of collection, hence these clinical isolates may be interpreted as a snapshot of the
circulating tubercle bacilli clones in Bulgaria.
Susceptibility testing for isoniazid (INH), rifampin (RIF), ethambutol (EMB),
streptomycin (STR) was carried out by the absolute concentration method on Lowenstein-
Jensen medium as recommended (WHO, 1998). The critical concentrations for INH, RIF,
EMB, and SM were 0.25, 10, 2.0, and 4 mg/l, respectively.
DNA Fingerprinting

The DNA of the studied strains was extracted from 4 to 6 weeks Lwenstein-Jensen
medium culture using the recommended method (van Embden et al., 1993).
Spoligotyping was used to analyze a variation in the DR locus (absence/presence of 43
different spacers) as described previously (Kamerbeek et al., 1997) (Figure 1). The individual
spoligotyping patterns were entered in Excel spreadsheet and compared with international
database SITVIT2 (Institut Pasteur de Guadeloupe) that is an updated version of the
published SpolDB 4.0 database (Brudey et al., 2006). At the time of this comparison
(September 4, 2007), SITVIT2 contained a total of 2880 shared-types corresponding to
66846 clinical isolates from 122 isolation countries, and 166 countries of origin. Major
phylogenetic clades were assigned according to signatures provided in SpolDB4 (Brudey et
al., 2006b), which defines 62 genetic lineages/sub-lineages. These included specific
signatures for various M. tuberculosis sub-species such as M. bovis, M. microti, M. caprae,
M. pinipedii, M. africanum, as well as rules defining major lineages/sub-lineages for M.
tuberculosis sensu stricto. The latter included the Central Asian (CAS) clade (2 sub-
lineages), the East African Indian (EAI) clade (9 sub-lineages), the Haarlem (H) clade (34
sub-lineages), the Latin-American-Mediterranean (LAM) clade (12 sub-lineages), the
"Manu" family (3 sub-lineages), the Beijing family, the S clade, the IS6110-low banding X
clade (3 sub-lineages), and an ill-defined T clade (5 sub-lineages). IS6110-RFLP typing was
performed mainly as previously described (van Embden et al., 1993). Briefly, M. tuberculosis
DNA was digested with PvuII, electrophoresed, Southern-blotted and hybridized with a DIG-
labeled 245-bp PCR-generated IS6110 probe. Each Southern blot included DNA of M.
tuberculosis strain 14323 as an external molecular weight marker. The hybridization profiles
were visualized as banding patterns on membrane using alcaline phosphatase (Roche Applied
Science, USA) catalyzed colorimetric reaction. Each of the 24 MIRU-VNTR loci was
amplified individually with primers specific for sequences flanking the MIRU units as
described by Supply et al. (2001, 2006) (Figure 1). The amplicons were evaluated on the
1.5% standard agarose gels using a 100-bp DNA ladder (GE Healthcare). The H37Rv strain
was run as additional control of the performance of the method. Size analysis of the PCR
fragments in 1.5% agarose gels and assignment of the VNTR alleles were done using
TotalLab TL100 software (Nonlinear Dynamics Ltd., UK) and by comparison with
correspondence tables kindly provided by Philip Supply. Some PCR reactions were repeated
and allele scoring was done by independent analysis by two technicians. Analysis of the
IS6110 element specific for LAM genetic family was done as described previously (Marais et
al., 2006). In brief, a 205-bp band indicates a LAM strain by the presence of an IS6110
element in a specific site in genome, whereas a 141-bp band indicates a non-LAM strain
lacking the IS6110 element in this site.

Resistance Mutations Typing

Mutations were detected in rpoB RRDR, katG315, inhA promoter region (positions from
-9 to -25), and embB306, as described previously (Morcillo et al., 2002; Mokrousov et al.,
2002ab, 2004, 2006). PCR-RFLP method was used to detect mutations in embB306 and
katG315, while reverse hybridization method was used to detect mutations in rpoB RRDR
region and inhA -15 C>T mutation in the inhA promoter region using membranes prepared in
St. Petersburg Pasteur Institute as described previously (Mokrousov et al., 2004, 2006).

Quality Control

To minimize the risk of laboratory cross-contamination during PCR amplification, each
procedure (preparation of the PCR mixes, the addition of the DNA, the PCR amplification,
and the electrophoretic fractionation) was conducted in physically separated rooms. Negative
controls (water) were included to control for reagent contamination.

Statistical Analysis

EpiCalc 2000 version 1.02 software (Gilman, Myatt, 1998) was used for statistical
analysis to calculate Odds ratio and p-values with 95% confidence interval.
PAUP* 4.0 package (Swofford, 2002) was used to reconstruct the most parsimonious
dendrogram of the VNTR digital profiles treated as categorical variables.
Hunter Gaston index (HGI) was calculated as described previously (Hunter Gaston,
1988) and was used to evaluate discriminatory power of the typing methods and allelic
diversity of the VNTR loci. HGI is a probability that two strains consecutively taken from a
given population would be placed into different types by the typing method; the lower the
index value is, the less discriminative is the typing method. The HGI was calculated as
described previously (Hunter and Gaston, 1989). Mean HGI was calculated as a mean value
of HGI values of the 24 loci. The HGI was calculated using the following formula:

1
1
1 ( 1)
( 1)
s
j j
j
HGI n n
N N
=
=

where N is the total number of strains in the typing scheme, s is the total number of distinct
patterns discriminated by the typing method, and n
j
is the number of strains belonging to the
jth pattern (Hunter, Gaston, 1988).
The TotalLab TL100 software (Nonlinear Dynamics Ltd., UK) was used to calculate
molecular weight of the fragments in the IS6110-RFLP profiles; the resulting molecular
weights matrix was used by Taxotron package (Grimont, 2002) to build a UPGMA
(unweighted pair-group method of averages) dendrogram.
Relationships between spoligoprofiles were evaluated as a spoligoforest graph according
to a deletion model of the evolution of the DR locus (Reyes et al., 2008; Tang et al., 2008) by
using SpolTools program (http://www.emi.unsw.edu.au/spolTools). The spoligoforest burst
layout was generated with SpolTools program using Fruchterman-Reingold algorithm (Reyes
et al., 2008; Tang et al., 2008).

Results and Discussion

Population structure of M. tuberculosis in Bulgaria

A study sample included 133 strains from different regions of the country. Spoligotyping
was used as a primary typing tool; it subdivided these strains into 37 types, including 15
clusters and 22 singletons (Table 1; Figure 2).

Figure 2. Relationships of the M. tuberculosis spoligotypes identified in 133 strains from Bulgaria
visualized as a forest graph. Circle size is proportional to the number of strains. Solid edges are unique
relationships between spoligotypes. Broken-line edges are chosen among multiple edges. Dotted lines
have probability <0.5, while dashed lines have probability measure >0.5.
Twenty-two spoligotypes represented single isolates; the other 111 isolates were grouped
into 15 clusters (2 to 33 isolates). HGI was 0.893. The spoligotype designation was attributed
by online comparison of the obtained profiles presented in binary code with those included in
the international SITVIT2 database (Institut Pasteur de Guadeloupe) (Table 1). This
comparison showed a noticeable presence of two globally distributed shared types ST53
(24.8%) and ST47 (5.3%). Twenty-five (19.0%) and six (4.5%) strains belonged to ST125
(LAM/S subfamily) and ST41 (LAM7_TUR subfamily). Eight spoligoprofiles (14 strains)
were not found in the SITVIT2 and were designated as new; two of them constituted new
shared types ST2905 and ST2906 while the other 6 new profiles remained orphans (Table 1).

Figure 3. Regional distribution of the major spoligotypes and clades identified in 133 M. tuberculosis
strains from Bulgaria. Circle size is roughly proportional to the number of strains from an area in
Bulgaria. Data on Turkey and Greece are based on the SITVIT2 database at Institut Pasteur de
Guadeloupe.
The distribution of the major spoligotypes was plotted to the map of the country; this also
demonstrated a geographic diversity of our collection (Figure 3). The spoligotype-defined
population structure of M. tuberculosis in Bulgaria appears to be both sufficiently
heterogeneous (HGI=0.893) and dominated by two spoligotypes ST125 (19%) and ST53
(25%) that distribution patterns differ strikingly. Spoligotype ST53 is found in similar and
rather high proportion in the neighboring Greece and Turkey and almost equally distributed
across different regions of Bulgaria (Figure 3). Contrarily, ST125 is not found elsewhere
(Valcheva et al., 2008a) and is specific for Bulgaria; furthermore it appears to be mainly
confined to the southern part of the country (Figure 3).

Table 1. Description of M. tuberculosis spoligotypes found in Bulgaria and compared to SITVIT2 database

No of strains in
Spoligotype
SITVIT2
a

Spoligoprofile
No of
strains,
this study
Clade,
SITVIT1
Romania Turkey Greece Russia
ST53 33 T1 1 167 41 67
ST612 1 T
ST50 2 H3 1 52 10 10
ST47 7 H1 1 21 2 6
ST602 1 U 9 1
Orphan A* 1
ST44 1 T5
ST61 1 LAM10_CAM 1
ST453 3 T
ST60 1 LAM4 1
ST41 6 LAM7_TUR 1 154 3
ST40 1 T4 2 2 4
ST1252 1 T 1 1
ST878 1 X1-VAR 3
ST2906* 2 U
ST37 2 T3 8 2
Orphan B* 1
ST2577 1
ST34 7 S 3 4
ST262 1 H1-VAR 10 25
ST280 2 T1_RUS2 2 5
ST144 1 T
ST154 3 T 2 4 1
Orphan C* 1 H
ST2139 1 U 2
ST205 1 T 1
Orphan D* 1 U

Table 1. (Continued)

No of strains in Spoligotype
SITVIT2
a

Spoligoprofile
No of
strains,
this study
Clade,
SITVIT1
Romania Turkey Greece Russia
ST90 1 U
Orphan E* 2 H
OrphanF* 1 U
ST2905* 5 U
ST284 7 T1 26 4 1
ST1588 2 LAM
ST4 5 LAM/S 22 3 2
ST1280 1 T1_RUS2 2 5
ST125 25 LAM/S
Total No of
strains
133 14 897 172 1241
a
asterisk (*) designates a new profile.

A large proportion of the studied strains belonged to spoligotype ST53 (Figure 2, Table
1). This worldwide distributed spoligotype represents 6.0% of strains in SITVIT2. In our
sample it constituted a much higher proportion (25.7%). Comparison with geographical
(Turkey, Greece, and Romania) and historical (Russia) neighbours revealed that this
spoligotype is present in almost as high proportion in Turkey (18.6%) and Greece (23.8%)
but not in Russia (5.4%) nor in Romania (7.1%), although this latter case may be biased due
to a very small sample size (Table 1; Figure 3). Consequently, in spite of the otherwise global
circulation of this genotype ST53, its Bulgarian strains were likely brought to Bulgaria as a
result of the Balkan intraregional human movement. In particular, a significant increase in
human exchange between Bulgaria and Turkey has been noted since end 1980s (Vasileva,
1992). However, a similar and high proportion of these strains not only in Bulgaria and
Turkey but also in Greece makes us to hypothesize a historically relatively more distant time
for their importation here driven by the medieval expansion of the Ottoman Empire
(http://www.euratlas.com/big/big1600.htm). In particular, it may be noted that starting with
the earliest conquests in Thrace (modern Bulgaria) in the 1350, the Ottoman state employed a
policy of forced population transfers that over the next three centuries would transport
thousands of subjects from Asia into Europe (Hooper, 2003).
The other three frequently found spoligotypes in our collection were ST47, ST41 and
ST125 (Table 1). In particular, ST41 belongs to the LAM7_TUR subfamily and is mainly
circumscribed to Turkey (17.2%) for which its phylogeographical specificity has been
suggested (Zozio et al., 2005). However, its only rare isolates have been described in Greece
and Romania. It may be possible that ST41 has reached its high rate in Turkey during the
course of the 20
th
century and has not yet penetrated to the neighboring countries in the
significant proportions.
Comparison with previously published data revealed that characteristic two-band
IS6110-RFLP profile found in ST41 strains from Sofia and Plovdiv (Figure 4) is also present
in ~40% of ST41 strains in Turkey (Zozio et al., 2005). A closer look at MIRU profiles of the
same strains showed some similarity of the Turkish and Bulgarian ST41 isolates. However,
all MIRU-typed Turkish strains had 1 copy in a locus MIRU26 (a signature 215125113322
being prevalent and likely ancestral) while Bulgarian strains of this spoligotype had 1 or 5
copies in MIRU26. This suggests a genetical divergence between extant geographical sub-
lineages within ST41 in Bulgaria and Turkey

Figure 4. MIRU and IS6110-RFLP profiles of the spoligotype ST41 strains.
Summing up, these observations make us to speculate that the two-band profile found in
the ST41 strains from Plovdiv and Sofia may have evolved from the variant brought from
Turkey whereas, in its turn, it may have become ancestral to the seven-band profile evolved
in situ and found in both strains from Shumen (Figure 4). This might require a relatively
long-term evolution and, consequently, could reflect a long-term presence of this two-band
RFLP variant of ST41 in Bulgaria.
On the other hand, a comparison with SITVIT2 revealed a high gradient for ST125 in
Bulgaria (Table 1) and negligible presence of this spoligotype outside Bulgaria and, in
particular, in the neighboring countries. A similarity of the IS6110-RFLP profiles confirmed
a true relatedness of the spoligotype ST125 strains whereas high diversity of the 12-MIRU
loci (Figure 5) suggested a long-term evolution of this spoligotype in Bulgaria. These
findings lead us to suggest a Bulgarian phylogeographic specificity of the spoligotype ST125.

Figure 5. 12-MIRU-loci based minimum spanning tree of spoligotype ST125 strains. Each circle, node
or tip, is described by MIRU type number (inside a circle), strain origin/number of strains (if more than
one), MIRU 12-digit profile (in italic) and IS6110-RFLP one-letter profile designation (in bold). MIRU
types numbering within ST125 was done only for convenience of analysis and discussion. Circle size is
roughly proportional to the number of strains sharing a respective MIRU profile.
Although a detailed comparison with drug susceptibility data is presented below, we
noted a high rate of multidrug-resistant (MDR) strains in the studied Bulgarian sample (12%).
For example, in Russia the MDR phenotype was found in 48.6% of the Beijing genotype
strains versus 29.4% of the non-Beijing strains suggesting that current transmission of MDR-
TB in Russia is greatly influenced by the ongoing dissemination of the Beijing family strains
(Narvskaya et al., 2005). Comparison with global database revealed that several spoligotypes
were co-shared by Bulgarian and Russian strains (Table 1), which is readily explained by
close links and extensive human movement between the two countries until the end of the
20th century. Nevertheless, the Beijing genotype was not identified in the studied strains
from Bulgaria. Consequently, the current situation with MDR-TB in Bulgaria cannot be
explained by transmission of the Beijing genotype that apparently has not yet reached this
country.

High-Resolution Typing and Comparison of Typing Methods

Seventy-three strains had sufficient quantity of DNA for traditional IS6110-RFLP typing.
Accordingly, this sub-sample served to compare all three methods in this study,
spoligotyping, IS6110-RFLP typing and newly proposed 24-locus VNTR scheme (Supply et
al., 2006). One should note that a reduction in the sample size did not decrease neither
genetic diversity (spoligotyping HGI
133
=0.893 versus HGI
73
=0.939) nor geographical
representativeness (city of isolation based HGI
133
=0.838 versus HGI
73
=0.873) of the
collection as a whole.
Table 2 shows a comparison of the discriminatory capacity of the different VNTR sets
and IS6110-RFLP typing. IS6110 fingerprinting subdivided 73 M. tuberculosis isolates into
39 unique types and 12 clusters. The IS6110 copy number varied between 2 and 13 copies
per profile although it was generally high (Figure 6, Table 2). Assuming a low-copy number
as less than 5, only three strains in this study were low-banders making an outgroup in the
IS6110-RFLP tree (strain 46 and cluster XII in Figure 6).

Table 2. Discriminatory power of the genotyping methods evaluated with 73 strains

Method*
No. of
types
No. of unique
isolates
No. of
clustered
isolates
No. of
clusters
Cluster size
(range)
HGI
MIRU-VNTR 24 loci 66 61 12 5 2-3 0.997
MIRU-VNTR 15 loci 65 59 14 6 2-3 0.996
MIRU-VNTR 12 loci 62 55 18 7 2-4 0.994
MIRU-VNTR 5 loci 45 28 45 17 2-4 0.984
IS6110-RFLP 51 39 34 12 2-7 0.983
Spoligotyping 31 18 55 13 2-14 0.939
*
MIRU-VNTR 12, 15, 24 loci: Supply et al., 2001, 2006. The 5-locus scheme: 5 MIRU-VNTR loci
found the most polymorphic in this study: MIRU40, Mtub04, Mtub21, QUB-11b, and QUB-26.

Figure 6. IS6110-RFLP based dendrogram of M. tuberculosis strains from Bulgaria compared to their
24-locus VNTR digital haplotypes, 43-signal spoligoprofiles. SIT, spoligotype international type.
IS6110-RFLP clusters in the dendrogram are designated with Roman numerals from I to XII. A
designates 11 repeat units in a VNTR locus. VNTR profiles of the strains included in the IS6110-RFLP
clusters are in boxes; minor variable alleles within these clusters are in bold.
The 24 published MIRU-VNTR loci (Supply et al., 2006) were further analyzed in this
study (Table 2). Examples of different alleles for the most polymorphic loci are shown in
Figure 7. The allelic diversity differed significantly among VNTR loci (Table 3). The highest
allelic diversity among all strains was observed for QUB-26 (0.827), and the null allelic
diversity was found for the monomorphic MIRU24. The lowest diversity (HGI~0.1) was
found for six loci MIRU20, MIRU27, MIRU31, MIRU39, ETR-B, Mtub34.


Figure 7. Examples of the VNTR alleles of the most variable loci in this study: QUB11b (a), QUB26
(b) Mtub21 (c). M, molecular weights marker 100 bp DNA ladder (GE Healthcare).

A comparison of different combinations of the VNTR loci revealed that the use of the
old/classical 12-locus combination was the least discriminatory; it identified 55 unique and
18 clustered strains (HGI=0.994). A better resolution with 59 unique and 14 grouped strains
(0.996) was observed by using the 15-locus MIRU-VNTR system. Finally, a use of the full
set of the 24 loci permitted us to identify 61 unique and 12 clustered isolates (HGI=0.997).
Compared to the 15-locus scheme, the 24-locus scheme differentiated within a cluster of
strains 17 and 25 due to the difference in the Mtub34 locus; it may be noted that these two
strains were identical in other VNTR loci as well as in their IS6110-RFLP and
spoligoprofiles (Figure 6). Otherwise, except for the above example, a use of the moderately
polymorphic (Table 3) 9 auxiliary loci of the 24-locus scheme did not contribute to the
additional differentiation of strains compared to the 15-locus scheme.

Table 3. Allelic diversity of 24 VNTR loci in M. tuberculosis strains from Bulgaria and
other locations

Diversity in Bulgarian
strains
VNTR
locus
a
No. of
alleles
No. of
repeats
(range)
HGI

HGI,
Global
set

HGI, Japan,
non-Beijing
types

HGI,
Japan,
Beijing
genotype

HGI,
China,
Beijing
genotype
MIRU4 5 0-4 0.557 0.38 0.469 0.086 0.120
MIRU10 4 2-5 0.532 0.74 0.794 0.419 0.144
MIRU16 4 1-4 0.524 0.53 0.610 0.310 0.068
MIRU26 7 1-7 0.422 0.75 0.739 0.383 0.353
MIRU31 3 2-4 0.106 0.72 0.537 0.322 0.169
MIRU40 6 1-6 0.806 0.73 0.752 0.327 0.194
Mtub04 4 1-4 0.655 0.71 0.471 0.459 0.306
Mtub21 5 1-5 0.669 0.76 0.599 0.393 0.556
Mtub30 3 1-4 0.430 0.62 0.580 0.403 0.068
Mtub39 5 2-6 0.465 0.69 0.735 0.186 0.171
ETR-A 4 1-4 0.554 0.75 0.554 0.147 0.232
ETR-C 4 2-5 0.419 0.69 0.230 0.022 0.094
QUB-11b 7 1-7 0.773 0.82 0.748 0.772 0.651
QUB-26 9 2-11 0.827 0.84 0.798 0.741 0.518
QUB-4156 5 0-4 0.447 0.67 0.665 0.611 0.395
MIRU2 3 1-3 0.249 0.16 0.105 0 0
MIRU20 2 1-2 0.129 0.30 0.226 0.022 0.014
MIRU23 5 3-8 0.399 0.65 0.597 0.176 0.014
MIRU24 1 1 0 0.35 0.105 0 0
MIRU27 3 1-4 0.106 0.25 0.036 0.115 0.014
MIRU39 2 1-2 0.153 0.45 0.497 0.221 0.119
ETR-B 3 1-3 0.106 0.44 0.370 0.033 0.014
Mtub29 4 2-5 0.204 0.48 0.262 0.043 0.119
Mtub34 3 2-4 0.154 0.27 0.036 0.065 0.014
Mean 0.404 0.573 0.480 0.260 0.181
Reference
Supply et
al., 2006
Iwamoto et
al., 2007
Iwamoto et
al., 2007
Jiao et al.,
2008
a
In bold are 5 the most polymorphic loci in this study.
We further tested various combinations of VNTR loci in order to find one based on a
reduced number of loci and close in discrimination to the 24-locus typing. The applied
criteria were number of alleles and individual diversities of the loci assessed as HGI (not
shown). Finally, the most obvious combination of the five most polymorphic loci (HGI>0.6)
was shown to achieve a good discrimination although below that of the 15-locus scheme, but
still higher than IS6110-RFLP typing (Table 2).
In the present study, of the three methods used, not unexpectedly, spoligotyping showed
the lowest discrimination. At the same time, it may be noted that similar to the German study
(Oelemann et al., 2007), in our setting, spoligotyping albeit least discriminatory, contributed
to the subdivision within two of five 24-locus VNTR clusters (clusters A and C in Figure 8).
Contrarily, spoligotype ST41 makes the most apparent example of the slower evolution of the
DR locus compared to the VNTR haplotypes or IS6110-RFLPs: the ST41 strains differ in 7
out of 24 loci although indeed they remained weakly related and located in the same part of
the 24-VNTR dendrogram (Figure 8). An interesting finding of this study is that a gold
standard IS6110-RFLP appeared even less variable marker than classical 12-locus MIRU
scheme (Figure 6, Table 2). Most of the IS6110 clusters in the Figure 6 were completely or
partially differentiated by use of 24-VNTR set. On the other hand, all three VNTR clusters
included strains with identical RFLP profiles (not shown). A remarkable evolutionary
stability of some IS6110-RFLP profiles is especially manifested in the ST125/ST4 cluster of
strains, a largest cluster in this study (see cluster I in Figure 6). Perhaps, mapping of the
IS6110 insertions in the genome in these strains would help to understand this intriguing
situation.
It may be also noted that addition of the VNTR typing to the IS6110-RFLP allowed for
more precise tracing of the local clones at the city level. For example, an IS6110 cluster IV
(spoligotype ST2905) was further subdivided by VNTR typing: three strains from Pleven
remained identical and differed in three loci from a strain from Shumen (Fig. 6). Previous
studies on 24-locus format showed a general congruence of IS6110 and 24-VNTR results
while the latter was suggested to be overall more accurate for cluster analysis (Oelemann et
al., 2007; Supply et al., 2006). In the Belgian study, of the 23 IS6110 RFLP clusters with
high copy numbers, 20 were found to be completely identical by MIRU-VNTR typing. Of the
three remaining IS6110 RFLP clusters, two were fully subdivided both by 4 to 7 MIRU-
VNTR loci (Allix-Beguec et al., 2008a). In this sense, our result of the superior
discrimination achieved by the 24-locus VNTR scheme compared to IS6110 fingerprinting is
not so surprising.
Various sets of MIRU-VNTR loci demonstrated different levels of discriminatory power
(Table 2). Only one locus MIRU-24 was monomorphic which is in agreement with previous
observation that this locus is phylogenetically conserved and discriminates between large
ancestral/modern M. tuberculosis lineages with/without TbD1 genome region (Sun et al.,
2004). Compared to the IS6110-RFLP typing, a 12-locus MIRU scheme already showed a
good discrimination but indeed the addition of the new VNTR loci, mainly those from the
discriminatory 15-locus set, improved a discrimination by reducing the number of clusters
and clustered isolates (Table 2). A further closer look at the individual diversities of the loci
(Table 3) showed that loci found the most polymorphic in Bulgaria were also among the most
polymorphic loci in the global set of strains (Supply et al., 2006). At the same time, some
globally variable loci were low-polymorphic in the Bulgarian collection, e.g., MIRU31.

Figure 8. The 24-VNTR-loci based dendrogram of M. tuberculosis strains from Bulgaria compared to
their 43-signal spoligoprofiles. Spoligotype number and family were attributed based on comparison
with global SITVIT2 database at Institut Pasteur de Guadeloupe.
Altogether and not unexpectedly, the mean per-locus diversity was higher for the global
set of strains (Table 3). Comparison with available data from other published studies in Japan
and China (Table 3) re-confirmed a strong phylogeographical structure of M. tuberculosis
that appeared to have a direct impact on the observed diversity of the VNTR loci. Both China
and Japan are dominated by the Beijing genotype strains (Iwamoto et al., 2007; Jiao et al.,
2008; Millet et al., 2007; van Soolingen et al., 1995), a closely related clonal group of strains;
this lead to low mean per-locus diversity as well as low diversity of the most VNTR loci
including those from the 15-locus discriminatory set (Table 3). A lower mean HGI for
Beijing genotype samples (Japan and China) versus non-Beijing genotype samples (global
set, Japan, Bulgaria) may result from a stronger clonality in the Beijing genotype strains
compared to the much more diverse strains of other genotypes. On the other hand, a lower
mean HGI in the Beijing genotype samples in China versus Japan may be explained by much
smaller sample size in the Chinese study. Further, although non-Beijing strains are likely to
be of the diverse origins, still these origins differ and depend on the area of isolation.
Nevertheless, individual and mean HGI values in the Bulgarian collection were similar to the
respective values for the non-Beijing sub-sample from Japan and the global collection (Table
3). This observation is made on the geographically very distant locations, such as, Bulgaria
and Japan, and apparently it gives an additional support to the new 24-locus format of M.
tuberculosis genotyping. Indeed, mean HGI value in non-Beijing sample from Japan is higher
than in Bulgaria, especially due to the low-polymorphic (in our collection) loci MIRU31,
MIRU39 and ETR-B. A general explanation, albeit speculatively, may lie in different levels
of clonality and/or more/less recent dissemination of the strains in a survey area, or more
diverse origins of the circulating strains. These factors may be additionally influenced by
human population size and the level of urbanization.
An increase of a number of the targeted VNTR loci is expected to result in an increased
discrimination. Nevertheless, it also makes such multi-locus schemes rather time-consuming
and expensive in the settings with relatively limited resources. It appears that a primary
typing may be reasonably limited to a few loci if they still achieve a sufficiently high
discrimination. As a population structure of the circulating M. tuberculosis strains vary across
different world regions, these first-line typing schemes may be country-dependent and could
include different loci. The five most polymorphic (in our study) loci used together allowed to
achieve a HGI higher than that of IS6110-RFLP typing (Table 2). In this view, an apparent
utility of the newly proposed 24-locus format (Supply et al., 2006) has been manifested by
the fact that 4 of 5 loci of the Bulgaria-specific reduced set represented these new loci
(Mtub04, Mtub21, QUB11b, QUB26) while only one locus was retained from the earlier 12-
locus scheme (MIRU40). Accordingly, this leads us to preliminarily suggest these five loci
(Table 2) for use in the first-line typing of the M. tuberculosis strains in Bulgaria although
further studies are undoubtedly required to test the proposed provisional scheme.

Figure 9. Visual presentation of the decision rules for definition of the LAM and S spoligotype families
(Filliol et al., 2002) and their application for ST125 and related ambiguous spoligotypes.

24-VNTR format: Phylogenetic Utility

On the basis of spoligotyping, the 133 strains were subdivided into 37 distinct
spoligotypes (Table 1; Figure 2). Application of the published rules for definition of the
major spoligotype clades (Brudey et al., 2006; Filliol et al., 2002) and comparison with
SITVIT2 global database permitted us to assign most of the 133 strains to the known
spoligotype families (Table 1). At the same time, spoligotypes ST4, ST125, and ST1280 were
classified as LAM/S since the absence of spacers 21-24 and 33-36 is specific for LAM family
whereas the absence of spacers 9-10 and 33-36 is specific for S family (Filliol et al., 2002)
(Figure 9). We additionally used a recently proposed PCR approach to the definition of the
LAM family (Marais et al., 2006) and found that ST4, ST125 and ST1280 strains did not
harbor a LAM-specific IS6110 insertion (Figure 10).

Figure 10. LAM-specific PCR: (a) schematic view of the genome region harboring LAM-specific
IS6110 insertion in strain F11. Three primers (arrows) are used in one reaction: LAM-F, LAM-R, XhoI.
Non-LAM strain (without this IS6110-insertion): primers LAM-R and LAM-F amplify 141 bp
fragment. LAM strain (with this IS6110 insertion): primers LAM-R and XhoI amplify 205 bp fragment
(b) gel-electrophoresis. Lanes 1, 2, 4, 5 - LAM strain. M, molecular weights marker 100 bp DNA
ladder (Amersham Bioscience).
A phylogenetic position of these strains (ST4, ST125 and ST1280) was further
investigated in the light of the VNTR data. Interestingly, strains of these three spoligotypes
were grouped closely in the 24-locus-based VNTR dendrogram and together with ST34 that
is a prototype of the S family (a cluster marked by * in Figure 8). It appears that spoligotypes
ST125, ST4, and ST1280 may indeed belong to the S family although further studies
targeting VNTR loci in strains of these spoligotypes from other world regions are needed to
clarify their phylogenetic clade position.

Molecular Basis of Drug Resistance

The study collection included 37 drug-resistant and 96 susceptible M. tuberculosis strains
(Table 4). A monoresistance was identified in 15 of 37 drug-resistant strains, a majority being
limited to the RIF (7/15) and INH (5/15) monoresistance (Table 4). Sixteen strains (12.0%)
were resistant to both RIF and INH and thus classified as multidrug resistant.
It should be noted that in Bulgaria, a total of 1360 TB cases (42% of all new TB cases)
were confirmed by culture in 2006; 1108 of them were subjected to DST and 24 (2.2%) of the
DST-screened cultures were found to be multidrug resistant (Euro TB, 2008). A total of 22
MDR M. tuberculosis strains were identified in Bulgaria in 2005 (4.6% of all DST-screened
cultures from all new TB cases) (Euro TB, 2007). The relative data for 2007 are not yet

Table 4. Resistance patterns of 133 M. tuberculosis strains isolated in different regions of Bulgaria

Resistance
type
Phenotypi
c
resistance
profile
a
No. of
isolates
(%)
rpoB
wild
type
rpoB531
TCG>TTG
rpoB526
CAC>TAC
rpoB
mutant
del wt5
b
katG315
wild type
katG315
AGC>ACC
inhA
-9 -25
(wild type)
inhA
-15C>T
embB306
wild type
embB306
mutant
HR
c
9 (6.8) 3 6 5 4 6 6
HRE 5 (3.7) 1 3 1 2 3 3 1 1 4 MDR
HRS 3 (2.2) 1 2 1 2 3 3
RE 3 (2.2) 1 2 3 2 2 1 2
HE 1 (0.8) 1 1 1 1
Polyresistant
(non-MDR)
ES 1 (0.8) 1 1 1 1
R 7 (5.3) 1 4 2 7 6 1 7
H 5 (3.7) 5 3 2 5 5
E 1 (0.8) 1 1 1 1
Monoresistant
S 2 (1.5) 2 2 2 2
Fully
susceptible
96 (72.2) 96 96 96 96
a
One-letter abbreviations of drug resistance: H, INH; R, RIF; E, EMB, S, STR.
b
Absence of hybridization with wild type probe #5, i.e., a mutation in rpoB codons 530-534 (Morcillo et al., 2002; Mokrousov et al., 2006).
c
No information on inhA and embB306 mutations was available for three HR strains.

available. Nevertheless, the extrapolation of the published information (Euro TB, 2007,
2008) allows us to estimate the total number of MDR M. tuberculosis strains identified in
Bulgaria between January 2005 June 2007 (the survey period of this study) to be 58 strains.
Consequently, regarding the representativeness of our study panel, we note that the studied
sub-sample of the MDR isolates represents 29% (17/58) of the MDR M. tuberculosis cultures
isolated from newly diagnosed TB patients in Bulgaria within the survey period.
This study found a high specificity and sufficiently good sensitivity of the molecular
methods to detect RIF and EMB-resistant strains; the results for INH resistance are more
complex.
Three types of the rpoB RRDR mutations were found in 20 of 27 RIF-resistant strains
while rpoB S531L (TCG>TTG) was the most frequent (Table 4). The remaining 7 RIF-
resistant strains and all 106 RIF-susceptible strains had no mutation in the targeted rpoB hot-
spot region. Interestingly, 62.5% (10/16) of MDR strains were found to harbor a mutation in
the rpoB hot-spot region (S531L). Sensitivity and specificity of the genotypic method to
detect RIF-resistance were 74.1% and 100%, respectively.
Regarding RIF resistance, the high rate of the rpoB S531L (TCG>TTG) mutation
compared to very low rate of the other rpoB mutations found in this study is striking (Table
4). A similar situation was described, e.g., for Russia and Kazakhstan, but it was associated
with a Beijing genotype (Mokrousov et al., 2003; Hillemann et al., 2005). In other studies,
rpoB 531TTG allele was found in a similar rate of ~50% in the Beijing versus non-Beijing
RIF-resistant strains from East Asia (Mokrousov et al., 2006; Qian et al., 2002), Taiwan (Jou
et al., 2005), and Latvia (Tracevska et al., 2003), and even less represented in the Beijing
genotype RIF-resistant strains from Korea (41% vs 66% [Park et al., 2005]). A variation in
the prevalence of this rpoB S531L mutation among Beijing strains in different countries may
reflect not only the increased capacity of the Beijing family strains to readily acquire the most
frequently observed rpoB mutation but also some specific features of the National TB control
programs in different countries (Balabanova et al., 2004; Samarina et al., 2007). In our study
the Beijing genotype was not found in Bulgaria among the 133 clinical isolates studied,
hence the current situation with MDR-TB in Bulgaria cannot be explained by global
dissemination of the Beijing genotype that apparently has not yet reached this country.
Summing up, whether the very high rate of rpoB S531L mutation is a surrogate marker of the
failure of the national TB control program or is hypothetically linked to another molecular
mechanism related to acquisition of the RIF resistance needs to be addressed in further
investigations in different settings.
A mutation in embB306 was found in 7 of 11 EMB-resistant strains. No such mutation
was detected in EMB-susceptible strains (Table 4). Sensitivity and specificity of the
genotypic method to detect EMB-resistance were 63.6% and 100%, respectively. The results
on embB306 variation obtained in this and a recent German study (Plinke et al., 2006) are in
line with earlier findings that correlated mutations in embB306 with EMB resistance
(Sreevatsan et al., 1997; Ramaswamy et al., 2000; van Rie et al., 2001). They are in
contradiction with more recently reported discrepancies between genotypic and phenotypic
EMB resistance (Mokrousov et al., 2002b; Tracevska et al., 2004; Lee et al. 2004; Hazbon et
al., 2005). A number of explanations of these contradictory findings have been proposed.
Plinke et al. (2006) suggested that there is a small difference between the critical
concentration used for EMB susceptibility testing and the MIC, making susceptibility testing
more problematic. Mokrousov et al. (2002b) hypothesized an unknown mechanism in MDR
M. tuberculosis strains that leads to susceptibility to EMB. Hazbon et al. (2005) suggested
that the clear association between mutations in embB306 and EMB resistance found in
several earlier studies might be due to the use of pansusceptible strains as control groups.
Regarding this latter point, our study in the Bulgarian setting found embB306 mutation in 7
strains of which 6 strains were resistant to more than one drug. Indeed, embB306 mutation
was not found in fully susceptible or monoresistant strains (except for one EMB-
monoresistant). However all EMB-susceptible MDR strains in this study had embB306 wild
type allele. Accordingly, it appears that a hypothesis of Hazbon et al. (2005) about embB306
as a marker of multidrug resistance is not completely supported by our data.
Molecular investigation of genetic basis of INH-resistance in M. tuberculosis strains in
Bulgaria targeted two the most frequently reported mutations related to INH resistance, katG
315AGC>ACC and inhA -15C>T (Baker et al., 2005; Guo et al. 2006; Nikolayevskyy et al.,
2007, and references therein). KatG S315T (AGC>ACC) mutation was detected in 10 (45%)
of 22 INH-resistant isolates and in none of 110 INH-susceptible isolates (Table 4). Additional
analysis of the inhA promoter region revealed four strains that harbored the inhA -15C>T
mutation: one INH-resistant strain that also had katG315 mutation and 3 INH-susceptible
strains. Of these latter, two strains were RIF- and EMB-resistant (rpoB531 and embB306
mutations), and one strain was RIF-resistant (rpoB531 mutation). Sensitivity and specificity
of the genotypic method to detect INH-resistance were 45.5% and 99.1%, respectively.
The global prevalence of the katG S315T substitution in INH-resistant strains highlights
the selective advantage conferred by this mutation, which provides the optimal balance
between decreased catalase activity and a sufficiently high level of peroxidase activity in
KatG. Mutations in the inhA promoter region are thought to increase the InhA protein
expression, thereby elevating the drug target levels and producing INH resistance by a drug
titration mechanism (Mdluli et al., 1996). A large-scale study (Hazbon et al., 2006) showed
that mutations in katG315 were significantly more common in MDR isolates while mutations
in the inhA promoter were significantly more common in INH-monoresistant isolates.
The prevalence of the katG315 AGC>ACC mutation among INH-resistant M.
tuberculosis strains in the world varies but remains high, e.g., 47% in Finland (Marttila et al.,
2008), 61% in China (Jiao et al., 2007), 64% in India (Nusrath Unissa et al., 2007), 71% in
Vietnam (Caws et al., 2006), 92-94% in Russia (Mokrousov et al., 2002a; Voronina et al.,
2004; Afanasev et al., 2007). Accordingly, it appears that katG S315T mutation alone can be
used to reliably predict a high proportion of the INH-resistant strains in many world regions.
For Bulgaria this is not the case. Only 10 of 22 INH-resistant strains would be detected
genotypically through an analysis of the two targeted mutations and this result is a surprise.
Furthermore, inhA mutation was found in only one INH-resistant strain that also harbored a
katG315 mutation. Three other strains with inhA mutation were INH-susceptible. It has been
suggested that inhA -15C>T mutation can be present by itself and is associated with a low-
level INH resistance, 0.2 mg/l (Guo et al., 2006). Hazbon et al. (2006) even observed a strong
negative association between mutations in katG315 and mutations in the inhA promoter
region (p<0.01). In this study, a MIC was 0.25 mg/l for INH and indeed it may be that the


Figure 11. UPGMA dendrogram of M. tuberculosis strains from Bulgaria based on 24 MIRU-VNTR
loci. sensu lato clusters designates a group of strains differing in single/double locus variation. Black
and white circles define multi-city sensu lato clusters, grey circles define one-city sensu lato
clusters. Loci order in the 24-VNTR loci digital profile: MIRU2, MIRU4, MIRU10, MIRU16,
MIRU20, MIRU23, MIRU24, MIRU26, MIRU27 MIRU31, MIRU39, MIRU40, Mtub4, Mtub21,
Mtub30, Mtub39, ETRA, ETRC, QUB11b, QUB26, QUB4156, Mtub29, Mtub34, ETRB; B means
11 repeat copies in a locus; * designates missing data (PCR failure). Drug resistance: H, INH; R, RIF;
E, EMB, S, STR.
three INH-susceptible strains with inhA promoter mutation (katG315 wild type) had a low-
level INH-resistance below the MIC used. Perhaps, mutations in other parts of the katG gene
and in other gene regions such as, ahpC promoter region, inhA coding sequence or other
gene, may account for resistance in other INH-resistant isolates in this study.

Geographic Distribution of Drug Resistant Strains and Comparison with
Clustering

A specific association of drug resistance properties and particular or predominant clones
may be a reason behind a high rate of drug resistance (Drobniewski et al., 2005; Narvskaya et
al., 2005; Nikolayevskyy et al., 2007). We investigated this issue by means of the molecular
fingerprinting approach.
Strain differentiation was performed by two standardized DNA fingerprinting techniques,
spoligotyping and 24-locus MIRU-VNTR typing, in order to assess the relationships among
strains at different levels of genetic relatedness. On the basis of spoligotyping, all 133 strains
were subdivided into 37 distinct spoligotypes (Table 1). A selection of 98 strains belonging
to different lineages was subjected to the high-resolution VNTR typing using 24 MIRU-
VNTR loci. One should note that a reduction in the sample size did not affect a genetic
diversity and representativeness of the studied collection (spoligotyping HGI
133
=0.893 versus
HGI
98
=0.912). Previously, we demonstrated a superior discriminatory power of the MIRU-
typing over IS6110-RFLP typing for M. tuberculosis strains in Bulgaria (Valcheva et al.,
2008ab). This explains our choice of 24-loci MIRU-VNTR format for a secondary typing in
this study. This method differentiated most of the studied strains (Figure 11): five sensu
stricto clusters (identical 24-locus profile) consisted of two strains each, two other sensu
stricto clusters consisted of three and four strains each and the 81 remaining isolates had
unique 24-locus digital profiles while HGI was 0.997. Additionally, we designated as sensu
lato clusters those that included strains with 0.05 dissimilarity (Figure 11). This definition
permitted us to identify 19 sensu lato clusters that included from 2 to 6 strains (shown by
grey or black/white circles in Figure 11).
A comparison with spoligotyping data revealed both the heterogeneity of our collection
and a prevalence of two spoligotypes, ST53 and ST125 that differed in their MDR/drug-
resistant TB rate (Table 5).

Table 5. Distribution of drug resistant strains in spoligotypes in this study
a

Spoligotype All strains Drug resistant MDR
ST125 25 3 0
ST154 3 0 0
ST2905 5 1 0
ST284 7 3 1
ST34 7 1 0
ST4 5 0 0
ST41 6 3 0

Spoligotype All strains Drug resistant MDR
ST453 3 1 1
ST47 7 2 1
ST53 33 11 7
singletons 20 11 7
a
The total number of strains in this table does not correspond to the total number of strains in this study
(n=133) since the data only on the main shared types (more than two strains) and singletons are
shown.

ST125 did not include MDR strains and the difference in the rate of drug resistant strains
between ST53 and ST125 was noticeable but statistically insignificant (p=0.11) perhaps due
to a small sample size. In may be noted that sixteen MDR strains exhibited a sufficiently high
spoligotype diversity as they belonged to four shared types (ST53, ST47, ST453 and ST284)
and 7 singletons. Furthermore, 11 of 20 spoligotyping-based singletons were drug-resistant
(Table 5). Taken together, these findings suggest that the emergence and spread of drug-
resistant and MDR-TB in Bulgaria are not linked to the spoligotype-defined population
structure of M. tuberculosis in this country.
Application of the high resolution 24-locus MIRU typing revealed a significant
heterogeneity of our collection. Seven sensu stricto clusters of two or four strains have been
identified (Figure 11); they may speculatively correlate with recent transmission/close
relatedness of these strains even though an epidemiological link between patients was not
established. At the same time, a mathematical modeling of the VNTR loci evolution in M.
tuberculosis estimated a very slow mutation rate for the repeats (Grant et al., 2008) and sensu
lato clusters may reflect a long-term historical evolution of clones. A comparison with city of
isolation data revealed that 13 sensu lato clusters included strains from the same city whereas
6 sensu lato clusters included strains from more than one city. The one-city clusters may
reflect a biologically vertical, family/household-based transmission that is confined to a
particular geographic area. Contrarily, the multi-city clusters may reflect a horizontal
transmission due to human migration across the country. Since the first case is somewhat
more frequent, a local circulation of clones appears to be more significant factor to take into
consideration in the molecular epidemiological studies of tuberculosis in Bulgaria.
A closer look at the VNTR tree did not identify a cluster associated with drug resistance
whereas drug-resistant strains were evenly dispersed across the dendrogram (Fig. 11). Fifteen
of 51 broadly clustered strains (sensu lato clusters) were drug-resistant; of them 8 were
polyresistant. Compared to 12 drug-resistant (8 polyresistant) out of 47 non-clustered strains,
there was no statistical difference in the prevalence of (multiple) drug resistance between
clustered and non-clustered isolates.

Conclusion

This study has been undertaken in order to gain first insight into the population structure
of M. tuberculosis in Bulgaria. Spoligotyping was used as a primary typing tool because of its
easy use and straightforward coding and interpretation of results; furthermore the availability
of the international database permitted us to view our results in the context of the globally
and locally circulating M. tuberculosis clones. The Bulgarian M. tuberculosis population is
both sufficiently heterogeneous and dominated by several worldwide distributed and Balkan
specific spoligotypes. The majority of the studied Bulgarian strains (54%) belonged to the
four spoligotypes. It may be possible that they represent either clones (ST125) or subclones
(within ST41, ST53, ST47) phylogeographically specific to the Balkan region and could have
been brought here through the Bulgarian-Turkish interaction, either very recent or, contrarily,
distant historical.
The present study also evaluated new markers for molecular typing of Mycobacterium
tuberculosis with a collection of strains circulating in Bulgaria. We demonstrated that MIRU-
VNTR typing is the most discriminatory tool compared to spoligotyping and IS6110-RFLP
typing of M. tuberculosis. Consequently, a new 24-locus MIRU-VNTR format (Supply et al.,
2006) appears to be taking an increasingly leading position of the primary method for M.
tuberculosis epidemiological typing. A reduced 5-locus set (MIRU40, Mtub04, Mtub21,
QUB-11b, and QUB-26) provided a sufficiently high differentiation and may be used for a
first-line typing of M. tuberculosis isolates in Bulgaria although further studies are needed to
validate this scheme. At the same time, a comprehensive secondary subtyping of clustered
isolates should target all 15 loci of the discriminatory set of Supply et al. (2006), at least, for
the time being.
A detailed discussion on the reasons of the much higher MDR-TB rate in both newly and
previously diagnosed TB patients in Bulgaria compared to all neighboring countries is
beyond the scope of this study. Basically, these reasons may lie in poor institutional TB
control policies including management of MDR-TB, and inappropriate medical or self-
prescribing. For example, a monoresistance, found in 15 of 37 drug-resistant isolates in this
study, is known to arise mainly due to non-compliance or wrong prescribing. Accordingly, it
may be an additional indication (along with a very high rate of rpoB S531L mutation) of the
insufficient anti-TB control in Bulgaria.
Analysis of the molecular basis of drug resistance revealed that rpoB RRDR and
embB306 mutations may serve for rapid genotypic detection of the majority of the RIF and
EMB-resistant M. tuberculosis strains in Bulgaria. The results for INH resistance are complex
and further investigation of more genes is needed. Comparison with spoligotyping and 24-
VNTR locus typing data did not reveal a significant difference in the distribution of drug
resistance between clustered and non-clustered isolates. Emergence and spread of drug-
resistant and MDR-TB in Bulgaria is not associated with any specific spoligotype or MIRU-
VNTR cluster. A local circulation of the particular area-specific clones appears to be an
important factor to take into consideration in the molecular epidemiological studies of
tuberculosis in Bulgaria.

Acknowledgments

We are grateful to all colleagues from regional TB laboratories for kindly providing
mycobacterial isolates. This work was supported by NATO's Public Diplomacy Division in
the framework of Science for Peace program (grant SFP-982319 Detect drug-resistant
TB) and from the European Commission to Igor Mokrousov (Marie Curie Fellowship
contract No MIF1-CT-2007-039389).

References

Afanas'ev, M.V., Ikryannikova, L.N., Il'ina, E.N., Sidorenko, S.V., Kuz'min, A.V.,
Larionova, E.E., Smirnova, T.G., Chernousova, L.N., Kamaev, E.Y., Skorniakov, S.N.,
Kinsht, V.N., Cherednichenko, A.G., Govorun, V.M. (2007) Molecular characteristics of
rifampicin- and isoniazid-resistant Mycobacterium tuberculosis isolates from the Russian
Federation. J. Antimicrob. Chemother. 59, 1057-1064.
Al-Hajoj, S.A., Zozio, T., Al-Rabiah, F., Mohammad, V., Al-Nasser, M., Sola, C., Rastogi,
N. (2007) First insight into the population structure of Mycobacterium tuberculosis in
Saudi Arabia. J. Clin. Microbiol. 45, 2467-2473.
Allix-Bguec, C., Fauville-Dufaux, M., Supply, P. (2008) Three-year population-based
evaluation of standardized mycobacterial interspersed repetitive unit-variable number of
tandem repeat typing of Mycobacterium tuberculosis. J. Clin. Microbiol. 46, 1398-1406.
Baker, L.V., Brown, T.J., Maxwell, O., Gibson, A.L., Fang, Z., Yates, M.D., Drobniewski,
F.A. (2005) Molecular analysis of isoniazid-resistant Mycobacterium tuberculosis
isolates from England and Wales reveals the phylogenetic significance of the ahpC 46A
polymorphism. Antimicrob. Agents. Chemother. 49, 14551464.
Balabanova, Y., Fedorin, I., Kuznetsov, S., Graham, C., Ruddy, M., Atun, R., Coker, R.,
Drobniewski, F. (2004) Antimicrobial prescribing patterns for respiratory diseases
including tuberculosis in Russia: a possible role in drug resistance ? J. Antimicrob.
Chemother. 54, 673-679.
Brudey, K., Filliol, I., Ferdinand, S., Guernier, V., Duval, P., Maubert, B., Sola, C., Rastogi,
N. (2006a) Long-term population-based genotyping study of Mycobacterium
tuberculosis complex isolates in the French departments of the Americas. J. Clin.
Microbiol. 44, 183-191.
Brudey, K., Driscoll, J.R., Rigouts, L., Prodinger, W.M., Gori, A., Al-Hajoj, S.A., Allix, C.,
Aristimuo, L., Arora, J., Baumanis, V., Binder, L., Cafrune, P., Cataldi, A., Cheong, S.,
Diel, R., Ellermeier, C., Evans, J.T., Fauville-Dufaux, M., Ferdinand, S., Garcia de
Viedma, D., Garzelli, C., Gazzola, L., Gomes, H.M., Guttierez, M.C., Hawkey, P.M., van
Helden, P.D., Kadival, G.V., Kreiswirth, B.N., Kremer, K., Kubin, M., Kulkarni, S.P.,
Liens, B., Lillebaek, T., Ho, M.L., Martin, C., Martin, C., Mokrousov, I., Narvskaa, O.,
Ngeow, Y.F., Naumann, L., Niemann, S., Parwati, I., Rahim, Z., Rasolofo-
Razanamparany, V., Rasolonavalona, T., Rossetti, M.L., Rsch-Gerdes, S., Sajduda, A.,
Samper, S., Shemyakin, I.G., Singh, U.B., Somoskovi, A., Skuce, R.A., van Soolingen,
D., Streicher, E.M., Suffys, P.N., Tortoli, E., Tracevska, T., Vincent, V., Victor, T.C.,
Warren, R.M., Yap, S.F., Zaman, K., Portaels, F., Rastogi, N., Sola, C. (2006b)
Mycobacterium tuberculosis complex genetic diversity: mining the fourth international
spoligotyping database (SpolDB4) for classification, population genetics and
epidemiology. BMC Microbiol. 6, 23.
Caws, M., Duy, P.M., Tho, D.Q., Lan, N.T., Hoa, D.V., Farrar, J. (2006) Mutations prevalent
among rifampin- and isoniazid-resistant Mycobacterium tuberculosis isolates from a
hospital in Vietnam. J. Clin. Microbiol. 44, 2333-2337.
Drobniewski, F., Balabanova, Y., Nikolayevsky, V., Ruddy, M., Kuznetzov, S., Zakharova,
S., Melentyev, A., Fedorin, I., 2005. Drug-resistant tuberculosis, clinical virulence, and
the dominance of the Beijing strain family in Russia. JAMA 293, 2726-2731.
El Sahly, H.M., Wright, J.A., Soini, H., Bui, T.T., Williams-Bouyer, N., Escalante, P.,
Musser, J.M., Graviss, E.A. (2004) Recurrent tuberculosis in Houston, Texas: a
population-based study. Int. J. Tuberc. Lung Dis. 8, 333-340.
Euro TB. Report on tuberculosis cases notified in 2006. Saint-Maurice, France : EuroTB
Institut de Veille Sanitaire, 2008.
Filliol, I., Driscoll, J.R., Van Soolingen, D., Kreiswirth, B.N., Kremer, K., Valtudie, G.,
Anh, D.D., Barlow, R., Banerjee, D., Bifani, P.J., Brudey, K., Cataldi, A., Cooksey, R.C.,
Cousins, D.V., Dale, J.W., Dellagostin, O.A., Drobniewski, F., Engelmann, G.,
Ferdinand, S., Gascoyne-Binzi, D., Gordon, M., Gutierrez, M.C., Haas, W.H., Heersma,
H., Kllenius, G., Kassa-Kelembho, E., Koivula, T., Ly, H.M., Makristathis, A.,
Mammina, C., Martin, G., Mostrm, P., Mokrousov, I., Narbonne, V., Narvskaya, O.,
Nastasi, A., Niobe-Eyangoh, S.N., Pape, P.W., Rasolofo-Razanamparany, V., Ridell, M.,
Rossetti, M.L., Stauffer, F., Suffys, P.N., Takiff, H., Texier-Maugein, J., Vincent, V., De
Waard, J.H., Sola, C., Rastogi, N. (2002) Global distribution of Mycobacterium
tuberculosis spoligotypes. Emerg. Infect. Dis. 8, 1347-1349.
Felsenstein, J. (2004) PHYLIP (Phylogeny Inference Package) version 3.6b. Department of
Genome Sciences, University of Washington, Seattle.
Frothingham, R., Meeker-O'Connell. W.A. (1998) Genetic diversity in the Mycobacterium
tuberculosis complex based on variable numbers of tandem DNA repeats. Microbiology
144, 1189-1196.
Grimont, P.A.D. Taxotron package. Taxolab, Institut Pasteur, Paris, 2000.
Gilman, J., Myatt, M. (1998) EpiCalc 2000, version 102, London, UK: Brixton Books
Grant, A., Arnold, C., Thorne, N., Gharbia, S., Underwood, A. (2008) Mathematical
modelling of Mycobacterium tuberculosis VNTR loci estimates a very slow mutation rate
for the repeats. J. Mol. Evol. 66, 565-574.
Guo, H., Seet, Q., Denkin, S., Parsons, L., Zhang, Y. (2006) Molecular characterization of
isoniazid-resistant clinical isolates of Mycobacterium tuberculosis from the USA. J. Med.
Microbiol. 55, 1527-1531.
Hazbn, M.H., Bobadilla del Valle, M., Guerrero, M.I., Varma-Basil, M., Filliol, I., Cavatore,
M., Colangeli, R., Safi, H., Billman-Jacobe, H., Lavender, C., Fyfe, J., Garca-Garca, L.,
Davidow, A., Brimacombe, M., Len, C.I., Porras, T., Bose, M., Chaves, F., Eisenach,
K.D., Sifuentes-Osornio J., Ponce de Len, A., Cave, M.D., Alland, D. (2005) Role of
embB codon 306 mutations in Mycobacterium tuberculosis revisited: a novel association
with broad drug resistance and IS6110 clustering rather than ethambutol resistance.
Antimicrob. Agents Chemother. 49, 37943802.
Hazbn, M.H., Brimacombe, M., Bobadilla del Valle, M., Cavatore, M., Guerrero, M.I.,
Varma-Basil, M., Billman-Jacobe, H., Lavender, C., Fyfe, J., Garca-Garca, L., Len,
C.I., Bose, M., Chaves, F., Murray, M., Eisenach, K.D., Sifuentes-Osornio, J., Cave,
M.D., Ponce de Len, A., Alland, D. (2006) Population genetics study of isoniazid
resistance mutations and evolution of multidrug-resistant Mycobacterium tuberculosis.
Antimicrob. Agents Chemother. 50, 2640-2649.
Hillemann, D., Kubica, T., Agzamova, R., Venera, B., Rusch-Gerdes, S., Niemann, S. (2005)
Rifampicin and isoniazid resistance mutations in Mycobacterium tuberculosis strains
isolated from patients in Kazakhstan. Int. J. Tuberc. Lung Dis. 9, 1161-1167.
History Map of Europe, Year 1600. http://www.euratlas.com/big/big1600.htm
Hunter, P.R., Gaston, M.A. (1988) Numerical index of the discriminatory ability of typing
systems: an application of Simpsonss index of diversity. J. Clin. Microbiol. 26, 2465-
2466.
Hooper, P.L. Forced Population Transfers in Early Ottoman Imperial Strategy: a Comparative
Approach. Thesis. Princeton University, Princeton, 2003.
Iwamoto, T., Yoshida, S., Suzuki, K., Tomita, M., Fujiyama, R., Tanaka, N., Kawakami, Y.,
Ito, M. (2007) Hypervariable loci that enhance the discriminatory ability of newly
proposed 15-loci and 24-loci variable-number tandem repeat typing method on
Mycobacterium tuberculosis strains predominated by the Beijing family. FEMS
Microbiol. Lett. 272, 282-283.
Jiao, W.W., Mokrousov I., Sun G.Z., Guo Y.J., Vyazovaya A., Narvskaya O., Shen A.D.
(2008) Evaluation of new variable-number tandem repeat typing systems of
Mycobacterium tuberculosis with Beijing genotype isolates from Beijing, China. J. Clin.
Microbiol. 46, 1045-1049.
Jiao, W.W., Mokrousov, I., Sun, G.Z., Li, M., Liu, J.W., Narvskaya, O., Shen, A.D., 2007.
Molecular characteristics of rifampin and isoniazid resistant Mycobacterium tuberculosis
strains from Beijing, China. Chin. Med. J. (Engl.) 120, 814-819.
Jou, R., Chen, H.Y., Chiang, C.Y., Yu, M.C., Su, I.J. (2005) Genetic diversity of multidrug-
resistant Mycobacterium tuberculosis isolates and identification of 11 novel rpoB alleles
in Taiwan. J. Clin. Microbiol. 43, 1390-1394.
Kamerbeek, J., Schouls, L., Kolk, A., van Agterveld, M., van Soolingen, D., Kuijper, S.,
Bunschoten, A., Molhuizen, H., Shaw, R., Goyal, M., van Embden, J. (1997) Rapid
detection and simultaneous strain differentiation of Mycobacterium tuberculosis for
diagnosis and tuberculosis control. J. Clin. Microbiol. 35, 907-914.
Kremer, K., van Soolingen, D., Frothingham, R., Haas, W.H., Hermans, P.W.M., Martin, C.,
Palittapongarnpim, P., Plikaytis, B.B., Riley, L.W., Yakrus, M.A., Musser, J.M., van
Embden, J.D.A. (1999) Comparison of methods based on different molecular
epidemiological markers for typing Mycobacterium tuberculosis complex strains:
interlaboratory study of discriminatory power and reproducibility. J. Clin. Microbiol. 37,
26072618.
Kremer, K., Glynn, J. R., Lillebaek, T., Niemann, S., Kurepina, N.E., Kreiswirth, B.N.,
Bifani, P.J., van Soolingen, D. (2004) Definition of the Beijing/W Lineage of
Mycobacterium tuberculosis on the Basis of Genetic Markers. J. Clin. Microbiol. 42,
40404049.
Lee, A.S.G., Teo, A.S.M., Wong, S.Y. (2001) Novel mutations in ndh in isoniazid-resistant
Mycobacterium tuberculosis isolates. Antimicrob. Agents Chemother. 45, 2157-2159.
Lee, A.S., Othman, S.N., Ho, Y.M., Wong, S.Y. (2004) Novel mutations within the embB
gene in ethambutol-susceptible clinical isolates of Mycobacterium tuberculosis.
Antimicrob. Agents. Chemother. 48, 44474449.
Marais, B.J., Victor, T.C., Hesseling, A.C., Barnard, M., Jordaan, A., Brittle, W., Reuter, H.,
Beyers, N., van Helden, P.D., Warren, R.M., Schaaf, H.S. (2006) Beijing and Haarlem
genotypes are overrepresented among children with drug-resistant tuberculosis in the
Western Cape Province of South Africa. J. Clin. Microbiol. 44, 3539-3543.
Martin, A., Portaels, F. (2007) Drug resistance and drug resistance detection. In: Palomino
JC, Leo SC, Ritacco V, eds. Tuberculosis. From basic science to patient care.
www.TuberculosisTextbook.com, pp. 635-660.
Marttila, H.J., Mkinen, J., Marjamki, M., Ruutu, P., Soini, H. (2008) Molecular genetics of
drug-resistant Mycobacterium tuberculosis isolates in Finland, 19952004. Int. J.
Tuberc. Lung Dis. 12, 338343.
Mdluli, K., Sherman, D.R., Hickey, M.J., Kreiswirth, B.N., Morris, S., Stover, C.K., Barry,
3rd C.E. (1996) Biochemical and genetic data suggest that InhA is not the primary target
for activated isoniazid in Mycobacterium tuberculosis. J. Infect. Dis. 174, 1085-1090.
Millet, J., Miyagi-Shiohira, C., Yamane, N., Sola, C., Rastogi, N. (2007) Assessment of
mycobacterial interspersed repetitive unit-QUB markers to further discriminate the
Beijing genotype in a population-based study of the genetic diversity of Mycobacterium
tuberculosis clinical isolates from Okinawa, Ryukyu Islands, Japan. J. Clin. Microbiol.
45, 3606-3615.
Mokrousov, I. (2007) Towards a quantitative perception of human-microbial co-evolution.
Front. Biosci. 12, 4818-4825.
Mokrousov, I. (2008) Genetic geography of Mycobacterium tuberculosis Beijing genotype: a
multifacet mirror of human history? Infect. Genet. Evol. 8, 777-785.
Mokrousov, I., Bhanu, N.V., Suffys, P.N., Kadival, G.V., Yap, S.F., Cho, S.N., Jordaan,
A.M., Narvskaya, O., Singh, U.B., Gomes, H.M., Lee, H., Kulkarni, S.P., Lim, K.C.,
Khan, B.K., van Soolingen, D., Victor, T.C., Schouls, L.M. (2004) Multicenter
evaluation of reverse line blot assay for detection of drug resistance in Mycobacterium
tuberculosis clinical isolates. J. Microbiol. Meth. 57, 323-335.
Mokrousov, I., Jiao, W.W., Sun, G.Z., Liu, J.W., Li, M., Narvskaya, O., Shen, A.D (2006)
Evaluation of the rpoB macroarray assay to detect rifampin resistance and comparison
with population structure of Mycobacterium tuberculosis in Beijing, China. Eur. J. Clin.
Microbiol. Infect. Dis. 25, 703-710.
Mokrousov, I., Ly, H.M., Otten, T., Lan, N.N., Vyshnevskyi, B., Hoffner, S., Narvskaya, O.
(2005) Origin and primary dispersal of the Mycobacterium tuberculosis Beijing
genotype: clues from human phylogeography. Genome Res. 15, 1357-1364.
Mokrousov, I., Narvskaya, O., Limeschenko, E., Vyazovaya, A., Otten, T., Vyshnevskiy, B.
(2004) Analysis of the allelic diversity of the mycobacterial interspersed repetitive units
in Mycobacterium tuberculosis strains of the Beijing family: practical implications and
evolutionary considerations. J. Clin. Microbiol. 42, 2438-2444.
Mokrousov, I., Narvskaya, O., Otten, T., Limeschenko, E., Steklova, L., Vyshnevskiy, B.,
(2002a) High prevalence of KatG Ser315Thr substitution among isoniazid-resistant
Mycobacterium tuberculosis clinical isolates from northwestern Russia, 1996 to 2001.
Antimicrob. Agents Chemother. 46, 1417-1424.
Mokrousov, I., Narvskaya, O., Vyazovaya, A., Millet, J., Otten, T., Vishnevsky, B., Rastogi,
N. (2008) Mycobacterium tuberculosis Beijing genotype in Russia: in search of
informative VNTR loci. J. Clin. Microbiol. doi:10.1128/JCM.00414-08.
Mokrousov, I., Otten, T., Vishnevsky, B., Narvskaya, O. (2005) Molecular basis of anti-
tuberculosis drug resistance and its genotypic detection in Russia. In: Read, M.M., ed.
Trends in DNA fingerprinting research. New York: Nova Science Publishers, Inc, pp.83-
109.
Mokrousov, I., Otten, T., Vyazovaya, A., Limeschenko, E., Filipenko, M., Sola, C., Rastogi,
N., Steklova, L., Vishnevsky, B., Narvskaya, O. (2003) PCR based methodology for
detecting multi-drug resistant strains of Mycobacterium tuberculosis Beijing family
circulating in Russia. Eur. J. Clin. Microbiol. Infect. Dis. 22, 342-348.
Mokrousov, I., Otten, T., Vyshnevskiy, B., Narvskaya, O. (2002b) Detection of embB306
mutations in ethambutol-susceptible clinical isolates of Mycobacterium tuberculosis from
northwestern Russia: implications for genotypic resistance testing. J. Clin. Microbiol. 40,
38103813.
Morcillo, N., Zumarraga, M., Alito, A., Dolmann, A., Schouls, L., Cataldi, A., Kremer, K.,
van Soolingen, D. (2002) A low cost, home-made, reverse-line blot hybridisation assay
for rapid detection of rifampicin resistance in Mycobacterium tuberculosis. Int. J.
Tuberc. Lung Dis. 6, 959-965.
Mostrm, P., Gordon, M., Sola, C., Ridell, M., Rastogi, N. (2002) Methods used in the
molecular epidemiology of tuberculosis. Clin. Microbiol. Infect. 8, 694-704.
Narvskaya, O., Mokrousov, I., Otten, T., Vishnevsky, B. (2005) Molecular markers:
application for studies of Mycobacterium tuberculosis population in Russia In: Read
MM, ed. Trends in DNA fingerprinting research. New York: Nova Science Publishers
Inc, pp.111-125.
Nikolayevskyy, V.V., Brown, T.J., Bazhora, Y.I., Asmolov, A.A., Balabanova, Y.M.,
Drobniewski, F.A. (2007) Molecular epidemiology and prevalence of mutations
conferring rifampicin and isoniazid resistance in Mycobacterium tuberculosis strains
from the southern Ukraine. Clin. Microbiol. Infect. 13, 129-138.
Nusrath Unissa, A., Selvakumar, N., Narayanan, S., Narayanan, P.R. (2008) Molecular
analysis of isoniazid-resistant clinical isolates of Mycobacterium tuberculosis from India.
Int. J. Antimicrob. Agents 31, 71-75.
Oelemann, M.C., Diel, R., Vatin, V., Haas, W., Rsch-Gerdes, S., Locht, C., Niemann, S.,
Supply, P. (2007) Assessment of an optimized mycobacterial interspersed repetitive-
unit-variable-number tandem-repeat typing system combined with spoligotyping for
population-based molecular epidemiology studies of tuberculosis. J. Clin. Microbiol. 45,
691-697.
Park, Y.K., Shin, S., Ryu, S., Cho, S.N., Koh, W.J., Kwon, O.J., Shim, Y.S., Lew, W.J., Bai,
G.H. (2005) Comparison of drug resistance genotypes between Beijing and non-Beijing
family strains of Mycobacterium tuberculosis in Korea. J. Microbiol. Methods 63, 165-
172.
Plinke, C., Rsch-Gerdes, S., Niemann, S. (2006) Significance of mutations in embB codon
306 for prediction of ethambutol resistance in clinical Mycobacterium tuberculosis
isolates. Antimicrob. Agents Chemother. 50, 1900-1902.
Qian, L., Abe, C., Lin, T.P., Yu, M.C., Cho, S.N., Wang, S., Douglas, J.T. (2002) rpoB
genotypes of Mycobacterium tuberculosis Beijing family isolates from East Asian
countries. J. Clin. Microbiol. 40, 1091-1094.
Ramaswamy, S., Musser, J. (1998) Molecular genetic basis of antimicrobial agent resistance
in Mycobacterium tuberculosis: 1998 update. Tuberc. Lung Dis. 79, 3-29.
Ramaswamy, S.V., Amin, A.G., Goksel, S., Stager, C.E., Dou, S.J., El Sahly, H., Moghazeh,
S.L., Kreiswirth, B.N., Musser, J.M. (2000) Molecular genetic analysis of nucleotide
polymorphisms associated with ethambutol resistance in human isolates of
Mycobacterium tuberculosis. Antimicrob. Agents Chemother. 44, 326336.
Reyes, J.F., Francis, A.R., Tanaka, M.M. (2008) Models of deletion for visualizing bacterial
variation: an application to tuberculosis spoligotypes. Submitted.
Samarina, A., Zhemkov, V., Zakharova, O., Hoffner, S. (2007) Tuberculosis in St Petersburg
and the Baltic Sea region. Scand. J. Infect. Dis. 39, 308-314.
Slayden, R.A., Barry, C.E., 3
rd
. (2000) The genetics and biochemistry of isoniazid resistance
in Mycobacterium tuberculosis. Microbes Infect. 2, 659-669.
Sola, C., Filliol, I., Legrand, E., Mokrousov, I., Rastogi, N. (2001) Mycobacterium
tuberculosis phylogeny reconstruction based on combined numerical analysis with
IS1081, IS6110, VNTR, and DR-based spoligotyping suggests the existence of two new
phylogeographical clades. J. Mol. Evol. 53, 680-689.
Sreevatsan, S., Stockbauer, K.E., Pan, X., Kreiswirth, B.N., Moghazeh, S.L., Jacobs, W.R.,
Jr, Telenti, A., Musser, J.M. (1997) Ethambutol resistance in Mycobacterium
tuberculosis: critical role of embB mutations. Antimicrob. Agents Chemother. 41, 1677
1681.
Sun, Y.J., Bellamy, R., Lee, A.S., Ng, S.T., Ravindran, S., Wong, S.Y., Locht, C., Supply, P.,
Paton, N.I. (2004) Use of mycobacterial interspersed repetitive unit-variable-number
tandem repeat typing to examine genetic diversity of Mycobacterium tuberculosis in
Singapore. J. Clin. Microbiol. 42, 1986-1993.
Supply, P., Allix, C., Lesjean, S., Cardoso-Oelemann, M., Rusch-Gerdes, S., Willery, E.,
Savine, E., de Haas, P., van Deutekom, H., Roring, S., Bifani, P., Kurepina, N.,
Kreiswirth, B., Sola, C., Rastogi, N., Vatin, V., Gutierrez, M.C., Fauville, M., Niemann,
S., Skuce, R., Kremer, K., Locht, C., van Soolingen, D. (2006) Proposal for
standardization of optimized mycobacterial interspersed repetitive unit-variable-number
tandem repeat typing of Mycobacterium tuberculosis. J. Clin. Microbiol. 44, 4498-4510.
Supply, P., Lesjean, S., Savine, E., Kremer, K., van Soolingen, D., Locht, C. (2001)
Automated high-throughput genotyping for study of global epidemiology of
Mycobacterium tuberculosis based on mycobacterial interspersed repetitive units. J. Clin.
Microbiol. 39, 35633571.
Swofford, D.L. PAUP*: Phylogenetic Analysis Using Parsimony (and Other Methods) 4.0
Beta. Sinauer Associates, Sunderland, Massachusetts, 2002.
Tang, C., Reyes, J.F., Luciani, F., Francis, A.R., Tanaka, M.M. (2008) spolTools: online
utilities for analyzing spoligotypes of the Mycobacterium tuberculosis complex.
Bioinformatics. Published online 18 August 2008.
Telenti, A., Imboden, P., Marchesi, F., Lowrie, D., Cole, S., Colston, M.J., Matter, L.,
Schopfer, K., Bodmer, T. (1993) Detection of rifampicin-resistant mutations in
Mycobacterium tuberculosis. Lancet 341, 647-650.
Telenti, A., Philipp, W.J., Sreevatsan, S., Bernasconi, C., Stockbauer, K.E., Wieles, B.,
Musser, J.M., Jacobs, W.R., Jr. (1997) The emb operon, a gene cluster of Mycobacterium
tuberculosis involved in resistance to ethambutol. Nature Med. 3, 567-570.
Tracevska, T., Jansone, I., Baumanis, V., Marga, O., Lillebaek, T. (2003) Prevalence of
Beijing genotype in Latvian multidrug-resistant Mycobacterium tuberculosis isolates. Int.
J. Tuberc. Lung. Dis. 7, 1097-1103.
Tracevska, T., Jansone, I., Nodieva, A., Marga, O., Skenders, G., Baumanis, V. (2004)
Characterisation of rpsL, rrs and embB mutations associated with streptomycin and
ethambutol resistance in Mycobacterium tuberculosis. Res. Microbiol. 155, 830834.
Valcheva, V., Mokrousov, I., Rastogi, N., Narvskaya, O., Markova, N. (2008a) Molecular
characterization of Mycobacterium tuberculosis isolates from different regions of
Bulgaria. J. Clin. Microbiol. 46, 1014-1018.
Valcheva, V., Mokrousov, I., Narvskaya, O., Rastogi, N., Markova, N. (2008b) Utility of new
24-locus variable-number tandem-repeat typing for discriminating Mycobacterium
tuberculosis clinical isolates collected in Bulgaria. J. Clin. Microbiol. 46, 3005-3011.
Valcheva, V., Mokrousov, I., Narvskaya, O., Rastogi, N., Markova, N. (2008c) Molecular
snapshot of drug-resistant and drug-susceptible Mycobacterium tuberculosis strains
circulating in Bulgaria. Infect. Genet. Evol. 8, 657-663.
van Embden, J., Cave, M., Crawford, J., Dale, J., Eisenach, K., Gicquel, B., Hermans, P.,
Martin, C., McAdam, R., Shinnick, T., Small, P. (1993) Strain identification of
Mycobacterium tuberculosis by DNA fingerprinting: Recommendation for a standardized
methodology. J. Clin. Microbiol. 31, 406-409.
Van Rie, A., Warren, R., Mshanga, I., Jordaan, A.M., van der Spuy, G.D., Richardson, M.,
Simpson, J., Gie, R.P., Enarson, D.A., Beyers, N., Van Helden, P.D., Victor, T.C. (2001)
Analysis for a limited number of gene codons can predict drug resistance of
Mycobacterium tuberculosis in a high-incidence community. J. Clin. Microbiol. 39, 636
641.
van Soolingen, D. (2001) Molecular epidemiology of tuberculosis and other mycobacterial
infections: main methodologies and achievements. J. Intern. Med. 249, 1-26.
van Soolingen, D., Qian, L., de Haas, P.E.W., Douglas, J.T., Traore, H., Portaels, F., Quing,
Z., Enkhasaikan, D., Nymadawa, P., van Embden, J.D.A. (1995) Predominance of a
single genotype of Mycobacterium tuberculosis in countries of East Asia. J. Clin.
Microbiol. 33, 32343238.
Vasileva, D. (1992) Bulgarian Turkish emigration and return. Int. Migr. Rev. 26, 342-352.
Voronina, E.N., Vikhrova, M.A., Khrapov, E.A., Kinsht, V.N., Norkina, O.V., Gorbunova,
E.V., Shabaldin, A.V., Glushkov, A.N., Krasnov, V.A., Filipenko, M.L. (2004) KatG
Ser315Thr mutation as the main reason of isoniazid resistance in Mycobacterium
tuberculosis isolated in the Novosibirsk and Kemerovo Regions. Mol. Gen. Mikrobiol.
Virusol. (3), 8-11. In Russian.
World Health Organisation. Laboratory Services in Tuberculosis Control. Part III: Culture.
Geneva, Switzerland. 1998.
World Health Organisation. Anti-Tuberculosis Drug Resistance in the World Fourth Global
Report. Geneva, Switzerland. 2008a.
World Health Organisation. Global tuberculosis control: surveillance, planning, financing.
Geneva, Switzerland. 2008b.
Zozio, T., Allix, C., Gunal, S., Saribas, Z., Alp, A., Durmaz, R., Fauville-Dufaux, M.,
Rastogi, N., Sola, C. (2005) Genotyping of Mycobacterium tuberculosis clinical isolates
in two cities of Turkey: Description of a new family of genotypes that is
phylogeographically specific for Asia Minor. BMC Microbiol. 5, 44.


Chapter 4

Genetic Diversity in Switchgrass
A Potential Bioenergy Crop

B. Narasimhamoorthy, M. C. Saha
, H. S. Bhandari and J. H. Bouton

Forage Improvement Division, The Samuel Roberts Noble Foundation, Inc.,
2510 Sam Noble Parkway, Ardmore, OK 73401, USA

Abstract

Switchgrass

(Panicum virgatum L.) is a warm-season C4 perennial grass belonging
to the family Poaceae. It is native to North America. Persistence across a wide
geographical range, in addition to high biomass production with minimum inputs, makes
it an excellent choice for a sustainable bioenergy crop. Switchgrass is a highly
heterozygous, self-incompatible and out-crossing species. Broad species adaptation,
natural selection and photoperiodism

have combined to create considerable ecotypic
differentiation

in switchgrass. The natural population is classified into two distinct
cytotypes; upland and lowland. Upland cytotypes are mostly octaploid (2n = 8x = 72) and
lowlands are tetraploid (2n = 4x = 36); however, multiple ploidy levels ranging from
diploid (2n = 2x = 18) to dodecaploid (2n = 12x = 108) have been reported in
switchgrass. In the USA, uplands are adapted to the mid and northern latitudes, while
lowlands are in the southern parts of the country. In addition, these ecotypes differ with
respect to photosynthesis, drought tolerance and N-use efficiency. Knowledge on the
amount of genetic diversity and polymorphism in switchgrass is necessary to enhance the
effectiveness of breeding programs and germplasm conservation efforts. In the past two
decades, several studies have been conducted to evaluate the genetic variability in
switchgrass populations. Molecular markers, such as RFLPs, RAPDs and SSRs, were
used to find within and among population variation in a wide range of switchgrass
cytotypes. Hybrid cultivars can be an attractive option for improving biomass production.
Molecular marker and phenotypic data suggest that lowland and upland genotypes

represent different heterotic groups that can potentially be

used to produce F
1
hybrid

Author for correspondance : Malay C. Saha, Forage Improvement Division, The Samuel Roberts Noble
Foundation, Inc., 2510 Sam Noble Parkway, Ardmore, OK 73401, USA, E-mail: mcsaha@noble.org, Phone:
+1-580-224-6840. Fax: +1-580-224-6802.
B. Narasimhamoorthy, M. C. Saha, H. S. Bhandari et al. 106
cultivars. This review summarizes the current understandings on the genetic diversity
available in P. virgatum populations, with a focus on studies performed at the Noble
Foundation, where the genetic variability and the relationships within and among
switchgrass populations were determined with simple sequence repeat markers and
ploidy analysis.

Introduction

Switchgrass or tall panic grass (Panicum virgatum L.) belongs to the Paniceae tribe in the
subfamily Panicoideae of the Poaceae (Gramineae) family. It is one of the predominant
species of North American tallgrass prairies found along with big bluestem (Andropogon
gerardii), indiangrass (Sorghastrum nutans [L.], Nash) and eastern gammagrass (Tripsacum
dactyloides). Historically, switchgrass was a natural component of the tall-grass prairie which
covered most of the Great Plains, but was also found on the prairie soils in the Black Belt of
Alabama and Mississippi. It has multiple uses; it is grazed by certain animals, farmed as
forage for livestock and used as ground cover to control soil erosion both from wind and
water. It has long been used for restoring the native ranges and conservation plantings
through conservation reserve program (CRP). Switchgrass is well-known among wildlife
conservationists as a favorite forage and habitat, due to the abundance of wildlife attracted by
switchgrass stands.
Switchgrass is an erect perennial grass with rapid growth and can grow up to 4 m in
height while the base of the plant at ground level can reach to about 50 cm. It produces scaly
underground stems called rhizomes that are capable of reproducing new tillers each year. It is
a warm-season perennial grass with a C4 photosynthesis pathway [Moss et al., 1969; Koshi et
al., 1982], producing most of its biomass during summers. It is capable of producing very
high amounts of biomass with minimum fertilizer and moisture, and can be established from
seed. Switchgrass is capable of growing in marginal, erodible and drought lands, and is
tolerant to moderate soil salinity and acidity, growing in soil pH ranging from 4.5 to 7.6. The
combination of heat, cold, and drought tolerance within

the species results in an adequate
level of adaptation for nearly

all of the U.S. east of the Rocky Mountains and much of eastern

Canada. It has the potential for sequestering large

amounts of atmospheric carbon in
permanent grasslands (Liebig et al., 2008).

Biofuel Potential

Bioenergy Feedstock Development Program (BFDP) of the Department of Energy (DOE)
was initiated to identify the most promising species for use as bioenergy feedstock to produce
biofuel [McLaughlin and Kszos, 2005]. Switchgrass was chosen for future research as the
main herbaceous bioenergy crop by this intensive program. The DOE- initiated research on
switchgrass estimated a 25% reduction in projected production cost, 50% biomass yield
increase through development of improved cultivars, reduction of amount (up to 40%) and
timing of nitrogen fertilizers and high carbon sequestration potential as summarized in
Genetic Diversity in Switchgrass A Potential Bioenergy Crop 107
McLaughlin and Kszos [2005]. The importance of herbaceous biofuel crops like switchgrass
was reinstated in order to reach the goal of replacing USA petroleum supplies with biofuels
[Perlack et al., 2005]. DOE-United States Dept of Agriculture (USDA) initiated substantial
research programs on biomass genomics and also established bioenergy research centers
(BESC) for improving biomass recalcitrance and production. Switchgrass is an important
component of these research initiatives. These efforts have positioned switchgrass as a major
crop in terms of research focus which can result in increasing the economic value of
switchgrass as a dual purpose crop for both forage and the bioenergy industry.
Switchgrass biomass can be converted into energy by fermentation,

gasification or
combustion [McLaughlin et al., 1999]. It has a high on-farm potential with estimated
ethanol yields between 2,500 and 3,600 l ha
-1
and is capable of producing 540% more
renewable than non-renewable energy consumed [Schemer, 2008]. It can also be directly
combusted or co-fired with coal to produce electricity or heat that can lower emissions
associated with the burning of coal. Thus, a range perennial crop species traditionally used
for grazing and forage, the original energy feedstocks for draft animal power is now
considered a second generation bioenergy crop [Lange, 2007]. Biomass could bring back a
21
st
century version of the prairie. And along with the prairie, it could bring a new crop to
Americas farms, a boost to U.S. energy independence and brighter prospects for a clean,
sustainable future [http://bioenergy.ornl.gov/papers/misc/switgrs.html].

Breeding Goals

Until recently the breeding efforts of switchgrass were directed towards enhancing its
nutritional value as a forage crop [Vogel et al., 1989]. Primary use of switchgrass since the
early 1940s was for conservation purposes and for warm-season pastures in the Great Plains
and Midwestern states, with less use in other regions [Vogel, 2004]. Increasing forage yield,
improving seedling establishment and forage qualities, particularly digestibility, were the
primary focus of any forage crop improvement. The numerous reports on switchgrass as a
forage crop are helpful, but not fully applicable to biomass research, which has been limited.
High biomass yield and increased digestibility will remain a priority for the biofuel industry.
Increased

fiber content of biomass with large amounts of fermentable or combustible

sugars
would be beneficial, while lignin

is difficult to break down during fermentation and ash

is
detrimental to combustion,

due to slagging and fouling of biomass boilers, increasing

maintenance costs and creating a disposal problem for flyash

[Miles and Miles, 1994].
Therefore, the most desirable switchgrass cultivar as a general-purpose bioenergy

feedstock
requires high biomass

production, combined with a high fiber concentration, low lignin

and
low ash content. Similarly, biomass yields of a biofuel crop in response to fertilizer
management and harvest management will differ from a forage crop.
To improve its economic value as a sustainable bioenergy crop, switchgrass cultivars
with improved biomass yield potential need to be developed. In addition, there is a need to
genetically improve switchgrass for its quality attributes to suit the rising demand for energy
feedstocks. In stark contrast to development of food crops like corn (Zea mays. L), rice
(Oryza sativa. L) and wheat (Triticum aestivum. L), there are very few switchgrass cultivars
available, e.g., Alamo, Blackwell, Cave-in-rock, Dacotah, Forestburg, Kanlow, Nebraska 28,
Shawnee, Summer, Sunburst, Pathfinder and Trailblazer [Berdhal and Redfearn, 2007]. To
capture the heterosis, classification and organization of germplasms into genetically divergent
pools or heterotic groups is required. The extent of heterosis and the different heterotic
groups in switchgrass have not yet been identified [Martinez-Reyna and Vogel, 2008]. The
hybrids between upland Summer and lowland Kanlow seem to represent different
heterotic groups [Vogel, personal communication]. The most obvious potential heterotic
groups of a species are either geographically separated populations or separate subspecies
[Brummer, 1999]. Knowledge on the amount of genetic diversity, polymorphism and the
genetic relationships among populations would be helpful in switchgrass cultivar
improvement. Selecting desirable parental genotypes of the crosses based on the genetic
relationships would be valuable in a breeding program and germplasm conservation efforts.

Genetic Diversity

Ecotypes

Switchgrass is a highly heterozygous, self-incompatible and out-crossing species [Talbert
et al., 1983]. Cross-pollination is enforced by a gametophytic self-incompatibility that is
similar to the S-Z incompatibility system found in other Poaceae members [Martinez-Reyna
and Vogel, 2002]. Less than 1% of self-compatibility, as measured by seed set from bagged
panicles, has been reported [Martinez-Reyna and Vogel, 2002]. Protocol for clonal
propagation using nodes or immature inflorescence has been developed [Alexandrova et al.,
1996]. Switchgrass can thus be propagated by seeds or by clonal propagules.
The natural population of switchgrass is classified into two distinct ecotypes; uplands
and lowlands. Two distinct cytoplasm types, L and U, are also associated with lowland and
upland ecotypes, respectively [Hultquist et al., 1996]. Upland ecotypes are adapted to the mid
and northern latitudes of the U.S. that are not subject to flooding, while lowland ecotypes are
adapted to the southern U.S. on flood plains, heavy soils and other areas subjected to
inundation [Porter, 1966; Brunken and Estes, 1975; Casler et al., 2004]. Lowland ecotypes
are taller, more robust and coarse, and generally more rust (Puccinia spp.) resistant, with a
more bunch-type growth habit than their upland counterparts. Two ecotypes also differ with
respect to photosynthesis [Warner et al., 1987], drought tolerance [Nickell, 1972] and N-use
efficiency [Porter, 1966]. It is a highly adaptable photoperiod-sensitive species, and the
latitude of origin of germplasm is a primary determinant of its area of adaptation [Vogel,
2004; Casler et al., 2007a, b]. Day length and tolerance to cold and heat mostly control the
adaptation zone of the switchgrass populations. For this reason, the two ecotypes were further
subdivided into northern and southern uplands, and northern and southern lowlands based on
latitudinal adaptation. Most of the populations cannot be moved north or south beyond one
hardiness zone without adversely affecting flowering, vigor and survival rate [Casler et al.,
2004].
Distinct chloroplast DNA polymorphisms associated with lowland and upland ecotypes
were used to identify the cytotypes. A deletion of 49 nucleotides in chloroplast trnL (UAA)
sequences appears to be specific to lowland accessions and may be useful as a DNA marker
for the classification of upland and lowland germplasm [Missaoui et al., 2006]. Despite
multiple ploidy levels having been reported in switchgrass ranging from diploid (2n = 2x =
18] to dodecaploid (2n = 12x = 108] [Church, 1940; Burton, 1942; Nielson, 1944], only two
major ploidy levels, the tetraploids (2n = 4x = 36] and the octoploids (2n = 8x = 72] [Hopkins
et al., 1996; Lu et al., 1998], are primarily seen. The L cytotypes, i.e., the lowlands, are
tetraploids while the U cytotypes, i.e., the uplands, can be either tetraploids or octoploids.
The hybridization between the two ecotypes is restricted only between plants of a similar
ploidy level due to a post-fertilization endosperm incompatibility system that inhibits seed set
or produces abnormal seeds across ploidy levels [Martinez-Reyna et al., 2001].

Germplasm Collection and Repository

Most of the existing switchgrass accessions in the USDA-Germplasm Resources
Information Network (GRIN) were collections from the USA. A total of 163 active and
available switchgrass Plant Introductions (PI) are available at GRIN germplasm repository
[http://www.ars-grin.gov], representing collections from 20 states within the U.S. and three
other countries (Table 1). These collections include commercial cultivars, registered
germplasms, and native populations from different geographical areas. The majority of them
(115) are collections from North Dakota [obtained from Dr. Arvid Boe, Native Grasses
Curator]. Some of the upland varieties like Summer, Cave-in-rock, Shawnee, Trailblazer,
Pathfinder, Dacotah, Caddo, Blackwell, Sunburst, Shelter, Grenville and Falcon, and lowland
varieties Alamo and Kanlow are among these collections. Among these, Pathfinder [Newell,
1968], Forestburg [Barker et al., 1988], Dacotah [Barker et al. 1990], Trailblazer [Vogel et
al., 1991], Shawnee [Vogel et al., 1996] and Sunburst [Boe and Ross, 1998] are registered
cultivars, while PI607838 (TEM-SEC) [Tischler et al., 2001] is a registered germplasm. The
collections were made as early as 1953, and the most recent collection is a breeding material
PI636438 (TEM-dorm) obtained in 2004.
There are relatively few varieties of switchgrass [Berdahl and Redfearn, 2007] listed for
commercial sale under seed certification. Switchgrass breeding efforts have begun only
recently, suggesting that this crop is still undomesticated in terms of breeding and selection.
Most switchgrass cultivars are either seed increases of source-identified

collections or
products of a limited number of breeding cycles

tracing to many of these remnant-prairie sites
[Alderson and Sharp, 1994].

Three types of switchgrass cultivars are available. Cultivars that
came from direct seed increases of collections from prairie remnants, e.g., Alamo, Blackwell,
Cave-in-rock, Shelter and Summer. Although these cultivars have not undergone breeding,
they have been selected for vigor and agronomic traits from among other populations
collected from prairie-remnant sites, following evaluation in a common nursery. Cultivars
like Sunburst, e.g., are eco-population, developed from polycrossing of plants from several
upland ecotypes. The genotypes involved in crosses represent some selection among
ecotypes, but no true breeding. The cultivars like Pathfinder, Shawnee, Trailblazer and
Nebraska-28 have been developed with considerable emphasis placed on increasing biomass
yield and survival [Vogel, 2004; Taliaferro et al., 1999]. In collaboration with the University
of Georgia, the Noble Foundation has recently placed two new lowland cultivars, Blade
EG1101 and Blade EG1102, into the commercial seed trade for use as biofuel feedstocks
(http://www.bladeenergy.com/switchproducts.aspx). All these new cultivars still have
significant variation within and between them for yield and forage quality, and in-vitro dry
matter digestibility [Vogel, 2004; Lemus et al., 2002]. Development of synthetic cultivars
and F
1
hybrids using the population improvement procedures is underway which will aid in
achieving the high biomass yield.

Table 1. The origin and number of switchgrass collections available in GRIN

Origins Number Details
Arkansas 3 Observation, Geographical
Argentina 1 Geographical
Belgium 1 Geographical
Colorado 2 Geographical
Florida 2 Geographical
Illinois 1 Cultivar
Kansas 3 Cultivar, Observation
Kentucky 1 Geographical
Maryland 3 Observation, Geographical
Mississippi 1 Geographical
Missouri 1 Cultivar
North Carolina 5 Observation
North Dakota 115 Geographical
Nebraska 7 Variety, Observation, Geographical
New Jersey 1 Geographical
New Mexico 4 Cultivar, Geographical
New York 1 Geographical
Oklahoma 3 Cultivar
Oregon 1 Cultivar
South Dakota 2 Observation, Geographical
Turkey 1 Geographical
Texas 3 Registered germplasm, Observation
West Virginia 1 Geographical
Source: http://www.ars-grin.gov.

Characterizing the Switchgrass Collections

The collections held in the GRIN and the released varieties are generally more adapted to
the regions of their origin. The upland and lowland ecotypes represent genetically distinct
populations with limited gene flow between them. Their use in breeding programs needs
thorough evaluation to understand their genetic potential. Due to the cross-pollinated nature
of the crop, an assessment of genetic diversity within and among ecotypes, cultivars, and
populations is essential.

Genetic Variability

Self-incompatibility and polyploidy create high amounts of phenotypic and genetic
variability in switchgrass due to a higher frequency of heterozygotes compared to their
diploid counterparts. Using sorghum cpDNA, the restriction pattern of 15 cultivars and three
experimental strains with differing ploidy levels or ecotype classification were examined
[Hultquist et al., 1996]. This is the first report to classify the upland and lowland ecotypes of
switchgrass using molecular tools. The lowland cultivars contained a change in a restriction
enzyme site that was not evident in the upland cytotypes. However, this study did not
differentiate genotypes or cultivars within the ecotypes or geographical locations. Gunter et
al. [1996] carried out a broad assessment of the genetic relationship among 137 genotypes
representing 14 populations of upland and lowland switchgrass using 91 polymorphic RAPD
markers and differentiated the uplands from the lowlands. In addition, the individual
genotypes within each population were clustered into discrete groups within the ecotypes
except for Blackwell and Caddo which were collected from the same geographic location.
Evidence from both studies using cpDNA and RAPD markers suggested that Cave-in-rock is
an upland cytotype despite its classification as lowland cytotype by USDA-SCS (1991).
The genetic variation in the nuclear gene encoding plastid acetyl-coA carboxylase
(ACCase) from six switchgrass cultivars was investigated using DNA sequence comparisons
to establish the relationship among these populations [Huang et al., 2003]. Tetraploid and
octoploid switchgrass ecotypes formed two separate groups. The close relationship between
Blackwell and Caddo cultivars was reflected by the identity of some of the ACCase gene
sequence. This study estimated that the most recent polyploidization events that established
modern switchgrass lineages occurred less than 2 million years ago. However, several
sequence variants of the Acc gene were seen within the cultivars due to the out-crossing
nature of the crop, suggesting the difficulty of using this approach for phylogenetic analysis.
Genetic variation among 21 switchgrass genotypes randomly selected from two lowland
(Alamo and Kanlow) and one upland (Summer) synthetic cultivars was estimated using 85
restriction fragment length polymorphism (RFLP) markers [Missaoui et al., 2006]. A great
extent of divergence between the uplands and lowlands was observed. In addition, the upland
Summer showed a higher degree of genetic variation than the lowland ecotypes. The fraction
of polymorphic loci within Summer, Kanlow and Alamo was 64%, 52% and 60%,
respectively. Phylogenetic analysis of chloroplast non-coding region of the trnL (UAA) DNA
sequences showed a deletion of 49 nucleotides that was specific to lowland cytotypes
[Missaoui et al., 2006]. In a related study, 46 prairie-remnant populations and 11 cultivars
were analyzed with RAPD markers to identify variation in switchgrass populations from the
northern and central U.S. [Casler et al., 2007a]. The structural patterns and spatial variation
as indicated by markers suggested that there are different switchgrass gene pools in different
regions of the northern and eastern U.S. Any prairie-remnant populations within the plant
adaptation region (PAR) can be used to represent the primary gene pool of that region. This
study also reiterated that because most of the genetic variability occurs within populations, a
relatively small number of collection sites are sufficient to maintain genetic variability of the
gene pool.
These genetic diversity studies provide valuable information for switchgrass breeders and
researchers, but are limited to specific regions and a small number of populations.
Information regarding the amount of genetic diversity present among a larger, more diverse
switchgrass collection would be valuable and enhance their effective use in germplasm
improvement and cultivar development efforts. Attempts were undertaken to examine the
level of genetic diversity and patterns of relatedness within and among switchgrass
accessions obtained from USDA GRIN collections [Narasimhamoorthy et al., 2008]. A total
of 31 unique accessions from 168 available in GRIN were chosen covering all the geographic
locations. The selected accessions represented 18 U.S. states and three other countries;
Turkey, Belgium and Argentina. Expressed sequence tag-simple sequence repeat (EST-SSR)
markers were used to identify variability within and among the collections. Greater within
(79.6%) rather than among (20.4%) population variation was identified among the
switchgrass GRIN collections which is common in out-crossing species like perennial
ryegrass (Lolium perennie), meadow fescue (Festuca pratensis), orchardgrass (Dactylis
glomerata L.), rhodesgrass (Chloris gayana Kunth), hardinggrass (Phalaris aquatica L.), etc.
[Huff, 1997; Kolliker et al., 1998; Ubi et al., 2003; Mian et al., 2005]. Results from this study
support the previous report [Taliaferro et al., 1999] that recurrent selection within any
defined population could serve as a good breeding strategy for switchgrass.
Approximately 45% similarity exists among the GRIN collections [Narasimhamoorthy et
al., 2008]. The 186 genotypes of 31 accessions were grouped into three major clusters and
several subclusters (Table 2). The upland and lowland ecotypes were broadly classified into
northern and southern uplands and lowlands, respectively. In general, the genotypes were
grouped into different adaptive zones based on the geographical locations of collections. All
the uplands fell into one cluster while the lowlands were found separating into distinct
subclusters suggesting that there could be more variation among lowlands compared to the
uplands. The three collections from Argentina, Turkey and Belgium were found to cluster
with the uplands.

Ploidy Variations

Ploidy level and DNA contents of several switchgrass populations have been estimated
using microscopy and flow cytometry techniques. Cytogenetic analysis combined with flow
cytometry has demonstrated that plants with average nuclear DNA contents 2.7 to 3.3 pg are
tetraploid populations and 4.7 to 6.0 pg are octoploid populations [Hopkins et al., 1996; Lu et
al., 1998). The flow cytometry results indicated that many remnant prairie sites in the
Midwestern states contain both tetraploid and octoploid plants that might be separate
breeding populations [Hultquist et al., 1997].

Table 2. Genotype and accession clusters from GRIN generated by similarity co-efficient method using NTYSIS 2.0
[reproduced from Bioenergy Research (2008) 2:136146]

I II III
IA IB IIIA IIIB
IB.1 IB.2
AR1(3/6)

AR1(2/6) KS3(1/6) AR1(1/6) NJ(5/6) AR2 MD1 NC2(1/6)
Sunburst-SD2(1/6) Argentina Shawnee-NE1(1/6) Kanlow-KS2 MD2
Belgium Greenville-NM1(1/6) MI NC1
CO Falcon-NM2(1/6) Alamo-TX(3/6) NC2(5/6)
Cave-in-rock-IL Blackwell-OK1(2/6) NJ(1/6)
Blackwell-KS1
KS3(5/6)
KY
MO
ND1
ND2
Shawnee-NE1(5/6)
Trailblazer-NE2
Greenville-NM1(5/6)
Falcon-NM2(5/6)
NY
Blackwell-OK1(4/6)
Caddo-OK2
Dacotah-OR
Summer-SD1
Sunbrust-SD2(5/6)
Turkey
Alamo-TX(3/6)
=
Accession followed numbers within brackets are the number of plants out of the six assayed.

Figure 1. DNA estimation using flow cytometry analysis of tetraploid (cultivar Alamo) and octoploid
(cultivar Trailblazer) switchgrass along with the standard diploid barley (cultivar Stark).
Nuclear DNA contents of the 31 GRIN switchgrass populations were estimated using
flow cytometry where Stark, a spring diploid barley (Hordeum vulgare L), was used as a
standard (Figure 1). The mean DNA content per plant was estimated using the formula
Nuclear DNA content = (mean position of unknown

peak)/(mean position of known peak) x
DNA content of barley standard. Twenty accessions were classified as tetraploids, with
average DNA content ranging from 2.4 to 3.4 pg, which included some of the varieties with
known ploidy such as Summer and Alamo etc. There were four accessions clearly classified
as octoploids such as Trailblazer (Figure 1), Caddo, Blackwell and Argentina collection with
average DNA content ranging from 4.5 to 5.8 pg [Narasimhamoorthy et al., 2008]. Seven out
of 31 accessions were identified with mixed ploidy levels including well characterized
varieties such as Cave-in-rock, Blackwell, Shawnee and Alamo.
Seedlings from the commercial seed lots of Alamo were found to be all tetraploids (mean
DNA content of 2.97 pg) and Blackwell and Shawnee were found to be octoploids (mean
DNA content of 4.76 pg), respectively. However, in Cave-in-rock, one seedling each was
designated as tetraploid and hexaploid with a mean DNA content of 2.96 and 3.68 pg,
respectively.
The remaining accessions were octoploids with a mean DNA content of 4.75 pg. Out-
crossing among Cave-in-rock and other accessions of different ploidy levels may have led to
ploidy level variation within the population. The occurrence of a tetraploid plant in Cave-in-
rock can best be explained if it existed from its inception in the population. In addition, the
clustering pattern seen among the accessions and genotypes within accession may not
entirely be attributed to the differences in ploidy level [Narasimhamoorthy et al., 2008].

Biomass Potential

Switchgrass being a perennial has the advantage of building soil organic matter by
storing more soil carbon throughout the year in addition to reducing the cost of planting
every year. The recent focus on switchgrass is to maintain high yields, and keeping costs low
to result in the best economic returns for the producer. One way to reduce costs and increase
farm productivity is to consider multiple uses such as grazing and biomass production.
Switchgrass is nutritious if grazed prior to the boot stage of development and is used as a
livestock forage by some farmers from the South to the Midwest. In order to obtain both
optimum livestock forage and biomass tonnage, it has been recommended to graze the
switchgrass to no less than 6 inches in the spring and/or early summer, and allow the grass to
regrow for a late fall or winter biomass harvest. However, switchgrass stands tend to decline
over the years with more frequent defoliation events (grazing, haying, or biomass harvest),
suggesting that timing of harvest, number of harvests and regrowth potential are crucial. For
many forage species, including switchgrass, the two-cut system would maximize the
economic yield under appropriate nitrogen (N) and harvest management systems. However,
for switchgrass in a biomass system, the one-cut system provides highest sustainable biomass
yields in the long term [Parrish and Fike, 2005]. The opportunity for the mineral nutrients to
be leached into soil is high by leaving the biomass standing until senescence [Bakker and
Jenkins, 2003], which would not only improve the feedstock quality, but also the soil nutrient
status. The two cytotypes also differ in response to frequency of cuttings. Upland cytotypes
produced on average 38 percent more biomass yield in two-cut systems compared to single
cut, but lowland cytotypes were not responsive [Fike et al., 2005]. Multiple harvesting of
Alamo switchgrass in Texas also supported the fact that the two cutting systems may not be
suitable in lowland cytotypes [Sanderson et al., 1999].
Switchgrass takes three years for the stands to reach full potential. The higher yield
potential of lowlands at lower latitudes is generally observed. The yield gradient from higher
to lower latitudes is partially related to the length of the growing season in addition to
adequate moisture and early establishment. McLaughlin and Kszos [2005] reported the
biomass yields of nine switchgrass cultivars evaluated for 10 years across 13 states in 18
sites. Over all sites and regions, the best one-year dry yield was 34.6 Mg ha
-1
for Alamo cut
twice in Alabama. The mean of the best yields per year regardless of variety across years was
24.4 4.2 Mg ha
-1
; however, variability as high as 50% was seen between best and poorest
sites within a region in any given year. Across regions, average single-cut Alamo yields,
derived from averaging intervals ranging from five to 10 years ranged from 12 to 19 Mg ha
-1

yr
-1
. Under the same conditions, Kanlow, a northern lowland cultivar, had average yields
ranging from 11.6 to 15.5 Mg ha
-1
yr
-1
. Among the uplands, Cave-in rock, a southern upland,
produced the highest yield that was either equivalent or exceeded the yields of the lowland
varieties in some years at southeastern sites, but only for a two-harvest system. The average
Cave-in-rock biomass yields in the two-cut system ranged from 13.5 to 18.6 Mg ha
-1
yr
-1
.
Six cultivars in sward plantings were evaluated in South Dakota and Wisconsin [Boe and
Casler, 2005]. The magnitude of difference between cultivar means for morphological traits
was often a reflection of distance between regions of cultivar origin. Shawnee, a selection
from Cave-in-rock, produced the highest average biomass yields (14.2 Mg ha
-1
). Trailblazer
(13.9 Mg ha
-1
), Sunburst (12.1 Mg ha
-1
) and Cave-in-rock (11.3 Mg ha
-1
) were high yielders
while Forestburg (8.9 Mg ha
-1
) and Dacotah (6.2 Mg ha
-1
) were low yielders. Thirty-eight
switchgrass collections from 33 prairie-remnant sites and 11 switchgrass cultivars were
evaluated for biomass yield, survival, dry matter, lodging, maturity, plant height,
holocellulose, lignin and ash for two years at two locations [Casler, 2005]. The maximum
yield seen among 38 switchgrass collections was 20 Mg ha
-1
while the maximum yield seen
among 11 switchgrass cultivars was 21.3 Mg ha
-1
. Populations from several of the
westernmost collection sites clustered with cultivars from the Great Plains, suggesting an
ecological basis for some of the phenotypic variation observed. Lemus et al. [2002] observed
higher biomass yield for lowland ecotypes compared to their upland counterparts in southern
Iowa. But the yield advantage was not as great as it was in some southern locations.
Population hybrids made between populations and specific hybrids made from specific
genotypes indicated that the lowland tetraploids and the upland represent two distinct
heterotic groups [Martinez-Reyna and Vogel, 2008]. The lack of heterosis reported among
progeny of both populations and specific hybrids of the upland octoploid cultivars and
experimental strains suggest less variability among germplasm pool or that they belong to the
same heterotic group. Similar results were observed where all the uplands were clustered
together and showed less variation compared to the lowlands [Casler et al., 2007a;
Narasimhamoorthy et al., 2008]. In general, the lowlands have higher biomass potential than
the uplands. The expression of superior biomass yield potential of southern-origin cultivars in
the Northern latitudes is dependent on winter survival [Berdahl et al., 2005] and adequate soil
moisture for phytomer development and biomass accumulation during late summer [Lee and
Boe, 2005].
Morphological Variations

Substantial morphological variation among genotypes within a population, among
populations and among ecotypes was recorded. Variation in phenology among cultivars and
ecotypes is strongly related to its biomass potential.
Variation for inflorescence type, heading and spring regrowth: The flowers of
switchgrass exist in a well developed panicle growing up to 60 cm in length and bears ripe
seeds from a single-flowered spikelet. It has a diffuse panicle type and seed head with its
spikelets positioned at the end of long branches. The first floret of the spikelet is fertile and
produces the seed while the second floret is staminate [Bouton, 2007]. The panicle structure
ranged from highly compact heads to completely open heads, with many indeterminate types
among accessions collected from GRIN (Figure 2). Switchgrass is a short-day plant; its
reproductive development is more tied to reduced photoperiod. Flowering is delayed for
southern populations when moved to northern latitudes, increasing the vegetative biomass
yield, while moving northern populations to southern latitudes hastens flowering, thereby
reducing biomass yield [Sanderson et al., 1996]. In southern Oklahoma, the lowlands
collected from GRIN are generally late heading and late maturing types compared to the
uplands. In this location, an upland cultivar (Dacotah) headed within 138 days, but it was
delayed until 172 days in Trailblazer, a southern upland cultivar (personal observation).
Heading among lowland accessions varied from 189 (Alamo) to 181 days (BN14669-92-
Mississippi). Spring regrowth after a killing freeze was early in lowlands compared to
uplands. The emergence of new tillers after winter in lowlands was seen between 75 (Alamo)
and 88 (Kanlow) Julian days, while in uplands the emergence of new tillers was noted
between 88 (Caddo) and 95 (central Iowa Germplasm) Julian days in this location.

Figure 2. Wide variability for inflorescence types from compact (left) to open (right) heads are
observed in switchgrass genotypes.
Variation for density and tiller weight: Switchgrass is a clonal modular organism [Boe
and Casler, 2005] where the tillers are made up of successive segments called phytomers
which are composed of a growing point, a stem, leaves, roots, nodes, internodes and latent
buds; all of which can rise from crown tissue buds or rhizomes. The process by which these
new aerial shoots emerge is called tillering. A tiller may flower if exposed to necessary
growing conditions, otherwise it will remain vegetative. The number of tillers and phytomer
per tiller, and the rate of phytomer development are important components of biomass and
seed production in switchgrass [Boe, 2007].
Cultivars from lower latitudes have larger tillers with more phytomers/tiller [Berdahl et
al., 2005; Boe and Casler, 2005] and higher biomass yield potential than their upland
counterparts due to their later maturity and more rapid stem elongation rate [Casler et al.,
2004]. The lowland cultivars such as Alamo and Kanlow produced thicker tillers compared to
uplands in our field trials in southern Oklahoma. Some genotypes were short with higher
foliage which could be good candidates for a forage crop, while some were tall with thick
stems suggesting being a good candidate for biofuel crop. In a different study, two uplands,
Summer and Sunburst, that have a similar phenology were compared for tiller density and
size [Boe, 2007]. Summer (yielded 12.6 Mg ha
-1
) produced 20% more vegetative biomass
than Sunburst and had a higher percent of reproductive tillers (62% vs. 40%) and more
phytomers/tiller (7.9 vs. 6.4), while Sunburst had more tillers m
-2
(677 vs. 530). Boe and
Casler [2005] reported highest tiller density (1090 tillers m
-2
) in Dacotah, and lowest density
(520 tillers m
-2
) in Cave-in-rock was observed in a study to compare six cultivars of different
ages for stands at three locations to determine the traits that affect biomass production.
Cultivar differences for biomass production were attributed to variations at tiller and
phytomer levels. The highest biomass swards of each cultivar in this study [Boe and Casler,
2005] were composed of predominantly reproductive tillers, with the maximum number of
phytomers/tiller for that cultivar. On the other hand, the lowest biomass swards were not
necessarily associated with low tillers m
-2
, but rather with a high frequency of vegetative
tillers with fewer phytomers and lower weight/phytomer than reproductive tillers.
Variation for plant growth habit, plant height and foliage color: The switchgrass
genotypes and ecotypes vary greatly for erectness or lodging, leaf/stem ratio (stemminess),
foliage color and forage mass [Boe, 2007]. Genotypes with spreading, semi-spreading,
intermediate, semi-erect and erect growth habits were noted among the GRIN collections.
The uplands tend to be more erect and upright while the lowlands tend to be spreading, as
observed in the field trials at southern Oklahoma. Among the GRIN collections, PI-
315725(BN-14669-92-Mississippi) was found to have the most erect genotypes, while the
collection from Turkey (PI204907) had the most spreading genotypes. Lowlands were
generally taller than uplands with less leaf/stem ratio. The uplands are more prone to lodging,
which is a negative trait for bioenergy feedstock production. When grown in Southern
Oklahoma the lowland ecotypes (Alamo) were much taller with compared to the upland
ecotypes (Summer) (Figure 3). Plant height among accessions from GRIN varied extensively
in this location; the lowland accessions grew anywhere from 2 3 m and the upland
accessions grew between 1 2 m in a growing year. Switchgrass foliage color varies from
copper blue to pale green with intermediate colors. Many of the lowlands have the
characteristic copper blue while the uplands have the characteristic pale green color. Among
the GRIN collections, Summer had the most pale green foliage color while BN-14668-65-
Arkansas collection had the most bluish foliage color.
Variation for rust infection: Considerable variation in response to different strains of
rusts on switchgrass genotypes and ecotypes have been reported. Thirty-four accessions of
switchgrass of different geographic sources were screened with Puccinia graminicola
showed that germplasms originated from North Dakota and Nebraska were extremely
susceptible to rust, while those from Oklahoma and Texas were found highly resistant
[Cornelius and Johnson, 1941].

Figure 3. Two major cytotypes of switchgrass, the lowlands (left) which are taller, more robust and
bunchier than the uplands (right).
Accessions originated from Kansas were either resistant or moderate to highly
susceptible. In southern Oklahoma, rust outbreak was seen in the research fields of
switchgrass in two consecutive years; 2007 and 2008, which was later identified as Puccinia
emaculata. The impact was severe in the upland and low to moderate in the lowland
populations (personal observation). Earlier, Gustafson et al. [2003] has reported a significant
amount of variation among switchgrass families originated from Summer and Sunburst
populations suggesting that the rust resistance is chiefly attributed to additive genes. The
severity of the rust observed suggest that the problems of diseases among a low-input crop
like switchgrass could turn into an important issue that has to be addressed when large areas
are brought under cultivation.

Nutrient Uptake and Carbon Sequestration

The nitrogen use efficiency of any biomass plant is affected by harvest management,
timing and frequency of harvest, plant biomass removed and soil mineralization rates. In
single-cut harvests of switchgrass, the nitrogen (N) removed was approximately one-third to
one-half of the N removed by two-harvest systems [Reynolds et al., 2000]. In the field trials
as summarized by McLaughlin and Kszos [2005], delaying harvest past late September can
result in yield losses of up to 20%; however, after initial losses associated with translocation
of nutrients, further yield declines over winter appeared minimal, and yields the following
year often benefit from the conserved nutrients. This report recommended that a 50%
reduction in the N requirement for switchgrass would still give comparable yields. However,
high N could cause lodging of switchgrass when grown as biomass crop. Lodging has been
reported at high N rates in Texas after drought [Ocumpaugh et al., 1997]. On heavy soils with
high N content, switchgrass will often not show a response to nitrogen for several years after
establishment [Christian and Elbersen, 1998]. Therefore, N application rates would differ for
specific soils, longer growing seasons compared to the shorter growing seasons and the
number of harvests per year.
Most research on switchgrass fertility has focused on its use as a forage crop. Grazing
livestock require protein, and higher N applications can ensure not only high yields, but
better quality feed. Nitrogen and carbon naturally cycle from shoots to below-ground parts
(roots) at the end of the growing season as a nutrient-conserving strategy. A review of the
literature suggests that switchgrass can be grown on soils of moderate fertility without
fertilizing, or with limited additions of fertilizer, and still maintain productivity [Parrish and
Fike, 2005]. Lowland genotypes had lower N concentration (5.7 vs 6.3 g kg
-1
), but higher soil
N removal rates (83 vs. 41 kg ha
-1
yr
-1
) than the upland genotypes [Cassida et al., 2005].
Switchgrass has been suggested for use in sequestering excess P and reducing its loss to
streams, besides taking advantage of manure as a substitute for inorganic fertilizers
[Sanderson et al., 2001; Missaoui, 2003]. The concentrations of total reactive P in surface
water run-off was reduced by an

average of 47% to 76% after passing through a production-
filter

strip system treated with dairy manure using Alamo switchgrass [Missaoui, 2003]. Most
studies on phosphate fertilization report that switchgrass does not show a response to P-
fertilization even if soil values are low [Jung et al., 1988; Jung et al., 1990; Ocumpaugh et al.,
1997]. However, Cassida et al. [2005] reported that in general the lowland genotypes had
higher P removal rates than upland genotypes (12 vs. 6 kg ha
-1
yr
-1
). During the growing
season, calcium and magnesium concentrations did not change much, while K, P and total
ash was found to decline significantly [Parrish et al., 1997; Sanderson and Wolf, 1995].
The use of perennial energy crops and renewable materials contributes to reductions of
CO
2
emissions. Switchgrass is considered essentially carbon-neutral since it absorbs as much
carbon-dioxide as it emits when ethanol is burnt. The high levels of resource allocation to
underground root production, also called carbon storage sinks, while slowing aboveground
growth during establishment, is one of the desirable attributes of switchgrass. It has much
higher belowground biomass (7.2 Mg ha
-1
) than corn (1.6 Mg ha
-1
) [Zan et al., 1997].
McLaughlin and Kszos [2005] reported that in the DOE-sponsored field trials the projected
annual carbon accumulation rates can reach up to 1.4 Mg C ha
-1
yr
-1
over 10 years on
degraded soils in warmer climates and that an average accumulation rate of 0.78 Mg C ha
-1

yr
-1
was seen across diverse regions in the eastern U.S. (Liebig et al., 2008). Switchgrass
plantings have great potential to store significant amounts of soil carbon, and the soil organic
carbon to 0.9-m

depth increased at the rate of 1.01 kg C m
-2
yr
-1
[Frank et al., 2004].
Biomass Recalcitrance

Lignocellulosic bioenergy crops that hold promise as sustainable sources of biomass for
ethanol production are composed predominantly of the polysaccharide-rich primary and
secondary walls which are made up of lignin, cellulose and hemicellulose. Biomass is first
processed (saccharification) to convert hemicellulose and cellulose to their constituent
pentose and hexose sugars by sequential acid and enzymatic hydrolysis; the released sugars
are then fermented to ethanol. Recalcitrance to saccharification is recognized as the major
limitation to efficient conversion of lignocelluloses to ethanol. Without overcoming biomass
recalcitrance, cellulosic biofuels will be more expensive than corn biofuels. Research has
been directed to reducing the recalcitrance by targeted modification of plant cell wall
structure and composition. Transgenic alfalfa (Medicago sativa L.) plants with reduced lignin
content were found to yield more than twice as much fermentable sugar from cell wall
polysaccharides as did wild-type plants, suggesting that lignin modification could even
bypass the need for acid treatment [Chen and Dixon, 2007]. Therefore, reducing
bioconversion recalcitrance via reduction of lignin content is projected to be promising
research in this area. Switchgrass has higher amounts of carbohydrates on a weight basis,
suggesting better bioconversion potentials than alfalfa, but its recovery seems to be inversely
correlated to maturity and lignin content [Dien et al., 2006]. A targeted approach has been
applied to probe into the lignin biosynthetic pathway in switchgrass using genomics tools.
Gene families involved in the lignin biosynthetic pathway will be recovered and used as
candidate genes for genetic engineering [Karuppiah et al., 2008].
Genotypes with high IVDMD (in vitro dry matter digestibility) or low lignin
concentration may have had lower

survival than genotypes with lower IVDMD. Genetic
changes in IVDMD on plant survival of switchgrass were evaluated [Vogel et al., 2002].

No
significant differences for IVDMD among the population obtained from crosses between
randomly

selected plants from pasture trials of

Trailblazer and Pathfinder were seen. In
addition, it was found that it should be feasible to continue to breed for

high IVDMD in
switchgrass, but survival will need to be an additional

selection criterion.

Nine switchgrass
populations (six lowland and three uplands) were evaluated at five south-central U.S. regions
to estimate the biofuel components [Cassida et al., 2005]. The lowland group consistently
yielded the higher amount of lignocellulose across these regions. The lowland genotypes had
greater cellulose concentrations (394 vs. 388 g kg
-1
) and greater yields of cellulose (6.03 vs.
2.63 Mg ha
-1
), lignin (1.37 vs. 0.63 Mg ha
-1
) and total lignocellulose (7.40 vs. 3.26 Mg ha
-1
)
than upland genotypes. Among the lowlands, the southern lowlands had greater yields of
cellulose (5.57 vs. 4.04 Mg ha
-1
), lignin (1.29 vs. 0.92 Mg ha
-1
) and total lignocellulose (6.85
vs. 4.95 Mg ha
-1
) than the northern lowlands.

Conclusions

Switchgrass has historically been used as ground cover to control erosion and farmed as
forage for livestock. Recently it has attained the status of the dedicated herbaceous bioenergy
crop and is still evolving for this use. Adaptation is important in naturalizing switchgrass
populations to certain climates, photoperiods and soil conditions. Switchgrass is classified
into two distinct ecotypes; upland and lowland. Day length and tolerance to cold and heat
mostly control the adaptation zone of these switchgrass populations. Lowland ecotypes
adapted to the southern latitudes are larger, more robust and higher yielding than their upland
counterparts that are generally adapted to the mid and northern latitudes. Switchgrass has a
tremendous amount of diversity due to the high levels of heterozygosity through its self-
incompatibility and out-crossing nature along with its polyploidy genome. This genetic
diversity gives the species the ability to adapt to changing environments, including new pests
and diseases and new climatic conditions. Understanding this genetic diversity provides
options to develop cultivars through selection and breeding, and use in genetic and genomic
manipulation studies. Molecular markers have shown a clear distinction between the uplands
and lowlands which seem to possess different heterotic blocks. Greater variation exists within
a population of switchgrass than among populations, which facilitates the use of a recurrent
selection strategy from within a defined population to develop cultivars suitable to each
geographic region. In addition, variation for morphological traits, plant characteristics,
biomass potential and nutrient uptake exists among these two ecotypes. The lowlands, when
grown in southern Oklahoma, are generally late flowering types with more vegetative
biomass, and are taller and possess thicker stems than the uplands. The lowlands tend to
remove more soil N and P during the growing season compared to the uplands. The down-
regulation of lignin pathway genes in switchgrass to improve the biomass recalcitrance to
fermentation is emphasized for its use as a biofuel crop. Diseases may appear as a major
limitation in monoculture. Although the future is bright for switchgrass as a dedicated energy
crop with millions of hectares projected to be planted in order to meet the DOE goals
(Bouton 2007), much more program coordination among the DOE, USDA and other federal,
state and local agencies will be necessary to attain the billion-ton feedstock goal (Perlack et
al., 2005).

References

Alderson, J., & Sharp, W.C. (1994). Grass varieties in the United States. Agricultural
Handbook No. 170. USDA, Soil Conservation Service, Washington, DC.
Alexandrova, K.S., Denchev, P.D., & Conger, B.V. (1996). Micropropagation of switchgrass
by node culture. Crop Sci. 36:17091711.
Bakker, R.R., & Jenkins, B.M. (2003). Feasibility of collecting naturally leaching rice straw
for thermal conversion. Biomass Bioenergy. 25:597-614.
Barker, R.E., Hass, R.J., Jacobson, E.T., & Berdahl, J.D. (1988). Registration of
Forestburgh switchgrass. Crop Sci. 28:192-193.
Barker, R.E., Hass, R.J., Berdahl, J.D., & Jacobson, E.T. (1990). Registration of Dacotah
switchgrass. Crop Sci. 30:1158.
Berdahl, J.D., Frank, A.B., Krupinsky, J.M., Carr, P.M., & Hanson, J.D. (2005). Biomass
yield, phenology, and survival of diverse switchgrass cultivars and experimental strains
in western North Dakota. Agron. J. 97:549555.
Berdahl, J.D., & Redfearn, D.D. (2007). Grasses for arid and semiarid areas. In Barnes RF,
Nelson CJ, Moore KJ & Collins M. (6
th
edition, vol. II). Forages: The Science of
Grassland Agriculture. Blackwell Publishing: pp. 221-244.
Boe, A., & Ross, J.G. (1998). Registration of Sunburst switchgrass. Crop Sci. 38:540.
Boe, A., & Casler, M.D. (2005). Hierarchial Analysis of Switchgrass Morphology. Crop Sci
45:2465-2472.
Boe, A. (2007). Variation between two switchgrass cultivars for components of vegetative
and seed biomass. Crop Sci. 47:636640.
Bouton, J.H. (2007). Molecular breeding of switchgrass for use as biofuel crop. Current
Opinion in Genet. and Dev.17:553-558.
Brunken, J.N., & Estes, J.R. 1975. Cytological and morphological variation in Panicum
virgatum. L. Southwest Nat 19:379-385.
Brummer, E.C. (1999). Capturing heterosis in forage crop cultivar development. Crop Sci.
39:943954.
Burton, G. (1942). A cytological study of some species in the tribe Paniceae. Am. J. Bot.
29:355-359.
Casler, M.D., Vogel, K.P., Taliaferro, C.M., & Wynia, R.L. (2004). Latitudinal Adaption of
Switchgrass Population. Crop Sci. 44:293-303.
Casler, M.D., Stendal, C.A., Kapich, L., & Vogel, K.P. (2007a). Genetic diversity, plant
adaptation regions, and gene pools for switchgrass. Crop Sci. 47:22612273.
Casler, M.D., Vogel, K.P, Taliaferro, C.M., Ehlke, N.J., Berdahl, J.D., Brummer, E.C.,
Kallenbach, R.L., West, C.P., & Mitchell, R.B. (2007b). Latitudinal and longitudinal
adaptation of switchgrass populations. Crop Sci. 47:22492260.
Casler, M.D. (2005). Ecotypic Variation among Switchgrass Populations from the Northern
USA. Crop Sci 45:388-398.
Cassida, K.A., Muir, J.P., Hussey, M.A., Read, J.C., Venuto, B.C., & Occumpaugh, W.R.
(2005). Biofuel component concentrations and yield of switchgrass in south central U.S.
environments. Crop Sci 45:682-692.
Chen, F., & Dixon, R. (2007). Lignin modification improves fermentable sugar yields for
biofuel production. Nat Biotechnol 25:759-761.
Christian, D.G., & Elberson, H.W. (1998). Prospects of using Panicum virgatum as biomass
energy crop. In. ElBassam ed.
Church, G.L. (1940). Cytotaxonomic studies in the gramineae Spartina, Andropogon, and
Panicum. Am J Bot 27: 263271.
Cornelius, D.R., & Johnson, C.O. (1941). Differences in plant type and reaction to rust
among several collections of Panicum virgatum L. Journal of American society of
agronomy: 33:115-124.
Dien, B.S., Jung, H.G., Vogel, K.P., Casler, M.D., Lamb, J.F.S., Weimer, P.J., Iten, L.,
Mitchell, R.B., & Sarath.G. (2006) Chemical composition and response to dilute-acid
pretreatment and enzymatic saccharification of alfalfa, reed canarygrass, and switchgrass.
Biomass Bioenergy 30:880-891
Frank, A.B., Berdhal, J.D., Hanson, J.D., Liebig, M.A., & Johnson, H.A. (2004) Biomass and
carbon partitioning in switchgrass. Crop Sci 44:1391-1396.
Fike, J.H., Parrish, D.J., Wolf, D.D., Balasko, J.A., Green, J.T. Jr., Rasnake, M., & Reynolds,
J.H. (2005). Long-term yield potential of switchgrass for biofuel systems. Biomass
Bioenergy 30:198-206
Gustafson, D.M., Boe, A., & Jin, Y. (2003). Genetic variation for Puccinia emaculata
infection in switchgrass. Crop sci 43:755-759.
Gunter, L.E., Tuskan, G.A., & Wullschleger, S.D. (1996). Diversity among populations of
switchgrass based on RAPD markers. Crop Sci 36:1017-1022.
Hopkins, A.A., Taliaferro, C.M., Murphy, C.D., & Christian, D. (1996). Chromosome
number and nuclear DNA content of several switchgrass populations. Crop Sci 36:1192-
1195.
Huang, S., Su, X., Haselkorn, R., & Gornicki. P. (2003). Evolution of switchgrass (Panicum
virgatum L.) based on sequences of the nuclear gene encoding plastid acetyl-CoA
carboxylase. Plant Sci. 164:43-49.
Huff, D.R. (1997) RAPD characterization of heterogeneous perennial ryegrass cultivars.
Crop Sci. 37:557564.
Hultquist, S.J., Vogel, K.P., Lee, D.J., Arumuganathan, K., & Kaeppler, S. (1996).
Chloroplast DNA and nuclear DNA content variations among cultivars of switchgrass,
Panicum virgatum L. Crop Sci 36:1049-1052.
Hultquist, S.J., Vogel, K.P., Lee, D.J., Arumuganathan, K., & Kaeppler, S. (1997). DNA
content and chloroplast DNA polymorphisms among switchgrass from remnant mid-
western prairies. Crop Sci 37:595-598.
Jung, G.A., Shaffer, J.A., & Stout, W.L. (1988). Switchgrass and big bluestem responses to
amendments on strongly acid soil. Agron. J. 80:669676.
Jung, G.A., Shaffer, J.A., Stout, W.L., & Panciera, M.J. (1990). Warm-season grass diversity
in yield, plant morphology, and nitrogen concentration and removal in northeastern USA.
Agron. J. 82:21-26.
Karuppiah, P., Allen, S., Ma, J., Dixon, R., Blancaflor, E., & Tang, Y. (2008). Abstract
PAGXVI conference http://www.intl-pag.org/16/abstracts/PAG16_P02c_88.html.
Kolliker, R., Stadelmann, F.J., Reidy, B., & Nosberger, J. (1999). Genetic variability of
forage grass cultivars: A comparison of Festuca pratensis Huds, Lolium perenne L. and
Dactylis glomerata L. Euphytica 106:261-270.
Koshi, P.T., Stubbendieck, J., Eck, H.V., & McCully, W.G. (1982). Switchgrass: Forage
yield, forage quality, & water use efficiency. J. Range Mgt. 35:623-627.
Lange, J.P. (2007). Lignocellulose conversion: an introduction to chemistry, process and
economics. Biofuels, Bioproducts and Biorefining 1:39-48.
Lemus, R., Brummer, E.C., Moore, K.J., Molstad, N.E., Burras, C.L., & Barker, M.F. (2002).
Biomass yield and quality of 20 switchgrass populations in southern Iowa, USA.
Biomass Bioenergy 23:433442.
Lee, D.K., & Boe, A. (2005). Biomass production of switchgrass in central South Dakota.
Crop Sci. 45:2583-2590.
Liebig M.A., Schmer M.R., Vogel K.P., & Mitchell R.B. (2008). Soil Carbon Storage by
Switchgrass Grown for Bioenergy Bioenerg. Res. 1:215222
Lu, K., Kaeppler, S.M., Vogel, K.P., Arumuganathan, K., & Lee, D.J. (1998). Nuclear DNA
content and chromosome numbers in switchgrass. Great Plains Res 8:269-280.
Martinez-Reyna, J.M, Vogel, K.P., Caha C., & Lee, D.J. (2001). Meiotic stability, chloroplast
DNA polymorphisms, and morphological traits of UplandLowland switchgrass
reciprocal hybrids. Crop Sci 41:1579-1583.
Martinez-Reyna, J.M., & Vogel, K.P. (2002). Incompatibility systems in switchgrass. Crop
Sci 42:1800-1805.
Martinez-Reyna, J.M., & Vogel, K.P. (2008). Heterosis in switchgrass: Spaced plants Crop
Sci 48:1312-1320.
McLaughlin, S., Bouton, J., Bransby, D., Conger, B., Ocumpaugh, W., Parrish, D.,
Taliaferro, C.M., Vogel, K.P., & Wullschleger, S.D. (1999). Progress in developing
switchgrass as a bioenergy feedstock. In Jenick J. (ed.) Perspectives on new crops and
new uses. Am. Soc. Hortic. Sci. Press., Alexandria, VA. pp 282-298.
McLaughlin, S.B., & Kszos, L.A. (2005). Development of switchgrass (Panicum virgatum)
as a bioenergy feedstock in the United States. Biomass Bioenergy 28:515535.
Mian, M.A.R., Zwonitzer, J.C., Chen, Y., Saha, M.C., & Hopkins, A.A. (2005). AFLP
diversity within and among hardinggrass populations. Crop Sci 45:2591-2597.
Miles, T.R., & Miles, T.R. Jr. (1994). Alkalis in alternative fuels. p. 152160. In J. Farrell et
al. (ed.) Sixth Natl. Bioenergy Conf., Reno, NV. 26 Oct. (1994). Western Reg. Biomass
Energy Prog., Golden, CO.
Missaoui, A. (2003). Molecular investigation of the genetic variation and polymorphism in
switchgrass (Panicum virgatum L.) cultivars and development of a DNA marker for the
classification of switchgrass germplasm. Ph.D. thesis, The University of Georgia.
Missaoui, A.M., Paterson, A.H., & Bouton, J.H. (2006). Molecular markers for the
classification of switchgrass (Panicum virgatum L.) germplasm and to assess genetic
diversity in three synthetic switchgrass populations. Genet. Resources Crop Evol
53:1291-1302.
Moss, D.N., Krenzer Jr., E.G., & Brun, W.A. (1969). Carbon Dioxide Compensation Points
in Related Plant Species. Science 164:187-188.
Narasimhamoorthy, B., Saha, M.C., Swaller, T., & Bouton, J.H. (2008).Genetic diversity in
switchgrass collections assessed by EST-SSR markers. Bio-energ Res.1:136-146.
Newell, L.C. (1968). Registration of pathfinder Switchgrass. Crop Sci. 8: 516.
Nielsen, E.L. (1944). Analysis of variation in Panicum virgatum. J.Agric. Res. 69:327353.
Nickell, G.L. (1972). The physiological ecology of upland and lowland Panicum virgatum.
Ph.D. dissertation (Diss. Abstr. No. 7304957). Univ. of Oklahoma, Norman.
Ocumpaugh, W.R., Sanderson, M.A., Hussey, M.A., Read, J.C., Tischler, C.R., & Reed, R.L.
(1997). Evaluation of switchgrass cultivars and cultural methods for biomass production
in the southcentral U.S. Final report. Oak Ridge National Laboratory, Oak Ridge, TN.
contract #19X-SL128C.
Parrish, D.J., Wolf D.D., & Daniels W.L. (1997).Switchgrass as a biofuels crop for the upper
Southeast: Variety trials and cultural improvements. Report to Oak Ridge National
Laboratory, Virginia Tech, Blacksburg, VA.
Parrish, D.J., & Fike, J.H. (2005). The biology and agronomy of switchgrass for biofuels.
Critical reviews in plant science.24: 423-459.
Perlack, R.D., Wright, L.L., Turhollow, F.F., Graham, R.L., Stokes, B.J., & Erbach D.C.
(2005). Biomass as feedstock for a bioenergy and bioproducts industry: the technical
feasibility of a billion-ton annual supply.:http://feedstockreview.ornl.gov/pdf/
billion_ton_vision.pdf.
Porter, C.L. (1966). An analysis of variation between upland and lowland switchgrass,
Panicum virgatum L., in central Oklahoma. Ecology 47.
Reynolds, J.H., Walker, C.L., & Kirchener, M.J. (2000). Nitrogen removal in switchgrass
biomass under two harvest systems. Biomass and Bioenergy. 19:281-286.
Sanderson, M.A., & Wolf, D.D. (1995). Morphological development of switchgrass in
diverse environments. Agron. J. 87:908-915.
Sanderson, M.A, Reed, R.L., Ocumpaugh, W.R., Hussey, M.A., Vans Esbroeck, G., Read,
J.C, Tischler, C.R., & Hons, F.M. (1999). Switchgrass cultivars and germplasm for
biomass feedstock production in Texas. Bioresour Technol. 67:209-219.
Sanderson, M.A., Jones, R.M., McFarland, M.J., Stroup, J., Reed, R.L., & J.P. Muir. (2001)
Nutrient movement and removal in a Switchgrass biomassfilter strip system treated with
dairy manure. J. Environ. Qual. 30: 210-216.
Sanderson, M.A., Reed R.L., McLaughlin, S.B., Wullschleger, S.D., Conger, B.V., Parrish, D.J., Wolf , D.D.,
Taliaferro, C.M., Hopkins, A.A., Ocumpaugh, W.R., Hussey, M.A., Read, J.C., & Tischler, C.R.. (1996).
Switchgrass as a sustainable bioenergy crop. Bioresour Technol. 56:83-93.
Schmer, M.R., Vogel, K.P., Mitchell, R.B., & Perrin, R.K. (2008). Net energy of cellulosic
ethanol from switchgrass. PNAS 105:464-469.
Talbert, L.E., Timothy, D.H., Burns, J.C., Rawlings, J.O., & Moll, R.H. (1983). Estimates of
genetic parameters in switchgrass. Crop Sci 23:725-728.
Taliaferro, C.M., Vogel, K.P., Bouton, J.H., McLaughlin, S.B., & Tuskan, G.A. (1999).
Reproductive characteristics and breeding improvement potential of switchgrass. In
Proceedings of Fourth Biomass Conference of the Americas, Aug 29-Sept 2, (1999),
Oakland, CA. pp.147-153.
Tischler, C.R., Elberson, H.W. , Hussey, M.A., Ocumpaugh, W.R., Reed R.L., & Sanderson
M.A. (2001) Registration of TEM-SLC and TEM-SEC Switchgrass Germplasms Crop
Science 41:1654-1655.
Ubi, B.E., Kolliker, R., Fujimori, M., & Komatsu, T. (2003). Genetic diversity in diploid
cultivars of rhodesgrass determined on the basis of amplified fragment length
polymorphism markers. Crop Sci 43:1516-1522.
Vogel, K.P. (2004). Switchgrass. p. 561588. In L.E. Moser, L. Sollenberger, & B. Burson
(ed.) Warmseason (C4) grasses. ASA, CSSA, and SSSA, Madison, WI.
Vogel, K.P, Gorz, H.J., & Haskins, F.A. (1989). Breeding Grasses for the Future. Crop
Science Society of America. Contributions from Breeding Forage and Turf Grasses.
CSSA Special Pub.
Vogel, K.P., Haskins, F.A., Gorz, H.J., Anderson, B.A., & Ward, J.K. (1991). Registration of
'Trailblazer' switchgrass. Crop Sci 31:1388.
Vogel, K.P., Hopkins, A.A., Moore, K.J., Johnson, K.D., & Carlson, I.T. (1996). Registration
of "Shawnee" switchgrass. Crop Sci 36:1713.
Vogel, K.P., Hopkins, A.A., Moore, K.J., Johnson, K.D., & Carlson, I.T. (2002). Winter
survival in switchgrass populations bred for high IVDMD. Crop Sci 42:1857-1862.
Warner, D.A., Ku, M.S.B., & Edwards, G.E. (1987). Photosynthesis, Leaf Anatomy, and
Cellular Constituents in the Polyploid C4 Grass Panicum virgatum. Plant Physiol
84:461-466.
Zan, C., Fyles, J., Girouard, P., Samson, R., & Doan, M. (1997) Carbon storage in
switchgrass and short-rotation willow plantations. p. 355361. In R.P. Overend and E.
Chornet (ed.) Making a business from biomass in energy, environment, chemicals, fibers,
and materials. Vol. 1. Elsevier Science. Oxford, UK.


Chapter 5

Genetic Variability in the Fescue-
Ryegrass Complex

F. M. Kirigwi, A. A. Hopkins and M. C. Saha

Affiliation: Forage Improvement Division,
The Samuel Roberts Noble Foundation, Inc.,
2510 Sam Noble Parkway, Ardmore, OK 73401, USA

Abstract

Fescues and ryegrasses in the Lolium genus are widely used as forage and turf,
especially in temperate regions of the world. These highly productive grass species
provide feed and fodder for livestock and wild animals, play a major role as turf on golf
courses and lawns worldwide, and prevent soil erosion. Among these grasses, tall fescue
[Lolium arundinaceum (Schreb.) Darbysh.] germplasm is classified into five botanical
varieties that range from tetraploid to decaploid and into two major germplasm pools,
Continental" and Mediterranean, as well as into two functional groups, forage and turf
types. Important species in the genus Lolium include the outcrossing Lolium perenne L.,
(perennial ryegrass) and the self-pollinated L. temulentum L. subsp. temulentum (darnel,
darnel ryegrass). The majority of the Lolium are self-infertile, have a strong self-
incompatibility system and are, therefore, highly heterogeneous. Grazing or selection
may lead to loss of rare alleles that may be useful in adaptation in extreme environments,
e.g., when these cool-season grasses are grown in warmer, drier areas. Understanding the
levels of genetic diversity within and genetic relationships between populations is
therefore important for not only breeding, but also for ensuring adaptability

and
persistence, quality and disease resistance of germplasm accessions, breeding

lines and
populations. At the Noble Foundation, efforts have been concentrated on collecting tall
fescue and L. temulentum germplasm, and the development of molecular tools for these
species. Molecular tools developed in-house were employed to study genetic diversity
and to understand the utility of various marker tools for diversity studies. In this chapter,

Author for correspondence: Malay C. Saha, Forage Improvement Division. The Samuel Roberts Noble
Foundation, Inc. 2510 Sam Noble Parkway. Ardmore, OK 73401, USA. E-mail: mcsaha@noble.org. Phone:
+1-580-224-6840. Fax: +1-580-224-6802
F. M. Kirigwi, A. A. Hopkins and M. C. Saha 130
we review the genetic diversity work carried out in Lolium, with an emphasis on our
work at the Noble Foundation. Various marker systems have been found to be useful in
the Lolium genus, with SSRs in particular being transferable across the fescue-ryegrass
complex.

Introduction

Fescues belong to the Poaceae family and subfamily Festucoideae. The reader is referred
to Craven et al. [2007] for a thorough review of the taxonomy and evolution of fescues.
Previously, more than 400 species were placed in the Festuca genus [Clayton and Renvoize
1986], which was divided into six sections based on leaf morphology and ovary structure:
Bovinae, Montanae, Subbulbosae, Scariosae, Ovinae and Variae [Hackel, cited in Jauhar,
1993]. Tall fescue is a member of the section Bovinae (= subgenus Schedonorus), composed
of the broad-leaved fescues including tall fescue, that have been reclassified into the Lolium
genus, while the fine-leafed fescues, in the section Ovinae (= subgenus Festuca), remain in
the Festuca genus. It has been proposed that tall fescue be classified as Schedonorus
arundinaceus (Schreb.) Dumort. [Soreng et al., 2001]; however, in this report we follow the
classification of Darbyshire [1993].
The broad-leaved fescues are a polyploidy complex with 2n chromosome number
ranging from 14 to 70 [Borrill et al., 1971]. The ryegrasses in the Lolium genus consist of
eight recognized species, all of which are diploids (2n = 14) [Loos, 1993a; Terrell, 1968].
The outcrossing ryegrasses include L. perenne L., L. multiflorum Lam. and L.
rigidum Gaud, whereas L. temulentum L. subsp. temulentum (darnel, darnel ryegrass) and L.
persicum Boiss. and Hohen. ex Boiss. (Persian darnel) are self-pollinated species. An
additional outcrossing species, L. loliaceum, is considered conspecific with L. rigidum
[Terrell, 1968]. Based on the analysis of morphological and quantitative data, Abbas and
Sarita [2006] could not separate L. loliaceum from L. rigidum, indicating that the two are
inseparable at the species level. A new species of Lolium, L. saxatile, was reported by Scholz
and Scholz [2005]. This Lolium species is related to L. multiflorum, but has a perennial habit
and other distinguishing features that include erect culms and a distinct characteristic
appearance of the culm caused by basal shoots [Scholz and Scholz, 2005]. Although native to
Europe, temperate Asia and North Africa, the genus Lolium has been introduced to most
temperate regions of the world [Loos, 1993b]. The most important species are L. perenne L.
(perennial ryegrass) and L. multiflorum (Italian ryegrass). Perennial ryegrass is mainly
used for grazing and turf purposes while Italian ryegrass is mainly used for grazing or hay
and silage making, among other uses.
Tall fescue is a short-day hemicryptophyte grass species that is predominantly self-
sterile. It is cool-season perennial hay and pasture grass in the temperate regions of the world
and is native to Europe [Gibson and Newman, 2001]. Tall fescue is a deep-rooted, upright,
coarse-leaved bunch grass. However, most types have short rhizomes and can form sods
when kept short through grazing or mowing. Tall fescue is an important grass in the United
States where it is grown on about 15 million hectares [Buckner et al., 1979]. It forms the
forage base for beef cowcalf production

in the east-central and southeastern U.S., and is
Genetic Variability in the Fescue-Ryegrass Complex 131
important in turf and conservation applications [Sleper and Buckner, 1995]. In cytological
terms, tall fescue is an allogamic allohexaploid (2n = 6X = 42) whose genome size is
approximately 5.27 to 5.83 x 10
6
kb [Seal, 1983]. The genome constitution of tall fescue is
PPG
1
G
1
G
2
G
2
[Sleper and West, 1996]. The progenitor species are meadow fescue (L.
pratense Huds. Darbysh; 2n=2x=14) and F. glaucescens Hegetschw and Heer. (2n=4x=28).
Meadow fescue is believed to be the donor of the P genome while the tetraploid species, F.
glaucescens, is the donor of the G
1
G
2
genomes [Xu et al., 1994]. It should be noted that
Craven et al. [2007] refer to F. glaucescens as 4x L. arundinaceum (=F. arundinacea ssp.
Fenas).

Origin, Adaptation and Sources of Germplasm

The Poeae (Festuceae) tribe originates primarily in Western Europe [Lamp et al., 2001;
Meyer and Watkins, 2003]. According to Buckner et al. [1979], tall fescue can be seen in
pastures in Europe, North Africa, the mountains of East Africa and in Madagascar. European
settlers introduced tall fescue to North and South America although the actual introduction
date in the United States remains unknown. It is suspected that tall fescue was introduced as a
contaminant in meadow fescue which was introduced from England prior to 1800 [Kennedy,
1900]. Perennial ryegrass (L. perenne) is the most popular forage grass in temperate Europe
because of its higher nutritive quality. Although not a major forage crop in the United States,
meadow fescue has been

used as a germplasm source in the development of

interspecific
hybrids with both perennial and annual ryegrass

[Thomas and Humphreys, 1991].
After the introduction of tall fescue in the 1800s, many lawns and pastures were sown.
Grazing and mowing over the years were responsible for selection and development of the
current germplasm [Meyer and Watkins, 2003]. Germplasm can be acquired through
collections from old stands in the East Coast states of the U.S. in areas such as graveyards,
old lawns and in docking yards. Among the first collections of tall fescue were those made by
Dr. C.R. Funk in 1962 and subsequently by W.A. Meyer [Meyer and Watkins, 2003].
Additionally, germplasm collection can be carried out in areas of origin in Europe and the
Mediterranean countries [e.g. Meyer and Watkins, 2003]. The Noble Foundation has
collected numerous tall fescue accessions within the United States, as well as northern
Mexico, over the years [A.A. Hopkins, personal communication]. New collections in the
geographical areas of origin also present unique opportunities to collect, at the same time,
new fungal endophyte germplasm that can be used to develop new grass-endophyte

combinations for enhanced tolerance to abiotic and biotic stresses [Clement et al., 2001].
The National Genetic Resources Program of the USDA has the responsibility of
acquiring, characterizing, preserving, documenting and distributing germplasm that has
importance to food and agricultural production. The Agricultural Research Service hosts a
Web server, the Germplasm Resources Information Network - [GRIN, http://www.ars-
grin.gov/], which is an online database providing information on plant, animal, microbial and
invertebrate germplasm. As of November 2008, GRIN held 2469 accessions classified as
Festuca, including 84 species collected from 68 different countries (Table 1). Of these, 1033
accessions are listed as F. arundinacea Shreb. (= L. arundinaceum). In the same database, the
genus Lolium is represented by 1435 accessions from 19 species and 63 different countries.

Table 1. The number of accessions listed as Festuca and Lolium held by the National
Genetic Resources Program of the USDA-ARS and accessible through the Germplasm
Resources Information Network - [GRIN] [http://www.ars-grin.gov/]

Genus Center Accessions Countries Species
Festuca NSSL 17 1 2
W6 2452 67 82
Lolium COR 42 3 3
NSSL 6 2 2
PGQO 23 1 2
W6 1364 57 12
NSSL = National Center for Genetic Resources Preservation, Colorado.
W6 = Western Regional PI Station, Washington.
COR = National Germplasm Repository - Corvallis, Oregon.
PGQO = Plant Germplasm Quarantine Program, Maryland.

The Lolium Complex

Interspecific hybridization between species in the Lolium complex has been exploited in
the development of forage germplasm of high quality and winter hardiness, and in the
introgression of abiotic stress tolerance traits [Humphreys et al., 2005]. The ryegrass
members of the Lolium genus are high yielding and produce fodder of high quality and
digestibility. These grasses are considered to be Europe's most important forage grasses due
to their quality and yield, but only in drought and freeze stress-free conditions, for they have
poor tolerance [Kosmala et al., 2008]. On the other hand, the fescue members of the genus
Lolium have good resistance to abiotic stresses resulting in winter hardiness, drought
resistance and persistence, and are a source of useful tolerance traits to ryegrass species
[Humphreys et al., 2005;

Kosmala et al., 2006a; Kosmala et al., 2008]. Moreover, biotic
stress genes have been introgressed from fescue to ryegrass species [Armstead et al., 2006].
Combining the attributes of both groups is feasible through hybridization because
chromosomes of the two have high homology and a high frequency of recombination which
enables gene transfer from one homoeologous chromosomal region to another, resulting in
fertile offspring [Humphreys and Paakinskien, 1996; Humphreys et al., 2003; Jauhar, 1975;
Jauhar, 1993; King et al., 1999; Naganowska et al., 2001; Terrel 1968; Zwierzykowski et al.,
1999]. Furthermore, introgressed segments of chromosomes from one species through
interspecific hybridization can be distinguished by genomic in situ hybridization (GISH)
[Kosmala et al., 2006b; Thomas et al., 2003]. Interspecific hybridization between ryegrasses
and fescues can be used for the production of androgenic plants that may display
transgressive resistance for abiotic stresses [Humphreys et al., 2003].
The transfer of stress resistance genes from the fescues to ryegrasses can be carried out
via several schemes. Subgenomes of L. arundinaceum [Humphreys, 1989] or one of its
progenitors such as meadow fescue [Humphreys et al., 2005; Morgan et al., 2001] can be
used. For example, drought resistance was transferred from L. arundinaceum into L.

multiflorum using partially fertile pentaploid hybrids [Humphreys and Thomas, 1993], and
from fescue to ryegrass via a backcross-breeding program using partially

fertile tetraploid
hybrids between L. multiflorum [4x] x 4x L. arundinaceum [Humphreys et al., 2005]. Frost
resistance was transferred from L. pratense to L. multiflorum by a backcross-breeding
program [Kosmala et al., 2006a]. Similarly, winter hardiness and frost

tolerance genes were
transferred from L. arundinaceum into winter-sensitive L.

multiflorum via pentaploid hybrids
using a backcross-breeding

program [Kosmala et al., 2008].
A member of the genus Lolium, L. temulentum L. subsp. temulentum (darnel,
darnel ryegrass), is closely related to other members of the fescues and ryegrasses [Mian et
al., 2005a]. Darnel ryegrass has been used as a model in the study of a senescence-induced
degradation gene in L. pratense [Thomas et al., 1999] and can be exploited for introgression
of its self-compatibility genes into the outcrossing members of the Lolium [Yamada, 2001].
For example, L. temulentum was used to study the stay green trait of L. pratense [Hauck et
al., 1997; Thomas and Stoddart, 1975]. To transfer the trait, L. pratense was crossed with L.
multiflorum as a bridging species, and then to L. temulentum using both backcross and
embryo-rescue techniques [Thomas et al., 1999].

Population Heterogeneity for Genetic
Diversity Assessment

Grass species of economic importance can be divided into autogamous and vegetatively
propagated species, where cultivars are characterized by homozygous genotypes or uniform
stands, and the outcrossing species, where heterogeneous individuals, due to the presence of a
strong self-incompatibility system [Cornish et al., 1979], characterize cultivars. Outcrossing
leads to reshuffling of alleles in each cycle of mating resulting in a high degree of genetic
variation within populations. There has been less use of molecular markers for cultivar
identification in outcrossing species due to a great deal of heterogeneity among and within
populations leading to difficulties in finding cultivar-specific molecular markers [Busti et al.,
2004]. Tall fescue cultivar development normally involves intercrossing selected genotypes
and advancing their offspring through several generations of random mating. Consequently,
tall fescue populations exhibit high heterogeneity among individuals within a population and
among populations. This complicates the conventional use of phenotypic traits for cultivar
identification. However, Veronesi and Falcinelli [1988] used a multivariate approach to
assess the genetic diversity of 48 accessions of tall fescue collected from northern to southern
Italy and concluded that the method appeared to be a valid system for tall fescue germplasm
evaluation.
Other techniques used in assessing genetic diversity include the use of prolamine seed
protein fraction [Abernethy et al., 1989] and high performance liquid chromatography
[Freeman et al., 1996]. These methods are limited for polymorphisms that can be generated to
distinguish individual genotypes or bulks from different populations. DNA-based molecular
markers have been applied in the assessment of genetic diversity in allogamous species that
include tall fescue [Mian et al., 2002], Darnel ryegrass [Kirigwi et al., 2007] and perennial
ryegrass [Bolaric et al., 2005]. To assess diversity in germplasm and cultivars of a cross-
pollinated species, individuals are randomly picked and their separate DNA profiles used in
analyses [Xu et al., 1994]. These profiles are useful in assessing within population variation
of a cultivar or population. In order to determine diversity among populations, bulk DNA
samples from several individuals of a population are profiled [Caceres et al., 2000; Fu, 2003].

Molecular Markers in the Assessment of
Genetic Diversity

Molecular markers have become a reliable tool for the detection of genetic variability.
Molecular markers are elegant because they directly reveal changes at the DNA sequence
level. Different molecular marker systems have been used to assay the genetic

diversity of tall
fescue. These marker systems include the amplified fragment length polymorphism (AFLP)
[Mian et al., 2002; Roldan-Ruiz et al., 2000], restriction

fragment length polymorphisms
(RFLP) [Busti et al., 2004; Caceres et al., 2000; Xu et al., 1994], random

amplified
polymorphic DNA (RAPD), isozyme polymorphism, and microsatellites or simple sequence
repeats (SSRs) [Saha et al., 2006; Sun et al., 1999]. Chloroplast microsatellite (cpSSR)
markers have also been employed in diversity studies [McGrath et al., 2007, and citations
therein]. Molecular markers have been applied in the development of conservation strategies
[van Hintum, 1999], in the analysis of population genetics and population structure [Mian et
al., 2005a; Powell et al., 1996] and in cultivar identification [Busti et al., 2004; Fu et al.,
2003].
Simple sequence repeats (SSRs) are among the most variable DNA sequences. SSRs
have become an important marker class because they are mostly co-dominant, abundant in
genomes and highly reproducible, and some have high rates of transferability across species
[Saha et al., 2004; Thiel et al., 2003]. SSR markers have the highest level of discrimination
between genotypes [Pejic et al., 1998] and can reveal genetic diversity among different
populations. Consequently, SSRs have become an important marker system in cultivar
fingerprinting, diversity studies, molecular mapping and in marker-assisted selection
[Goldstein and Schlterer, 1999].
Mian et al. [2002] used AFLP markers

generated by CNG methylation-insensitive
(EcoRI/MseI) enzyme combinations. This combination produces markers that cluster along
the hypermethylated regions

of the genome such as the centromeres [Menz et al., 2002]. At
the Noble Foundation, we have used markers derived from both the genic and the genomic
regions of the genome, some of which are unmapped. Menz et al. [2004] compared the use of
both SSR and AFLP marker classes. Their findings indicate that the nature and genome
distribution of molecular markers is important in the classification of germplasm. It is
important then to consider the marker system due to different marker distributions in the
genome, which can lead to some regions having a higher contribution to the classification of
germplasm [Menz et al., 2002].
Xu et al. [1994] estimated that a bulk from 16 plants would be the minimum required to
distinguish between tall fescue cultivars using RFLP markers. In other diversity studies, 15
plants were used for AFLP analysis of crested wheatgrass complex (Agropyron spp. Gaertn)
[Mellish et al., 2002], 15 plants for RAPDs analysis of natural, diploid sources of dioecious
buffalograss (Buchlo dactyloides [Nutt.] Engelm.) [Huff et al., 1993] and 16 plants in AFLP
analysis of tall fescue [Mian et al., 2002]. Microsatellites have a higher level of
discrimination between and among genotypes relative to other markers [Pejic et al., 1998];
therefore the use of 15 plants and over 80 microsatellite markers [Staub et al., 2000] would
be sufficient to adequately discriminate between the tall fescue accessions/cultivars.

Transferability of Tall Fescue SSRs to Other Species

Although the transferability of genomic SSR markers across genera and beyond is
generally low [Peakall et al., 1998; Roa et al., 2000], high rates of transferability across
species within a genus have been reported [Gaitn-Sols et al., 2002]. Comparative mapping
reveals a relatively high sequence similarity among members of the Poaceae family [Kantety
et al., 2002]. Consequently, evaluation of SSRs for utility across different grass genera is a
worthwhile objective.
Molecular markers developed at the Noble Foundation for tall fescue have found utility
in other species and genera. The amplification rate of EST-SSRs was 86% in ryegrass (L.
perenne/multiflorum), 83% in meadow fescue (L. pratensis), 82% in tetraploid fescue (L.
arundinacea var. glaucescens), 59% in rice (Oryza sativa) and 71% in wheat (Triticum
aestivum). In a different study, tall fescue genomic-SSRs were assessed for their utility in six
different grass species [Saha et al., 2006]. The rate of transferability of tall fescue-based
marker sequences ranged from 74.2% (for rice) to 84.9% (for wheat and tetraploid fescue),
while for ryegrass the rate was 83.4%. Kirigwi et al. [2007] used 40 SSR primers derived
from tall fescue expressed sequence tags (TF EST-SSRs) and 62 fescue-ryegrass (FL)
genomic SSRs to screen for amplification in L. temulentum on a subset of eight genotypes. A
total of 30 TF EST-SSRs and 32 FL genomic SSRs were selected based on clean
amplification products. This translates into a transfer rate of 71% and 53% for the TF-EST-
and the FL genomic-SSRs, respectively. Saha et al. [2004] evaluated the utility of EST-
SSRs across several grass species. These studies indicate that SSR primer pairs developed
from one species could be used to detect the presence of SSRs in related species
[Dirlewanger et al., 2002; Kuleung et al., 2004; Saha et al., 2004; 2006; Yu et al., 2004].
Consequently, the marker resources developed at the Noble Foundation for tall fescue have
application to other crops and could be employed in comparative mapping.

Tall Fescue and Genetic Differentiation

Heterogeneous individuals characterize perennial grass pastures of outcrossing species
with concomitant reshuffling of alleles from generation to generation. This results in a high
degree of genetic variation within populations. When biotic and abiotic agents of natural
selection act on pasture populations, genetic shifts are likely to occur. Plant populations may
undergo genetic differentiation due to biological agents such as cattle grazing [Allen

and
Marlow, 1994; Brummer and Bouton, 1991; Singh et

al., 1995; Weiguo et al., 1999] or other
abiotic agents [Weiguo et al., 1999, and citations therein]. The changes are likely cumulative
unless there is reseeding with seed of the original variety. Changes in genetic variation of tall
fescue stands occurred

within three years for paddocks planted with GA-5

EF, GA-5 EI and
Johnstone [Vaylay and van Santen, 2002]. This is a relatively short time and has implications
for plant material collections from fields. Detection of patterns of genetic structure present in
a field is necessary to enable appropriate design of pasture

sampling designs [Weiguo et al.,
1999]. Overall, seeded

cultivars grown under conditions of severe natural selection or grazing
will be different from the original

seed lot used for establishment [Vaylay and van Santen,
1999].
Genetic diversity can be correlated to heterosis. Moutray and Frakes [1973] found that
the greatest heterosis for plant height, anthesis date, panicle number, seed yield

and fall vigor
rating in tall fescue was between genotypes selected for diverse morphology, origin and

anthesis date relative to crosses of clones with similar morphology, origin or

anthesis date.
Crosses between maturity groups resulted in the greatest heterosis

above the midparent for all
characteristics.

Genetic Diversity in Lolium

In a study aimed at assessing the genetic diversity of L. temulentum, 41 accessions were
assayed with 40 tall fescue EST-SSRs [TF EST SSRs from Saha et al., 2004] and 60 fescue-
ryegrass (FL) genomic SSRs [Kirigwi et al., 2007]. Both marker sources were useful in
assessing genetic diversity in L. temulentum accessions. Seed shape and size of one of the
accessions, L6, closely resembled meadow fescue and was distinct from other L. temulentum
accessions. Molecular marker profiles and seed characteristics indicated that L6 may be a
meadow fescue and not a darnel ryegrass. Although the accession had leaf morphology and
plant structure similar to some L. temulentum accessions, the molecular characterization
demonstrated the power of DNA markers for distinguishing materials in germplasm
collections [Kirigwi et al., 2007]. In perennial ryegrass, RAPD markers were used to assess
22 ryegrass cultivars, mainly of European origin [Bolaric et al., 2005]. Analysis of molecular
variance (AMOVA) revealed a larger genetic variation within cultivars (66%) than between
them (34%). In switchgrass, assessment of genetic diversity revealed high within (79.6%)
relative to among (20.4%) population variation [Narasimhamoorthy et al., 2008]. The higher
within-than among population variance has been documented in other self-sterile species like
perennial ryegrass [L. perenne, Bolaric et al., 2005], meadow fescue (F. pratensis),
orchardgrass (Dactylis glomerata L.), rhodesgrass (Chloris gayana Kunth) and hardinggrass
(Phalaris aquatica L.), among others [Huff, 1997; Kolliker et al., 1998; Mian et al., 2005b;
Ubi et al., 2003]. Using cpSSR, McGrath et al. [2007] characterized chloroplast genetic
diversity at allelic and haplotypic levels of 104 accessions of L. perenne, other Lolium
species, Festuca species and Festulolium cultivars. The within-population variance for L.
perenne, and for L. perenne and Festulolium cultivars was 61% and 64%, respectively. The
authors were able to indentify plastid gene pools and maternal lineages for L. perenne and a
possible migration route of L. perenne from southern regions of Europe northwards could be
inferred.

Endophyte Infection and Interaction in the
Fescue-Ryegrass Complex

Fungal endophytes, including Neotyphodium species, naturally infect members of the
Poaceae family and can affect growth and physiology of their host grasses. Tall fescue and
ryegrass are naturally infected with endemic endophytes, Neotyphodium

coenophialum
([Morgan-Jones and Gams] Glenn, Bacon and Hanlin) and N. lolii, respectively, that live in
the intercellular spaces, act as mutualists with their hosts and are asymptomatic on the plant.
Asexual endophytes, such as Neotyphodium species, are seed transmitted. Endophytes confer
pest resistance [Funk et al., 1983; Prestridge et al., 1982], deeper root development
[Richardson et al., 1990], increased plant tillering [Vaylay and van Santen, 1999], improved
utilization of soil nitrogen [Arachevaleta et al., 1989], stand persistence [Bouton et al., 2002]
and drought tolerance [Bouton et al., 1993a], through improved tiller and whole

plant
survival [Shelby and Dalrymple, 1993; West et al., 1993]. Persistence of tall fescue stands in
the coastal plain of the southeastern U.S. was attributed to increased ecological fitness
possibly due to increased drought tolerance in the summer [Bouton et al., 1993a]. Infection
frequencies of endophytes can be affected by grazing pressure as well as by altitude. The
distribution of infection of fungal symbionts in grasses may be affected by both biotic and
abiotic factors. The distribution of infection of Epichlo festucae, a common fungal symbiont
of the genus Festuca, in natural populations of F. rubra, F. ovina and F. vivipara, was
inconsistent from species to species, given different grazing histories and altitude [Granath et
al., 2007].
Livestock grazing tall fescue infected with an endophyte may exhibit symptoms such as
nervousness, reduced weight gain, rough hair coat, elevated body temperature and low
conception rates. These health disorders in animals are collectively called fescue toxicosis
and are caused by ergot-like alkaloids produced by the endophyte [Stuedemann and
Hoveland, 1988]. Over 90% of tall fescue grown in the United States is infected with the
Neotyphodium endophyte [Ball et al., 2002]. The benefits offered by endophyte infection and
the concomitant negative effects on animal health were tempered by the discovery in New
Zealand of nontoxic N. lolii strains that could be used to create novel, nontoxic L.
perenne/endophyte associations [Latch, 1997]. Similarly, endophytes that do not produce
ergot alkaloids in tall fescue plants were found that resulted in little or no loss in plant fitness
[Bouton et al., 2000]. Tall fescue cultivars that are infected with a novel nontoxic endophyte
include Jesup [Bouton et al., 1997]

Georgia-5 [Bouton et al., 1993b] and ArkPlus
TM
which
was developed from HiMag [Sleper et al., 2002] tall fescue and

the endophyte strain 4 NE+
[Gunter and Beck, 2004]. Weight gains for animals grazing tall fescue infected with a novel,
nontoxic endophyte have been much greater than that for animals grazing tall fescue infected
with an endemic toxic endophyte, while plant persistence has been comparable [Bouton et al.,
2002; Nihsen et al., 2004].
Tall Fescue Genetic Diversity, a Case Study

Introduction

Several tall fescue populations have been developed at the Noble Foundation from
known cultivars and from collections in Oklahoma and other states. Diversity information
about different cultivars/accessions is important as a strategic tool for the improvement of
crop plants. We carried out a study to assess the genetic variation of tall fescue cultivars and
populations using EST- and genomic SSR markers.

Materials and Methods

Plant Materials, DNA and Molecular Markers
The populations used to examine within population diversity were GA-5 infected with its
endemic endophyte (GA5-EI), the novel endophytes AR584 (GA5-584) and AR542 (GA5-
542), the tall fescue breeding line PDF infected with AR584 (PDF 584) and KY31. Plants
used for DNA extraction were grown in the greenhouse from seed. Approximately 200 mg of
leaf tissue were collected from 15 seedlings and stored separately at -80C. Genomic DNA
was extracted using DNeasy plant mini kit (Qiagen Inc., Valencia, Calif.). A set of 43 EST-
and 43 genomic SSR primer pairs developed at the Noble Foundation were selected [Saha et
al., 2006] for this study. All polymerase chain reactions (PCRs) were performed under
standard conditions for all primers [Saha et al., 2004], and the PCR products were resolved
using an ABI 3730 DNA Sequencer in the genotyping mode [Schuelke, 2000]. An AMOVA
analysis was performed to determine the genetic structure using Arlequin version 3.01.
Similarity matrices for the genotypes were calculated using NTSYS-PC 2.10 (Applied
Biostatistics, Setauket, N.Y., USA). The genetic similarity among genotypes was calculated
by the SIMQUAL procedure using the DICE similarity coefficient [Dice, 1945].

Results and Discussions

Genetic Variation within and among Populations
Analysis of molecular variance (AMOVA) shows that most of the variance for GA5-EI,
GA5-584, GA5-542, PDF-584 and KY31 was attributable to within-population variation
(94.7% of the variance) whereas the among-population variation was 5.3% (Table 2). Despite
the high within population variance, AMOVA could distinguish the five populations based
on a sample of 15 plants. Similar results have been found for buffalograss (Buchlo
dactyloides [Nutt.] Engelm.) using RAPD markers [Huff et al., 1993], the crested wheatgrass
complex (Agropyron spp. Gaertn) using AFLP markers [Mellish et al., 2002] and in
switchgrass [Narasimhamoorthy et al., 2008] using SSR markers.

Table 2. AMOVA where the distance method used was the number of different alleles
[FST] using 225 Genomic- and 102 EST-SSR's on GA5-EI, GA5-584, GA5-542, PDF-
584 and KY31

Source of Variation
d.f.
Sum of squares Variance components P-value
Among Groups 2 241.5 1.83 [5.3%] 0.000001
Within Populations 70 2302.3 32.89 [94.7%] 0.000001
Total 74 2543.8 34.72 [100%]
P-values after 1023 permutations.

The cultivar GA-5 originates from tall fescue ecotypes collected from the southeastern
USA [Bouton et al., 1993b] and may include genotypes tracing to KY31. The population
pairwise FST between GA5-EI and the derived populations were relatively low (Table 3).
Based on the DICE similarity coefficient, GA5-EI and the derived populations formed a
cluster at a similarity of 0.85 (Figure 1).
The derived populations from GA5-EI were re-selections that differ in, among other
things, flowering time and persistence, and look distinct from the original populations (J.H.
Bouton, personal communication). The novel endophytes AR584 and AR542 were each
inoculated into subsets of 200 GA-5 plants. Differences between the original and derived
populations can occur at the initial sample selection which was cured of the endophyte,
infected with a novel endophyte, and later, when seeds with the new endophyte were
germinated. Host endophyte interactions could cause a bottleneck effect. However, the FST
values indicate moderate genetic differentiation between populations and there seemed to be
little effect because of the endophyte introduction (Figure 1); any differences seemed to be
due to selection effect of the original population (founder effect), but not genotype
endophyte interactions.

Table 3. Population pair-wise FSTs [number of different alleles] and population specific
FST calculated by distance method. All FST P-values were significant [P<0.05] except
for the comparison between GA5-EI and GA5-584 which had a P-value of 0.13514+/-
0.0279

GA5-EI GA5-584 GA5-542 PDF584 Population Specific
FST
GA5-EI

0.04836
GA5-584 0.00724 0.05225
GA5-542 0.02367 0.03583 0.05503
PDF584 0.06339 0.08962 0.07965 0.05412
KY31 0.04816 0.08209 0.05310 0.04074 0.05398
The overall fixation index was 0.05275.


Figure 1. A phenogram constructed based on DICE similarity coefficients calculated from both EST-
and genomic SSR marker data. Forty-three EST- SSRs and 43 genomic SSRs giving 102 and 225
fragments, respectively, were used.
All other pairwise tests of interpopulation distances calculated from

individual
populations were significant (Table 3) meaning that each population was differentiated and
that the FST values observed were not random. However, all the other pairwise FSTs
indicated that at least 4% of the genetic variation was due to differences in allele frequencies.
All the population specific FSTs were within the 0.05-0.15 classification indicating moderate
genetic differentiation (Table 3) [Wright, 1978].
The relationship between the five populations is displayed in figure 1. The background of
cultivar GA-5 may include genotypes tracing to KY31 [Bouton et al., 1993b]. PDF is an
Oklahoma pasture ecotype collected from Carter County in south central Oklahoma. KY31 is
the predominant tall fescue cultivar

in the eastern U.S. [Ball et al., 1993] and is likely a seed
source for some of the pastures planted in Oklahoma [Mian et al., 2002]. The relationship
between the populations [Figure 1] suggests that the PDF ecotype has KY31 in its
background.

Conclusion

Heterogeneous individuals characterize perennial grass pastures of outcrossing species
with concomitant reshuffling of alleles from generation to generation. This results in a high
degree of genetic variation within populations. Additionally, plant populations may undergo
genetic differentiation due to natural selection. In breeding programs, complex allelic
changes in gene composition occur that are difficult to follow. However, molecular markers
have made it possible to tag individual genes responsible for producing particular
phenotypes, select plants with low linkage drag, design germplasm conservation strategies,
study population structure and identify cultivars.
Tall fescue and other outcrossing species have high within-population genetic variation.
Our study also shows that despite high within-population variance, the AMOVA could
distinguish between the populations. The preponderance of within population variation
indicates that breeding strategies should focus on selecting within populations using various
recurrent selection schemes. Therefore, future breeding efforts could concentrate on elite
populations to continue exploiting indigenous variation.
Reselection within GA-5, along with novel endophyte insertion, resulted in little genetic
differentiation among populations based on Wrights index. Likewise, the insertion of novel
endophytes did not seem to cause a genetic shift in GA-5. The derived populations retained
levels of within-population variance similar to original populations. The implication for
breeding is that we can continue to infect elite tall fescue cultivars and elite populations with
novel endophytes without shifting the genetics of the host population. Furthermore, the
derived cultivars may be registered as the original population for plant variety protection
purposes.

References

Abbas, M.S., & Sarita, B.J. (2006). Morphological variation in population of the genus
Lolium [Poaceae] in Iran. International J Botany 2:286-292.
Abernethy, R.H., Steiner, J.J., Wofforfd, D.S., & Thiel, D.S. (1989). Classification and
pedigree verification of tall fescue cultivars utilizing the prolamine seed protein fraction.
Crop Sci 29:791 797.
Allen, D.R., & Marlow, C.B. (1994). Shoot population dynamics of beaked sedge following
cattle grazing. J Range Manage 47:64-69.
Arachevaleta, M., Bacon, C.W., Hoveland, C.S., & Radcliffe, D.E. (1989). Effect of the tall
fescue endophyte on plant response to environmental stress. Agron J 81:83-90.
Armstead, I.P., Harper, J.A., Turner, L.B., Skt, L., King, I.P., Humphreys, M.O., Morgan,
W.G., Thomas, H.M., & Roderick, H.W. (2006). Introgression of crown rust (Puccinia
coronata) resistance from meadow fescue (Festuca pratensis) into Italian ryegrass
(Lolium multiflorum): genetic mapping and identification of associated molecular
markers. Plant Pathology 55:62-67.
Ball, D.M., Pedersen, J.F., & Lacefield, G.D. (1993). The tall fescue endophyte. Am Sci
81:370380.
Ball, D.M., Hoveland, C.S., & Lacefield, G.D. (2002). Fescue toxicity. Pages 198205 In
Southern Forages: Modern Concepts for Forage Crop Management. 3rd ed. Graphic
Communications Corp., Lawrenceville, GA.
Bolaric, S., Barth, S., Melchinger, A.E., & Posselt, U.K. (2005). Genetic diversity in
European perennial ryegrass cultivars investigated with RAPD markers. Plant Breeding
124: 161-166.
Borrill, M., Tyler, B., & Lloyd-Jones, M. (1971). Studies in Festuca. 1. A chromosome atlas
of Bovinae and Scariosae. In Cytologia 36:1-14.
Bouton, J.H., Duncan, R.R., Gates, R.N., Hoveland, C.S., & Wood, D.T. (1997). Registration
of Jesup tall fescue. Crop Sci 37:10111012.
Bouton, J.H., Gates, R.N., Belesky, D.P., & Owsley, M. (1993a). Yield and persistence of tall
fescue in the Southeastern Coastal Plain after removal of its endophyte. Agron J 85: 52-
55.
Bouton, J.H., Gates, R.N., Hill, G.M., & Owsley, M., Wood, D.T. (1993b). Registration of
Georgia 5 tall fescue. Crop Sci 33:1405.
Bouton, J.H., Latch, G.C.M., Hill, N.S. Hoveland, C.S. McCann, M.A., Watson, R.H.,
Hawkins, L.L. & Thompson, F.N. (2002). Re-infection of tall fescue cultivars with non-
ergot alkaloid producing endophytes. Agron J 94:567574.
Bouton, J., Hill, N., Hoveland, C., McCann, M., Thompson, F., Hawkins, L., & Latch, G.
(2000). Performance of tall fescue cultivars infected with nontoxic endophytes. In Proc.
4th Intern. Neotyphodium/Grass Interactions Symp. V.H. Paul and P.D. Dapprich [ed.]
Soest, Germany, 27-29 Sept. 2000. p. 179-185.
Brummer, E.C., & Bouton, J.H. (1991). Plant traits associated with grazing-tolerant alfalfa.
Agron J 83:9961000.
Buckner, R.C., Powell, J.B., & Frakes, R.V. (1979). Historical development. p. 1-8. In R. C.
Buckner and L.P. Bush [ed.] Tall fescue. Amer Soc.Agron, Madison, WI.
Busti, A., Caceres, M.E., Calderini, O., Arcioni, S., & Pupilli, F. (2004). RFLP markers for
cultivar identification in tall fescue [Festuca arundinacea Schreb.] Genetic Resources
and Crop Evolution 51: 443-448.
Caceres, M.E., Pupilli, F., Piano, E, & Arcioni, S. (2000). RFLP markers are an effective tool
for the identification of creeping bentgrass [Agrostis stolonifera L.] cultivars. Genetic
Resources and Crop Evolution 47:455-459.
Clayton, W.D., & Renvoize, S.A. (1986). Genera Graminum. Grasses of the world. Kew. ull
Addit Ser.13.
Clement, S.L., Elberson, L.R., Youssef, N.N., Davitt, C.M., & Doss, R.P. (2001) . Incidence
and diversity of neotyphodium fungal endophytes in tall fescue from Morocco, Tunisia
and Sardinia. Crop Sci 41: 570-576.
Cornish, M.A., Hayward, M.D., & Lawrence, M.J. (1979). Self-incompatibility in ryegrass. I.
Genetic control in diploid Lolium perenne L. Heredity 43:95-106.
Craven, K.D., Clay, K., & Schardl C.L. (2007). Systematics and morphology. In Fribourg
H.A and Hannaway D.B. [ed.] Tall Fescue Online Monograph. Online at
http://forages.oregonstate.edu/is/tfis/book.cfm?PageID=366andchapter=3andsection=0.
Darbyshire, S.J. (1993). Realignment of Festuca subgenus Schedonorus with the genus
Lolium [Poaceae]. Novon 3:239-243.
Dice, L. (1945). Measures of the amount of ecological association between species. Ecology
26:297302.
Dirlewanger, E., Cosson, P., Tavaud, M., Aranzan,a M.J., Poizat, C., Zanetto, A., Ars, P., &
Laigret, F. (2002). Development of microsatellite markers in peach [Prunus persica [L.]
Batsch] and their use in genetic diversity analysis in peach and sweet cherry [Prunus
avium L]. Theor Appl Genet 105:127138.
Freeman, G.W., Wagg, C.A., & Mileva, M.M. (1996). Identification of turfgrass cultivars
using reverse-phase high-performance liquid chromatography. Seed Sci Technol 24: 495
504.
Fu, Y.B. (2003). Applications of bulking in molecular characterization of plant germplasm: a
critical review. Plant Genet Res 1:161167.
Fu, Y.B., Peterson, G.W., Scoles, G., Rossnagel, B., Schoen, D.J., & Richards, K.W. (2003).
Allelic diversity changes in 96 Canadian oat cultivars released from 1886 to 2001. Crop
Sci 43:19891995.
Funk, C.R., Halisky, P.M., & Hurley, R.H. (1983). Implications of endophytic fungi in
breeding for insect resistance. Proc. Forage and Turfgrass Endophyte Workshop,
Corvallis, OR. 3-4 May. p. 67-75.
Gaitn-Sols, E., Duque, M.C., Edwards, K.J., & Tohme, J. (2002). Microsatellite repeats in
common bean [Phaseolus vulgaris]: isolation, characterization and cross-species
amplification in Phaseolus ssp. Crop Sci 42:21282136.
Gibson, D.J., & Newman, J.A. (2001). Festuca arundinacea Schreber [F. elatior L. ssp.
arundinacea [Schreber] Hackel]. The Journal of Ecology 89:304-324.
Goldstein, D., & Schlterer, C. (1999). Microsatellite: Evolution and Applications. Oxford:
Oxford University Press Inc., NY.
Gunter, S.A., & Beck, P.A. (2004). Novel endophyte-infected tall fescue for growing beef
cattle. Anim Sci 82:75-82.
Granath, G., Vicari, M., Bazely, D.R., Ball, J.P., Puentes, A., & Rakocevic, T.I. (2007).
Variation in the abundance of fungal endophytes in fescue grasses along altitudinal and
grazing gradients. Ecography 30:422-430.
Hauck, B., Gay, A.P., Macduff, J., Griffiths, C.M., & Thomas, H. (1997). Leaf senescence in
a non-yellowing mutant of Festuca pratensis: implications of the stay green mutation for
photosynthesis, growth and nitrogen nutrition. Plant Cell and Environment 20:1007-
1018.
Huff, D.R. (1997). RAPD characterization of heterogenous perennial ryegrass cultivars. Crop
Sci 37:783791.
Huff, D.R., Peakall, R., & Smouse, P.E. (1993). RAPD variation within and among natural
populations of outcrossing buffalograss [Buchloe dactyloides [Nutt.] Engelm.]. Theor
Appl Genet 6:927934.
Humphreys, M.W. (1989). The controlled introgression of Festuca arundinacea genes into
Lolium multiflorum. Euphytica 42:105116.
Humphreys, M.W., & Thomas, H. (1993). Improved drought resistance in introgression lines
derived from Lolium multiflorum Festuca arundinacea hybrids. Plant Breed 111:155-
161.
Humphreys, M.W., & Paakinskien, I. (1996). Chromosome painting to locate genes for
drought resistance transferred from Festuca arundinacea into Lolium multiflorum.
Heredity 77:530-534.
Humphreys, M.W., Canter, P., & Thomas, H.M. (2003). Advances in introgression
technologies for precision breeding within the Lolium-Festuca complex. Ann Appl Biol
143:110.
Humphreys, J., Harper, J.A., Armstead, I.P., & Humphreys, M.W. (2005). Introgression-
mapping of genes for drought resistance transferred from Festuca arundinacea var.
glaucescens into Lolium multiflorum. Theor Appl Genet 110:579587.
Jauhar, P.P. (1975). Chromosome relationships between Lolium and Festuca [Gramineae].
Chromosoma 52:103121.
Jauhar, P.P. (1993). Cytogenetics of the Festuca-Lolium complex: Relevance to breeding.
Springer-Verlag, New York. 255 pp.
Kantety, R.V., Rota, M.L., Matthews, D.E., & Sorrells, M.E. (2002). Data mining for simple
sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and
wheat. Plant Mol Biol 48:501510.
Kennedy, P.B. (1900). Cooperative experiments with grasses and forage plants. USDA Bull.
22. USDA, Washington, D.C.
King, I.P., Morgan, W.G., Harper, J.A., & Thomas, H.M. (1999). Introgression mapping in
the grasses. II. Meiotic analysis of the Lolium perenne/Festuca pratensis triploid hybrid.
Heredity 82:107112.
Kirigwi, F.M., Zwonitzer, J.C., Mian, M.A.R., Wang, Z.-Y, & Saha, M.C. (2007).
Microsatellite markers and genetic diversity assessment in Lolium temulentum. Genetic
Resources and Crop Evolution 55:105-114.
Kolliker, R., S.F., Reidy, B., & Nosberger, J. (1998). Genetic variability of forage grass
cultivars: A comparison of Festuca pratensis Huds, Lolium perenne L. and Dactylis
glomerata L. Euphytica 106:261-270.
Kosmala, A., Zwierzykowski, Z., G sior, D., Rapacz, M., Zwierzykowska, E., & Humphreys,
M.W. (2006a). GISH/FISH mapping of genes for freezing tolerance transferred from
Festuca pratensis to Lolium multiflorum. Heredity 96:243251.
Kosmala, A., Zwierzykowska, E., & Zwierzykowski, Z. (2006b). Chromosome pairing in
triploid intergeneric hybrids of Festuca pratensis with Lolium multiflorum, revealed by
GISH. J Appl Genet 47:215220.
Kosmala, A., Zwierzykowski, Z., Zwierzykowska, E., Luczak, M., Rapacz, M., Gasior, D., &
Humphreys, M.W. (2008). Introgression mapping of genes for winter hardiness and frost
tolerance transferred from Festuca arundinacea into Lolium multiflorum. J Heredity
Advance Access published on July 9, 2007, DOI 10.1093/jhered/esm047.
Kuleung, C., Baenziger, P.S., & Dweikat, I. (2004). Transferability of SSR markers among
wheat, rye and triticale. Theor Appl Genet 108:11471150.
Lamp, C.A., Forbes, S.J., & Cade, J.W. (2001). Grasses of temperate Australia - A field
guide. Inkata Press [1st Edition] and C.H. Jerram and Associates Science Publishers.
Latch, G.C.M. (1997). An overview of Neotyphodium-grass interactions. p. 1-11. In C. W.
Bacon and N. S. Hill (ed.) Neotyphodium/Grass Interactions. Plenum Press, NY.
Loos, B.P. (1993a). Allozyme variation within and between populations in Lolium [Poaceae].
Plant Syst Evol 188: 101-113.
Loos, B.P. (1993b). Morphological variations in Lolium [Poaceae] as a measurement of
species relationships. Plant Syst Evol 188: 87-89.
McGrath, S., Hodkinson, T.R., & Barth, S. (2007). Extremely high cytoplasmic diversity in
natural and breeding populations of Lolium [Poaceae]. Heredity 99:531 544.
Mellish, A., Coulman, B., & Ferdinandez, Y. (2002). Genetic relationships among selected
crested wheatgrass cultivars and species determined on the basis of AFLP markers. Crop
Sci 42:1662-1668.
Menz, M.A., Klen, R.R., Mullet, J.E., Obert, J.A., Unruh, N.C., & Klein, P.E. (2002). A high-
density genetic map of Sorghum bicolor (L.) Moench based on 2926 AFLP
, RFLP and
SSR markers. Plant Mol Biol 48:483499.
Menz, M.A., Klein, R.R., Unruh, N.C., Rooney, W.L., Klein, P.E., Mullet, J.E. (2004).
Genetic Diversity of Public Inbreds of Sorghum Determined by Mapped AFLP and SSR
Markers. Crop Sci 44:1236-1244.
Meyer, W.A., & Watkins, E. (2003). Tall fescue (Festuca arundinacea). In: Turfgrass
Biology, Genetics and Cytotaxonomy, Eds. M. Casler and R.Duncan, Jonh Wiley and
Sons, Inc., pp. 107-127.
Mian, M.A.R., Hopkins, A.A., & Zwonitzer, J.C. (2002). Determination of genetic diversity
in tall fescue with AFLP markers. Crop Sci 42: 944-950.
Mian, M.A.R., Saha, M.C., Hopkins, A.A., & Wang, Z. (2005a). Use of tall fescue EST-SSR
markers in phylogenetic analysis of cool-season forage grasses. Genome 48:637647.
Mian, M.A.R., Zwonitzer, J.C., Chen, Y., Saha, M.C., & Hopkins A.A. (2005b). AFLP
diversity within and among hardinggrass populations. Crop Sci 45:2591-2597.
Morgan, W.G., King, I.P., Koch, S., Harper, J.A., & Thomas, H.M. (2001). Introgression of
chromosomes of Festuca arundinacea var. glaucescens into Lolium multiflorum revealed
by genomic in situ hybridization [GISH]. Theor Appl Genet 103:696701.
Moutray, J.B. Jr., & Frakes, R.V. (1973). Effects of genetic diversity on heterosis in tall
fescue. Crop Sci 13:1-4.
Naganowska, B., Zwierzykowski, Z., & Zwierzykowska, E. (2001). Meiosis and fertility of
reciprocal hybrids of Lolium multiflorum with F. pratensis. J Appl Genet 42:247255.
Narasimhamoorthy, B., Saha, M.C., Swaller, T., & Bouton, J.H. (2008). Genetic diversity and
population differentiation in switchgrass collections assessed by microsatellite markers.
Bio-energy Res 1:136-146.
Nihsen, M.E., Piper, E.L., West, C.P., Crawford, R.J. Jr., Denard, T.M., Johnson, Z.B.,
Roberts, C.A., Spiers, D.A., & Rosenkrans, Jr. C.R. (2004). Growth rate and physiology
of steers grazing tall fescue inoculated with novel endophytes. J Anim Sci 82:878-883.
Peakall, R., Gilmore, S., Keys, W., Morgante, M., & Rafalski, A. (1998). Cross species
amplification of soybean [Glycine max] simple sequence repeat [SSRs] within the genus
and other legume genera: Implication for transferability of SSRs in plants. Mol Biol Evol
15:12751287.
Pejic, I., Ajmore-Marsan, P., Morgante, M., Kozumplick, V., Castiglioni, P., Taramino, G. &
Motto, M. (1998). Comparative analysis of genetic similarity among maize inbred lines
detected by RFLPs, RAPDs, SSRs and AFLPs. Theor Appl Genet 97:1248-1255.
Powell, W., Morgante, M., Andre, C., Hanafey, M., Vogel, J., Tingey, S., & Rafalski, A.
(1996). The comparison of RFLP, RAPD, AFLP and SSR (microsatellite) markers for
germplasm analysis. Mol Breed 2:225-238.
Prestridge, R.A., Pottinger, R.P. & Barker, G.M. (1982). An association of Lolium endophyte
with ryegrass resistance to Argentine stem weevil. Proc. N.Z. Weed Pest Conf. 35:119-
122.
Richardson, M.D., Hill, N.S., & Hoveland, C.S. (1990). Rooting patterns of endophyte-
infected tall fescue grown under drought stress. Agron. Abst., ASA, Madison, WI. p. 129.
Roa, A.C., Chavarriaga-Aguirre, P., Duque, M.C., Maya, M.M., Bonierbale, M.W., Iglesias,
C., & Tohme, J. (2000). Cross-species amplification of cassava [Manihot esculenta]
[Euphorbiaceae] microsatellites: allelic polymorphism and degree of relationship. Am J
Bot 87:16471655.
Roldan-Ruiz, I., Dendauw, J., Van Bockstaele, E., Depicker, A., & De Loose, M. (2000).
AFLP markers reveal high polymorphic rates in ryegrasses [Lolium spp.]. Mol Breed
6:125134.
Saha, M.C., Cooper, J.D., Mian, M.A.R., Chekhovskiy, K., & May, G.D. (2006). Tall fescue
genomic SSR markers: development and transferability across multiple grass species.
Theor Appl Genet 113:1432-2242.
Saha, M.C., Mian, M.A.R., Eujayl, I., Zwonitzer, J.C., Wang, L., & May G.D. (2004). Tall
fescue EST-SSR markers with transferability across several grass species. Theor Appl
Genet 109:783791.
Seal, A.G. (1983). DNA variation in Festuca. Heredity 50:225-236.
Scholz, S., & Scholz, H. (2005). A new species of Lolium [Gramineae] from Fuerteventura
and Lanzarote [Canary Islands, Spain]. Willdenowia 35:281-286.
Schuelke, M. (2000). An economic method for the fluorescent labeling of PCR fragment. A
poor mans approach to genotyping for research and high-throughput diagnostics. Nat
Biotechnol 18:223-33.
Shelby, R.A., & Dalrymple, L.W. (1993). Long-term changes of endophyte infection in tall
fescue stands. Grass and Forage Sci 48:356-361.
Singh, S.N., Singh, R.P., & Pandey, D.D. (1995). Effect of grazing on biomass, productivity
and nutritive value of bermudagrass. Environ Ecol 13:147-150.
Sleper, D.A., & Buckner, R.C. (1995). The fescues. In. Barnes R.F., Miller D.A., Nelson C.J.,
Heath M.E. [ed]. Forages p. 345356. Iowa State University Press, Ames, Iowa, USA.
Sleper, D.A., & West, C.P. (1996.) Tall fescue. In Moser L.E., Buxton D.R., Casler M.D.
[ed.]. Cool-season forage grasses. p. 471502. ASA-CSSA-SSSA, Madison, Wisconsin,
USA.
Sleper, D.A., Maryland, H.F., Crawford, R.J.Jr., Shewmaker, G.E., & Massie, M.D. (2002).
Registration of HiMag tall fescue germplasm. Crop Sci 42:318319.
Soreng, R.J., Terrell, E.E., Wiersema, J., & Darbyshire, S.J. (2001). Proposal to conserve the
name Schedonorus arundinaceus [Schreb.] Dumort. against Schedonorus arundinaceus
Roem. and Schult. [Poaceae: Poeae]. Taxon 50: 915-917.
Staub, J.E., Danin-Poleg, Y., Fazio, G., Horejsi, T., Reis, N., & Katsir, N. (2000).
Comparative analysis of cultivated melon groups [Cucumis melo L.] using random
amplified polymorphic DNA and simple sequence repeat markers. Euphytica 115:225
241.
Stuedemann, J.A., & Hoveland, C.S. (1988). Fescue endophyte: history and impact on animal
agriculture. J Prod Agric 1:39-44.
Sun, G.L, Diaz, O., Salomon, B, & von Bothmer, R. (1999). Genetic diversity in Elymus
caninus as revealed by isozyme, RAPD and microsatellite markers. Genome 42:420-
431.
Terrell, E.E. (1968). A taxonomic revision of the genus Lolium. Tech Bull US Dept Agric
1392.
Thiel, T., Michalek, W., Varshney, R.K., & Graner, A. (2003). Exploiting EST databases for
the development and characterization of gene-derived SSR-markers in barley [Hordeum
vulgare L.]. Theor Appl Genet 106:411422.
Thomas, H., & Humphreys, M.O. (1991). Progress and potential of interspecific hybrids of
Lolium and Festuca. J Agric Sci 117:18.
Thomas, H., Morgan, W.G., Thomas, A.M., & Ougham, H.J. (1999). Expression of the stay-
green character introgressed into Lolium temulentum Ceres from a senescence mutant of
Festuca pratensis. Theor Appl Genet 99:92-99.
Thomas, H., & Stoddart, L. (1975). Separation of chlorophyll degradation from other
senescence processes in leaves of a mutant genotype of meadow fescue (Festuca
pratensis L.). Plant Physiol 56:438-41.
Thomas, H.M., Morgan, W.G., & Humphreys, M.W. (2003). Designing grasses with a future
combining the attributes of Lolium and Festuca. Euphytica 133:19-26.
Ubi, E., Kolliker, R., Fujimori, M., & Komatsu, T. (2003). Genetic diversity in diploid
cultivars of rhodesgrass determined on the basis of amplified fragment length
polymorphism markers. Crop Sci 43:1516-1522.
van Hintum, T.J.L. (1999). The general methodology for creating a core collection. In:
Johnson R.D., Hodgkin T. [eds] Core collections for today and tomorrow. International
Plant Genetic Resources Institute, Rome, pp 1017.
Vaylay, R., & van Santen, E. (1999). Grazing induces a patterned selection response in tall
fescue. Crop Sci 39:4451.
Vaylay, R., & van Santen, E. (2002). Application of canonical discriminant analysis for the
assessment of genetic variation in tall fescue. Crop Sci 42:534-539.
Veronesi, F., & Falcinelli, M. (1988). Evaluation of an Italian germplasm collection of
Festuca arundinacea Schreb. through a multivariate analysis. Euphytica 38: 211220.
Weiguo, L., Guertal, E.A., & van Santen, E. (1999). Population differentiation, spatial
variation and sampling of tall fescue under grazing. Agron J 91:801-806.
West, C.P., Izekor, E., Turner, K.E., & Elmi, A.A. (1993). Endophyte effects on growth and
persistence of tall fescue along a water supply gradient Agron J 85: 264-270.
Wright, S. (1978). Evolution and the genetics of populations Vol 4. Variability within and
among natural populatlons. Chicago: University of Chicago Press.
Xu, W.W., Sleper, D.A., & Krause, G.F. (1994). Genetic diversity of tall fescue germplasm
based on RFLPs. Crop Sci 34:246252.
Yamada, T. (2001). Introduction of a self-compatible gene of Lolium temulentum L. to
perennial ryegrass (Lolium perenne L.) for the purpose of the production of inbred lines
of perennial ryegrass. Euphytica 122:213-217.
Yu, J.-K., La Rota, C.M., Kantety, R.V., & Sorrells, M.E. (2004). EST-derived SSR markers
for comparative mapping in wheat and rice. Mol Gen Genet 271:742751.
Zwierzykowski, Z., Lukaszewski, A.J., Naganowska, B., & Lesniewska, A. (1999). The
pattern of homoeologous recombination in triploid hybrids of Lolium multiflorum with
Festuca pratensis. Genome 42:720-726.

Chapter 6

Genetic Diversity of the Population of
Russia: Gene Pool and Genegeography

Sergei Rychkov, Oksana Naumova, Alexei Evsyukov,
Irina Morozova, Yuri Shneider and Olga Zhukova
Lab of Human Genetics, Vavilob Institute of General Genetics,
Rusian Acad. of Sci. Russia

Abstract

Genetic differentiation of the population of Russia is investigated. The work is based
on data about immuno-biochemical and molecular markers polymorphism in about 1,500
populations from 62 ethnoses belonging to six main linguistic families and having
different cultural traditions. Genetic diversity is studied by cartographic and statistical
methods and is presented in a form of genegeographical maps. The position of the
Russian gene pool on the Eurasian background is described. The genetic relief of Russia
is investigated, and main structure components are revealed in the gene pool. Analysis of
these components from the ethno-historical point of view revealed their connection with
different Eurasia regions (West and Central Europe, Central and East Asia).

Introduction

Human Gene Pool. Definition and Description

We stand on the shore of a vast sea. Thousands and thousands of valuable or harmful
substancesgenesare dissolved in it The sea is heavy. Every moment, noiselessly,
mutations burst out in it, presenting us with new treasures or polluting the sea with new
poisons. Slowly, the genes disperse, covering increasingly large areas. Multicolored,
sparkling streams mix and turn, giving rise to novel gene combinations that are yet unknown
to humanity The name of this sea is gene pool. [Serebrovsky 1928, p. 22]
Sergei Rychkov, Oksana Naumova, Alexei Evsyukov et al. 150
This poetic description of gene pool belongs to the outstanding Russian geneticist
Alexander S. Serebrovsky. This author introduced the notion of gene pool, which he termed
as the total set of genes in the population. The main aims of investigating the population gene
pool, stated by Serebrovsky, included listing the genes of the particular gene pool and
assessing frequencies of all alleles of these genes, the mode of combination of the genes in
the population, distribution of the alleles over the area or the elements of the population, the
general structure of the population (its homogeneity or ongoing admixture of several
populations). These aims coincide with the goals facing population genetics, a science
studying population gene pools.
The term gene pool is universal. Therefore, it is somewhat vague, requiring definition in
each particular case. On the one hand, with the advance of genetics, the very concept of gene
has been rapidly changing and becoming more complex, which changes and complicates the
notion of gene pool. To date, speaking of polymorphism, we mean not only structural genes,
but also diverse DNA fragments, differing in size, structure, and function. On the other hand,
the term of population referred to humans is particularly vague. In contrast to populations of
any other species, human populations have been formed not only by natural history, but, to an
even greater extent, by social history. Human populations are characterized by self-
identification and self-awareness. This is what determines the boundaries of the population as
a relatively independently evolving group of individuals.Thus, the term gene pool applied to
humans has the following meaning. The gene pool of a human population is a
geographically defined and historically arranged set of genes (and any replicating parts of
DNA), retained in the boundaries of its area by self-awareness of the population, reproduced
in generations, and maintained by systemic and stochastic evolutionary factors in dynamic
equilibrium with the state of the environment, in which the population exists and which is
changed by it [Gene Pool 2003, p. 9]. The hierarchy of human populations, which is
formed primarily by social factors, is tremendously complex. The most universal human
population structure worldwide has the following hierarchic structure: local population
(village, town, etc.), ethnic group and state. Within these subdivisions of the population
structure, the self-awareness of individuals and their groups is most pronounced. Each of the
subdivisions has a strictly defined area, which may be the boundaries of the local population,
historically formed ethnic territory, or state boundaries. In this study, we consider the gene
pool of the Russian population exactly at these structural levels, with special reference to the
ethnic division, which is determined by ethnic self-awareness and is clearly formed
historically. This approach seems to be most reasonable for studying the population of
Russia, a vast and multiethnic country. For centuries, many various native ethnic groups in
Russia have had a common history.

The Population of Russia. Ethnoses, Races, Languages

To date, there is over 100 ethnic groups living in Russia, whose population, according to
the census of 2002, is 145 167 000. Most of these groups (93%) are native peoples, for which
Russia is the main or even the only place of residence. Generally, these populations are
restricted to their historical ethnic areas. As to the 1990s, only 20% of the population of
Genetic Diversity of Population of Russia 151
Russia resided outside of their historic territories. Figure 1 depicts ethnic areas of the
autochthonous peoples of Russia, their historically formed territories, and places of current
residence.

Ethnoses:
1 Abazians
2 Adygeys
3 Aleuts
4 Altaians (Northern)
5 Altaians (Southern)
6 Balkars
7 Bashkirs
8 Buryats
9 Veps
10 Dagestani
11 Dolgans
12 Ingushs
13 Itelmens
14 Kabardians
15 Kalmyks
16 Karagashs
17 Karachays
18 Karelians
19 Kets
20 Komi-zyryans
21 Komi-permyaks
22 Koryaks
23 Kumyks
24 Mansi
25 Mari
26 Mordvinians
27 Nanai
28 Nganasans
29 Negidals
30 Nenets (Forest)
31 Nenets (Tundra)
32 Nivkhs
33 Nogays
34 Oroks
35 Oroshs
36 Ossetians
37 Russians
38 Sami
39 Selkups
40 Tatars
41 Tatars (Siberian)
42 Todzhinians
43 Tofalars
44 Tuvinians
45 Udmurts
46 Udegeys
47 Ulchis
48 Khakassians
49 Khants
50 Circassians
51 Chechens
52 Chuvashians
53 Chukchi
54 Chulyms
55 Shors
56 Evenks (Eastern)
57 Evenks (Western)
58 Evens
59 Enets
60 Eskimos
61 Yukaghirs
62Yakuts
Figure 1. Population of Russia. Ethnic areas.
In anthropological terms, the population of Russia consists of members of two major
human races, Mongoloid and Caucasoid, and two transitory races, Uralic and South Siberian.
The main areas of the two latter races are reflected in their very names: the Urals with
neighboring territories and South Siberia, respectively. The members of the Caucasoid race
reside predominantly in the European part of Russia and the Caucasus, while Mongoloid race
includes native peoples of Siberia, mostly of its eastern regions.
The peoples of Russia speak more than a hundred different languages assigned to six
linguistic families [Ethnoses1988]. The areas of residence of these language families and
large linguistic groups are presented in Figure 2. The Indo-European language family in
Russia is represented by Slavic (Russians) and Iranian (Ossetians) language groups. As
suggested by their name, the members of North Caucasian language family inhabit North
Caucasus. The peoples speaking languages of the Uralic family (Finno-Ugric and Samodian
languages) reside mainly in the Urals region and West Siberia. The extreme northeastern
region of Russia is inhabited by peoples of the Chukotka-Kamchatka and Eskimo-Aleutian
language families. The vast lands of East and South Siberia are populated by the peoples of
the Altaic family, the members of Tunguso-Manchurian, Mongolic, and Turkic linguistic
groups. Some peoples of Russia (Kets, Nivkhs and Yukagirs) speak languages that are not
part of any linguistic classification and assigned to isolated languages.
Above, we described the population of Russia in terms of two generally accepted
classifications, racial and linguistic. Either of them is based on different traits and they have a
different nature. The race groups have primarily the biological basis, whereas language
classification is related to social characteristics and is directly related to the ethnic
classification, as language is one of the key characteristics of the ethnos.

Figure 2. Population of Russia. Language families and groups.
The racial and language diversity of the population is a result of its natural and social
histories. As the gene pools of the populations develop under the impact of the same
processes, we expect if not coincidence, then at least parallel patterns in the linguistic, racial,
and genetic diversity of the population.

Genegeography. The History of the Idea

The population, as the gene pool carrier, has two basic properties: reproduction in
generations and the area. Consequently, these properties also characterize the gene pool as a
set of genes occurring on a particular territory. Thus, gene pool is primarily a territorial set of
genes, a system in space that has a certain size, complexity, and differentiation. Geographic
space is among the main factors determining the gene pool structure. First of all, the space
forms gene flows, both directly (by determining the direction of migration upon colonization
of new lands) and indirectly (by determining different cultural and economic aspects of the
population life, and thus cultural boundaries or, conversely, cultural and genetic
relationships). Studying the population gene pool and its development requires that the gene
pool be considered together with the corresponding geographic area. This line of research in
population genetics was termed genegeography.
The concept of genegeography (together with the concept of gene pool) was introduced
in the 1920s to define a new field of research at the intersection of genetics, geography,
evolution and history [Serebrovsky 1928]. The aim of genegeography was investigation of
processes that occur in gene pools: gene dispersal from the centers of origin, gene flows
caused by migration of populations, introduction of new genes in populations via mutations
and admixture, random processes of gene pool change, etc. The problems of genegeography
were stated and analyzed in the late 1920s with reference to farm animals [Serebrovsky
1928], and examined in relation to the centers of origin and genetic diversity of cultivated
plants [Vavilov 1927; 1992]. Later, these studies have become relevant with regard to the
human gene pool.
The concepts of genegeography and gene pool have long remained a poorly studied
scientific heritage. The association of investigations of pools of genes with geography was
lost. Gene pool studies used the population-genetic statistic approach, which, in contrast to
the genegeographic one, does not provide the possibility to study both genetic properties of
the population and the geographic area inhabited by this population. Population genetics are
totally based on probabilistic statistical methods of analysis; population geneticists examine
populations, removing them from the geographic space and transferring them into the space
of genetic parameters. By contrast, in genegeography, researchers examine actual (or close to
actual) distribution of genes on the territory of populations. The original genetic information
becomes a subject of geographic investigation. Thus, genegeography augments possibilities
of studying population genetic processes, taking into account the space where these processes
occur. Geographic investigation of objects implies primarily their mapping.
The revival of genegeography in that sense (as genetic map-making) began in the 1970s-
1980s, when the genegeographic notions were for the first time combined with mapping
methods [Mourant et al. 1976; Rychkov, Sheremetyeva 1977; Menozzi et al. 1978; Piazza et
al. 1981a,b; 1983]. Genegeographic pictures of the world population gene pool were
presented in the treatise The History and Geography of Human Genes [Cavalli-Sforza et al.
1994]. In this work, the authors presented basic plans of genetic differentiation of the
populations of continents, showed parallels with their racial and linguistic differentiations,
and proposed an explanation of the genetic diversity pattern in terms of global historic
events. Thus, it was convincingly demonstrated that the gene pool bears the impress of the
history, and genegeography provides a possibility to read the genetic chronicle of historical
processes.
In the above study and in other surveys of genetic polymorphism of the world
population, the gene pool of Russia remained a gap in the general picture, although the data
on genetic diversity in the Russian populations have been rapidly accumulating. This gap was
largely filled up by a summary of the data on gene distribution in the Russian population and
a genegeographic atlas of the population of Russia and adjacent countries published by our
laboratory [Gene Pool 2000; 2003]
1
. The present work is mostly based on the ideas of this
publication. Its aim is describing the genetic diversity of the Russian population and other
peoples inhabiting the Eurasian continent. Two scales of the genetic mapping, Russian and
Eurasian, are presented. The former represents the gene pool structure of the population of
Russia, its specific features and patterns of geographic variation. The latter permits
estimating the scale of genetic variation of the population of Russia in Eurasia, assessing the
integration of the genetic relief of this population into the general Eurasian genetic landscape
and the extent of manifestation of the interaction between the peoples of Russia and those
from other Eurasian regions.

Genegeography of the Population of Russia

Genes and Markers for Studying Gene Pool

Recent studies, conducted in the framework of the Human Genome Project, shed light on
the diversity of genes and other structural elements of the genome
2
. To date, it is impossible
to examine all the genes in the gene pool; for many of them, their distribution in populations
is yet unknown. However, one can approach understanding of the gene pool structure by
examining a sample of genes from the total population gene pool. This sample should meet
the following requirements.
First, the sample should be randomized. Specificity of the gene function is associated
with the mode and character of its selection, and thus with the distribution of this gene
frequency in the populations. If the sample is random with regard to gene function and mode
of selection, then averaging of statistical properties of such genes reveals the most general

1
This work was published under the editorship of Prof. Yuri G. Rychkov, who in the 1970s-1990s has developed
the ideas of gene pool and genegeography in relation to human population genetics and theoretical and
methodological foundations of genegeography of the USSR population.
2
The size of the whole human genome is 3 billion base pairs, with its polymorphic part constituting less than 1%.
The human genome consists of 20,000-25,000 protein-coding genes, which accounts for about 2% of the total
properties of the population, related to such microevolution factors as gene drift, migration
and mutation.
Second, the sample should include primarily polymorphic genes, because these genes are
of key importance for characterizing the gene pool studied. Their polymorphisms underlie the
hereditary diversity, which should be revealed and assessed in the population.
Third, the sample should include the maximum number of genes that are studied in most
detail in human populations. These primarily include genes determining biochemical
(proteins and enzymes of blood serum and erythrocytes, biologically active substances of
various tissues), immunological (systems of blood groups, immunoglobulins and tissue
incompatibility), and physiological (perception of taste and color) polymorphism in human.
The history of investigation of these polymorphic systems is about a century long
3
.
Thoroughly studied genetic systems in human include also mitochondrial DNA (mtDNA),
which for the two last decades was a helpful and attractive tool of human population genetics.
The description of the gene pool of the Russian population that we present here rely on
the principles of population sampling, listed above. This description is based on the data on
the distribution of 101 alleles of 37 gene loci and 19 mtDNA haplogroups (Table 1). At the
first glance, the total number of genes examined is not high. Can this gene set adequately
characterize the gene pool and reflect genetic properties of the populations? A reliable
criterion of correctness of a gene sample is non-genetic, demographic data on migration, size
and composition of the populations examined. A demographic prediction of differentiation in
the populations of Russia based on such data [Rychkov 1984; 1986] proved to fit the
expected (from the genetic data) level of differentiation in these populations.

Geography of Single Genetic Markers

Genetic cartography, as any other thematic cartography, consists of two parts. One of
these parts pertains to geography, including traditional geographic data: location of the
objects and their presentation on the map. The other part is related to the theme, in our case,
population genetics, and contains population parameters (gene frequencies in the
populations).
Thanks to the efforts of several generations of researchers, a vast store of knowledge has
accumulated on frequencies of various genetic markers and genetic diversity of populations
from many world regions. For Russia, we have information on gene frequencies in about
1,500 populations of 62 ethnic groups
4
. These populations with their precise geographic

genome. Noncoding sequences, which may perform structural and regulatory functions, constitute the
remaining 98% [www.ornl.gov/sci/techresources/Human_Genome/home.shtml].
3
Historically, the earliest discovered system of human genetic markers is thought to be the AB0 blood group
system, described in the beginning of the 20
th
century [Landsteiner 1901]. The beginning of population studies
of human polymorphisms dates back to the World War I, when differences in the AB0 blood group
frequencies were reported in the Macedonian front, where many various peoples and races accumulated.
4
The source of data in this study is the Genofond Database (1996-2008), developed in our laboratory (Lab. of
Human Genetics, Vavilov Institute of General Genetics, Russ. Acad. Sci). It contains information on allele
frequencies at polymorphic loci of immunogenetic, biochemical, and physiological genetic systems and
polymorphic mtDNA markers in human populations. These data were taken from literature (beginning from
localization are presented in Figure 3. Exactly these populations, with their precise
geographic address, characterized by certain gene frequencies, are used as data points for
building genegeographic maps. The irregular distribution of the data points over the territory
of Russia is explained primarily by different population density of Russian regions. For
example, the average population density in the European part of Russia is 36.5 individuals
per square kilometer, while in Siberia it is about 5 individuals per square kilometer.

Table 1. Genes and markers for studying the gene pool of Russia

Polymorphic Systems Loci Alleles\ Haplotypes
Systems of antigenic polymorphism
Secretion of ABH-antigenes FUT2 Se, se
Blood groups:
AB0 AB0 0, A, A1, A2, B
Diego DI Di
a
, Di
b

Duffy FY Fy
a
, Fy
b

KEL-K K, k Kell
KEL-Kp Kp
a
,Kp
b

Kidd JK Jk
a
, Jk
b

Lewis FUT3 Le, le
Lutheran LU Lu
a
, Lu
b

MN and Ss GYPA, GYPB M, N, S, s \ MS, Ms, NS, Ns
P P1 P1, P2
Rhesus RHD, RHCE C, c, D, d, E, e \ CDE, CDe, cDE, cDe,
Cde, cdE, CdE, cde
Immunoglobulins:
Gm IGHG1, IGHG2,
IGHG3
\ a;g, a,x;g, a;b0
1345
, a;b
035
,s,t,
a,f;b
01345
, f;b
01345

Km IGKC Km
1
, Km
1,2
, Km
3

HLA-A A1, A2, A3, A9, A10, A11, A28, Aw19
HLA-B B5, B7, B8, B12, B13, B14, B16, B17,
B18, B21, B22, B27, B35, B40
Human leukocyte antigenes
HLA-C Cw1, Cw2, Cw3, Cw4, Cw5
Blood serum proteins
Alpha-1-antitrypsin PI Pi
M1
, Pi
M2
, Pi
M3,
Pi
S
, Pi
Z

C3 component of complement C3 C3
F
, C3
S

Group-specific component GC Gc
1
, Gc
2

Haptoglobin HP Hp
1
, Hp
2

Serum cholinesterase CHE2 C
5+
, C
5-

Transferrin TF Tf
C
, Tf
B0-1
,Tf
Dchi

Erythrocytic blood enzimes
Acid phosphotase ACP1 p
a
, p
b
, p
c

Adenilate kinase 1 AK1 AK
1
, AK
2

Esterase D ESD ESD
1
, ESD
2

Glucose-6-phosphate dehydrogenase G6PD Gd
+
, Gd
-

the late 19
th
century), materials of research centers, and reports on frequencies of genetic markers in the world
populations [Mourani et al. 1976; Tills et al. 1983; Nei, Roychoudhury 1988; Cavalli-Sforza et al. 1994].
Polymorphic Systems Loci Alleles\ Haplotypes
Glyoxalase 1 GLO1 GLO
1
, GLO
2

Phosphoglucomutase 1 PGM1 PGM
1
1
, PGM
1
2

6-phosphogluconat-dehydrogenase PGD PGD
A
, PGD
C

Physiologo-genetic polymorphism
Cerumen types Cer W, d
Color blindness CB CB+, CB-
Phenylthiocarbamide taste
perception
PTC T
1
, T
2
, t
Mitochondrial DNA haplogroups A, B, C, D, F, G, H, HV, I, J, K, M, T,
U, V, W, X, Y, Z

The number of data points for Russia is 1224, and for Eurasia is 4416.
Figure 3. Location of data points for the genegeographical study of Russia.
Arranging the primary data at the population level and their mapping provide a
possibility to present discrete distributions of genetic characters. The advantage of the
mapping approach in genetics is to the full extent manifested in transition from discreet to
continuous distributions. In this case, genegeographic maps cease to be illustrations of
numerical genetic data and become precisely calculated cartographic models that reflect
regularities in the spatial gene distributions. This transition is based on the interpolation
procedure.
In constructing maps, we used weighted average interpolation [Serbenyuk et al. 1990],
which in essence (a direct relationship between the interpolated values and the values at the
data points and the inverse relationship with the distances from the data points) is based on
one of the major models of interpopulation interaction, isolation by distance [Wright 1943].
The interpolation procedure was carried out under assumption of homogeneity of the space
separating the populations, because the anisotropy level is not constant both spatially and
temporally. For instance, certain water bodies at some points of the history could serve as a
way connecting different populations, while at other times the same water bodies would
become insurmountable barriers for gene flow. Another illustrative example of changes in
space with regard to its gene permeability pertains to geographic transformations of the
continent that have occurred since its colonization by humans. Thus, a large part of land,
Beringia, which had formerly connected Asia and North America, disappeared under water.
Interpolation is a method that permits detecting on geographical gene distributions the
barriers that prevented gene flow and the pathways along which these flows were directed.
A model of gene landscape, constructed by interpolation, is reflected on the plane as an
isolinear map, which contains main parameters of the gene distribution on the mapped area
and is easily readable. The map reflects such parameters as the mean gene frequency (M), the
range of its variation (min, max), and frequency variance (S
2
). The frequency intervals are
marked by the intensity of color, which changes from darker to lighter shades with decreasing
frequency. The legends to maps present the scale of these intervals and the distribution of the
areas of the mapped territory and the reference data, described by the corresponding intervals
of gene frequencies. These distributions show the size of the territories with the given gene
frequencies (high or low). The shape of isolines on the map is also significant: the bents and
bulges of the curves reflect the dynamics and direction of the geographic dispersal of the
gene.
The developed collection includes maps of distribution on the territory of Russia of over
a hundred of genetic characters (Table 1). Each of them has its own specific features of the
distribution on the gene pool area, its specific landscape and the direction of the variation
change. In parallel, we have developed maps of distribution of the genetic markers in Eurasia.
Their aim is presenting the scale of variation in genetic marker frequency in the Russian
population and showing to what extent the landscapes of the markers on the territory of
Russia correspond to the Eurasian gene distributions. In this paper, we cannot present the
total atlas of genetic maps because of reasons of space; it is available at
http://humgenlab.vigg.ru/russiamap/default.htm. Here, we show two arbitrarily chosen maps
demonstrating markedly different gene frequency landscapes. These are a map of the
distribution of the d allele of locus RHD, which in homozygous state determines the Rhesus-
negative phenotype, and a map of the mtDNA haplogroup C distribution.
The RH-d allele in Russia exhibits clinal variation of gene frequencies (Figure 4a). Its
frequency decreases from the west eastwards (correlation with longitude r
long
= 0.84, P =
0.99), from the maximum frequencies in the population of the European Russia through
intermediate frequencies in the Urals region and West Siberia to extremely low, nearly zero
frequencies in East Siberia.


Figure 4. Distribution of RH-d allele in Russia (a) and Eurasia (b).

Figure 5. Distribution of mtDNA haplogroup C in Russia (a) and Eurasia (b).
The mtDNA haplogroup C shows quite a different type of distribution (Figure 5a). Its
frequency drops in the orthogonal directions from the focus of high frequencies in East
Siberia. The maps of Eurasia show the same features of the gene landscapes. The longitudinal
cline of the RH-d frequencies becomes even steeper in Eurasia (r
long
= 0.92, P = 0.99)
(Figure 4b); the general character of the haplogroup C frequency distribution is also
preserved, with the area of highest frequencies in East Siberia (Figure 5b).
Is this correspondence of gene landscapes of Russia and Eurasia a general rule for the
whole set of markers? This question is considered in the section on geography of gene pool.
The distribution maps for individual markers suggest the positive answer to this question.
Statistical parameters of the gene frequency variation also support this expectation. As shown
in Figures 4 and 5, the minimum and the maximum of the Eurasian frequencies is found in
Russia, i.e. the whole range of the RH-d and haplogroup C frequencies is presented on the
territory of this country. The variances S
2
of these frequencies in Eurasia as a whole and in
the third of its territory covered by Russia are also comparable. This observation pertains to
the great majority of the genes examined, which is illustrated by the distribution of means and
variances of frequencies of 140 genetic markers in the Eurasian and Russian populations
(Figure 6).

Figure 6. The means (a) and variances (b) of the genetic marker frequencies in Russia (black line) and
Eurasia (white line).
Geography of Gene Pool

Each of the maps describes the fate of a concrete gene in the population gene pool. The
diversity of frequency landscapes among the genes is explained, first, by different function
and physiological role of different genes and thus their different response to selection, and
second, heterogeneity of the population. Averaged over a number of genes having different
selective significance, such generality will be selectively neutral [Gene Pool 2003]. In this
case, synthetic characteristics of the gene generality become a function of gene drift and
migration under conditions of a selectively neutral environment. It allows to neglect specific
properties of particular genes and find what they have in common, these common
characteristics that combine these genes into a new entity a gene pool.
There are several currently accepted methods to characterize a set of genes. Each of these
methods describes a particular part of the information about the gene pool, its structure and
processes. Parameters of gene diversity characterize the gene pool in each given point of its
area. The geographic landscape of these parameters is formed by areas with compact location
of close diversity values. The content of these parameters allows one to some extent describe
the areas in terms of genetic demographic processes occurring in the population. Another way
to detect and estimate the traces of processes that have occurred in the gene pool is to build a
general portrait of spatial variability of the gene frequencies and reveal geographically
connected distributions of many genes constituting the gene pool.

Genetic Diversity

Total gene diversity H
T
[Nei 1975] is one of the major gene pool characteristics. As it
was mentioned above, averaged over a number of non-linked gene loci, this value becomes a
function of gene drift and migration. The same applies to the H
T
components: intrapopulation
(H
S
) and interpopulation (D
ST
) diversities, which are related to H
T
as follows: H
T
= D
ST
+ H
S
.
In this case, the differences in intra- and interpopulation diversities between population
groups reflect primarily differences in social history and demographic scenarios of the
formation of these groups. A similar level of the total gene diversity in different population
groups may be explained by different contributions of its intra- and interpopulation
components. For instance, if the initial population is panmictic and lacks subdivision, the
indices of the total gene (H
T
) and intrapopulation (H
S
) diversity will converge in the absence
of interpopulation differences. In the opposite case, the interpopulation component would
account for the main part of the total diversity.
Here, we consider geography of all of the three gene diversity parameters in the
population of Russia: total (H
T
), intrapopulation (H
S
), and interpopulation diversities. The
latter parameter is represented by F
ST
, measuring the proportion of interpopulation
component in the total diversity. It reflects the degree of similarity of each local population
with its geographic neighbors. Technically, the maps of the diversity parameters are averaged
over 38 loci listed in Table 1. Each of these maps was constructed using the method of
moving window [Gene Pool 2003]. The geographic variation in the genetic diversity
parameters on the map is described in the map legend, with designations corresponding to
those presented above.
The information on gene frequency distributions indicates that the genetic variability of
the Russian population is high even as compared to the population of Eurasia. Thus, it was
expected that such general characteristics of the gene pool as gene diversity and among-
population genetic differentiation would also be high. Table 2 lists total genetic diversity H
T
,
its intrapopulation component H
S
, and index of interpopulation differentiation F
ST
in Russia
and Eurasia, both for each locus and averaged over loci.

Table 2. Genetic diversity (total H
T
, intrapopulation H
S
, interpopulation F
ST
) in Russia
and Eurasia

Russia Eurasia
Loci
H
T
H
S
F
ST
H
T
H
S
F
ST

FUT2 0.3708 0.3340 0.0993 0.4766 0.4326 0.0922
AB0 0.5452 0.5311 0.0258 0.5545 0.5418 0.0230
DI 0.1142 0.1081 0.0535 0.0875 0.0824 0.0584
FY 0.4159 0.3657 0.1208 0.4451 0.3552 0.2019
KEL-K 0.0711 0.0687 0.0340 0.0687 0.0663 0.0352
KEL-Kp 0.0711 0.0694 0.0236 0.0689 0.0672 0.0253
JK 0.4921 0.4655 0.0541 0.5000 0.4837 0.0327
FUT3 0.5000 0.4680 0.0641 0.4998 0.4717 0.0562
LU 0.0483 0.0458 0.0509 0.0494 0.0471 0.0465
GYPA 0.4877 0.4627 0.0512 0.4793 0.4656 0.0286
GYPB 0.3902 0.3413 0.1252 0.3759 0.3380 0.1009
P1 0.4273 0.3962 0.0727 0.4470 0.4110 0.0805
RHCE 0.7288 0.6785 0.0690 0.6952 0.6243 0.1020
RHD 0.2022 0.1651 0.1832 0.2735 0.2357 0.1381
IGHG1, IGHG2, IGHG3 0.7107 0.6149 0.1348 0.7609 0.5988 0.2130
IGKC 0.1533 0.1485 0.0310 0.2337 0.2173 0.0702
HLAA 0.8114 0.7795 0.0394 0.8464 0.8075 0.0460
HLAB 0.8985 0.8682 0.0336 0.9174 0.8874 0.0327
HLAC 0.7315 0.6982 0.0456 0.6844 0.6494 0.0511
PI 0.2633 0.2589 0.0168 0.3784 0.3672 0.0294
C3 0.1267 0.1224 0.0346 0.1285 0.1215 0.0547
GC 0.3704 0.3527 0.0480 0.3620 0.3532 0.0243
HP 0.4433 0.4347 0.0194 0.4265 0.4122 0.0336
CHE2 0.0925 0.0914 0.0122 0.0911 0.0892 0.0207

Loci
Russia Eurasia
H
T
H
S
F
ST
H
T
H
S
F
ST

TF 0.0435 0.0425 0.0229 0.0314 0.0307 0.0207
ACP1 0.4472 0.4319 0.0342 0.4340 0.4159 0.0417
AK1 0.0400 0.0394 0.0141 0.0591 0.0580 0.0187
ESD 0.2947 0.2871 0.0259 0.3360 0.3224 0.0407
G6PD 0.0280 0.0276 0.0128 0.0724 0.0665 0.0809
GLO1 0.4439 0.4254 0.0416 0.4254 0.4064 0.0448
PGM1 0.5236 0.5167 0.0132 0.5541 0.5473 0.0123
PGD 0.1302 0.1255 0.0358 0.1198 0.1164 0.0277
Cer 0.3751 0.3018 0.1954 0.4431 0.3488 0.2130
CB 0.0700 0.0684 0.0222 0.1071 0.1054 0.0156
PTC 0.5763 0.5476 0.0498 0.5640 0.5382 0.0458
mtDNA haplogroups 0.8795 0.7891 0.1027 0.9299 0.8236 0.1142
Average 0.3700 0.3465 0.0559 0.3869 0.3585 0.0631

In context of the present study, of particular interest are similarity and numerous parallels
in the estimates of genetic polymorphism in Russia and Eurasia. The similarity of the
estimates and their synchronous variation from locus to locus are striking, while the
difference between the mean estimates of the genetic diversity per locus in Russia and
Eurasia is statistically nonsignificant (P = 0.99).
How exactly are distributed in space the main parameters of population gene diversity,
are there any trends in the geographic variation of these parameters, and are these trends, if
any, the same in the populations of Russia and Eurasia? Answers to these questions are found
in the maps of the genetic diversity parameters averaged over loci, which are presented here.
The total gene diversity of the Russian population exhibits a cline with the diversity
index decreasing from the west and southwest eastwards and northeastwards (Figure 7a).
Index H
S
, which measures intrapopulation diversity accounting for the most part of the total
diversity, shows the same trend (Figure 7b). The distribution of F
ST
demonstrates the
contribution of the interpopulation component to the total gene diversity and the difference
among the populations in each region of Russia (Figure 7c). This distribution shows the same
clinal pattern, but with the opposite direction of the cline. In contrast to the intrapopulation
component, the interpopulation diversity increases in the eastward direction on the territory
of Russia.
Based on these results (Figure 7), we distinguish two different regions on the territory.
The first region encompasses the European part of Russia, the Urals, and South Siberia. The
population of this region exhibits high total and intrapopulation genetic diversity, whereas the
diversity among these populations is low. The second region, East Siberia, is inhabited by
highly genetically different populations with low intrapopulation variability. In context of
racial and linguistic classifications of the populations, the boundaries of these regions
correspond to the borders of distribution of different anthropological types and language
groups (Figure 2). Thus, the European Russia, Urals, and South Siberia, as noted above, are
peopled by members of the Caucasoid and transitory races (speaking Indo-European, North
Caucasian, Finno-Ugric and Turkic languages). East Siberia is the territory peopled by
Siberian Mongoloids (speaking Samodian, Tungus and Paleo-Asiatic languages). This pattern
of genetic diversity of the population of Russia and its correspondence to the racial and
ethnolinguistic diversity of the population may suggest historical roots of its origin, in
particular, non-uniform rate of the historic process.


Figure 7. Genetic diversity of population of Russia: total HT (a); intrapopulation HS (b), and
interpopulation FST (c).
This non-uniformity dates back to the Neolithic [Neolithic 1996] and manifests in the
simultaneous coexistence of different cultural epochs on the territory of the country. It is also
associates with different rate of the demographic development.
The historical process proceeded with particular intensity in the west and the south of
Russia under the impact of the Mediterranean and Central Asian foci centers of civilization.
This intensity reflected in the high gene diversity and low among-population differences in
these regions. In Siberia, the indigenous peoples still preserves the traditional forms of
occupation, including hunting forest and sea animals, reindeer farming, etc. The demographic
development there is especially slow, the population density is low, and links among the
populations are weak. As a result, the gene pool of the Siberian population, in particular of its
eastern part, exhibits low gene diversity and high among-population differentiation.
Such is the structure of the population of Russia. In all probability, this structure reflects
the historical and demographic conditions of its development. The distribution of the
corresponding parameters of genetic diversity on the territory of Eurasia (Figure 8) shows
that the longitudinal trend in the genetic diversity variation is characteristic of the whole
Eurasian continent. The gene frequency landscapes described for Russia reflect Eurasian
landscapes and trends. The total level of genetic diversity of the population decreases and the
level of interpopulation genetic differentiation increases from the west eastwards. The
European part of Russia is included in the western Eurasian region of high population genetic
diversity, which covers Europe, Near East, West Asia and Central Asia. The territory of
Siberia shows specific features of the eastern Eurasian region, whose population displays
rather low genetic diversity together with high among-population genetic differentiation.



Figure 8. Genetic diversity of Eurasian population: total HT (a); intrapopulation HS (b), and
interpopulation FST (c).
This variation in Siberia is high even on the Eurasian scale. The map shows that Asian
populations from the Himalayan highlands and the Pamir have a similar, albeit somewhat
lower, variation level.

Generalized Variability of Gene Frequencies

In terms of genetic diversity, the gene pool of Russia seems to consist of two components
western and eastern. A question arises, does this gene pool architecture reflect in the gene
frequencies and if yes, then to what extent? This question can be answered by considering
geographically associated distributions of the genes included in the gene pool of the Russian
population.
We revealed the most general types of geographic variation of gene frequencies on the
territory of Russia using the method of component analysis, which is widely employed in
statistics and population genetics. The principal components were calculated from a
correlation matrix. The essence of maps of principal components is detection of repeated
landscape structures in the variety of genetic landscapes of different genes. On the map,
negative and positive component values define stably repeated areas of the distribution of
extreme gene frequency values minimums and maximums, indirectly pointing to territories
inhabited by groups of people that have greatest differences between the gene distributions.
Minimums and maximums of a principal component determine main plans of population
differentiation and intermediate (close to 0) component values would describe zones of
transgression of different population groups. The direction of such transgressions is shown by
the shape of isolines, while the distribution of areas by the intervals of the principal
component values measure the areas occupied by different principal component values.
What can be inferred from maps of principal components, which are in essence synthetic
maps of generalized variance of gene frequencies? Repeatability of the distribution of
biologically independent (not linked) genetic characters can be caused only by continuous
systemic or random action of population genetic factors migration, mutation and gene drift,
directed and regulated by natural and social environmental factors. Accordingly, the maps of
principal components can be read as traces of the effects of these factors on the gene pool.
The depth of these traces is measured by the rank of the principal component, which
depends on the proportion of the total variance that it explains. In this study, we consider the
first two principal components, which account for about 40% of the total genetic variation of
the population gene pool of Russia.
The landscape described for genetic diversity maps manifests on the map of the first
principal component (PC) of the gene frequency variation. The spatial distribution of the first
PC is clinal, from negative minimum values at the west to high positive ones in the east with
a zone of intermediate, close to zero values (Figure 9a). Zones of extreme values delineate
territories inhabited by the members of two major races, Caucasoid and Mongoloid. The same
longitudinal gradient of gene frequencies occurs in Eurasia (Table 3, Figure 9b). Here also,
the opposite values of the principal component define the distribution areas of Caucasoid
populations in the west and Mongoloid populations, in the east. The first PC accounts for
32% of the total variation of the Russian gene pool and 28% of that of the Eurasian
population (Table 3). In view of this, we conjecture that about one-third of the genetic
variation of both Russian and Eurasian population is caused by the affiliation of the
population of these regions to two major human races.
In this case, intermediate, close to zero values of the principal component would
delineate areas of close contacts and deep introgression of the two different parts of the gene
pool. In Eurasia, this contact zone is particularly wide, being very pronounced in Central
Asia (Figure 9b), where vast steppes during centuries served as an area of contacts of
genetically different populations and a zone of intense migrations. In Russia, such zone of
genetic transgression is represented by the territories adjacent to the Urals and particularly
southern Siberia (Figure 9a).

Table 3. Principal components of gene frequency variation, its contribution to the total
geographic variation and correlation with geographical coordinates

% of total variance Correlation with
longitude (P=0.99)
Correlation with
latitude (P=0.99)
PC
Russia Eurasia Russia Eurasia Russia Eurasia
1 32 29 0.86 0.93 0.38 0.13
2 10 15 0.15 0.05 0.68 0.89


Figure 9. The first principal component of gene pool variability in Russia (a) and Eurasia (b).

Defining this zone as a contact one is in good agreement with evaluation of its role in the
history of the Russian population in terms of archeology and paleoanthropology. According
to the archeological data on variation of the material culture, as early as in late Paleolithic
(2612 thousand years ago), the territories of West and South Siberia, Central Asia were a
theater of interaction of three distinct cultural worlds European, Siberian and Central
Asian. The impact of the first of them prevailed in the late stage of Paleolithic, 1512
thousand years ago [Grekhova et al. 1996]. In Neolithic and the Bronze Age, the border
between Mongoloid and Caucasoid tribes was much farther to the east than today. The area
of the Caucasoid race included then territories of Central Asia and South Siberia. At that
time, main Siberian and Central Asian types of the Mongoloid race were formed, and in the
contact zone of Caucasoids and Mongoloids (predominantly in Central Asia), the eastern
variant of the Caucasoid race appeared. In the early Iron Age (first half of the 1
st
millenium
A.D.), a wide and gradual migration of Central Asian Mongoloids to the west began,
continuing in the Middle Ages. These processes of admixture explain the modern
anthropological diversity in the population of Central Asia, West and South Siberia
[Alekseev, Gokhman 1984].
Thus, the global historical process, interaction between the east and the west, has left a
deep impression on the gene distribution on the territory of Eurasia and the third part of it
that to date constitutes the territory of Russia. It has specified the main, longitudinal direction
of the geographic variation pattern of many polymorphic genes, which was reflected in the
first PC map.
The impact of global, albeit less potent, processes is manifested in the map of the second
principal component of gene frequency variation. This component accounts for 10% of the
total geographical variation of the gene frequencies in Russia and 15%, in Eurasia. Both in
the country and in the continent, the second PC exhibits a latitudinal cline (Figures 10a, 10b),
which is supported by corresponding coefficients of correlation of the second PC with
geographical coordinates (Table 3). This variation pattern is particularly marked in the Asian
part of the area, where maximums (in the north) and minimums (in the south) of the PC
values are located (Figures 10a, 10b). The map of the second principal component reveals
extremely high differentiation of the East Asian population, its northern and southern parts,
irrespective of the fact whether the total Eurasian continent or only its northern region
(Russia), is taken into consideration.
The geographical landscape of the second PC variation in the Eurasian gene pool
completely reproduces the natural latitudinal zonal structure of the continent (Figure 11). For
instance, in southeastern Eurasia, three extreme negative PC values delineate territories of the
tropical and subtropical climatic zones (Hindustan and Indochina, southeastern China). The
interval of close to zero 2
nd
PC values defines the steppe and semi-desert zone of Asia and
northern Black Sea region. In addition to these, many parallels between the genetic landscape
reflected in the second PC and natural latitudinal variation on the territory of Eurasia can be
listed (Figures 10b, 11).
Thus, natural zonality apparently ranks second among the factors controlling the
structure of the Eurasian gene pool. The area of human habitation encompasses virtually all
types of landscapes and natural zones, from Arctic tundra to tropical forests. Having a dual,
biological and social, nature, humans have also a dual system of adaptation to the
environment.

Figure 10. The second principal component of gene pool variability in Russia (a) and Eurasia (b).

Figure 11. Climate map of Eurasia.
In this case, natural zonality determines both the natural and the social constituents of the
latitudinal population variation. The climatic conditions are responsible for economic and
cultural lifestyles, differing among the belts of natural zonality, and determining both
pathways of development and forms and directions of contacts between different cultures.
On the territory of Russia, the latitudinal zonal structure of the Upper Paleolithic culture
continued in the distribution of Neolithic cultures agricultural and stock-raising Neolithic of
the southern steppes and the forest Neolithic of hunters and fishers of the north. The very
colonization of the northern Eurasia, the territories of todays Russia, was directed
latitudinally, from south northwards. This direction was preserved during the whole history
of the population of Russia. The trend was most pronounced in Siberia. For instance, the
Neolithic culture of taiga hunters, represented to date by Tungus-speaking Evenks, migrated
from the Baikal Lake region to the north and the northeast [Okladnikov 1970]. From the first
centuries B.C. to the 17
th
century, Turkic-speaking populations moved from the south
(Central Asia) to Siberia; at present, Turkic-speaking peoples inhabit Altai and the Sayans,
southern steppes of West Siberia and the Lena River basin (Yakuts) in East Siberia (see
Figure 2).
The latitudinal strata of genetic deposits of all epochs of the Russian gene pool history,
beginning from the colonization of this land in the ancient times, was reflected in the
geographic distribution of the second PC (Figure 10a). The 2
nd
PC maximum marks sub-
Arctic regions: Taimyr and Chukotka peninsulas. The positive values of this principal
component delineate the genetic territorial complex of the Urals region and Siberia (the area
of dispersal of Finno-Ugric and Paleo-Asian populations), which in the south is bordered by a
genetically different population pool with the opposite, negative component values. The
location of the zone of minimum 2
nd
PC values suggests that this process was most intense in
southeastern Siberia.
In the European part of Russia, the general latitudinal type of the second PC variation has
a somewhat different direction. Here, the area of negative component values moves from the
west and southwest to the east and northeast. This direction corresponds to the migration of
Slav tribes from the West and Central Europe to the territory of modern European Russia in
the middle of the 1
st
millennium AD. These tribes, having assimilated the Baltic and Finnish
tribes, gave rise to the Russian ethnos, which to date is the predominant population of this
region [Sedov 1995].
Thus, in addition to the general latitudinal direction, corresponding to the Eurasian
landscape, the geographic pattern of the second PC distribution shows specific feature of this
landscape. These specific features reflect historic processes on the territory of Russia,
peopling of the region, various modes of interaction of Russian populations with the
neighbors: in Siberia, primarily with Central Asian populations, and in the European part of
Russia, with the population of West and Central Europe.

Conclusion

In this work, we present the major trends in differentiation of the gene pool of Russia.
These trends are reflected by geographic foci that exhibit extreme values of gene diversity
and opposite values of the gene frequency variation components. Two constituents, western
and eastern, European and Siberian, exhibit very different sets of alleles and genetic
demographic scenarios. These constituents account for the main part of the gene pool
variation in Russia. The heterogeneity of this gene pool noted above to some extent is
determined by the heterogeneity of its carrierthe population that includes various ethnic
groups and even races, with different cultural and economic lifestyles. A question arises,
what can unite these two poles into an integral, ordered genetic territorial structure. To what
extent, if at all, the different parts of the Russian gene pool interact and intermix?
This integrity can be evaluated on the basis of the general distribution of many different
genetic markers of the gene pool. If we abstract ourselves from the gene frequencies in local
populations and consider only the distribution of their deviations from their means, then the
shape of this distribution is of importance. A multimodal distribution would suggest a
substantial heterogeneity of the gene pool consisting of disconnected gene pools of peoples
or population devoid of uniting links, whereas a unimodal distribution would be observed if
the gene pools of different peoples are united in an integral historically formed entity. The
latter was observed in the gene pool of the population of Russia (Figure 12). The distribution
of deviations of the frequencies of the genes examined from the mean value for the total
population of Russia is symmetrical and unimodal.

Figure 12. Distribution of gene frequency deviations from their means in gene pool of Russia.
This indicates the balance of centrifugal and centripetal forces that provide stability and
integrity of the Russian population gene pool. A deviation from this balance would reflect on
the shape of the distribution, making it asymmetric: shifted to the right with prevailing
panmixia (centrifugal forces) or to the left with prevailing isolation (centripetal forces). Thus,
the gene pool of Russia appears as an entity consisting of two interacting and introgressing
parts, eastern and western; these parts are equidistant from the mean.
Synthetic maps of the gene pool characteristics (gene diversity and components of gene
frequency variation) show that the transition of the gene pool from one state to the other is
gradual. The zone of this transition was marked by us as transgressive on the principal
component maps, as the area of medium values of gene diversity. The map given in Figure 13
shows how the postulated introgression of the Russian gene pool parts is expressed on the
geographic area of this gene pool. The map reflects the genetic distance [Cavalli-Sforza,
Edwards 1967] of the populations in each point of Eurasia from the average gene pool of
Russia. As expected, the area of minimum and close to minimum genetic distances was
observed on the territory of Russia, namely on the West Siberian Plane. This region was
described above as the contact zone of the eastern and western populations of Russia. Here,
we would like to make a special reference to the following features of the genetic distance
landscape. First, note the wide expanse of the area of minimum and small (below average)
distances, which suggest long-term and dense contacts between the eastern and the western
parts of the gene pool of Russia. Second, the extension of the small distance values to the
south, beyond the boundaries of Russia. The southern limit of this genetic distance zone
reaches the highest and longest mountains in Eurasia, the Himalayas. Exactly this natural
barrier proved insurmountable for the gene flow.
Thus, genes do not know administrative and state borders. It would seem that state
subdivisions and formations affect the fate of gene pools by changing the demographic
situation, direction and intensity of gene flows. However, in reality this is only a moment in
the history of a gene pool, whose lifetime is measured by generations and millenniums of the
population evolution. For this reason, the gene pool of Russia does not appear as a distinct
entity within the Eurasian gene pool. As an integral part of the Eurasian gene pool, the gene
pool of Russia shares with it the same space and the same history.

Figure 13. Genetic distances to the average gene pool of Russia.
The specificity of the gene pool of Russia lies elsewhere. This gene pool, occupying a
peripheral position in Eurasia, has characteristics that fully correspond to those of the gene
pool of the continent. As shown above, the minimums and maximums of Eurasian
frequencies of many genes are located on the territory of Russia, that is, this northern part of
the continent often encompasses the whole Eurasian range of variation.
Thus, the gene diversity of the gene pool of Russia, (irrespective of how we consider it,
using distinct genes or their set) either corresponds to that of the whole Eurasia or
encompasses most of it.
The main processes that have formed the Eurasian gene pool acted with the same
strength and precision in the gene pool of Russia. Apparently, the gene pool of the population
of Russia is one of the most representative parts of the Eurasian gene pool, and the
knowledge of the former is indispensable for understanding the latter.
The work is supported by the Program Biological Diversity and Dynamic of Gene Pools
of Plants, Animals and Humans by Presidium of Russian Academy of Sciences.

References

Alekseev, V. P., Gokhman, I. I. (1984). Anthropology of Asian Part of USSR. Moscow:
Nauka. (in Russian)
Cavalli-Sforza, L. L., Menozzi, P., Piazza, A. (1994). The History and Geography of Human
Genes. Princeton, New Jersey: Princeton University Press.
Edwards, A. W. F., Cavalli-Sforza, L. L. (1972). Affinity as Revealed by Differences in Gene
Frequencies. In The Assessment of Population Affinities in Man. J. S. Weiner and J.
Huizinga (Eds.), pp. 3747. Oxford: Oxford University Press.
Ethnoses of the World. The History-ethnographic Handbook. (1988). Yu. V. Bromlei (Eds).
Moscow: Sovjetskaya Entsiklopedia. (in Russian).
Grekhova, D. V., Balanovskaya, E. V., Rychkov, Yu. G. (1996). Technology of Development
of Computer Regional Atlases: The Late Paleolithic Age in Northern Eurasia. In The
Humanities in Russia: Soros Laureates. History, archeology, cultural anthropology and
ethnography, pp. 286304. (in Russian).
Landsteiner K. (1901). Uber Agglutinationserscheinungen Normalen Menschlichen. Weiner
Klin. Wochenschr, 14, 11321134.
Menozzi, P., Piazza, A., Cavalli-Sforza, L. L. (1978). Synthetic Maps of Human Gene
Frequencies in Europe. Science, 201, 786792.
Mourant, A. E., Kopec, A. C., Domaniewska-Sobczak, K. (1976). The Distribution of the
Human Blood Groups and Other Polymorphisms. London: Oxford University Press.
Nei, M. (1975) Molecular Population Genetics and Evolution. (1975). Amsterdam: North-
Holland Publishing Company.
Nei, M., Roychoudhury A. K. (1988). Human Polymorphic Genes: World Distribution. New
York: Oxford University Press.
Neolithic Age in Northern Eurasia. (1996). S. V. Oshibkin (Eds). Moscow: Nauka. (in
Russian)
Okladnikov, A. P. (1970). Stone Age on the Territory of USSR: The Neolithic of Siberia and
Far East. Moscow: Nauka. (in Russian).
Piazza, A., Menozzi, P. (1983). Geographic Variation in Human Gene Frequencies. In
Numerical Taxonomy: Proceedings of a NATO Advanced Study Institute. J. Felsenstein
(Eds.), pp. 444450. Berlin: Springer..
Piazza, A., Menozzi, P., Cavalli-Sforza, L. L. (1981a). Synthetic Gene Frequency Maps of
Man and Selective Effects of Climate. Proc. Natl. Acad. Sci. USA, 78, 26382642.
Piazza, A., Menozzi, P., Cavalli-Sforza, L. L. (1981b). The Making and Testing of
Geographic Gene Frequency Maps. Biometrics, 37, 635659.
Rychkov , Yu. G. (Eds.) (2003), Gene Pool and Genegeography of Population:
Genegeographical Atlas of Population of Russia and Contiguous Countries.St. Petersburg:
Nauka. (in Russian)
Rychkov , Yu. G. (Eds.) (2000), Gene Pool and Genegeography of Population: Gene Pool of
Population of Russia and Contiguous Countries. St. Petersburg: Nauka. (in Russian)
Rychkov, Yu. G. (1984). Space and Time in Genegeography. Vestnik of the Academy of
Medical Sciences USSR, 7, 111116. (in Russian)
Rychkov, Yu. G. (1986). Genetic Chronology of Historical Events. Voprosy Anthropologii,
77, 318. (in Russian)
Rychkov, Yu. G., Sheremetyeva, V. A. (1977). The Genetic Process in the System of Ancient
Human Isolates in North Asia. In Population Structure and Human Variation G.A.
Harrison (Eds.), pp. 47108. Cambridge: Cambridge University Press.
Sedov, V. V. (1995). Slavs in the Early Middle Ages. Moscow: Fond Arheologii. (in Russian)
Serbenyuk, S. N., Koshel, S. M., Musin, O. R. (1990). Methods of Modeling of Geo-fields by
Data in Irregularly Located Points. Geodezia i kartographia, 11, 3135. (in Russian)
Serebrovsky, A. S. (1928). Genegeography and Gene Pool of Agricultural Animals.
Nauchnoe Slovo, 9, 322. (in Russian)
Tills, D., Kopec, A. C., Tills, R. E. (1983). The Distribution of the Human Blood Groups and
Other Polymorphisms. Oxford: Oxford University Press.
Vavilov, N. I. (1927). Geographical Regularities in Relation to the Distribution of the Genes
of Cultivated Plants. Works of Applied Botany, Genetic and Plant Breeding, 17, 411
428. (in Russian)
Vavilov, N. I. (1992). Origin and Geography of Cultivated Plants. Cambridge: Cambridge
University Press.
Wright, S. (1943). Isolation by Distance. Genetics, 28, 114138.

Chapter 7

Genetic Variability within Cypella
fucata Ravenna in Southern Brazil

vilin Giordana de Marco, Luana Olinda Tacuati, Lilian Eggers,
Eliane Kaltchuk-Santos and Tatiana Teixeira de Souza-Chies
2

Departament of Genetics
Department of Botany, Universidade Federal do Rio Grande do Sul,
Av. Bento Gonalves, 9500, CEP 91501-970, Porto Alegre,
Rio Grande do Sul, Brazil

Abstract

Iridaceae is a relatively large family of monocots comprising over 2,030 species in
65-75 genera. Cypella fucata Ravenna is characterized as a perennial herb which presents
bulbous and beautiful orange flowers that have ornamental value. The distribution of the
species comprises Brazil, in the states of Rio Grande do Sul and Santa Catarina, and
Uruguay. This study aims to compare two geographically distinct survey areas of C.
fucata using molecular approaches and to offer a contribution to the knowledge of
genetic variation of the species. Cypella fucata specimens were collected in the State of
Rio Grande do Sul, Brazil, in two sites: the municipalities of Piratini (26 specimens) and
Capo do Leo (28 specimens). Survey sites were localized along a road, and were 22
km distant from each other. Specimens were analyzed by ISSR-PCR (Inter Simple
Sequence Repeats) since ISSR markers have a high capacity to reveal polymorphism and
offer a great potential to determine intra- and interspecific levels of variation. Nine
primers were tested, generating 201 fragments (bands) with sizes ranging from 150 bp to
2,000 bp and an average of 22 bands per primer. A matrix of presence and absence of
fragments was constructed and the Jaccards coefficient was calculated. A dendrogram
based on these values was generated to reveal the genetic structure of both populations.
The patterns were highly polymorphic within each collection site, with samples
aggregated into two major groups, corresponding to the surveyed populations. In
addition,
ST
was calculated and may indicate some interpopulation gene flow (
ST
=
0.0851) and an intermediate structure. The Neis genetic distance showed a high identity
between the two collection sites analyzed (98%). Since the sampled areas were near each
vilin Giordana de Marco, Luana Olinda Tacuati, Lilian Eggers et al. 180
other, our data may suggest that they in fact correspond to two subpopulations derived
from a single original one. These data may indicate that C. fucata presents cross-
pollination and the vegetative propagation does not play an important role in the
maintenance of the populations. Specimens from other sites will be analyzed to confirm
the mating system. This study is the first contribution to the knowledge of evolutionary
aspects of this species.

Introduction

Iridaceae is a family of plants that includes 65-75 genera and 2,030 extant species
(Goldblatt et al., 2008), most of which are found in the Southern Hemisphere. Africa is the
centre of diversity of the family, and the majority of the species is concentrated in temperate
and Mediterranean regions in the Southern part of the continent (Goldblatt et al., 1995).
Many genera of Iridaceae are economically important because of their ornamental value. This
family is easily recognizable among other monocots by having isobilateral equitant leaves,
flowers with three stamens and inferior ovary (Goldblatt, 1990). Representatives of Iridaceae
occur preferentially in open environments as fields, prairies and humid areas. They are mostly
perennial herbs, and bulbous plants of different genera are frequent. Basal leaves vary in
number, and they are cylindrical or flat, ensiform and usually inconspicuous in vegetation
coverage (Dahlgren et al., 1985).
In Brazil, the family Iridaceae is represented by little-known native and exotic species.
Overall, the family is represented by 14 genera and 110 species (Innes, 1985). In Rio Grande
do Sul (RS), the southern state of the country, ten genera include native species, and it is
believed that some are not yet known by botanists. Many species in the family, especially
those that are endemic, are under considerable risk of extinction due to anthropic activities in
the natural environment. More studies are indispensable to increase the knowledge on the
species, and more efforts should be made to preserve these elements of our flora. The native
species of RS are noteworthy in spring, which is the flowering period of the plants. The
estimated number of Iridaceae species in RS is about 40 and, five of them belong to Cypella
Herb. (Ravenna, 1981, 1983).
Iridaceae belongs to Asparagales (Fay, 2000; APG, 2003), but it was already placed in
Liliales (Stevenson and Loconte, 1995). Data of plastidial DNA and morphology together
rank Iridaceae within Asparagales, and then place Iridaceae as a precocius divergent family
in that order.
Nowadays, Iridaceae is considered to have seven subfamilies (Isophysidoideae,
Patersonioideae, Geosiridoideae, Aristeoideae, Nivenioideae, Crocoideae and Iridoideae)
Goldblatt et al., 2008), instead of four previously recognized subfamilies (Manning and
Goldblatt, 2001). The subfamily Iridoideae is represented by plants with flowers with
perigonal nectaries and style branches that are stigmatic only at the top, as a consequence of
the conduplicate margins of the branches that fall below the level of the anthers. This
subfamily is divided into five tribes, including Tigrideae, whose taxonomy has not been fully
investigated (Goldblatt et al, 2008). Many new and mainly monotypic genera have been
described by means of analysis based only on morphological characters, which led Goldblatt
(1990) to suggest that careful cladistic analysis seems to be a way to contribute to a more
Genetic Variability within Cypella Fucata Ravenna in Southern Brazil 181
acceptable taxonomy. Early studies regarding classification of Iridaceae using modern
methods of cladistics have been presented by Goldblatt (1990), considering data from
different approaches such as cytology, anatomy, pollen morphology, flavonoids and amino
acids.
The genus Cypella is native to Central and South America and belongs to the tribe
Tigrideae (Dahlgren et al., 1985). Plants have brown to blackish tunicate bulbs and plicate
leaves. Flowers are usually yellow or orange, but may also be purple or white, presenting
unequal tepals, of which the outer three are larger and the inner partly enclosed by folds. The
anthers are attached at least apically to the opposed style branches, which are well developed
and usually divided at the top, into characteristic prominent acute crests. Cypella is the
second largest genus in Tigrideae (Tigridia Juss. being the first one) and occurs from South
America to Mexico and Cuba, with 21 species (Goldblatt et al. 1998; Howard, 2001;
Ravenna 2005). Plants usually inhabit dry, stony soils, although there are species such as C.
aquatilis Ravenna that can be found in small rivers and flooded river banks (Howard, 2001;
Ravenna 1981a).
There is no taxonomic revision of the genus. Information about the occurence of species
in different countries or regions is just available in general floras. Ravenna (1968) reported
two Cypella species for the Buenos Aires province in Argentina, the well-known C. herbertii
(Lindl.) Herb. and C. coelestis (Lehm.) Diels. For Uruguay, Lombardo (1984) mentioned just
C. herbertii. Goldblatt et al. (2008) showed a paraphyletic Cypella species in a recent
phylogenetic study on Iridaceae, which provides evidence that the genus should be better
analyzed concerning taxonomic questions.
Cypella fucata Ravenna is a wild plant with populations often formed by few individuals.
Ravenna (1981b) named eight new Cypella species, related to the floras of Argentina, Brazil,
Paraguay and Uruguay, and described C. fucata for grassy fields of Santa Catarina and Rio
Grande do Sul (Brazil) and northeastern Uruguay. Cypella fucata belongs to the section
Cypella and has large orange flowers with external tepals markedly reflexed, filaments
connate along half or more of their length, and arquate adaxial crests of the style branches.
Cypella fucata (Figure 1) is related both to Cypella herbertii (Lindl.) Herb. and C.
osteniana Beauv. It resembles C. osteniana in terms of size of the plant and flowers wich
have dark-veined outer tepals. However, the relatively large bulb, the whitish color of the
perigone, and the rather long and divergent style-crests make C. osteniana unmistakable
(Ravenna, 1981b). Otherwise, C. herbertii has much smaller style crests.
Iridaceae displays considerable problems concerning the taxonomy of its species. The
scarcity of studies about the genera, along with its varied morphology and subtle difference
among morphological characters has hampered the analyses in this field, showing the need
for further investigation, as at molecular level, for instance. The few studies related to
phylogeny surrounding plastidial regions, rps4 (Souza-Chies et al., 1997), rbcL (both
protein-coding genes), the trnL intron, and the trnL-F intergenic spacer (Reeves et al., 2001),
the rps16 intron, the matK exon and others previously studied for Iridaceae (Goldblatt et al.,
2008), have revealed the need for more specific studies to better ascertain the circumscription
of many genera in the family. Some studies on Iridaceae involving intrapopulation analysis
using molecular markers (AFLP) found polymorphism in Romulea bulbocodium Sebast. &
Mauri, with expressive morphological variability (Colasante et al., 2008).

Figure 1. Lateral and top view of Cypella fucata flower.
Spier et al. (2008) evaluated the use of RFLP markers to identify species of the
subfamily Iridoideae. The results showed Cypella species grouped with Herbertia Sweet,
both of the tribe Tigridieae, denoting the efficiency of this molecular marker. Other studies
dealing with the genetic variability in species of the family Iridaceae include those with
Sisyrinchium micranthum Cav. (Tacuati et al., unpublished data), where different ploidy
levels are being analysed together with variation on morphological traits.
DNA markers have proved to be valuable in investigations on crop breeding, especially
genetic diversity studies. The commonly used polymerase chain reaction (PCR) based on
DNA marker systems are the Random Amplified Polymorphic DNA (RAPD), the Amplified
Fragment Length Polymorphism (AFLP) and more recently the Simple Sequence Repeats
(SSRs) or microsatellites assays (Staub et al., 1996; Gupta and Varshney, 2000).
Inter Simple Sequence Repeat (ISSR) is a rapid, easy and cheap technique which
involves polymerase chain reaction (PCR) amplification of regions between adjacent,
inversely oriented microsatellites, using simple sequence repeat (SSR) motifs (di-, tri-, tetra-
or penta-nucleotides) containing primers anchored or not at the 3 or 5 end by two to four
arbitrary, often degenerate nucleotides.
ISSR markers have a high capacity to reveal polymorphism, offering a great potential to
determine intra and interspecific levels of variation (Wu et al., 1994, Zietkiewics et al.,
1994). Such markers have been widely used for studies of genetic diversity (Li and Ge, 2001,
Brantestam et al. 2004), phylogenetics (Joshi et al., 2000, Xu and Sun, 2001, Yockteng et al.,
2003), population genetics (Camacho and Liston, 2001) as well in studies with interspecific
genetic relationships (Ajibade et al., 2000, Reddy et al., 2002).
Inter Simple Sequence Repeat show some advantage over randomly amplified
polymorphic DNA, since the primers are longer (approximately 14 bp or more), allowing for
more stringent annealing temperatures (Wolfe and Liston, 1998), and providing a higher
reproducibility of bands as compared to RAPD (Souframanien and Gopalakrishna, 2004). In
addition, the primers anneal to SSR that are abundant throughout the eukaryotic genome and
evolve rapidly; hence the potential to reveal high polymorphism levels (Zietkiewicz et al.,
1994; Li and Ge, 2001). The universal use of these markers, the capacity to access unknown
genomes, and the lack of the need for specific primers are important aspects of the ISSR-PCR
technique.
The analyses of genetic variability assessed at inter and/or intrapopulation is of extreme
importance in studies that aim at the characterization of a biological species and its
conservation. It allows the link between its current condition and evolutionary history and
enable the outline of proposals for management and maintenance of the species, if it is
endangered or not.
This chapter concerns a study with aims to compare two geographically distinct
populations of C. fucata using a molecular approach to contribute to the knowledge of the
genetic variation of this species.


Population Sampling

In 2006, a total of 54 specimens of Cypella fucata were sampled in two municipalities,
Piratini (26 individuals) and Capo do Leo (28 individuals) distributed along an inner
country road (BR 293) in Rio Grande do Sul (RS), Brazil. The accessions were named ESC
191 (Eggers and Souza-Chies 191, accession from Piratini - 3143'35.3"S 5251'57.7"W) and
ESC 192 (accession from Capo do Leo - 3144'04.8"S 5239'33.7"W). The populations
were 22 km away from each other. Voucher specimens have been deposited in the ICN
Herbarium, Instituto de Biocincias, Universidade Federal do Rio Grande do Sul.

DNA Isolation and ISSR-PCR Amplification

Total genomic DNA was extracted from 10-50 mg of silica gel-dried leaves using
hexadecyltrimetyl-ammonium bromide (CTAB) according to the Doyle and Doyle method
(1987) with modifications. The DNA samples were quantified and standardized at the
concentration of 10 ng. The PCR reactions were standardized containing a total volume of 25
l: 14.8 l of Milli-Q sterilized water, 0.2 l of Taq DNA Polimerase (5U/l)
(CentBiot
@
/Brazil), 2.5 l of MgCl
2
(50 mM), 2.5 l of 10X buffer, 1.5 l of primer (10
pmol), 1.5 l of dNTP (10 mM of each dNTP) (Invitrogen
TM
Brazil), 1 l of DMSO (100%),
and 1 l of DNA (10ng/l). Amplifications were performed using a thermocycler
PALMGEN. The amplification conditions were: one 5-min step at 92C, followed by 35
cycles of 1-min at 94C, 1-min at 45C for annealing, and 72C (2.5 min), and 5-min final
extension at 72C plus 4C (Tian et al., 2008; Medraoui et al., 2007). The ISSR amplification
products were stained with GelRed
TM
(5 ng/ml) and separated by horizontal electrophoresis
on agarose gel 1.5% soaked in 1X TBE buffer (50 mM Tris, 50 mM boric acid, 2.5 mM
EDTA, pH 8.3) at 100 mA. The electrophoresis gels were visualized and photographed under
ultraviolet light. The size of the amplified products was determined by comparison with a
100-bp molecular weight ladder (Amersham Biosciences). Ten primers were tested and nine
of them were analyzed. Eight anchored oligonucleotide primers (GA)
8
T, (CTC)
4
RC, (CT)
8
G,
(AG)
8
YC, (GA)
8
C, (AC)
8
T, (GT)
8
A and one nonanchored primer (GACA)
4,
were used to
amplify all samples. These primers were selected based on the number of amplification
products and the quality of the profiles obtained using one sample from one species.

Statistical Analysis

Bands were scored as a binary variable, (1) for presence and (0) for absence of each
fragment size. The binary matrix (1/0) was used to calculate the similarity according to the
Jaccards coefficient between each pair of samples.
The relationship among the specimens was evaluated by building the dendrogram using
UPGMA (Unweighted Pair Group Method Using Arithmetic Averages) algorithm.
Cophenetic correlation was calculated to measure goodness of fit for the matrix obtained.
These statistical analyses were performed using NTSYS-pc version 2.1 (Rohlf, 2001) and the
bootstrap analysis (10,000 permutations) were performed using Bood version 3.01 (Coelho,
2000). Neis genetic identity (Nei, 1978) and
ST
were calculated using the TFPGA version
1.3 (Tools for Population Genetic Analyses) (Miller, 1997) and the ARLEQUIN version 3.11
(Excoffier et al., 2005), respectively.

Results and Discussion

In this study, ISSR markers were used for assessing genetic variation and determining the
relationships among two different sampling sites of Cypella fucata in RS, Brazil. Analyses
with other molecular markers, as RAPD and PCR-RFLP, were also performed, but did not
produce satisfactory preliminary results to be pursued. The present study is the first to survey
data on genetic variability at species level for the Cypella genus. The nine primers analyzed
generated 201 bands, on average 22 bands per primer, ranging from 150 bp to over 2,000 bp
(Table 1).

Table 1. Sequence of the primers plus size and number of fragments amplified

Uniques fragments
Primer Sequence
Size of fragments
amplified
Polimorfic
fragments ESC191 ESC192
P1 (AC)
8
T 400 over 2,000 24 0 0
P2 (GA)
8
T 320 over 2,000 52 1 0
P3 (CTC)
4
RC 150 2,000 22 2 2
P4 (CT)
8
G 230 over 2,000 25 0 0
F3 (AG)
8
C 300 2,000 12 2 0
F5 (CA)
8
T Not analyzed 0 0 0
F4 (GA)
8
C 390 over 2,000 11 0 3


Uniques fragments
Primer Sequence
Size of fragments
amplified
Polimorfic
fragments ESC191 ESC192
F7 (GT)
8
400 - 1,000 9 1 0
F11 (GACA)
4
400 over 2,000 10 0 2
F12 (GTGC)
4
300 2,000 20 2 1

Among the 201 bands amplified, 16 were unique, with four of these obtained from
primer P3. This primer corresponds to the largest variation in terms of the size of fragments,
with 185 being polymorphic, which accounts for a high percentage (92%). The number of
unique markers tends to be lower in the youngest population in comparison with older ones
(Despres et al., 2002 apud Wrblewska and Brzosko, 2006), which allows to infer that the
two sample sites analyzed are young, and also that they have possibly been recently
established. All primers produced polymorphic bands, with an average of 22 ISSR markers
per primer being scored, in which the largest number (52) was obtained with primer P2, while
the lowest number (9) was obtained with primer F7. Figures 2 and 3 show the amplification
pattern using primer F4.

Legend: Specimens (a-z) and M (100-bp ladder).
Figure 2. Electrophoresis pattern obtained for ISSR primer F4 for accessions of Cypella fucata -
Population of Piratini.

Legend: Specimens (a-c1) and M (100-pb ladder)
Figure 3. Electrophoresis pattern obtained for ISSR primer F4 for accessions of Cypella fucata -
Population of Capo do Leo.
The dendrogram formed by UPGMA (Figure 4) based on the Jaccards index resulted in
poor fit (0.7 r < 0.8). The tree shows the distribution of the samples as two large groups and
one third group with four specimens isolated. The two main clusters are composed by
specimens from their respective collection sites. Moreover, we can observe a distinct group
formed by four individuals outside the main clusters. It is important to mention that for three
of these individuals (a2, l2, b21) some primers were not amplified. Also, these specimens
may also have influenced bootstrap values, which were low. The fourth individual (c21) did
not strictly lead to missing data, though it revealed an amplification pattern that differed from
those of other samples of its population.
Our data reveal a large genetic diversity for both accessions, and a low differentiation
between them. The ISSR markers proved to be a powerful method for the detection of genetic
diversity of the accessions. The Neis genetic distance (Nei, 1978) showed a high identity
between the sites sampled (98%).
Besides, according to the results obtained by Arlequin, 8.51% (
ST
= 0.0851, P < 0.001)
of the total variability was attributable to differences among populations, which corresponds
to an intermediate structure, so that ESC 191 and ESC 192 could not correspond to two really
separated populations. Fourteen groups had bootstrap values of more than 50% and are
shown in Figure 4. The groups not supported by high bootstrap values may be explained by
the high identity showed by both sampled sites.
Since the surveyed sites presented an intermediate population structure in addition to a
very low genetic distance, our data may suggest that they correspond to two subpopulations
derived from a single original one. As those sampling areas were located along a country road
and distant 22 km from each other, we can hypothesized that a former hypothetical
population may have been covering all the current distribution in earlier times, without any
gap. However, that original population might have gone through anthropogenic interaction
processes, which in turn could result in a spatial separation into two sub-areas of distribution,
since an incipient differentiation between the suggested subpopulations was observed in the
present study.
Estimates for Neis genetic distance between the two sampling areas was 0.02, and the
comparison between the Jaccards coefficient matrix and the UPGMA clustering also
indicates that populations spatially near each other tended to be genetically similar (Zong et
al., 2008). The Neis genetic diversity indicates the great similarity between the two
accessions studied. This similarity has been observed within Iridaceae, involving as markers
RAPD, when Iris aphylla L. were investigated (Wrblewska et al., 2003).
The low gene flow can be attributed to the difficulty faced by pollinators to travel across
long distances, as the 22 km separating the accessions. This aspect should be better studied
because there are no (not many) data about breeding systems, seed dispersion and pollination
biology for Cypella.
According to the literature, in relation to the breeding system, outcrossing, clonal, long-
lived and endangered species exhibit high variation within populations, while inter-
population variation is rather low [e.g. in Adenophorus periens L.E. Bishop
(Grammitidaceae); Allium aaseae Ownbey (Liliaceae) (Smith and Pham, 1996) and
Polygonella Michx. (Polygonaceae) (Lewis and Crowford, 1995)]. Otherwise, sometimes the
reverse situation has been noted, when populations of rare and geographically restricted plant
species with a similar life history revealed low levels of genetic diversity within populations,
e.g. in glacial endemic Iris lacustris Nutt. (Iridaceae) (Hannan and Orick, 2000), Ranunculus
reptans L. (Ranunculaceae) (Fisher et al., 2000) and Amentotaxus formosana H.L. Li

Figure 4. UPGMA dendrogram of individuals from two populations (ESC 191 and ESC 192) constructed with the Jaccards coefficient. Bootstrap values greater
than 50% are indicated. Legend: Red - ESC 191 - Population of Piratini. Blue - ESC 192 - Population of Capo do Leo. Green - specimens isolated of the two
populations. The individuals names are indicated for each branch.
(Taxaceae) (Wang et al., 1996). Many authors (Loveless and Hamrick, 1984; Hamrick and
Godt, 1989; Fisher et al., 2000) have suggested that the levels of genetic variability may be a
consequence of the breeding system. Even a low level of sexual reproduction favors high
genetic diversity within a population, and the longevity of a species further promotes the
exchange of genes among individuals of different generations (Brzosko et al., 2002a;
Brzosko et al., 2002b). On the other hand, a rhizomatous growth may protect against loss of
genetic variability due to stochastic environmental perturbations (Arnold, 2000).
The asexual reproduction (through the bulbs as in the case of the species analyzed in the
present study) tends to occur when the plants are in disturbed areas (as a result of
anthropogenic activities, for example) or any stress from the environment, process which
would reduce the variability within the population (Zong et al., 2008). Although the
localization of the samples studied in the present work (near a road), the genetic diversity is
high between individuals in each surveyed area, thus we may suppose that sexual
reproduction is predominant in this case.
Holtsford and Ellstrand (1989) found a strong influence of the breeding system upon the
distribution of the genetic variation within and among the populations of the annual herb
Clarkia tembloriensis Vasek (Onagraceae). For this species, outcrossing populations had
greater genetic variation and a lower differentiation among populations than the group of
selfing plants.
Zong et al. (2008) suggested that there is a relation between population size and genetic
variability, proposing that large populations express high variation. This hypothesis could
also explain the high genetic diversity observed for both collection sites, considering that
sample sizes from both sites are large. Studies with populations of Dysosma pleiantha
(Hance) Woodson (Berberidaceae) in China show that large populations have a relatively
high level of genetic variation, indicating that the balance between the vegetative
reproduction and sexual reproduction is greater in favor of sexual reproduction in large
populations than in small populations. When populations are small and isolated from each
other, the genetic drift also influences the genetic structure and increases the differentiation
between populations (Ellstrand and Elam, 1993).
Concerning the low inter- and high intra-populational variation observed in our study, a
similar result has been found in one investigation on populations of three species of the genus
Lathyrus L. (Fabaceae). In that study, the authors found low differentiation among some
populations of different species as well as for populations of the same species. All species
present autogamous (or preferentially autogamous) plants and the low genetic distance found
among the populations may be due to a common domestication origin, since two of them,
Lathyrus sativus L. and L. cicera L., are crop species (Belad et al., 2006).
Devoto and Medan (2003), in a study on the effects of grazing disturbance on the
reproduction of Cypella herbertii, concluded that the species clearly exhibits a mixed-mating
system but has a remarkable low spontaneous fruit formation. Yet, it is highly self-compatible
and expresses a significant decay at seed set stage when self-pollinated. Little is known about
the reproductive biology of Cypella species, but our results indicate the occurrence of sexual
reproduction, possibly by cross-fertilization, since the collection sites presented high genetic
variability.
The importance of species biology studies for comes as a helpful tool to the knowledge
of their conservation. Places disturbed by the presence of livestock or intense vehicle traffic,
such as where the sample collections of the present study were performed, might influence
the reproductive biology. Presence of grazing, for example, may restrict the occurrence of
nesting sites of the pollinators, resulting in high self-fertilization indexes, which directly
relates to the reduction in variability (Devoto and Medan, 2003). The sampled sites showed a
great genetic variability, but future generations of plants could present a different situation as
a result of the human interaction.
As a great number of species of Iridaceae are characterized by the presence of bulbs, this
morphological trait gives always the possibility of vegetative reproduction. Although
vegetative propagations reduce the variability among individuals, it can compensates the lack
of a pollinator and the low reproductive success through self-fertilization, thus promoting the
conservation of genetic diversity and preventing the elimination of individuals.
The nine ISSR primers used in the present study could represent only a few repetitive
DNA regions in the C. fucata genome. At the moment, no information is available about the
proportion of the genome that is covered with such microsatellites or about their genomic
distribution. Hence, the genetic relationship determined based on these nine ISSR primers
would mainly reflect the diversity of such populations (Kumar et al., 2001). Despite having
assessed a small fragment of the complexity of the genome, this study showed the high
diversity found in the composition of the populations and will contribute to future research
involving the species, genus or family.
Assessment of genetic relationships among populations could also be affected by the
number of markers used and their distribution in the genome (Nei, 1978). It has been
recommended that at least 50 different polymorphic loci should be used for a precise
estimation of genetic distance. In the present study, nine ISSR primers generated 185
polymorphic loci. For more precise estimates of genetic distance, ideally, markers that are
randomly distributed and span the whole genome should be selected (Kumar et al., 2001).
Henderson (1976) emphasized the importance of using a large sample of individuals, as
the identification can be difficult when there is a wide intrapopulational variation
encompassing various external morphological characters, which is an obstacle to the clear
identification of species. In the present work, a high variability among individuals was
observed, but in this case an increase in sample sizes would retain the same pattern of
diversity.

Conclusion

The present study is the first one to address the divergence within the genus Cypella. Our
results indicate that the two sample sites evaluated present a high variability within each
access and a high identity between the two sites. This study contributes to improve the
knowledge of evolutionary aspects of Cypella fucata and brings in new information on
Cypella as a genus.
The ISSR markers used to explore the variability of this taxonomic group were shown be
very useful, since a high number of markers were obtained. These markers allowed the
formulation of a hypothesis about population structure and variation within the species. It
may be interesting to perform further research on to floral biology, morphological and genetic
variability in order to contribute to the conservation of Cypella fucata.

References

Ajibade, S. R.; Weeden, N. F. and Chite, S.M. (2000). Inter-simple sequence repeat analysis
of genetic relationships in the genus Vigna. Euphytica, 111, 4755.
APG II. (2003). An update of the Angiosperm Phylogeny Group classification for the orders
and families of flowering plants: APG II. Botanical Journal of the Linnean Society, 141,
399-436.
Arnold, M. L. (2000). Andersons paradigm: Louisiana Irises and the study of evolutionary
phenomena. Molecular Ecology, 9, 16871698.
Belad, Y.; Chtourou-Ghorbel, N.; Marrakchi, M. and Trifi, N. (2006). Genetic diversity
within and between populations of Lathyrus genus (Fabaceae) revealed by ISSR markers.
Genetic Resources and Crop Evolution, 53, 14131418.
Brantestam, A. K.; Bothmer, R. V.; Dayteg, C.; Rashal, I.; Tuvesson, S. and Weibull, J.
(2004). Inter simple sequence repeat analysis of genetic diversity and relationships in
cultivated barley of Nordic and Baltic origin. Hereditas, 141(2), 186-187.
Brzosko, E.; Ratkiewicz, M. and Wrblewska, A. (2002a). Allozyme diversity in island
population of Cypripedium calceolus in the Biebrza Valley. Botanical Journal of the
Linnean Society, 138, 433440.
Brzosko, E.; Wrblewska, A. and Ratkiewicz, M. (2002b). Allozyme differentiation and
spatial clonal structure in isolated Cypripedium calceolus populations (NE Poland).
Molecular Ecology, 11, 24992509.
Camacho, F. J. and Liston, A. (2001). Population structure and genetic diversity of
Botrychium pumicola (Ophioglossaceae) based on inter-simple sequence repeats (ISSR).
American Journal of Botany, 88, 10651070.
Coelho, A. S. G. (2000). BOOD: avaliao de dendrogramas baseados em estimativas de
distncias/similaridades genticas atravs do procedimento de bootstrap. Goinia:UFG
Colasante, M.; Cozzolino, S. and Tarquini, F. (2008). Intrapopulation polymorphism in
Romulea bulbocodium. In: Abstracts the Fourth International Conference: The
Comparative Biology of the Monocotyledons. Copenhagen, Denmark, 74.
Dahlgren, R. M. T.; Clifford, H. T. and Yeo, P. F. (1985). The families of the
Monocotyledons - structure evolution and taxonomy. Berlin, Heidelberg, New York:
Springer.
Despres, L.; Loriot, S. and Gaudeul, M. (2002). Geographic patterns of genetic variation in
the European globeflowers Trollius europaeus L. (Ranunculaceae) inferred from
amplified fragment length polymorphism markers. Molecular Ecology, 11, 23372347.
Devoto, M. and Medan, D. (2003). Effects of grazing disturbance on the reproduction of a
perennial herb, Cypella herbertii (Lindl.) Herb. (Iridaceae). Plant Systematics and
Evolution, 243, 165-173.
Doyle, J. J. and Doyle, J. L. (1987). A rapid DNA isolation procedure for small quantities of
fresh leaf tissue. Phytochemical Bull, 19,11-15.
Ellstrand, N. C. and Elam, D. R. (1993). Population genetics consequences of small
population size: implications for plant conservation. Annual Review of Ecology and
Systematics, 24, 217242.
Excoffier, L.; Laval, G. and Schneider, S. (2005). Arlequin v. 3.0: An integrated software
package for population genetics data analysis. Evolutionary Bioinformatics Online, 1,
47-50.
Fay, M. F. (2000). Phylogenetic studies of Asparagales based on four plastid DNA regions.
In: K. L. Wilson and D. A. Morrison, Monocots: Systematics and evolution, Royal
Botanic Gardens, 360-371. Kollingwood, Australia: CSIRO.
Fisher, M.; Husi, R.; Prati, D.; Pentinger, M.; Kleunen, M. and Schmid, B. (2000). RAPD
variation among and within small and large populations of the rare clonal plant
Ranunculus reptans (Ranunculaceae). American Journal of Botany, 87, 11281137.
Goldblatt, P. (1990). Phylogeny and classification of iridaceae. Annals of the Missouri
Botanic Garden, 77, 607-627.
Goldblatt, P.; Manning, J. C. and Bernhardt, P. (1995). Pollination in Lapeirousia subgenus
Lapeirousia (Iridaceae: Ixioideae). Annalsof the Missouri Botanical Garden, 82, 517-
534.
Goldblatt, P.; Manning, J. C. and Rudall, P. (1998). Iridaceae. In: Kubitzki, K. The families
and genera of vascular plants. Berlin: Springer, 295-333.
Goldblatt, P.; Rodriguez, A.; Powell, M. P.; Davies, T. J.; Manning, J. C.; Van der Bank, M.
and Savolainen, V. (2008). Iridaceae Out of Australasia? Phylogeny, biogeography,
and divergence time based on plastid DNA sequences. Systematic Botany, 33(3), 495-
508.
Gupta, P. K. and Varshney, R. K. (2000). The development and use of microsatellite markers
for genetic analysis and plant breeding with emphasis on bread wheat. Euphytica, 113,
163185.
Hamrick, J. L. and Godt, M. J. W. (1989). Allozyme diversity in plant species. In: Brown, H.
D.; Clegg, M. T.; Kahler, A. L. and Weir, B. S. [eds.] Plant population genetics,
breeding and genetic resources, 4363. Sunderland: Sinauer Associates.
Hannan, G. L. and Orick, M. W. (2000). Isozymes diversity in Iris cristata and the threatened
glacial endemic I. lacustris (Iridaceae). American Journal of Botany, 87, 293301.
Henderson, C. R. (1976). A simple method for computing the inverse of a numerator
relationship matrix used in prediction of breeding values. Biometrics, 32, 69.
Holtsford, T. P. and Ellstrand, N. C. (1989). Variation in outcrossing rate and population
genetic structure of Clarkia tembloriensis (Onagraceae). Theoretical and Applied
Genetics, 78, 480488.
Howard, T. M. (2001). Bulbs for warm climates. Austin: University of Texas Press. 171-195.
Innes, C. (1985). The world of Iridaceae - a comprehensive record. Holly Gate Internacional
Ltda. England.
Joshi, S. P.; Gupta, V. S.; Aggarwal, R. K.; Ranjeka, P. K. and Brar, D. S. (2000). Genetic
diversity and phylogenetic relationship as revealed by inter-simple sequence repeat
(ISSR) polymorphism in the genus Oryza. Theoretical and Applied Genetics, 100, 1311
1320.
Kumar, L.; Sawant, A.; Gupta, V. and Ranjekar, P. (2001). Comparative Analysis of Genetic
Diversity Among Indian Populations of Scirpophaga incertulas by ISSR-PCR and
RAPD-PCR. Biochemical Genetics, 39, 9-10.
Lewis, P. O. and Crawford, D. J. (1995). Pleistocene refugium endemics exhibit greater
allozymic diversity that widespread congeners in the genus Polygonella (Polygonaceae).
American Journal of Botany, 82, 141149.
Li, A. and Ge, S. (2001). Genetic variation and clonal diversity of Psammochloa villosa
(Poaceae) detected by ISSR markers. Annals of Botany, 87, 585-590.
Lombardo, A. (1984). Flora Montevidensis. Montevideo: Intendencia Municipal de
Montevideo, 387-398.
Loveless, M. D. and Hamrick, J. L. (1984). Ecological determinants of genetic structure in
plant populations. Annual Review of Ecology and Systematics, 15, 6595.
Manning, J. C. and Goldblatt, P. (2001). A synoptic review of Romulea (Iridaceae:
Crocoideae) in sub-Saharan Africa, The Arabian Peninsula and Socotra including new
species, biological notes, and a new infrageneric classification. Adansonia, 23, 59-108.
Medraoui, L.; Ater, M.; Benlhabib, O.; Msikine, D. and Filali-Maltouf, A. (2007). Evaluation
of genetic variability of sorghum (Sorghum bicolor L. Moench) in northwestern Morocco
by ISSR and RAPD markers. Comptes Rendus Biologies, 330, 789797.
Miller, M. P. (1997). Tools for Population Genetic Analyses (TFPGA), Version 1.3. A
windows program for the analysis of allozyme and molecular population genetic data.
Computer software distributed by author.
Nei, M. (1978). Estimation of average heterozygosity and genetic distance from a small
number of individuals. Genetics, 89, 583590.
Ravenna, P. (1968). Iridaceae. In: Cabrera, A.L. Flora de la Provncia de Buenos Aires.
Buenos Aires: INTA, 539-565.
Ravenna, P. (1981). Kellissa, a new genus of Iridaceae from South Brazil. Bulletin du
Musum National dHistoire Naturelle, sr. B, Adansonia, 3, 105-110.
Ravenna, P. (1981a). A submerged new species of Cypella (Iridaceae), and a new section for
the genus (s.str.). Nordic Journal of Botany, 1, 489-492.
Ravenna, P. (1981b). Eight new species and two new subspecies of Cypella (Iridaceae).
Wrightia, 7, 13-22.
Ravenna, P. (1983). Catila and Onira, two new genera of South American Iridaceae. Nordic
Journal of Botany, 3(2), 197-205.
Ravenna, P. (2005). New species of South American bulbous Iridaceae. Onira, Botanical
Leaflets, 10 (13), 39-45.
Reddy, M. P.; Sarla, N. and Siddiq, E. A. (2002). Inter simple sequence repeat (ISSR)
polymorphism and its application in plant breeding. Euphytica, 128, 917.
Reeves, G.; Chase, M.; Goldblatt, P.; Rudall, P.; Fay, M.; Cox, A.; Lejeune, B. and Souza-
Chies, T. (2001). Molecular systematics of Iridaceae: Evidence from four plastid DNA
regions. American Journal of Botany, 88, 2074-2087.
Rohlf, F. J. (2001). NTSYSpc: Numerical Taxonomy and Multivariate Analysis System,
Version 2.1, Exeter Software, Setauket, New York.
Smith, J. F. and Pham, T. V. (1996). Genetic diversity of the narrow endemic Allium asseae
(Alliaceae). American Journal of Botany, 83, 717726.
Souframanien, J. and Gopalakrishna, T. (2004). A comparative analysis of genetic diversity
in blackgram genotypes using RAPD and ISSR markers. Theoretical and Applied
Genetics, 109, 16871693.
Souza-Chies, T. T.; Bittar, G.; Nadot, S.; Carter, L.; Besin, E. and Lejeune, B. (1997).
Phylogenetic analysis of Iridaceae with parsimony and distance methods using the plastid
gene rps4. Plant Systematics and Evolution, 204, 109123.
Spier, F. F.; Tacuati, L. O.; Agostini, G.; Eggers, L. and Souza-Chies, T. T. (2008). Uso de
marcadores PCR-RFLP como ferramenta na identificao de espcies da subfamlia
Iridoideae (Iridaceae) presentes no Parque Estadual de Itapu, Viamo, RS, Brasil.
Brazilian Journal of Biosciences, 6, 159-165.
Staub, J. E; Serquen, F. C. and Gupta, M. (1996). Genetic markers, map construction, and
their application in plant breeding. HortScience, 31, 729739.
Stevenson, D. W. and Loconte, H. (1995), Cladistic analysis of monocot families. In: Rudall,
P. J.; Cribb, P. J.; Cutler, D. F. Monocotyledons: Systematics and evolution. Royal
Botanic Gardens, 543-578.
Tian, H.; Xue, J; Wen, J.; Mitchell, G. and Zhou, S. (2008). Genetic diversity and
relationships of lotus (Nelumbo) cultivars based on allozyme and ISSR markers. Scientia
Horticulturae, 116, 421429.
Wang; C. T.; Wang, W. Y.; Chiang, C. H.; Wang, Y. N. and Lin, T. P. (1996). Low genetic
variation in Amentotaxus formosana Li revealed by isozyme analysis and random
amplified polymorphic DNA markers. Heredity, 77, 388395.
Wolfe, A. D. and Liston, A. (1998). Contributions of PCR-Based Methods to Plant
Systematics and Evolutionary Biology. In: Soltis, D. E.; Soltis, P. S.; Doyle, J. J. eds.
Molecular Systematics of Plant II: DNA Sequencing. New York: Chapman and Hall.
Wrblewska, A. and Brzosko, E. (2006). The genetic structure of the steppe plant Iris aphylla
L. at the northern limit of its geographical range. Botanical Journal of the Linnean
Society, 152, 245255.
Wrblewska, A.; Brzosko, E.; Czarnecka, B. and Nowosielski, J. (2003). High levels of
genetic diversity in populations of Iris aphylla L. (Iridaceae), an endangered species in
Poland. Botanical Journal of the Linnean Society, 142, 6572.
Wu, K.; Jones, R.; Dannaeberger, L. and Scolnik, P.A. (1994). Detection of microsatellite
polymorphisms without cloning. Nucleic Acids Res, 22, 32573258.
Xu, F. and Sun, M. (2001). Comparative analysis of phylogenetic relationships of grain
amaranths and their wild relatives (Amaranthus; Amaranthaceae) using internal
transcribed Spacer, Amplified Fragment Length Polymorphism, and Double-Primer
Fluorescent Intersimple Sequence Repeat Markers. Molecular Phylogenetics and
Evolution, 21, 372-387.
Yockteng, R.; Ballard, H.; Mansion, G.; Dajoz, I. and Nadot, S. (2003). Relationship among
pansies (Viola section Melanium) investigated using ITS and ISSR markers. Plant
Systematics and Evolution, 241, 153170.
Zietkiewicz, E.; Rafalski, A. and Labuda, D. (1994). Genome fingerprinting by simple
sequence repeat (SSR) anchored polymerase chain reaction amplification. Genomics,
20, 176183.
Zong, M.; Liu, H.; Qiu, Y.; Yang, S.; Zhao, M. and Fu, C. (2008). Genetic Diversity and
Geographic Differentiation in the Threatened Species Dysosma pleiantha in China as
Revealed by ISSR Analysis. Biochemical Genetics, 46,180196.


Chapter 8

Genetic and Functional Diversity of
Phosphate Solubilizing Fluorescent
Pseudomonads and Their Simultaneous
Role in Promotion of Plant Growth and
Soil Health

K. Badri Narayanan, M. Jaharamma, G. Raman and N. Sakthivel

Department of Biotechnology, School of Life Sciences, Pondicherry University, Kalapet,
Puducherry 605014, India

Abstract

Soil microbes that solubilize the insoluble phosphates play a vital role in maintaining
soil fertility, plant health and subsequent enhancement of crop yield. Fluorescent
pseudomonad group of bacteria are often predominant among bacterial species associated
with the plant rhizosphere. Due to their innate capability for plant growth promotion,
plant disease suppression and their potential for biodegradation of agricultural chemical
pollutants, fluorescent pseudomonads have been a major focus for investigators around
the world. In recent years, rich knowledge has been generated on diversity, functional
potential of fluorescent pseudomonads. This chapter describes the genetic and functional
diversity of fluorescent pseudomonads and their role in phosphate solubilization,
biological control and soil fertility.

Corresponding author: Dr. N. Sakthivel, Professor and Head,. E-mail address: puns2005@gmail.com. Tel: +91-
413-2654430; fax: +91-413-2655255
M. Jaharamma, K. Badri Narayanan and N. Sakthivel 196
1. Introduction

Phosphorous is the major essential macronutrient of plants and its deficiency is a severe
constraint to crop production. Plants absorb only inorganic form of phosphorous and the level
of inorganic phosphorous is very low in the soil because most of the phosphorous is present
in insoluble mineral forms such as hydroxyapatites, oxyapatites and apatites. In ferralite soils
mineral phosphates are associated with the poorly soluble and unassimilated forms of
hydrated oxides of Fe, Al and Mn. In acid soils, phosphorous is fixed by free oxides and
hydroxides of Al and Fe but in alkaline soils, phosphorous is fixed by Ca and therefore,
resulting poor solubility of phosphorous fertilizers. The low level of available phosphorous in
soil is reported to be less than 1 ppm (Goldstein, 1994). Phytopathogens reduce crop yield all
over the world. Chemical control of phytopathogens is one of the major approaches for
disease control. The overuse of chemical pesticides and chemical fertilizers in agricultural
soils has led to the lethal consequences to useful arthropods and other beneficial microbes as
well as led to soil pollution and accumulation of phosphorous as insoluble form (Rodriguez
and Fraga, 1999). Regular application of chemical pesticides and fertilizers escalated soil
problems such as structural degradation, reduction of organic matter, soil colloidal content
and accumulation of agricultural residues and phosphorous in soil. Crop residues consisting
of cellulose and hemicellulose contain 53-75% carbohydrate. Cellulose is a polymer of
glucose and hemicellulose consists of xylose, arabinose, glucose, galactose and mannose
(Mosier et al. 2005). Phosphate solubilizing bacteria are known to utilize carbohydrates of
crop residues. Therefore, phosphate solubilizing bacteria with multiple functional traits such
as plant growth promotion and crop protection are preferred to enhance mineralization and
decomposition of crop residues. Fluorescent pseudomonad group of bacteria with such innate
ability to promote growth, control diseases and degrade pollutants may play a vital role to
promote agricultural yield.

2. Microbial Phosphate Solubilization

Soil microorganisms have the ability to solubilize the insoluble phosphates and to
improve the quality of soil health and its fertility. Phosphate solubilizing microbes have been
reported for promoting plant growth and enhancing yield (Kapoor et al. 1989; Rodriguez and
Fraga, 1999). Evaluation of their potential to mobilize soil phosphate has been the subject of
interest for soil microbiologist (Rodriguez and Fraga, 1999; Whitelaw, 2000).
Microorganisms play an essential role to facilitate the availability of soil phosphate to the
root system and enhance the mobilization of phosphate in soil (Richardson, 2001). Efficacy
of phosphate solubilizing microorganisms has been identified on the basis of kinetics and
phosphorous accumulation.
Fungal species such as Aspergillus, Penicillum and Rhizopus have been reported for
phosphate solubilization (Rodriguez and Fraga, 1999; Whitelaw, 2000; Vassilev and
Vassileva, 2003). The phosphate uptake by plants and subsequent growth promotion in plant-
soil systems are more pronounced when phosphate solubilizing microorganisms are co-
inoculated with arbuscular mycorrhizal fungi that offer beneficial symbiosis with plant roots
Phosphate Solubilizing Fluorescent Pseudomonads 197
(Smith and Read, 1997). Phosphate solubilizing fungi have been reported to solubilize 9.0 to
34.6% of total phosphate in synthetic medium (Narsian et al. 1994). These fungi are capable
of mobilizing soluble inorganic phosphate by the excretion of H
+
after the utilization of
ammonium by the hyphae (Yao et al. 2001). Several bacterial genera such as Pseudomonas,
Azospirillum, Bacillus, Rhizobium, Burkholderia, Alcaligenes, Serratia, Enterobacter,
Acinetobacter, Flavobacterium and Erwinia have been reported to solubilize tricalcium
phosphate, dicalcium phosphate, hydroxyapatite and rock phosphates (Rodriguez et al. 1996;
Goldstein, 1986). Among these phosphate solubilizing bacteria, plant rhizosphere associated
fluorescent pseudomonads are considered important due to their simultaneous potential of
plant growth promotion, biocontrol of pathogens and biodegradation of soil pollutants (Bano
and Musarrat, 2004; OSullivan and OGara, 1992; Ravindra Naik and Sakthivel, 2006).

3. Microbial Mechanisms That Mediate
Phosphate Solubilization

Phosphate mineralization from non-specific substrates by several bacterial genera such as
Pseudomonas, Burkholderia, Enterobacter, Citrobacter, Proteus and Serratia and phosphate
mineralization from inositol phosphate by Bacillus and Pseudomonas as well as
mineralization of phosphate from phosphonoacetate by Pseudomonas has been reported. The
production of organic acids by these bacteria seems to be the main cause of phosphate
solubilization. These bacteria are found to produce monocarboxylic (acetic, formic),
dicarboxylic (oxalic, succinic), tricarboxylic hydroxyl (citric) acids in liquid media. The role
of organic acids in dissolving mineral phosphates and phosphorylated minerals can be
attributed to the lowering of pH, which helps in the formation of stable complexes with
cations such as Ca
2+
, Mg
2+
, Fe
3+
and Al
3+
. Phosphatase and phytase enzymes secreted by
these bacteria play an essential role in phosphate solubilization because of the predominant
presence of their substrates in soil.
Phosphorous can be released from organic compounds in soil by three groups
of enzymes such as nonspecic phosphatases that perform dephosphorylation of
phospho-ester or phospho-anhydride bonds in organic matter, phytases that
specically cause the release of phosphorous from phytic acid and phosphonoacetases
and carbon-phosphate (C-P) lyase enzymes that perform CP bond cleavage in
organophosphonates. Secretions of organic acids and phosphatase enzymes have been
reported as common mechanisms to facilitate the conversion of insoluble forms of
phosphorous to accessible forms like orthophosphate. Although several phosphate
solubilizing bacteria occur in soil, usually their numbers are not sufficient enough to compete
with other rhizobacteria. Therefore, inoculation of plants by a target microorganism at a
much higher concentration is necessary. It has been reported that phosphate mineralization of
fluorescent pseudomonads from the substrates has been mediated by the enzymes, acid
phosphatase and phosphonoacetatehydrolase of Pseudomonas fluorescens strains and phytase
enzyme of P. putida. At least 30-48% of culturable soil microbes utilize phytase (Greaves
and Webley, 1965). This is evidently suggested that phytase producing P. putida and
phosphatase enzyme producing fluorescent pseudomonad strains may play an efficient role in
phosphate solubilization. In agricultural fields, the pH of most soils are acidic to neutral
range and therefore, acid phosphatase producing fluorescent pseudomonads are considered as
important microbial inoculants for enhancing soil fertility.

4. Microbial Diversity of Phosphate Solubilizing
Fluorescent Pseudomonads

Plant growth promoting bacteria such as fluorescent pseudomonads and rhizobia are able
to solubilize both organic phosphates (Abd-Alla, 1994) and inorganic phosphates (Antoun et
al. 1998). The advantage of using this group of bacteria with other innate biocontrol and
biofertilizing traits will be their dual beneficial nutritional effect resulting both from
phosphate mobilization, nitrogen fixation and suppression of phytopathogens. These bacteria
are documented well for their synergistic interactions with arbuscular mycorrhizal fungi (Peix
et al. 2001). Several species of fluorescent pseudomonads such as P. fluorescens NJ-101
(Bano and Musarrat, 2004), P. fluorescens EM85 (Dey et al. 2004), P. aeruginosa (Bano and
Musarrat, 2004), Pseudomonas sp. (Lehinos and Vacek, 1994; Lehinos, 1994), P.
chlororaphis, P. savastanoi, P. pickettii (Cattelan et al. 1999), P. lutea OK2 (Peix et al.
2004), P. rhizophaerae LMG21640, P. graminis DSM11363 (Peix et al. 2003), P. striata
(Gaind and Gaur, 2002) and P. corrugate (Pandey and Palni, 1998) have been reported as
efficient phosphate solubilizers. It has been reported that 18% of fluorescent pseudomonad
strains exhibited positive for the solubilization of tri-calcium phosphate (Ca
3
(PO
4
)
2
) by the
formation of visible dissolution halos on Pikovskayas agar (Ravindra Naik et al. 2008). A
high degree of genetic diversity among phosphate solubilizing fluorescent pseudomonad
strains have been reported on the basis of phenotypic characterization and 16S rRNA gene
phylogenetic analyses. All these phosphate solubilizing fluorescent pseudomonad strains
have been identified as P. aeruginosa, P. mosselii, P. monteilii, P. plecoglossicida, P. putida,
P. fulva and P. fluorescens (Ravindra Naik et al. 2008).

4.1. Functional Diversity of Phosphate Solubilizing Fluorescent
Pseudomonads

4.1.1. Plant Growth Promoting Traits

4.1.1.1. Siderophore
Siderophores by fluorescent pseudomonads are determined using the FeCl
3
test
(Neilands, 1995), and the chrome azurol S agar (CAS) assays (Schyn and Neilands, 1987).
Bacterial suspensions are dropped onto the center of a CAS plate and after incubation at 25C
for 3 days; siderophore production is assessed on the basis of change of colour of the medium
from blue to orange. Fluorescent pseudomonads are known to produce hydroxamate type of
siderophores.

4.1.1.2. Protease
Proteases by fluorescent pseudomonads are determined using skim milk agar medium.
Bacterial cells are spot inoculated and after 2 days incubation at 28C, proteolytic activities
are identified by formation of clear zones around the cells (Smibert and krieg, 1994). Several
species of fluorescent pseudomonads have been reported to produce protease.

4.1.1.3. Indole-3-Acetic Acid
The production of phytohormone, indole-3-acetic acid (IAA) is determined by using
standard method (Bric et al. 1991). Single colony is streaked onto Luria-Bertani (LB) agar
amended with 5 mM L-tryptophan, 0.06% sodium dodecyl sulphate and 1% glycerol.
Bioassay plates are overlaid with Whatman no. 1 filter paper and the bacterial strain is
allowed to grow for a period of 3 days. After the incubation period, the paper is removed and
treated with Salkowskis reagent (Gordon and Weber, 1951) with the formulation, 2% of 0.5
M ferric chloride

in 35% perchloric acid. Membranes are saturated in a Petri dish by soaking
directly in Salkowskis reagent and the production of IAA is identified by the formation of a
characteristic red halo within the membrane immediately surrounding the colony.
Quantification of IAA is done by following colorimetric method as described earlier (Patten
and Glick, 2002). A single colony of the bacterium is propagated overnight in 5 ml of DF
minimal salt medium and an aliquot of bacterial suspension is transferred into 5 ml of DF
minimal salt medium amended with 500 g/ml of L-tryptophan. After 40 hour of growth at
25
o
C in a rotary shaker at 180 rpm, the cells are removed by centrifugation. To the
supernatant Salkowskis reagent is added, mixed well and allowed to stand at room
temperature for 20 minutes. The absorbance is measured in a spectrophotometer at 535 nm.
An un-inoculated control with Salkowskis reagent is used as reference. The concentration of
IAA is determined on comparing with the standard curve. Selective strains of phosphate
solubilizing fluorescent pseudomonads have also been reported for the production of IAA.

4.1.1.4. 1-Aminocyclopropane-1-carboxylate (ACC) Deaminase
The ACC deaminase activity of fluorescent pseudomonads is estimated by using DF salts
medium (Dworkin and Foster, 1958). The solution of ACC (3 mM) is filter sterilized through
a 0.2 m membrane (Millipore) and spread over the agar plates, allowed to dry for 10
minutes and inoculated with bacteria. Observation is made after 2 days incubation at 28C.
Growth on the medium is considered as positive for ACC deaminase production (Penrose and
Glick, 2002). Strains of fluorescent pseudomonads have been reported for ACC deaminase
activity.

4.1.1.5. N-Acyl Homoserine Lactone (AHL)
Production of AHL by fluorescent pseudomonads is screened as described by Molina et
al. (2003). Briefly, AHL biosensor, Chromobacterium violaceum CV026 is streaked in a line
on plates of LB agar. The AHL donor Erwinia carotovara is used as a control and applied in
the spots 16 mm from the C. violaceum CV026 line. A test strain is spotted in between the
biosensor and the AHL donor at a distance 6 mm from the C. violaceum CV026 line. Plates
are incubated at 28C for 2 days. Migration of AHL from the donor E. carotovara and test
bacterium is confirmed by the production of purple-pigmented antibiotic violacein in the
biosensor.

4.1.2. Biocontrol Traits

4.1.2.1. Antagonism
Phosphate solubilizing strains of fluorescent pseudomonads are tested for in vitro
antagonism towards plant pathogens by following standard co-inoculation technique on
potato dextrose agar or nutrient agar (Sakthivel and Gnanamanickam, 1987; Ayyadurai et al.
2007). Briefly, bacterial plugs are removed from a 48 hour culture and are transferred to the
centre of potato dextrose agar or nutrient agar plates, which had been inoculated with fungal
spore or bacterial cell suspension. Assay plates are incubated at 28C to 42C for 2 to 3 days
and growth-inhibition appeared around the plugs are measured and the strains are identified
as antagonists.

4.1.2.2. Fungal Cell Wall Degrading Enzyme
The production of fungal cell wall degrading enzyme, chitinase by fluorescent
pseudomonads can be tested on chitin agar medium (Renwick et al. 1991). Briefly, overnight
grown bacterial cells are spot inoculated onto chitin agar plates and after 5 days of incubation
at 30C, chitinase activity are identified by formation of clear zones around the cells. In a
recent study from our laboratory, a total of 9% strains of phosphate solubilizing fluorescent
pseudomonads showed chitinase activity (Ravindra Naik et al. 2008).

4.1.2.3. Production of Enzymes for Decomposition of Crop Residues

4.1.2.3.1. Cellulase
Strains are screened for cellulase production by plating onto M9 medium agar amended
with 10 g of cellulose and 1.2 g of yeast extract per litre. After 8 days of incubation at 28
o
C,
colonies surrounded by clear halos are considered positive for cellulase production (Cattelan,
1999). Cellulase production of selective strains of phosphate solubilizing fluorescent
pseudomonads has been reported (Ravindra Naik et al. 2008).

4.1.2.3.2. Pectinase
Pectinase production is determined using M9 medium amended with 4.8 g of pectin per
litre. After 2 days of incubation at 28
o
C, plates are flooded with 2 mol/l HCl and strains
surrounded by clear halos are considered positive for pectinase production (Cattelan, 1999).
Pectinase production of selective strains of phosphate solubilizing fluorescent pseudomonads
has been reported (Ravindra Naik et al. 2008).

4.1.2.4. Rapid Detection of Antibiotic Genes and Antimicrobial
Metabolites
Detection of the genes that encode the known antimicrobial metabolites of fluorescent
pseudomonads such as 2,4-diacetylphloroglucinol (DAPG), phenazine-1-carboxylic acid
(PCA), phenazine-1-carboxamide (PCN), pyrrolnitrin (PRN), pyoluteorin (PLT) and
hydrogen cyanide (hcnBC) is done by PCR using gene-specific primers pairs such as, Phl2a
(5-GAGGACGTCGAAGACCACCA-3) and Phl2b (5-ACCGCAGCATCGTGTATGAG-
3) for DAPG (Mavrodi et al. 2001b), PCA2a (5-TGCCAAGCCTCGCTCCAAC-3) and
PCA3b (5-CCCGTTTCAGTAAGTCTTCCATGATGCG-3) for PCA (Raaijmakers et al.
1997), PhzHup (5-CGCACGGATCCTTTCAGAATGTTC-3) and PhzHlow
(5GCCACGCCAAGCTTCACGCTCA-3) for PCN (Mavrodi et al. 2001a), Prncf (5-
CCACAAGCCCGGCCAGGAGC-3) and Prncr (5-GAGAAGAGCGGGTCGATG
AAGCC-3) for PRN, PltBf (5-CGGAGCATGGACCCCCAGC-3) and PltBr (5-GTG
CCCGATATTGGTCTTGACCGAG-3) for PLT (Mavrodi et al. 2001b), and ACa (5-
ACTGCCAGGGGCGGATGTGC-3) and ACb (5-ACGATGTGCTCGGCGTAC-3) for
hcnBC (Ramette et al. 2003). In order to test the production of antifungal metabolites, strains
are inoculated in the fermentation media. Production of DAPG and PCA is tested by growing
bacteria in pigment producing medium (per litre contains 20 g peptone, 20 ml glycerol, 5 g
NaCl and 1 g KNO
3
, pH 7.2) for 5 days at 28C (Gurusiddaiah et al. 1986). Production of
PLT is tested by growing bacteria in KB broth for 14 days at 25C (de Souza and
Raaijimakers, 2003). PRN production is tested by growing bacteria in a minimal medium (per
litre contains 30 g glycerol, 3 g K
2
HPO
4
, 0.5 g KH
2
PO
4
, 5 g NaCl, 0.5 g MgSO
4
.7H
2
O, 0.35
mM ZnSO
4
, 0.5 mM MO
7
(NH
4
)
6
O
24
.H
2
O and 0.61 g D-tryptophan) for 24 hour at 25C and
subsequently incubated at 25C in dark for 4 days (de Souza and Raaijimakers, 2003). The
culture supernatants are extracted with ethyl acetate and the production of antibiotics is
verified by analytical HPLC as described (Sunish kumar et al. 2005; Ayyadurai et al. 2007).

4.2. Phenotypic and Genotypic Diversity of Phosphate Solubilizing

Phosphate solubilizing fluorescent pseudomonad strains exhibited positive traits for
cytochrome oxidase, arginine dihydrolase, and showed variability for traits such as gelatin
hydrolysis, levan production and growth at 4C and 42C. All these strains have been
reported to utilize dextrose, galactose, mannose and citrate but exhibited varying degree of
utilization towards other carbon sources such as lactose, xylose, fructose, melibiose, L-
arabinose, glycerol, ribose, -methyl-D-mannoside, xylitol, esculin, D-arabinose, malonate,
sorbose, trehalose, sorbitol, mannitol, adonitol and glucosamine. These strains did not utilize
maltose, sucrose, inulin, salicin, dulcitol, inositol, -methyl-D-gluconate, rhamnose,
cellobiose, melazitose, xylitol and ONPG. Numerical analysis of phenotypic characteristics
revealed a high degree of polymorphism into 3 major phenons (Ravindra Naik et al. 2008).
Cluster analysis of phosphate solubilizing fluorescent pseudomonads based on the pair-wise
coefficient similarity with UPGMA of BOX-PCR has resulted into 3 distinct genomic
clusters and 26 distinct BOX profiles (Ravindra Naik et al. 2008). All phosphate solubilizing
fluorescent pseudomonads strains showed wide variations in fingerprinting pattern due to
their high degree of genetic variability and distributed into different clusters. These results
identified a high degree of genetic variability among different species of phosphate
solubilizing fluorescent pseudomonads.

5. Genes Involved in Phosphate Solubilization

The ability of mineral phosphate solubilization has been shown to be related to the
production of organic acid (Rodrguez and Fraga, 1999). Direct glucose oxidation
to gluconic acid (GA) has been reported as a major mechanism for mineral
phosphate solubilization (Goldstein, 1994). In fluorescent pseudomonads gluconic acid
biosynthesis is mediated by glucose dehydrogenase (GDH) and the co-factor,
pyrroloquinoline quinone (PQQ). Although some genes involved in mineral
phosphate solubilization in dierent bacterial species have been reported, the
genetic basis of mineral phosphate solubilization is not well understood (Goldstein and Liu,
1987). Any gene involved in organic acid synthesis might have an effect on mineral
phosphate solubilization. The gene mps from Erwinia herbicola that is involved in mineral
phosphate solubilization has been cloned (Goldstein and Liu, 1987). The expression of
mps gene mediated the production of gluconic acid and mineral phosphate solubilization
activity in E. coli HB101. Sequence analysis of mps gene (Liu et al. 1992) suggested its role
in the biosynthesis pyrroloquinoline quinone (PQQ) synthase, which directs the synthesis of
PQQ, a co-factor necessary for the formation of the holoenzyme glucose dehydrogenase
(GDH)-PQQ. This enzyme catalyzes the formation of gluconic acid from glucose by the direct
oxidation pathway. Mineral phosphate solubilization gene gabY from P. cepacia was
isolated (Babu-Khan et al. 1995). Expression of gabY gene has led to the mineral phosphate
solubilization via gluconic acid production in Escherichia coli JM109. The gabY gene
could play an alternative role in the expression and regulation of the direct oxidation
pathway in P. cepacia. This gabY gene showed no apparent homology with the previous
cloned PQQ synthetase gene (Liu et al. 1992; Goosen et al. 1989). Many acid
phosphatase genes from Gram-negative bacteria have been identified (Rossolini et
al. 1998). These genes are the important source of genetic materials for the gene
transfer to other plant growth promoting strains. In general, genes coding for acid
phosphatases are capable of performing well in soil. The acpA gene from Francisella
tularensis expresses an acid phosphatase with optimum action at pH 6, with a wide
range of substrate specicity (Reilly et al. 1996) and genes encoding non-specic
acid phosphatases class A (PhoC) and class B (NapA) from Morganella morganii
are also promising genetic materials (Thaller et al. 1994 and 1995b).

6. Molecular Tools Used For Isolation and
Characterization of Phosphate Solubilizing

6.1. Isolation and Screening of Phosphate Solubilizing Strains

Phosphate-solubilizing bacteria are isolated from rhizospheric samples by plating serial
dilution of rhizospheric soil extracts in Pikovskayas agar medium (Pikovskaya, 1948). The
medium contains insoluble tri- or bi-calcium phosphate, allowing the detection of phosphate
solubilizing bacteria by the formation of halo around their colonies. The addition of
bromophenol blue, which produces yellow colored halos around the colonies in response to
the pH drop by the release of organic acids, or protein release in exchange for cation uptake,
generates more responsible results than with the simple halo method (Gupta et al. 1994).
Although phosphate solubilizing capability remains stable in most strains, few other strains
also show instability after several cycle of inoculation (Halder et al. 1990; Illmer and
Schinner, 1992).

6.2. Estimation of Phosphate Solubilization

Strains are inoculated into phosphate solubilization estimation medium (per litre contains
0.5 g yeast extract, 10 g dextrose, 5 g CaCl
2
, 0.5 g (NH
4
)
2
SO
4
, 5 g Ca
3
(PO
4
)
2
, 0.2 g KCl, 0.1
g MgSO
4
, 0.0001 g MnSO
4
and 0.0001 g FeSO
4 ,
pH 7.0) and grown at 28C with 180 rpm
on rotary shaker (King, 1936). At different time intervals (1, 3, 5, 7 and 10 days), samples are
drawn and used for the estimation of soluble phosphate and for checking the pH of the
culture medium. For estimation of soluble phosphate in culture medium, 1 ml culture filtrate
is added to 4.5 ml of chloromolybdic acid (1.5 gm of ammonium molybdate is dissolved in
40 ml of warm water, 34.2 ml of 12 N HCl is added, allowed to cool and made up to 100 ml
with distilled water) in each test tube and vortexed. To this 0.025 ml of chlorostannous acid
(2.5 gm of SnCl
2
.H
2
O dissolved in 10 ml of 12 N HCl, made up to 100 ml with distilled
water) is added and immediately made up to 5 ml and optical density (OD) is measured at
600 nm. Standard curve is prepared using 50, 100, 150, 200, 250 and 300 g/ml
concentrations of potassium dihydrogen phosphate.

6.3. Evaluation of Strains for Efficient Phosphate Solubilization

The effectiveness of phosphate solubilizing fluorescent pseudomonads may be evaluated
by testing their response towards crop plants of various species and genotypes. Differential
rhizosphere effects of these bacteria may be due the variability in their root exudates may
play important role in microbial phosphate solubilization. Efficacy of phosphate
solubilization by microbes should also be tested under field conditions of varying phosphate
content, soil pH and other soil characteristics.

6.4. Phenotypic Characterization

Phenotypic characterization of phosphate solubilizing bacteria relies upon biochemical
tests and carbon assimilation profiles. For rapid identification of soil bacteria including
Pseudomonas, Bacilli and Acinetobacter commercial kits such as BIOLOG system
(BIOLOG, California, USA), API 20NE system (BioMerieux, France) and HiCarbohydrate
kits (Himedia, Mumbai, India) are available.

6.5. Molecular Characterization

6.5.1. 16S rRNA, gyrB and rpoD Amplifications
Amplification of 16S rRNA gene is performed from the genomic DNA using universal
primers fD1 (5-GAGTTTGATCCTGGCTCA-3) and rP2 (5-
ACGGCTACCTTGTTACGACTT-3) (Weisburg et al. 1991). The gyrB and rpoD genes are
amplified using the primer pairs GYRBF (5-CAGGAAACAGCTAT
GACCAYGSNGGNGGNAARTTYRA-3) and GYRBR (5-TGTAAAACGACGG
CCAGTGCNGGRTCYTTYTCYTGRCA-3) for gyrB and RPODF (5-
ACGACTGACCCGGTACGCATGTAYATGMGNGARATGGGNACNGT-3) and RPODR
(5-ATAGAAATAACCAGACGTAAGTTNGCYTCNACCATYTCYTTYTT-3) for rpoD
(Yamamoto et al. 2000).

6.5.2. Sequencing and Phylogenetic Tree Analyses
Sequences of 16S rRNA, gyrB and rpoD genes are subjected to BLAST search from the
NCBI database for bacterial strain identification. The reference sequences required for
comparison are downloaded from the European Molecular Biology Laboratory (EMBL)
database available on the site http://www.ncbi.nlm.nih.gov/GenBank. Sequences are aligned
by the aid of multiple sequence alignment program CLUSTAL V (Higgins et al. 1992). The
aligned sequences are then checked for gaps manually, arranged in a block of 600bp in each
row and saved as molecular evolutionary genetics analysis (MEGA) format in software
MEGA v3.0. The pairwise evolutionary distances are computed with the help of Kimura 2-
parameter (Kimura, 1980). To obtain the confidence values, the original data set is resampled
1000 times using the bootstrap analysis method. The bootstrapped data set is used directly for
constructing the phylogenetic tree by the MEGA v3.0 program for calculating the multiple
distance matrixes. The multiple distance matrix obtained is then used to construct
phylogenetic trees using neighbor-joining (NJ) method (Saitou and Nei, 1987). All these
analyses are performed with the aid of MEGA v3.0 (Kumar et al. 2004).

6.5.3. Amplified Ribosomal DNA Restriction Analysis (ARDRA)
ARDRA is done by digestion of 16S RNA gene with discriminative restriction
endonucleases such as AluI, DdeI, HinfI and MspI (Laguerre et al. 1994). The digested PCR
products are separated by horizontal electrophoresis and gels are stained with ethidium
bromide and photographed under UV illumination. Depending on their ARDRA profiles, the
strains are distributed into genotypic groups (Laguerre et al. 1994; Frey et al. 1997).

6.5.4. Electrophoresis
Single dimensional (1-D) electrophoresis has permitted optimum separation of stable low
molecular weight RNA (Cruz-Sanchez et al. 1997) that includes 5S rRNA and tRNA of
bacteria (Velazquez et al. 1988, 1998 and 2001; Palomo et al. 2000). These RNA molecules
are of great interest for taxonomic affiliation due to their highly conserved nature and have
been successfully used to identify bacterial strains.

6.5.5. Random Amplified Polymorphic DNA (RAPD)
RAPD profiles can be used to identify bacteria at species and subspecies level. At least
three different families of short intergenic repeated sequences such as REP (repetitive
extragenic palindromic elements), ERIC (enterobacterial repetitive intergenic consensus) and
BOX elements have been reported in the genome of eubacteria (Louws et al. 1994). PCR
based techniques using specific primers of these repetitive DNA sequences (REP, ERIC and
BOX) are collectively known as rep-PCR. This rep-PCR based RAPD analysis is a universal
tool for assessing genomic variation and microbial biodiversity.

Conclusion

Plant growth and yield is often limited by insufficient phosphate availability. The low
solubility of common phosphates such as Ca
3
(PO
4
)
2
, hydroxyapatite and aluminium
phosphate cause low phosphate availability in agricultural soil. Soil fertility is one of the
most important factors for agricultural production (Richardson, 2001). The excess use of
chemical pesticides in agricultural fields minimizes the solubility of chemical fertilizers and
making it unavailable to plants. The availability of soil phosphorous is largely controlled by
biologically mediated process such as gross mineralization and immobilization rates.
Microorganisms play a major role in phosphate solubilization and the use of phosphate
solubilizing microorganism in agriculture is a low-cost technology and also environment
friendly approach without disturbing ecological balance. It is believed that microbial-
mediated solubilization of insoluble phosphates in soil is through the release of organic acids
and microbial metabolites (Gyaneshwar, 1998; Carrillo et al. 2002; Rodriguez et al. 2004).
However, in addition to acid production, other mechanisms can cause phosphate
solubilization (Nautiyal et al. 2000). Also, phosphate solubilization has been reported to
depend on the structural complexity, particle size of phosphates and the quantity of organic
acid secreted by microbes (Gaur, 1990).
Fluorescent pseudomonads are predominant group of rhizobacteria (Glick et al. 1995;
Antoun and Kloepper, 2001). They are classified into two different groups such as strains that
have the capability of synthesizing phytohormones and strains that have the ability to
suppress the growth of phytopathogens (Bashan and Holguin, 1998). Fluorescent
pseudomonads enhance plant growth by improving soil nutrient status, producing plant
growth hormones, enzymes and suppressing the growth of phytopathogenic fungi (Antoun
and Kloepper, 2001; Albert and Anderson, 1987). Plant growth promoting rhizobacterial
types of fluorescent pseudomonads exhibit an array of mechanisms such as solubilizing of
inorganic phosphate and iron, production of vitamins, phytohormones and antimicrobial
metabolites in improving plant growth. These mechanisms can probably be active
simultaneously or sequentially at different stages of plant growth and improve plant nutrients
uptake, tolerance to stress, salinity, metal toxicity and pesticide. Fluorescent pseudomonad
strains such as P. monteilli, P. putida, P. plecoglossicida, P. fulva, P. monteilli and P.
aeruginosa (Ravindra Naik et al. 2008) P. fluorescens (Bano and Musarrat, 2004; Dey et al.
2004; Dalla, 1986), Pseudomonas sp. (Lehinos and Vacek, 1994; Lehinos, 1994), P.
chlororaphis, P. savastanoi, P. pickettii (Cattelan et al. 1999) and P. corrugata (Pandey and
Palni, 1998) have been reported as phosphate solubilizers.
Production of antimicrobial metabolites such as 2,4-diacetyphloroglucinol (DAPG),
phenazine-1-carboxylic acid (PCA), phenazine-1-carboxamide (PCN), pyrrolnitrin (PRN) and
pyoluteorin (PLT) by fluorescent pseudomonads is considered as a key mechanism for the
suppression of phytopathogens. Several enzymes by fluorescent pseudomonads are also
involved in lysis and fragmentation of fungal cell wall and suppression of phytopathogenic
fungi. Utilization of variety of carbon sources by phosphate solubilizing fluorescent
pseudomonads may also play an important role in adapting to a variety of crop plants and soil
types. The phytohormone by fluorescent pseudomonads is known to have dual role in
influencing plant growth, by involving in the biocontrol together with glutathione-s-
transferases in defense-related plant reactions and inhibits the germination of spore and
growth of mycelium of different pathogenic fungi (Brown and Hamilton, 1993; Strittmatter,
1994). Chitinases are known to be involved in antagonistic activity against phytopathogenic
fungi and insects (Chernin, 1995; Dunn et al. 1997). Many phosphate solubilizing
antagonistic fluorescent pseudomonad strains exhibit multiple traits such as production of
IAA, proteases, chitinases and cellulases. Selective microbial producers of chitinases are also
reported to be efficient phosphate solubilizers (Greaves and Webley, 1965). Considering the
complexity of soil conditions, the environment factors that affect phosphate solubilization
and other functional traits of fluorescent pseudomonads in soils should be studied in detail.
Enriched knowledge on genetic and functional diversity of phosphate solubilizing fluorescent
pseudomonad bacteria is required to study their ecological role in soil and to design suitable
strategies. Fluorescent pseudomonad bacteria with phosphate solubilization, pesticide
degradation potential and ability to excrete phytohormones and antimicrobial metabolites
may be used as potent biological inoculants for sustainable agriculture.

Acknowledgment

We thank the Department of Biotechnology, Government of India, New Delhi, for
financial support through a research project awarded to Prof. N. Sakthivel.

References

Abd-alla, M. H. (1994). Phosphates and the utilization of organic phosphorous by Rhizobium
leguminosarum biovar viceae. Lett. Appl. Microbiol, 18, 294-296.
Albert, F. and Anderson, A. J. (1987). The effect of Pseudomonas putida colonization on root
surface peroxidases. Plant Physiol, 85, 535-541.
Antoun, H. Beauchamp, C. J. Goussard, N. Chabot, R. and Lalande, R. (1998). Potential of
Rhizobium and Bradyrhizobium species as growth promoting bacteria on non-legumes;
effect on radishes (Raphanus sativus L.), Plant and Soil, 204, 57-67.
Antoun, H; Kloepper, JW. Plant growth promoting rhizobacteria. In: Brenner S, Miller JF
editors. Encyclopedia of Genetics. San Diego, Academic Press; 2001; 1477-1480.
Ayyadurai, N. Ravindra Naik, P. and Sakthivel, N. (2007). Functional characterization of
antagonistic fluorescent pseudomonads associated with rhizospheric soil of rice (Oryza
sativa L.). J. Microbiol. Biotechnol., 17, 919-927.
Babu-Khan, S. Yeo, T. C. Martin, W. L. Duron, M. R Rogers, R. D. and Goldstein, A. H.
(1995). Cloning of a mineral phosphate-solubilizing gene from Pseudomonas cepacia.
Appl. Environ. Microbiol., 61, 972978.
Bano, N. and Musarrat, J. (2004). Characterization of a novel carbofuran degrading
Pseudomonas sp. with collateral biocontrol and plant growth promoting potential. FEMS
Microbiol. Lett, 231, 13-17.
Bashan, Y. and Holguin, G. (1998). Proposal for the division of plant growth-promoting
rhizobacteria into two classifications: biocontrol-PGPB (plant growth-promoting
bacteria) and PGPB. Soil Biol. Biochem., 30, 1225-1228.
Bric, J. M. Bostock, R. M. and Silverstone, S. E. (1991). Rapid in situ assay for indoleacetic
acid production by bacteria immobilized on nitrocellulose membrane. Appl. Environ.
Microbiol., 57, 535538.
Brown, A. E. and Hamilton, J. T. G. (1993). Indole-3-ethanol produced by
Zygorrhynchusmoelleri, and indole-3-acetic acid analogue with antifungal activity.
Mycol Res., 96, 71-74.
Carrillo, A. E. Li, C. Y. and Bashan, Y. (2002). Increased acidification in the rhizosphere of
cactus seedlings induced by Azospirillum brasilense. Naturwissenschaften, 89, 428-432.
Cattelan, A. J. Hartel, P. G. and Furhmann, F. F. (1999). Screening for plant growth
promoting rhizobacteria to promote early soybean growth. Soil Sci. Soc. Am. J., 63,
1670-1680.
Chernin, I. Ismailov, Z. Haran, S. and Chet, I. (1995). Chitinolytic Enterobacter agglomerans
antagonistic to fungal plant pathogens. Appl. Environ. Microbiol., 61, 1720-1726.
Cruz-Sanchez, J. M. Velazquez, E. Mateos, P. F. and Martinez-Molina, E. (1997).
Enhancement of resolution of low molecular weight RNA profiles by staircase
electrophoresis. Electrophoresis, 18, 1909-1911.
Dalla, G. C. (1986). Esperienze di lotta biologicacontrola fusariosi vascolare del garofano
(Trials on the biological control of vascular kilt of carnations). Ann. Inst. Sper. Floric,
17, 3012.
de Souza, J. T. and Raaijmakers, J. M. (2003). Polymorphisms within the PrnD and PltC
genes from pyrrolnitrin and pyoluteorin-producing Pseudomonas and Burkholderia spp.
FEMS Microbiol. Ecol., 43, 2134.
Dey, R. Pal, K. K. Bhatt, D. M. and Chauha, S. M. (2004). Growth promotion and yield
enhancement of peanut (Arachis hypogaea L.) by application of plant growth-promoting
rhizobacteria. Microbiol. Res, 159, 371-39.
Dunn, C. Crowley, J. J. Moenne-Loccoz, Y. Dowling, D. N. de Bruijn, F. J. and OGara, F.
(1997). Biological control of Pythium ultinum by Stenotrophomonas maltophilia W18 is
mediated by an extracellular proteolytic activity. Microbiology, 143, 3921-3931.
Dworkin, M. and Foster, W. J. (1958). Experiments with some microorganisms which utilize
ethane and hydrogen. J. Bacteriol., 75, 592-601.
Frey, P. Frey-Klett, P. Garbaye, J. Berge, O. and Heulin, T. (1997). Metabolic and genotypic
fingerprinting of fluorescent pseudomonads associated with the Gouglas-fir--Laccaria
bicolor mycorrhizosphere. Appl. Environ. Microbiol., 63, 1852-1860.
Gaur, A. C. (1990). Phosphate solubilizing microorganisms as bio-fertilizers. New Delhi,
Omega Scientific Publication.
Gaind, S. and Gaur, A. C. (2002). Impact of fly ash and phosphate solubilizing bacteria on
soybean productivity. Biores. Tech., 58, 313-315.
Glick, B. R. Karaturovic, D. M. and Newell, P. C. (1995). A novel procedure for rapid
isolation of plant growth promoting Pseudomonas. Can. J. Microbiol., 41, 533-536.
Goldstein, A. H. and Liu, S. T. (1987). Molecular cloning and regulation of a mineral
phosphate solubilizing gene from Erwinia herbicola. Bio/Technology, 5, 7274.
Goldstein, A. H. (1986). Bacterial solubilization of mineral phosphates: historical
perspective and future prospects. Am. J. Altern. Agri, 1, 5157.
Goldstein, AH. Involvement of the quinoprotein glucose dehydrogenase in the solubilization
of exogenous phosphates by gram-negative bacteria. In: Torriani-Gorini A, Yagil E,
Silver S, editors. Phosphate in Microorganisms: Cellular and Molecular Biology.
Washington, DC: ASM Press; 1994; 197203.
Goosen, N. Horsman, H. P. Huinen, R.G. and van de Putte, P. (1989). Acinetobacter
calcoaceticus genes involved in biosynthesis of the coenzyme pyrrolo-quinoline-
quinone: nucleotide sequence and expression in Escherichia coli K-12. J Bacteriol, 171,
447455.
Gordon, S. A. and Weber, R. P. (1951). Colorimetric estimation of Indoleacetic acid. Plant
Physiol, 26, 192195.
Greaves, M. P. and Webley, D. M. (1965). A study of the breakdown of organic phosphates
by microorganisms from the root region of certain pasture grasses. J. Appl. Bacteriol.,
28, 454465.
Gupta, R. Singal, R. Sankar, A. Chander, R. M. and Kumar, R. S. (1994). A modified plate
assay for screening phosphate solubilizing microorganisms. J. Gen. Appl. Microbiol., 40,
255-260.
Gurusiddaiah, S. Weller, M. D. Sarkar, A. and Cook, J. R. (1986). Characterization of an
antibiotic produced by a strain of Pseudomonas fluorescens inhibitory to
Gaeumannomyces graminis var. tritici and Pythium spp. Antimicrob Agents Chemother,
29, 488-495.
Gyaneshwar, P. Kumar, G. N. and Parekh, L. J. (1998). Effect of buffering on the phosphate-
solubilizing ability of microorganisms. World J. Microbiol. Biotechnol., 14, 669-673.
Halder, A. K. Mishra, A. K. Bhattacharyya, P. and Chakrabartty, P. K. (1990). Solubilization
of inorganic phosphate by Rhizobium and Brady rhizobium. J. Gen. Appl. Microbiol., 36,
81-92.
Higgins, D. G. Bleashy, A. T. and Fuchs, R. (1992). Clustal V: Improved multiple sequence
alignment. Comput Appl. Biosci, 8, 189-191.
Illmer, P. and Schinner, F. (1992). Solubilization of inorganic phosphates by microorganisms
isolated from forest soil. Soil Biol. Biochem., 24, 389-395.
Jones, D. L. and Darrah, P. R. (1994). Role of root derived organic acids in the mobilization
of nutrients from the rhizosphere. Plant soil, 166, 247-257.
Kapoor, K. K. Mishra, M. M. and Kuhkreja, K. (1989). Phosphate solubilization by soil
microorganisms-a review. Indian J. Microbiol., 29, 119-127.
Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions
through comparative studies of nucleotide sequences. J. Mol. Evol., 16, 111-120.
King, J. E. (1936). The colorimetric determination of phosphorous. Biochem. J., 26, 292-295.
Kumar, S. Tamura, K. and Nei, M. (2004). MEGA3: Integrated software for molecular
evolutionary genetic analysis and sequence alignment. Brief Bioinform, 5, 1596-1599.
Laguerre, G. Rigottier-Gois, L. and Lemanceau, P. (1994). Fluorescent Pseudomonas species
categorized by using polymerase chain reaction (PCR)/restriction fragment analysis of
16S rRNA. Mol. Ecol., 3, 479-487.
Lehinos, V. and Vacek, O. (1994). Biosynthesis of auxin by phosphate-solubilizing
rhizobacteria from wheat (Triticum aestivum) and rye (Secale cereale). Microbiol. Res.,
149, 31-35.
Lehinos, V. (1994). Effects of pH and glucose on auxin production of phosphate-solubilizing
rhizobacteria in vitro. Microbiol. Res., 149, 135-138.
Liu, T. S. Lee, L. Y. Tai, C. Y. Hung, C. H. Chang, Y. S. Wolfram, J. H. Rogers, R. and
Goldstein, A. H. (1992). Cloning of an Erwinia herbicola gene necessary for gluconic
acid production and enhanced mineral phosphate solubilization in Escherichia coli
HB101: Nucleotide sequence and probable involvement in biosynthesis of the coenzyme
pyrroloquinoline quinone. J. Bacteriol., 174, 58145819.
Louws, F. J. Fulbright, D. W. Stephens, C. T. and de Bruijn, F. J. (1994). Specific genomic
fingerprints of phytopathogenic Xanthomonas and Pseudomonas pathovars and strains
generated with repetitive sequences and PCR. Appl. Environ. Microbiol., 60, 2286-2295.
Mavrodi, D. V. Bonsall, R. F. Delaney, S. M. Soule, M. J. Phillips, G. and Thomashow, L. S.
(2001a). Functional analysis of genes for biosynthesis of pyocyanin and phenazine-1-
carboxamide from Pseudomonas aeruginosa PAO1. J. Bacteriol., 183, 6454-6465.
Mavrodi, O. V. Gardener, B. B. M. Mavrodi, D. V. Bonsall, R. F. Weller, D. M. and
Thomashow, L. S. (2001b). Genetic diversity of PhlD from 2,4-diacetylphloroglucinol-
producing fluorescent Pseudomonas spp. Phytopathol, 91, 35-43.
Molina, L. Constantinescu, F. Michel, L. Reimmann, C. Duffy, B. and Defago, G. (2003).
Degradation of pathogen quorum sensing molecule by soil bacteria: a preventive and
curative biocontrol mechanism. FEMS Microbiol Ecol., 45, 71-81.
Mosier, N. Wyman, C. Dale, B. Elander, R. Lee, Y. Y. Holtzapple, M. and Ladisch, M.
(2005). Features of promising technologies for pretreatment of lignocellulosic biomass.
Bioresource Technol., 96, 673-686.
Narsian, V. Thakkar, J. and Patel, H. H. (1994). Isolation and screening of phosphate
solubilizing fungi. Indian J. Microbiol., 34, 113-118.
Nautiyal, C. S. Bhadauria, S. Kumar, P. Lal, H. Mondal, R. and Verma, D. (2000). Stress
induced phosphate solubilization in bacteria isolated from alkaline soils. FEMS
Microbiol Lett., 182, 291-296.
Neilands, J. B. (1995). Siderophore: structure and function of microbial iron transport
compounds. J. Biol. Chem., 270, 26723-26726.
OSullivan, D. J. and OGara, F. (1992). Traits of fluorescent Pseudomonas spp. involved in
suppression of plant root pathogens. Microbiol. Rev., 56, 662-676.
Palomo, J. L. Velazquez, E. Mateos, P. F. Garcia-Benavides, P. and Martinez-Molina, E.
(2000). Rapid identification of Clavibacter michiganensis subspecies sependonicus
based of the stable Low Molecular weight RNA (LMW RNA) profiles, Eur. J. Plant
Pathol., 106, 789-793.
Pandey, A. and Palni, L. M. S. (1998). Isolation of Pseudomonas corrugate from Sikkim
Himalaya. World J Microbiol Biotechnol, 14, 411-413.
Patten, C. L. and Glick, R. B. (2002). Role of Pseudomonas putida in indoleacetic acid in
development of the host plant root system. Appl. Environ. Microbiol, 68, 3795-3801.
Peix, A, Rivas-Boyero, A. A. Mateos, P. F. Rodriguez-Barrueco, C. Martinez-Molina, E. and
Velazquez, E. (2001). Growth promotion of chickpea and barley by a phosphate
solubilizing strain of Mesorhizobium mediterraneum under growth chamber conditions.
Soil Biol. Biochem., 33, 103-110.
Peix, A. Rivas, R. Mateos, P. F. Martinez-Molina, E. Rodriguez-Barrueco, C. and Velazquez,
E. (2003). Pseudomonas rhizosphaerae sp. nov., a novel species that actively solubilizes
phosphate in vitro. Int. J. syst. Evol. Microbiol., 53, 2067-2072.
Peix, A. Rivas, R. Santa-Regina, I. Mateos, P. F. Martinez-Molina, E., Rodriguez-Barrueco,
C. and Velazquez, E. (2004). Pseudomonas lutea sp. nov., a novel phosphate-solubilizing
bacterium isolated from the rhizosphere of grasses. Int. J. syst. Evol. Microbiol., 54, 847-
850.
Penrose, D. M. and Glick, B. (2002). Methods for isolating and characterizing ACC
deaminase containing plant growth promoting rhizobacteria. Physiol. Plant, 118, 10-15.
Pikovskaya, R. I. (1948). Mobilization of phosphorous in soil in connection with vital
activity of some microbial species. Mikrobiologiya, 17, 363370.
Raaijmakers, J. Weller, D. M. Thomashow, and L. S. (1997). Frequency of antibiotic
producing Pseudomonas spp. in natural environments. Appl. Environ. Microbiol., 63,
881-887.
Ramette, A. Moenne-Loccoz, Y. and Defago, G. (2003). Prevalence of fluorescent
pseudomonads producing antifungal phloroglucinols and/or hydrogen cyanide in soils
naturally suppressive or conducive to tobacco black root rot. FEMS Microbiol. Ecol., 44,
35-43.
Ravindra Naik, P. and Sakthivel, N. (2006). Functional characterization of a novel
hydrocarbonoclastic Pseudomonas sp. strain PUP6 with plant-growth-promoting traits
and antifungal potential. Res. Microbiol., 157, 538546.
Ravindra Naik, P. Raman, G. Badri Narayanan, K. and Sakthivel, N. (2008). Assessment of
genetic and functional diversity of phosphate solubilizing fluorescent pseudomonads
isolated from rhizospheric soil. BMC Microbiol., 8, 230.
Reilly, T. J. Baron, G. S. Nano, F. and Kuhlenschmidt, M. S. (1996). Characterization
and sequencing of a respiratory burst-inhibiting acid phosphatase from
Francisella tularensis. J. Biol. Chem., 271, 1097310983.
Renwick, A. Campbell, R. and Coe, S. (1991). Assessment of in vivo screening systems for
potential biocontrol agents of Gaeumannomyces graminis. Plant Pathol., 40, 524-532.
Richardson, A. E. (2001). Prospects for using soil microorganisms to improve the acquisition
of phosphorous by plants. Aust. J. Plant Physiol., 28, 8797-8906.
Rodriguez, H. and Fraga, R. (1999). Phosphate solubilizing bacteria and their role in plant
growth promotion. Biotechnol Adv, 17, 319-339.
Rodrguez, H. Goire, I. and Rodrguez, M. (1996). Characterizacin de cepas de
Pseudomonas solubilizadoras de fsforo. Rev. ICIDCA, 30, 4754.
Rodriguez, H. Gonzalez, T. Goire, I. and Bashan, Y. (2004). Gluconic acid production and
phosphate solubilization by the plant growth promoting bacterium Azospirillum spp.
Naturwissenschaften, 91, 552-555.
Rossolini, G. M. Shipa, S. Riccio, M. L. Berlutti, F. Macaskie, L. E. and Thaller, M. C.
(1998). Bacterial non-specic acid phosphatases: physiology, evolution, and use as
tools in microbial biotechnology. Cell Mol. Life Sci, 54, 833850.
Saitou, N. and Nei, M. (1987). The neighbor-joining method: a new method for
reconstructing phylogenetic trees. Mol. Biol. Evol., 4, 406-425.
Sakthivel, N. and Gnanamanickam, S. S. (1987). Evaluation of Pseudomonas fluorescens for
suppression of sheath-rot disease and for enhancement of grain yields in rice (Oryza
sativa L). Appl. Environ. Microbiol, 53, 20562059.
Schyn, B. and Neilands, J. B. (1987). Universal chemical assay for the detection and
determination of siderophores. Anal. Biochem., 160, 47-56.
Smibert, RM; Krieg, NR. (1994). Phenotypic characterization. In: Gerhardt P, Murray RGE,
Wood WA, Krieg NR editors. Methods for General and Molecular Bacteriology.
American Society of Microbiology, Washington DC: 607-654.
Smith, S. E. and Read, D. J. (1997). Mycorrhizal symbiosis: (2nd edition). San Diego,
Academic Press.
Strittmatter, H. K. (1994). Pathogen-defense gene prp1-1 from potato encodes an auxin-
responsive glutathione-s-transferase. Eur. J. Biochem., 226, 619-626.
Sunish Kumar, R. Ayyadurai, N. Pandiaraja, P. Reddy, A. V. Venkateswarlu, Y. Prakash, O.
Sakthivel, N. (2005). Characterization of antifungal metabolite produced by a new strain
Pseudomonas aeruginosa PUPa3 that exhibits broad-spectrum antifungal activity and
biofertilizing traits. J. Appl. Microbiol., 98, 145-154.
Thaller, M. C. Berlutti, F. Schippa, S. Lombardi, G. and Rossolini, G. M.
(1994). Characterization and sequence of PhoC, the principal phosphate-
irrepressible acid phosphatase of Morganella morganii. Microbiol, 140, 1341
1350.
Thaller, M. C. Lombardi, G. Berlutti, F. Schippa, S. and Rossolini, G. M. (1995b).
Cloning and characterization of the NapA acid phosphatase/phosphotransferase of
Morganella morganii: identication of a new family of bacterial acid phosphatase
encoding genes. Microbiol, 140, 147151.
Vassilev, N. and Vassileva, M. (2003). Biotechnological solubilization of rock phosphate on
media containing agro-industrial wastes. Appl. Microbiol. Biotechnol., 61, 435-440.
Velazquez, E. Igual, J. M. Willens, A. Fernandez, M. P. Munoz, E. Mateos, P. F. Abril, A.
Toro, N. Normand, P. Cervants, E. Gillis, M. and Martinez-Molina E. (2001).
Mesorhizobium chacoense sp. nov., a novel species that nodulates Prosopis alba in the
Chaco Arido region (Argentina). Int. J. Syst. Evol. Microbiol., 51, 10111021.
Velazquez, E. Cervants, E. and Igual, J. M. et al. (1988). Analysis of LMW RNA profiles of
Frankia strains by staircase electrophoresis. Syst. Appl. Microbiol., 21, 539-545.
Velazquez, E. Cruz-Sanchez, J. M. Mateos, P. F. and Martinez-Molina, E. (1998). Analysis
of stable low-molecular-weight RNA profiles of members of the family Rhizobiaceae.
Appl. Environ. Microbiol., 64, 15551559.
Weisburg, W. G. Barns, S. M. and Lane, D. J. (1991). 16S ribosomal DNA amplification for
phylogenetic study. J. Bacteriol., 173, 697-703.
Whitelaw, M. A. (2000). Growth promotion of plants inoculated with phosphate- solubilizing
fungi. Adv. Agron., 69, 99-151.
Yamamoto, S. Kasai, H. Arnold, D. L. Jackson, R. W. Vivian, A. and Harayama, S. (2000).
Phylogeny of the genus Pseudomonas: intrageneric structure reconstructed from the
nucleotide sequences of gyrB and rpoD genes. Microbiol., 146, 2385-2394.
Yao, Q. Li, X. Feng, G. and Christie, P. (2001). Mobilization of sparingly soluble inorganic
phosphates by the external mycelium of an arbuscular mycorrhizal fungus. Plant Soil,
230, 279-285.

Chapter 9

Genetic Diversity and Population
Structure of Alpine Plants Endemic
to Qinghai-Tibetan Plateau, with
Implications for Conservation under
Global Warming

Yupeng Geng
1
, John Cram
2
and Yang Zhong
3
1
School of Life Sciences, Fudan University, Shanghai 200433, China
2
China-UK HUST-Rres Genetic Engineering and Genomics Joint Laboratory,
Huazhong University of Science and Technology, Wuhan 430074, China
3
School of Life Sciences, Fudan University, Shanghai 200433, and Institute of
Biodiversity Science and Geobiology, Tibet University, Lhasa 850000, China

Abstract

The Qinghai-Tibetan Plateau is one of the most important centers of biodiversity for
alpine species in the world and is among the areas that are most sensitive to global
warming. Knowledge about population genetics is essential for understanding the
dispersal ability and evolutionary potential of alpine species in a warming world. In this
chapter, we review the genetic diversity and population structure of 19 alpine plant
species endemic to the Qinghai-Tibetan Plateau. Generally, the population genetic
variation can varygreatly among different species and the endangered species have much
lower levels of genetic diversity than the co-occurring common species. Although a few
species showed increased levels of genetic diversity along altitude, we dectected no
significiant correlation between diversity and altitude in most species. In addition, the
isolation-by-distance model cannot explain the spatial genetic structure in most alpine
species that have been investigated, which may partially due to the discontinous
distribution of alpine species shaped by complex geomorphology in Qinghai-Tibetan
Plateau. The implications of these results for the conservation of alpine plants during
global warming are discussed.

Yupeng Geng, John Cram and Yang Zhong 214
Introduction

The Qinghai-Tibetan Plateau is the highest and largest plateau in the world. The major
part of the plateau is located in China and has an area of 2.5 million km
2
, including Qianghai
Province, the Tibet Autonomous Region and part of adjacent Chinese provinces (e.g., Gansu,
Sichuan, and Yunnan, as well as Xinjiang Autonomous Region, see Figure 1). As a
geographic term, the plateau also includes a few adjacent regions (N 25-40, E 74-104)
outside China, in Nepal, Bhutan, Afghanistan, Pakistan, Tajikistan, Kyrgyzstan, and India.
The Qinghai-Tibetan Plateau is often called the third polar region of the earth,
comparable to the Arctic and the Antarctic. With an average altitude of more than 4000 m, it
is a unique biogeographic region, where various landscapes, altitudinal belts, alpine
ecosystems, and endangered and endemic species have developed. In particular, the
southeastern part of Qinghai-Tibetan Plateau belongs to one of the 25 global biodiversity
hotspots (the South-Central China area) (Myers et al. 2000). Taking vascular plants as a
sample, it is estimated that the plateau contains more than 12,000 species in 1500 genera,
among which more than 20% are endemic to this region (Wu 1988; Wu et al. 1995).
Additionally, many wild relatives of crops (e.g. wild barley) and medical plants (e.g.
Rhodiola species) are distributed in the plateau. The rich biodiversity and unique
environments make the Qinghai-Tibetan Plateau a special laboratory for botanists, ecologists,
and evolutionary biologists.

Figure 1. Location and range of the Qinghai-Tibetan Plateau. The shaded area indicates the approximate
range of one of the 25 global biodiversity hotspots - South-Central China.
The plateau is also among the areas of the world that are most sensitive to global
warming (Qin 1998; Weng and Zhou 2006; Xu and Liu 2007). A warmer climate and altered
precipitation patterns may have significant effects on the composition and distribution of
Genetic Diversity and Population Structure of Alpine Plants 215
biodiversity in alpine areas (Baker and Moseley 2007). In addition to the changing climate,
intensive anthropic disturbance (e.g. overgrazing by livestock or over-harvesting of medical
plants) will further accelerate habitat loss and environmental degration in the plateau. As a
result, biodiversity in the Qinghai-Tibetan Plateau is at great risk and effective conservation
efforts are urgently needed. Knowledge of genetic diversity and population structure provide
important bases for the conservation of threatened species, as the long-term survival of a
species largely depends on the maintenance of genetic variability within and among
populations to accommodate new selection pressures resulting from inevitable environmental
changes like global warming (Kinnison et al. 2007).
In this chapter, the general levels and patterns of genetic variation for plant species
endemic to the Qinghai-Tibetan Plateau are outlined. We focus on intraspecific genetic
variation and microevolution processes, in an attempt to provide a better understanding of the
potential ecological and evolutionary responses of alpine plants in a warmer world. Recently,
the macroevolution patterns and processes in the Tibetan flora (e.g. adaptive radiation and
phylogenetic relationship between closely related species) have received increased attention.
This topic is interesting but beyond the scope of this review. Furthermore, we have not
attempted to make a comprehensive review of all published work, but have instead paid more
attention to papers published in recent years. Our aim is to assess the current state of this field
and raise a few questions that deserve attention in future work.

Brief History of Plant Genetic Diversity Studies in
Qinghai-Tibetan Plateau

The earliest exploration of the Qinghai-Tibetan Plateau by western explorers goes back
to 19
th
century, started with the plant collection from Sikkim to Tibet by Joseph Dalton
Hooker (Liu 2000). In the 1930s-40s, several Chinese scientists, e.g. Shene Liu (T. N. Liou),
Dejun Yu (T. T. Yu), Jinzhi Xu, and JianChu Sun, also conducted a few scientific
investigations of botany, geography, and meteorology. But these activities were confined to
limited areas until the foundation of the Peoples Republic of China in 1949. The systematic
scientific investigation of the Qinghai-Tibetan Plateau began in the 1950s. A large group was
set up by the Chinese Academy of Sciences (also called Academia Sinica) in 1973 and
conducted extensive investigations during the following twenty years (Liu 2000). Based on
large-scale collections of plant specimens, a number of monographs including the Flora of
Tibet and Vegetation of Tibet have been published, which are the most important basic data
sources for studies of the plants in the Qinghai-Tibetan Plateau even today.
In the early stages of investiating plant genetic diversity in Tibet, many studies focused
on the collection and evaluation of germplasm resources in crops, especially barley
(Hordeum vulgare L. var. nudum Hook. f.), one of the major crops in the plateau, and their
wild relatives. Zhou et al. (1984) analyzed the karyotype and chromosome Giemsa N-banding
patterns of two-rowed wild barley (H. spontaneum C. Koch). Using isozyme markers, Dai
and Zhang (1989) analyzed the genetic diversity of 463 accessions of cultivated barley.
Recently, in addition to several economically important species (e.g. crop relatives and
medical plants), ecologically important species (e.g. dominant species in local communites)
have also received increasing attention. With regard to geographical regions, earlier studies
usually focused on areas on the edge of the plateau, including Gansu, Sichuan, Yunnan,
Qinghai, East Tibet and areas adjacent to Lhasa in south Tibet. Recent studies have expanded
to areas located in central Tibet. However, northwest Tibet, e.g. Kekexili (Hoh Xil) and Ali
(Ngari), where the average altitude is 4500m, has still received little attention, which may
partially be due to the harsh environment and poor road systems.
Allozymes are among the commonly used markers in early studies, but DNA-based
markers (e.g. RAPD, ISSR, AFLP, microsatellite, and DNA sequences) are becoming more
and more widely used in the studies of plant genetic diversity. As a highly variable co-
dominant marker, microsatellite (i.e. SSR) has a few outstanding advantages in population
genetic studies (Selkoe and Toonen 2006). However, the lack of sequence information has
restricted the use of microsatellite in Tibetan plants. Instead, several anonymous markers (e.g.
RAPD and ISSR) have frequently been used because of their convenience and low cost. With
the advance of sequencing techniques, more microsatellite primers can be developed in the
near future. In addition, haplotype data based on chloroplast or mitochondrial DNA
sequences have been commonly used in the investigation of the phylogeographic structure of
plant species.

Amount and Distribution of Genetic Variation

To assess the levels and partitioning of genetic variation in plants from the Qinghai-
Tibetan Plateau, we have reviewed the papers published in recent years. Only data based on
samples from natural populations have been considered, and the studies focusing on crop
varieties have been excluded. In addition, some studies based on inadequate sampling
strategies (e.g. only 3-4 individuals from each population) were also excluded.
Data for 19 species are summarized in Table 1. Generally, most studies were based on
RAPD and/or ISSR markers and most pecies were perennial herbs distributed above the tree
line, i.e. alpine plants (sensu Krner 1995). As mentioned earlier, these species can be
classified into four major categories: 1) wild relatives of crops, e.g. Elymus sibiricus L. and
Roegneria thoroldiana (Oliv.) Keng; 2) endangered species, e.g. Pinus squamata X. W. Li
and Pedicularis dunniana Bonati; 3) medical plants, e.g. Anisodus tanguticus (Maximowicz)
Pascher, Lamiophlomis rotata (Benth.) Kudo, Swertia przewalskii Pissjaukova, and several
Rhodiola species; and 4) ecologically important species (i.e. dominant species in local
communities), e.g. Androsace tapete Maxim., Polygonum viviparum L., and Sophora
moorcroftiana (Benth.) Baker. Please note that this is not a strict classification and the four
categories are not mutually exclusive to each other.

1. Genetic Diversity at Population Level

Because of the unequal numbers of populations of a species investigated in different
studies (ranging from 2 to 10), the species level genetic diversity may not be comparable.

Table 1. Genetic diversity of endangered species or wild relatives of crops in Tibet. Hs, Neis gene diversity index; I,
Shannons diversity index; Gst, Neis genetic differentiation coefficient, st, an analogue of Gst based on AMOVA.

Category
Species (Family)
Geographic
region
Altitude
range (m)
Marker
type
Genetic diversity
Genetic
differentiation
Reference
Elymus sibiricus (Poaceae) Sichuan 3200-3600 ISSR Hs = 0.181
Gst = 0.33
st = 0.425
Ma et al. (2008)
Wild relatives
of crops Roegneria thoroldiana
(Poaceae)
Qinghai, Tibet 4015-4710 SSR Hs = 0.49 Gst = 0.23 Jiang et al. (2005)
Pinus squamata (Pinaceae) Northeast Yunnan 1900
RAPD
ISSR
Hs =0.017, I =
0.025
Hs =0.025, I =
0.039
st = 0.011
st = 0.024
Zhang et al. (2005)
Pedicularis dunniana
(Scrophulariaceae)
Sichuan, Yunnan Not report ISSR
Hs =0.062, I =
0.099
st = 0.7462 Xia & Guo (2006)
Swertia przewalskii
(Gentianaceae)
Qinghai 3280-3660
RAPD
ISSR
I = 0.27
I = 0.25
st = 0.52
st = 0.56
Zhang et al. (2007)
Endangered
plants
Anisodus tanguticus
(Solanaceae)
Qinghai, Sichuan,
Southeast Tibet
3200-4100 RAPD Hs = 0.195
Gst = 0.3505
st = 0.3298
Zheng et al. (2008)
Rhodiola crenulata
(Crassulaceae)
Yunnan, South
Tibet
3890-5150 ISSR I = 0.268 st = 0.474 Lei et al. (2006)
R. chrysanthemifolia
(Crassulaceae)
Southeast Tibet
3600-4800
(estimated)
RAPD I = 0.1351 st = 0.773 Xia et al. (2007)
R. alsia (Crassulaceae)
Qinghai, Gansu,
East Tibet
3470-4900 ISSR I = 0.1369 st = 0.703 Xia et al. (2005)
Medical
plants
Lamiophlomis rotata
(Lamiaceae)
Qinghai, Yunan,
East Tibet
4200-5100
RAPD
ISSR
Hs = 0.166, I =
0.248
Hs = 0.166, I =
0.251
Gst = 0.430
Gst = 0.422
Liu et al. (2006)


Category
Species (Family)
Geographic
region
Altitude
range (m)
Marker
type
Genetic diversity
Genetic
differentiation
Reference
Sophora moorcroftiana
(Fabaceae)
South Tibet 2947-4100 Isozyme Hs = 0.122 Fst = 0.199 Liu et al. (2006)
Ecological
important
species

Kobresia humilis (Cyperaceae)
Gansu, Sichuan,
Qinghai
2800-3820 RAPD
Hs = 0.2126
I = 0.3185
Gst = 0.1891 Zhao et al. (2006)
K. royleana (Cyperaceae) Gansu, Qinghai 2750-3860 RAPD
Hs = 0.2446
I = 0.3662
Gst = 0.1066 Zhao et al. (2006)
K. kansuensis (Cyperaceae) Gansu, Sichuan 3450-3820 RAPD
Hs = 0.2266
I = 0.3369
Gst = 0.1438 Zhao et al. (2006)
K. tibetica
(Cyperaceae)
Gansu, Sichuan,
Qinghai
3230-3550 RAPD
Hs = 0.2521
I = 0.3772
Gst = 0.1884 Zhao et al. (2006)
K. setchwanensis (Cyperaceae) Gansu, Sichuan 3140-3600 RAPD
Hs = 0.1997
I = 0.2998
Gst = 0.2101 Zhao et al. (2006)
Megacodon stylophorus
(Gentianaceae)
Yunnan, Sichuan 3300-4000 ISSR
Hs = 0.0532
I = 0.0792
Gst = 0.727 Ge et al. (2005)
Androsace tepate (Primulaceae) South Tibet 4830-5010 ISSR
Hs = 0.3193
I = 0.4665
Gst = 0.1251
st = 0.1385
Geng et al. (2008)

Polygonum viviparum
(Polygonaceae)
Gansu 2000-3900 RAPD
Hs 0.1227
I 0.1804
Gst 0.5743
st = 0.6659
Lu et al. (2008)
Note: Some species may fall into two categories, e.g. Rhodiola species are both endangered and medical species.


Figure 2a. Androsace tapete, a cushion plant, is an important ecosystem engineer species in the alpine
ecosystem of the Qinghai-Tibetan Plateau.

Figure 2b. Rhodiola crenulata, a medicinal plant endemic to the Qinghai-Tibetan Plateau.


Figure 2c. Lamiophlomis rotata, a medicinal plant endemic to the Qinghai-Tibetan Plateau.
Accordingly, we only considered intrapopulation variation that is measured using Neis
gene diversity index (Hs) and Shannons diversity index (I).
Previous studies have suggested that the levels of genetic variation within populations
are significantly affected by several factors including life form, mating system, population
history, and effective population size (Hamrick and Godt 1996, Booy et al. 2000). Here, 18 of
the 19 species under investigation are perennial herbs. In most cases, the information about
the mating system is not available. Generally, the degree of population genetic variation
observed varied greatly between species, Hs ranging from 0.017 to 0.319 with a mean of
0.173. This value is lower than the avarage value of both short-lived (Hs = 0.20) and long-
lived perennial plant species (Hs = 0.25) (Nybom 2004). The average lower diversity in
plants endemic to the Qinghai-Tibetan Plateau may be partitially due to the unusually low
values in two endangered species. Specifically, Zhang et al. (2005) detected an extremely low
level of genetic diversity (Hs = 0.017, I = 0.025) in two populations of Pinus squamata,
which is a highly endangered pine consisting of only 32 individuals. A similarly low value
was found in the endangered Pedicularis dunniana (Hs = 0.062), with a single population of
no more than 50 individuals. In contrast, Geng et al. (2008) found high levels of genetic
diversity (Hs = 0.319, I = 0.467) in five populations of Androsace tapete. Such high diversity
could be attributed to its unique life history characteristics (longevity and long juvenile
phase) and large population size (Geng et al. 2008).
Given the large altitude ranges in the Qinghai-Tibetan Plateau, an interesting question is
how genetic diversity changes with altitude. The answer is very important because it provides
the information necessary to characterise the evolutionary potential and genetic structure of
alpine plants, and can help to predict the responses of vertical vegetation zones to climate
change (Kinnison and Hairston 2007).
In a recent review, Ohsawa and Ide (2007) summarized four common patterns of genetic
variation change with altitude: 1) L < M > H, i.e. the lower and higher populations have
less diversity than those at intermediate levels; 2) L < M < H, i.e. the lower populations
have less diversity; 3) L > M > H, i.e. the lower populations have greater diversity; and 4)
L = M = H, i.e. no significant change with altitude. Some published studies amongst those
considered here compared explicitly the genetic diversity within populations from different
altitudes. For example, Zhao et al. (2006) investigated the genetic diversity of five Kobresia
species from the eastern Qinghai-Tibetan Plateau and found no significant correlation
between diversity and altitude. In contrast, using allozyme markers, Liu et al. (2006) found
that the genetic diversity of Sophora moorcroftiana increased significantly with altitude in
terms of expected heterozygosity (Hs) but not observed heterzygosity (Ho).
For other species which the authors did not perform such statistical analysis, we plotted
the diversity value against altitude, based on data in the original papers, and looked for
possible non-linear patterns (i.e. L < M > H). Where the data suggested a linear increase or
decrease, statistically analyses were performed to examine the possible correlation of
variability with altitude. Our results revealed that most species show no significiant
correlation between diversity and altitude (Figure 3), suggesting the existance of other factors
that affect genetic diversity more strongly than altitude.
Another point is that, in most cited studies, populations from different altitudes are
collected from areas that are also far apart, often from different mountains. In other words,
the difference of genetic diversity may represent the combined effects of both vertical and
horizontal gradients.

2. Genetic Differentiation between Populations

Knowledge of genetic structure, i.e. the distribution of diversity within and between
populations of a species, is important for the conservation of alpine species because it
provides useful insights into how the species may respond to climate changes. For example, if
a large proportion of the diversity resides within populations, this would seem good for in
situ conservation of alpine species for at least two reasons: 1) the local populations may have
high evolutionary potential and thus increase their chances to pass through the environmental
filter caused by changed selection regimes; and 2) a large proportion of diversity within
populations usually means effective gene exchange between populations, which would help
the warm-adapted alleles in low altitude populations to spread into higher populations and
thus decrease the risk of local extinction by warming.
A commonly used statistical parameter for genetic differentiation is Gst (Nei, 1973),
which provides a measure of the proportion of the total diversity occurring between
populations. The values of Gst (or st, an analogue of Gst based on AMOVA) for 19 species
endemic to the Qinghai-Tibetan Plateau are presented in Table 1. Most species show
considerable genetic differentiation between populations, with Gst ranging from 0.1066 to
0.727 and st ranging from 0.011 to 0.773.

Figure 3. The change of genetic diversity (measured as Hs or I) with altitude in alpine plants endemic to
the Qinghai-Tibetan Plateau. The correlation between genetic diversity and altitude is not significant in
most species. Only studies including more than six populations are analyzed.
The mean value (Gst = 0.300, st = 0.481) is largely comparable to the average for
short-lived perennial plant species (Gst = 0.32, st = 0.41) and higher than that for long-lived
perennial plants (Gst = 0.25, st = 0.19) (Nybom, 2004). Generally, several endangered
and/or medical plants including the genera Rhodiola, Pedicularis, Lamiophlomis, Swertia,
and Anisodus show high genetic differentiation between populations, which may be
partitially ascribed to their fragmented habitats and shrinking population size because of
overexploitation (Liu et al. 2006). An exception is the highly endangered Pinus squamata, in
which both limited genetic diversity within populations and low genetic differentiation
between populations were found as a result of extremely small population size (Zhang et al.
2005). In contrast, several widespread plants like Androsace tapete and Kobresia species,
whose life histories are similar to those of long-lived trees, have relatively low genetic
differentiation between populations (Geng et al. 2008).
Several studies also investigated the genetic differentiation between populations in a
spatial context. One of the most widely considered models is isolation-by-distance. In this
case the genetic differentiation between populations is predicted to be quantitatively
correlated with the corresponding geographic distance (Wright 1943). A Mantel test can be
used to examine the correlation between genetic distances and geographic distances (Mantel
et al. 2003). For example, Liu et al. (2006) investigated the spatial genetic structure of ten
populations of Sophora moorcroftiana along the Brahmaputra River (known within Tibet as
Yarlung Zangbo River) and reported a significant correlation between genetic and geographic
distances (r
2
= 0.50, p = 0.002). Similar findings were reported in Anisodus tanguticus (r =
0.345, p = 0.020), Rhodiola crenulata (r = 0.677, p = 0.006), and Lamiophlomis rotata (r =
0.688, p = 0.001). In contrast, no significant correlations were found in Androsace tapete (r =
0.042, p = 0.446), Elymus sibiricus (r = 0.744, p = 0.993), and Megacodon stylophorus (r =
0.531, p = 0.146). The lack of significiant correlations between genetic and geographical
distances in the last three species suggests that the isolation-by-distance model cannot explain
the spatial genetic structure of populations of alpine plant species in the Qinghai-Tibetan
Plateau. It is notable that some of the assumptions of the isolation-by-distance model may be
invalid in the case of Tibetan plateau. Specifically, the model assumes continuous and
homogeneous populations, and ignores the effects of habitat characteristics and demographic
variation within the range of a species (Slatkin and Maruyama 1975, McRae 2006).
Nevertheless, most plants in the Qinghai-Tibetan Plateau occur within a limited range of
altitudes, resulting in a belt along land at those altitudes. Given the complex geomorphology
and interlaced valleys (i.e. low altitude areas), the distributions of most alpine plants are not
spatial continously. Recently, a more refined model of isolation-by-resistance has been
proposed, which may be more useful in explaining the spatial genetic structure of alpine
plants in the Tibetan plateau.

3. Spatial Genetic Structure on a Large Scale: Effects of Landscape
Barriers

The Qinghai-Tibetan Plateau has a few outstanding features that make it very different
from its counterparts elsewhere in the world. Firstly, it occupies an extremely large area (i.e.
2.5 million km
2
) and spans considerable latitude and longitude ranges (i.e. N 25-40, E 74-
104). It is not surprising that the plateau possesses highly diverse vegetation types, including
not only vertical zones between different altitudes on the same mountain but also horizontal
zones crossing different latitudinal or longitudinal areas (see The Vegetation of Tibet,
Institute of Botany at the Chinese Academy of Sciences, 1988). Alpine plants on the Tibetan
plateau may respond to climate warming by both upward migration along the same mountain,
and northward migration in some areas (e.g. northern Tibet) where many plants are
continuously distributed. Secondly, many major landscape features (e.g. ridges with peaks of
more than 5000m and valleys with basins below 4000 m) run west-east across the plateau,
and present significant barriers to the northward migration of alpine plants whose habitats are
usually constrained between altitudes of 4000 and 5000m. Thus, knowledge of spatial genetic
structure at a large scale, especially across major landscape features, is essential for
predicting the response of alpine plants to climate change.
Among the outstanding landscape features of the southern Qinghai-Tibetan Plateau is the
Brahmaputra River, which is the largest and longest river on the Tibetan plateau and forms a
huge west-east valley, about 1,500 km in length and 200 km in maximum width. Several
studies have investigated the effect of the Brahmaputra River on the spatial genetic structure
of alpine plants endemic to Tibet. Using ISSR markers, Xia et al. (2007) investigated the
genetic structure of Rhodiola chrysanthemifolia, in which five populations were collected to
the south and five to the north of the Brahmaputra River. AMOVA revealed that the genetic
variation between populations located in the south and north side of the Brahmaputra River
was only 4.6%, and most variation (73.1%) was found among populations within regions,
suggesting limited genetic differentiation across the Brahmaputra River.
Similarly, Liu et al. (2006) used isozyme markers and analyzed ten populations of the
endemic shrub Sophora moorcroftiana along the Brahmaputra River. Althought they did not
explicitly test the effect of the river on the genetic differentiation among populations of this
species. The genetic distances presented in Table 6 of Liu et al. (2006) and the results of
cluster analysis (Figure 3 in Liu et al. 2006) give no indication of genetic discontinuity across
the Brahmaputra River. The lack of genetic differentiation across landscape features may
result from extensive current and/or past gene exchanges between populations located in
different geographical regions. The relative importance of the two factors (current versus
historical gene exchanges) can be inferred through the analysis of gene flow at different
spatial scales. For example, in a recent study, Geng et al. (2008) explored the spatial genetic
structure of Androsace tapete at both fine-scale (several meters) and landscape-scale
(hundreds of km). On a fine scale, Androsace tapete showed significant genetic spatial
autocorrelation within a short distance (less than 10 m), suggesting limited current gene
dispersal via pollen and/or seeds. On a landscape scale, however, the Brahmaputra River
played a weak role in shaping the spatial population structure of this species. The contrasting
results of spatial genetic structure at different scales suggest that historical gene exchanges,
rather than current gene flow, might have played an important role in shaping the genetic
structure of this species across the landscape features like the Brahmaputra River. Besides the
Brahmaputra river, a few other landscape barriers (e.g. the Nianqingtanggula and Tanggula
mountains) were also involved in the studies of genetic structure in alpine plants endemic to
the Qinghai-Tibetan Plateau (Xia et al. 2005). More studies are needed to make a fuller
assessment of the effects landscape barriers on the spatial genetic structure of alpine species
in the Qinghai-Tibetan Plateau.

Conclusions

The population genetics of alpine plants endemic to the Qinghai-Tibetan Plateau have
received increasing attention in recent years, as these species are among the most senstive to
climate change. Despite significant progress reviewed here, several challenges remain for
better management and conservation. Generally, whether or not alpine plants can survive the
ongoing climate changes will largely depend on their ability to disperse into suitable new
habitats, and on their ability to adapt to the changed environment in situ through rapid
evolution (Pulido and Berthold 2004). As the dispersal abilities of alpine plants are often
difficult to measure using traditional methods, indirect methods of population genetic
analysis based on neutral molecular markers represent a promising alternative to assess the
gene flow between populations. However there are few studies of this sort. In addition, the
knowledge of current vertical and horizontal genetic differentiation can provide important
insights into long-term gene exchange and historical dispersal patterns of alpine plants, which
are useful for better prediction of their probable responses to future climate change. In
addition, efforts are needed to assess the adaptive potentials in alpine plant populations in
more accurate ways. Although the neutral molecular markers (e.g. ISSR and RAPD) are most
widely used to measure the genetic diversity within natural populations with the assumption
of the existance of positively correlationship between marker diversity and the additive
genetic variance. However, this assumption has not been tested rigorously, and neutral
markers may fail to detect genetic differentiation of great adaptive significance. Thus, there is
a need for more well-designed research in Tibet, using both neutral molecular markers and
quantitative traits with explicitly ecological significance, to examine the amount and
distribution of genetic variation along the altitudinal gradients and across the horizontal
zones.

Acknowledgments

We would like to thank Dr. Tashi Tersing and other colleagues at Tibet University for
their help in field investigation and Dr. Yidong Lei, Dr. Jimei Liu, Dr. Qingbiao Wang, Dr.
Li Wang and Dr. Liyan Zeng for their assistance in experiments and data analyses. Special
thanks to Professors Suhua Shi and Shaoqing Tang for their guidence and support to our
studies on plant genetic diversity. This work was supported by Shanghai Science and
Technology Committee (07XD14025), China Postdoctoral Science Foundation (200801171
and 20070410163), and Doctoral Fund of Ministry of Education of China (200802461047).

References

Baker B.B., Moseley R.K. (2007). Advancing treeline and retreating glaciers: implications
for conservation in Yunnan, PR China. Arctic Antarctic and Alpine Research 39, 200
209.
Booy G., Hendriks R.J.J., Smulders MJM, Van Groenendael JM, and Vosman B. (2000).
Genetic diversity and the survival of populations. Plant Biology 2, 379-395.
Dai X.K. and Zhang Q.F. (1989). Genetic diversity of six isozyme loci in cultivated barley of
Tibet. Theoretical and Applied Genetics 78, 281-286.
Ge X.J., Zhang L.B., Yuan Y.M., Hao G., and Chiang T.Y. (2005). Strong genetic
differentiation of the East-Himalayan Megacodon stylophorus (Gentianaceae) detected
by Inter-Simple Sequence Repeats (ISSR). Biodiversity and Conservation 14, 849-861.
Geng Y.P., Tang S.Q., Tashi T., Song Z.P., Zhang G.R., Zeng L.Y., Zhao J.Y., Wang L., Shi
J., Chen J.K., and Zhong Y. (2008) Fine- and landscape-scale spatial genetic structure of
cushion rockjasmine, Androsace tapete (Primulaceae), across southern Qinghai-Tibetan
Plateau. Genetica (In press).
Hamrick, J.L., and Godt M.J.W. (1996). Effects of life history traits on genetic diversity in
plant species. Phil. Trans. Roy. Soc. London Biol. Sci. 351, 1291-1298.
Institute of Botany at the Chinese Academy of Sciences. (1988). Vegetation of Tibet.
Scientific Press, Beijing.
Jiang Z.L., Yang X.M., Wang R., Gao A.N. and Li L.H. (2005). Genetic diversity of
Roegneria thoroldiana (Oliv.) Keng populations based on SSR analyses. Journal of
Plant Genetic Resources 6, 315-318.
Kinnison M.T., Hairston N.G. (2007). Eco-evolutionary conservation biology: contemporary
evolution and the dynamics of persistence. Funtional Ecology 21, 444-454.
Krner C. (1995). Alpine plant diversity: A global survey and functional interpretations. In
Chapin F.S. III and Krner C. (eds), Arctic and Alpine Biodiversity. Ecological Studies,
113, Springer-Verlag, Berlin, pp 45-62.
Lei Y.D., Gao H., Tashi T., Shi S.H., and Zhong Y. (2006). Determination of genetic
variation in Rhodiola crenulata from the Hengduan Mountains Region, China using
inter-simple sequence repeats. Genetics and Molecular Biology 29, 339-344.
Liu DS (2000). Implications of fifty years scientic investigation in Qinghai-Tibetan Plateau.
Resources Science 22, 1-5.
Liu J.M., Wang L., Geng Y.P., Wang Q.B., Luo L.J., and Zhong Y. (2006). Genetic diversity
and population structure of Lamiophlomis rotata (Lamiaceae), an endemic species of
Qinghai-Tibet Plateau. Genetica 128, 385-394.
Liu Z.M., Zhao A.M., Kang X.Y., Zhou S.L., and Lopez-Pujol J. (2006). Genetic diversity,
population structure, and conservation of Sophora moorcroftiana (Fabaceae), a shrub
endemic to the Tibetan Plateau. Plant Biology 8, 81-92.
Lu J.Y., Yang X.M., and Ma R.J. (2008) Genetic diversity of clonal plant Polygonum
viviparum based RAPD in eastern Qinghai-Tibet Plateau of China. Journal of Northwest
Normal University 44, 66-72.
Ma X., Zhang X.Q., Zhou Y.H., Bai S.Q., and Liu W. (2008). Assessing genetic diversity of
Elymus sibiricus (Poaceae: Triticeae) populations from Qinghai-Tibet Plateau by ISSR
markers. Biochemical Systematics and Ecology 36, 514-522.
McRae B.H. (2006). Isolation by resistance. Evolution 60, 1551-1561.
Myers, N., Mittermeier, R.A., Mittermeier, C.G., da Fonseca, G.A.B. and Kent, J. (2000).
Biodiversity hotspots for conservation priorities. Nature 403, 853-858.
Nei M. (1973). Analysis of gene diversity in subdivided populations. Proceedings of the
National Academy of Sciences, USA 70, 3321-3323.
Nybom H. (2004). Comparison of different nuclear DNA markers for estimating intraspecific
genetic diversity in plants. Molecular Ecology 13, 1143-1155.
Ohsawa T. and Ide Y. (2007). Global patterns of genetic variation in plant species along
vertical and horizontal gradients on mountains. Global Ecology and Biogeography 17,
156-163.
Pulido F. and Berthold P. (2004). Microevolutionary response to climatic change. In: Moller
et al (eds) Effects of climatic change on birds. Elsevier, Amsterdam, pp 151184.
Qin D.H. (1998). The glaciers and ecological environments of the Qinghai-Tibet Plateau.
China Tibetology Publisher, Beijing.
Slatkin M., Maruyama T. (1975). The influence of gene flow on genetic distance. American
Naturalist. 109, 597601.
Selkoe K.A. and Toonen R.J. (2006). Microsatellites for ecologists: a practical guide to using
and evaluating microsatellite markers. Ecology Letters 9, 615-629.
Wu C.Y. (1988). Hengduan Mountain flora and her significance. Journal of Japanese Botany
63, 297-311.
Wu S.G, Yang Y.P. and Fei Y. (1995). On the flora of the alpine region in the Qinghai-
Xizang (Tibet) plateau. Acta Botanica Yunnanica 17, 233-250.
Weng E.S., Zhou G.S. (2006). Modeling distribution changes of vegetation in China under
future climate change. Environmental Modeling and Assessment 11, 4558.
Xia J. and Guo Y.H. (2006). ISSR analysis for genetic diversity of Pedicularis dunniana.
Journal of Wuhan Botanical Research 24, 565-568.
Xia T., Chen S.L., Chen S.Y., and Ge X.J. (2005). Genetic variation within and among
populations of Rhodiola alsia (Crassulaceae) native to the Tibetan Plateau as detected by
ISSR markers. Biochemical Genetics 43, 87-101.
Xia T., Chen S.L., Chen S.Y., Zhang D.F., Zhang D.J., Gao Q.B., and Ge X.J. (2007). ISSR
analysis of genetic diversity of the Qinghai-Tibet Plateau endemic Rhodiola
chrysanthemifolia (Crassulaceae). Biochemical Systematics and Ecology 35, 209-214.
Xu W.X. and Liu X.D. (2007). Response of vegetation in the Qinghai-Tibet Plateau to global
warming. Chinese Geographical Science 17, 151-159.
Zhang D.F., Chen S.L., Chen S.Y., Zhang D.J., and Gao Q.B. (2007). Patterns of genetic
variation in Swertia przewalskii, an endangered endemic species of the Qinghai-Tibet
Plateau. Biochemical Genetics 45, 33-50.
Zhang Z.Y., Chen Y.Y., and Li D.Z. (2005). Detection of low genetic variation in a critically
endangered Chinese pine, Pinus squamata, using RAPD and ISSR markers. Biochemical
Genetics. 43, 239-249.
Zhao Q.F., Wang G., Li Q.X., Ma S.R., Cui Y., and Grillo M. (2006). Genetic diversity of
five Kobresia species along the eastern Qinghai-Tibet Plateau in China. Hereditas 143,
33-40.
Zheng W., Wang L.Y., Meng L.H., and Liu J.Q. (2008). Genetic variation in the endangered
Anisodus tanguticus (Solanaceae), an alpine perennial endemic to the Qinghai-Tibetan
Plateau. Genetica 132, 123-129.
Zhou Z.Q., Shao Q.Q., and Jiang X.C. (1984). Comparison of karyotype and chromosome N-
banding pattern of Hordeum spontaneum of Qing-Zang Plateau and that of the Middle
East. Acta Genetica Sinica 11, 120-124.


Chapter 10

Bayesian Inference under Complex
Evolutionary Scenarios Using
Microsatellite Markers:
Multiple Divergence and Genetic
Admixture Events in the Honey Bee,
Apis Mellifera

Jean-Marie Cornuet
1
, Laurent Excoffier
2
, Pierre Franck
3
and Arnaud Estoup
1
1
Centre de Biologie et de Gestion des Populations, INRA,
Campus International de Baillarguet, CS 30016 Montferrier-sur-Lez,
34988 Saint-Gly-du-Fesc Cedex, France
2
Computational and Molecular Population Genetics Lab (CMPG),
Zoological Institute, University of Bern, Baltzerstrasse 6, 3012 Bern, Suisse
3
UR1115 Plantes et Systmes de culture Horticoles, INRA,
F-84000 Avignon cedex 9, France

Abstract

Making inference from molecular data on the demographic parameters of complex
evolutionary scenarios remains methodologically challenging. The approximate Bayesian
computation (ABC) method has the potential to treat such scenarios (Beaumont et al..,
2002). We have developed a user-friendly methodological framework based on ABC that
allows one to make inferences from microsatellite data under evolutionary scenarios
including any combination of admixture, divergence and (discontinuous) effective
population size variation events, and this for any number of populations. We illustrate
here the potential of this methodological framework by making inferences on a complex
scenario involving four A. mellifera populations sharing two divergence and two

Corresponding author: Jean-Marie Cornuet. Tel: +44 20 7594 3420; email: cornuet@supagro.inra.fr
Jean-Marie Cornuet, Laurent Excoffier, Pierre Franck et al. 230
admixture events. Four groups of honey bee populations belonging to two genetic
lineages (M and C) and genotyped at eight microsatellite loci have been analysed twice
to evaluate estimation stability. In addition, mean relative bias and errors have been
computed from 500 data sets simulated with known values of parameters (close to
estimates on real data), showing that the order of magnitude of all parameters is correctly
estimated. Time estimates of divergences between populations are compatible with
previous estimates: -0.6 My for lineages M and C divergence and -0.2 My for French and
Italian M lineage divergence. The estimated proportion of lineage M alleles in the
subspecies ligustica, amounting to 12%, is intermediate between estimates obtained by
two different methods. Furthermore, our ABC analysis allows decomposing the previous
estimate of 35% of lineage M alleles in the recently admixed population as 23% from the
local mellifera subspecies and 77%12% (9.2%) from the imported ligustica, making a
total of 32.2%. The most unexpected result concerns the time of the admixture of
lineages M and C that gave rise to the subspecies ligustica. It is estimated at 2,000 years
with an approximate credibility interval of (-1,000, -7,000).

Keywords: inference, evolutionary history, microsatellite, introgression, approximate
Bayesian computation, Apis mellifera.

Introduction

Reconstructing the evolutionary history of populations from molecular data is crucial for
various questions addressed in both academic and applied sciences. For instance, the genetic
variation within and between populations can have different implications for conservation
biology and genetic breeding programs depending on the historical events shared by those
populations. Although those historical events are of various types and complex, it seems
possible to summarize the population history by a sequence of events of a limited number of
types. These types mainly include some geographic expansions or reductions of population
ranges and some split or merging of population ranges. The spatial and temporal aspects of
such events are essential. One can, however, as a first approximation, reduce the spatial
dimension of such processes by considering non-spatially explicit populations (define as a set
of individuals sharing the same territory) and by assuming that two populations separated by
a major geographical barrier do not exchange genes. The evolutionary history of the
populations of a species can then be summarized by a series of split (i.e. divergence) and
merge (i.e. admixture) events with some potential variation of effective population sizes.
Despite such a simplification of population processes, there is no generic method that
allows inferring on such historical events from molecular data, especially when the
evolutionary history of populations includes many split and/or merge and/or effective
population variation events. Inferences in this case concern various demographic parameters
(admixture rate, effective population size), historical parameters (splitting time, time of
admixture) and also the mutational parameters of the genetic markers when the evolutionary
scales considered are too large to neglect mutation processes. For instance, Wilson and
Balding (1998) proposed an inferential method for scenarios including series of population
splits without the possibility of merge/admixture events. Their method also assume a
Bayesian Inference under Complex Evolutionary Scenarios ... 231
stepwise mutation model (SMM), a mutation model which imperfectly reflect the mutational
modalities of microsatellites, one of the most popular and versatile molecular markers used
for addressing questions in population genetics and evolution (Estoup and Angers, 1998).
Other methods have been recently published (e.g. Williamson and Slatkin, 1999; Chikhi et
al.., 2001; Nielsen and Wakeley, 2001; Berthier et al.., 2002; Wang, 2003) but each of these
methods can be applied to a very limited set of evolutionary scenarios.
Considering complex demographic histories including many split and/or merge and/or
effective population variation events and marker mutation models more sophisticated than the
SMM represent difficult methodological challenges (see Stephens, 2003). In particular, the
complexity of the processes involved makes calculation of the likelihood of the model very
difficult. The approximate Bayesian computation (ABC) method has been developed to
circumvent this difficulty (Tavar et al.., 1997; Pritchard et al.., 1999; Beaumont et al..,
2002). It avoids the computation of the likelihood on the data set, and uses instead summary
statistics, which makes computations tractable and fast. The ABC method has already been
applied to various problems in population genetics, as well as in epidemiology, and
palaeontology (Estoup et al.., 2001; Estoup and Clegg, 2003; Estoup et al.., 2004; Miller et
al., 2005; Plagnol and Tavar 2004; Excoffier et al.., 2005; Hamilton et al.., 2005; Shriner et
al.., 2006; Tanaka et al.., 2006).
We are presently developing a user-friendly methodological framework based on ABC
that allows one to make inferences from microsatellite data under evolutionary scenarios
including any combination of admixture, divergence and (discontinuous) effective population
size variation events, and this for any number of populations. Inference bears on (i) three
categories of historico-demographic parameters, namely effective population sizes, times of
events and admixture rates and (ii) parameters of the mutation model (e.g. mutation rates).We
first describe here the principle of our methodological framework. We then illustrate the
potential of the method by analysing a microsatellite data set that corresponds to a complex
scenario involving four honey bee populations (Apis mellifera) sharing two divergence and
two admixture events. Finally, we compute mean relative bias and errors from data sets
simulated with known values of parameters (close to estimates on real data) to assess the
accuracy of the method.


A User-Friendly Versatile ABC Program

Approximate Bayesian Computation (ABC) is a powerful Bayesian alternative to
likelihood computation for parameter estimation that has been introduced recently (Fu and Li
1997; Tavar et al.., 1997; Pritchard et al.., 1999; Estoup et al.., 2001; Beaumont et al..,
2002; Marjoram et al.., 2003). This approach does not require the computation of likelihoods,
but simply relies on the comparison of summary statistics computed on observed data with
those computed on data simulated under a model for which the parameters of interest are
known (Beaumont et al.., 2002; Marjoram et al.., 2003). Hence, by construction, ABC
methods have the potential to consider models of any complexity, provided only that data can
be simulated under the model.
The rationale and the description of the ABC method are given in Beaumont et al..,
(2002). In short, the approach involves three successive steps (Figure 1). The first step
consists of simulating many data sets (typically a few hundreds of thousands to a few
millions) with characteristics similar to the observed data set (same number of samples, same
number of individuals per sample, same number of loci, same geographic location of sampled
sites) using parameter values drawn from prior distributions (as defined in Tables 1 and 2).

Draw parameters values from prior
distributions
Simulate genetic data

Compute summary statistics from
simulated data
Write simulated parameters and
summary statistics into reference file
Compute summary statistics on
observed data
No
Compute distance between observed
and simulated summary statistics
Estimate parameters by local and
weighted linear regression on
retained simulations
Other
estimations?
No
Yes
Yes
Stop
Retain simulations closest to
observations
Enough
simulations?

Step 1
Simulations

Step 2
Rejection
Step 3
Estimation

Figure 1. Flow-chart of computations in an ABC analysis.
Table 1. Uniform [a,b] prior distributions for natural parameters

natural parameter a b
Ne 300 3 000
t
1
250 000 1 000 000
t
2
25 000 250 000
t
3
250 25 000
t
4
10 250
1
0 1
2
0 1
0.0001 0.001

Table 2. Modes and quantiles (5 and 95%) of prior distributions for composite
parameters

composite parameter mode 5% quantile 95% quantile
1.88 0.60 8.73
1
199 76 751
2
39 12.5 183
3
3.4 0.56 17.9
4
0.036 0.008 0.180

The simulation outputs and their associated parameters are stored in a reference file. The
second step consists of comparing the simulations to the observations by means of summary
statistics such as those described below for the honey bee application, and discarding those
simulations that are very different from the observations. The difference between sets of
statistics is computed as the Euclidean distance () between them. The n
r
simulation
outcomes (typically a few thousands) with the smallest Euclidean distance from the
observations are retained. The third step is the estimation of the parameters by a local linear
regression of parameters on summary statistics using the n
r
simulation outcomes.
Recent coalescent-based packages (e.g. Hudson, 2002; Laval and Excoffier, 2004)
provide efficient tools for simulating genetic data under complex scenarios. Such versatile
simulation packages are the core components to perform parameter estimation under an ABC
framework. However none of these programs allows performing an ABC analysis from A to
Z as detailed in the Figure 1. The program that we are presently developing, dubbed as do it
yourself ABC (diyABC), allows making a complete ABC analysis for any evolutionary
scenarios including split and/or merge and/or (discontinuous) effective population size
variation events, and this for any number of populations. It is a user-friendly program in a
fully clickable environment that will be easy to handle even for biologists unfamiliar with
inference algorithms. The main limitations of diyABC are: (i) there is no geneflow between
populations once they have split, (ii) the mutation model is restricted to the case of
microsatellites, therefore DNA sequence data cannot be analysed, and (iii) the program does
not have the potential to generate data for partially or fully linked markers.
Application to the Honey Bee

Biological model and samples: It is commonly believed that the honey bee, Apis
mellifera, branched off its closest relative Apis cerana a few millions years ago in Asia where
all other species of the genus Apis can be found and subsequently expanded into Europe and
Africa (Ruttner, 1988). In Europe, invasions took several roads allowing the emergence of
two evolutionary lineages M and C that were found to be strongly differentiated both at the
mitochondrial and nuclear genomes (Garnery et al.., 1992; Estoup et al.., 1995; Franck et al..,
2000; Whitfield et al.., 2006). Several geographical subspecies have been previously
described based on morphological variations within each of these two lineages such as A.m.
ligustica in the Italian Peninsula and A.m. mellifera in the North-Western Europe. However,
the Italian subspecies A.m. ligustica which was initially associated with lineage C, is likely to
result from an ancient mixture of lineages C and M, as suggested by both mtDNA (Garnery et
al.., 1992; Franck et al.., 2000) and microsatellite (Excoffier et al.., 2005) data. More
specifically, the potential hybrid nature of A.m. ligustica may explain the wide distribution of
microsatellite allele lengths previously observed in populations of this subspecies in
Excoffier et al.., (2005). In addition to this ancient (though not precisely dated) admixture
event, more recent admixture events have been observed in the populations from the Italian
Alpine border between A.m. mellifera (belonging to the M lineage) and A.m. ligustica
(Franck et al.., 2000).
We have illustrated the potential of our versatile ABC approach by analysing a
microsatellite data set that corresponds to a complex scenario involving four honeybee
populations and including two divergences and two admixture events (Figure 2). The
honeybee data set we analysed has been previously partly described in Franck et al.., (2000)
and Choisy et al.., (2004). Each population is represented by a sample of worker bees. The
focal population is located in Courmayeur (n=33, one bee per colony) at the extreme North of
the Aosta valley (North Western Italy). It is considered as an admixed population between
two different honeybee subspecies A. m. mellifera and A. m. ligustica. The A. m. mellifera
parental population is represented by a sample from the sanctuary of Ouessant (Bretagne,
France; n=49) or a sample from Sabres (Landes, France; n=50). The A. m. ligustica parental
population is represented by a sample from Forli (Emilia-Romagna, Italy; n=19) or from
Bergamo (Lombardia, Italy; n=31). Two population samples of each parental type were
considered to take into account our uncertainty regarding the exact parental populations of
the hybrid population from Courmayeur and evaluate the effect of considering different
parental population samples in our inferences. Because the Italian subspecies A. m. ligustica
is itself admixed between lineages C and M, we have added in the analysis a sample from
Novska (Croatia, subspecies A. m. carnica; n=30) as a reference for a pure lineage C
population. All sampled honeybees were genotyped at eight microsatellite loci as described in
Franck et al.., (2000).
Historico-demographic and mutation models: The evolutionary scenario representing the
historical events shared by the analysed honey bee populations is summarized in the Figure 2.
The two lineages M and C diverged at time t
1
in the past. The French and Italian populations
of the lineage M split at time t
2
.


Figure 2. Evolutionary scenario describing the history of the four sampled populations of honeybees.
The Italian populations have been invaded by the lineage M at time t
3
, giving birth to the
M/C hybrid subspecies A. m. ligustica. Finally, the admixture between A. m. ligustica and A.
m. mellifera in the population from Courmayeur occurred at time t
4
. Other demographic
parameters include the effective population size (N
e
) which was assumed to be the same in all
populations, and the admixture proportions
1
and
2
which measure the proportion of genes
from the subspecies A. m. mellifera (lineage M) in the populations of Courmayeur and Forli
(or Bergamo), respectively. Our model hence includes seven historico-demographic
parameters.
Two population samples of each parental type were considered to reflect our uncertainty
regarding the exact parental populations of the hybrid population from Courmayeur and
evaluate the effect of considering different parental population samples in our inferences. For
each analysis, a single A. m. mellifera parental population (Ouessant or Sabres) and a single
A. m. ligustica parental population (Forli or Bergamo) was considered: we hence made a total
of four independent analyses corresponding to the four combinations of parental populations.
Because we used microsatellite markers, we assumed a generalized mutation model
(GSM, Estoup et al.., 2002) requiring two parameters: the mean mutation rate over loci ( )
and the coefficient of the geometric distribution of the length by which a new mutant allele
differs from its ancestor. Individual loci mutation rates were considered to be proportional to
; the coefficient of proportionality was computed from the expected heterozygosity (Nei,
1987) of each locus in the A. m. mellifera reference population Ouessant or Sabres. This
coefficient for locus i,
I
=
i
/ , is computed from the relationship between the mutation
rate and the effective population size N
e
for a SMM locus in a mutation-drift equilibrium
population (8N
e
= (1/J)
2
-1, with J being the sum of squared allele frequencies; Estoup and
Cornuet, 1999). It is equal to

i
=
1
J
i
( )
2
1
1
J
j
( )
2
1

j=1
L
(1)

See also Pritchard et al. (1999) for a similar computation based on the allele size
variance.
For sake of simplicity, the coefficient of the geometric distribution of the length by
which a new mutant allele differs from its ancestor was fixed at 0.22 for all loci (Dib et al..,
1996; Estoup et al.., 2002). The only mutation parameter of interest was hence the mean
mutation rate over loci ( ).
We have also considered several composite parameters that are in principle better
estimated (e.g. Excoffier et al.., 2005): = 4Ne , 1 = t1 , 2 = t2 , 3 = t3 , and 4 =
t4 .
ABC estimation: Each natural parameter was drawn in a rectangular prior distribution
(Table 1) which translated into non rectangular prior distributions for the composite
parameters (Table 2). We extracted 30 summary statistics from both the observed and
simulated data sets to estimate parameters under the ABC approach. For each sample, we
took the mean number of alleles, the mean expected heterozygosity (Nei, 1987), the mean
allelic sizes variance in base pairs and the mean ratio of the number of alleles over the range
of allelic sizes in base pairs (Excoffier et al.., 2005). For each pair of populations, we also
took the F
st
(Weir and Cockerham, 1984) and the ()
2
distances (Goldstein et al.., 1995).
Finally, we added two admixtures coefficients (ML coefficient of Choisy et al., 2004): one
with Courmayeur as the admixed population and mellifera and ligustica as parental
populations and the other as ligustica as the admixed population and mellifera and carnica as
parental populations.
Each reference file (step 1 in Figure 1) included 8x10
6
simulated data sets. The
difference between the observed and simulated sets of statistics has been computed following
Beaumont et al. (2002) as the Euclidean distance () between them (step 2, Figure 1). The
5,000 simulation outcomes with the smallest Euclidean distances have been retained for the
regression step (step 3, Figure 1). In order to reduce heteroscedasticity (i.e. inequality of
variances among parameters) in the regression, all demographic parameters were log-
transformed prior to the regression and reversed-transformed to obtain posterior densities on
the original scale (Estoup et al.., 2004).
Accuracy of estimation: The performance of the ABC method for parameter estimation
was assessed using 500 data sets simulated under the scenario described in the Figure 2 with
fixed parameter values close to our estimates for these honey bee data sets (i.e. the modes of
posterior distributions). For each simulated data set, we computed the modes of the estimated
posterior distribution of the parameters, which was then used to compute, for each parameter,
the relative bias, the relative Root Mean Square Error, and Factor 2 (the proportion of times
in which the estimated value is within half and twice that of the true value) (Excoffier et al..,
2005).

Results

Each of the four independent analyses corresponding to the four combinations of parental
populations was replicated on a second reference file of 8x10
6
simulated data sets in order to
roughly assess the variance on our estimations inherent to the present ABC design (mainly
the numbers of simulated data sets in the reference file and of retained simulations for the
regression step). The results of all eight analyses are summarized in Table 3. Posterior and
prior distributions are illustrated in Figure 3 for one analysis of the population group
including Sabres Forli Courmayeur Novska (replicates 1 and 2 of SFCN in Table 3).
For a given parameter, the estimations obtained for each analysis only slightly differ for most
parameters. This suggests that our ABC design allows stable estimations of most parameters
and that these estimations are robust to the choice of different parental population samples of
A. m. mellifera and A. m. ligustica.
The mean point estimates (i.e. mean of modes of posterior distributions) of admixture
proportions were around 0.23 (with 90% credibility interval of 0.16 0.36) for the population
from Courmayeur and 0.12 (0.03 0.40) for the populations from Forli or Bergamo (Table
3). Regarding splitting times, we obtained mean point estimates around 150, 35, 0.5 and 0.04
for 1, 2, 3 and 4, respectively. The estimation of the natural parameters t1, t2, t3 and t4
implies the mutation rate to be fixed (see Table 4). If a mean mutation rate of 5x10
-4
is
assumed (a value usually considered for microsatellite loci in most species; e.g. Estoup and
Angers, 1998), we obtain around 150/5x10
-4
= 300,000 generations for the divergence of the
two lineages M and C which translates into 600,000 years when taking a mean generation
time of two years (Franck et al.., 2000). We obtained around 70,000 generations (140,000
years) for the divergence of the lineages M of France and Italy, 1,000 generations (2,000
years) for the introgressive invasion of the Italian populations by lineage C, and 80
generations (160 years) for the admixture between A. m. ligustica and A. m. mellifera species
in the population of Courmayeur.
The amount of information brought by our microsatellite data for each parameter can be
qualitatively assessed by comparing the posterior distributions to the prior distributions. A
weak contrast between both types of distributions reflects a weak information brought by the
microsatellite data and casts doubt about the estimation. A first indication is given by
comparing the modal values as well as the 5% and 95% quantile values of the prior and
posterior distributions (Tables 2 and 3). Those values substantially differ for the parameters
,
3
,
1
and
2
, whereas they are relatively similar for
1
,
2
, and
4
, although the 95%
quantiles correspond to usually lower values in posterior distributions.

Table 3. Estimations of historico-demographic parameters

OFCN SFCN OBCN SBCN Mean Composite
parameter Replicate 1 Replicate 2 Replicate 1 Replicate 2 Replicate 1 Replicate 2 Replicate 1 Replicate 2
3.4
2.5 5.0
3.4
2.5 4.9
5.0
3.7 7.1
5.1
3.7 7.2
3.7
2.7 5.4
3.9
2.8 5.5
5.3
3.9 7.2
5.9
4.3 7.9
4.5
3.3 6.3
1
96
52 262
155
84 400
150
87 392
182
110 469
141
78 367
142
82 372
185
111 484
208
118 548
157
90 - 412
2
40
14 123
30
13 104
30
11 89
36
13 102
21
9 72
36
14 111
38
16 123
41
15 127
34
13 106
3
0.40
0.15 1.49
0.43
0.17 1.56
0.78
0.33 2.65
0.78
0.34 2.49
0.39
0.14 1.40
0.44
0.18 1.56
0.48
0.20 1.68
0.34
0.13 1.25
0.51
0.21 1.76
4
0.051
0.018 0.163
0.033
0.010 0.088
0.058
0.018 0.151
0.046
0.015 0.116
0.024
0.007 0.067
0.033
0.010 0.087
0.025
0.008 0.065
0.034
0.010 0.095
0.038
0.012 0.104
1
0.21
0.15 0.35
0.28
0.21 0.42
0.21
0.15 0.32
0.24
0.17 0.37
0.24
0.17 - 0.38
0.25
0.18 0.37
0.20
0.15 0.34
0.18
0.13 0.30
0.23
0.16 0.36
2
0.15
0.03 0.61
0.14
0.03 0.45
0.08
0.02 0.26
0.10
0.02 0.35
0.10
0.02 - 0.34
0.15
0.04 0.45
0.15
0.04 0.51
0.07
0.02 0.26
0.12
0.03 0.40
Note: Estimated value (upper line) et 90% credibility interval (lower line) are given for each composite parameters and admixture rates for each set of
population samples and each replicate of the analysis. The sets of population samples are noted with the initials of each location (B=Bergamo,
C=Courmayeur, F=Forli, N=Novska, O=Ouessant, S=Sabres).

Table 4. Estimations of natural historico-demographic parameters

Parameter Mean of modes Mean 90% credibility interval
N
e
(individuals) 2250 1650 - 3150
t
1
(years) 628,000 360,000 1,648,000
t
2
(years) 136,000 52,000 424,000
t
3
(years) 2,040 840 - 7040
t
4
(years) 152 48 - 416
Note: The estimation of the natural parameters implies the mutation rate to be fixed; here a mean
mutation rate of 5x10
-4
was assumed (a value usually considered for microsatellite loci in most
species; e.g. Estoup and Angers, 1998). We took a mean generation time of two years (Franck et
al.., 2000) to express divergence times in years.


Note: Distributions correspond to the analysis of the set of population samples Sabres-Forli-
Courmayeur-Novska (SFCN). The two posterior distributions correspond to two independent
replicates of the complete analysis.
Figure 3. Prior (indicated by an arrow) and posterior distributions of composite parameters (, 1, 2, 3
et 4) and admixture rates (1 et 2).
A more explicit indication is given in Figure 3 which represents the posterior and prior
distributions of the seven parameters for one analysis performed on the population group
SFCN. This Figure confirms that our microsatellite data are informative for the parameters ,
3
,
1
and
2
, whereas the level of information for other parameters is considerably lower.
A quantitative measure of the accuracy of parameters estimation was obtained by
analysing 500 data sets simulated under the scenario described in the Figure 2 with fixed
parameter values chosen to be close to the best estimates for our honey bee microsatellite
data set. Table 5 shows a strong positive bias for the parameters
1
and
2
(54% and 70.3%
respectively) and a substantial negative bias for
4
(-33.1%).

Table 5. Performance of the ABC method for inferring parameters assessed from
simulated data sets

Parameter (true value) Mean relative bias RMSE Factor 2
(5.0) -0.028 0.195 0.998
1
(150) 0.540 0.662 0.857
2
(30) 0.703 0.829 0.844
3
(0.78) 0.006 0.527 0.820
4
(0.058) -0.331 0.431 0.717
1
(0.21) -0.124 0.210 0.994
2
(0.08) 0.065 0.558 0.810
Note: Mean relative bias, relative mean square error, and factor 2 have been estimated on 500 data files
simulated with parameter values indicated in the first column of the Table. The latter fixed
parameter values are close to our estimates for the analysed honey bee data sets (i.e. the modes of
posterior distributions). This analysis has been performed with the first reference file (replicate 1)
of the SFCN group.

On the other hand, the bias was virtually null for
3
(0.6%) and very low for ,
1
and
2

(-2.9%, -12.4% and 6.5%, respectively). Although non negligible, the RMSE values remain
acceptable and the Factor 2 values high for all parameters, with consistently lower RMSE and
higher Factor 2 values for and
1
than for other parameters. This suggests that, provided
that the evolutionary scenario for honey bee has been correctly modelled, one can be
reasonably confident in the order of magnitude of our estimations for all parameters, with the
most accurate estimations concerning and
1
.

Discussion

Realistic models of admixture may be much more complex than the standard model
assumed by most studies of estimation of admixture coefficients (e.g. Long 1991; Bertorelle
and Excoffier, 1998; Wang 2003; Choisy et al.., 2004), which includes a single hybrid
population and two isolated parental populations at mutation-drift equilibrium. They may
indeed involve: (i) more than two source populations (Dupanloup and Bertorelle, 2001), (ii)
some regular (and thus not instantaneous) admixture events over relatively long periods
(reviewed in Chakraborty, 1986), (iii) subdivided source populations, so that the actual
parental population is only partially sampled, and (iv) parental population(s) that are not at
mutation-drift equilibrium, due to population size fluctuations or introgression event(s) in a
more or less recent past. This study illustrates the potential of ABC methods to make
inferences on such complex evolutionary scenarios involving several population split
(divergence) and merge (admixture) events. Our study also illustrates the ability of ABC
methods to assess their performances (bias, mean square errors, ) at almost no extra
computation cost. For other estimation methods, performance studies generally require a time
consuming analysis of independently produced simulated data sets (e.g. Choisy et al.., 2004),
whereas this is intrinsic in the ABC approach (cf. Figure 1). As a matter of fact, the same
ABC process used to build the reference Table can be derived to produce test data sets with
known values of parameters. The same rejection and regression steps can then be applied to
these data sets to produce estimates of parameters that can be compared to their known true
values. It is therefore relatively quick and easy to evaluate the performance of the method for
any subset of the parameter space under a given model. The applicability of the ABC method
to particular cases should however depend on available computer power, as one to a few days
of computing time (on a single computer chip) are often necessary depending on the scenario
and the data set considered to obtain a large number of simulated summary statistics from
which the estimation procedure proceeds (e.g. 10
6
iterations). However, reasonable point
estimates can be obtained using much less simulations (e.g. a ten times lower number of
iterations in the present case study; results not shown) and hence shorter computation times.
It seems reasonable to anticipate that progress in simulation algorithm and higher computing
power will be available in future years, promoting the ABC method as the method of choice
for analysing complex evolutionary scenarios.
The potential of ABC to infer on admixture scenarios was already studied in Excoffier et
al.., (2005) but the later study focussed on a simple (i.e. standard) admixture scenario and
was based on a different simulation program, Simcoal (Laval and Excoffier, 2004). The latter
coalescent-based package also provides a versatile tool for simulating genetic data under
complex scenarios. Simcoal is considerably slower, however, than the program diyABC
which allows switching from the slow generation coalescent algorithm to the fast continuous
coalescent algorithm (Hudson 1990) when appropriate. Moreover, in contrast to Simcoal,
diyABC allows one to perform an ABC analysis from A to Z. Finally, diyABC is a user-
friendly Windows program with a graphical interface that is easy to handle even for
biologists unfamiliar with inference algorithms. It is worth noting, however, that in contrast
to diyABC, Simcoal implements geneflow between populations after they have split as well
as the simulation of partially or fully linked markers including DNA sequence data. We hope
to include at least some of those aspects in a second version of diyABC.
Our results on the honey bee data set indicates that, provided that the evolutionary
scenario has been correctly modelled, one can be reasonably confident in the order of
magnitude of our estimations for all parameters, with the most accurate estimations
concerning and
1
. Several features of our results should be however discussed here. The
variation of estimations observed when considering different potential parental samples is
expected because intra-population genetic diversity varies between the samples and hence
their effective population size as well. This however translates into only slightly different
estimations for most parameters. As for all Bayesian analysis one should be aware that the
prior distributions (chosen as flat priors for the natural parameters here) may have a
substantial effect on the posterior distribution, and this especially for the parameters for
which the data do not bring much information. One should hence expect a low effect of the
prior distributions for ,
3
,
1
and
2
and a stronger effect for other parameters. Additional
test simulations using different (non flat) priors confirmed this expectation (results not
shown). Figure 3 shows that, whereas the posterior distributions of replicate analyses
superimpose for some parameters ( and
3
), this is not the case for the five other parameters.
It hence seems that more than 810
6
simulated data sets are necessary to eliminate any
variation of parameter estimation between reference files. This is not surprising if one
considers the decile of a prior distribution. The probability to draw a value in a particular
decile is 1/10 for a single parameter and 10
-n
for a set of n parameter values (10
-7
in our
setting). There is also a substantial variability of genetic data and hence of summary statistics
values for a given set of parameter values. It is hence not surprising that a very high number
of simulations are needed to eliminate any variation of parameter estimation among replicate
analyses. Fortunately this variation remains limited in comparison to that intrinsic of the
inferential process. One alternative would consists to explore in a more efficient way the
parameter space by favouring locations that provide the summary statistics closest to the
target observed values. Majoram et al.., (2003) proposed to couple the ABC procedure to a
Monte Carlo Markov Chain method to do so. However, much work is still needed to do this
coupling in an efficient way for complex evolutionary scenarios (unpublished results). Other
algorithms similar to the Sampling Importance Resampling algorithms (Gelman et al.., 1995)
are presently under study as a possible alternative for a more efficient exploration of the
parameter space in a ABC framework. Finally, one of the most interesting challenges for the
ABC method concerns the choice of the statistics summarizing genetic information, which
remains relatively arbitrary. So far any attempt to this aim turned out to be unsatisfactory and
additional work is clearly needed in this specific field to optimize ABC methods.
Regarding the evolutionary history of honey bee, this study confirms some previously
published results and brings new insights. The complex evolutionary scenarios considered
here and involving several population divergence and admixture events between lineages M
and C populations have been suggested mostly by mtDNA data (Garnery et al.., 1992; Franck
et al.., 2000), but also by previous analysis of microsatellite data (Excoffier et al., 2005). Our
ABC microsatellite data analysis confirms an ancient divergence of two evolutionary lineages
(M in the North-West and C in the South-East of Europe) around -0.6 My (90% credibility
interval: -0.36 -1.67 My) which is compatible with the divergence times of -0.67 My (Arias
and Sheppard 1996) and -1 My (Garnery et al.., 1992) estimated from mtDNA sequence data
(credibility intervals not computed). The divergence time estimated for the French and Italian
populations of the lineage M (-0.14 My; -0.05 -0.44 My) is also compatible with the
estimation provided by mtDNA (-0.19 My; Franck et al.., 2000). Interestingly, our ABC
treatment indicates a relatively recent time of admixture of the Italian populations of the
lineage M by populations from the lineage C (-2000 years: -1000 - -7000 years). The
introgressive invasion of the Italian populations of the lineage M by populations from the
lineage C hence occurred after the last glacial event. This evolutionary event could not be
precisely dated from mtDNA data. Our ABC treatment suggests that the admixture between
the A. m. mellifera and A. m. ligustica subspecies in Italy is very recent and may have started
during the 19
th
century, at least in the Val dAoste area. The credibility interval on this date
makes it possible an even more recent event that would be compatible with the trading of
honey bee queens that developed in the second part of the 20
th
century. However, the close
similarity between the prior and posterior should prompt us to be careful in our conclusion on
this parameter. Finally, the admixture rate values obtained with the present ABC approach
are in agreement with previous estimations based on microsatellite markers (Franck et al..,
2000; Choisy et al.., 2004; Excoffier et al.., 2005). It is worth noting, however, that previous
estimations (around 35% of genes of A. m. mellifera origin) did not differentiate the A. m.
mellifera genes of local origin from those brought by A. m. ligustica. Because the
evolutionary model treated in the ABC approach involved two instead of one admixture
events we could infer that ca. 23% of the genes in the admixed population from Courmayeur
have a A. m. mellifera local origin and that among the 77% remaining, 12% and hence 9.2%
(=77%12%) are A. m. mellifera genes of the M lineage that persisted in A. m. ligustica
parental populations. The total proportion of A. m. mellifera genes can therefore be estimated
as 23% + 9.2% = 32.2%, a value close to the previously estimated value of 35%.

Acknowledgments

This research was financially supported by a grant from the French Bureau des
Ressources Gntiques and the French Agence Nationale de la Recherche grant No NT05-4-
42230 as well as a INRA department SPE grant to JMC and AE. LE was supported by a
Swiss National Science Foundation grant No 3100A0-112072, as well as a grant from the
Institut de la Recherche Agronomique during his sabbatical visit at the CBGP. The program
do it yourself ABC (diyABC) used to perform our ABC computations is available under
request from JMC or AE.
References

Arias M.C., Sheppard W.S. 1996. Molecular phylogenetics of honey bee subspecies (Apis
mellifera L.) inferred from mitochondrial DNA sequence. Mol. Phyl. Evol., 5:557-566.
Beaumont M.A., Zhang W., Balding D.J. 2002. Approximate Bayesian computation in
population genetics. Genetics 162:2025-2035.
Berthier P., Beaumont M.A., Cornuet J-M., Luikart G. 2002. Likelihood-based estimation of
the effective population size using temporal changes in allele frequencies: a genealogical
approach. Genetics 160:741-751.
Bertorelle G., Excoffier L. 1998 Inferring admixture proportions from molecular data. Mol
Biol Evol. 15:1298-1311.
Choisy M.P., Franck P, Cornuet J-M. 2004 Estimating admixture proportions with
microsatellites:comparison of methods based on simulated data. Mol. Ecol 13:955-968.
Chakraborty R. 1986 Gene admixture in human populations: models and predictions. Year of
Phys. Anthr. 29:1-43.
Chikhi L., Bruford M.W., Beaumont M.A. 2001. Estimation of admixture proportions: a
likelihood-based approach using Markov chain Monte Carlo. Genetics 158:1347-1362.
Dib C., Faure S., Fizames C., Samson D., Drouot N., et al. 1996. A comprehensive map of
the human genome based on 5,264 microsatellites. Nature 380:152-154.
Dupanloup I., Bertorelle G. 2001. Inferring admixture proportions from molecular data:
extension to any number of parental populations. Mol. Biol. Evol 18:672-675.
Estoup A., Garnery L., Solignac M., Cornuet J-M. 1995. Microsatellite variation in honeybee
(Apis mellifera L.) populations: hierarchical genetic structure and test of the infinite
allele and stepwise mutation model. Genetics 140:679-695.
Estoup A., Angers B. 1998. Microsatellites and minisatellites for molecular ecology:
theoretical and empirical considerations. In Carvalho G. R. (ed.) Advances in molecular
ecology. Nato Sciences Series, IOS Press, Amsterdam, pp. 5586.
Estoup A, Cornuet J-M. 1999. Microsatellite evolution: inferences from population data.
In:Microsatellites:evolution and applications. (eds. Goldstein, D.B, Schltterer, C.),
Oxford University Press, Oxford, 50-65.
Estoup A., Wilson I.J., Sullivan C., Cornuet J-M., Moritz C. 2001. Inferring population
history from microsatellite and enzyme data in serially introduced cane toads, Bufo
marinus. Genetics 159:1671-1687.
Estoup A., Jarne P., Cornuet J-M. 2002. Homoplasy and mutation model at microsatellite loci
and their consequence for population genetics analysis. Mol Ecol. 11:1591-1604.
Estoup A., Clegg S.M. 2003. Bayesian inferences on the recent island colonization history by
the bird Zosterops lateralis lateralis. Mol. Ecol. 12:657-674.
Estoup A., Beaumont M.A., Sennedot F., Moritz C., Cornuet J-M. 2004. Genetic analysis of
complex demographic scenarios: spatially expanding populations of the cane toad, Bufo
marinus. Evolution 58:2021-2036.
Excoffier L., Estoup A., Cornuet J-M. 2005. Bayesian analysis of an admixture model with
mutations and arbitrarily linked markers. Genetics 169:1727-1738.
Franck P., Garnery L., Celebrano G., Solignac M., Cornuet J-M. 2000. Hybrid origins of
honeybees from Italy (Apis mellifera ligustica) and Sicily (A. m. sicula). Mol Ecol.
9:907-921.
Fu Y.X., Li W.H. 1997. Estimating the age of the common ancestor of a sample of DNA
sequences. Mol. Biol Evol. 14:195-199.
Garnery L., Cornuet J-M., Solignac M. 1992. Evolutionary history of the honey bee Apis
mellifera inferred from mitochondrial DNA analysis. Mol. Ecol. 1:145-154.
Gelman A., Carlin J.B., Stern H.S., Rubin D.B. 1995. Bayesian Data Analysis. Chapman and
Hall, London.
Goldstein D.B., Ruiz Linares A., Cavalli-Sforza L.L., Feldman M.W. 1995 Genetic absolute
dating based on microsatellites and the origin of modern humans. Proc. Natl. Acad. Sci.
USA. 92:6723-6727.
Hamilton G., Stoneking M., Excoffier L. 2005. Molecular analysis reveals tighter social
regulation of immigration in patrilocal populations than in matrilocal populations. Proc.
Natl. Acad. Sci. USA. 102:7476-7480.
Hudson R.R. 1990. Gene genealogies and the coalescent process. In:Antonovics J. (ed).
Oxford surveys in evolutionary biology. Oxford University press, Oxford. pp. 1-44.
Hudson, R.R. 2002. Generating samples under a Wright-Fisher neutral model of genetic
variation. Bioinformatics 18:337-338.
Laval G., Excoffier L. 2004. SIMCOAL 2.0:a program to simulate genomic diversity over
large recombining regions in a subdivided population with a complex history.
Bioinformatics 20:2485-2487.
Long J.C. 1991. The genetic structure of admixed populations. Genetics 127:417-428.
Marjoram P., Molitor J., Plagnol V., Tavar S. 2003. Markov chain Monte Carlo without
likelihoods. Proc. Natl. Acad. Sci. USA. 100:15324-15328.
Miller N., Estoup A., Toepfer S., Bourguet D., Lapchin L. et al. 2005. Multiple transatlantic
introductions of the western corn rootworm. Science 310:992.
Nei M., 1987. Molecular Evolutionary Genetics. New York:Columbia University Press.
Nielsen R., Wakeley J. 2001. Distinguishing migration from isolation: a Markov chain Monte
Carlo approach. Genetics 158:885-96.
Plagnol V., Tavar S. 2004. Approximate Bayesian computation and MCMC. In Niederreiter
H. (ed) Monte Carlo and Quasi-Monte Carlo methods, Springer-Verlag, pp 99-114.
Pritchard J., Seielstad M., Perez-Lezaun A., Feldman M. 1999. Population growth of human
Y chromosomes: a study of Y chromosome microsatellites. Mol. Biol. Evol. 16:1791-
1798.
Ruttner F. 1988. Biogeography and taxonomy of honeybees. Springer-Verlag Berlin.
Shriner D., Liu Y., Nickle D. C., Mullins J. I. 2006. Evolution of intrahost HIV-1 genetic
diversity during chronic infection. Evolution 60:1165-76.
Stephens M. 2003. Inference under the coalescent. In D.J. Balding, M. Bishop and C.
Cannings (eds). Handbook of Statistical Genetics. Wiley, Chichester, pp. 213-238.
Tanaka M.M., Francis R.F., Luciani F., Sisson S.A. 2006. Using approximate Bayesian
computation to estimate tuberculosis transmission parameters from genotype data.
Genetics 173:1511-1520.
Tavar S., Balding D.J., Griffiths R.C., Donnelly P. 1997. Inferring coalescence times from
DNA sequence data. Genetics 145:505-518.
Wang J. 2003. Maximum-likelihood estimation of admixture proportions from genetic data.
Genetics 164:747-765.
Weir B. S., Cockerham C.C. 1984. Estimating F-statistics for the analysis of population
structure. Evolution 38:1358-1370.
Whitfield C.W., Behura S.K., Berlocher S.H., Clark A.G., Johnston J.S. et al.., 2006. Thrice
out of Africa: ancient and recent expansions of the honey bee, Apis mellifera. Science
314:642-645.
Williamson E.G., Slatkin M. 1999. Using maximum likelihood to estimate population size
from temporal changes in allele frequencies. Genetics 152:755-761.
Wilson I.J., Balding D.J. 1998. Genealogical inference from microsatellite data. Genetics
150:499-510.


Chapter 11

Geographic Structure of Craniometric
Variation and the Estimates of Possible
Dispersal Routes of Major Human
Populations

Tsunehiko Hanihara

Department of Anatomy and Biological Anthropology,
Saga Medical School, 5-1-1 Nabeshima, Saga 849-8501, Japan

Abstract

In the last decade, a near consensus has emerged in supporting single African origin
of modern humans. However, the timing of dispersal out of Africa and the routes taken
are far from obvious and focus of debate. In the present study, possible dispersal routes
taken across Eurasia and finally New World and the Pacific were investigated using
craniometric dataset consisting of 34 measurements. The degree of intra-regional
variation shows that sub-Saharan Africans are the most diverse and that the diversity of
non-Africans is negatively correlated with geographic distance to East Africa. The
relationship between regional variation and geographic distance from sub-Saharan Africa
tested by linear regression analysis supports a possible dispersal route proposed from the
research of mtDNA haplotype variation, the Horn of Africa (the route across the Bab el
Mandeb Strait) as a passageway in major human migration out of Africa. The results
obtained support, moreover, the multiple migration hypothesis for the peopling of
East/Northeast Asian region; mainly from central/western Asia with minor contribution
from Southeast Asia. Nonlinear regression (exponential approximation) analysis using
geographic distance measured along a hypothetical dispersal route shows that phenotypic
similarity between populations decreases as the geographic distance increases. Such
findings suggest that geographic distance is a primary and significant determinant of not
only genetic but also craniometric variation between major human population groups.

Correspondence to: Tsunehiko Hanihara, Department of Anatomy and Biological Anthropology, Saga Medical
School, 5-1-1 Nabeshima, Saga, 849-8501, Japan; E-mail: hanihara@cc.saga-u.ac.jp
Tsunehiko Hanihara 248
The present study illustrates that modern human cranial diversity patterns fit an
evolutionary model of neutral expectation and a dispersal model of iterative founder
effects with an African origin.

Keywords: phenotypic diversity, intra-regional variation, migration, R-matrix method,
neutral expectation.

Introduction

The understanding of how modern human diversity is structured is important for
understanding the process of modern human evolution, because the patterns of human
variation are intimately linked with the origin and dispersals of modern humans (Cavalli-
Sforza et al., 1994; Relethford, 1994, 2002; Relethford and Harpending, 1994; Relethford
and Jorde, 1999; Jorde et al., 2000; Thomson et al., 2000; Ke et al., 2001; Underhill et al.,
2000; Wells, 2002; Oppenheimer, 2003; Mellars, 2006). Based on recent genetic and
morphological findings, it turns out that it is not just that sub-Saharan Africans have larger
diversity than other regional populations, but that there is a sequential decrease in diversity
with distance from Africa (Relethford and Harpending, 1994; Harpending and Rogers, 2000;
Relethford, 2004a, 2004b; Li et al., 2008).
The larger diversity of sub-Saharan Africans are explained either by differences in
population size, differences in time since the founding of the populations, or both (Relethford
and Harpending, 1994; Jorde et al., 1997; Relethford and Jorde, 1999; Excoffier, 2002).
Several lines of evidence for gradients of genetic and phenotypic diversity among major
geographic populations address the process of the occupation of the present range of modern
human population groups with iterative bottleneck effects (Harpending and Rogers, 2000;
Ayub et al., 2003; Relethford, 2004b; Prugnolle et al., 2005; Ramachandran et al., 2005; Liu
et al., 2006; Manica et al., 2007; Cramon-Taubadel and Lycett, 2008).
In relation to the patterns of human variation, possible expansion and colonization routes
throughout the Eurasian continent and time scales are extensively studied, but still remain
controversial on several issues. First, regarding the migration routes between Africa and
Eurasia, two major migratory pathways are presumed; the Levant corridor and the Horn of
Africa (Cavalli-Sforza et al., 1994; Lahr, 1996; Stringer, 2000; Bosch et al., 2001; Underhill
et al., 2001; Kivisild et al., 2004; Luis et al., 2004; Forster and Matsumura, 2005). The results
obtained by Y-chromosome analyses suggest that the Levant corridor, the northern route,
may be of major importance in the human migratory movements between Africa and Eurasia
(Underhill et al., 2001, Luis et al., 2004). On the other hand, recent mtDNA analyses indicate
the importance of the Horn of Africa, or southern route, the route across Bab el Mandeb
Strait along the Indian Ocean coastline (Quintana-Murci et al., 1999; Stringer 2000;
Oppenheimer, 2003; Forster and Matsumura, 2005; Macaulay et al., 2005; Chandrasekar et
al., 2007; Hudjashov et al., 2007). One more possible migration route, the strait of Gibraltar
connecting Iberia and Maghreb, northwestern Africa, made only a minor contribution to gene
flow between Africa and Eurasia (Bosch et al., 2001).
Craniometric Variation and the Estimates of Dispersal Route 249
Another major active focus of interest is the peopling of East/Northeast Asia. Several
students indicate that the human occupation of East/Northeast Asia resulted from a northward
expansion of Southeast Asian populations in the late Pleistocene (Turner, 1987, 1990;
Ballinger et al., 1992; Disotell, 1999; Li and Su, 2000; Oppenheimer, 2003; Shi et al., 2005).
On the other hand, some genetic and phenotypic studies suggest multiple migrations to the
East/Northeast Asia, from somewhere around the western half of Eurasia, Central Asia, and
South Siberia (Underhill et al., 2001; Wells, 2002; Uinuk-Ool et al., 2003; Hanihara, 2006,
2008; Hill et al., 2007).
In recent years, quantitative genetic approach, the R-matrix method, makes it possible to
examine human phenotypic variation within and among geographic region in terms of the
patterns of modern human diversity and demographic history (Relethford, 1994, 1996, 2001;
Relethford and Harpending, 1994; Hanihara and Ishida, 2005; Hanihara, 2008). The results
obtained provide several lines of evidence that the modern human cranial variation, when
considered as a whole, varies across regions in a manner matching neutral evolution (Lynch
and Hill, 1989; Relethord, 1994, 2002, 2004a, 2004b; Roseman and Weaver, 2004; Cramon-
Taubadel and Lycett, 2008; Hanihara, 2008).
If neutral evolution were to a large extent responsible for the diversity among modern
human craniofacial features, it should be possible to evaluate temporal and spatial aspects
regarding the timing of human dispersal from Africa, and the routes taken. However, no
estimates of possible dispersal routes taken by the early anatomically modern humans based
on morphological data have so far been reported.
Given these backgrounds, the purpose of the present study is to explore the possible
migration and colonization routes taken across Eurasia and finally to the New World and
Oceania based on the cranial measurements.


The worldwide craniometric dataset drawn from the 14 major geographic regions,
totaling 9614 male and 3600 female adult specimens, were used in this study. Brief
information on the samples used is given in Table 1. The detailed information on country of
origin, tribal affiliation, and cultural background are given elsewhere (Hanihara, 2008).
Phenotypic variation was assessed using 34 craniofacial measurements (Table 2). All the data
were recorded by myself to avoid possible interobserver error.
For assessing relationships between the degree of diversity for each geographic
population and geographic distances from sub-Saharan Africa, the analytical approach used
here is R-matrix method (Relethford and Blangero, 1990; Relethford, 1994; Relethford and
Harpending, 1994). In the present study, moreover, the isolation-by-distance model
developed by Relethford (2004a) was applied to examine the level of the correlation between
geographic and phenotypic distance and to confirm the relevance of dispersal route estimated.
This model can be expressed in terms of the elements of R-matrix and related parameters as
follows,

r
ij
= (Fst r
min
) e
-bd
+ r
min.
[1]
Table 1. Materials used in this study

Number of samples
Name of regional samples Male Female Local populations
Sub-Saharan Africa 817 177 West Africa (Gambia, Guinea, Ivory
Coast, Liberia, Senegal, Sierra Leone),
Ghana/Ashanti, Nigeria/Ibo, Cameroon
Congo, Gabon, Somalia, Ethiopia,
Kenya, Tanzania, Uganda, Rwanda,
South Africa (Zambia, Simbabwe,
Malawi, Mozambique, Lesotho)
South Africa/Zulu,
South Africa/Khoi-San
North Africa* 582 105 Pre Dynasty (Badari, Naqada)
Early Dynasty (Lisht)
Middle Dynasty (Cairo, Gizeh,
Omdurman), Recent Egypt
Nubia/Dynasty, Nubia/Recent
Morocco
West Asia 350 47 Afghanistan, Pakistan, Iran, Iraq,
Israel, Syria, Palestine, Turkey,
Cyprus
Europe 1354 320 Russia, Czech, Poland, Hungary,
Rumania, Greece, Yugoslavia, Italy
Finland, Sweden, Norway, Holand,
Austria, Germany, Switzerland, France,
Spain, Portugal, United Kingdom
South Asia 532 60 Nepal, Assam-Sikkim, Bengal,
Punjab, Bombay, Malabar Coast
Mysore, Madras, Ceylon/Veddah
Bangladesh, Bhutan
East/Northeast Asia 1140 529 Buryats, Amur Basin, Mongol,

Japan, North China, South China,

Tibet

Number of samples
Name of regional samples Male Female Local populations
Southeast Asia 781 139 Vietnam, Thailand, Malay,
Laos, Cambodia, Myanmar,
Andaman, Nicobar, Sumatra,

Java, Borneo, Celebes, Molucca,
Lesser Sunda, Philippines,
Negritos/Philippines
Australia 355 110 Northern Territory, Queensland,
New South Wales, Victoria,

South Australia, Tasmania,

Western Australia
Melanesia 621 286 New Guinea, Torres Strait,

New Britain, New Ireland, Fiji
Solomon, Vanuatu, New Caledonia,
Micronesia 98 63 Mariana, Caroline
Polynesia 501 93 Tonga, Samoa, Society, Cook,
Marquesas, Hawaii, Easter,
New Zealand/Maori,
Chatham Islands/Moriori
Arctic 660 526 Aleuts, Chukchis, Inuits/Asia
Inuits/Alaska, Inuits/Canada,
Inuits/Greenland
North America 1360 908 Subarctic, Northwest Coast,

California, Plateau, Great Basin,
Arizona, New Mexico, Plain/North
Plain/South, Northeast Woodland,

Southeast Woodland
Central/South America 442 233 Mexico, Carib, Intermediate, Peru,

South Andes, Patagonia, Fuego
*: R-matrix theory and analysis is synchronic in nature. However, both the Egypian and Nubian cranial
seires from pre-Dynastic to recent times through the Christian period exhibit relative homogeneity,
suggesting overall post-Neolithic diachronic and regional population continuity (Irish, 2005, 2006;
Hanihara, 2008)
Table 2. List of 34 craniofacial measurements

1. Maximum cranial length (GOL)
2. Nasion-opisthocranion (NOL)
3. Cranial base length (BNL)
4. Maximum cranial breadth (XCB)
5. Minimum frontal breadth (M9)
6. Maximum frontal breadth (XFB)
7. Biauricular breadth (AUB)
8. Biasterionic breadth (ASB)
9. Basion bregma height (BBH)
10. Sagittal frontal arc (M26)
11. Saggital parietal arc (M27)
12. Saggital occipital arc (M28)
13. Nasion-bregma chord (FRC)
14. Bregma-lambda chord (PAC)
15. Lambda-opisthion chord (OCC)
16. Basion prosthion length (BPL)
17. Breadth between Frontomalare temporale (M43)
18. Bizygomatic breadth (ZYB)
19. Middle facial breadth (M46)
20. Nasion prosthion height (NPH)
21. Interorbital breadth (DKB)
22. Orbital breadth (M51)
23. Orbital height (OBH)
24. Nasal breadth (NLB)
25. Nasal height (NLH)
26. Palate breadth (MAB)
27. Mastoid height (MDH)
28. Mastoid width (MDB)
29. Frontal chord (M43(1))
30. Frontal subtense (No 43c)
31. Simotic chord (M57, WNB)
32. Simotic subtense (No 57a, SIS)
33. Zygomaxillary chord (M46b, ZMB)
34. Zygomaxillary subtense (No 46c, SSS)
For additional description, see Howells (1973, 1989).
M, Martin and Saller (1957); and No, Bruer (1988)

According to Relethford (2004a), under isolation by distance, the expected corelation
between population i and j is of the form ke
-bd
, where d is the geographic distance between
the two populations and b is the rate of distance decay.
To conduct the R-matrix method, an estimate of average heritability for craniometric
traits used is required. In the present study, the average heritability of h
2
= 0.55 obtained by
Devor (1987) and followed by many other studies (Relethford, 1994, 2004a; Relethford and
Harpending, 1994; Donnelly and Konigsberg, 1998; and many others) were used.
For each pair of regional samples, geographic distance was calculated in kilometers
based on great circle distances (Relethford, 2004a; Manica et al., 2005; Ramachandran et al.,
2005). Following Ramachandran et al. (2005) and Cramon-Taubadel and Lycett (2008),
pairwise geographic distances were calculated using 12 waypoints (as shown in Figure 1) to
make the estimates of between-regional population distances more reflective of human
migration patterns.


Figure 1. Maps showing waypoints (solid circles) and possible colonization routes through them.
The distance between two regions is the sum of the great circle distance between the
regional center of each geographic region and the waypoint in the path connecting them, plus
the great circle distances between waypoints if two or more waypoints exist (Ramachandran
et al., 2005). That is, I took the geographic center of the local samples within each regional
cluster as the geographic coordinates of that cluster, and then compute the great circle
distances between these 14 geographic points, adjusting for way points. The point of origin
for modern humans is tentatively set in Nairobi, Kenya, a likely region of the origin of
anatomically modern humans (Harpending et al., 1993; Lahr, 1996; Yuehai et al., 2001;
Manica et al., 2005, 2007; Prugnolle et al., 2005; Ramachandran et al., 2005; Liu et al., 2006;
Mellars, 2006).
In the present study, four dispersal routes out of Africa are assumed (Figure 1): the first
(shown in Figure 1a) is the route along the Nile River and across the Sinai peninsula and the
Levant leading into western Asia, Europe, northern part of eastern Eurasia and the New
World via the Bering Strait, and Southeast Asia along the Indian Ocean coastline and
Australasia via the Wallace Strait (Jones et al., 1992; Kingdon, 1993; Lewin, 1993); the
second, well-known as the multiple exodus hypothesis (Figure 1b), argues for at least
northern route via the Levant region to Europe and further the northeastern Asia and finally
to the New World, while acknowledging the possibility of an earlier southern route from
eastern Africa to Southeast Asia and Australia along the Indian subcontinent across the Bab-
el-Mandeb Strait leading into southern Arabia and the South Asian coast (Lahr and Foley,
1994; Lahr, 1996; Quintana-Murci et al., 1999; Stringer, 2000; Underhill et al., 2001; Luis et
al., 2004; Forster and Matsumura, 2005; Thangaraj et al. 2005; Macaulay et al., 2005); the
third (Figure 1c) is a single southern route out of Africa via the Bab-el-Mandeb Strait from
Red Sea along the Indo-Pacific coast to Southeast Asia and Australia, including northern
expansion from Southeast Asia to China, Japan and Northeast Asian region, and finally to the
New World on the one hand (Turner, 1987, 1990), and migration from the Arabian (Persian)
Gulf to the Levant and further to Europe and North Africa on the other hand (Oppenheimer,
2003); and the fourth route (Figure 1d) is similar to the third one, but hypothesizing the
peopling of East/Northeast Asian region from west/central Asia via northern Siberian routes
(Wells, 2002; Uinuk-Ool et al., 2003; Chandrasekar et al., 2007).

Results

Under the assumption of the same effective population size, interregional variation
among 14 geographic groups was estimated by Fst values. The Fst values using average
heritability of craniometric traits of h
2
= 0.55 and the minimum Fst value (h
2
= 1.00) are
shown in Table 3 for the male and female samples separately. The results show that the
craniometric variation of across worldwide regions is fairly limited. Moreover, the standard
errors, being sufficiently small relative to the Fst values, suggest significance of parameters
used in the isolation-by-distance model.

Table 3. Minimum and estimated Fst values for both male and female cranial series

Minimum Fst S.E. Fst
1)
S.E.
Male 0.1132 0.0009 0.1884 0.0010
Female 0.1188 0.0019 0.1969 0.0021
1);
average heritability of h
2
= 0.55.

Table 4 gives the intraregional variation calculated by applying the Relethford and
Blangeros (1990) method and the geographic distances from sub-Saharan Africa (Nairobi,
Kenya) to major geographic regions measured along the four hypothetical migration routes
shown in Figure 1. The relationships between intra-regional variations and geographic
distances from eastern Africa shown in Table 4 in both sexes are presented in Figure 2. The
scatterplots yield inverse linear relationship between within-regional phenotypic variance and
geographic distance, with samples more geographically distant from sub-Saharan Africa
(eastern Africa) characterized by less phenotypic variation. The fourth route (Figure 1d)
yields significant inverse linear relationships between within-group craniometric variance and
geographic distance.

Table 4. Observed variance for 14 geographic samples and the geographic distance
from eastern Africa (Nairobi, Kenya) measured by great circle distances along the
hypothetical dispersal routes shown in Figure 1 (kilometer)

regional variance
Regional
groups Male Female

route 1

route 2

route 3

route 4
Sub-Saharan
Africa
1.0856 1.1482 0.00 0.00 0.00 0.00
North Africa 0.9890 0.9845 3977.64 3977.64 8565.55 8565.55
West Asia 0.9982 0.9459 4885.95 4885.95 5848.91 5848.91
Europe 1.0190 1.0059 6628.75 6628.75 8370.26 8370.26
South Asia 0.9604 1.0288 9048.15 6792.99 6792.99 6792.99
East/Northeast
Asia
1.0351 1.0893 11774.22 11774.22 12872.05 10409.70
Southeast Asia 0.9973 1.1115 12233.32 9978.16 9978.16 9978.16
Australia 0.9936 1.0269 17134.45 14879.29 14879.29 14879.29
Melanesia 0.9931 1.0152 18262.97 16007.81 16007.81 16007.81
Micronesia 0.9880 0.9011 16137.94 13882.78 13882.78 13882.78
Polynesia 0.9587 1.0060 22940.25 20685.09 20685.09 20685.09
Arctic 0.9577 1.0458 16844.48 16844.48 19464.62 15479.96
North
America
1.0199 0.9727 20283.37 20283.37 22903.51 18918.85
Central/South
America
0.9826 0.8994 26632.84 26632.84 29252.98 25268.32

The decrease of phenotypic diversity observed with increasing distance from sub-
Saharan Africa shown in Figure 2 may reflect a rapid expansion with serial bottlenecks of
ancient populations originating in Africa (Relethford and Harpending, 1994; Manica et al.,
2005, 2007; Ramachandran, 2005; Cramon-Taubadel and Lycett, 2008). If so, isolation by
distance should have a significant effect on average patterns of phenotypic similarity. Given
such background, I extend the analysis of the relationships between patterns of phenotypic
variation and geographic distance to the isolation-by-distance model.


Figure 2. Relationships between intra-regional variations given in Table 4 and distance from East
Africa using the dispersal routes shown in Figure 1.
Geographic distance between every pair of regional samples measured along the possible
dispersal routes through landmasses shown in Figure 1d based on great circle distances are
given in Table 5. Tables 6 and 7 present the R-matrix between pairs of samples based on the
34 craniometric data for the male and female series, respectively. Figure 3 shows the
relationship between r
ij
and geographic distance in male and female samples. For each
dataset, the isolation-by-distance model is fitted by using nonlinear regression (exponential
approximation) analysis presented in equation [1] (Relethford, 2004a). The two datasets show
roughly the expected decline in biological similarity with geographic distances.

Figure 3. Pattern of isolation by distance based on craniometric data. Solid line indicates the fit of
nonlinear regression model (exponential approximation).

Table 5. Geographic distances between every pair of samples measured long the routes shown in Figure 1d

1 2 3 4 5 6 7 8 9 10 11 12 13 14
SubSaharn
Africa
0.00

North Africa 8565.55 0.00

West Asia 5848.91 3665.89 0.00

Europe 8370.26 5408.69 3720.00 0.00

South Asia 6792.99 7666.62 4949.98 7471.33 0.00

East/Northeat
Asia
10409.70 11639.97 8923.33 11444.68 8102.88 0.00

Southeast
Asia
9978.16 10922.33 8205.69 10727.04 6845.23 4567.45 0.00

Australia 14879.29 15823.46 13106.82 15628.17 11746.36 17953.71 6574.69 0.00

Melanesia 16007.81 16951.98 14235.34 16756.69 12874.88 19082.23 7703.21 4159.91 0.00

Micronesia 13882.78 14826.95 12110.31 14631.66 10749.85 8472.07 5578.18 10479.31 11607.83 0.00

Polynesia 20685.09 21629.26 18912.62 21433.97 17552.16 15274.38 12380.49 8837.19 7948.92 8519.77 0.00

Arctic 15479.96 16424.13 13707.49 16228.84 12887.04 7030.17 11160.02 16061.15 17189.67 15064.64 21866.95 0.00

North
America
18918.85 19863.02 17146.38 19667.73 16325.93 10469.06 14598.91 19500.04 20628.56 18503.53 25305.84 5148.84 0.00

Central/South
America
25268.32 26212.49 23495.85 26017.20 22675.40 16818.53 20948.38 25849.51 26978.03 24853.00 31655.31 11498.31 6632.03 0.00

Table 6. R-matrix calculated based on 34 craniofacial measurements for male samples

1 2 3 4 5 6 7 8 9 10 11 12 13 14
Sub-Saharan
Africa
0.2146

North Africa -0.0666 0.2024

West Asia -0.0007 0.1383 0.1413

Europe -0.0479 0.0958 0.1336 0.2881

South Asia 0.1048 0.1837 0.0963 0.0186
0.3821

East/Northeast
Asia
-0.1185 -0.1055 -0.0847 0.0013 -0.1553 0.2436

Southeast
Asia
0.0149 0.0013 -0.0426 -0.0277 0.1076 0.0582 0.1544

Australia 0.1685 0.0161 0.0015 -0.1094 0.0553 -0.2137 -0.0890 0.3785

Melanesia 0.0954 0.0285 -0.0055 -0.1133 0.1044 -0.1395 -0.0145 0.1998 0.1805

Micronesia -0.0278 -0.0754 -0.0687 -0.0790 -0.1069 0.0523 0.0041 --0.0261 -0.0132 0.1077

Polynesia -0.0699 -0.1011 -0.0643 -0.0853 -0.1836 0.0746 -0.0760 -0.0243 -0.0263 0.0863 0.2143

Arctic -0.1762 -0.2565 -0.1621 -0.0767 -0.3515 0.2524 -0.0699 -0.1776 -0.1583 0.1066 0.1955 0.5571
North
America
-0.1017 -0.1316 -0.0571 -0.0165 -0.1979 0.0736 -0.0473 -0.0706 -0.0826 0.0417 0.0783 0.2227 0.1862

Central/South
America
-0.1157 -0.0626 -0.0192 -0.0181 -0.0575 0.0610 0.0265 -0.1089 -0.0554 -0.0017 -0.0183 0.0945 0.1027 0.1364

Table 7. R-matrix calculated based on 34 craniofacial measurements for female samples

1 2 3 4 5 6 7 8 9 10 11 12 13 14
Sub-Saharan
Africa
0.2181

North Africa -0.0163 0.1808

West Asia -0.0435 0.1417 0.1836

Europe -0.0720 0.1087 0.1684 0.3259

South Asia 0.0154 0.0990 0.1138 0.0784 0.2789

East/Northeast
Asia
-0.0695 -0.0642 -0.0997 -0.0497 -0.1199 0.1841

Southeast Asia -0.0182 -0.0124 -0.0139 0.0334 0.1018 0.0236 0.2150

Australia 0.1381 0.0075 0.0085 -0.1109 0.0321 -0.1475 -0.1209 0.3741

Melanesia 0.0775 0.0006 -0.0098 -0.1106 0.0417 -0.0800 -0.0315 0.1875 0.1607

Micronesia 0.0001 -0.0571 -0.0955 -0.1463 -0.0427 0.0697 0.0138 -0.0172 0.0351 0.1483

Polynesia -0.0039 -0.0644 -0.1194 -0.1636 -0.1483 0.0739 -0.1058 0.0650 0.0175 0.0536 0.2842

Arctic -0.0659 -0.2045 -0.1820 -0.1034 -0.3004 0.1907 -0.1340 -0.1575 -0.1484 0.0700 0.1577 0.5987

North America -0.0399 -0.0989 -0.0703 -0.0393 -0.1592 0.0572 -0.0485 -0.1061 -0.0653 0.0071 0.0439 0.2617 0.1961

Central/South
America
-0.1200 -0.0205 0.0182 0.0811 0.0093 0.0313 0.0977 -0.1529 -0.0751 -0.0388 -0.0903 0.0171 0.0614 0.1816

Discussion

The results obtained confirm the relatively small Fst values, the largest intra-regional
variance in sub-Saharan Africa, and the negative correlations of the phenotypic variances of
non-African populations with their geographic distance to East Africa. These figures are as a
whole concordant with those obtained by genetic studies, indicating that global patterns of
the craniofacial variation is similar to that expected under a neutral genetic model of genetic
drift balanced by gene flow. If it is true, as seems likely (Relethford, 2004a, 2004b; Manica et
al., 2007; Cramon-Taubadel and Lycett, 2008), the present patterns of human craniometric
diversity may be a good predictor for inferring possible colonization routes from East Africa
to major geographic regions (Relethford, 2001, 2004a).
Our current knowledge and understanding is that modern humans evolved c. 150,000
180,000 years ago in eastern Africa (reviewed by Excofffier, 2002). The small group of
individuals subsequently migrated out of eastern Africa, and their descendants finally
expanded into most of todays populations (Kingdon, 1993; Goldstein et al., 1995; Hammer
et al., 1998; Klein, 1999; Thomson et al., 2000; Ke et al., 2001; Zhivotovsky et al., 2003;
Mellars 2006; Fagundes et al., 2007; Scholz et al., 2007). However, the issues regarding the
number of times of the dispersal from Africa and routes taken still remain to be solved. Was
there only one major dispersal? Or were there multiple dispersals with different timing?
Which route did the first Eurasians take out of Africa, the Levant corridor, the Horn of
Africa, or both?
The classic interpretation favors, perhaps, the route along the Nile and across the Sinai
peninsula leading into the Levant region (Jones et al., 1992; Lewin, 1993; Cavalli-Sforza et
al., 1994; Quintana-Murci et al., 1999; Manni et al., 2002; Salas et al., 2002). Recently, Luis
et al. (2004) emphasize that the migratory movements between Africa and Eurasia occurred
mainly across the Levant corridor based on Y chromosome analysis. However, as pointed out
by Forster and Matsumura (2005), if that were so, why was adjacent Europe settled thousands
of years later than distant Australia. In fact, after an early dispersal during the Riss-Wrm
interglacial through the Levant corridor as indicated by the 90,000 year-old Skhul and Qafzeh
fossils found in Israel, no further movements of anatomically modern humans within the
Middle East and into Eurasia through the Middle East took place (Lahr, 1996). In addition,
the dates of colonization of Southeast Asia through South Asia and finally Australia fall
during the time-span (Wrm glacial period), in which the Levant corridor was closed because
of the expansion of the Sahara (Lahr, 1996).
The results of mtDNA analyses together with the archaeological evidence suggest
importance in the human expansion through Horn of Africa (Lahr, 1996; Stringer, 2000;
Oppenheimer, 2003; Forster and Matsumura, 2005; Macaulay et al., 2005; Thangaraj et al.,
2005). However, the proponents of the northern route hypothesis pointed out that the use of
the southern route, the Horn of Africa route, may have been restricted to intervals of low sea
levels and mild monsoonal conditions (Underhill et al., 2001, Luis et al., 2004). Although the
Bab el Mandeb Strait never dried during the last glacial sea-level changes, its width
fluctuated markedly (Lahr, 1996).
Palaeoanthropological evidence suggests both the Levant (northern route) and the Horn
of Africa (southern route) as possible migratory corridors between Africa and Eurasia (Lahr
and Foley, 1994; Lahr, 1996). This multiple dispersal hypothesis indicates that West Asia and
Europe were colonized by the lineages traced back to the populations restricted to North
Africa during and soon after the last interglacial, around 45,000 years B.P. (Lahr, 1996;
Underhill et al., 2001). Recent mtDNA analysis emphasizes, on the other hand, that southern
region of the Zagros Mountains, part of the Fertile Crescent, as a core homeland for the early
modern Europeans and North Africans. Moreover, the route of entry into Europe and North
Africa was most likey via the Levant region (Oppenheimer, 2003).
The recent African origin hypothesis implies a rapid expansion with serial bottlenecks,
leading to the decrease of genetic and phenotypic diversity observed with increasing distance
from sub-Saharan Africa along a possible colonization routes (Relethford, 2004a, 2004b;
Serre and Pbo, 2004; Manica et al., 2005, 2007; Prugnolle et al., 2005; Liu et al., 2006; Li
et al., 2008). IF so, the results of the linear regression analysis shown in Figure 2 indicate that
modern human craniometric variation patterns fit a model of iterative founder effects along
the colonization route from an African origin shown in Figure 1d. This may allow use to
suggest an importance of southeastern dispersal route for the emigration out of Africa and
subsequent expansion along the shorelines of Arabia towards Southeast Asia, and eventually
Australia on the one hand, and into Europe and North Africa through the Levant region on
the other hand.
Another important issue in question is the process of the settlement of East/Northeast
Asian region. The classic view held that the human occupation of East/Northeast Asia
resulted from the expansion of the late Pleistocene Southeast Asian (Sundaland) people
(Turner, 1987, 1990; Ballinger et al., 1992; Scott and Turner, 1997; Disotell, 1999). This
Southeast Asian model of East/Northeast Asian origins is supported by recent mtDNA
analysis (Li and Su, 2000; Oppenheimer, 2003; Shi et al., 2005). However, several data on
the Y-chromosome shed new light on this issue, suggesting that the populations of
East/Northeast Asia are largely composed of descendants of central and western Asian
populations, rather than Southeast Asian peoples (Uniderhill et al., 2001; Wells, 2002;
Uinuk-Ool et al., 2003; Hill et al., 2006, 2007).
The present findings shown in Figure 2d may be compatible with those suggested by
recent Y-chromosome analysis. However, at the same time, the results obtained in this study
indicate the relatively large intra-regional variation of the East/Northeast Asian region in both
male and female analyses (Table 4), suggesting long-term population history and/or long-
range gene flow from outside sources (Relethford and Harpending, 1994). Such findings
together with the previous dental and cranial morphological and genetic studies may be
consistent with the multiple migration hypothesis for the peopling of East/Northeast Asia
region (Uniderhill et al., 2001; Hanihara, 2006, 2008; Hanihara and Ishida, 2009).
Recent studies show that isolation by distance has a primary effect on average patterns of
not only genetic but also craniometric affinities on a global level (Cavalli-Sforza et al., 1994;
Howells, 1995; Eller, 1999; Relethford, 2004a, 2004b; Ramachandran et al., 2005). In the
present study, the increases in dissimilarity between populations with geographic distance are
identified on a global scale as shown in Figure 3. This suggests that a possible migration and
colonization route suggested in Figure 2d is confirmed by the fit of the isolation-by-distance
model to global patterns of craniometric variation. However, the correlations between
geographic distances and pairwise biological distances are not particularly high and some
populations do not strictly fit the model. The deviation from the fitted line may be explicable
by admixture, extreme isolation, population-specific selection, environmental influence, or
more likely some combinations of such factors (Relethford, 2004a, 2004b: Ramachandran et
al., 2005; Templeton, 2007). In either case, further research, particularly focusing on
fluctuation of morphological traits, should bring us closer to the complexity of the migration
and adaptive processes that have shaped human diversity.

Acknowledgments

I wish to express my sincere thanks to T. Molleson, R. Kruszynski, L.T. Humphrey, and
C. Stringer of the Natural History Museum, London; R. Foley, M.M. Lahr, and M. Bellatti of
the Department of Biological Anthropology, University of Cambridge; A. Langaney and
M.A. Pereira da Silva of Laboratoire dAnthropologie Biologique, Muse de lHomme, Paris;
D. Hunt, D. Owsley, S. Ousley, R. Potts, M. London, and D.H. Ubelaker of the Department
of Anthropology, National Museum of Natural History, Washington D.C.; I. Tattersall, K.
Mowbray, and G. Sawyer of the Department of Anthropology, American Museum of Natural
History, New York; G. Feinman, B. Bronson, and W.J. Pestle of the Department of
Anthropology, Field Museum, Chicago; J. Specht, P. Gordon, L. Bonshek, and N. Goodsell
of the Department of Anthropology, Australian Museum, Sydney; J. Stone and D. Donlon of
the Department of Anatomy and Histology, University of Sydney; D. Henley of the New
South Wales Aboriginal Land Council, Sydney; M. Chow, a dentist in Sydney; M. Hanihara
of the School of Languages, Macquarie University, Sydney; C. Pardoe and G.L. Pretty of the
Department of Anthropology, South Australian Museum, Adelaide; G. Suwa of the
Department of Anthropology, University Museum of the University of Tokyo; for their kind
permission to study the materials under their care.
This study was supported in part by Grant-in-Aid for Scientific Research (No. 18570220)
from the Ministry of Education, Science and Culture in Japan; a Japan Fellowship for
Research in United Kingdom from the Japan Society for the Promotion of Science; and
Smithsonian Opportunities for Research and Study: Smithsonian Institution Fellowship
Program (Senior Fellow in 2001-2002).

References

Ayub Q, Mansoor A, Ismail M, Khaliq S, Mohyuddin A, Hameed A, Mazhar K, Rehman S,
Siddiqi S, Papaioannou M, Piazza A, Cavalli-Sforza LL, Mehdi SQ. 2003.
Reconstruction of human evolutionary tree using polymorphic autosomal microsatellites.
Am. J. Phys. Anthropol. 122:259-268.
Ballinger SW, Schurr TG, Torroni A, Gan YY, Hodge JA, Hassan K, Chen KH, Wallace DC.
1992. Southeast Asian mitochondrial DNA analysis reveals genetic continuity of ancient
Mongoloid migrations. Genet. 130:139-152.
Bosch E, Calafell F, Comas D, Oefner PJ, Underhill PA, Bertranpetit J. 2001. High-
resolution analysis of human Y-chromosomes variation shows a sharp discontinuity and
limited gene flow between northwestern Africa and the Iberian peninsula. Am. J. Hum.
Genet. 68:1019-1029.
Bruer G. 1988. Osteometrie: a Kraniometrie. In: Knumann R, editor. Anthropologie:
Handbuch der Vergleichenden Biologie des Menschen, Band I. Stuttgart: Gustav Fisher.
p 160-192.
Cavalli-Sforza LL, Menozzi P, Piazza A. 1994. The history and geography of human genes.
Princeton: Princeton University Press.
Chandrasekar A, Saheb SY, Gangopadyaya P, Gangopadyaya S, Mukherjee A, Basu D,
Lakshmi GR, Sahani AK, Das B, Battacharya S, Kumar S, Xaviour D, Sun D, Rao VR.
2007. YAP insertion signature in South Asia. Ann. Hum. Biol. 34:582-586.
Cramon-Taubadel von N, Jycett SJ. 2008. Human cranioal variation fits iterative founder
effect model with African origin. Am. J. Phys. Anthropol. 136:108-113.
Devor EJ. 1987. Transmission of human craniofacial dimensions. J. Craniofaical Genet. Dev.
Biol. 7:95-106.
Disotell TD. 1999. Human evolution: the southern route to Asia. Curr. Biol. 9: R925-R928.
Donnelly SM, Konigsberg LW. 1998. Interpretation of population structure when group
structure is unknown. Am. J. Phys. Anthropol. Suppl. 26:106.
Eller E. 1999. Population substructure and isolation by distance in three continental regions.
Am. J. Phys Anthropol. 108:147-15.
Excoffier L. 2002. Human demographic history: refining the recent African origin model.
Curr. Opinion Genet.Develop. 12:675-682.
Fagundes NJR, Ray N, Beaumont M, Neuenschwander S, Salzano FM, Bonatto SL, Excoffier
L. 2007. Statistical evaluation of alternative models of human evolution. Proc. Natl.
Acad. Sci. USA 104:17614-17619.
Forster P, Matsumura S. 2005. Did early humans go north or south? Science 308:965-966.
Goldstein DB, Linares AR, Cavalli-Sforza LL, Feldman MW. 1995. Genetic absolute dating
based on microsatellite and the origin of modern humans. Proc. Natl. Acad. Sci. USA
92:6723-6727.
Hammer MF, Karafet T, Rasanayagam A, Wood ET, Altheide TK, Jenkins T, Griffiths RC,
Templeton AR, Zegura SL. 1998. Out of Africa and back again: nested cladistic analysis
of human Y chromosome variation. Mol. Biol. Evol. 15: 427-441.
Hanihara T. 2006. Interpretation of craniofacial variation and diversification of East and
Southeast Asians. In: Oxenham M, Tayles N. editors. Bioarchaeology of Southeast Asia.
Cambridge: Cambridge University Press. p 91-111.
Hanihara T. 2008. Morphological variation of major human populations based on nonmetric
dental traits. Am. J. Phys. Anthropol. 136:169-182.
Hanihara T, Ishida H. 2005. Metric dental variation of major human populations in the world.
Am. J. Phys. Anthropol. 121:241-251.
Hanihara T, Ishida H. 2009. Regional differences in craniofacial diversity and population
history of Jomon Japan. Am. J. Phys. Anthropol. 139:278-289.
Harpending H, Rogers A. 2000. Genetic perspectives of human origins and differentiation.
Annu. Rev. Genomics Hum. Genet. 1:361-85.
Harpending HC, Sherry ST, Rogers AR, Stoneking M. 1993. Genetic structure of ancient
human populations. Curr. Anthropol. 34:483-496.
Hill C, Soares P, Mormina M, Macaulay V, Meehan W, Blackburn J, Clarke D, Raja JM,
Ismail P, Bulbeck D, Oppenheimer S, Richards M. 2006. Phylogeography and
ethnogenesis of aboriginal Southeast Asia. Mol. Biol. Evol. 23:2480-2491.
Hill C, Soares P, Mormina M, Macaulay V, Clarke D, Blumbach PB, Vizuete-Forster M,
Forster P, Bulbeck D, Oppenheimer S, Richards M, 2007. A mitochondrial stratigraphy
for island Southeast Asia. Am. J. Hum. Genet. 80:29-43.
Howells WW. 1973. Cranial variation in man: a study by multivariate analysis of patterns of
difference among recent human populations. Papers of the Peabody Museum of
Archaeology and Ethnology 67, Cambridege, MA: Harvard University.
Howells WW. 1989. Skull shapes and the map: craniometric analyses in the dispersion of
modern Homo. Papers of the Peabody Museum of Archaeology and Ethnology 79,
Cambridege, MA: Harvard University.
Howells WW. 1995. Whos Who in Skulls: Ethnic Identification of Crania from
Measurements. Papers of the Peabody Museum of Archaeology and Ethnology 82,
Cambridge, MA: Harvard University.
Hudjashov G, Kivisild T, Underhill PA, Endicott P, Sanchez JJ, Lin AA, Shen P, Oefner P,
Renfrew C, Villems R, Forster P. 2007. Revealing the prehistoric settlement of Australia
by Y chromosome and mtDNA analysis. Proc. Natl. Acad. Sci. USA 104:8726-8730.
Irish JD. 2005. Population continuity vs. discontinuity revisited: dental affinities among late
Paleolithic through Christian-era Nubians. Am. J. Phys. Anthropol. 128:520-535.
Irish JD. 2006. Who were the ancient Egyptians? Dental affinities among Neolithic through
postdynastic peoples. Am. J. Phys. Anthropol. 129:529-543.
Jones S, Martin R, Pilbeam D. 1992. The Cambridge encyclopedia of human evolution.
Cambridge: Cambridge University Press.
Jorde LB, Rogers AR, Bamshad M, Watkins WS, Krakowiak PA, Sung S, Kere J,
Harpending HC. 1997. Microsatellite diversity and the demographic history of modern
humans. Proc. Natl. Acad. Sci. USA 94:3100-3103.
Jorde LB, Watkins WS, Bamshad MJ, Dixon ME, Ricker CE, Seielstad MT, Batzer MA.
2000. The distribution of human genetic diversity: a comparison of mitochondrial,
autosomal, and Y-chromosome data. Am. J. Hum. Genet. 66:979-988.
Ke Y, Su B, Song X, Lu D, Chen L, Li H, Qi C, Marzuki S, Deka R, Underhill P, Xiao C,
Shriver M, Lell J, Wallace D, Wells RS, Seielstad M, Oefner P, Zhu D, Jin J, Huang W,
Chakraborty R, Chen Z, Jin L. 2001. African origin of modern humans in East Asia: a
tale of 12,000 Y-chromosomes. Science 292:1151-1153.
Kindon J. 1993. Self-made man and his undoing. London: Simon and Schuster.
Kivisild T, Reidla M, Metspalu E, Rosa A, Brehm A, Pennarun E, Parik J, Geberhiwot T,
Usanga E, Willem R. 2004. Ethiopian mitochondrial DNA heritage: tracking gene flow
across and around the gate of tears. Am. J. Hum. Genet. 75:752-770.
Klein R. 1999. The human career. Chicago: University of Chicago Press.
Lahr MM. 1996. The evolution of modern human diversity: a study of cranial variation.
Cambridge: Cambridge Univ Press.
Lahr MM, Foley RA. 1994. Multiple dispersals and modern human origins. Evol. Anthropol.
3:48-60.
Lewin R. 1993. The origin of modern humans. New York: Scientific American Library.
Li J, Su B. 2000. Natives or immigrants: modern human origin in East Asia. Nature Rev.
Genet. 1:126-133.
Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh
GS, Feldman M, Cavalli-Sforza LL, Myers RM. 2008. Worldwide human relationships
inferred from genome-wide patterns of variation. Science 319:1100-1104.
Liu H, Prugnolle F, Manica A, Balloux F. 2006. A geographically explicit genetic model of
worldwide human-settlement history. Am. J. Hum. Genet. 79:230-237.
Luis JR, Rowold DJ, Regueiro M, Caeiro B, Cinniolu C, Roseman C, Underhill PA,
Cavalli-Sforza LL, Herrera RJ. 2004. The Levant versus the Horn of Africa: evidence for
bidirectional corridors of human migrations. Am. J. Hum. Genet. 74:532-544.
Lynch M, Hill WG. 1986. Phenotypic evolution by neutral mutation. Evolution (Lawrence,
Kans) 40:915-935.
Macaulay V, Hill C, Achilli A, Rengo C, Clarke D, Meehan W, Blackburn J, Semino O,
Scozzari R, Cruciani F, Taha A, Shaari NK, Raja JM, Ismail P, Zainuddin Z, Goodwin
W, Bulbeck D, Bandelt H-J, Oppenheimer S, Torroni A, Richiards M. 2005. Single,
rapid coastal settlement of Asia revealed by analysis of complete mitochondrial genomes.
Science 308:1034-1036.
Manica A, Prugnolle F, Balloux F. 2005. Geography is a better determinant of human genetic
differentiation than ethnicity. Hum. Genet. 118:366-371.
Manica A, Amos W, Balloux F, Hanihara T. 2007. The effect of ancient population
bottlenecks on human phenotypic variation. Nature 448:346-349.
Manni F, Leonardi P, Barakat A, Rouba H, Heyer E, Klintschar M, McElreavey K, Quintana-
Murchi L. 2002. Y-chromosome analysis in Egypt suggests a genetic regional continuity
in northeastern Africa. Hum. Biol. 74:645-658.
Martin R, Saller K. 1957. Lehrbuch der Anthropologie, Vol. 1. Stuttgart: Gustav Fischer
Verlag.
Mellars P. 2006. Why did modern human populations disperse from Africa ca. 60,000 hears
ago? A new model. Proc. Natl. Acad. Sci. USA 103:9381-9386.
Oppenheimer S. 2003. Out of Eden: the peopling of the world. London: Constable and
Robinson Ltd.
Prugnolle F, Manica A, Balloux F. 2005. Geography predicts neutral genetic diversity of
human populations. Curr. Biol. 15:159-160.
Quintana-Murci L, Semino O, Bandelt H-J, Passarino G, McElreavey K, Santachiara-
Benerecetti S. 1999. Genetic evidence of an early exit of Homo sapiens sapiens from
Africa through eastern Africa. Nature Genet. 23:437-441.
Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, Cavalli-Sforza
LL. 2005. Support from the relationship of genetic and geographic distance in human
populations for a serial founder effect originating in Africa. Proc. Natl. Acad. Sci. USA
102:15942-15947.
Relethford JH. 1994. Craniometric variation among modern human populations. Am. J. Phys.
Anthropol. 95:53-62.
Relethford JH. 1996. Genetic drift can obscure population history: problem and solution.
Hum. Biol. 68:29-44.
Relethford JH. 2001. Global analysis of regional differences in craniometric diversity and
population substructure. Hum. Biol. 73:629-636.
Relethford JH. 2002. Apportionment of global human genetic diversity based on
craniometrics and skin color. Am. J. Phys. Anthropol. 118:393-398.
Relethford JH. 2004a. Global patterns of isolation by distance based on genetic and
morphological data. Hum. Biol. 76:499-513.
Relethford JH. 2004b. Boas and beyond: migration and craniometric variation. Am. J. Hum.
Biol. 16:379-386.
Relethford JH, Blangero J.1990. Detection of differential gene flow from patterns of
quantitative variation. Hum. Biol. 62:5-25.
Relethford,JH, Harpending HC. 1994. Craniometric variation, genetic theory, and modern
human origins. Am. J. Phys. Anthropol. 95:249-270.
Relethford JH, Jorde LB. 1999. Genetic evidence for larger African population size during
recent human evolution. Am. J. Phys. Anthropol. 108:251-260.
Roseman CC. Weaver TD. 2004. Multivariate apportionment of global human craniometric
diversity. Am. J. Phys. Anthropol. 125:257-263.
Salas A, Richards M, De la fe T, Lareu MV, Sobrino B, Sanchez-Diz P, Macaulay V,
Carracedo A. 2002. The making of the African mtDNA landscape. Am. J. Hum. Genet.
71:1082-1111.
Scholz CA, Johnson TC, Cohen AS, King JW, Peck JA, Overpeck JT, Talbot MR, Brown E,
Kalindekafe L, Amoako PYO, Lyons RP, Shanahan TM, Castaneda IS, Heil CW,
Forman SL, McHargue LR, Beuning KR, Gomez J, Pierson J. 2007. East African
megadroughts between 135 and 75 thousand hears ago and bearing on early-modern
human origins. Proc. Natl. Acad. Sci. USA 104:16416-16421.
Scott GR, Turner CG II. 1997. The Anthropology of Modern Human Teeth: Dental
Morphology and its Variation in Recent Human Populations. Cambridge: Cambridge
University Press.
Serre D, Pbo S. 2004. Evidence for gradients of human genetic diversity within and among
continents. Genome Res. 14:1679-1685.
Shi H, Yong-li D, Bo W, Chun-jie X, Underhill PA, Pei-dong S, Chakraborty R, Li J, Bing S.
2005. Y-chromosome evidence of southern origin of the East Asian-specific haplogroup
O3-M112. Am. J. Hum. Genet. 77:408-419.
Stringer, C., 2000. Coasting out of Africa. Nature 405:24-26.
Thangaraj K, Chaubey G, Kivisild T, Reddy AG, Singh VK, Rasalkar AA, Singh L. 2005.
Reconstructing the origin of Andaman islanders. Science 308:996.
Thomson R, Pritchard JK, Shen P, Oefner PJ, Feldman MW. 2000. Recent common ancestry
of human Y chromosomes: evidence from DNA sequence data. Proc. Natl. Acad. Sci.
USA 97:7360-7365.
Turner CG II. 1987. Late Pleistocene and Holocene population history of East Asia based on
dental variation. Am. J. Phys. Anthropol. 73:305-321.
Turner CG II. 1990. Major features of sundadonty and sinodonty, including suggestions
about East Asian microevolution, population history, and late Pleistocene relationships
with Australian Aboriginals. Am. J. Phys. Anthropol. 82:295-317.
Uinuk-Ool TS, Takezaki N, Klein J. 2003. Ancestry and kinships of native Siberian
populations: the HLA evidence. Evol. Anthropol. 12:231-245.
Underhill PA, Shen P, Lin AA, Li J, Passarino G, Wei HY, Kauffman E, Bonn-Tamir B,
Bertranpetit J, Francalacci P, Ibrahim M, Jenkins T, Kidd JR, Mehdi SQ, Seielstad MT,
Wells RS, Piazza A, Davis RW, Feldman MW, Cavalli-Sforza LL, Oefner PJ. 2000. Y
chromosome sequence variation and the history of human populations. Nature Genet.
26:358-361.
Underhill PA, Passarino G, Lin AA, Shen P, Lahr MM, Foley RA, Oefner PJ, Cavalli-Sforza
LL. 2001. The phylogeography of Y chromosome binary haplotypes and the origins of
modern human populations. Ann. Hum. Genet. 65:43-62.
Wells S. 2002. The journey of man: a genetic odyssey. Princeton: Princeton University Press.
Yuehai K, Su B, Xiufeng S, Daru L, Lifeng C, Hongyu L, Chunjian Q, Marzuki S, Deka R,
Underhill P, Chunjie X, Shriver M, Lell J, Wallace D, Wells RS, Seielstad M, Oefner P,
Dingliang Z, Jianzhong J, Wei H, Chakraborty R, Zhu C, Li J. 2001. African origin of
modern humans in East Africa: a tale of 12,000 Y chromosome. Science 292:1151-1153.
Zhivotovsky LA, Rosenberg NA, Feldman MW. 2003. Features of evolution and expansion
of modern humans, inferred from genomewide microsatellite markers. Am. J. Hum.
Genet. 72:1171-1186.


Chapter 12

Intra-Specific Genetic Variation in
Mosses: A Novel Approach to Detect
Environmental Changes

Valeria Spagnuolo, Stefano Terracciano and Simonetta Giordano
Dipartimento di Biologia Strutturale e Funzionale
Universit degli Studi di Napoli Federico II

Abstract

Intra-specific genetic variation is considered an important factor for evaluating
biodiversity; indeed, the higher genetic variation within a species, the higher its surviving
ability. The loss of suitable habitats for moss species involves demographic decreases
and genetic impoverishment. Mosses, have a short generation time compared to
phanerogamic vegetation, particularly trees, and therefore may exhibit all these effects
earlier, predicting the destiny of higher plant communities and the ongoing changes in
natural landscapes. Indeed, intra-specific genetic variation in moss species may represent
an ideal model system for investigating species fitness consequent to natural and man
driven environmental changes, both at a local level, and at a large scale. At a local level
these studies provide useful information for territory management since they promptly
signal local environmental changes; whereas, over a large scale they highlight historical
processes which have affected taxon origin, distribution, radiation, in relation to the main
geological events. Genetic variation and structure within moss species is influenced by
reproductive strategy and dispersal, giving information about gene exchange, occurrence
of sexual reproduction, selfing/outcrossing rates. Demographic constraints and especially
ongoing demographic fluctuations also concur to shape population genetic diversity and
structure, evidencing phenomena such the relative importance of the founder effect, the
occurrence of bottleneck and genetic drift. Moss genetic variation may highlight
environmental disturbance caused both by natural events and by land use and human
pressure. Among disturbances, habitat fragmentation is one of the most studied due to the
increasing loss of suitable habitats for moss species. In general, it can be stated that
intraspecific genetic variation in mosses reflect environmental gradients, with high
Valeria Spagnuolo, Stefano Terracciano and Simonetta Giordano 272
amount of variation in natural environment, versus low level of variation in threatened
environments.
The rapid transformation of the environment into a network of patches due to habitat
fragmentation, and the increasing environmental disturbance, lead to a genetic erosion in
isolated populations, with consequent increase of extinction risk. Thus, intraspecific
genetic variation in mosses appears a suitable tracer of environmental disturbance due to
the global ubiquity and the fast generation time of these plants.

Introduction

Intra-specific genetic variation is assumed to be an important factor for evaluating
biodiversity; indeed, the higher genetic variation within a species, the higher its surviving
ability. Environmental disturbance and pollution lead to a gradual shrink of plant populations
until a local extinction, to a loss of sexual reproduction and an interruption of gene flow
among populations. All these effects can be observed as a loss of genetic variation, with a
consequent intra-specific genetic impoverishment (Young et al. 1996). Genetic diversity
within moss species also provides invaluable information about reproductive strategies
(Cronberg 2002; Eppley et al 2007), gene flow among populations (Hassel et al. 2005; Mc
Daniel and Shaw, 2005; Shaw et al. 1990), demographic constraints, such as bottleneck and
founder effect, and all the main factors shaping genetic structure (Shaw 1991 and 2000 ).
Indeed, a different pattern of genetic variation will be observed depending on sexual/ asexual
reproduction, gene exchange occurrence among con-specific populations, or genetic isolation
due to habitat fragmentation. This latter is one of the most frequent threat for natural plant
communities, multiplying the edge effect, and favouring genetic drift because isolated
population fragments tend to be more genetically distant (Gibbs 2001; Hylander 2005; Pharo
et al.2004; Pharo and Zartman, 2007).
Moreover, mosses, due to their short generation time compared to phanerogamic
vegetation, and particularly trees, may exhibit all these effects earlier, predicting the destiny
of higher plant communities and therefore the ongoing changes of the natural landscapes
(Young et al. 1996).
Based on that above stated, intra-specific genetic variability in mosses may represent a
new approach in the evaluation of biodiversity in more complex plant community; therefore,
these molecular methods may be coupled with biomonitoring classic investigations (decrease
of cover/frequency, accumulation of trace elements, radionuclides) in integrated
experimental procedures.
Intra-specific genetic variation over a large geographic scale also highlights historical
processes, which affect intercontinental genetic pattern in widespread plant species. Genetic
structure of cosmopolite moss species may be linked to the main knowledge about geological
changes. In addition, bryophytes, due to their ability for colonizing extreme habitats may
result particularly suitable to elucidate the ongoing climatic changes, such as the effect of the
Antarctic ozone hole on somatic mutation rate (Clarke et al. 2008).
Intra-specific genetic variation in mosses may therefore represent an ideal model system
for investigating natural and man driven environmental changes, both at a local level, and at a
large scale. In the present paper we try to draw a picture concerning the main findings
Intra-Specific Genetic Variation in Mosses 273
reported in the literature about intraspecific genetic variation in mosses in relation to
environmental changes; in particular we consider molecular biodiversity in relation to
reproductive strategies, demography, habitat fragmentation and human pressure.

Genetic Variation, Reproductive Strategies and
Diaspore Dispersal

Genetic variation and structure of moss populations are influenced by the occurrence of
sexual/asexual reproduction. Some authors (During 1979; Longton 1988a and 1992)
observed that fugitive and colonist ephemeral mosses invest great amount of energy in sexual
reproduction, that generally involves the production of millions of very small, resistant
spores; by contrast, the effort put into the production of spores or specialized vegetative
propagules is relatively low in perennial stayers, that have a long life span and show a growth
pattern mainly dependent on lateral expansion, frequently observed in clonal species
(Cronberg 2002). Estimate sexual reproduction occurrence in mosses is important since most
species renounce to sexual reproduction in favour of vegetative growth, under adverse
(unfavourable) environmental conditions. This change in reproductive pattern involves a loss
of haplotypes with consequent genetic erosion.
Multilocus linkage disequilibrium (Agapow and Burt 2001) gives an estimation of the
association degree among loci or the amount of recombination due to meiosis, and therefore,
an evaluation of the occurrence of sexual reproduction. Indeed, if sexually produced spores
are the main agent of dispersal, no linkage among loci is expected; but high linkage value is
expected if asexual reproduction by fragments is dominating (Hassel et al. 2005).
Several moss species are known which do not develop sporophytes, especially in extreme
environments (Spagnuolo et al. 2007a), or have physically disjunct male and female
gametophytes, as observed in Climacium americanum (Meagher and Shaw 1990), and in
Bryum argenteum, where male plants are restricted to protected areas of subantarctic regions,
whereas female plants are widespread (Longton 1988b). Even if some species can develop
specialised propagules for asexual reproduction, all moss species are in theory able to create
new, independent shoots, by gametophyte fragmentation. Vegetative growth pattern, genetic
variation being equal, can influence genetic structure within populations (Spagnuolo et al
2009a). Indeed, vegetative growth can produce large and continuous genets if the moss
considered is a so-called phalanx species; whereas the high level of intermingling that
characterises guerrilla species produces fragmented clones, more or less extended (Cronberg
et al. 2006).
Another important feature affecting genetic variation and clonal structure is the timing of
spore/propagules recruitment (Cronberg 2002). If recruitment takes place at the early of the
establishment only, the genetic variation tends to decrease in time; while if recruitment takes
place continuously, then a given level of molecular differentiation can be maintained in time.
Genetic variation in widespread species must also consider long range transport of the
spores; it is reported in the literature that spores mostly fall on the soil in a ray of few dozens
of centimetres from the capsule where they develop, but a small aliquot of spores can also
cover intercontinental distance, before their germination. This phenomenon plays an
important role in shaping large scale genetic structure in cosmopolite moss species, and it
allows, sometimes, to date intercontinental patterns in relation to geological changes (Shaw et
al. 1990; Spagnuolo et al. 2009a).

Genetic Variation and Breeding Systems

In plant species that practise sexual reproduction, breeding systems influence patterns of
genetic diversity and structure because they are responsible of gene transmission through
generations. Breeding systems range from the condition of only a single sex function
assigned to each individual, to that in which both sex functions can be expressed in the same
individual (hermaphroditism) with the possibility of self-fertilization. Population genetic
theoretical models predict that selfing may be selected against since it increases levels of
homozygosity in offspring, which in turn, allows the expression of recessive, deleterious
alleles (Fisher 1949; Wright 1965; Nei et al. 1975; Charlesworth and Charlesworth 1998);
however, as later explained, some factors can favour selfing. Indeed, both selfing and
outcrossing are stable states, depending on levels of inbreeding depression (Lande and
Schemske 1985), and both complete selfing and complete outcrossing occur in mostly species
(Barrett and Eckert 1990). Outcrossing is selected for historically large species, whereas
selfing is favoured in species or populations that, undergone to severe bottlenecks, once
purged from deleterious alleles can find in selfing the guarantee of their survival. Therefore,
the rates of selfing/outcrossing within a population can signal an ongoing environmental
change.
Whereas selfing in diploid organisms results in a 50% reduction in heterozygosity, in
haploid organisms, selfing can occur in different ways and can result in more than a 50%
reduction in heterozygosity (Klekowski 1972). Indeed, mating between gametes from
different haploid individuals produced from the same diploid parent (intergametophytic
selfing) results in a 50% reduction in homozygosity, which is equivalent to selfing in animals
and seed plants, producing an offspring formed by haploid sibs. In contrast, mating between
gametes produced from the same haploid individual (intragametophytic selfing) (Klekowski
1972) leads to a complete homozygosity in a single generation (Hedrick 1987). If bryophyte
colonies consist primarly of close haploid sibs and secondly of more distant relatives from
previous generations, mating patterns probably involve several degrees of inbreeding;
therefore, true outcrossing (i.e. mating between unrelated gametophytes) will depend on the
extent to which populations are established and on their ability to exchange propagules from
more distant sources (Shaw 2000). Heterozygote frequency can be evaluated by Wrights
inbreeding coefficient or fixation indices (Wright 1965), that are defined as the correlation
between homologous alleles within individuals, with reference to a local population (F
IS
) or
total population (F
IT
), and describe the departure (deviation) from Hardy-Weinberg genotypic
frequencies:

F
I
= 1-(h
obs
/h
exp
),

where h
obs
and h
exp
are the heterozigosity observed and expected at a given locus.
In a study focussed on self-fertilization in mosses (Eppley et al. 2007) sporophyte
allozymatic profiles were investigated in several monoicous and dioicous species. The
authors found inbreeding coefficients extremely high in monoicous species (0.62-0.98), and
much lower in dioicous species. Moss species generally exhibit low level of heterozigosity,
particularly those with combined sexes, that in turn, show high frequency of selfing. The
authors also found a general heterozigosity deficiency. In mosses both high and low
heterozigosity levels are reported in different species by using allozymes: Shaw (1991) found
proximate value of 1 for F
IS
in Funaria hygrometrica, whereas Innes (1990), found very low
values of F
IS
in the unisexual species Polytrichum juniperinum. Although multilocus isozyme
genotypes indicated sexual reproduction occurrence, the analyses of sporophytes showed that
matings were mostly between gametes growing in the same or nearby population, supporting
the hypothesis of a reduced gene flow. This result confirms that genetic differentiation due to
sexual reproduction is affected by several features, not always directly related to species-
specific mating systems, but also depending on population history.

Genetic Variation and Demography

Frequency and cover values are demographic indices commonly used for cryptogamic
plants for biomonitoring the status of naturality/alteration of most terrestrial ecosystems.
Frequency and cover can be assessed by using apposite grids or plots, and counting how
many times each species occurs (Dulire et al. 2000). Demographic parameters of bryophytes
are very important in the evaluation of environmental status, since bryophytes are major
components of forest ecosystems, where they can greatly contribute to the biomass of the
ground layer (Longton 1992). In addition they are known as bioindicator of air quality
(Giordano et al. 2004; Sardans and Peuelas 2005; Sim-Sim et al. 2000), and offer the
advantage of a short generation time (Young et al. 1996), compared to most seed plants. This
latter feature allows to obtain conspicuous demographic information in relatively short time,
to promptly detect demographic fluctuation, and to act, subsequently, a continuous
biomonitoring, profitable in forest management. Besides species richness and species
frequency and cover, biodiversity of ecosystems may also be assessed by using molecular
markers. In particular a high intraspecific genetic variation detected for a given species in a
certain habitat, support the idea of a high level of naturality for the investigated area, and vice
versa. The observation of lower intraspecific genetic differentiation may be the result of an
ongoing extinction, but may also indicate a founder effect (Spagnuolo et al. 2007b).
It is reported in the literature that genetic variation is in theory as much higher, as bigger
is population size (Ellstrand and Elam 1993) , and particularly, that the effects of genetic drift
depend on the number of generations for which a population remains small, and therefore
they are displayed earlier in small populations of plants having short generation time, as
mosses (Young et al. 1996). But another important factor does influence genetic variation,
that is casual/non-casual matings; indeed, kinship analyses show that matings mostly occur
between close gametophytes, creating genetically homogeneous spots within the population
and reducing genetic variation; in addition the genetic variability observed in established
populations is generally lower than that found in spore progeny, due to mortality during the
establishment phase (Innes 1990). If the investigated population undergoes to gene exchange
with outer populations, then the immigration of genetically different spores/propagules could
balance, at the same time, the loss of individuals and genets, restoring equilibrium condition.
Demographic events, such as population expansion or subdivision, can alter the
distribution of genetic variation in a sample of homologous gene sequences from a species.
For example, sequences sampled from an expanding population are expected to show an
excess of low-frequency mutations. Samples from subdivided populations, on the other hand,
will show an excess of mid-frequency mutations (Tajima 1989). Thus, patterns of molecular
variation permit inferences about past demographic processes.
Genetic variation in expanding populations of clonal moss species has been also
explained resorting to somatic mutation that could occur during gametophyte growth
(Skotnicki et al. 1999). Indeed, to demonstrate somatic mutation in mosses, point mutations
in homologue sequences should be observed in different ramets of the same genet, that is a
long and expensive work. Ongoing demographic fluctuations were also demonstrated by
ISSR molecular markers in the epiphytic moss Leptodon smithii (Spagnuolo et al. 2007b).
The authors investigated genetic variation in some urban and extraurban populations; they
found high frequency and cover values and high gene diversity in the extraurban populations.
They also observed that one urban site showed a higher haplotype number, similar to those
detected in extra-urban sites. However, only one haplotype had high frequency, while all the
others exhibit low frequencies, indicating an ongoing demographic contraction.
Demographic indices and intra-specific genetic variation in bryophytes can provide
useful data to estimate the occurrence of important phenomena, such as bottlenecks, founder
effects and genetic drift, but they are especially profitable in forest management, due to their
predicting value for the destiny of whole plant communities.

Genetic Differentiation, Habitat Fragmentation
and Edge Effect

Habitat fragmentation, that is the reduction of continuous habitat into several smaller,
spatially isolated remnants, is a significant threat to the maintenance of biodiversity in many
terrestrial ecosystems. The rapid, widespread transformation of the environment into a
network of habitat patches involves the urgent need to better understand how fragmentation
alters both the ecological stability and the evolutionary potential of the surviving flora. In this
respect, molecular markers provide a fast and convenient tool to assess the condition of
populations of a wide variety of species (Spielman et al. 2004).
In theory, habitat subdividing constricts the genetic neighbourhood of surviving species
by reducing population size, decreasing levels of gene flow, and increasing average
interpopulation distances (Templeton et al. 1990; Young et al. 1996). These changes (induced
by habitat fragmentation) erode local genetic variability, augment genetic differentiation
among populations because of genetic drift, and increase likelihood of breeding among
related individuals (inbreeding depression) (Frankham et al. 2002; Arnaud et al. 2003), with a
consequent increase of the linkage among loci and multilocus linkage disequilibrium index.
Evidence from plant studies indicates that decreased genetic variation, resulting from
inbreeding and genetic drift, may lower fitness and increase extinction risk in isolated
populations (Charlesworth and Charlesworth 1987; Ellstrand and Elam 1993; Menges 1991;
Newman and Pilson 1997). Fragmentation also affects the genetic structure of populations,
with isolated fragments tending to be more genetically distinct than would be expected in a
continuous population on a similar spatial scale. Fragmentation significantly alters
demographic patterns inducing a decline in colonization rates; this event, extended through
the generations, leads to local extinction in a time whose length depends on mortality rate
per-generation (Zartman and Shaw 2006).
Due to their sessile nature, plants are predicted to be especially sensitive to the
population genetic consequences of increased insularity resulting from habitat fragmentation
(Young et al. 1996).
Many bryophytes are well suited for investigating the genetic effects of fragmentation,
addressing both the ecological and evolutionary impacts of habitat fragmentation, due to
global ubiquity, fast generation times, substrate specificity, and dominant haploid condition.
Indeed, many bryophyte taxa have a distribution across more than one continent at the
generic and familial level: a feature which allows for the unique opportunity of examining
habitat fragmentation impacts in disparate geographic areas while minimising the
confounding effects caused by the use of different taxa.
In addition, the fast colonisation-extinction rates, the high substrate specificity, and the
high turnover rates of habitat patches for many bryophyte taxa (Snll et al., 2005; Sderstrm
and Herben, 1997) offer the unique opportunity to quantify population parameters, such as
patch colonisation and extinction, within experimentally tractable time periods in order to test
metapopulation theory in light of habitat fragmentation (e.g., Zartman and Shaw, 2006).
The bryophyte studies that have addressed the effects of habitat fragmentation on
population genetic structure generally provide evidence that isolation decreases genetic
diversity and increases population structure. For example, bryophyte populations of both
fragmented peat bogs (Wilson and Provan 2003), and deciduous forests (Wyatt et al. 1989)
show a loss of genetic variability, and exhibit increased interpopulation genetic
differentiation. Reduced genetic diversity in bryophytes may result from the random loss of
alleles due to drift acting on these isolated populations (Wyatt et al. 1989).
Edge effect plays a central role in the biology of fragmented populations (Laurance et al.
2002; Murcia 1995). The abrupt changes in microclimatic conditions, such as increasing
temperatures, increased solar radiation due to tree windfalls and, in many cases, decreased
ambient humidity (Kapos et al., 1997) associated with exposure to the differing conditions of
the surrounding matrix, impact local growth and community composition of bryophytes
(Zartman and Shaw, 2006). For example, Hylander et al. (2005) found significantly lower
growth rates in two mosses (Hylocomium splendens and Hylocomiastrum umbratum) on the
more exposed, south-facing edges of forest compared to the sheltered north edge of boreal
forest of northern Sweden.
In a study on the intraspecific genetic variation of epiphytic moss, Leptodon smithii,
collected on ilex bark, the authors (Spagnuolo et al. 2007b) observed that fragmentation and
edge effects can also occur in condition of high naturality; indeed, in a volcanic site
(Vesuvius) different phanerogamic communities (Q. ilex, pine wood, Robinia pseudacacia,
mixed wood, open macchia) and soil structures (presence of volcanic native soil) following
abruptly one other, have determined ilex wood fragmentation, with a consequent low level of
genetic diversity in epiphyte moss population.
In general the presence of suitable vegetation corridors among populations maintains
gene flow, favouring step-by-step dispersal that, in turn, increases colonization events and
population expansion.

Genetic Variation and Environmental Disturbance

Environmental disturbance causes a general loss of habitats suitable for moss species,
with a subsequent demographic decrease, accompanied by a major or minor loss of
haplotypes, depending on low or high haplotype frequencies, respectively. Therefore,
intraspecific genetic variation in mosses appears a suitable tracer of environmental
disturbance, providing useful data for environmental monitoring and management. The
duration of the disturbance in relation to the life spanning of the species is very important.
Indeed, among mosses, fugitive and pioneer species are not suitable for the evaluation of
intraspecific genetic analysis along environmental gradient since the short life cycle is
accompanied with a extremely short life span. Instead, long standing species and perennial
stayers couple the short generation time with a sufficiently long persistence in the
environment, allowing to observe the evolution and the changes of genetic variation and
structure induced by environmental disturbance. However, it is generally accepted the
advantage of a short generation time, as in mosses, for displaying the genetic effects of the
disturbance; by contrast, most angiosperms and especially trees, could dilate this effect in too
long times, that are incompatible with the experimental timing. The investigations on the
relationships between moss genetic variation and environmental disturbance date to the very
recent years. The observation of a surprisingly high level of genetic variation in some moss
species, in the absence of sexual reproduction, and particularly in mosses collected in the
Antarctic region, such as Bryum argenteum (Skotnicki et al. 2005) and Ceratodon purpureus
(Skotnicki et al. 2004) has oriented the scientific interest towards this topic. The high genetic
differentiation observed by RAPD markers in Antarctic mosses has been claimed to reflect
somatic mutation possibly due to elevated UV-B radiation. Clarke et al. (2008) by
microsatellites found that genetic variation in C. purpureus was lower in Antarctic
populations than in populations from temperate regions. The authors conclude that climatic
changes represent a strong challenge for Antarctic populations of C. purpureus, that appear
weakly interconnected, and with a less potential than temperate populations to adapt to
environmental changes.
In a study on the moss Leptodon smithii the authors (Spagnuolo et al. 2009b) investigated
major/trace metal accumulation in moss tissues in different populations located along a
disturbance gradient, from remote areas to urban gardens, and compare bioaccumulation data
with intrapopulation genetic diversity measured by ISSR markers. They found higher values
of trace metals accumulation coupled with lower genetic differentiation in mosses from urban
sites, where urbanisation, vehicular traffic, dry microclimate favouring erosion and dust
resuspension, are relevant features. By contrast high level of genetic variation and lower
element content were coupled in mosses from remote sites. The authors also found high and
significant correlation between the two independent data sets, and therefore they suggest that
metal accumulation and genetic differentiation my prove a useful tool to highlight, in an
integrated way, the occurrence of man driven disturbance, to evaluate its sustainability, and
to discover deviations from reference situations, particularly when disturbance effects are
hidden or do not clearly display their demographic and phytosociological consequences.

Conclusion

Bryophytes are traditionally considered secondary elements of the vegetation, apart from
some ecosystems, such as bog and tundra.
Instead, several advantageous traits, (i.e. their peculiar life cycle with dominant haploid
phase, their short generation time, the widespread distribution of most species, and overall
their importance as biomonitors), converge in mosses, making these organisms particularly
suitable as a model plant for detecting environmental changes. They have been employed as
bioindicators of air quality since sixties. Recently Frahm (2008) has reported that spores, and
even vegetative propagules, are ubiquitous, and are also dispersed into regions where the
species cannot grow. The overall presence of spores is another reason for the use of
bryophytes as tracers of climatic changes. For example, spores of southern species dispersed
across the former borders of their ranges are able to germinate immediately upon a warmer
climate, and can also keep their vitality for more of 100 years (Frahm and Klaus 2001). Thus,
the changing in distribution of a given species, either a widespread species or an endemic
one, may suggest an ongoing climatic change. Moreover, the short generation time and the
life span mostly tractable with the experimental timing are other two important advantages in
the use of mosses as tracers of environmental changes, providing the opportunity to gain
information about demographic trend and reproductive strategies adopted through the
generations. All these objectives can be achieved by field observations, but also by molecular
analyses. Moss genetic variation supports the other methodological approaches, providing an
independent data set that allows to quantify (with much accuracy) the effects of
environmental changes, and to predict their trend in time, especially when their demographic
and phytosociological consequences are hidden or not clearly displayed.
The most important challenges for the future include the ability to investigate and
promptly identify environmental changes which will occur under the influence of global
climatic changes, as responses to human activities and pressure, in order to make suitable
interventions in the management of threatened environments and protected areas.

References

Agapow, P.M. and Burt, A. (2001). Indices of multilocus linkage disequilibrium. Molecular
Ecology Notes, 1, 101-102.
Arnaud, J.F.; Madec, L.; Guiller, A. and Deunff, J. (2003). Population genetic diversity in a
human-disturbed environment: a case study in the land snail Helix aspersa (Gastropoda:
Pulmonata). Heredity, 90, 451458.
Barrett, S.C.H. and Eckert, C.G. (1990). Variation and evolution of mating systems in seed
plants. In S. Kawano (Ed.), Biological Approaches and Evolutionary Trends in Plants.
(Academic Press, pp. 229-254). Tokyo, Japan.
Charlesworth, B. and Charlesworth, D. (1998). Some evolutionary consequences of
deleterious mutations. Genetica, 103, 319.
Charlesworth, D. and Charlesworth, B. (1987). Inbreeding depression and its evolutionary
consequences. Annual Review Ecology and Systematics, 18, 237-268.
Clarke, L.J.; Aire, D.J. and Robinson, S.A. (2008). Somatic mutation and Antarctic ozone
hole. Journal of Ecology, 96, 378-385.
Cronberg, N. (2002). Colonization dynamics of the clonal moss Hylocomium splendens on
islands in a Baltic land uplift area: reproduction, genet distribution and genetic variation.
Journal of Ecology, 90, 925-935.
Cronberg, N.; Rydgren, K. and kland, R. H. (2006). Clonal structure and genet-level sex
ratios suggest different roles of vegetative and sexual reproduction in the clonal moss
Hylocomium splendens. Ecography, 29, 95-103
Dulire, J.; De Bruynb, R. and Malaisse, F. (2000). Changes in the moss layer after liming in
a Norway spruce (Picea abies (L.) Karst.) stand of Eastern Belgium. Forest Ecology and
Management, 136, 97-105
During, H.J. (1979) Life strategies of Bryophytes: a preliminary review. Lindbergia, 5, 2-18.
Ellstrand, N.C. and Elam, D.R. (1993). Population genetic consequences of small population
size: Implications for plant conservation. Annual Review of Ecology and Systematics, 24,
217242.
Eppley, S.M.; Taylor, P.J. and Jesson, L.K. (2007). Self-fertilization in mosses: a comparison
of heterozygote deficiency between species with combined versus separate sexes.
Heredity, 98, 38-44.
Fisher, R.A. (1949). The Theory of Inbreeding. London, UK. Oliver and Boyd.
Frahm, J-P. and Klaus, D. (2001). Bryophytes as indicators for past and present climate
fluctuations. Lindbergia, 26, 97-104.
Frahm, J-P. (2008). Diversity, dispersal and biogeography of bryophytes (mosses).
Biodiversity and Conservation, 17, 277-284.
Frankham, R.; Ballou, J.D. and Briscoe, D.A. (2002). Introduction to Conservation Genetics.
Cambridge, UK. Cambridge University Press.
Gibbs, J.P. (2001). Demography versus habitat fragmentation as determinants of genetic
variation in wild populations. Biological Conservation, 100, 15-20.
Giordano, S.; Sorbo, S.; Adamo, P.; Basile, A.; Spagnuolo, V. and Castaldo Cobianchi, R.
(2004). Biodiversity and trace element content of epiphytic bryophytes in urban and
extraurban sites of southern Italy. Plant Ecology, 170, 1-14
Gupta, M.; Chyi, Y.S.; Romero-Stevenson, J. and Owen, J.L. (1994). Amplification of DNA
markers from evolutionarily diverse genomes using simple primers of simple-sequence
repeats. Theoretical and Applied Genetics, 89, 998-1006.
Hassel, K.; Sstad, S.M.; Gunnarsson, U. and Sderstrm, L. (2005). Genetic variation and
structure in the expanding moss Pogonatum dentatum (Polytrichaceae) in its area of
origin and in a recently colonized area. American Journal of Botany, 92, 16841690.
Hedrick, P.W. (1987). Genetic load and the mating system in homosporous ferns. Evolution,
41, 12821289.
Hylander, K.(2005). Aspect modifies the magnitude of edge effects on bryophyte growth in
boreal forests. Journal of Applied Ecology, 42, 518525.
Hylander, K.; Dynesius, M.; Jonsson, B.G. and Nilsson, C.(2005) Substrate form determines
the fate of bryophytes in riparian buffer strips. Ecological Applications, 15, 674688.
Innes, D.J. (1990). Microgeographic genetic variation in the haploid and diploid stages of the
moss Polytrichum juniperinum Hedw. Heredity, 64, 331340.
Jarne, P. and Charlesworth, B. (1993). The evolution of the selfing rate in functionally
hermaphrodite plants and animals. Annual Review of Ecology and Systematics, 24, 441
466.
Kapos, V; Wandelli, E.; Camargo, J. and Ganade, G. (1997). Edge-related changes in
environmenta and plant responses due to forest fragmentation in central Amazonia. In
W.F. Laurance and R.O. Jr Bierregaard (Eds.), Tropical forest remnants: ecology,
management, and conservation of fragmented communities. (University of Chicago
Press; pp. 3344). Chicago, USA.
Klekoswski, E.J. Jr. (1972). Genetical features of ferns as contrasted to seed plants. Annals of
the Missouri Botanical Garden, 59, 138-151.
Lande, R. and Schemske, D.W. (1985). The evolution of self-fertilization and inbreeding
depression in plants. I. Genetic models. Evolution, 39, 2440.
Laurance, W.F. and Yensen, E. (1991) Predicting the impacts of edge effects in fragmented
habitats. Biological Conservation, 55, 7792.
Longton, R.E. (1988a). Life-history strategies among bryophytes of arid regions. Journal of
the Hattori Botanical Laboratory, 64, 15-28.
Longton, R.E. (1988b). The biology of polar bryophytes and lichens. Cambridge, UK.
Cambridge University Press.
Longton, R.E. (1992). The role of bryophytes and lichens in terrestrial ecosystems. In J.W.
Bates and A.M. Farmer (Eds.), Bryophytes and Lichens in a Changing Environment.
(Clarendon Press; pp. 32-76). Oxford, UK.
Mc Daniel, S.F. and Shaw, J.A. (2005). Selective sweeps and intercontinental migration in
the cosmopolitan moss Ceratodon purpureus (Hedw.) Brid. Molecular Ecology, 14,
11211132.
Meagher, T.R. and Shaw, A.J. (1990). Clonal structure of the moss Climacium americanum
Brid. Heredity, 64, 233-238.
Menges, E. (1991). Seed germination percentage increases with population size in a
fragmented prairie species. Conservation Biology, 5, 158164.
Murcia, C. (1995). Edge effects in fragmented forests: implications for conservation. Trends
in Ecology and Evolution, 10, 5862.
Nei, M.; Maruyama, T. and Chakraborty, R. (1975). The bottleneck effect and genetic
variability in populations. Evolution, 29, 110.
Newman, D. and Pilson, J. (1997). Increased probability of extinction due to decreased
genetic effective population size: Experimental populations of Clarkia pulchella.
Evolution, 51, 354362.
Pharo, E.J. and Zartman, C.E. (2007). Bryophytes in a changing landscape: The hierarchical
effects of habitat fragmentation on ecological and evolutionary processes. Biological
Conservation, 135, 315325.
Pharo, E.J.; Lindenmayer, D.B. and Taws, N. (2004). The effects of large-scale fragmentation
on bryophytes in temperate forests. Journal of Applied Ecology, 41, 910921.
Sardans, J. and Peuelas, J. (2005). Trace element accumulation in the moss Hypnum
cupressiforme Hedw. and the trees Quercus ilex L. and Pinus halepensis Mill. in
Catalonia. Chemosphere, 60, 12931307.
Shaw, A.J. (1991). The genetic structure of sporophytic and gametophytic populations of the
moss, Funaria hygrometrica Hedw. Evolution, 45, 12601274.
Shaw, A.J.; Werner, O. and Ros, R. (1990). Intercontinental Mediterranean disjunct mosses:
morphological and molecular patterns. American Journal of Botany, 90, 540550.
Shaw, A.J. (2000). Population ecology, population genetics and microevolution. In A.J.
Shaw, and B. Goffinet (Eds.), Bryophyte Biology. (Cambridge University Press; pp 369
402).Cambridge, UK.
Sim-Sim, M.; Carvalho, P. and Srgio, C. (2000). Cryptogamic epiphytes as indicators of air
quality around an industrial complex in the Tagus valley, Portugal. Factoral analysis and
environmental variables. Cryptogamie Bryologie, 21, 153-170.
Skotnicki, M.L.; Ninham, J.A. and Selkirk, P.M. (1999). Genetic diversity and dispersal of
the moss Sarconeurum glaciale on Ross Island, East Antarctica. Molecular Ecology, 8,
753-762.
Snll, T.; Ehrln, J. and Rydin, H. (2005). Colonization-extinction dynamics of an epiphyte
metapopulation in a dynamic landscape. Ecology, 86, 106115.
Sderstrm, L. and Herben, T. (1997). Dynamics of bryophyte metapopulations. In R.E.
Longton (Ed.), Advances in Bryology: Population Studies. (J. Cramer Press; pp. 205
240). Berlin, Germany.
Spagnuolo, V.; Muscariello, L.; Cozzolino, S.; Castaldo Cobianchi, R. and Giordano, S.
(2007a). Ubiquitous genetic diversity in ISSR markers between and within populations of
the asexually producing moss Pleurochaete squarrosa. Plant Ecology, 188, 91-101.
Spagnuolo, V.; Muscariello, L.; Terracciano, S. and Giordano, S. (2007b). Molecular
biodiversity in the moss Leptodon smithii (Neckeraceae), in relation to habitat
disturbance and fragmentation. Journal of Plant Research, 120, 595-604.
Spagnuolo, V.; Terracciano, S. and Giordano, S. (2009a). Clonal diversity and geographic
structure in Pleurochaete squarrosa (Pottiaceae): different sampling scale approach.
Journal of Plant Research, 122, 161-170.
Spagnuolo, V.; Terracciano, S. and Giordano, S. (2009b). Trace element content and
molecular biodiversity in the epiphytic moss Leptodon smithii: two independent tracers
of human disturbance. Chemosphere, 74 (2009), 1158-1164.
Spielman, D.; Brook, B.W. and Frankham, R. (2004). Most species are not driven to
extinction before genetic factors impact them. Proceedings of the National Academy of
Sciences, 101, 1526115264.
Tajima, F. (1989).Statistical method for testing the neutral mutation hypothesis by DNA
polymorphism. Genetics, 123, 585-595.
Templeton, A.R.; Shaw, K.; Routman, E. and Davis, S.K. (1990). The genetic consequences
of habitat fragmentation. Annals of the Missouri Botanical Garden, 77, 1327.
Wilson, P.J. and Provan, J. (2003). Effect of habitat fragmentation on levels and patterns of
genetic diversity in natural populations of the peat moss Polytrichum commune.
Proceedings of the Royal Society of London Biological Science, 270 (1517), 881-886.
Wright, S. (1965). The interpretation of population structure by F-statistics with special
regard to systems of mating. Evolution, 19, 395420.
Wyatt, R.; Odrzykoski, I.J. and Stoneburner, A. (1989). High levels of genetic variability in
the haploid moss Plagiomnium ciliare. Evolution, 43, 10851096.
Young, A.; Boyle, T. and Brown, T. (1996). The population genetic consequences of habitat
fragmentation for plants. Trends Ecology and Evolution, 11, 413-418.
Zartman C.E. and Shaw A.J. (2006). Metapopulation extinction thresholds in rainforest
remnants. The American Naturalist, 167, 177189.

Index

#
1G, 57
A
ABC, xi, 229, 231, 232, 233, 234, 236, 237, 240,
241, 242, 243
abiotic, 131, 132, 135, 137
Aboriginal, 264
ACC, 90, 92, 199, 210
accounting, 164
accuracy, 231, 240, 279
acetate, 201
acetic acid, 199, 207
achievement, 71
acid, 6, 9, 23, 61, 63, 68, 121, 123, 124, 183, 196,
197, 199, 200, 202, 203, 205, 206, 207, 208, 209,
210, 211
acidic, 63, 198
acidification, 207
acidity, 106
Acinetobacter, 197, 203, 208
acute, 181
Adams, 42
adaptability, vii, x, 129
adaptation, ix, x, 105, 106, 108, 111, 122, 123, 129,
172
adaptive radiation, 215
adenine, 53
administrative, 175
adult, 73, 249
Afghanistan, 214, 250
Africa, xii, 39, 72, 100, 130, 131, 180, 192, 234,
246, 247, 248, 249, 250, 255, 256, 257, 259, 260,
261, 262, 263, 265, 267, 268, 269
agar, 198, 199, 200, 202
AGC, 90, 92
age, 11, 234, 245
agent, 102, 273
agents, 135, 210
aggregation, 38
agricultural, xi, 131, 173, 195, 196, 198, 205
agricultural residue, 196
agriculture, 146, 205, 206
aid, 110, 204
air, 58, 275, 279, 282
air quality, 275, 279, 282
Alabama, 106, 116
Alaska, 251
alcohol, 8
alfalfa, 121, 123, 142
algorithm, 75, 184, 241, 242
alkaline, 196, 209
alkaloids, 137
allele, 8, 21, 74, 91, 92, 140, 155, 158, 159, 234,
235, 236, 244, 246
alleles, x, xii, 6, 12, 13, 26, 74, 83, 84, 85, 86, 99,
129, 133, 135, 139, 140, 150, 155, 174, 221, 230,
236, 274, 277
alternative, 27, 28, 125, 202, 225, 231, 242, 265
alternative hypothesis, 28
alters, 276, 277
aluminium, 205
amendments, 124
amino, 9, 23, 181
amino acid, 9, 23, 181
amino acids, 181
ammonium, 183, 197, 203
Index 286
amphibians, 19, 22
Amsterdam, 45, 177, 227, 244
analysis of variance, 4
anatomy, 181
Andes, 251
animal agriculture, 146
animal health, 137
animals, viii, 2, 9, 13, 15, 19, 22, 106, 137, 153, 166,
274, 281
Animals, 37, 39, 41, 44, 177, 178
anisotropy, 158
annealing, 182, 183
ANOVA, 4, 17, 18, 19, 22
antagonism, 200
antagonistic, 206, 207
antagonists, 200
Antarctic, 214, 226, 272, 278, 280
anthropic, 180, 215
anthropogenic, 186, 188
anthropological, 152, 165, 171
anthropology, 177
antibiotic, 200, 208, 210
antibiotics, 201
apatites, 196
API, 203
appendix, 16
application, 71, 88, 99, 101, 102, 120, 135, 192, 193,
196, 207, 233
Arabia, 97, 255, 263
Araneae, 40
arbuscular mycorrhizal fungi, 196, 198
arbuscular mycorrhizal fungus, 212
archeology, 171, 177
Arctic, 40, 41, 171, 174, 214, 226, 251, 256, 259,
260, 261
Argentina, 110, 112, 113, 115, 181, 211
arginine, 201
arid, 123, 281
arithmetic, 4, 8, 36
Arizona, 251
Arkansas, 110, 119
ARS, 132
Arthropoda, 19, 33, 34, 35, 36
arthropods, 196
asexual, 67, 188, 272, 273
ash, 107, 116, 120, 208
Asia, x, xii, 12, 42, 44, 48, 80, 91, 103, 104, 130,
149, 158, 166, 169, 171, 173, 178, 234, 247, 249,
250, 251, 255, 256, 259, 260, 261, 262, 263, 265,
266, 267, 268
Asian, xii, 39, 74, 102, 166, 168, 171, 174, 177, 247,
249, 255, 263, 264, 268
assessment, 14, 111, 134, 136, 144, 147, 225
assignment, 74
assimilation, 203
assumptions, 10, 223
asymptomatic, 137
asymptotically, 2
Atlantic, 41, 49
Atlantic Ocean, 41
Atlas, 177
Australasia, 49, 255
Australia, 49, 50, 144, 191, 251, 255, 256, 259, 260,
261, 262, 263, 266
Austria, 250
autocorrelation, 224
availability, 96, 196, 205
averaging, 29, 116, 154
awareness, 150
B
BAC, 58, 59
bacilli, 73
Bacillus, 197
back, 11, 107, 155, 166, 215, 263, 265
backcross, 133
bacteria, xi, 10, 195, 196, 197, 198, 199, 201, 202,
203, 204, 205, 206, 207, 208, 209, 211
bacterial, xi, 58, 102, 195, 197, 199, 200, 202, 204,
211
bacterial cells, 200
bacterial strains, 204
bacterium, 199, 200, 210, 211
Baikal, 27, 42, 43, 173
Bangladesh, 250
banks, 181
barley, 114, 144, 147, 190, 210, 214, 215, 226
barrier, 175, 230
barriers, 14, 24, 158, 224
base pair, 53, 154, 236
Bayesian, vi, xi, 229, 230, 231, 242, 244, 245
Bayesian analysis, 242, 244
beef, 130, 143
Beijing, viii, 69, 72, 74, 82, 85, 87, 91, 98, 99, 100,
101, 102, 103, 226, 227
Belgium, 72, 110, 112, 113, 280
benefits, 137
Bhutan, 214, 250
bias, xii, 6, 230, 231, 237, 240, 241
Index 287
bifurcation, 28
bioaccumulation, 278
biochemistry, 102
biocontrol, 197, 198, 206, 207, 209, 210
bioconversion, 121
biodegradation, xi, 195, 197
biodiversity, vii, xi, xii, 44, 205, 213, 214, 215, 271,
272, 273, 275, 276, 282
biofuel, 106, 107, 110, 118, 121, 122, 123, 124
biofuels, 107, 121, 125
biogeography, 191, 280
bioindicators, 279
bioinformatics, ix, 70
biological control, xi, 195, 207
biomass, ix, 105, 106, 107, 109, 115, 116, 117, 118,
119, 120, 121, 122, 123, 125, 126, 127, 146, 209,
275
biomonitoring, 272, 275
biosynthesis, 202, 208, 209
biotechnology, 64, 211
biotic, 131, 132, 135, 137
birds, 19, 22, 227
birth, 235
Black Sea, 48, 171
blindness, 157
blocks, 52, 53, 122
blood, 155, 156
blood group, 155
blot, 74, 100, 101
body temperature, 137
bogs, 277
boilers, 107
bonds, 197
bootstrap, 184, 186, 190, 204
boreal forest, 277, 281
boric acid, 183
Borneo, 251
Bose, 98
Botanical Garden, 191, 281, 283
bottleneck, xiii, 9, 139, 248, 271, 272, 281
bottlenecks, 256, 263, 267, 274, 276
bovine, 9
Bradyrhizobium, 206
Brazil, v, x, 51, 179, 180, 181, 183, 184, 192
Brazilian, 193
breakdown, 208
breeding, ix, x, 9, 65, 66, 105, 107, 108, 109, 110,
112, 122, 123, 126, 129, 133, 138, 140, 141, 143,
144, 182, 186, 188, 191, 192, 193, 230, 274, 276
Britain, 251
British Columbia, 13
Brussels, 72
bryophyte, 274, 277, 281, 282
Buenos Aires, 181, 192
buffer, 183, 281
bulbs, 181, 188, 189
Bulgaria, v, viii, 69, 70, 73, 76, 77, 78, 80, 81, 82,
83, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,
103
Burkholderia, 197, 207
burning, 107
butterfly, 8
bypass, 121
C
Ca
2+
, 197
calcium, 120, 198, 202
calf, 130
CAM, 78
Cambodia, 251
Cameroon, 250
Canada, 40, 41, 49, 106, 251
candidates, 118
capsule, 273
carbohydrate, 196
carbohydrates, 121, 196
carbon, 106, 115, 119, 120, 123, 124, 125, 127, 197,
201, 203, 206
carboxylic, 200, 206
Caribbean, 65
carrier, 153, 174
CAS, 74, 198
case study, 241, 280
catalase, 92
catfish, 6, 29, 42
cation, 203
cattle, 136, 141, 143
Caucasian, 152, 165
Caucasus, 152
cell, 53, 58, 59, 121, 200, 206
cellulose, 121, 196, 200
cellulosic, 121, 126
cellulosic ethanol, 126
Central Asia, 74, 166, 169, 171, 173, 174, 249
Central Europe, x, 149, 174
centrifugal forces, 175
certification, 109
changing environment, 122
chemicals, 127
Index 288
children, 100
Chile, 64
China, 12, 51, 63, 65, 66, 72, 85, 87, 92, 99, 100,
171, 188, 194, 213, 214, 215, 225, 226, 227, 228,
250, 255
chitin, 200
chloride, 199
chlorophyll, 147
chloroplast, 108, 111, 124, 125, 136, 216
cholinesterase, 156
Chordata, 34
chromosome, viii, 29, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 66, 67, 68, 72, 124, 130,
142, 215, 228, 245, 248, 263, 266, 267, 268
chromosome map, 58
chromosomes, 52, 53, 54, 55, 56, 58, 60, 61, 62, 63,
64, 65, 66, 67, 68, 132, 145, 245, 264, 266, 268
circulation, ix, 70, 80, 95, 96
citrus, 52, 59, 61, 62, 68
classes, 134
classical, 71, 85, 86
classification, 24, 27, 51, 52, 60, 97, 108, 109, 111,
125, 130, 134, 140, 152, 181, 190, 191, 192, 216
cleavage, 197
climate change, 221, 224, 225, 227
climate warming, 224
clonality, 88
cloning, 193, 208
cluster analysis, 86, 224
clustering, 11, 73, 98, 115, 186
clusters, 73, 76, 77, 82, 83, 86, 93, 94, 95, 112, 113,
185, 201
Co, vii, 1, 3, 4, 6, 8, 9, 15, 16, 17, 18, 19, 20, 22, 23,
29, 30, 33, 42, 43
CO2, 120
coal, 107
coding, 5, 10, 11, 21, 94, 96, 111, 154, 181, 202
codon, 10, 73, 98, 102
codons, 10, 90, 103
coenzyme, 208, 209
cohesion, 21, 27
Coleoptera, 33, 34, 35, 36, 44
collaboration, 109
collateral, 207
colonisation, 277
colonization, 153, 158, 173, 174, 206, 244, 248, 249,
254, 262, 263, 277, 278
Colorado, 110, 132
colors, 118
Columbia, 13, 44, 45, 47, 49, 245
Columbia University, 245
combined effect, 221
combustion, 107
communication, 40, 45, 58, 108, 131, 139
communities, xii, 216, 271, 272, 276, 277, 281
community, vii, 21, 103, 272, 277
compatibility, 12, 52, 108, 133
competition, 13
complement, 60, 156
complexity, 153, 189, 205, 206, 231, 232, 264
compliance, 96
complications, 5, 28
components, x, 19, 23, 118, 121, 123, 139, 149, 162,
168, 169, 174, 175, 233, 275
composition, 65, 121, 123, 140, 155, 189, 214, 277
compounds, 209
computation, xi, 229, 230, 231, 236, 241, 244, 245
computing, 4, 7, 191, 241
concentration, 73, 92, 107, 120, 121, 124, 183, 197,
199
conception, 137
concrete, 7, 162
condensation, 52
confidence, 17, 75, 204
confidence interval, 17, 75
confidence intervals, 17
conflict, 52
confusion, 62
Congress, iv
congruence, 86
conjecture, 169
consensus, xii, 28, 205, 247
conservation, ix, xi, 59, 105, 106, 107, 108, 131,
134, 141, 183, 189, 190, 191, 213, 215, 221, 225,
226, 227, 230, 280, 281
constraints, xiii, 271, 272
construction, 3, 30, 193, 231
consulting, 2
contaminant, 131
contamination, 75
continuity, 251, 264, 266, 267
control, 8, 72, 74, 75, 91, 92, 96, 99, 104, 106, 108,
121, 142, 196, 199, 207
control group, 92
conversion, 121, 122, 124, 197
Copenhagen, 190
copepods, 8
copper, 118
COR, 132
corn, 107, 120, 121, 245
Index 289
correlation, xi, 19, 29, 48, 158, 168, 169, 171, 184,
213, 221, 222, 223, 249, 274, 279
correlation coefficient, 19
correlations, 223, 262, 263
corridors, 262, 267, 278
costs, 8, 107, 115
country of origin, 249
coupling, 242
covering, 112, 149, 186
craniofacial, 249, 252, 260, 261, 262, 265
credibility, xii, 230, 237, 238, 239, 243
creep, 142
criticism, 23
Croatia, 234
crop production, 196
crop residues, 196
crops, 107, 120, 121, 125, 135, 214, 215, 216, 217
crossbreeding, 20
cross-fertilization, 188
CRP, 106
crustaceans, 9, 22
CTAB, 183
Cuba, 181
culm, 130
cultivation, 52, 119
culture, 74, 89, 122, 171, 173, 200, 201, 203, 229
cyanide, 201
cycles, 109, 183
Cyprus, 250
cytochrome, vii, 1, 3, 40, 43, 47, 201
cytochrome oxidase, vii, 1, 3, 43, 47, 201
cytology, 181
cytometry, 112, 114
cytoplasm, 108
cytosine, 53
D
Dace, 42
dairy, 120, 126
data analysis, 7, 191, 243
data set, xii, 11, 18, 29, 50, 204, 230, 231, 232, 234,
236, 237, 240, 241, 242, 279
database, vii, viii, 1, 4, 19, 22, 69, 73, 74, 77, 78, 82,
87, 89, 96, 97, 131, 204
dating, 11, 245, 265
decay, 188, 252
deciduous, 277
decomposition, 196
defense, 206, 211
deficiency, 196, 275, 280
definition, 20, 24, 71, 88, 89, 94, 150
degenerate, 182
degradation, 133, 147, 196, 206
degrading, 200, 207
dehydrogenase, 8, 156, 157, 202, 208
demographic data, 3, 155
demography, 273
Denmark, 12, 190
density, 118, 145, 156
dentist, 264
Department of Energy (DOE), 106
dephosphorylation, 197
deposits, 174
depression, 274, 276, 280, 281
desert, 171
detection, ix, 55, 70, 72, 73, 96, 99, 100, 101, 134,
168, 186, 202, 211
deviation, 175, 264, 274
dextrose, 200, 201, 203
differentiation, ix, x, 2, 3, 8, 9, 11, 17, 21, 23, 26, 37,
52, 53, 54, 55, 64, 70, 71, 85, 94, 96, 99, 105,
136, 139, 140, 141, 145, 147, 149, 153, 154, 155,
163, 166, 169, 171, 174, 186, 188, 190, 217, 218,
221, 223, 224, 225, 226, 265, 267, 273, 275, 276,
277, 278
digestibility, 107, 110, 121, 132
digestion, 204
diploid, ix, 105, 109, 111, 114, 126, 135, 142, 147,
274, 281
diplotene, 58, 59
direct repeats, 71
directional selection, 10
discontinuity, 224, 264, 266
discriminant analysis, 147
discrimination, 28, 71, 86, 88, 134, 135
discriminatory, viii, 69, 71, 75, 82, 85, 86, 88, 94,
96, 99
diseases, 97, 119, 122, 196
disequilibrium, 14, 273, 276, 279
dispersion, 186, 266
distilled water, 203
distribution, x, xi, xiii, 7, 13, 15, 16, 17, 18, 23, 44,
56, 58, 59, 66, 67, 73, 77, 96, 98, 134, 137, 150,
153, 154, 155, 156, 158, 161, 164, 165, 166, 168,
169, 171, 173, 174, 175, 179, 185, 186, 188, 189,
213, 214, 221, 225, 227, 234, 235, 236, 237, 242,
266, 271, 276, 277, 279, 280
Index 290
divergence, vii, viii, xi, 1, 2, 3, 5, 6, 9, 10, 11, 12,
19, 21, 22, 23, 26, 27, 29, 38, 43, 48, 49, 50, 80,
111, 189, 191, 229, 230, 231, 237, 239, 241, 243
diversification, 57, 65, 265
division, 57, 150, 207
DNA, v, vii, viii, 1, 3, 4, 5, 6, 8, 11, 12, 13, 15, 18,
23, 29, 30, 37, 38, 39, 40, 41, 42, 44, 45, 46, 47,
49, 53, 54, 59, 64, 66, 69, 71, 72, 74, 75, 82, 84,
89, 94, 98, 101, 103, 108, 111, 112, 114, 115,
124, 125, 134, 136, 138, 146, 150, 157, 180, 182,
183, 189, 191, 192, 193, 204, 205, 212, 216, 227,
233, 242, 245, 246, 268, 280, 283
domestication, 188
dominance, 98
donor, 131, 199
down-regulation, 122
draft, 23, 107
Drosophila, 10, 12, 35, 38, 40, 44, 45, 46
drought, ix, 105, 106, 108, 120, 132, 133, 137, 143,
144, 146
drug resistance, ix, 70, 73, 90, 94, 95, 96, 97, 98,
100, 101, 103
drug-resistant, ix, 70, 89, 94, 95, 96, 97, 100, 103
drugs, 72
dry matter, 110, 116, 121
duration, 278
dust, 278
dyes, 52
E
E. coli, 202
earth, 214
East Asia, x, 91, 102, 103, 149, 171, 266, 267, 268
Eastern Europe, 12
ecological, 3, 21, 27, 116, 137, 142, 205, 206, 215,
225, 227, 276, 277, 282
ecologists, 214, 227
ecology, 44, 125, 244, 281, 282
economics, 124
ecosystem, 219
ecosystems, 214, 275, 276, 279, 281
Eden, 267
Egypt, 250, 267
elaboration, 50
electricity, 107
electron, 10
electrophoresis, 22, 71, 89, 183, 204, 207, 211
elongation, 118
email, 229
embryo, 63, 133
emigration, 103, 263
encoding, 72, 111, 124, 202, 211
endonuclease, 8
endosperm, 109
energy, 107, 120, 122, 123, 126, 127, 145, 273
England, 14, 97, 131, 191
environment, vii, xiii, 23, 28, 127, 150, 162, 172,
180, 188, 205, 206, 216, 225, 233, 272, 276, 278,
280
environmental change, xiii, 215, 271, 272, 274, 278,
279
environmental conditions, 273
enzymatic, 121, 123
enzymes, 155, 197, 205, 206
epidemiology, 71, 73, 97, 101, 102, 103, 231
epigenetic, 26
epiphytes, 282
equilibrium, 7, 9, 12, 150, 236, 241, 276
ERIC, 205
erosion, xiii, 121, 272, 273, 278
erythrocytes, 155
Escherichia coli, 202, 208, 209
EST, 112, 125, 135, 136, 138, 139, 140, 145, 146,
147
ester, 197
estimating, 3, 5, 154, 209, 227
estimator, 10
ethane, 207
ethanol, 107, 120, 121, 126, 207
Ethiopia, 250
Ethiopian, 266
ethnic groups, 150, 155, 174
ethnicity, 267
ethyl acetate, 201
euchromatin, 55
eukaryotes, 8
Eurasia, x, xii, 149, 154, 157, 158, 159, 160, 161,
163, 164, 166, 169, 170, 171, 172, 173, 175, 176,
177, 247, 248, 249, 255, 262
Euro, 89, 98
Europe, 48, 80, 99, 130, 131, 137, 166, 177, 234,
243, 250, 255, 256, 259, 260, 261, 262, 263
European Commission, 97
Europeans, 263
evolution, viii, 2, 3, 6, 10, 11, 19, 21, 23, 24, 26, 29,
30, 37, 52, 66, 68, 75, 81, 86, 95, 99, 100, 130,
153, 176, 190, 191, 193, 211, 225, 226, 231, 244,
248, 249, 265, 266, 267, 268, 269, 278, 280, 281
evolutionary process, 282
Index 291
exclusion, 8
excretion, 197
expansions, 230, 246
exposure, 277
expressed sequence tag, 135, 144
extinction, xiii, 180, 221, 272, 275, 277, 282, 283
extraction, 138
extrapolation, 91
eyes, 75
F
factorial, 18, 29
failure, 91, 93
familial, 277
family, vii, ix, x, 1, 3, 10, 17, 18, 22, 27, 32, 35, 38,
51, 74, 82, 87, 89, 91, 95, 98, 99, 100, 101, 102,
104, 105, 106, 130, 135, 137, 152, 179, 180, 181,
182, 189, 211, 212
FAO, 41
Far East, viii, 1, 2, 12, 42, 43, 47, 177
farmers, 115
farming, 166
farms, 107
fax, 195
feedstock, 106, 107, 115, 118, 122, 125, 126
females, 13
fermentation, 107, 122, 201
Fertile Crescent, 263
fertility, xi, 120, 145, 195, 196, 198, 205
fertilization, 109, 120, 280
fertilizer, 106, 107, 120
fertilizers, 106, 120, 196, 205, 208
fiber, 107
fiber content, 107
fibers, 127
field trials, 118, 120
Fiji, 251
financial support, 206
financing, 104
fingerprinting, viii, 69, 72, 82, 86, 94, 101, 103, 134,
194, 201, 208
fingerprints, 209
Finland, 92, 100, 250
fish, vii, 1, 8, 11, 13, 14, 19, 26, 27, 49
FISH, 54, 56, 58, 59, 64, 66, 67, 144
fishers, 173
fitness, xii, 13, 23, 137, 271, 277
fixation, 139, 274
flavonoids, 181
flavor, 63
flood, 108
flooding, 108
flora, 180, 215, 227, 276
flow, xi, 11, 24, 110, 112, 114, 158, 175, 179, 186,
224, 225, 227, 248, 262, 263, 265, 266, 268, 272,
275, 276, 278
flow cytometry analysis, 114
fluctuations, xiii, 241, 271, 276, 280
focusing, 216, 264
food, 107, 131
foreign-born population, 73
forest ecosystem, 275
forest management, 275, 276
forests, 277, 281, 282
fouling, 107
founder effect, xii, xiii, 139, 248, 263, 265, 267, 271,
272, 275, 276
fractionation, 75
fragmentation, xiii, 206, 271, 272, 273, 276, 277,
280, 281, 282, 283
fragmented forests, 281
France, 98, 203, 229, 234, 237, 250
FRC, 252
freezing, 144
frequency distribution, 16, 161, 163
freshwater, 9
frog, 14
frost, 133, 144
fructose, 201
fruits, 51, 61, 63
fugitive, 273, 278
fungal, 131, 137, 142, 143, 200, 206, 207
fungi, 143, 196, 198, 205, 206, 209, 212
fungus, 212
G
Gabon, 250
gametes, 68, 274, 275
gametophyte, 273, 276
gasification, 107
gel, 89, 183
gelatin, 201
gels, 74, 183, 204
GenBank, vii, 1, 4, 5, 11, 204
gene combinations, 149
gene expression, 26, 27
Index 292
gene pool, x, 5, 14, 22, 26, 29, 63, 111, 123, 137,
149, 150, 153, 154, 155, 156, 158, 161, 162, 163,
166, 168, 169, 170, 171, 172, 174, 175, 176
gene transfer, 132, 202
generalization, 29
generalizations, 3, 28
generation, xii, xiii, 135, 140, 237, 239, 242, 271,
272, 274, 275, 277, 278, 279
genes, vii, viii, ix, 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 14,
15, 17, 18, 19, 21, 22, 23, 26, 27, 29, 50, 70, 72,
73, 96, 119, 121, 122, 132, 133, 141, 143, 144,
149, 150, 153, 154, 155, 161, 162, 168, 171, 175,
176, 181, 188, 200, 202, 204, 207, 208, 209, 211,
212, 230, 235, 243, 265
genetic diversity, vii, viii, ix, x, xi, xiii, 21, 22, 26,
65, 69, 71, 82, 94, 97, 100, 102, 105, 108, 111,
112, 122, 125, 129, 133, 134, 136, 142, 144, 145,
153, 154, 155, 162, 163, 164, 166, 168, 169, 182,
186, 188, 189, 190, 193, 198, 213, 215, 216, 220,
221, 222, 223, 225, 226, 227, 242, 245, 266, 267,
268, 271, 274, 277, 278, 280, 282, 283
genetic drift, xiii, 188, 262, 271, 272, 275, 276, 277
genetic factors, 9, 169, 282
genetic information, 153, 242
genetic marker, 155, 156, 158, 161, 174, 230
genetics, xi, 2, 3, 11, 15, 20, 24, 27, 42, 44, 65, 71,
97, 99, 100, 102, 134, 141, 147, 150, 153, 154,
155, 157, 168, 182, 191, 204, 213, 225, 231, 244,
282
Geneva, 47, 104
genome, 9, 12, 14, 26, 27, 29, 42, 58, 71, 72, 74, 86,
89, 122, 131, 134, 154, 182, 189, 205, 244, 267
genomes, 9, 14, 38, 131, 134, 182, 234, 267, 280
genomic, 13, 27, 30, 122, 132, 134, 135, 136, 138,
140, 145, 146, 183, 189, 201, 204, 205, 209, 245
genomic regions, 134
genomics, 107, 121
genotype, viii, 13, 23, 28, 69, 72, 80, 82, 85, 87, 91,
99, 100, 101, 103, 139, 147, 245
genotypes, ix, 8, 12, 88, 100, 101, 102, 104, 105,
108, 109, 111, 112, 115, 116, 117, 118, 119, 120,
121, 133, 134, 135, 136, 138, 139, 140, 193, 203,
275
genre, 65
geography, 41, 42, 100, 153, 155, 161, 162, 215, 265
Georgia, 110, 125, 137, 142
Germany, 71, 142, 250, 282
germination, 206, 273, 281
Gibbs, 272, 280
glaciers, 226, 227
global warming, vi, vii, xi, 213, 214, 227
glucose, 196, 202, 208, 209
glutathione, 206, 211
glycerol, 199, 201
goals, 122, 150
gold, 72, 86
gold standard, 72, 86
goodness of fit, 184
government, iv
GPO, 50
grain, 193, 211
Gram-negative, 202
gram-negative bacteria, 208
grants, viii, 2
grapefruit, 57, 62, 66, 67
graph, 18, 75, 76
grass, ix, x, 105, 106, 115, 124, 129, 130, 131, 135,
140, 144, 146
grasses, x, 126, 129, 132, 137, 143, 144, 145, 146,
147, 208, 210
grasslands, 106
grazing, 107, 115, 130, 136, 137, 141, 142, 143, 145,
146, 147, 188, 189, 190
Greece, viii, 45, 69, 70, 77, 78, 79, 80, 250
greenhouse, 138
Greenland, 251
grids, 275
GRIN, 109, 110, 112, 113, 114, 117, 118, 131, 132
groups, vii, ix, x, xi, xii, 1, 2, 3, 4, 5, 8, 9, 10, 11, 12,
13, 15, 16, 17, 18, 19, 20, 22, 23, 24, 29, 36, 60,
61, 62, 64, 92, 105, 108, 111, 116, 129, 132, 136,
146, 150, 152, 155, 156, 162, 165, 168, 174, 179,
185, 186, 197, 204, 205, 230, 247, 248, 255, 256
growth, xi, 106, 108, 118, 120, 137, 143, 147, 188,
195, 196, 198, 199, 200, 201, 202, 205, 206, 207,
208, 210, 211, 245, 273, 276, 277, 281
growth hormone, 205
growth rate, 277
GSM, 235
guanine, 53
guerrilla, 273
Guinea, 250, 251
Gulf of Mexico, 41
H
H1, 28, 78
H
2
, 28
habitat, xiii, 21, 106, 215, 223, 271, 272, 273, 275,
276, 277, 280, 282, 283
Index 293
habitation, 171
halos, 198, 200, 203
haploid, 60, 274, 277, 279, 281, 283
haplotype, xii, 12, 44, 216, 247, 276, 278
haplotypes, 8, 9, 12, 23, 83, 86, 269, 273, 278
Harvard, 37, 44, 266
harvest, 107, 115, 116, 119, 126
harvesting, 115, 215
Hawaii, 44, 251
health, xi, 70, 137, 195, 196
heat, 106, 107, 108, 122
height, 106, 116, 118, 136, 252
Helix, 280
hemagglutinin, 8
hemicellulose, 121, 196
hemoglobin, 12, 23, 44
herbs, 180, 216, 220
heritability, 252, 255
hermaphrodite, 281
heterochromatic, viii, 51, 53, 66
heterochromatin, 53, 55, 67
heterogeneity, 19, 22, 29, 94, 95, 133, 162, 174
heterogeneous, viii, x, 18, 69, 73, 77, 96, 124, 129,
133
heteroscedasticity, 236
heterosis, 108, 116, 123, 136, 145
heterozygosity, 6, 21, 23, 26, 27, 58, 63, 67, 122,
192, 221, 235, 236, 274
heterozygote, 280
heterozygotes, 111
high resolution, 95
highlands, 168
high-performance liquid chromatography, 143
histone, 8
HIV, 73, 245
HIV-1, 245
HLA, 156, 269
HLA-B, 156
Holland, 37, 45, 177
Holocene, 268
holoenzyme, 202
hominids, 10
homogeneity, 150, 158, 251
homologous chromosomes, 62
homology, 58, 132, 202
homozygosity, 59, 274
honey, xii, 230, 231, 233, 234, 236, 240, 241, 242,
243, 244, 245, 246
hormones, 205
Horticulture, 65
hospital, 98
host, 137, 141, 210
host population, 141
household, 95
HPLC, 201
HRS, 90
human, vii, xii, xiii, 8, 10, 11, 80, 82, 88, 95, 100,
102, 150, 152, 153, 154, 155, 169, 171, 189, 244,
245, 247, 248, 249, 253, 262, 263, 264, 265, 266,
267, 268, 269, 271, 273, 279, 280, 282
human genome, 154, 244
Human Genome Project, 154
humanity, 149
humans, xii, 11, 150, 158, 172, 245, 247, 248, 249,
254, 262, 265, 266, 269
humidity, 277
Hungary, 250
hunting, 166
hybrid, viii, ix, 12, 13, 51, 55, 58, 59, 62, 63, 64, 67,
105, 144, 234, 235, 241
hybridization, viii, 5, 12, 13, 15, 51, 54, 58, 60, 61,
63, 64, 67, 74, 75, 90, 109, 132, 145
hybrids, viii, 12, 13, 51, 52, 58, 60, 62, 63, 65, 67,
108, 110, 116, 125, 131, 133, 143, 144, 145, 147
hydrogen, 201, 207, 210
hydrogen cyanide, 201, 210
hydrolysis, 121, 201
hydrophobic, 6
hydrophobic properties, 6
hydroxides, 196
hydroxyapatite, 197, 205
hydroxyapatites, 196
hydroxyl, 197
hypothesis, xii, 28, 60, 63, 92, 188, 190, 247, 255,
262, 263, 275, 283
I
identification, 27, 41, 53, 55, 58, 99, 103, 133, 134,
141, 142, 150, 189, 203, 204, 210
identity, xi, 111, 179, 184, 186, 189
Illinois, 110
illumination, 204
images, 53
immigrants, 267
immigration, 245, 276
immobilization, 205
immunoglobulins, 8, 155
immunological, 155
implementation, 26, 73
Index 294
in situ, 54, 61, 67, 81, 132, 145, 207, 221, 225
in situ hybridization, 54, 61, 67, 132, 145
in transition, 157
in vitro, 121, 200, 209, 210
in vivo, 210
inbreeding, 274, 275, 276, 277, 281
inbreeding coefficient, 274, 275
incidence, 70, 103
incompatibility, x, 108, 109, 111, 122, 129, 133,
142, 155
incubation, 198, 199, 200
incubation period, 199
independence, 47, 107
India, 51, 92, 101, 195, 203, 206, 214
Indian, 74, 192, 209, 248, 255
Indian Ocean, 248, 255
Indians, 9
indication, 58, 96, 224, 237, 240
indicators, 280, 282
indices, 15, 162, 274, 275, 276
indigenous, 141, 166
indigenous peoples, 166
Indochina, 171
indole, 199, 207
Indo-Pacific, 255
industrial, 211, 282
industrial wastes, 211
industry, 65, 66, 107, 125
inequality, 236
infection, 119, 124, 137, 142, 146, 245
infections, 103
inferences, xi, 42, 43, 229, 231, 234, 235, 241, 244,
276
infertile, x, 129
infinite, 7, 28, 244
inherited, 13
inhibition, 200
inhibitory, 208
injury, iv
inoculation, 197, 200, 203
inorganic, 120, 196, 197, 198, 205, 208, 212
inositol, 197, 201
insects, 19, 206
insertion, 71, 89, 141, 265
insertion sequence, 71
insight, 95, 97
inspection, 8
instability, 203
insulin, 8
integration, 154
integrity, 14, 174, 175
interaction, 12, 17, 18, 28, 96, 154, 158, 171, 174,
186, 189
interaction process, 186
interactions, 10, 139, 144, 198
interface, 8, 242
interphase, 53
inter-population, 186
interstitial, 54
interval, xii, 171, 230, 237, 238, 239, 243
intra-population, 188, 242
intrinsic, 241, 242
intron, 66, 181
inulin, 201
invertebrates, 12, 13, 19, 22
Iran, 141, 250
Iraq, 250
Ireland, 251
iron, 205, 209
Iron Age, 171
iron transport, 209
island, 190, 244, 266
isolation, xi, 9, 22, 24, 26, 41, 74, 82, 88, 95, 143,
158, 175, 191, 208, 213, 223, 245, 249, 252, 255,
256, 258, 263, 265, 268, 272, 277
isoniazid, 70, 73, 97, 98, 99, 100, 101, 102, 103
isozyme, 12, 134, 146, 193, 215, 224, 226, 275
isozymes, 52
Israel, 57, 250, 262
Italian population, 234, 235, 237, 243
Italy, 133, 234, 237, 243, 245, 250, 280
J
JAMA, 98
Japan, 12, 13, 42, 45, 47, 65, 68, 72, 85, 87, 100,
247, 250, 255, 264, 265, 280
Japanese, 9, 47, 48, 227
Java, 251
joining, 204, 211
judgment, 27
Jung, 42, 120, 123, 124
K
K-12, 208
karyotype, 52, 53, 54, 55, 57, 58, 59, 60, 61, 62, 63,
64, 66, 68, 215, 228
karyotypes, 52, 56, 59, 60, 61, 63
Index 295
Kazakhstan, 91, 99
Kentucky, 110
Kenya, 250, 254, 255, 256
killing, 10, 117
kinase, 156
kinetics, 196
King, 21, 23, 43, 132, 141, 144, 145, 203, 209, 268
Kolmogorov, 17
Korea, 42, 91, 101
Korean, 12, 42
Kyrgyzstan, 214
L
labeling, 146
labor, 8, 27
lactose, 201
lambda, 252
land, xiii, 158, 174, 223, 271, 280
land use, xiii, 271
landscapes, xii, 158, 161, 162, 166, 168, 171, 214,
271, 272
language, 152, 153, 165
language diversity, 153
Laos, 251
large-scale, 72, 92, 215, 282
Latvia, 91
leaching, 122
legume, 145
legumes, 206
Lepidoptera, 8, 15, 19, 33, 34, 35, 36
leukocyte, 156
Levant, 248, 255, 262, 263, 267
Liberia, 250
life cycle, 278, 279
life span, 273, 278, 279
lifestyles, 173, 174
lifetime, 176
lignin, 107, 116, 121, 122
lignocellulose, 121
likelihood, 231, 244, 246, 276
limitation, 21, 28, 121, 122
limitations, 28, 233
linear, xii, 18, 221, 233, 247, 256, 263
linear regression, xii, 18, 233, 247, 263
linguistic, x, 149, 152, 153, 154, 165
linkage, 141, 273, 276, 279
links, viii, 69, 82, 166, 174
liquid chromatography, 133, 143
livestock, x, 106, 115, 120, 121, 129, 137, 189, 215
LMW, 210, 211
localization, 56, 156, 188
location, 111, 117, 118, 155, 162, 174, 232, 238
locus, ix, 13, 21, 70, 71, 73, 74, 75, 80, 82, 83, 85,
86, 87, 88, 89, 93, 94, 95, 96, 103, 158, 163, 164,
236, 274
London, 41, 47, 48, 49, 98, 177, 226, 245, 264, 266,
267, 280, 283
long distance, 186
long period, 14, 241
longevity, 188, 220
losses, 120
Louisiana, 190
low molecular weight, 204, 207
low-level, 92
lysis, 206
M
M.O., 141, 147
macromolecules, 5
Madison, 126, 142, 146
magnesium, 120
magnetic, iv
maintenance, xi, 71, 107, 180, 183, 215, 276
maize, 144, 145
males, 12, 13
maltose, 201
mammal, 9
mammals, 9, 42
management, xiii, 96, 107, 115, 119, 183, 225, 271,
275, 276, 278, 279, 281
manipulation, 122
mannitol, 201
MANOVA, 4, 5, 18
manure, 120, 126
Maori, 251
mapping, 58, 67, 86, 134, 135, 141, 144, 147, 153,
154, 157
marker genes, 14, 21, 22
Markov, 242, 244, 245
Markov chain, 244, 245
Maryland, 110, 132, 146
Massachusetts, 102
maternal, 3, 9, 13, 137
maternal inheritance, 9
matrix, xi, 4, 75, 168, 179, 184, 186, 191, 204, 248,
249, 251, 252, 258, 260, 261, 277
Maya, 146
MDB, 252
Index 296
MDH, 252
MDR, ix, 70, 72, 82, 89, 90, 91, 92, 94, 95, 96
measurement, 144
measures, vii, 1, 4, 6, 7, 8, 11, 15, 19, 21, 164
media, 197, 201, 211
medical plant, 214, 215, 216, 223
Mediterranean, x, 48, 57, 60, 61, 66, 74, 129, 131,
166, 180, 282
Mediterranean countries, 131
meiosis, 273
Melanesia, 251, 256, 259, 260, 261
melon, 146
membranes, 75
metabolism, 73
metabolite, 211
metabolites, 200, 205, 206
metals, 278
metaphase, 53, 61
methylation, 134
Mexican, 57
Mexico, 110, 131, 181, 251
Mg
2+
, 197
MgSO
4
, 201, 203
mice, 14, 48, 50
microbes, xi, 102, 195, 196, 197, 203, 205
microbial, 100, 131, 198, 203, 205, 206, 209, 210,
211
Microbial, 196, 197, 198
microclimate, 278
Micronesia, 251, 256, 259, 260, 261
microorganism, 197, 205
microorganisms, 196, 207, 208, 209, 210
microsatellites, 8, 134, 146, 182, 189, 231, 233, 244,
245, 264, 278
microscopy, 112
Middle Ages, 171, 178
Middle East, 228, 262
migration, xii, 9, 11, 44, 95, 137, 153, 155, 162, 169,
171, 174, 224, 245, 247, 248, 249, 253, 255, 263,
268, 281
milk, 199
mineralization, 119, 196, 197, 205
minerals, 197
mining, 97, 144
minisatellites, 244
Ministry of Education, 225, 264
mirror, 100
missions, 120
Mississippi, 38, 106, 110, 117, 118
Missouri, 110, 191, 281, 283
mitochondrial, 3, 5, 8, 9, 11, 12, 14, 23, 38, 40, 41,
42, 47, 48, 50, 155, 216, 234, 244, 245, 264, 266,
267
mitochondrial DNA, 5, 12, 23, 41, 47, 155, 216, 244,
245, 264, 266
mitotic, 58
mixing, 11
modalities, 231
model system, xii, 271, 272
modeling, 95
models, 7, 24, 36, 37, 50, 157, 158, 223, 231, 232,
234, 241, 244, 265, 274, 281
moisture, 106, 116
molecular biology, 67
molecular markers, x, 14, 23, 24, 26, 52, 65, 67, 73,
133, 134, 140, 141, 149, 181, 184, 225, 231, 275,
276
molecular weight, 74, 75, 84, 89, 183, 204, 207
molecules, 204
monkeys, 8
Monte Carlo, 242, 244, 245
Monte Carlo method, 245
Moon, 66
Morocco, 142, 192, 250
morphological, 14, 23, 49, 52, 63, 116, 117, 122,
123, 125, 130, 180, 181, 182, 189, 190, 234, 248,
249, 263, 264, 268, 282
morphology, 52, 54, 58, 124, 130, 136, 142, 180,
181
mortality, 275, 277
mortality rate, 277
Moscow, 37, 38, 42, 43, 44, 48, 177, 178
mountains, 131, 175, 221, 224, 227
mouse, 10, 12, 44
movement, 80, 82, 126
moving window, 162
Mozambique, 250
MPI, 13
mtDNA, xii, 5, 8, 9, 10, 11, 12, 13, 14, 23, 30, 39,
40, 41, 43, 45, 46, 155, 158, 160, 161, 164, 234,
243, 247, 248, 262, 263, 266, 268
multidrug resistance, 92
multivariate, 133, 147, 266
mutant, 90, 143, 147, 235, 236
mutation, ix, 9, 59, 70, 75, 90, 91, 92, 95, 96, 98,
103, 143, 155, 169, 230, 231, 233, 234, 235, 236,
237, 239, 241, 244, 267, 272, 276, 278, 280, 283
mutation rate, 95, 98, 231, 235, 236, 237, 239, 272
Index 297
mutations, ix, 10, 11, 24, 52, 62, 70, 72, 73, 75, 90,
91, 92, 96, 98, 99, 100, 101, 102, 103, 149, 153,
244, 276, 280
Myanmar, 251
mycelium, 206, 212
mycobacterial infection, 103
Mycobacterium, v, viii, 69, 70, 96, 97, 98, 99, 100,
101, 102, 103, 104
myoglobin, 10
N
NA, 13, 54, 66, 112, 124, 210, 267, 269
NaCl, 201
Nash, 106
National Academy of Sciences, 227, 282
National Science Foundation, 243
native population, 109
native species, 180
NATO, 177
natural, ix, xii, 5, 9, 10, 12, 13, 14, 21, 23, 52, 105,
106, 108, 135, 137, 140, 143, 144, 147, 150, 153,
169, 171, 173, 175, 180, 210, 216, 225, 233, 236,
237, 239, 242, 271, 272, 283
natural environment, xiii, 180, 210, 272
natural selection, ix, 9, 10, 12, 14, 23, 105, 136, 140
Near East, 166
Nebraska, 108, 109, 110, 119
neglect, 162, 230
Nepal, 214, 250
nervousness, 137
nesting, 189
network, xiii, 272, 276
New Jersey, 110, 177
New Mexico, 110, 251
New South Wales, 251, 264
New World, xii, 247, 249, 255
New York, iii, iv, 37, 39, 41, 43, 44, 45, 46, 47, 48,
49, 101, 110, 144, 177, 190, 192, 193, 245, 264,
266
New Zealand, 137, 251
Ni, 180
Nielsen, 23, 45, 125, 231, 245
Nigeria, 250
Nile, 255, 262
nitrogen, 106, 115, 119, 124, 137, 143, 198
nitrogen fixation, 198
Nixon, 39
nodes, 108, 118
nonparametric, 4, 18
non-renewable, 107
nontoxic, 137, 142
non-uniform, 7, 165, 166
non-uniformity, 166
normal, 14, 15, 23
normal conditions, 14
normal distribution, 15, 23
normalization, 27
North Africa, 130, 131, 250, 255, 256, 259, 260,
261, 263
North America, ix, 40, 46, 50, 105, 106, 158, 251,
256, 259, 260, 261
North Atlantic, 49
North Carolina, 110
Northeast, xii, 217, 247, 249, 250, 251, 255, 256,
260, 261, 263
Northeast Asia, xii, 247, 249, 250, 255, 256, 263
Norway, 250, 280
Norway spruce, 280
nuclear, 8, 9, 10, 12, 13, 14, 21, 23, 40, 48, 66, 111,
112, 124, 227, 234
nuclear genome, 9, 12, 234
nuclei, 53, 66
nucleolus, 53
nucleotide sequence, vii, 1, 4, 6, 8, 11, 12, 15, 19,
23, 50, 208, 209, 212
nucleotides, 5, 6, 7, 9, 108, 111, 182
null hypothesis, 28
numerical analysis, 102
nutrient, 115, 120, 122, 200, 205
nutrients, 115, 120, 205, 208
nutrition, 143
O
oat, 143
obligate, 9
observations, 9, 23, 28, 81, 233, 279
Oceania, 249
oil, 122, 196, 205
Oklahoma, 110, 117, 118, 119, 122, 125, 126, 138,
140
online, 77, 103, 131
operon, 73, 103
optical, 203
optical density, 203
Oregon, 110, 132
organelle, 29
organic, 115, 120, 196, 197, 198, 202, 203, 205, 206,
208
Index 298
organic compounds, 197
organic matter, 115, 196, 197
organism, 29, 118
Ottoman Empire, 80
ovary, 130, 180
overexploitation, 223
overgrazing, 215
oxalic, 197
oxidation, 202
oxides, 196
ozone, 272, 280
ozone hole, 272, 280
P
P. falciparum, 45
PAA, 28
Pacific, xii, 14, 44, 46, 247, 255
pairing, 58, 68, 144
Pakistan, 214, 250
Palestine, 57, 250
Paraguay, 181
parameter, 7, 10, 15, 36, 162, 204, 221, 231, 232,
233, 236, 237, 238, 240, 241, 242, 243
parameter estimation, 231, 233, 236, 242
parents, 58
Paris, 98, 264
pasture, 121, 130, 136, 140, 208
pastures, 107, 131, 135, 140
pathogenic, 206
pathogens, 197, 200, 207, 209
pathways, 158, 173, 248
patient care, 100
patients, 70, 72, 73, 91, 95, 96, 99
PCA, 28, 200, 206
PCR, x, 41, 47, 71, 74, 75, 89, 93, 101, 138, 146,
179, 182, 183, 184, 192, 193, 201, 204, 205, 209
peat, 277, 283
pectin, 200
pedigree, 141
Pennsylvania, 44
peptides, 10
perception, 100, 155, 157
permanent resident, 73
permeability, 158
permit, 8, 276
personal communication, 58, 108, 131, 139
perturbations, 188
Peru, 251
pesticide, 205, 206
pesticides, 196, 205
pests, 122
Petri dish, 199
petroleum, 107
pH, 106, 183, 197, 198, 201, 202, 203, 209
phalanx, 273
phenazine, 200, 206, 209
phenotype, 5, 28, 30, 82, 158
phenotypes, 28, 141
phenotypic, ix, xii, 14, 23, 91, 105, 111, 116, 133,
198, 201, 247, 248, 249, 256, 262, 263, 267
phenotypic variances, 262
Philippines, 251
phosphatases, 197, 202, 211
phosphate, vi, xi, 120, 156, 195, 196, 197, 198, 199,
200, 201, 202, 203, 205, 206, 207, 208, 209, 210,
211, 212
phosphates, xi, 195, 196, 197, 198, 205, 208, 212
phosphorous, 196, 197, 205, 206, 209, 210
photographs, 64
photoperiod, 108, 117
photoperiodism, ix, 105
photosynthesis, ix, 105, 106, 108, 143
phylogenetic, 3, 8, 11, 21, 28, 38, 39, 40, 42, 43, 52,
74, 89, 97, 111, 145, 181, 191, 193, 198, 204,
211, 212, 215
phylogenetic tree, 3, 11, 28, 204, 211
phylogeny, ix, 42, 43, 45, 49, 67, 70, 102, 181
physiological, 23, 125, 155, 162
physiology, 65, 137, 145, 211
phytopathogens, 196, 198, 205, 206
pica, 43
Pinus halepensis, 282
pioneer species, 278
planning, 104
plant growth promoting rhizobacteria, 207, 210
plants, viii, xi, xiii, 51, 52, 109, 112, 113, 121, 125,
132, 135, 137, 138, 139, 141, 144, 145, 153, 180,
188, 189, 190, 191, 196, 197, 203, 205, 206, 210,
212, 213, 214, 215, 216, 217, 220, 221, 222, 223,
224, 225, 227, 272, 273, 274, 275, 277, 280, 281,
283
plastid, 66, 111, 124, 137, 191, 192, 193
play, x, xi, 9, 10, 14, 129, 180, 195, 196, 197, 202,
203, 205, 206
Pleistocene, 46, 192, 249, 263, 268
ploidy, ix, 105, 109, 111, 115, 182
point mutation, 276
point of origin, 254
poisons, 149
Index 299
Poland, 190, 193, 250
pollen, 181, 224
pollination, xi, 108, 180, 186
pollinators, 186, 189
pollutants, xi, 195, 196, 197
pollution, 196, 272
polygenes, 26
polymer, 196
polymerase, 72, 138, 182, 194, 209
polymerase chain reaction, 138, 182, 194, 209
polymerase chain reactions, 138
polymorphism, ix, x, 6, 8, 9, 11, 23, 65, 66, 71, 97,
105, 108, 111, 125, 126, 134, 146, 147, 149, 150,
154, 155, 156, 157, 164, 179, 181, 182, 190, 192,
201, 283
polymorphisms, 102, 108, 124, 125, 133, 134, 155,
193
Polynesia, 251, 256, 259, 260, 261
polyploid, 68
polyploidization, 111
polyploidy, 111, 122, 130
polysaccharide, 121
polysaccharides, 121
pools, x, 22, 26, 108, 129, 150, 153, 174
poor, 96, 132, 146, 185, 196, 216
population density, 156, 166
population group, xii, 162, 169, 237, 240, 247, 248
population size, xi, 9, 88, 188, 191, 220, 223, 229,
230, 231, 233, 235, 236, 241, 242, 244, 246, 248,
255, 268, 275, 276, 280, 281, 282
portability, 71
Portugal, 250, 282
positive correlation, 29
potassium, 203
potato, 200, 211
power, 71, 75, 82, 86, 94, 99, 107, 136, 241
precipitation, 214
prediction, 24, 102, 155, 191, 225
press, 34, 226, 245
pressure, xiii, 137, 271, 273, 279
Pretoria, 45
preventive, 209
primary data, 157
Primates, 49
probability, 19, 22, 75, 76, 166, 242, 282
probe, 54, 59, 61, 74, 90, 121
producers, 206
production, ix, 105, 106, 107, 115, 118, 120, 121,
123, 124, 125, 126, 130, 131, 132, 147, 197, 198,
199, 200, 201, 202, 205, 206, 207, 209, 211, 273
productivity, 115, 120, 146, 208
progenitors, 133
progeny, 12, 13, 116, 275
program, 7, 75, 91, 97, 106, 108, 122, 133, 192, 204,
233, 241, 243, 245
promoter, 74, 92
promoter region, 74, 92
propagation, xi, 108, 180
property, iv, 29
proportionality, 235
proteases, 206
protected area, 273, 279
protected areas, 273, 279
protection, 141, 196
protein, 5, 10, 11, 21, 22, 23, 24, 29, 37, 66, 92, 120,
133, 141, 154, 181, 203
protein function, 10
proteins, 6, 9, 39, 43, 45, 49, 155, 156
protocols, 67
prototype, 89
Pseudomonas, 197, 198, 203, 205, 206, 207, 208,
209, 210, 211, 212
Pseudomonas aeruginosa, 209, 211
Pseudomonas spp, 209, 210
public, viii, 69, 70
public health, viii, 69, 70
pure line, 234
purines, 5
P-value, 139
Q
Quebec, 40
Quercus, 282
Quercus ilex, 282
quinacrine, 55
quinone, 202, 208, 209
quorum, 209
R
race, 152, 171
radiation, xiii, 41, 44, 47, 215, 271, 277, 278
radionuclides, 272
rainforest, 283
Raman, vi, 195, 210
random, 4, 18, 62, 133, 134, 140, 146, 153, 154, 169,
193, 277
random amplified polymorphic DNA, 134, 146, 193
Index 300
random mating, 133
randomly amplified polymorphic DNA, 182
range, ix, x, 7, 14, 22, 82, 85, 105, 107, 129, 158,
161, 176, 193, 198, 202, 214, 217, 218, 223, 236,
248, 263, 273, 274
RAPD, 64, 65, 66, 111, 124, 134, 136, 138, 141,
143, 145, 146, 182, 184, 186, 191, 192, 193, 205,
216, 217, 218, 225, 226, 227, 278
RAS, viii, 2
reagent, 75, 199
reality, 175
recognition, 21
recombination, 12, 13, 14, 40, 132, 147, 273
reconstruction, 102
recovery, 121
refining, 265
reflection, 116
regional, xii, 96, 247, 248, 250, 251, 253, 254, 255,
256, 257, 258, 262, 263, 267, 268
regression, viii, xii, 2, 18, 29, 233, 236, 237, 241,
247, 258, 263
regression analysis, viii, xii, 2, 29, 247, 263
regrowth, 115, 117
regular, 8, 241
regulation, 122, 202, 208, 245
rejection, 241
relationship, viii, xii, 12, 51, 66, 111, 140, 146, 158,
184, 189, 191, 215, 236, 247, 256, 258, 267
relationships, ix, x, 11, 14, 15, 40, 52, 64, 65, 76, 94,
106, 108, 129, 144, 145, 153, 182, 184, 189, 190,
193, 249, 255, 256, 267, 268, 278
relatives, 64, 67, 193, 214, 215, 216, 217, 274
relevance, 249
renewable energy, 107
representativeness, 82, 91, 94
reproduction, xiii, 2, 9, 52, 65, 67, 153, 188, 189,
190, 271, 272, 273, 274, 275, 278, 280
reptiles, 19, 22
reservoir, 9
residues, 196
resistance, ix, x, 70, 72, 73, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99, 100, 101, 102, 103, 119, 129, 132,
133, 137, 141, 143, 144, 145, 223, 227
resolution, 85, 94, 95, 207, 264
resource allocation, 120
resources, 88, 135, 191, 215
respiratory, 97, 210
restriction enzyme, 111
restriction fragment length polymorphis, 71, 111,
134
returns, 115
Reynolds, 120, 124, 126
RFLP, viii, 14, 41, 65, 69, 71, 73, 74, 75, 80, 81, 82,
83, 85, 86, 88, 94, 96, 111, 134, 135, 142, 145,
182, 184, 193
rhizobia, 198
Rhizobium, 197, 206, 208
rhizosphere, xi, 195, 197, 203, 207, 208, 210
ribose, 201
ribosomal, 212
rice, 107, 122, 135, 144, 147, 207, 211
rigidity, 27
risk, xiii, 75, 180, 215, 221, 272, 277
rivers, 181
RNA, 47, 72, 204, 207, 210, 211, 212
rodents, 9
Romania, 70, 78, 79, 80
Rome, 147
room temperature, 199
Royal Society, 49, 283
Rumania, 250
Russia, v, viii, x, 1, 12, 42, 43, 69, 70, 72, 78, 79, 80,
82, 91, 92, 97, 98, 101, 149, 150, 151, 152, 154,
155, 156, 157, 158, 159, 160, 161, 162, 163, 164,
166, 168, 169, 170, 171, 172, 173, 174, 175, 176,
177, 178, 250
Russian, viii, x, 1, 2, 42, 82, 97, 104, 149, 150, 154,
155, 156, 158, 161, 163, 164, 168, 169, 171, 174,
175, 177, 178
Russian Academy of Sciences, viii, 1, 2, 177
rust, 108, 119, 123, 141
Rwanda, 250
rye, 144, 209
S
salinity, 106, 205
salt, 199
salts, 199
Samoa, 251
sample, 6, 17, 18, 36, 76, 80, 82, 88, 91, 94, 95, 138,
139, 154, 155, 184, 185, 188, 189, 214, 232, 234,
236, 245, 276
sampling, 13, 136, 147, 155, 184, 186, 216, 282
satellite, 53, 54, 66
saturation, viii, 2, 18
Saudi Arabia, 97
scaling, 27
scarcity, 181
Schmid, 191
Index 301
scores, vii, 1, 4, 7, 17, 18, 19, 20, 28, 29
SCS, 111
sea level, 262
search, 10, 101, 204
Seattle, 98
second generation, 107
seed, 106, 108, 109, 115, 117, 118, 123, 133, 136,
137, 138, 140, 141, 186, 188, 274, 275, 280, 281
seedlings, 61, 138, 207
seeds, 108, 109, 117, 139, 224
segregation, 12, 62
selecting, 7, 71, 141
selectivity, 23
Self, 111, 142, 266, 280
self-awareness, 150
self-fertilization, 189, 274, 275, 281
semiarid, 123
Senegal, 250
senescence, 115, 133, 143, 147
sensing, 209
sensitivity, 71, 91
separation, 22, 29, 186, 204
sequencing, 210, 216
series, vii, 1, 230, 255, 258
serum, 155, 156
services, iv
set theory, 28, 29
settlers, 131
severity, 119
sex, 274, 280
sex ratio, 280
sexual reproduction, xiii, 188, 271, 272, 273, 274,
275, 278, 280
Shanghai, 213, 225
shape, xiii, 17, 136, 158, 169, 174, 175, 271
shaping, 224, 272, 274
shares, 176
sharing, xii, 61, 81, 229, 230, 231
short period, 27
shortage, 4
Siberia, 152, 156, 158, 161, 164, 166, 168, 169, 171,
173, 174, 177, 249
sibling, vii, 1, 18, 22, 23, 26, 27, 30, 34
Sierra Leone, 250
sign, viii, 2, 11, 36
signals, 56, 59
silica, 183
similarity, xii, 15, 21, 22, 23, 26, 60, 80, 81, 112,
113, 135, 138, 139, 140, 145, 162, 164, 184, 186,
201, 243, 247, 256, 258
simulation, 7, 15, 233, 236, 241
simulations, 233, 237, 241, 242
Singapore, 102
SIS, 252
sites, x, 9, 11, 13, 55, 56, 59, 60, 67, 109, 112, 116,
179, 184, 185, 186, 188, 189, 232, 276, 278, 280
skin, 268
Smithsonian, 264
Smithsonian Institution, 264
social environment, 169
social factors, 150
social regulation, 245
sodium, 199
software, vii, 1, 2, 4, 7, 48, 74, 75, 191, 192, 204,
209
soil, x, xi, 106, 115, 116, 119, 120, 122, 124, 129,
137, 195, 196, 197, 202, 203, 205, 206, 207, 208,
209, 210, 273, 278
soil erosion, x, 106, 129
soil pollution, 196
soils, 106, 108, 120, 181, 196, 198, 206, 209, 210
solar, 277
solubility, 196, 205
Somalia, 250
somatic mutations, 52
sorbitol, 201
Sorghum, 145, 192
sorting, 11
South Africa, 45, 46, 100, 250
South America, 131, 181, 192, 251, 256, 259
South Asia, 250, 255, 256, 259, 260, 261, 262, 265
South Carolina, 49
South Dakota, 110, 116, 124
Southeast Asia, xii, 12, 247, 249, 251, 255, 256, 259,
260, 261, 262, 263, 264, 265, 266
Southern blot, 74
Southern Hemisphere, 180
soybean, 145, 207, 208
spacers, 74, 89
Spain, 146, 250
spatial, xi, 3, 11, 111, 147, 157, 162, 169, 186, 190,
213, 223, 224, 226, 230, 249, 277
speciation, viii, 2, 3, 5, 11, 22, 23, 24, 25, 26, 27, 29,
30, 41, 46, 50
species richness, 275
specificity, 71, 80, 81, 91, 92, 176, 277
spectrum, 211
spore, 200, 206, 273, 275
sporophyte, 275
sports, 62
Index 302
St. Petersburg, 69, 75, 177, 178
stability, viii, xii, 2, 86, 125, 175, 230, 276
stable states, 274
stages, 205, 215, 281
stamens, 180
standard deviation, 10, 17
standard error, 6, 7, 36, 255
standard model, 241
standardization, 102
state borders, 175
statistical analysis, 29, 75, 221
statistics, 4, 21, 168, 231, 233, 236, 241, 242, 246,
283
sterile, 130, 136
stochastic, 23, 150, 188
stochastic processes, 23
stock, 173
storage, 120, 127
strain, 71, 74, 81, 82, 86, 89, 92, 98, 99, 137, 199,
204, 208, 210, 211
strains, viii, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 85, 86, 87, 88, 89, 90, 91, 92, 93,
94, 95, 96, 99, 100, 101, 103, 111, 116, 119, 122,
137, 197, 198, 199, 200, 201, 202, 203, 204, 205,
206, 209, 211
strategies, 134, 141, 206, 216, 272, 273, 279, 280,
281
streams, 120, 149
strength, 176
stress, 132, 133, 141, 146, 188, 205
structural gene, 10, 11, 23, 26, 27, 150
structural protein, 21
students, 64, 249
subgroups, 63
sub-Saharan Africa, xii, 192, 247, 248, 249, 250,
255, 256, 262, 263
substances, 149, 155
substitution, vii, 1, 6, 7, 9, 10, 18, 19, 92, 100
substrates, 197
sucrose, 201
sugar, 121, 123
sugars, 107, 121
sulphate, 199
Sumatra, 251
summer, 115, 116, 137
Sun, 86, 99, 100, 102, 134, 146, 182, 193, 215, 265
superimpose, 242
supernatant, 199
supply, 126, 147
suppression, xi, 195, 198, 206, 209, 211
surface water, 120
surprise, 92
surveillance, 104
survival, vii, 108, 109, 116, 121, 122, 126, 137, 215,
226, 274
survival rate, 108
surviving, xii, 271, 272, 276
susceptibility, 82, 92
suspensions, 198
sustainability, 279
Sweden, 250, 277
switching, 242
Switzerland, 104, 250
symbiont, 137
symbiosis, 196, 211
symptoms, 137
synchronous, 164
synergistic, 198
synthesis, 202
Syria, 250
systematics, 192
T
taiga, 173
Taiwan, 12, 91, 99
Tajikistan, 214
tandem repeats, 71
Tanzania, 250
taste, 155, 157
taxa, vii, viii, 1, 2, 3, 4, 6, 7, 9, 12, 14, 18, 19, 21,
22, 24, 26, 27, 29, 30, 35, 49, 277
taxonomic, 2, 3, 5, 6, 19, 24, 26, 28, 42, 43, 50, 52,
64, 147, 181, 189, 204
taxonomy, viii, 42, 43, 51, 67, 130, 180, 181, 190,
245
technicians, 74
TEM, 109, 126
temperature, 137
temporal, 23, 230, 244, 246, 249
territorial, 153, 174
territory, xiii, 150, 153, 156, 158, 161, 164, 166,
168, 171, 173, 174, 175, 176, 230, 271
test data, 241
Texas, 98, 110, 116, 119, 120, 126, 191
Thailand, 251
therapy, 72
Thomson, 248, 262, 268
threat, 272, 276
threatened, xiii, 191, 215, 272, 279
Index 303
thresholds, 283
thymine, 53
Tibet, 213, 214, 215, 217, 218, 223, 224, 225, 226,
227, 228, 250
time consuming, 241
time periods, 277
timing, xii, 106, 115, 119, 247, 249, 262, 273, 278,
279
tissue, 118, 138, 155, 191
titration, 92
tobacco, 210
Tokyo, 42, 45, 264, 280
tolerance, ix, 105, 106, 108, 122, 131, 132, 133, 137,
144, 205
Tonga, 251
toxic, 72, 137
toxicity, 141, 205
trace elements, 272
tracers, 279, 282
tracking, 266
trade, 110
trading, 243
traffic, 189, 278
traits, 14, 28, 52, 109, 116, 118, 122, 125, 132, 133,
142, 152, 182, 196, 198, 201, 206, 210, 211, 225,
226, 252, 255, 264, 265, 279
trans, 135, 237, 242
transfer, 12, 29, 132, 133, 135, 202
transformation, xiii, 272, 276
transformations, 158
Transgenic, 121
transgression, 169
transition, 157, 175
translocation, 120
transmission, 72, 82, 95, 245, 274
transport, 10, 80, 209, 273
travel, 186
tree-based, 28
trees, xii, 11, 48, 204, 223, 271, 272, 278, 282
tribal, 249
tribes, 171, 174, 180
triploid, 144, 147
tropical forest, 171
tropical forests, 171
trout, 26
tryptophan, 199, 201
tuberculosis, v, viii, 45, 69, 70, 71, 72, 73, 74, 76,
77, 78, 82, 83, 85, 86, 87, 88, 89, 90, 91, 92, 93,
94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104,
245
tundra, 171, 279
Tunisia, 142
Turkey, viii, 69, 70, 77, 78, 79, 80, 81, 104, 110,
112, 113, 118, 250
turnover, 277
two-way, 17, 18
U
UFG, 190
Uganda, 250
Ukraine, 70, 101
ultraviolet (UV), 183, 204, 278
ultraviolet light, 183
uncertainty, 234, 235
unification, 27
uniform, 7, 133, 165
United Kingdom, 14, 250, 264
United States, 13, 14, 107, 122, 125, 130, 131, 137
Urals, 152, 158, 164, 169, 174
urbanization, 88, 278
Uruguay, x, 179, 181
USDA, 107, 109, 111, 112, 122, 131, 132, 144
USSR, 154, 177, 178
V
Valencia, 138
values, vii, xi, xii, 1, 4, 6, 7, 8, 9, 17, 18, 19, 50, 70,
75, 88, 120, 139, 140, 158, 162, 168, 169, 171,
174, 175, 179, 186, 187, 191, 204, 220, 221, 230,
231, 232, 236, 237, 240, 241, 242, 243, 255, 262,
275, 276, 278
Vanuatu, 251
variability, vii, ix, 3, 4, 5, 9, 17, 19, 21, 22, 23, 26,
27, 62, 66, 68, 105, 111, 112, 116, 117, 124, 134,
144, 162, 163, 164, 170, 172, 181, 182, 183, 184,
186, 188, 189, 192, 201, 203, 215, 221, 242, 272,
275, 276, 277, 281, 283
variables, 75, 282
variance, 4, 7, 17, 21, 136, 138, 141, 158, 169, 225,
236, 237, 256, 262
vector, 28
vegetation, xii, 180, 221, 224, 227, 271, 272, 278,
279
vegetative reproduction, 188, 189
Venezuela, 9
vertebrates, vii, 1, 12, 19, 21
Victoria, 46, 251
Index 304
Vietnam, 92, 98, 251
village, 150
virulence, 98
visible, 198
vision, 126
vitamins, 205
W
Wales, 97, 251, 264
warrants, 50
water, 75, 106, 120, 124, 147, 158, 183, 203
weakness, 28
Weibull, 190
weight gain, 137
Weinberg, 274
West Africa, 250
Western Cape Province, 100
Western Europe, 131, 234
wheat, 107, 135, 144, 147, 191, 209
white-fish, 27
wild animals, x, 129
wild type, 90, 92, 94
wildlife, 106
wildlife conservation, 106
wind, 106
windows, 192
winter, 115, 116, 117, 120, 132, 133, 144
Wisconsin, 116, 146
wood, 277
World Health Organization (WHO), 70, 73, 104
World War, 155
World War I, 155
writing, 36
Y
Y chromosome, 245, 262, 265, 266, 268, 269
yeast, 200, 203
yield, xi, 11, 106, 107, 109, 115, 116, 117, 118, 120,
121, 122, 123, 124, 132, 136, 195, 196, 205, 207,
256
yield loss, 120
Yugoslavia, 250
Z
Zambezi, 43
Zea mays, 107

Guerra - Capítulo de Citrus PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Guerra - Capítulo de Citrus PDF

Uploaded by

Copyright:

Available Formats

Genetics Research and Issues Series

are the number of DNA sequences examined, the frequency of the i

, Test for modification

: test for modification (negative).

bands), mainly located at the terminal region

chromocentres should represent another kind of heterochromatin, probably

) bands. A good contrast between heterochromatin and euchromatin is

, H. S. Bhandari and J. H. Bouton

P-values after 1023 permutations.

You might also like