You are on page 1of 82

KYUSHU UNIVERSITY

Genetic variation and phylogeography of tree species from


North and Southeast Asia

A dissertation submitted in partial satisfaction of the requirements for


the Degree Doctor of Philosophy in Biology

by

Neyton Hideki Tadeu Araki

Fukuoka – JAPAN
2008
CONTENTS

Chapter 1

Phylogeography of Larix sukaczewii Dyl. and L. sibirica L. inferred from nucleotide

variation of nuclear genes

Abstract 3

Introduction 5

Materials and methods 8

Results 12

Discussion 18

References 26

Tables 1-4 36

Figures 1-5 44

1
Chapter 2

Genetic structure of Dipterocarpus alatus Roxb. Populations from Thailand revealed by

nuclear microsatellites

Abstract 51

Introduction 53

Materials and methods 55

Results 58

Discussion 61

References 68

Tables 1-7 72

Figures 1-2 79

Acknowledgements 81

2
CHAPTER 1

Phylogeography of Larix sukaczewii Dyl. and L. sibirica L. inferred from

nucleotide variation of nuclear genes

ABSTRACT

The phylogeography of Larix sukaczewii and L. sibirica was investigated using nucleotide

variation at the four following nuclear gene regions: 5.8S rDNA (including two internal transcribed

spacers (ITS)), glyoxysomal malate dehydrogenase (gMDH), cinnamyl alcohol dehydrogenase

(CAD) and phytochrome-O (PHYO). Sequences of the 4-coumarate: coenzyme A ligase (4CL) gene

region obtained in a recent study were also included. CAD and PHYO showed very low nucleotide

variation, but ITS, 4CL and gMDH had levels of variation similar to those reported for other

conifers. Neutrality tests showed significant deviations at the gMDH region. Namely, positive

values of Tajima’s D, Fu and Li’s D and F (with and without outgroup) were observed in all but one

population of L. sukaczewii, but negative values were observed in all populations of L. sibirica.

Pleistocene refugia have been hypothesized to exist in the southern Urals and south-central Siberia,

where four out of nine of the investigated populations occur. Moderate to high levels of population

differentiation were found in some pair-wise comparisons suggesting limited gene flow and

independent evolution of some refugial populations. In L. sukaczewii, low levels of differentiation

were found among populations from areas glaciated during the Pleistocene, indicating their recent

origin. Results of this study also suggest these populations were created by migrants from multiple,

genetically distinct refugia. Furthermore, some haplotypes observed in populations from previously

glaciated areas were not found in putative refugial ones, suggesting these populations might have

contributed little to the extant populations created after the Last Glacial Maximum (LGM). Some

authors regard L. sukaczewii and L. sibirica as a single species, while others consider them as

3
separate species. The observed conspicuous differences in haplotype composition and distribution

between L. sukaczewii and L. sibirica, together with high values of FST between populations of the

two species appear to support the latter classification.

4
INTRODUCTION

The history of postglacial dispersal of many plant species has been clarified by

phylogeographic studies. However, there is still little knowledge on the phylogeography of the

genus Larix Mill. The biogeographic history of Larix and other plants in Eurasia has been shaped

by Pleistocene glaciations (Hewitt 2000). During the Last Glacial Maximum (LGM) most of

northwestern Eurasia was covered by glaciers or tundra, to approximately 57 0N in European

Russia, as well as in western and central Siberia (Svendsen et al. 1999; Tarasov et al. 2000).

According to fossil data, forest refugia were present in the southern mid-latitudes of Eurasia, such

as the north of the Sea of Azov and east of the Ural watershed (north of Caspian Sea, northwest of

Aral Sea and southern Urals; Hewitt 2004; Tarasov et al. 2000). Other refugia were in the Tien-

Shan Mountains (Kazakhstan) and in northern Mongolia (Tarasov et al. 2000). Furthermore, in the

Altai region forests could only grow at altitudes lower than 1000m during the LGM and migration

of trees into the Altai from nearby refugia occurred solely after deglaciation (Blyakharchuk et al.

2004).

In addition to such complex history of Eurasia, the reproductive biology of Larix species

suggests their populations may have high levels of differentiation, since under normal conditions,

their pollen and seeds usually disperse over less than 100m (Brown et al. 1988; Duncan 1954; Hall

1986; Knowles et al. 1992). Therefore, extant populations of Larix species are likely to have

complex origins and genetic structures. Yet, most previous studies on Larix suggested relatively

simple pictures such as low levels of genetic differentiation at both inter and intraspecific levels.

Recent speciation on the geological time scale, lack of reproductive isolation and recent divergence

of extant populations were given as explanations for such low differentiation (Gros-Louis et al.

2005; Larionova et al. 2004; Lewandowski 1997; Semerikov and Lascoux 2003; Semerikov and

Lascoux 1999; Timerjanov 1997; Wei et al. 2003; Wei and Wang 2004b).

5
The classification status of Larix populations occurring from western Russia to central

Siberia is controversial. Some authors have considered these populations as a single species, the

Larix sibirica L. (Kullman 1998; Milyutin and Vishnevetskaia 1995; Semerikov et al. 1999; Wei

and Wang 2004a). However, because they display a slight geographic gradient of morphological

traits along their distribution range, populations westward of the Irtysh and Ob rivers are considered

by some authors as an independent species: Larix sukaczewii Dylis (Bashalkhanov et al. 2003;

Dylis 1947). In this classification, L. sibirica refers to populations found mainly in central Siberia,

while L. sukaczewii refers to those found in western Russia (Abaimov et al. 2002; Abaimov et al.

1998; Timerjanov 1997).

While there is some information about Larix species for DNA markers (microsatellites,

AFLP, etc.), there is still very little information about levels and patterns of nucleotide variation in

the coding regions of nuclear genome. As demonstrated in our recent study of the Eurasian Larix

species, such information can give important insights into history and classification of this genus

(Khatab et al. 2008).

In this study, partial regions of the 5.8S rDNA gene including two internal transcribed

spacers ITS1 and ITS2 (hereafter referred to as ITS), the glyoxysomal malate dehydrogenase

(gMDH), the cinnamyl alcohol dehydrogenase (CAD) and the phytochrome-O (PHYO) were

directly sequenced. Sequence data for the partial region of the 4-coumarate: coenzyme A ligase

(4CL) gene obtained in a previous study (Khatab et al. 2008) were also included. Six populations of

L. sukaczewii and three populations of L. sibirica were examined. All investigated populations of

L. sibirica came from locations that were not glaciated during the Pleistocene and some of them are

regarded as glacial refugia (Tarasov et al. 2000). On the other hand, four of the investigated

populations of L. sukaczewii (populations 1 through 4) are located in a previously glaciated area,

6
while the two remaining populations (4 and 5) are located in areas regarded as glacial refugia

(Tarasov et al. 2000).

The main objectives of the present study were: (i) to determine whether, as suggested by

previous studies, populations of L. sukaczewii and L. sibirica are weakly differentiated; (ii) to verify

taxonomic status of L. sukaczewii and L. sibirica and (iii) to provide new information about the

demographic history of both species.

7
MATERIALS AND METHODS

Seed samples of L. sukaczewii and L. sibirica were collected from natural forests in Russia

(Abaimov et al. 2002). Details on number of samples per population and locations of the nine

populations used in this study are shown in Table 1. Samples sizes were not uniform among the

investigated DNA regions and populations due to either non-amplification of the DNA target, or

depletion of DNA stock during experiments.

DNA extraction, amplification and sequencing of the target DNA regions

Seeds were kept for 2 ~ 3 days on moist sterilized paper to facilitate separation of the

maternal haploid megagametophyte tissues from seed coats and embryos. Total genomic DNA was

isolated from megagametophytes using the SDS method (Ish-Horowicz 1989) with some

modifications: separated megagametophyte tissue was placed in an 1.5 ml tube and homogenized

with 200 µl of extraction buffer (0.1 M Tris HCl pH 8.0, 10 mM EDTA, 0.5% SDS and 0.1 mg/ml

Proteinase K) using a pestle and the mixture was incubated at 37 0C for 2 hours; then 200 µl of TE

solution (10 mM Tris HCl, 1 mM EDTA pH 8.0) was added. One volume (400 µl) of Tris saturated

phenol was added and mixed gently for 15 minutes. The mixture was centrifuged at 10,000 g for 5

minutes and aqueous phase was transferred to a new tube. The lysate was treated with RNase A

(final concentration 50 µg/ml) and incubated at 37 0C for 1 hour. One volume of

phenol/chloroform (1:1) was added, mixed gently for 15 minutes and centrifuged at 10,000 g for 5

minutes. The aqueous phase was transferred to a new 1.5 ml tube. The DNA was precipitated by

adding 1/10 volume of 3 M sodium acetate and 2.5 volume of cold 100% ethanol. The solution was

kept at -80 0C for 15 minutes and centrifuged at 10,000 g for 5 minutes. The DNA pellet was

washed two times with 70% ethanol and centrifuged at 10,000 g for 5 minutes. The dried DNA

pellet was dissolved in 100 ~ 200 µl of TE solution.

8
The multi-copy region ITS a is assumed to evolve in a concerted fashion (Wei and Wang

2003). Previous phylogeographic studies of trees and annual plants, such as Fraxinus sp. (Jeandroz

et al. 1997), Olea europaea L. (Hess et al. 2000), Helenium virginicum (Simurda and Knox 2000),

Saxifraga oppositifolia (Holderegger and Abbott 2003), Pritzelago alpina (Kropf et al. 2003), and

Clausia aprica (Franzke et al. 2004), have successfully used the ITS region as DNA marker. The

4CL is a low-copy gene that has been used in phylogenetic studies of Pinaceae (Wang et al. 2000).

Two to three copies of the 4CL gene exist in Larix species (Wei and Wang 2004a). The gMDH

gene has been reported to exist as a single copy (Kim and Smith 1994). It encodes enzyme for lipid

metabolism in seed. The CAD gene has been reported to exist as a single copy in Pinus taeda

(MacKay et al. 1995) and as a small gene family in Picea abies (Schubert et al. 1998). Both the

4CL and the CAD genes play roles in the lignin biosynthetic pathway (Wei and Wang 2004a;

Whetten and Sederoff 1995). The phytochrome-O (PHYO) gene was used in a study of nucleotide

diversity along a latitudinal cline in Pinus sylvestris (Garcia-Gil et al. 2003). Phytochrome acts as

the photoreceptor that mediates red light effects on various physiological and molecular responses

in plants (Sharrock and Quail 1989).

Primers for the ITS, CAD and PHYO gene regions were designed based on conserved DNA

sequence regions of Larix species from the GenBank using Primer3 (Rozen and Skaletsky 2000)

and GeneFisher (Reeder et al. 2006), both of which are website-based primer designers. Primers for

the first PCR (1st Fwd, 1st Rev) of the gMDH gene region were designed using Cryptomeria

japonica genomic DNA, cDNA sequences (GenBank BP175785) and Pinus taeda EST (Expression

Sequence Tag) sequences. Because amplification with the pair of primers for the 1st PCR was not

sufficient, nested primers were designed (2nd Fwd and 2nd Rev) for a second PCR, using a sequence

of one individual of L. olgensis, which was obtained by direct sequencing of the first PCR products.

9
The primer sequences (5' – 3') used in this study were as follows: ITS Fwd:

TGCGGTAGGATCATTGATAGCA, Rev: AGCCCAAACCTATCCATCCGA; gMDH 1st PCR:

Fwd: AATCCGTTGGTCTCAGTCCTTCA, Rev: ACCTCTGTGCCCCCATTTTGAATAC, 2nd

PCR: Fwd: ATATGGACACCACTGCCGTT, Rev: TTTTAGAAAATGATGGATTGAC; CAD

Fwd: CACTTACACTCTCAGGTACA, Rev: GAAGGGCCAGATAAGGTTCCA; PHYO Fwd:

GAGGTAGTTGCAGAGATGAGA, Rev: ATATTGGGAGTCTGAGACACA. The PCR mixture

was prepared to the total volume of 50 µl containing 50 ~ 100 ng DNA template, 50 mM KCL, 10

mM Tris-HCl pH 8.3, 1.5 mM MgCl2, 2.5 pmol of each primers and 2 mM each of dATP, dGTP,

dCTP and dTTP (Amersham Bioscience, USA) and 1 unit of Taq polymerase. The amplification of

the ITS region was carried out after denaturing the DNA at 95 0C for 5 minutes followed by 35

cycles of 30 seconds at 95 0C, 45 seconds at 55 0C for annealing, 60 seconds at 72 0C, and ending

with 7 minutes at 72 0C for further extension. The gMDH region was amplified as follows: the

temperature profile was one cycle of 95 0C for 3 minutes, 32 cycles of 30 seconds at 95 0C, 30

seconds at 55 0C, one minute at 72 0C, and then one cycle of 7 minutes at 72 0C. The first PCR

products were used as templates for the second PCR and the number of PCR cycles was changed

from 32 to 15. The amplification of the PHYO region was as follows: 95 0C for 5 minutes, 35

cycles of 30 seconds at 95 0C, 30 seconds at 55 0C, 30 seconds at 72 0C and then a further extension

of 7 minutes at 72 0C. The amplification of the CAD region was: 950C for 5 minutes, followed by

35 ~ 40 cycles of 30 seconds at 95 0C, 45 seconds at 50 ~ 55 0C, 60 seconds at 72 0C, ending with 7

minutes at 72 0C. The amplification of the 4CL region was performed according to Wang et al.

(2000). All PCR products were purified using WizardR SV Gel and PCR Clean-Up System

(Promega, USA) following manufacturer’s instructions. Purified PCR products were directly

sequenced on the ABI Prism 3100 Genetic Analyzer (Applied Biosystems), using the BigDyeTM

Terminator (v 3.1) Cycle Sequencing Ready Reaction Kit (PE Applied Biosystems) according to

manufacturer’s instructions, and sequences were determined for both strands. Additional internal

primers were used during sequencing (data not shown).

10
Data analyses

Sequences of both strands were checked using the Sequence Navigator 1.01 (Applied

Biosystems, Foster City, California) and the ATGC program ver. 4 (GENETYX CORPORATION).

Complete sequences of individuals were aligned using the ClustalX program ver. 1.83 (Thomson et

al. 1997). The DnaSP program ver. 4.10.9 (Rozas et al. 2003) was used to perform the following

sequence analyses: 1) nucleotide diversity per site (π; Nei 1987); 2) nucleotide polymorphism (θ)

(Watterson 1975); 3) the number of haplotypes (H); and 4) the following neutrality tests were

carried out: Tajima’s D (Tajima 1989), Fu and Li’s D and F (with and without outgroup; Fu and Li

1993), and Hudson, Kreitman and Aguadé's (HKA) test (Hudson et al. 1987). The HKA test was

calculated by direct input mode since multilocus analyses could not be performed, because of

differences in number of samples between the investigated DNA regions (Table 1). Measures of

population differentiation (conventional F-statistics; FST; Weir and Hill 2002) with 1000

permutations were performed using the Arlequin program ver. 3.11 (Excoffier et al. 2005). Two

types of treatments were used: one where gaps were considered as segregating sites and the other

where they were excluded.

Haplotype networks were constructed using median-joining method as implemented in the

NETWORK program ver. 4.2.0.1 (Bandelt et al. 1999) to visualize relationships and frequencies of

individual haplotypes (indels were considered in the analysis).

11
RESULTS

Sequence variation

The obtained lengths of the aligned sequences (including indels) were: the ITS region =

1777 bp, the 4CL region = 758 bp, the gMDH = 1285 bp, the CAD region = 1331 bp, the PHYO =

565 bp. In the ITS region the total number of segregating sites (S) was 31, including 17 singletons

and one indel. Twenty nine segregating sites, including 16 singletons, were found in the ITS1

region (18 ~ 1390 bp position). Two segregating sites including one singleton were observed in the

ITS2 region (1553 ~ 1777 bp) in L. sibirica (Fig. 1). No variation was found in the 5.8S rDNA

region (1391 ~ 1552 bp). Eleven segregating sites were found in the 4CL region, including five

singletons and two indels. Eight segregating sites including one replacement were found in exon 1

(1 ~ 654 bp). Three segregating sites (one singleton and two indels) were found in the intron (655 ~

736 bp) and no variation was observed in exon 2 (737 ~ 758 bp; Fig. 2). The gMDH region

consisted of two exons (1 ~ 274, 599 ~ 688 bp) and two introns. In total, 27 polymorphic sites were

detected in all nine populations. Three segregating sites were observed in the first exon and only

one segregating site was observed in the second exon, a replacement at position 619 bp (haplotype

H04), which was observed only in populations 7 and 8 of L. sibirica. All other segregating sites

were found in introns (Fig. 3). Three indels were observed; the first indel was composed of only

one nucleotide at position 440 bp, the second was composed of three nucleotides at positions 540,

541 and 542 bp; the third and longest indel was composed of 14 nucleotides at positions 868 ~ 881

bp (Fig. 3). In the CAD region, only three haplotypes were found among 48 sequences of L.

sukaczewii and five haplotypes were found among sequences of L. sibirica. Forty seven sequences

of L. sukaczewii and one of L. sibirica represented only one haplotype. Four sequences of L.

sibirica represented another haplotype, which differed from the previous one by only one non-

synonymous substitution at 607 bp position in exon 3. The third haplotype was found in only one

12
individual of L. sukaczewii and differed from the most common haplotype by one indel of two bp at

248-49 bp positions in intron 1. The partial sequence of the PHYO region analyzed in our study is

composed of only one exon. Only one synonymous segregating site (a ‘T/C’ nucleotide

substitution) was found in this region at 367 bp position in both L. sukaczewii and L. sibirica.

However, this site was ambiguous (showing both ‘T’ and ‘C’ nucleotides) in several sequences of

both species. This result might have been caused by e.g., recent duplication of the gene or by

contamination with diploid tissue. Since both CAD and PHYO regions showed very low nucleotide

variation, they were excluded from further analyses.

The nucleotide diversity (πall sites) in the ITS region ranged from 0.0007 (population 3) to

0.0026 (population 9) and the nucleotide polymorphism (θall sites) from 0.0007 (population 3) to

0.0026 (population 8). In the 4CL region, πall sites ranged from 0.0013 (population 6) to 0.0036

(population 9) and θall sites from 0.0014 (population 1) to 0.0037 (population 4). In the gMDH

region, πall sites ranged from 0.0019 (population 8) to 0.0082 (population 6) and θall sites from 0.0032

(population 1) to 0.0062 (population 6). Polymorphisms at non-synonymous, non-coding and

synonymous, as well as silent sites (synonymous and non-coding) are shown in Tables 2a ~ c.

In the ITS and 4CL regions, values of πall sites and θall sites were generally lower in L.

sukaczewii (ITS/4CL over all populations: πall sites = 0.0010/0.0020; θall sites = 0.0013/0.0026) than in

L. sibirica (ITS/4CL over all populations: πall sites = 0.0026/0.0033; θall sites = 0.0031/0.0027; Tables

2a and 2b), while similar levels of πall sites and θall sites were found in comparisons between

populations of L. sukaczewii from putative refugia (5 and 6) and populations created after

deglaciation (1 ~ 4; Tables 2a and 2b). On the other hand, in the gMDH region higher values were

observed in L. sukaczewii in over all population comparisons (L. sukaczewii: πall sites = 0.0059; θall

sites = 0.0042; L. sibirica: πall sites = 0.0027; θall sites = 0.0044). In L. sukaczewii, higher values of πall

sites and θall sites were observed in population 6 (putative refugium; Table 2c).

13
Haplotypes

The constructed haplotype networks, including indels, are shown in Figs. 4a (ITS), 4b (4CL)

and 4c (gMDH). Twenty five haplotypes (including indels) that relate to each other in a complex

network were found in the ITS region. Ten haplotypes were found in L. sukaczewii and 16 in L.

sibirica and only one haplotype (H21) was shared by both species (Fig. 4a). Haplotypes H06, H21

and H23 were the most frequent in L. sukaczewii while in L. sibirica H14 was the most frequent

haplotype (Fig. 4a). Some haplotypes differed from each other by only one mutational step (e.g.,

H06 and H07 differed only by an indel at 1200 bp position; H19 and H21 differed by one nucleotide

substitution at 1191 bp position), while others were several mutational steps apart (e.g., H16 and

H25, the two most isolated haplotypes; Figs. 1 and 3a). Haplotypes H02 ~ H07 found in L.

sukaczewii and haplotypes H14 and H15 found in L. sibirica differed by eight or more mutational

steps and formed the two most distinct groups in the network (Figs. 1 and 4a).

The haplotype network of the 4CL region (Fig. 4b) was simpler than that of the ITS region

(Fig. 4a). Thirteen haplotypes (indels included) were found in this region. Five of them were found

only in L. sukaczewii, and one was found only in L. sibirica; seven haplotypes were shared by both

species. As in the ITS region, some haplotypes differed from each other by only one mutational

step (e.g., H06 and H08), others were up to five mutational steps apart from each other (e.g., H01

and H12; Figs. 2 and 4b). Haplotypes H02 ~ H08 and H09 ~ H13 appeared to form two separate

groups and the haplotype H01 appeared to be isolated from these two groups. Haplotypes H02,

H04, H05 and H06 of the first group were more frequent in L. sukaczewii than in L. sibirica, while

the haplotype H10 of the second group as well as the haplotype H01 were more frequent in L.

sibirica (Figs. 2 and 4b).

14
The haplotype network of the gMDH region, including indels, shown in Fig. 4c revealed the

presence of three main groups of diverged haplotypes. The first group included haplotype H01 and

the low frequency haplotype H02. The second group included haplotype H03 and the low

frequency haplotypes H04, H05, H07, H08 and H09. Finally, the third group included only one

haplotype H06, which was isolated from all other haplotypes. Haplotypes H01 and H02 differed

from haplotype H03 by respectively 27 and 26 mutational steps and from haplotype H06 by 32 and

31 mutational steps respectively. But this was because of the presence of a single indel composed

of 14 nucleotides (Fig. 3). The H03 was the most common and found in all populations (Fig. 4c).

In the ITS region, population 6 differed from other populations of L. sukaczewii mainly in

haplotype frequencies rather than haplotype composition. Each population of L. sibirica; however,

appeared to be unique in haplotype composition with only few shared haplotypes observed among

populations 7, 8 and 9 (Fig. 5). In the 4CL region population 6 of L. sukaczewii and population 8 of

L. sibirica were most distinct. Population 6 shared haplotype H04 with populations 2, 3 and 4 and

haplotype H05 with population 1 but frequencies of these haplotypes differed. Populations 5 and 6

did not share any haplotype. Among the four haplotypes observed in population 8, two haplotypes

(H04 and H11) were absent in populations 7 and 9. The haplotypes H01 and H10 were shared

among all three populations of L. sibirica, but in population 8 their frequencies differed from

populations 7 and 9 (Fig. 5). In the gMDH region, haplotype frequencies rather than haplotype

composition was the main cause of population differentiation (Fig. 5). Though, the most marked

characteristic of the haplotype pattern, especially in the ITS and 4CL regions, was the apparent

distinction between L. sukaczewii and L. sibirica in both composition and frequencies of haplotypes

(Fig. 5).

15
Genetic differentiation among populations

The FST values obtained with and without indels were similar to each other, therefore only

FST values with indels are presented. In the ITS region, the highest values of FST were found in

comparisons between populations of L. sukaczewii and L. sibirica (Tables 3a ~ c). The range of

pair-wise FST values in this region varied from negative (e.g., populations 1 ~ 4 of L. sukaczewii) to

as high as 0.523 (pop. 5 vs. 7, and pop. 6 vs. 7; Table 3a). All pair-wise comparisons among L.

sibirica populations were moderate to high. All FST values for pair-wise comparisons between L.

sukaczewii and L. sibirica populations were moderate to high and statistically significant (p < 0.05;

Table 3a).

In the 4CL region, all comparisons involving population 6 and most comparisons between L.

sukaczewii and L. sibirica showed moderate to high values of FST. The FST value for the

comparison between populations 6 (L. sukaczewii) and 8 (L. sibirica) was the highest (0.401; Table

3b). Low levels of population differentiation and/or not significant FST values were found among

populations 1 ~ 5 and between populations 7 and 9 of L. sibirica (Table 3b).

In the gMDH region, most of the high FST values were also observed in pair-wise

comparisons involving population 6 (e.g. pop. 6 vs. 7, FST = 0.287; Table 3c).

Tests of neutrality and population size changes

In the ITS and 4CL regions, no statistically significant result was obtained in any of the

neutrality tests (Tajima’s D; Fu and Li’s D* and F*, D and F; and HKA) and there was no tendency

16
toward negative or positive values in Tajima’s D and Fu and Li’s D* and F*, D and F (data not

shown). Therefore, no deviations from neutrality were detected.

However, in the gMDH region, for all but one population (4) of L. sukaczewii, Tajima’s D,

Fu and Li’s D* and F*, and D and F were positive and some of them, significant. On the other

hand, these tests showed significant negative values in populations 7 and 8 of L. sibirica (Table 4).

However, the HKA test failed to detect significant deviations from neutrality in this region (data not

shown).

17
DISCUSSION

DNA sequences and polymorphism

The ITS region is perhaps the most commonly used sequence in population genetic and

phylogenetic studies (Alvarez and Wendel 2003). However, some authors have argued that for

various reasons such as e.g., the presence of multiple copies, compensatory base changes and

difficulties in alignment, the use of ITS for such studies is problematic (Alvarez and Wendel 2003;

Bailey et al. 2003; Campbell et al. 2005; Gernandt and Liston 1999). Indeed, the presence of

multiple copies of the ITS region was reported for some Larix species (Gernandt and Liston 1999;

Gernandt et al. 2001; Wei et al. 2003; Wei and Wang 2004b). Yet, there is also evidence

suggesting that multiple and different copies of the ITS region were not amplified in our study. If

such copies were present in this material, one would expect to observe multiple peaks during

sequencing such as those reported by Gernandt et al. (2001). Yet, the ITS chromatograms obtained

using ABI 3100 automatic sequencer had no ambiguous nucleotide sites. Therefore, the direct

sequencing method used in this study has probably detected only one copy of the ITS region or

multiple copies, which had identical sequence. Based on this study’s data alone one cannot

determine the reason why additional copies of the ITS region were not detected. Nevertheless, such

selective amplification has been often reported in other studies and its possible causes have been

reviewed by e.g., Wagner et al. (1994).

In spite of the fact that three copies of the 4CL region exist in the genus Larix, direct sequencing

method used by Khatab et al. (2008) detected only the 4CL-B copy (as determined by comparisons

with 4CL sequences of Larix from the GenBank). To date, the copy status of the gMDH region has

not been studied in Larix, however it is reported to exist as a single copy in cucumber (Kim and

Smith 1994).

18
It is often assumed that long non-coding regions of the DNA harbor more nucleotide

variation than shorter coding regions. Although this may be true in most cases, in this study most

segregating sites (eight out of 11) in the 4CL region were found in the exon 1 (size = 654 bp). On

the other hand, the CAD region was almost monomorphic, despite its total size of 1331 bp including

more than 600 bp of introns. The reasons for such low nucleotide variation in the CAD region

remain a question for further investigation. The low nucleotide variation in the CAD and PHYO

regions and the ambiguity observed at the only segregating site in the PHYO region prevented their

use in this study.

The levels of nucleotide diversity (π; Table 2) revealed in this study were similar to

nucleotide variation reported in other studies on conifers using nuclear gene regions. For example,

values of π were in approximately the same order of magnitude as those reported for Pinus taeda

(ranges of 19 loci: πall sites = 0.00027 ~ 0.01728, πsilent = 0.00042 ~ 0.01975; Brown et al. 2004); P.

sylvestris (PHYP: πall sites = 0.0010, πsyn = 0.0020; PHYO: πall sites = 0.0004, πsyn = 0.0013; Garcia-Gil

et al. 2003); P. sylvestris (pal1: πall sites = 0.0014, πsyn = 0.0049; Dvornyk et al. 2002); P.

tabuliformis, P. yunnanensis, P. densata (ranges of seven loci: πall sites = 0.0064 ~ 0.0092; πsilent =

0.0087 ~ 0.0128; Ma et al. 2006) and Cryptomeria japonica (ranges of seven loci: πall sites = 0.00004

~ 0.00519; πsilent = 0.00017 ~ 0.00813; Kado et al. 2003). Similar levels of nucleotide diversity

were also observed in the C3H nuclear gene region of L. sukaczewii (πall sites = 0.0016) and L.

sibirica (πall sites = 0.0020; Khatab et al. 2008).

In over all population comparisons the values of π in the ITS and 4CL regions were lower in

L. sukaczewii (ITS: πall sites = 0.0010, πnon coding = 0.0011; 4CL: πall sites = 0.0020, πsilent = 0.0057) than

in L. sibirica (ITS: πall sites = 0.0026, πnon coding = 0.0028; 4CL: πall sites = 0.0033, πsilent = 0.0102; Table

2). This result was concordant with that reported for the C3H region, where the levels of π were
19
also slightly lower in L. sukaczewii (Khatab et al. 2008). However, similar levels of variation in

nuclear AFLPs between these two species were reported (Semerikov and Lascoux 2003). This

difference could be due to the different ways AFLP markers and sequencing nuclear gene loci

sample the existing genetic variation. On the other hand, in the gMDH region, the values of πall sites

were higher in L. sukaczewii than in L. sibirica. Considering that deviations from neutrality were

detected in the gMDH region: positive values in five out of six populations of L. sukaczewii and

negative in all populations of L. sibirica (Table 4), and since demographic events, such as

population bottlenecks, should affect the whole genome, selection acting upon this gene might be

responsible for the observed differences in the levels of polymorphism observed in the gMDH

compared to the other two DNA regions. Balancing selection could be acting upon this region in L.

sukaczewii populations (except population 4), while positive selection (selective sweep) could be

acting in all studied populations of L. sibirica. However, the exact reasons for the situation

observed at this gene cannot be determined in this study. Hence, further studies that focus on the

gMDH gene should be considered in the future. Considerable differences observed among

individual loci included in this study indicate that the available data is still insufficient to make

general inferences about the levels of polymorphism in L. sukaczewii and L. sibirica.

Population differentiation

Usually low levels of genetic differentiation among local populations of conifers are

expected, because of their outbreeding and wind-pollination behavior (Loveless and Hamrick

1984). In the genus Larix; however, high levels of population differentiation could be expected

because its pollen does not have air-sacs (Owens et al. 1998) and thus, cannot disperse for long

distances. For example, it has been reported that, under normal conditions, most of L. laricina

pollen falls less than 50m away from the parent tree (Hall 1986; Knowles et al. 1992). Seeds are

not easily disseminated either, being generally dispersed over distances equivalent to less than two

20
trees heights (Brown et al. 1988; Duncan 1954; Knowles et al. 1992). Therefore, geographic

isolation has been considered as a barrier to gene flow among Larix populations (Lewandowski et

al. 1994; Young and Young 1992). Yet, most previous studies on Larix revealed low population

differentiation (Larionova et al. 2004; Lewandowski 1997; Semerikov and Lascoux 2003;

Semerikov and Lascoux 1999; Timerjanov 1997; Wei et al. 2003; Wei and Wang 2004b). Recent

divergence of extant populations was suggested as the cause of the low genetic differentiation

within and among Eurasian species from the genus Larix (Semerikov and Lascoux 2003; Wei et al.

2003; Wei and Wang 2004b). In this study, both low and high levels of differentiation among

populations were found. The lack of differentiation among populations 1 through 4 (most FST

values were close to zero, and/or, not statistically significant; Table 3a ~ c) that occupy previously

glaciated areas on the plains of northwestern Russia is concordant with results from previous studies

and is consistent with their recent divergence on geological time scale. No population

differentiation was observed either in the C3H region for the same populations (Khatab et al. 2008).

Moderate to high levels of differentiation among populations (FST > 0.050) were found in many

pair-wise comparisons involving populations 6, 7, 8 and 9, which (except population 7) occur in, or

near areas of putative refugia (Table 3a ~ c). These results are consistent with a history of long time

isolation of these populations, or their respective sources during the Pleistocene.

Only few other similar results of moderate to high levels of population differentiation in

Larix species have been reported. In RAPD analyses of Larix species (Kozyrenko et al. 2004)

found an overall GST = 0.1864. In a study of L. sukaczewii using allozymes, one highly

differentiated population from southern Urals (near the location of population 6) was reported, in

spite of a low over all population differentiation (FST = 0.028; Timerjanov 1997). The author

concluded that this result may be due to isolation of this area from other parts of L. sukaczewii

distribution during the LGM. An additional reason for the high levels of differentiation among

some populations revealed in this study could be the fact that four out of nine of them are located in,

21
or near areas of different and isolated putative Pleistocene refugia, which have rarely been

investigated before. Populations from these areas might have evolved independently for a long time

with little, if any, gene flow among individual refugia.

Two recent studies on populations of Larix species have also revealed moderate to high

levels of population differentiation. For example, in the C3H gene region populations 6 and 8

showed significant levels of population differentiation when compared to other populations of the

corresponding species (Khatab et al. 2008). In a study of mtDNA variation (Semerikov et al. 2007),

the observed overall FST of 45.7% was similar to the levels of population differentiation revealed in

the present study (Table 3). The divergent haplotype distribution between L. sukaczewii and L.

sibirica, as well as among L. sibirica populations reported by Semerikov et al. (2007) were also

similar to the results found in this study, especially those for the ITS region (Fig. 5).

Demography

Populations of L. sukaczewii were sampled both from areas of recent colonization (1, 2, 3

and 4) and from putative Pleistocene refugia. Thus, they were probably created by migrants coming

from southern refugia (likely from areas near the Sea of Azov etc.) and might have started

occupying extant locations around 7500 ~ 8700 years before present (Kullman 1998). Some

haplotypes found in populations 1 through 4 were not found in populations 5 and 6 in both ITS and

4CL regions (ITS: H02, H03, H05, H07 and H24; 4CL: H03, H07, H08, H09 and H10; Figs. 4a, 4b

and 5), while in the gMDH region the main differences among populations were due to different

haplotype frequencies (Figs. 4c and 5). But the differences in haplotype composition observed

between populations of L. sukaczewii from refugial areas in the southwestern Urals (5 and 6) and

populations, from northwestern Russia, which was glaciated during the LGM (1 through 4; Tarasov

et al. 2000) suggest that population 5 and especially population 6 are not the likely sources of post-

22
glacial expansion into that region. It is possible that the extant populations in northwestern Russia

have originated from several sources located in other refugial areas that existed during the LGM,

such as the surroundings of the Sea of Azov and other locations within the Urals watershed

(Tarasov et al. 2000). However, to our knowledge, the part of southern Urals, where populations 5

and 6 are located, is currently the southernmost limit of extant populations of L. sukaczewii and

Larix species no longer grow in areas farther south and near the Sea of Azov, because those areas

are now dominated by steppe vegetation, or desert. It thus appears that some refugial populations,

which gave rise to the extant populations in northwestern Russia, went extinct. The similar levels

of nucleotide diversity (π) and nucleotide polymorphism (θ) observed in comparisons between

populations from putative refugia (5 and 6) and populations from newly colonized areas (1 ~ 4;

Table 2) confirms the findings reported by Khatab et al. (2008) and are also concordant with this

study’s suggestion that populations 1, 2, 3 and 4 were created by migrants from different refugia,

the admixture effect as proposed by Widmer (2001). That is because populations occurring in

refugial areas or created by migrants from different and genetically distinct refugia are expected to

harbor higher levels of genetic diversity than those occurring in newly colonized deglaciated areas.

Populations 8 and 9 of L. sibirica are in the areas of putative refugia in the south-central

Siberia and Altai (Blyakharchuk et al. 2004; Tarasov et al. 2000). On the other hand, there is no

information about the presence of Larix refugia in the Upper Tunguska region where population 7 is

located, near the banks of the Angara River. It is possible that this population was created by

migrants from refugia other than those where populations 8 and 9 occur, since in the ITS region,

population 7 showed moderate to high levels of differentiation in relation to the other two

populations (Table 3a; and Fig. 5). The Angara River, which flows out of Lake Baikal, could have

been the main route of colonization of that area, most likely from northern Mongolia through the

surroundings of this Lake. If this scenario is correct, our results for the ITS region give support to

the results reported by Semerikov et al. (2007), where haplotype frequencies observed in

23
populations of L. sibirica from the southern coast of Lake Baikal, also suggested their independent

origin from western populations. Population 9 (Altai region) appears to have been created by

migrants from nearby refugial areas located at lower altitudes because no forest was present at its

current altitude (1630 m) during Pleistocene glaciation (Blyakharchuk et al. 2004; Tarasov et al.

2000). Finally, unique haplotype composition of population 8 observed in the ITS and 4CL regions,

and high FST values in the ITS region in pair-wise comparisons with the other two L. sibirica

populations (7 and 9; Table 3a and Fig. 5) suggest that despite relative geographic proximity, it has

evolved in isolation from populations occurring in other parts of the Siberian Central Plateau.

Larix sukaczewii and L. sibirica

Following an extensive study Dylis (1947) found that populations from western Russia

differ from those occurring in central and eastern Siberia with respect to a considerable number of

characters such as e.g., cone variability, seeds, shoots, crown shape, stem, physical and mechanical

properties of wood. Based on these results he proposed to regard populations from western Siberia

as a separate taxon: L. sukaczewii. Results from karyotypic analyses (Muratova 1991) gave further

support to such classification and analysis of phylogenetic relationships between L. sibirica and L.

sukaczewii using the chloroplast DNA trnK intron sequences (Bashalkhanov et al. 2003) revealed

interspecific levels of genetic distances between L. sibirica and L. sukaczewii. In this study,

haplotype composition of the investigated populations showed a conspicuous separation between L.

sibirica and L. sukaczewii in the ITS region. Among the 25 ITS haplotypes only one (H21) was

shared by both taxa (Figs. 4a and 5) and this haplotype was frequent in L. sukaczewii but it was

found in only one individual of L. sibirica. Further evidence of the divergence between these two

taxa was given by the haplotype network, which showed two groups of haplotypes (H02 ~ H07, L.

sukaczewii and H13 ~ H16, L. sibirica) separated by several mutational steps (Figs. 1 and 4a). In

the 4CL region, haplotypes of the two species were more similar to each other than those observed

24
in the ITS region. Seven out of 13 haplotypes were shared, but noticeable differences in haplotype

frequencies were also observed when populations of L. sibirica were compared to those of L.

sukaczewii. Some 4CL haplotypes that were frequent in L. sibirica (e.g., H01 and H10) were rarely

found in L. sukaczewii and vice versa (e.g., H02; Figs. 4b and 5). Few differences in haplotype

composition were observed in the gMDH region between these two taxa, but there were noticeable

differences in haplotype frequencies (Figs. 4c and 5). The high FST values obtained in most pair-

wise comparisons in the ITS region when populations of L. sibirica were compared to populations

of L. sukaczewii also suggest a considerable divergence between these two taxa (Tables 3a ~ c).

Therefore, results of this study provide partial support for the classification of L. sibirica and L.

sukaczewii as two distinct taxa. However, as phylogeography of L. sukaczewii and L. sibirica

seems to be much more complex than previously suggested, further studies that include L. sibirica

populations from areas colonized after the LGM are necessary for a better comprehension of the

post-glacial history of these species.

25
References

Abaimov AP, Barzut VM, Berkutenko AN, Buitnk J, Martinsson O, Milyutin LI, Polezhaev A,

Putenikhin VP, Katsuhiko T (2002) Seed collection and seed quality of Larix spp. from Russia -

Initial Phase of the Russian-Scandinavian Larch project. Eur J For Research 4:39-49

Abaimov AP, Lesinski JA, Martinsson O, Milyutin LI (1998) Variability and ecology of Siberia

Larch species. Swedish University of Agricultural Sciences, Department of Silviculture, Report No.

43, 123pp, Umeå

Alvarez I, Wendel JF (2003) Ribosomal ITS sequences and plant phylogenetic inference. Mol Phyl

Evol 29:417-434

Bailey CD, Carr TG, Harris SA, Hughes CE (2003) Characterization of angiosperm nrDNA

polymorphism, paralogy, and pseudogenes. Mol Phyl Evol 29:435-455

Bandelt HJ, Forster P, Rohl A (1999) Median-joining networks for inferring intraspecific

phylogenies. Mol Biol Evol 16:37-48

Bashalkhanov SI, Konstantinov YM, Verbitskii DS, Kobzev VF (2003) Reconstruction of

phylogenetic relationships of larch Larix sukaczewii Dyl. based on chloroplast DNA trnK intron

sequences. Rus J Genet 39:1322-1327

Blyakharchuk TA, Wright HE, Borodavko PS, van der Knaap WO, Ammann B (2004) Late Glacial

Holocene vegetational changes on the Ulagan high-mountain plateau, Altai Mountains, southern

Siberia. Palaeogeo Palaeoclim Palaeoecol 209:259-279

26
Brown, KR, Zobel D, Zasada J (1988) Seed dispersal, seedling emergence and early survival of

Larix laricina (DuRoi) K. Koch in the Tanana Valley, Alaska. Can J Forest Res - Revue Can Rech

For 18:306-314

Brown GR, Gill GP, Kuntz RJ, Langley CH, Neale DB (2004) Nucleotide diversity and linkage

disequilibrium in loblolly pine. Proc Natl Acad Sci 101:15255-15260

Campbell CS, Wright WA, Cox M, Vining TF, Smoot Major C, Arsenault MP (2005) Nuclear

ribosomal DNA internal transcribed spacer 1 (ITS1) in Picea (Pinaceae): sequence divergence and

structure. Mol Phyl Evol 35:165-185

Duncan D (1954) A study of some of the factors affecting the natural regeneration of tamarack

(Larix laricina) in Minnesota. Ecology 35:498-521

Dvornyk V, Sirvio A, Mikkonen M, Savolainen O (2002) Low nucleotide diversity at the pal1 locus

in the widely distributed Pinus sylvestris. Mol Biol Evol 19:179-188

Dylis NV (ed) (1947) Sibirskaya listvennitsa. Moskovskoye Obshchestvo Ispytatelnej Prirody,

Novaya seria, Otdel’ botanicheskij, Moskva

Excoffier L, Laval B, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for

population genetics data analysis. Evolutionary Bioinformatics Online 1:47-50

27
Franzke A, Hurka H, Janssen D, Neuffer B, Friesen N, Markov M, Mummenhoff K (2004)

Molecular signals for Late Tertiary Early Quaternary range splits of an Eurasian steppe plant:

Clausia aprica (Brassicaceae). Mol Ecol 13:2789-2795

Fu Y-X, Li W-H (1993) Statistical tests of neutrality of mutations. Genetics 133:693-709

Garcia-Gil MR, Mikkonen M, Savolainen O (2003) Nucleotide diversity at two phytochrome loci

along a latitudinal cline in Pinus sylvestris. Mol Ecol 12:1195-1206

Gernandt DS, Liston A (1999) Internal transcribed spacer region evolution in Larix and

Pseudotsuga (Pinaceae). Am J Bot 86:711-723

Gernandt DS, Liston A, Pinero D (2001) Variation in the nrDNA ITS of Pinus Subsection

Cembroides: Implications for Molecular Systematic Studies of Pine Species Complexes. Mol Phyl

Evol 21:449-467

Gros-Louis MC, Bousquet J, Pƒques LE, Isabel N (2005) Species-diagnostic markers in Larix spp.

based on RAM and nuclear, cpDNA, and mtDNA gene sequences, and their phylogenetic

implications. Tree Genet Genom 1:50-63

Hall J (1986) Growth and development of larch in Newfoundland. In: Murray MB (ed) Sixth

International Workshop on Forest Regeneration: The Yield Advantages of Artificial Regeneration

at High Altitudes. USDA Forest Service General Technical Report PNW-194

28
Hess J, Kadereit JW, Vargas P (2000) The colonization history of Olea europaea L. in Macaronesia

based on internal transcribed spacer 1 (ITS-1) sequences, randomly amplified polymorphic DNAs

(RAPD), and intersimple sequence repeats (ISSR). Mol Ecol 9:857-868

Hewitt G (2000) The genetic legacy of the Quaternary ice ages. Nature 405:907-913

Hewitt GM (2004) Genetic consequences of climatic oscillations in the Quaternary. Phil Transact

Royal Soc London: Biol Sci 359:183-195

Holderegger R, Abbott RJ (2003) Phylogeography of the Arctic-Alpine Saxifraga oppositifolia

(Saxifragaceae) and some related taxa based on cpDNA and its sequence variation. Am J Bot

90:931-936

Hudson R (2002) Generating samples under Wright-Fisher neutral model of genetic variation.

Bioinformatics 18:337-338

Hudson R, Kreitman M, Aguade M (1987) A test of neutral molecular evolution based on

nucleotide data. Genetics 116:153-159

Hudson RR, Coyne JA (2002) Mathematical consequences of the genealogical species concept.

Evolution 56:1557-1565

Ish-Horowicz D (1989) Isolation of DNA from adult flies. In: Drosophila. A Laboratory Manual

(Ashburner, M., ed.). Cold Spring Harbor Laboratory Press, Cold Spring Harbour

29
Jeandroz S, Roy A, Bousquet J (1997) Phylogeny and phylogeography of the circumpolar genus

Fraxinus (Oleaceae) based on internal transcribed spacer sequences of nuclear ribosomal DNA.

Mol Phyl Evol 7:241-251

Kado T, Yoshimaru H, Tsumura Y, Tachida H (2003) DNA Variation in a Conifer, Cryptomeria

japonica (Cupressaceae sensu lato). Genetics 164:1547-1559

Khatab IA, Ishiyama H, Inomata N, Wang X-R, Szmidt AE (2008) Phylogeography of Eurasian

Larix species inferred from nucleotide variation in two nuclear genes. Genes Genet Syst 83: in press

Kim, D., and S. Smith, 1994 Expression of a single gene encoding microbody NAD-malate

dehydrogenase during glyoxysome and peroxisome development in cucumber. Plant Molecular

Biology 26: 1833-1841.

Knowles P, Perry D, Foster H (1992) Spatial genetic structure in two tamarack (Larix laricina (Du

Roi) K. Koch) populations with differing establishment histories. Evolution 46:572-576

Kozyrenko MM, Artyukova EV, Reunova GD, Levina EA, Zhuravlev YN (2004) Genetic Diversity

and Relationships Among Siberian and Far Eastern larches Inferred from RAPD Analysis. Rus J

Genet 40:401-409

Kropf M, Kandereit J, Comes H (2003) Differential cycles of range contraction and expansion in

European high mountain plants during the late Quaternary: insights from Pritzelago alpine (L.)

Kuntze (Brassicaceae). Mol Ecol 12:931-949

30
Kullman L (1998) Paleoecological, biogeographical and paleoclimatological implications of early

Holocene immigration of Larix sibirica into the Scandes mountains, Sweden. Global Ecol Biogeo

Letters, 5

Larionova AY, Yakhneva NV, Abaimov AP (2004) Genetic diversity and differentiation of Gmelin

larch Larix gmelinii populations from Evkenia (Central Siberia). Rus J Genet 40:1127-1133

Lewandowski A (1997) Genetic relationships between European and Siberian larch, Larix spp.

(Pinaceae), studied by allozymes. Is the Polish larch a hybrid between these two species? Plant Syst

Evol 204:65-73

Lewandowski A, Nikkanen T, Burczyk J (1994) Production of hybrid seed in a seed orchard of two

Larch Species, Larix sibirica and Larix decidua. Scand J For Res 9:214-217

Loveless MD, Hamrick JL (1984) Ecological determinants of genetic structure in plant populations.

Annu Rev Ecol Syst 15:65-95

Ma XF, Szmidt AE, Wang XR (2006) Genetic structure and evolutionary history of a diploid hybrid

pine Pinus densata inferred from the nucleotide variation at seven gene loci. Mol Biol Evol 23:807-

816

MacKay JJ, Liu W, Whetten R, Sederoff RR, O'Malley DM (1995) Genetic analysis of cinnamyl

alcohol dehydrogenase in loblolly pine: single gene inheritance, molecular characterization and

evolution. Mol Genet Genom 247:537-545

31
Milyutin L, Vishnevetskaia K (1995) Larch and Larch Forest in Siberia. In: Schmidt WC,

McDonald KJ (eds) Ecology and Management of Larix forests: a look ahead. U.S.D.A. Forest

Service Intermountain Research Station GTR-INT-319, Whitefish, Montana, U.S.A., pp 19-29

Muratova E (1991) Karyotypic analysis of Larix sibirica (Pinaceae) from Various Parts of the

Species Area. Bot Zh 76:1586-1595

Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York

Owens J, Takaso T, Runions C (1998) Pollination in conifers. Trends Plant Sci 3:479-485

Reeder J, Höchsmann M, Rehmsmeier M, Voss B, Giegerich R (2006) Beyond Mfold: recent

advances in RNA bioinformatics. J Biotec 124:41-55

Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R (2003) DnaSP, DNA polymorphism

analyses by the coalescent and other methods. Bioinformatics 19:2496-2497

Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist

programmers. In: Krawetz S, Misener S (eds) Bioinformatics Methods and Protocols. Meth Mol

Biol 132:365-386

Schubert R, Sperisen C, Müller-Starck G, Lascala S, Ernst D, Sandermann Jr H, Hager KP (1998)

The cinnamyl alcohol dehydrogenase gene structure in Picea abies (L.) Karst.: genomic sequences,

Southern hybridization, genetic analysis and phylogenetic relationships. Trees - Struct Funct

12:453-463

32
Semerikov V, Lascoux M (2003) Nuclear and Cytoplasmic Variation Within and Between Eurasian

Larix (Pinaceae) Species. Am J Bot 90:1113-1123

Semerikov VL, Iroshnikov AI, Lascoux M (2007) Mitochondrial DNA Variation Pattern and

Postglacial History of the Siberian Larch (Larix sibirica Ledeb.). Rus J Ecol 38:163-171

Semerikov VL, Lascoux M (1999) Genetic relationship among Eurasian and American Larix

species based on allozymes. Heredity 83:62-70

Semerikov VL, Semerikov LF, Lascoux M (1999) Intra- and interspecific allozyme variability in

Eurasian Larix Mill. species. Heredity 82:193-204

Sharrock RA, Quail PH (1989) Novel phytochrome sequences in Arabidopsis thaliana: structure,

evolution, and differential expression of a plant regulatory photoreceptor family. Genes Develop

3:1745-1757

Simurda M, Knox J (2000) ITS sequence evidence for the disjunct distribution between Virginia

and Missouri of the narrow endemic Helenium virginicum (Asteraceae). J Torrey Bot Soc 127:316-

323

Svendsen JI, Astakhov VI, Bolshiyanov DY, Demidov I, Downdeswell JA, Gataullin V, Hjort C,

Hubberten HW, Larsen E, Mangerud J, Melles M, Möller P, Saarnisto M, Siegert MJ (1999)

Maximum extent of the Eurasian ice sheets in the Barents and Kara Sea region during the

Weichselian. Boreas 28:134-242

33
Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA

polymorphism. Genetics 123:585-595

Tarasov PE, Volkova VS, III TW, Guiot J, Andreev AA, Bezusko LG, Bezusko TV, Bykova GV,

Dorofeyuk NI, Kvavadze EV, Osipova IM, Panova NK, Sevastyanov DV (2000) Last glacial

maximum biomes reconstructed from pollen and plant macrofossil data from northern Eurasia. J

Biogeo 27:609-620

Thomson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X

windows interface: flexible strategies for multiple sequences alignment aided by quality analysis

tools. Nucl Acids Res 25:4876-4882

Timerjanov AS (1997) Lack of allozyme variation in Larix sukaczewii Dyl. from the southern

Urals. Silvae Genet 46:61-64

Wagner A, Blackstone N, Cartwright P, Dick M, Misof B, Snow P, Wagner GP, Bartels J, Murtha

M, Pendleton J (1994) Surveys of gene families using polymerase chain reaction - PCR selection

and PCR drift. Syst Biol 43:250-261

Wang XQ, Tank DC, Sang T (2000) Phylogeny and divergence times in Pinaceae: Evidence from

three genomes. Mol Biol Evol 17:773-781

Watterson G (1975) On the number of segregating sites in genetical models without recombination.

Theor Pop Biol 7:256-276

34
Wei H-X, Wang X-Q, Hong D-Y (2003) Marked intergenomic heterogeneity and geographical

differentiation of nrDNA ITS in Larix potaninii (Pinaceae). J Mol Evol 57:623-635

Wei XX, Wang XQ (2003) Phylogenetic split of Larix: evidence from paternally inherited cpDNA

trnT-trnF region. Plant Syst Evol 239:67-77

Wei XX, Wang XQ (2004a) Evolution of 4-coumarate: coenzyme A ligase (4CL) gene and

divergence of Larix (Pinaceae). Mol Phyl Evol 31:542-553

Wei XX, Wang XQ (2004b) Recolonization and radiation in Larix (Pinaceae): evidence from

nuclear ribosomal DNA paralogues. Mol Ecol 13:3115-3123

Weir BS, Hill WG (2002) Estimating F-statistics. Annu Rev Genet 36:721-750

Whetten R, Sederoff R (1995) Lignin biosynthesis. Plant Cell 7:1001 – 1013

Widmer A (2001) Glacial refugia: sanctuaries for allelic richness, but not for gene diversity. Trends

Ecol Evol 16:267-269

Wright S (1978) Evolution and the genetics of populations. University of Chicago Press, Chicago

Young J, Young C (1992) Larix Mill., larch. In: Dudky T (ed) Seeds of woody plants in North

America. Dioscorides Press, Portland

35
Table 1. Sample sizes and locations of Larix populations used in this study. (n.a. = not available).

Original population designations used by Abaimov et al. (2002) are given in parentheses

36
Table 2. Summary of total number of segregating sites (S), nucleotide diversity (π) and nucleotide

polymorphism (θ) in the: 2a) ITS, 2b) 4CL and 2c) gMDH regions

37
38
39
Table 3. Pair-wise FST for: 3a) the ITS region; 3b) the 4CL region; and 3c) the gMDH regions

Statistical significance is marked with asterisks (p < 0.05 = *; p < 0.02 = **; p < 0.01 = ***)

Range interpretation: 0 ~ 0.05 = little differentiation; 0.05 ~ 0.15 = moderate; 0.15 ~ 0.25 = great; > 0.25 = very great (Wright 1978)

Table 3a
Pop. 1 2 3 4 5 6 7 8 9

1 ―

2 -0.066 ―

3 -0.082 -0.055 ―

4 -0.054 -0.034 -0.108 ―

5 -0.029 -0.051 0.003 0.030 ―

6 0.000 0.015 0.081 0.078 0.118 ―

7 0.495 *** 0.487 *** 0.481 *** 0.484 *** 0.523 *** 0.523 *** ―

8 0.324 *** 0.321 *** 0.293 *** 0.299 *** 0.362 *** 0.398 *** 0.232 *** ―

9 0.371 *** 0.368 *** 0.359 *** 0.359 *** 0.413 *** 0.377 *** 0.134 0.134 * ―

40
Table 3b
Pop. 1 2 3 4 5 6 7 8 9

1 ―

2 -0.080 ―

3 -0.016 -0.004 ―

4 -0.018 -0.005 -0.013 ―

5 -0.069 -0.031 -0.023 -0.010 ―

6 0.179 *** 0.089 0.096 0.123 * 0.237 *** ―

7 -0.042 -0.026 0.017 -0.061 -0.020 0.128 * ―

8 0.298 *** 0.228 ** 0.317 *** 0.292 *** 0.266 ** 0.401 *** 0.217 ** ―

9 0.023 0.015 0.095 -0.004 0.000 0.232 *** -0.048 0.087 ―

41
Table 3c
Pop. 1 2 3 4 5 6 7 8 9

1 ―

2 0.135 ―

3 0.107 -0.007 ―

4 -0.082 0.025 0.011 ―

5 0.186 * -0.044 -0.081 0.069 ―

6 0.086 0.198 * 0.013 0.053 0.124 ―

7 0.123 -0.020 0.134 0.034 0.124 0.287 *** ―

8 0.003 0.068 0.161 -0.034 0.196 0.245 ** -0.032 ―

9 -0.037 -0.038 0.014 -0.093 0.040 0.121 -0.050 -0.075 ―

42
Table 4. Results of neutrality tests for the gMDH region; Tajima’s D, Fu & Li’s D* and F*

(without outgroup) and Fu & Li’s D and F (with outgroup: L. kaempferi Lamb)

Statistical significance is marked with asterisks (Tajima’s D: p < 0.05 = *; p < 0.01 = **;

Fu & Li’s D*, F*, D and F: p < 0.05 = *; p < 0.02 = **)

Tajima's Fu & Li's Fu & Li's Fu & Li's Fu & Li's


Population
D D* F* D F
L. sukaczewii
1 1.2076 1.4762 ** 1.5932 * 1.6747 * 1.8643 *
2 0.9971 1.4840 ** 1.5435 1.6883 * 1.8018 *
3 0.9141 0.1695 0.3999 0.0181 0.3286
4 -0.5087 -0.1424 -0.2634 -0.3934 -0.5183
5 2.0217 * 1.2359 1.6112 * 1.3890 1.8916 *
6 1.4615 1.0236 1.2872 1.1512 1.5120
L. sibirica
7 -1.9887 ** -2.3355 ** -2.5332 ** -3.1838 ** -3.4542 **
8 -1.9612 * -2.2986 ** -2.4929 ** -3.0950 ** -3.3628 **
9 -1.4211 -1.3216 -1.5080 -2.0016 -2.2407

43
Figure 1. Summary of segregating sites in the ITS region. H01 ~ H25 represent haplotypes. The

dash (―) represents an indel

44
Figure 2. Summary of segregating sites in the 4CL region. The nucleotide ‘T’ at 244 bp position

of the H12 haplotype (third column) is a replacement. Dashes (―) represent indels

45
Figure 3. Summary of segregating sites in the gMDH region. H01 ~ H09 represent haplotypes. Dashes (―) represents indels. One indel is composed

of nucleotides (540 ~ 542 bp) and another is composed of 14 nucleotides (868 ~ 881 bp)

46
Figure 4. Haplotype networks (unrooted minimum spanning trees): (4a) the ITS region; (4b) the

4CL region; and (4c) the gMDH region. Small gray circles in Figs. 4a and 4c represent

nodes. All other circles represent haplotypes. The sizes of circles are proportional to the

haplotype frequency. Branch lengths longer than one mutational step are marked with

numbers

Figure 4a

47
Figure 4b

48
Figure 4c

49
Figure 5. Distribution of ITS, 4CL and gMDH haplotypes among populations. Larix sukaczewii populations are represented by black squares and L.

sibirica with by gray circles

50
CHAPTER 2

Genetic structure of Dipterocarpus alatus Roxb. populations from Thailand

revealed by nuclear microsatellites

ABSTRACT

Four populations of Dipterocarpus alatus from Thailand using nuclear microsatellite loci were

investigated. Two populations were from mainland Thailand, one from Samui Island and one from

Malay Peninsula. Nine pairs of primers originally designed for the related species Shorea curtisii were

tested. However, only four loci were successfully amplified; and only two of them (Shc02 and Shc07)

were polymorphic. Null alleles appeared to be present at the Shc07 locus, but significant deviations

from HWE was found only in Samui population at this locus.

Levels of genetic variation observed in D. alatus were similar to those revealed in other studies

on tropical trees using microsatellites. The levels of population differentiation (FST and RST) for

microsatellites were lower than those observed for isozymes in another study on D. alatus that included

three out of four populations investigated in this study. Genetic distances; however, were generally

consistent between microsatellites and isozymes in pair-wise comparisons, except between Kuphrakona

(mainland) and Hat Yai (Malay Peninsula), where microsatellite results suggested these two

geographically distant populations were relatively close to each other, but isozymes suggested the

opposite. This discrepancy could have been caused by homoplasy at microsatellite loci in these two

populations.

51
The genetic distances showed that Hat Yai was the most isolated population indicating it has

been isolated for a relatively long period of time. The two mainland populations were closer to each

other and the population from Samui had an intermediate position between mainland and Hat Yai, in

genetic distances, which is consistent with its geographical position. Therefore, gene flow could have

occurred between Samui and other populations in the past.

52
INTRODUCTION

Dipterocarpaceae is a dominant family of tree species of tropical Asia. It is composed of often

large trees that are usually insect-pollinated and have heavy seeds, which are not likely to disperse far

away from mother trees. Dipterocarps, as they are known, often show asynchronous flowering

(Appanah and Chan 1981) and their pollinators are weak flier insects such as thrips, bees and small

moths (Ghazoul 1997; Smitinand and Santisuk 1981), which cannot forage far beyond 100 m (Appanah

and Chan 1981). It is thus possible that mating often occurs among adjacent individuals. These factors

may limit gene flow among populations and help create groups of related individuals within stands.

Concerning the conservation of a species, one fundamental question is the amount of genetic

variation existing within and among populations, patterns of gene flow and mating system

(Changtragoon and Szmidt 1993). Another issue is the history of extant populations (phylogeography).

The sea levels varied greatly during the last 20,000 years and many of today’s areas, which are covered

by sea waters, were dry land during the Last Glacial Maxima (LGM) and the dominant vegetation

during that time was savannah, including the places where the four populations of this study occur

(Gathorne-Hardy et al. 2002).

Dipterocarpus alatus has a rectilinear trunk of 45 ~ 55 m height, providing commercially

valuable timber. It is found from Bangladesh (Chittagong) to South Vietnam and Northern Malaysia

(Smitinand et al. 1980). In Thailand, it occurs in most parts of the country. However, the total forested

area of Thailand has been reduced from more than 53% in 1961 to about 25% by the end of 20th

Century (Charuppat 1998; Lakanavichian 2001), with possible consequences of reducing the genetic

variability of D. alatus, as well as other forest species. Although D. alatus is an important forest tree

species, there is little information on its genetic variation, phylogeography and mating system.
53
There are only few previous studies on the intraspecific genetic variation of D. alatus, which

used isozyme markers, e.g. (Changtragoon and Boontawee 1999; Changtragoon 2001). Microsatellites

are, like isozymes, codominant markers, but they usually have more alleles per locus and therefore

greater resolution is expected in population genetic studies (Chase et al. 1996; Maguire et al. 2000).

They were used in this study to determine the extent and pattern of genetic variation within and the

differentiation among four populations of D. alatus from Thailand. The primers used in this study were

developed for a related species, Shorea curtisii Dyer ex King (Ujino et al. 1998). The utility of these

microsatellites loci for studies on population genetics of other Dipterocarpaceae species was

demonstrated in some studies e.g., (Konuma et al. 2000; Takeuchi et al. 2004; Ng et al. 2006).

54
MATERIAL AND METHODS

Seeds were randomly collected from trees separated by at least 100 meters from Samngao, Tak

(1); Kuphrakona, Roiet (2); and Hat Yai, Songkhla (3) populations. Leaves were collected from the

costal area and along the road 4169 on Samui Island (4), except for the western part of the island (Table

1; Fig. 1).

DNA Extraction, PCR and Polyacrylamide Gel Electrophoresis

DNA was extracted from leaves of seedlings raised from the seeds collected from populations 1

~ 3 and from the leaves collected on Samui Island. Total DNA isolation was performed using

hexadecyltrimethyl ammonium bromide (CTAB) method (Doyle and Doyle 1987) with few

modifications: 25 ~ 50 mg of dried leaves were ground in liquid nitrogen and suspended in 600 µl pre-

warmed (60 0C) 2XCTAB buffer (2% CTAB, 10 mM Tris-HCl pH8.0, 20 mM EDTA pH 8.0, 1.4 M

NaCl, 1% polyvinylpyrrolidone 0.2 %-mercaptoethanol). The mixture was incubated at 65 0C for 1 ~ 2

hours. Subsequently, 200 µl of sodium acetate (3 M) was added and the mixture was kept at –20 0C for

30 minutes. Equal volume of 24: 1 chloroform-isoamyl alcohol (CIA) was added and mixed by

inverting for 10 minutes. Following centrifugation at 10,000 g for 10 minutes the lysate was separated.

The lysate was treated with 2.5µl RNase (10 mg/ml) and incubated for 30 minutes at 37 0C. Equal

volume of CIA was added, mixed, centrifuged and the aqueous layer was extracted to a new tube.

DNA was precipitated after adding two third of ice-cold isopropanol and incubating for 15 minutes at 4

0C. Subsequently, the DNA pellet was washed with 70% ethanol three times and resuspended in 100µl

of TE buffer (10 mM Tris-HCl pH 8.0; 1 mM EDTA pH 8.0) and kept overnight at 4 0C for further

dissolving prior to the PCR. The extracted DNA was checked by agarose gel (0.8%) electrophoresis.

55
Nine pairs of primers were tested. They were designed to amplify the following microsatellite

loci: Shc01, Shc02, Shc03, Shc04, Shc07, Shc08, Shc09, Shc11 and Shc17; developed for the related

species Shorea curtisii (Ujino et al. 1998). Forward primers were fluorescent labeled, reverse ones

were not.

Polymerase chain reaction (PCR) was performed as described in Ujino et al. (1998).

Acrylamide gel electrophoresis of the amplified SSRs was carried out using the ABI 377 Genetic

Analyzer (Perkin-Elmer Co. Ltd) following manufacture’s instructions with additional procedures

described in Fernando et al. (2001) to avoid electrophoresis artifacts. Fragment sizes were determined

using the GeneScan program version 2.1 (Perkin-Elmer ABI Co. Ltd) and the GenoTyper program

version 2.0 (Perkin-Elmer ABI Co. Ltd). PCR amplification and allele detection procedures were

performed twice to verify the genotyping. DNA samples of S. curtisii were used as positive control.

DNA Cloning and Sequencing

To verify the nature of the amplified microsatellites PCR products, few samples from Samui

and at least one mainland/peninsular population for each locus were purified with the GeneClean DNA

Turbo purification kit (Bio101) and cloned into the pGEM T-easy vector (Promega) following

manufacturer’s instructions. The products were then sequenced using the ABI Prism 3100 Genetic

Analyzer (Applied Biosystems) according to manufacturer’s instructions.

Data analysis

Possible scoring errors caused by the presence of null alleles i.e., heterozygotes that were

scored as homozygotes (Brookfield 1996), were checked using the Micro-Checker software (Van

Oosterhout et al. 2004). The infinite allele model (IAM; Kimura and Crow 1964; Wright 1949), based
56
on frequencies of alleles (F-statistics) and the stepwise mutation model ( SMM; Kimura and Ohta

1978), based on allele frequencies and allele sizes (R-statistics) were used in the assessment of genetic

variability. Total number of alleles (Ao) and the allelic richness (RS) per locus and population, this

based on minimum sample size of 20 diploid individuals (the total number of individuals in the

smallest investigated sample; Table 1), based on the rarefaction method (Hurlbert 1971), described in

Elmousadik and Petit (1996) and Tsuda and Ide (2005) were obtained using the FSTAT program,

version 2.9.3. (Goudet 1995). This program was also used to obtain unbiased inbreeding coefficients

(FIS), the estimators of genetic differentiation ( FST; Weir and Cockerham 1984), with statistical

significances of F-statistics, and the genotypic, or linkage disequilibrium between loci.

Nei’s genetic distances (Ds; Nei 1978) were calculated using the TFPGA (Tools for Population

Genetic Analyses) program, version 1.3 (Miller 1997). Unbiased heterozygosities were obtained using

PopGene, program version 1.32 (Yeh and Boyle 1997). An unbiased estimator of the RST, the Rho,

which corrects for potential biases that may result from unequal sample sizes (Slatkin 1995) and the

δµ2, the genetic distance based on the SMM, were calculated using the RST Calc program version 2.2

(Goodman 1997).

Finally, Mantel tests were used to verify the correlation between the genetic and geographic

distances. The tests were carried out using the IBD: Isolation by Distance program version 1.52

(Bohonak 2002), which is a website based software (http://www.bio.sdsu.edu/pub/andy/IBD.html).

57
RESULTS

Nine pairs of primers were tested in PCR amplification and acrylamide gel electrophoresis

procedure to check for polymorphism. However, no PCR amplification was obtained at the following

loci: Shc01, Shc04, Shc08, Shc09, and Shc17. Among the four amplified loci (Shc02, Shc03, Shc07 and

Shc11), the Shc03 and Shc11 loci were monomorphic and therefore excluded from further analyses.

On the other hand, the Shc02 and Shc07 loci showed good amplification and high polymorphism and

were therefore selected for this study.

A total of 16 alleles were detected at the Shc02 locus and 17 alleles at the Shc07 locus among

the four investigated populations (Table 5). At the Shc02 locus, four alleles were found only in Samui

population, two alleles only in Kuphrakona and other two alleles only in Hat Yai. All alleles found in

Samngao population were shared by other populations (data not shown). At the Shc07 locus, seven

alleles were found only in Samui, one allele in Kuphrakona and all alleles found in Samngao and Hat

Yai were shared by other populations (data not shown).

The original number of samples was 185 in total. However, 22 samples were excluded from

analyses because at least one locus failed to produce PCR amplification. Therefore, a total of 163

individuals were analyzed. Seventeen out of the 22 discarded samples were due to amplification failure

at the Shc07 locus. The Micro-Checker program analysis has detected the presence of null alleles at

this locus, where the FIS values were all positive varying from 0.102 in Kuphrakona population to

0.281 in Samui population. On the other hand, the observed FIS values at the Shc02 locus were all

negative. They varied from -0.25 in Kuphrakona population to -0.031 in Hat Yai population. The only

statistically significant deviation from zero of FIS values was found at the Shc07 locus in Samui

population (Table 3).


58
At the Shc02 locus the allelic richness (RS) ranged from 5.390 in Kuphrakona to 8.609 in

Samui, and at the Shc07 locus from 5.000 in Samngao to 10.175 in Samui. The RS over all populations

was 8.639 at the Shc02 locus and 9.258 at the Shc07 locus (Table 2).

Sequences obtained from cloned PCR products revealed that both loci were the expected target.

However, some sequences had indels in flanking regions (Figs. 2a and 2b). Those indels could have

caused erroneous scoring of the number of repeats in some alleles.

Genetic distances and F/R-statistics were obtained from both original data set and those

obtained following the correction for null-alleles according to Brookfield (1996), which is an optional

measure in the Micro-Checker program. However, the results obtained from the corrected data were

very similar to those from the original data, without correction (data not shown). Therefore, only the

original data without correction were used. The Micro-Checker program also detected the presence of

slippage at both loci.

The expected heterozygosity (He) values were lowest in Samngao and highest in Samui at both

loci. They ranged from 0.464 to 0.706 at Shc02 and from 0.713 to 0.784 at Shc07. The He values

overall loci ranged from 0.589 to 0.745 (Table 4). The observed heterozygosity (Ho) values ranged

from 0.550 at both loci in Samngao to 0.835 in Samui at Shc02, and to 0.697 in Kuphrakona at Shc07.

The Ho values overall loci ranged from 0.550 in Samngao to 0.700 in Samui (Table 4). The RS and the

He were similar among all populations at Shc07. At the Shc02 locus; however, these values were

slightly higher in southern populations of Samui and Hat Yai (Tables 3 and 5). The lowest observed

heterozygosity (Ho) at the Shc02 locus was found in Samngao (0.550) and the highest in Samui

59
population (0.835). On the other hand, at the Shc07 locus the values of Ho were similar between

Samngao (0.550) and Samui (0.565; Table 4).

No linkage disequilibrium was observed between Shc02 and Shc07; therefore these two loci were

considered to segregate independently. The FST overall loci and populations was 0.041, but it was

lower at the Shc02 (FST = 0.025) than at the Shc07 (FST = 0.053; Table 5). In the pair-wise

comparisons, the FST overall loci ranged from 0.012 between Kuphrakona and Hat Yai populations to

0.065 between Samngao and Hat Yai (Table 6). The RST, over all populations and loci, was 0.292 and

significantly different from zero. It was lower at the Shc02 (0.004), than at the Shc07 (0.055; Table 5).

Among populations, the RST overall loci varied from -0.014 between Kuphrakona and Hat Yai to 0.042

between Samui and Hat Yai (Table 6).

Nei’s (1978) genetic distances (Ds) and the δµ2 are shown in Table 7. Most values in both

measurements were consistent with each other; the only discrepant result was observed between

Kuphrakona and Hat Yai (Ds = 0.040; δµ2 = 0.009). One possible explanation for the smaller value at

δµ2 compared to Ds between Kuphrakona and Hat Yai is that there were more differences in allele

frequencies than in the allele size. Mantel tests did not detect significant correlations between genetic

and geographic distances (data not shown).

60
DISCUSSION

Allelic or gene diversities and heterozygosity

The efficiency of PCR amplification declines as greater the genetic distance between the

species for which the primers were designed and the species for which these primers are used (Roa et

al. 2000). Dipterocarpus alatus is a more distant relative to the species for which the primers used in

this study were designed, the Shorea curtisii (Gamage et al. 2006), compared to other dipterocarp

species for which these primers were also used in previous studies e.g., (Konuma et al. 2000; Takeuchi

et al. 2004; Ng et al. 2006). This can explain the presence of null alleles at the Shc07 locus detected by

the Micro-Checker program as suggested by the general excess of homozygotes (positive FIS) for most

allele size classes (Tables 3 and 5). Most discarded samples (17 out of 22) were due to non-

amplification at the Shc07 locus; which is consistent with the situation when an individual bears two

null alleles. Null alleles are believed to be caused by mutations in the flanking sequence in at least one

of the priming sites (Callen et al. 1993; Koorey et al. 1993) that could lead to mistaken interpretations

about the level of inbreeding in a population (Pemberton et al. 1995). However, most results were not

significantly affected by the presence of null alleles since the excess of homozygotes in relation to the

HW equilibrium was found to be statistically significant only in Samui population (Table 3).

The sequences of cloned PCR products of Shc02 and Shc07 loci revealed that they were similar

to those reported in Ujino et al. (1998). However, length variations (indels) were observed not only

within the SSRs themselves but also in flanking regions (Fig. 2a and 2b). If those indels were not

produced by PCR errors, two or more different alleles could be scored as the same allele because they

have the same length in bp, causing homoplasy-like effect. The indels in flanking regions could have

61
also altered the expected number of nucleotide repeats for these loci and have probably caused the

Micro-Checker program to have interpreted at least part of those readings as slippage.

The number of alleles per locus found at the Shc02 and Shc07 loci was similar and in some

cases, larger than those found in previous studies on Dipterocarpaceae. At the Shc02 locus, 16 alleles

were found in D. alatus while only two were found in S. curtisii from Semangkok, Malaysia (Ujino et

al. 1998); nine alleles were found in N. heimii from Pasoh, Malaysia (Konuma et al. 2000), seven

alleles were found in S. leprosula and six alleles were found in S. ovalis spp. sericea (Ng et al. 2004).

Moreover, at this locus, seven and six alleles were found in H. dryobalanoides and S. parvifolia

respectively, while the Shc02 locus was fixed in S. acuminata (Ng et al. 2006). The number of alleles

(17) found at Shc07 locus was higher than that found in N. heimii (11), and S. curtisii (9) but it was

similar to the number of alleles observed in three populations of S. leprosula (17 ~ 20), as well as to in

S. ovalis spp. Sericea (13 ~ 18; Ng et al. 2004). The Shc07 locus was also variable in other Shorea

species and the number of alleles at several other microsatellite loci ranged from three through nine in

three dipterocarp species, Hopea dryobalanoides, S. parvifolia and S. acuminata (Ng et al. 2006).

The Shc03 locus was reported to be variable in Shorea species (Ng et al. 2006; Takeuchi et al.

2004), but it was monomorphic in D. alatus (this study) and in H. dryobalanoides (Takeuchi et al.

2004). The Shc11 locus was also monomorphic in D. alatus, as it was in N. heimii (Konuma et al.

2000). However, in S. curtisii, the presence of four alleles was reported at this locus (Ujino et al.

1998).

The range of mean Ho over two loci was 0.550 ~ 0.700 (Table 4) was similar to the average Ho

over four loci found in N. heimii (0.675; Konuma et al. 2000). The range of mean He from this study

(0.589 ~ 0.745) was also similar to that reported for N. heimii (0.775; Konuma et al. 2000), S. curtisii
62
(Ujino et al. 1998), three Shorea species (0.700 ~ 0.800; Ng et al. 2006) and H. dryobalanoides (0.560

~ 0.700; Takeuchi et al. 2004). Similar levels of heterozygosity, both observed and expected, were also

found in microsatellites analyses of non-dipterocarp tropical tree species. For example, Melaleuca

alternifolia (Ho = 0.724, and He = 0.781; Rossetto et al. 1999); Symphonia globulifera (Ho = 0.604 ~

0.833 and He = 0.760 ~ 0.827; Aldrich et al. 1998). However, levels of He observed in Japanese birch

(Betula maximowicziana), a temperate tree species, was lower (mean He = 0.361) than those observed

in tropical trees (Tsuda and Ide 2005). Thus, the levels of observed and expected heterozygosity found

in this study were in the same range observed at for microsatellite loci in other tropical tree species.

The generally lower values of Ho observed in Samui population at the Shc07 locus, compared

to those observed at the Shc02 locus were probably caused by the presence of null alleles at the former

locus (Table 4). Since the inbreeding coefficient (FIS) did not significantly deviate from Hardy-

Weinberg equilibrium, there is no indication that inbreeding has taken place in the recent past in D.

alatus. Its mating system remains unknown, but self incompatibility was reported in a related species,

the Dipterocarpus tempehes (Tanaka et al. 2002).

63
Genetic Distances and Population Differentiation

Nei’s (1978) genetic distances (Ds) and the δµ2 showed that, in pair-wise comparisons, Hat Yai

and Samui populations were the most distant, while Samngao and Kuphrakona populations, both from

mainland Thailand, were closer to each other (Table 7). Gene flow between the two mainland

populations has probably taken place in the past, or, they could have been originated from the same

sources after the LGM. Samui population is located between mainland Thailand and Hat Yai and the

genetic distance between Samui and other populations is concordant with its geographical location

(Fig. 1; Table 7). Hat Yai population was the most distant of them all. It might have been isolated

from other populations for a longer period of time compared to populations from mainland Thailand

and also Samui. However, no significant correlation between genetic and geographical distances was

observed. The possible causes for this result could be the low number of populations used in this study

(four). Moreover, a ‘leverage’ effect (Goodall 1993) might have negatively influenced the correlation,

because Samui is not geographically distant from Hat Yai, contrary to the genetic distance observed

between them (Table 7). The main issue is that Samui is an island and the correlation did not take into

account the geographical barrier represented by the sea and possibly other factors. When Samui or Hat

Yai population were excluded from the analyses, the obtained correlation was positive, but yet not

statistically significant (data not shown).

The genetic distances observed in this study were consistent with those reported for isozyme

markers, where Samngao and Kuphrakona were genetically closer to each other, and Samngao and Hat

Yai were genetically distant (Changtragoon 2001). In this study, the genetic distances between

Kuphrakona and Hat Yai (Ds = 0.040, δµ2 = 0.009) were lower than that between Samngao and Hat

Yai (Ds = 0.142, δµ2 = 0.085; Table 7). In Changtragoon (2001) study; however, the genetic distance

64
observed between Kuphrakona and Hat Yai (Ds = 0.0104) populations was similar to the observed

between Samngao and Hat Yai (Ds = 0.0109). Effects of homoplasy between Kuphrakona and Hat Yai

populations could explain the lower value of Ds observed in this study, compared the Ds value

observed for isozymes. Another indication that homoplasy could be the main cause of the low genetic

distance between Kuphrakona and Hat Yai is the low level of population differentiation observed

between them (FST = 0.012, RST = -0.014; Table 6). These two populations are separated by

approximately 980 km, which most likely prevents any gene flow between them and greatly reduces

the probability they have originated from the same sources (Fig. 1; Table 1).

The FST value over all populations and loci of 0.041 (Table 5), with a range of 0.012 ~ 0.065

(Table 6), found in this study was similar to that reported for Symphonia globulifera (FST = 0.031;

Aldrich et al. 1998), but much lower than those reported for isozymes in previous studies on D. alatus

where the values was FST = 0.128 (Changtragoon 2001). Similarly to the unexpected Ds result between

Kuphrakona and Hat Yai, effects of homoplasy could also be the explanation of the lower levels of FST

observed in this study, compared to those using isozymes. A comparison between results from

microsatellites in this study and nucleotide variation of a nuclear gene (Pgi) can be made between

Kuphrakona and Samui populations. The FST value for this pair of populations obtained in this study

was 0.037 (it was the same for both FST and RST; Table 6). It was higher but in the same order of

magnitude from the one reported for the Pgi gene (0.014; Tsuida 2003). The results from

Changtragoon and Boontawee (1999) and Changtragoon (2001) were based on much larger samples (>

40 individuals) than those used in the present study. Therefore, sampling differences between these

two studies and the present study could be the cause for the discrepant results between isozymes and

DNA markers. Alternatively, they could be caused by the different ways isozyme markers, sequencing

nuclear gene loci and microsatellites sample genetic diversity.

65
The levels of overall RST (0.029) from this study (with range -0.014 ~ 0.046; Tables 6 and 7)

was similar but slightly lower than the overall FST (0.041). This indicates that there are more

differences in allele frequencies than in allele sizes among these populations.

Phylogeography

In general, populations from Hat Yai (Malay Peninsula) and the island of Samui showed higher

levels of intra-population genetic variation than mainland populations of Samngao and Kuphrakona

(Table 2). This result may reflect the fact that Samui Island is located near a large putative Pleistocene

refugium on the western side of Malay Peninsula (Thomas 2000) that could have been Samui’s main

source. But, Samui population could have also received migrants from mainland Thailand and from

gallery forests along putative rivers present in the area where today is the Gulf of Thailand, which was

dry land dominated by savannahs during the LGM (Gathorne-Hardy et al. 2002). This scenario could

explain the intermediate position of Samui, between the two mainland populations and Hat Yai in

genetic distances (Table 7).

Hat Yai population has probably been created from putative refugia from nearby mountainous areas

where today is Peninsular Malaysia (Gathorne-Hardy et al. 2002). This area could include small

refugia that might have survived the Pleistocene epoch along rivers, mountain slopes etc. (Delmar et al.

2000; Gathorne-Hardy et al. 2002; Thomas 2000; Thomas and Thorp 1995). Samngao population, on

mainland Thailand, exists today on the mountainous area near the border between Thailand and

Myanmar where forest refugia could have existed during the Pleistocene (Gathorne-Hardy et al. 2002;

Thomas 2000). Kuphrakona population, on the other hand, is located near Laos, on eastern mainland

Thailand. The low levels of population differentiation and genetic distances observed between

66
Samngao and Kuphrakona populations suggest these two populations could have been created from the

same sources, or, level of gene flow throughout mainland Thailand was relatively high, possibly

favored by dispersion of seeds through rivers, such as the Mekong.

To date, the knowledge on genetic variation, structure and the origins of D. alatus, necessary

for its conservation is very scarce and further studies on this species are necessary.

67
References

Aldrich PR, Hamrick JL, Chavarriaga P, Kochert G (1998) Microsatellite analysis of demographic
genetic structure in fragmented populations of the tropical tree Symphonia globulifera. Mol Ecol 7:933-
944

Appanah S, Chan HT (1981) Thrips: the pollinators of some dipterocarps. Malaysian Forester 44:234-
252

Bohonak AJ (2002) IBD (Isolation By Distance): A program for analyses of isolation by distance. J of
Heredity 93:153-154

Brookfield J (1996) A simple new method for estimating null allele frequency from heterozygote
deficiency. Mol Ecol 5:453-455

Callen DF, Thompson AD, Shen Y, Phillips HA, Richards RI, Mulley JC, Sutherland GR (1993)
Incidence and origin of "null" alleles in the (AC)n microsatellite markers. The American Journal of
Human Genetics 52:922-927

Changtragoon S (2001) Evaluation of genetic diversity of Dipterocarpus alatus genetic resources in


Thailand using isozyme gene markers. In: Thielges BT, D. SS, A. R (eds) In situ and Ex situ
Conservation of Commercial Tropical Trees. Tropical Timber Organization (ITTO) Project PD16/96
Rev.4 (F), Faculaty of Forestry, GMU, Yogyakarta, Indonesia, p 573

Changtragoon S, Boontawee B (1999) The study of genetic diversity of Dipterocarpus alatus by


isoenzyme gene markers (English Abstract). Seminars on Dipterocarpus alatus and Dipterocapaceae,
Kasetsart University, Bangkok, Thailand, pp 107-114

Changtragoon S, Szmidt AE (1993) An integrated population genetic approach to conserve forest tree
diversity in Thailand. In: ASEAN Institute of Forest Management KL, Malaysia (ed) Management and
conservation of biodiversity. ASEAN Institute of Forest Management, Kuala Lumpur, Malaysia, Kuala
Lumpur, Malaysia

Charuppat T (1998) Forest Situation in the Past 37 years (1961-1998) [in Thai]. Royal Forest Dept,
Bangkok, p 116

Chase M, Kessel R, Bawa K (1996) Microsatellites markers for population and conservation genetics
of tropical trees. Am J Bot 83:51-57

Delmar M, Stergiopoulos K, Homma N, Calero G, Morley G, EkVitorin JF, Taffet SM (2000) A


molecular model for the chemical regulation of connexin43 channels: The ''ball-and-chain'' hypothesis.
Gap Junctions, pp 223-248

Doyle JJ, Doyle JL (1987) A rapid DNA isolation procedure for small quantities of fresh leaf tissue.
Phytochemical Bulletin 19:11-15

68
Elmousadik A, Petit RJ (1996) High level of genetic differentiation for allelic richness among
populations of the argan tree [Argania spinosa (L) Skeels] endemic to Morocco. Theor Appl Gen
92:832-839

Fernando P, Evans B, Morales J, Melnick D (2001) Electrophoresis artefacts – a previously


unrecognized cause of error in microsatellite analysis. Mol Ecol Notes 1:325-328

Gamage DT, de Silva MP, Inomata N, Yamazaki Y, Szmidt AE (2006) Comprehensive molecular
phylogeny of the sub-family Dipterocarpoideae (Dipterocarpaceae) based on chloroplast DNA
sequences. Genes Genet Syst 81:1-12

Gathorne-Hardy FJ, Jones DT, Syaukani (2002) A regional perspective on the effects of human
disturbance on the termites of Sundaland. Biodivers Conserv 11:1991-2006

Ghazoul J (1997) The pollination and breeding system of Dipterocarpus obtusifolius


(Dipterocarpaceae) in dry deciduous forests of Thailand. J Nat Hist 31:901-916

Goodall CR (1993) Computation using the QR decomposition. Elsevier, North-Holland, Amsterdam,


NL

Goodman S (1997) RSTCalc: a collection of computer programs for calculating estimates of genetic
differentiation from microsatellite data and determining their significance. Mol Ecol 6:881-885

Goudet J (1995) FSTAT (Version 1.2): A computer program to calculate F-statistics. J Heredity
86:485-486

Hurlbert SH (1971) The nonconcept of species diversity: a critique and alternative parameters. Ecology
52

Kimura M, Crow JF (1964) The number of alleles that can be maintained in a finite population.
Genetics 49:725-738

Kimura M, Ohta T (1978) Stepwise mutation model and distribution of allelic frequencies in a finite
population. Proc Natl Acad Sci of the USA 75

Konuma A, Tsumura Y, Lee C-T, Lee S-L, Okuda T (2000) Estimation of gene flow in the tropical-
rainforest tree Neobalanocarpus heimii (Dipterocarpaceae). Mol Ecol 9:1843-1852

Koorey DJ, Bishop GA, McCaughan JW (1993) Allele non-amplification: a source of confusion in
linkage studies employing microsatellite polymorphisms. Hum Mol Gen 2:289-291

Lakanavichian S (2001) Forest Policy and History in Thailand. Working Paper No 9. Research Centre
on Forest and People in Thailand, p 63

Maguire T, Saenger P, Baverstock P, Henry R (2000) Microsatellite analysis of genetic structure in the
mangrove species Avicennia marina (Forsk.) Vierh. (Avicenniaceae). Mol Ecol 9:1853-1862

Miller MP (1997) Tools for Population Genetics Analysis (TFPGA). 1.3 edn. furnished by the author

69
Nei M (1978) Estimation of average heterozygosity and genetic distance from a small number of
individuals. Genetics 89:583-590

Ng KKS, Lee SL, Koh CL (2004) Spatial structure and genetic diversity of two tropical tree species
with contrasting breeding systems and different ploidy levels. Mol Ecol 13:657-669

Ng KKS, Lee SL, Saw LG, Plotkin JB, Koh CL (2006) Spatial structure and genetic diversity of three
tropical tree species with different habitat preferences within a natural forest. Tree Genet Genom
2:121-131

Pemberton J, Slate J, Bancroft DR, Barrett J (1995) Nonamplifying alleles at microsatellite loci: a
caution for parentage and population studies. Mol Ecol 4:249-252

Powell A, Rowley AF (2007) The effect of dietary chitin supplementation on the survival and immune
reactivity of the shore crab, Carcinus maenas. Comp Biochem Physiol PT A 147:122-128

Roa A, Chavarriaga-Aguirre P, Duque MC, Maya M, Bonierbale MW, Iglesias C, Tohme J (2000)
Cross-species amplification of cassava (Manihot esculenta; Euphorbiaceae) microsatellites: allelic
polymorphism and degree of relationship. Am J Bot 87:1647-1655

Rossetto M, Slade RW, Baverstock PR, Henry RJ, Lee LS (1999) Microsatellite variation and
assessment of genetic structure in tea tree (Melaleuca alternifolia Myrtaceae). Mol Ecol 8:633-643

Slatkin M (1995) A measure of population subdivision based on microsatellite allele frequencies.


Genetics 139:457-462

Smitinand T, Santisuk T (1981) Dipterocarpaceae of Thailand with specieal reference to silvicultural


ecology. Malaysian Forester 44:377-385

Smitinand T, Thawatchai S, Phengklai C (1980) The manual of Dipterocarpaceae of mainland south-


east Asia. Royal For Dept, Bangkok

Takeuchi Y, Ichikawa S, Konuma A, Tomaru N, Niiyama K, Lee SL, Muhammad N, Tsumura Y


(2004) Comparison of the fine-scale genetic structure of three dipterocarp species. Heredity 92:323–
328

Tanaka K, Shimizu K, Nakagawa M, Okada K, Hamid AA, Nakashizuka T (2002) Multiple factors
contribute to outcrossing in a tropical emergent Dipterocarpus tempehes, including a new pollen-tube
guidance mechanism for self-incompatibility. Am J Bot 89:60-66

Thomas MF (2000) Late Quaternary environmental changes and the alluvial record in humid tropical
environments. Quatern Intl 72:23-26

Thomas MF, Thorp MB (1995) Geomorphic response to rapid climatic and hydrologic change during
the Late Pleistocene and Early Holocene in the humid and sub-humid tropics. Quatern Sci Reviews
14:193-207

70
Tsuda Y, Ide Y (2005) Wide-range analysis of genetic structure of Betula maximowicziana, a long-
lived pioneer tree species and noble hardwood in the cool temperate zone of Japan. Mol Ecol 14:3929-
3941

Tsuida Y (2003) Nucleotide Polymorphism in Two Geographically Isolated Populations of


Dipterocarpus alatus. Dept Biol. Kyushu University, Fukuoka, p 11

Tuttle AM, Gauley J, Chan N, Heikkila JJ (2007) Analysis of the expression and function of the small
heat shock protein gene, hsp27, in Xenopus laevis embryos. Comp Biochem Physiol PT A 147:112-121

Ujino T, Kawahara T, Tsumura Y, Nagamitsu T, Yoshimaru H, Ratnam W (1998) Development and


polymorphism of simple sequence repeat DNA markers for Shorea curtisii and other Dipterocarpaceae
species. Heredity 81:422-428

Van Oosterhout C, Hutchinson WF, Wills DPM, Shipley P (2004) MICRO-CHECKER: software for
identifying and correcting genotyping errors in microsatellite data. Mol Ecol Notes 4:535-538

Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure.
Evolution 38:1358-1370

Wright SE (1949) Genetics of Populations. Encyclopaedia Britannica, 14 edn, London, pp 111–112

Yeh FC, Boyle TJB (1997) Population genetic analysis of co-dominant and dominant markers and
quantitative traits. Belgian J Bot:129:157

71
Table 1. List of the investigated populations of Dipterocarpus alatus and sample sizes (n)

Population n Long. Lat.


(1) Samngao, Tak 20 99°03’00” E 17°20.5’00” N
(2) Kuphrakona, Roiet 33 103°50’00” E 15°34’00” N
(3) Hat Yai, Songkhla 25 100°26’36” E 07°28’00” N
(4) Samui Island 85 ≈100°00’00” E 09°35’00” N
Total # of indiv. 163

72
Table 2. Total number of alleles (Ao) observed within populations and allelic richness (RS) per locus

and population based on minimum sample size of 20 diploid individuals

Locus Samngao Kuphrakona Hat Yai Samui


Ao RS Ao RS Ao RS Ao RS
Shc02 6 6.000 6 5.390 9 8.157 12 8.609
Shc07 5 5.000 7 6.434 6 5.563 16 10.175

73
Table 3. Inbreeding coefficient (FIS) per population

Locus Samngao Kuphrakona Hat Yai Samui


Shc02 -0.191 -0.25 -0.031 -0.184
Shc07 0.233 0.102 0.113 0.281 *
Overall 0.067 -0.037 0.047 0.061

* = significant at 0.05 level

74
Table 4. Levels of heterozygosity within populations. Ho = observed heterozygosity; He = expected

heterozygosity; s.d. = standard deviation

Population Shc02 Shc07 Overall loci


Ho He Ho He Ho s.d. He s.d.
Samngao 0.550 0.464 0.550 0.713 0.550 0 0.589 0.176
Kuphrakona 0.636 0.511 0.697 0.775 0.667 0.043 0.643 0.187
Hat Yai 0.640 0.621 0.640 0.720 0.640 0 0.671 0.070
Samui 0.835 0.706 0.565 0.784 0.700 0.191 0.745 0.055

75
Table 5. Genetic diversity parameters over all populations: Ao = total number of alleles; RS = allelic

richness; FIS = inbreeding coefficient; FST and RST (Rho) = population differentiation based

on IAM and SMM respectively (see text for explanation). All parameters are unbiased

Locus Ao RS FIS FST RST


Shc02 16 8.639 -0.171 0.025 * 0.004
Shc07 17 9.258 0.215 * 0.053 * 0.055 *
Overall n.a. n.a. 0.080 0.041 * 0.029 *

* = significant at 0.05 level; n.a. = not applicable.

76
Table 6. Population differentiation: RST (above diagonal) and FST (below diagonal)

Samngao Kuphrakona Hat Yai Samui


Samngao – 0.013 0.046 * 0.016
Kuphrakona 0.014 – -0.014 0.037 *
Hat Yai 0.065 * 0.012 – 0.042 *
Samui 0.032 0.037 * 0.061 * –

* = significant at 0.05 level

77
Table 7. Genetic distances: δµ2 (above diagonal) and Nei’s Ds (1978)

Samngao Kuphrakona Hat Yai Samui


Samngao – 0.058 0.085 0.041
Kuphrakona 0.038 – 0.009 0.108
Hat Yai 0.142 0.040 – 0.101
Samui 0.065 0.095 0.188 –

78
Figure 1. Map of Thailand showing locations of the four investigated populations

79
Figure 2. Polymorphic sites at the Shc02 (Fig. 2a) and Shc07 (Fig. 2b) loci only. Dashes (–)

represent indels. Positions 149 bp in Fig. 2a and 145 ~ 150 bp in Fig. 2b are indels in flanking regions.

Fig. 2b also shows a nucleotide substitution at 144 bp (G/A). Other nucleotide positions belong to

SSRs

Figure 2a (Shc02 locus)

Figure 2b (Shc07 locus)

80
Acknowledgements

I would like to offer my thanks to my supervisor, Dr. A. E. Szmidt for his continuous guidance,

support and comments throughout my study and to my former supervisor Dr. T. Yamazaki for his

comments, advices and support, to Drs. N. Inomata and E. Nitasaka for their valuable teaching and

comments, to Drs. D. T. Gamage, S. Sirikantaramas, N. Nikandrov, H. Goto, H. Ishiyama, M. Iwasaki,

K. K. G. U. Hemamali, I. Khatab, Mr. R. Yamauchi and Ms. A. Saitoh for their assistance in my

experiments.

I wish to thank Drs. Ove Martinsson JiLU, Bispgården, Sweden and Katsuhiko Takata, Institute

of Wood Technology Akita Prefectural University, Japan for providing seed samples of Larix spp., to

Dr. Vladimir L. Semerikov, Institute of Plant and Animal Ecology, Ural Division, Russian Academy of

Sciences, Yekaterinburg, Russia, for help in obtaining literature related to morphological studies on L.

sukaczewii and L. sibirica, and to Dr. S. Changtragoon, Forest Genetics and Biotechnology Division

Forest and Plant Conservation Research Office National Park, Wildlife and Plant Conservation

Department, Thailand for providing part of the samples of Dipterocarpus alatus.

I also wish to express my gratitude for the invaluable support I received from Drs. I. Emmanuel,

P. S. Martins, R. Vencovsky, R. D. d’Arce, N. Takahata, Y. Shimamoto, J. Nason, X.-R. Wang, R.

Leung, Messrs. J. L. A. Furlan and M. C. Mathias. Above all, I offer my deepest gratitude to my

loving parents for their dedication to my life and encouragement.

My research was partly supported by Grant No. 16-260 from the Sasakawa Scientific Research

Grant, The Japan Science Society, and by the Grants No. 13575002 and 17405032 from the Ministry of

Education, Culture, Sports, Science and Technology of Japan.

81

You might also like