You are on page 1of 8

Review

TRENDS in Biotechnology

Vol.23 No.1 January 2005

Genic microsatellite markers in plants: features and applications


Rajeev K. Varshney1, Andreas Graner1 and Mark E. Sorrells2
1 2

Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstrasse 3, D-06466 Gatersleben, Germany Department of Plant Breeding, Cornell University, Ithaca, NY14853, USA

Expressed sequence tag (EST) projects have generated a vast amount of publicly available sequence data from plant species; these data can be mined for simple sequence repeats (SSRs). These SSRs are useful as molecular markers because their development is inexpensive, they represent transcribed genes and a putative function can often be deduced by a homology search. Because they are derived from transcripts, they are useful for assaying the functional diversity in natural populations or germplasm collections. These markers are valuable because of their higher level of transferability to related species, and they can often be used as anchor markers for comparative mapping and evolutionary studies. They have been developed and mapped in several crop species and could prove useful for marker-assisted selection, especially when the markers reside in the genes responsible for a phenotypic trait. Applications and potential uses of EST-SSRs in plant genetics and breeding are discussed. The analysis of DNA sequence variation is of major importance in genetic studies. In this context, molecular markers are a useful tool for assaying genetic variation, and have greatly enhanced the genetic analysis of crop plants. A variety of molecular markers, including restriction fragment length polymorphisms (RFLPs), random amplication of polymorphic DNAs (RAPDs), amplied fragment length polymorphisms (AFLPs) and microsatellites or simple sequence repeats (SSRs), have been developed in different crop plants [1,2]. Among different classes of molecular markers, SSR markers are useful for a variety of applications in plant genetics and breeding because of their reproducibility, multiallelic nature, codominant inheritance, relative abundance and good genome coverage [3]. SSR markers have been useful for integrating the genetic, physical and sequence-based physical maps in plant species, and simultaneously have provided breeders and geneticists with an efcient tool to link phenotypic and genotypic variation (for review, see [4]). With the establishment of expressed sequence tag (EST) sequencing projects for gene discovery programs in several plant species, a wealth of DNA sequence information has been generated and deposited in online databases [5]. In addition, sequence data for many fully
Corresponding author: Rajeev K. Varshney (rajeev@ipk-gatersleben.de or rajeevkvarshney@hotmail.com). Available online 25 November 2004

characterized genes and full-length cDNA clones have been generated for some plant species such as rice [6]. By using some computer programs, the sequence data for ESTs, genes and cDNA clones can be downloaded from GenBank and scanned for identication of SSRs, which are typically referred to as EST-SSRs or genic microsatellites (Figure 1). Subsequently, locus-specic primers anking EST- or genic SSRs can be designed to amplify the microsatellite loci present in the genes. Thus, the generation of (genic) SSR markers is relatively easy and inexpensive because they are a byproduct of the sequence data from genes or ESTs that are publicly available. However, the generation of genic SSR markers is largely limited to those species or close relatives for which there is a sufciently large number of ESTs available. Genic SSRs have some intrinsic advantages over genomic SSRs because they are quickly obtained by electronic sorting, and are present in expressed regions of the genome. The usefulness of these genic SSRs also lies in their expected transferability because the primers are designed from the more conserved coding regions of the genome. Because of the advantages of genic SSR markers over genomic SSR markers and the public availability of large quantities of sequence data, genic SSRs have been identied, developed and used in a variety of studies, for several plant species. In this article, we review the current status of research on genic microsatellites in plants and present a critical appraisal of the relative use of genic SSRs and genomic SSRs for specic purposes, showing a shifting paradigm in microsatellite research for crop breeding with a particular emphasis on cereals. Identication, frequency and distribution of genic SSRs Identication of SSRs in gene sequences of plant species was carried out as early as 1993 by Morgante and Olivieri [7]. However, at that time the volume of sequence data available for SSR analysis was limited (!5000 kb) and therefore only a few genic SSRs were reported. Only one SSR per 64.6 kb in monocotyledonous and one per 21.2 kb in dicotyledonous species were identied [8]. Subsequently, the sudden increase in the volume of sequence data generated from EST projects in several plant species facilitated the identication of genic SSRs in large numbers. For the identication of SSRs in publicly available EST and gene sequences, regular expression matching or BLASTN tools were initially used in the FASTA or BLAST2 formatted sequences [9,10]. Subsequently, several Perl scripts, search

www.sciencedirect.com 0167-7799/$ - see front matter Q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibtech.2004.11.005

Review

TRENDS in Biotechnology

Vol.23 No.1 January 2005

49

Characterized and annotated genes

Full-length cDNA clones

Shotgun sequencing (ESTs)

Singletons Unigenes Tentative consensi

Public databases such as NCBIa, EMBLb

Available sequence data from genes or ESTs

Database mining: Identification of SSR in sequence data of ESTs or genes Primer designing for genic SSRs Amplification of genic loci

Applications

Functional genomics

Association mapping

Diversity analysis

Genome mapping

Transferability and comparative mapping

Gene tagging and QTL analysis


TRENDS in Biotechnology

Figure 1. A schematic representation of the development and application of genic simple sequence repeat (SSR) markers. NCBI, National Center for Biotechnology Information, Bethesda, MD, USA (http://www.ncbi.nih.gov/); bEMBL, European Molecular Biology Laboratory, Heidelberg, Germany (http://www.embl-heidelberg.de/). These databases can be used to download publicly available ESTs or sequence data for a plant species available in the public domain. Abbreviation: QTL, quantitative trait loci.

modules or programs have been developed for recognition of SSR patterns in the sequence les (Table 1). Among different programs available in the public domain, the MIcroSAtellite (MISA) search module has some features that are useful for EST quality control and for designing the primer pairs for EST-SSRs in a batch le [11] (see also http://pgrc.ipkgatersleben.de/misa/). MISA has been used in several studies, in different laboratories [1115]. Another SSR nder, called Sputnik, has the useful feature of enabling the user to specify the percent imperfection allowed in the SSR [16] (see also C. Abajian; http://abajian.net/sputnik/ index.html), and Perl scripts have been written to facilitate routing the output to a relational database and batch primer design for Primer3 (http://wheat.pw.usda.gov/ITMI/ESTSSR/LaRota/).
Table 1. Tools for database mining
Script or program MIcroSAtellite (MISA) SSRFinder BuildSSR SSR Identication Tool (SSRIT) Tandem Repeat Finder (TRF) Tandem Repeat Occurrence Locator (TROLL) CUGIssr Sputnik Modied Sputnik Modied Sputnik II SSRSEARCH
www.sciencedirect.com

Because limited genomic sequence data are available for many plant species, EST databases have been screened for the development of genic SSRs. For example, ESTs have been scanned for the presence of SSRs in Arabidopsis [16], cotton [17,18], Festuca species [19], grapes [9], Medicago species [20], soybean [21], sugarcane [22], spruce [23] and cereals including barley [1113,17,24], maize [13,16,21,24], rice [10,13,16,17,24], rye [13,15,25], sorghum [13,24] and wheat [13,16,17,21,24,26,27]. The abundance of SSRs (perfect and imperfect) in unigenes can range from 1 in every 100 to 1 in every 2 unigenes of rice, depending on the minimum length of the SSR repeat motif (M. La Rota et al., unpublished). Varshney et al. [13] estimated the density of SSRs in expressed regions for 75.2 Mb of barley, 54.7 Mb of maize,

Refs http://pgrc.ipk-gatersleben.de/misa/; [11] [21] [23] [24] [59] [60] http://www.genome.clemson.edu/projects/ssr/ C. Abajian; http://abajian.net/sputnik/index.html [16] http://wheat.pw.usda.gov/ITMI/EST-SSR/LaRota/ ftp://ftp.gramene.org/pub/gramene/software/scripts/ssr.pl

50

Review

TRENDS in Biotechnology

Vol.23 No.1 January 2005

43.9 Mb of rice, 3.7 Mb of rye, 41.6 Mb of sorghum and 37.5 Mb of wheat and found the overall average density of (redundant) SSRs to be 1 per 6.0 kb. Higher frequencies, however, were reported by Morgante et al. [16], with 2.1, 1.1 and 1.3 per kb for rice, maize and wheat, respectively. In this context, it is important to note that the overall frequency and the frequency of different lengths of SSRs and repeat motifs depend on the criteria used to identify SSRs in the database mining, and therefore varied widely in different studies. In wheat, for example, the frequency of SSRs in ESTs has been reported as 1 in 6.2 kb [13], 1 in 0.74 kb [16], 1 in 17.42 kb [21], 3.2% or w1 in 1 kb [24], 1 in 9.2 kb [26] and 7.5% of the contigs [27]. In some of these studies, a redundant set of SSRs (see below) was taken into account for estimating the SSR frequencies [13,21,24,26]. Furthermore, different SSR search tools (MISA [13]; Sputnik [16]; SSRFinder [21]; SSRIT [24]; macro [26]; SSRSEARCH, TRF and RepeatMasker [27]) with different search SSR criteria and different datasets were used for EST database mining. In general, when the minimum repeat length is 20 bp, SSRs of various plant species are present in w5% of the ESTs (for examples, see http://www. genome.clemson.edu/projects/ssr). Trinucleotide repeats (TNRs) are the most common, followed by either dinucleotide repeats (DNRs) or tetranucleotide repeats (TTNRs), depending on the report. For example, Varshney et al. [13] reported that among cereal species, TNRs were the most frequent (5478%) followed by DNRs (17.140.4%) and TTNRs (36%). Frequencies and distribution of different repeat motifs varied substantially in studies by Morgante et al. [16], Gao et al. [21] and Kantety et al. [24]. In these reports, wheat TNRs ranged from 49% to 83% but DNRs and TTNR frequencies were similar. Proportions of the different rice SSRs were similar in these reports but for maize there were large differences, many of which could be attributed to different methods of screening and analyzing SSRs and to differences in the sources of DNA sequence. In a recent survey, the proportions of DNRs, TNRs and TTNRs and motifs observed varied with the length of the SSRs within and among barley, wheat and rice [M. La Rota et al., unpublished]. Yu et al. [14] reported that 74% of the TNRs were found in coding regions, 20% in 5 0 UTRs and 6% in 3 0 UTRs. By contrast, only 19% of the DNRs were in coding regions and 42% and 39% were in 5 0 and 3 0 UTRs, respectively. The abundance of trimeric SSRs in ESTs was attributed to the absence of frameshift mutations in coding regions when there is length variation in these SSRs [28]. Also, among the TNRs, codon repeats corresponding to small hydrophilic amino acids are perhaps more easily tolerated, and selection pressure probably eliminates codon repeats encoding hydrophobic and basic amino acids [29]. Some inherent issues of genic SSRs Redundancy Large-scale EST sequencing projects have been performed for several plant species [5,30]. However, random or shotgun sequencing within cDNA libraries leads to a high proportion of redundant ESTs [31]. For development of unique genic SSR markers, a nonredundant EST
www.sciencedirect.com

dataset (after clustering the redundant set of ESTs and dening the unigene set) should be used for identication and development of EST-SSR markers. In some studies, the redundant EST dataset has been scanned rst for the presence of ESTs containing SSRs (SSR-ESTs) and then the smaller dataset of redundant SSR-ESTs has been used to identify nonredundant SSR-ESTs or EST-SSRs after clustering and dening the unigene SSR-ESTs [1113,24]. The frequency of SSRs in nonredundant ESTs (or SSR-ESTs) more accurately reects the density of SSRs in the transcribed portion of the genome. Alternatively, all available ESTs can be assembled and consensus sequences from unigene datasets such as the gene indices at The Institute for Genome Research (TIGR; http://www. tigr.org/tdb/tgi/) or other sources can be used for proper development of nonredundant marker sets [24,27]. Robustness and high-quality markers The practical use of polymerase chain reaction-based markers, especially in germplasm analysis, in which data integration and comparison are crucial, requires that each SSR marker be validated for quality and robustness of the amplication product. However, a portion of genomic SSRs, developed in the past, have produced faint bands or stuttering, as observed in wheat [32] and barley [33]. By contrast, SSR markers derived from genes have produced a high proportion of high-quality markers with strong bands and distinct allelic peaks in most reports [11,12,14,19,27,34,35]. High quality and robustness of amplication patterns, along with other merits (see later) associated with EST-SSR markers, enhance their value, especially for germplasm characterization. Amplication rate and null alleles Primer design is not an exact science, and a success rate of 6090% amplication for both genomic and EST-SSRs has been reported in different studies [11,12,14,19,22,26]. Possible explanations include: (i) one or both primers of the EST-SSR extend across a splice site; (ii) the presence of large introns in genomic DNA sequence; (iii) the use of questionable sequence information for primer development; and (iv) primers were derived from chimeric cDNA clones. Thus, the quality of the SSR-EST sequence for designing the primer pairs is important. In a survey, up to 9% of cereal ESTs were of low quality [30] and should be rejected for designing primer pairs for EST-SSRs [11]. Furthermore, compared with genomic SSRs, amplicon size more frequently deviated from expectation [11,12,14,22,27]. This result is probably a result of the presence of introns and insertions-deletions(in-dels) in the corresponding genomic sequence, as was substantiated by sequence analysis [19]. Large in-dels (20Cbp) in the SSR-ESTs can alter amplicon size sufciently to enable visualization of polymorphism on agarose gels, which, compared with the use of acrylamide gels, signicantly reduce costs and increase throughput [14]. Null alleles (alleles that do not give a polymerase chain reaction product) were observed by using EST-SSR markers in studies on kiwifruit [36], rice [34], spruce [23] and wheat [26,35]. In wheat, occurrence of null alleles is

Review

TRENDS in Biotechnology

Vol.23 No.1 January 2005

51

common and has been reported earlier using genomic SSRs (for references, see [4]). Occurrence of null alleles can be explained by: (i) the deletion of microsatellite at a specied locus [37]; (ii) mutations (in-dels or substitutions) in the primer binding site [38]. Occurrence of null alleles complicates the interpretation of segregation data because heterozygotes cannot be identied and reaction failures cannot be detected. The latter can result in deviation from the expected Mendelian segregation ratios [36]. Level of polymorphism EST-SSR primers have been reported to be less polymorphic compared with genomic SSRs in crop plants because of greater DNA sequence conservation in transcribed regions [9,23,34,35,39,40]. It is noteworthy that for detection of polymorphism, EST-SSRs derived from 3 0 ESTs were found to be superior to those derived from 5 0 -ESTs [9,21,26,41]. Owing to the process of cDNA generation (polyT priming), there is a preferential selection of untranslated regions (UTRs) within 3 0 -ESTs, resulting in more variation than in 5 0 -ESTs. Scott et al. [9] also reported that there were polymorphism differences among microsatellites derived from the 3 0 UTR (most polymorphic at cultivar level), the 5 0 UTR (most polymorphic between cultivar and species) and microsatellites within the coding sequence (most polymorphic between species and genera). Interestingly, in a recent study on identication and genome mapping of EST-SSRs in kiwifruit (Actinidia spp.), 93.5% of the markers were polymorphic and segregating in a mapping population derived from the intraspecic cross between two genotypes of diploid Actinidia chinensis [36]. Saha et al. [19] reported that 66% of the tall fescue-derived EST-SSR primer pairs were polymorphic between parents of tall fescue and ryegrass populations, and 43% and 38% of these were polymorphic in rice and wheat, respectively. Applications Genetic mapping Microsatellite markers, developed from genomic libraries, can belong to either the transcribed region or the nontranscribed region of the genome, and rarely is there information available regarding their functions. By contrast, genic microsatellite markers often have known or putative functions and are gene targeted markers with the potential of representing functional markers in those cases where polymorphisms in the repeat motifs affect the function of the gene in which they reside [42]. Putative functions for a signicant proportion of EST-SSR markers have been reported [11,14,18,43]. EST-SSR markers are one class of marker that can contribute to direct allele selection, if they are shown to be completely associated or even responsible for a targeted trait [44]. For example, recently, a Dof homolog (DAG1 gene that showed a strong effect on seed germination in Arabidopsis [45]) has been mapped on chromosome 1B of wheat by using wheat EST-SSR primers [43]. Similarly, Yu et al. [14] identied two EST-SSR markers linked to the photoperiod response gene (ppd) in wheat. Finally, mapping candidate genes can
www.sciencedirect.com

facilitate genome alignment across distantly related species [46,47]. In recent years, the EST-SSR loci have been integrated, or genome-wide genetic maps have been prepared, in several plant (mainly cereal) species (Table 2). A large number of genic SSRs have been placed on the genetic maps of wheat [14,27,41,43]. EST-SSRs have been mapped as a part of the transcript map of barley (R.K. Varshney et al., unpublished) [11]. Unlike genomic SSRs, genic microsatellite markers were not clustered around the centromere but, as expected, were concentrated in generich regions [11,14,43]. It is believed that the distribution of genic SSRs in the genetic maps mirrors the distribution of genes along the genetic map. In some earlier reports dealing with genomic SSRs, microsatellite markers were associated with repetitive DNA or retrotransposons [4,48]; however, recent reports indicated that they are predominately associated with nonrepetitive DNA (M. La Rota et al., unpublished) [16,49]. Functional diversity Characterization of genetic variation within natural populations and among breeding lines is crucial for effective conservation and exploitation of genetic resources for crop improvement programs. Molecular markers have proven useful for assessment of genetic variation in germplasm collections [50]. Evaluation of germplasm with SSRs derived from genes or ESTs might enhance the role of genetic markers by assaying the variation in transcribed and known-function genes, although there is a higher probability of bias owing to selection. Expansion and contraction of SSR repeats in genes of known function can be tested for association with phenotypic variation or, more desirably, biological function [51]. The presence of SSRs in the transcripts of genes suggests that they might have a role in gene expression or function; however, it remains to be seen whether any unusual phenotypic variation might be associated with the length of SSRs in coding regions, as was reported for several diseases in humans [49,52]. It has been shown that variation in repeat units of SSRs: (i) present in 5 0 UTR affects the gene transcription and/or translation; (ii) present in coding region inactivate or activate genes or truncate protein; and (iii) present in 3 0 UTR might be responsible for gene silencing or transcription slippage. However, the function of genes that contain SSRs and the role of the SSR motif in the function of the plant genes are poorly understood. In a computational study, microsatellite markers in the transcribed regions of rice and Arabidopsis were more frequently found in the 5 0 UTRs than in coding regions or 3 0 UTRs, suggesting that they can potentially function as factors in regulating gene expression [53]. In an experimental study in rice, variation in the number of GA or CT repeats in the 5 0 UTR of the waxy gene was correlated with amylose content [51,54]. Similarly, microsatellite markers (CCG)n in 5 0 UTRs of some ribosomal protein genes of maize were believed to be involved in the regulation of fertilization [55]. Thus, the mechanisms found in human or animal systems might also have a role in generating phenotypic diversity in plant species. However, the variation associated

52

Review

TRENDS in Biotechnology

Vol.23 No.1 January 2005

Table 2. Genome mapping using genic simple sequence repeat (SSR) markersa
Plant species Barley Number of genic SSR loci mapped 185 39 Cotton Kiwifruit Raspberry Rice Rye Ryegrass Tall fescue (Festuca spp.) Wheat 111 138 8 91 39 91 91 149 126 101 449 Mapping population used 3 DHsa (Igri!Franka, Steptoe!Morex, OWBDom!OWBRec) F2s (Lerche!BGR41936), DHs (Igri!Franka), wheatbarley addition lines BCb 1 lines (TM1!Hai7124)!TM1) Intraspecic cross Full-sib family (Glen Moy!Latham) DHs (IR64!Azucena), RILsc (Milyang 23!Gihobyeo, Lemont! Teqing, BS125!WL02) 4 mapping populations derived from reciprocal crosses (P87! P105, N6!N2, N7!N2, N7!N6) Three-generation population (Floregon!Manhattan) Pseudo-test cross-population (HD2856!R4364) RILs (W7984!Opata85) RILs (W7984!Opata85) RILs (W7984!Opata85, Wenmai 6!Shanhongmai), DHs (Lumai14!Hanxuan 10) Pseudotest cross-population (6525/5!364/7) Refs [11], R.K. Varshney et al. unpublished [56] [18] [36] [61] [10] [15] [62] [57] [14] [27] [43] [63]

White clover
a

Abbreviations: BC1, backcross population; DHs, doubled haploids; RILs, recombinant inbred lines.

with deleterious characters is less likely to be represented in the germplasm collections of crop species than among natural populations because undesirable mutations are commonly culled from agricultural populations [34]. Several studies have found that genic SSRs are useful for estimating genetic relationship (Table 3), and at the same time provide opportunities to examine functional diversity in relation to adaptive variation [35,40]. In comparison to genomic SSRs, genic SSRs revealed less polymorphism (low polymorphic information content value) in germplasm characterization and genetic diversity studies [9,11,34,35,39,40,56]. Transferability and comparative mapping Perhaps the most important feature of the genic SSR markers is that these markers are transferable among distantly related species, whereas the genomic SSRs are not suitable for this purpose. Transferability of such

markers to related species or genera has been demonstrated in several studies (Table 4). Recently, the potential use of EST-SSRs developed for barley and wheat has been demonstrated for comparative mapping in wheat, rye and rice [46,47]. These studies suggested that EST-SSR markers could be used in related plant species for which little information is available on SSRs or ESTs. In addition, the genic SSRs are good candidates for the development of conserved orthologous markers for genetic analysis and breeding of different species. For example, a set of 12 barley EST-SSR markers was identied that showed signicant homology with the ESTs of four monocotyledonous species (wheat, maize, sorghum and rice) and two dicotyledonous species (Arabidopsis and Medicago) and could potentially be used across these species [47]. Two issues of importance for cross-species utilization are frequency of amplication for a given set of primers

Table 3. Utilization of genic simple sequence repeat (SSR) markers for estimation of genetic diversitya
Plant species Alpine lady-fern Barley Number of EST-SSR markers used 10 38 75 10 8 17 (barley and wheat) 22 9 145 39 129 100 21 20 22 10 (wheat and barley) 52 20 64 Details of genotypes used 186 individuals (6 populations) 54 cultivars 7 genotypes (parents of 3 DH mapping populations) 23 genotypes representing different geographic regions 8 spring barley cultivars, 8 Jordan and Syrian landraces and 8 wild barley lines 11 varieties 28 Germany barley cultivars and 2 wild barley accessions 15 C. arabica and 8 C. robusta species 5 Fescue genotypes and 2 genotypes each of wheat and rice 24 species and subspecies of Medicago 14 genotypes (parents of six intersubspecic crosses and one interspecic cross) 15 accessions (13 inbred lines and two open-pollinated cultivars) 5 genotypes 52 elite exotic wheat genotypes 64 durum wheat accessions 15 varieties 68 advanced wheat lines 56 old and new UK wheat varieties 18 species of TriticumAegilops complex Average PIC 0.49 0.45 0.60 0.38 0.36 0.38 0.32 0.66 0.46 0.62 0.44 0.62 0.45 Average alleles 3.3 0.40 Average alleles 6.8 Refs [64] [11] [12] [39] [40] [41] [56] [65] [57] [19] [34] [25] [22] [26] [35] [41] [66] [67] [68]

Coffea spp. Fescue spp. Medicago spp. Rice Rye Sugarcane Wheat

Abbreviations: EST, expressed sequence tag; DH, doubled haploid; PIC, polymorphic information content (unless otherwise specied).

www.sciencedirect.com

Review

TRENDS in Biotechnology

Vol.23 No.1 January 2005

53

Table 4. Interspecic and generic transferability of genic simple sequence repeat (SSR) markers
Plant species, genic SSRs developed Alpine lady-fern Apricot Barley Barley and wheat Coffee Cotton Grape Species, transferability recorded 9 species from Woodsiaceae 21 Prunus accessions, one pear and six apple cultivars Wheat, rye, rice Wheat and barley 12 Coffea species and 4 Psilanthus species 2 cotton species 7 species from 2 Vitaceae genera 46 species from Vitaceae family 25 species from 5 Vitaceae genera 8 species from 4 Vitaceae genera Different subspecies and species of pine 6 Medicago species Wild species of rice 23 spruce species Erianthus and Sorghum Lolium spp., rice, wheat Barley, maize, rice Barley, maize, rice, rye and oats Rice, maize and soybean Aegilops and Triticum species Refs [64] [69] [11,47] [41] [65] [17] [9] [69] [70] [71] [58] [20] [34] [23] [22] [19] [14,46] [26] [43] [68]

Loblolly pine Medicago (M. truncatula) Rice Spruce (Picea spp.) Sugarcane (Sachharum spp.) Tall fescue (Festuca arundinacea) Wheat

and probability of amplifying the same (orthologous) gene in multiple species. Studies have estimated that 44% to 60% of EST-SSR primer pairs designed for wheat or barley will also yield amplicons in rice [11,14,21,41,46,47]. Of tall fescue primers, 59% successfully amplied rice and 71% amplied wheat DNA [57]. Similarly, 96% of the primers designed for Medicago truncatula produced amplicons in six other Medicago species [19]. In a study of the transferability of Loblolly pine SSR markers to other pine species, Liewlaksaneeyanawin et al. [58] compared microsatellite markers developed from ESTs, unscreened genomic DNA, low-copy genomic DNA and undermethylated genomic DNA. Although all eight of the EST-SSR markers produced amplicons on all four species, the three groups of genomic SSR markers were only evaluated for transferability to Pinus contorta ssp. latifolia and 29%, 23% and 30% produced amplicons, respectively. In a comparison of methods for primer design, Yu et al. [46] found that aligning consensus sequences from two or more species to identify conserved regions for primer design was less efcient than designing speciesspecic primers and then testing them on other species. Orthology can only be determined by comparing both similarity of amplicon sequences and genome location across species [46,47]. For example, Saha et al. [19] sequenced the products of one EST-SSR primer pair for three fescue species, ryegrass, rice and wheat, and all sequences were O85% similar. Sequence-based comparison of mapped barley SSR-ESTs with genetically and/or physically mapped markers in wheat, rye and rice revealed several markers that showed an orthologous relationship between examined cereal species [47]. Comparison of genome locations of polymorphic EST-SSR markers mapped in both wheat and rice also conrmed previously known genome relationships with most of the markers examined [46]. However, the assessment of colinearity was complicated by the detection of multiple polymorphic loci in either wheat or rice by 85% of the primer pairs. The tendency of EST-SSR primer pairs to detect more loci than genomic SSRs was also reported for tall fescue [57].
www.sciencedirect.com

Comparative account on genic and genomic microsatellite markers A comparative analysis of genomic SSRs and genic SSRs reveals advantages to both; however, because of lower polymorphism, EST-SSRs are not as efcient as genomic SSRs for distinguishing the closely related genotypes (for references, see [4]). Furthermore, the development of genic SSRs is restricted to those species for which there are sufcient sequence data (for ESTs or genes) available because SSRs are present in only 2% to 5% of the unigenes examined. Nevertheless, EST-SSR markers developed for a given species can successfully be used in a related species for a variety of purposes, including ngerprinting or diversity studies, comparative mapping and markerassisted selection. Genic SSR- and genomic SSR markers tend to be complementary for genome mapping, with genic microsatellites being less polymorphic but concentrated in the gene-rich regions. For assessment of functional diversity, the genic SSRs are useful; however, because of higher polymorphism, genomic SSRs are superior for ngerprinting or varietal identication studies. Future directions of microsatellite marker research With more DNA sequence data being generated daily, the trend is towards cross-referencing genes and genomes using sequence- and map-based tools. Because polymorphism is a major limitation for many species, microsatellite markers are a valuable tool for plant genetics and breeding. Clearly, the most signicant application of EST-SSRs is for comparative mapping, with good examples in graminaceous and leguminous species. A database of EST-SSR primer pairs that would amplify orthologous loci across species and that are uniformly distributed over the rice, Medicago and Arabidopsis genomes would be very useful to breeders and geneticists, especially for minor or underfunded crop species. In the longer term, development of allele-specic markers for the genes controlling agronomic traits will be important for advancing the science of plant breeding. In this context, genic microsatellites are but one class of

54

Review

TRENDS in Biotechnology

Vol.23 No.1 January 2005

marker that can be deployed, along with single nucleotide polymorphisms and other types of markers that target functional polymorphisms within genes. The choice of the most appropriate marker system needs to be decided upon on a case by case basis and will depend on many issues, including the availability of technology platforms, costs for marker development, species transferability, information content and ease of documentation.
References
1 Philips, R.L. and Vasil, I.K. eds (2001) DNA-Based Markers in Plants, Kluwer Academic Publishers 2 Varshney, R.K. et al. (2004) Molecular maps in cereals: methodology and progress. In Cereal Genomics (Gupta, P.K. and Varshney, R.K. eds), pp. 3582, Kluwer Academic Publishers 3 Powell, W. et al. (1996) Polymorphism revealed by simple sequence repeats. Trends Plant Sci. 1, 215222 4 Gupta, P.K. and Varshney, R.K. (2000) The development and use of microsatellite markers for genetic analysis and plant breeding with emphasis on bread wheat. Euphytica 113, 163185 5 Rudd, S. (2003) Expressed sequence tags: alternative or complement to whole genome sequences? Trends Plant Sci. 8, 321329 6 Kikuchi, S. et al. (2003) Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice. Science 301, 376379 7 Morgante, M. and Olivieri, A.M. (1993) PCR-amplied microsatellites as markers in plant genetics. Plant J. 3, 175182 8 Wang, Z. et al. (1994) Survey of plant short tandem DNA repeats. Theor. Appl. Genet. 88, 16 9 Scott, K.D. et al. (2000) Analysis of SSRs derived from grape ESTs. Theor. Appl. Genet. 100, 723726 10 Temnykh, S. et al. (2000) Mapping and genome organization of microsatellite sequences in rice (Oryza sativa L.). Theor. Appl. Genet. 100, 697712 11 Thiel, T. et al. (2003) Exploiting EST databases for the development of cDNA derived microsatellite markers in barley (Hordeum vulgare L.). Theor. Appl. Genet. 106, 411422 12 Kota, R. et al. (2001) Generation and comparison of EST-derived SSRs and SNPs in barley (Hordeum vulgare L.). Hereditas 135, 145151 13 Varshney, R.K. et al. (2002) In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species. Cell. Mol. Biol. Lett. 7, 537546 14 Yu, J.K. et al. (2004) Development and mapping of EST-derived simple sequence repeat (SSR) markers for hexaploid wheat. Genome 47, 805818 15 Khlestkina, E. et al. (2004) Mapping of 99 new microsatellite-derived loci in rye (Secale cereale L.) including 39 expressed sequence tags. Theor. Appl. Genet. 109, 725732 16 Morgante, M. et al. (2002) Microsatellites are preferentially present with non-repetitive DNA in plant genomes. Nat. Genet. 30, 194200 17 Saha, S. et al. (2003) Simple sequence repeats as useful resources to study transcribed genes of cotton. Euphytica 130, 355364 18 Han, Z.-G. et al. (2004) Genetic mapping of EST-derived microsatellites from the diploid Gossypium arboreum in allotetraploid cotton. Mol. Gen. Genom. 272, 308327 19 Saha, M.C. et al. (2004) Tall fescue EST-SSR markers with transferability across several grass species. Theor. Appl. Genet. 109, 783791 20 Eujayl, I. et al. (2004) Medicago truncatula EST-SSRs reveal crossspecies genetic markers for Medicago spp. Theor. Appl. Genet. 108, 414422 21 Gao, L.F. et al. (2003) Analysis of microsatellites in major crops assessed by computational and experimental approaches. Mol. Breed. 12, 245261 22 Cordeiro, G.M. et al. (2001) Microsatellite markers from sugarcane (Saccharum spp.) ESTs cross transferable to erianthus and sorghum. Plant Sci. 160, 11151123 23 Rungis, D. et al. (2004) Robust simple sequence repeat markers for spruce (Picea spp.) from expressed sequence tags. Theor. Appl. Genet. 109, 12831294 24 Kantety, R.V. et al. (2002) Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol. Biol. 48, 501510
www.sciencedirect.com

25 Hackauf, B. and Wehling, P. (2002) Identication of microsatellite polymorphisms in an expressed portion of the rye genome. Plant Breed. 121, 1725 26 Gupta, P.K. et al. (2003) Transferable EST-SSR markers for the study of polymorphism and genetic diversity in bread wheat. Mol. Genet. Genomics 270, 315323 27 Nicot, N. et al. (2004) Study of simple sequence repeat (SSR) markers from wheat expressed sequence tags (ESTs). Theor. Appl. Genet. 109, 800805 28 Metzgar, D. et al. (2000) Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res. 10, 7280 29 Katti, M.V. et al. (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol. Biol. Evol. 18, 11611167 30 Sreenivasulu, N. et al. (2002) Mining functional information from cereal genomes the utility of expressed sequence tags. Curr. Sci. 83, 965973 31 Varshney, R.K. et al. (2004) A simple hybridization-based strategy for the generation of non-redundant EST collections a case study in barley (Hordeum vulgare L.). Plant Sci. 167, 629634 32 Stephenson, P. et al. (1998) Fifty new microsatellite loci for the wheat genetic map. Theor. Appl. Genet. 97, 946949 33 Ramsay, L. et al. (2000) A simple sequence repeat-based linkage map of barley. Genetics 156, 19972005 34 Cho, Y.G. et al. (2000) Diversity of microsatellites derived from genomic libraries and GenBank sequences in rice (Oryza sativa L.). Theor. Appl. Genet. 100, 713722 35 Eujayl, I. et al. (2001) Assessment of genotypic variation among cultivated durum wheat based on EST-SSRs and genomic SSRs. Euphytica 119, 3943 36 Fraser, L.G. et al. (2004) EST-derived microsatellites from Actinidia species and their potential for mapping. Theor. Appl. Genet. 108, 10101016 37 Callen, D. et al. (1993) Incidence and origin of null alleles in the (AC)n microsatellite markers. Am. J. Hum. Genet. 52, 922927 38 Lehman, T. et al. (1996) An evaluation of evolutionary constraints on microsatellite loci using null alleles. Genetics 144, 11551163 39 Chabane, K. et al. (2005) EST versus genomic derived microsatellite markers for genotyping wild and cultivated barley. Genet. Resour. Crop Evol. (in press) 40 Russell, J. et al. (2004) A comparison of sequence-based polymorphism and haplotype content in transcribed and anonymous regions of the barley. Genome 47, 389398 41 Holton, T.A. et al. (2002) Identication and mapping of polymorphic SSR markers from expressed gene sequences of barley and wheat. Mol. Breed. 9, 6371 42 Anderson, J.R. and Lubberstedt, T. (2003) Functional markers in plants. Trends Plant Sci. 8, 554560 43 Gao, L.F. et al. (2004) One hundred and one new microsatellite loci derived from ESTs (EST-SSRs) in bread wheat. Theor. Appl. Genet. 108, 13921400 44 Sorrells, M.E. and Wilson, W.A. (1997) Direct classication and selection of superior alleles for crop improvement. Crop Sci. 37, 691697 45 Papi, M. et al. (2000) Identication and disruption of an Arabidopsis zinc nger gene controlling seed germination. Genes Dev. 14, 2833 46 Yu, J.K. et al. (2004) EST-derived SSR markers for comparative mapping in wheat and rice. Mol. Genet. Genomics 271, 742751 47 Varshney, R.K. et al. (2005) Interspecic transferability and comparative mapping of barley EST-SSR markers in wheat, rye and rice. Plant Sci. 168, 195202 48 Schulman, A. et al. (2004) Organization of retrotransposons and microsatellites in cereal genomes. In Cereal Genomics (Gupta, P.K. and Varshney, R.K. eds), pp. 83118, Kluwer Academic Publishers 49 Li, Y.C. et al. (2004) Microsatellites within genes: structure, function, and evolution. Mol. Biol. Evol. 21, 9911007 50 Mohammadi, S.A. and Prasanna, B.M. (2003) Analysis of genetic diversity in crop plants salient statistical tools and considerations. Crop Sci. 43, 12351248 51 Ayers, N.M. et al. (1997) Microsatellites and a single nucleotide polymorphism differentiate apparent amylose classes in an extended pedigree of US rice germplasm. Theor. Appl. Genet. 94, 773781

Review

TRENDS in Biotechnology

Vol.23 No.1 January 2005

55

52 Cummings, C.J. and Zoghbi, H.Y. (2000) Trinucleotide repeats: mechanisms and pathophysiology. Annu. Rev. Genomics Hum. Genet. 1, 281328 53 Fujimori, S. et al. (2003) A novel feature of microsatellites in plants: a distribution gradient along the direction of transcription. FEBS Lett. 554, 1722 54 Bao, S. et al. (2002) Microsatellites in starch-synthesizing genes in relation to starch physicochemical properties in waxy rice (Oryza sativa L.). Theor. Appl. Genet. 105, 898905 55 Dresselhaus, T. et al. (1999) Novel ribosomal genes from maize are differentially expressed in the zygotic and somatic cell cycles. Mol. Gen. Genet. 261, 416427 56 Pillen, K. et al. (2000) Mapping new EMBL-derived barley microsatellites and their use in differentiating German barley cultivars. Theor. Appl. Genet. 101, 652660 57 Saha, M.C. et al. (2004) A high-density linkage map of tall fescue based on SSR and AFLP markers. Theor. Appl. Genet. (in press) 58 Liewlaksaneeyanawin, C. et al. (2004) Single-copy, species-transferable microsatellite markers developed from loblolly pine ESTs. Theor. Appl. Genet. 109, 361369 59 Benson, G. (1999) Tandem repeats nder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573580 60 Castelo, A.T. et al. (2002) TROLL Tandem Repeat Occurrence Locator. Bioinformatics 18, 634636 61 Graham, J. et al. (2004) The construction of a genetic linkage map of red raspberry (Rubus idaeus subsp. idaeus) based on AFLPs, genomicSSR and EST-SSR markers. Theor. Appl. Genet. 109, 740749

62 Warnke, S.E. et al. (2004) Genetic linkage mapping of an annual x perennial ryegrass population. Theor. Appl. Genet. 109, 294304 63 Barrett, B. et al. (2004) A microsatellite map of white clover. Theor. Appl. Genet. 109, 596608 64 Woodhead, M. et al. (2003) Development of EST-SSRs from the alpine lady-fern, Athyrium distentifolium. Mol. Ecol. Notes 3, 287290 65 Bhat, P. et al. (2004) Identication and characterization of gene (EST)derived SSR markers from robusta coffee variety CxR (an interspecic hybrid of Coffea canephora!C. congensis). Mol. Ecol. Notes DOI:10.1111/j-1471-8286.2004.00839 66 Dreisigacker, S. et al. (2003) SSR and pedigree analyses of genetic diversity among CIMMYT wheat lines targeted to different megaenvironments. Crop Sci. 44, 381388 67 Leigh, F. et al. (2003) Assessment of EST- and genomic microsatellite markers for variety discrimination and genetic diversity studies in wheat. Euphytica 133, 359366 68 Bandopadhyay, R. et al. (2004) DNA polymorphism among 18 species of Triticum-Aegilops complex using wheat EST-SSRs. Plant Sci. 166, 349356 69 Decroocq, V. et al. (2003) Development and transferability of apricot and grape EST microsatellite markers across taxa. Theor. Appl. Genet. 106, 912922 70 Arnold, C. et al. (2002) The application of SSRs characterized for grape (Vitis vinifera) to conservation studies in Vitaceae. Am. J. Bot. 89, 2228 71 Rossetto, M. et al. (2002) Evaluating the potential of SSR: anking regions examining taxonomic relationships in the Vitaceae. Theor. Appl. Genet. 104, 6166

Getting animated with parasites!


Interested in the molecular cell biology of hostparasite interactions? Then take a look at the online animations produced by Trends in Parasitology, one of our companion TRENDS journals. The pictures below are snapshots from two of our collection of animations revealing the latest advances in understanding parasite life cycles. Check them out today!

Microsporidia: how can they invade other cells? By C. Franzen [(2004) Trends Parasitol. 20, 10.1016/j.pt.2004.04.009] http://archive.bmn.com/supp/part/franzen.html

Interaction of Leishmania with the host macrophage By E. Handman and D.V.R. Bullen [(2002) Trends Parasitol. 18, 332334] http://archive.bmn.com/supp/part/swf012.html

www.sciencedirect.com

You might also like