This action might not be possible to undo. Are you sure you want to continue?
With the advent of molecular markers, a new generation of markers has been introduced over the last two decades, which has revolutionized the entire scenario of biological sciences. DNA based molecular markers have acted as versatile tools and have found their own position in various fields like taxonomy, physiology, embryology, genetic engineering, etc. They are no longer looked upon as simple DNA fingerprinting markers in variability studies or as mere forensic tools. Ever since their development, they are constantly being modified to enhance their utility and to bring about automation in the process of genome analysis. The discovery of PCR (polymerase chain reaction) was a landmark in this effort and proved to be an unique process that brought about a new class of DNA profiling markers. This facilitated the development of marker-based gene tags, map-based cloning of agronomically important genes, variability studies, phylogenetic analysis, synteny mapping, marker-assisted selection of desirable genotypes, etc. Thus giving new dimensions to concerted efforts of breeding and markeraided selection that can reduce the time span of developing new and better varieties and will make the dream of super varieties come true. These DNA markers offer several advantages over traditional phenotypic markers, as they provide data that can be analysed objectively. In this article, DNA markers developed during the last two decades of molecular biology research and utilized for various applications in the area of plant genome analysis are reviewed.
PLANTS have always been looked upon as a key source of energy for survival and evolution of the animal kingdom, thus forming a base for every ecological pyramid. Over the last few decades plant genomics has been studied extensively bringing about a revolution in this area. Molecular markers, useful for plant genome analysis, have now become an important tool in this revolution. In this article we attempt to review most of the available DNA markers that can be routinely employed in various aspects of plant genome analysis such as taxonomy, phylogeny, ecology, genetics and plant breeding. During the early period of research, classical strategies including comparative anatomy, physiology and embryology were employed in genetic analysis to determine inter- and intra-species variability. In the past decade, however, molecular markers have very rapidly complemented the classical strategies. Molecular markers include biochemical constituents (e.g. secondary metabolites in plants) and macromolecules, viz. proteins and deoxyribonucleic acids (DNA). Analysis of secondary metabolites is, however, restricted to those plants that produce a suitable range of metabolites which can be easily analysed and which can distinguish between varieties. These metabolites which are being used as markers should be ideally neutral to environmental effects or management practices. Hence, amongst the molecular markers used, DNA markers are more suitable and ubiquitous to most of the living organisms.
DNA-based molecular markers
Genetic polymorphism is classically defined as the simultaneous occurrence of a trait in the same population of two or more discontinuous variants or genotypes. Although DNA sequencing is a straightforward approach for identifying variations at a locus, it is expensive and laborious. A wide variety of techniques have, therefore, been developed in the past few years for visualizing DNA sequence polymorphism. The term DNA-fingerprinting was introduced for the first time by Alec Jeffrey in 1985 to describe bar-code-like DNA fragment patterns generated by multilocus probes after electrophoretic separation of genomic DNA fragments. The emerging patterns make up an unique feature of the analysed individual and are currently considered to be the ultimate tool for biological individualization. Recently, the term DNA fingerprinting/profiling is used to describe the combined use of several single locus detection systems and are being used as versatile tools for investigating various aspects of plant genomes. These include characterization of genetic variability, genome fingerprinting, genome mapping, gene localization, analysis of genome evolution, population genetics, taxonomy, plant breeding, and diagnostics.
Properties desirable for ideal DNA markers
• • • • • •
Highly polymorphic nature Co-dominant inheritance (determination of homozygous and heterozygous states of diploid organisms) Frequent occurrence in genome Selective neutral behaviour (the DNA sequences of any organism are neutral to environmental conditions or management practices) Easy access (availability) Easy and fast assay High reproducibility Easy exchange of data between laboratories.
It is extremely difficult to find a molecular marker which would meet all the above criteria. Depending on the type of study to be undertaken, a marker system can be identified that would fulfil atleast a few of the above characteristics. Various types of molecular markers are utilized to evaluate DNA polymorphism and are generally classified as hybridization-based markers and polymerase chain reaction (PCR)-based markers. In the former, DNA profiles are visualized by hybridizing the restriction enzyme-digested DNA, to a labelled probe, which is a DNA fragment of known origin or sequence. PCR-based markers involve in vitro amplification of particular DNA sequences or loci, with the help of specifically or arbitrarily chosen oligonucleotide sequences (primers) and a thermostable DNA polymerase enzyme. The amplified fragments are separated electrophoretically and banding patterns are detected by different methods such as staining and
autoradiography. PCR is a versatile technique invented during the mid1980s (ref. 3). Ever since thermostable DNA polymerase was introduced in 1988 (ref. 4), the use of PCR in research and clinical laboratories has increased tremendously. The primer sequences are chosen to allow basespecific binding to the template in reverse orientation. PCR is extremely sensitive and operates at a very high speed. Its application for diverse purposes has opened up a multitude of new possibilities in the field of molecular biology. For simplicity, we have divided the review in two parts. The first part is a general description of most of the available DNA marker types, while the second includes their application in plant genomics and breeding programmes.
Types and description of DNA markers
Single or low copy probes
Restriction fragment length polymorphism (RFLP). RFLPs are simply inherited naturally occurring Mendelian characters. They have their origin in the DNA rearrangements that occur due to evolutionary processes, point mutations within the restriction enzyme recognition site sequences, insertions or deletions within the fragments, and unequal crossing over. In RFLP analysis, restriction enzyme-digested genomic DNA is resolved by gel electrophoresis and then blotted6 on to a nitrocellulose membrane. Specific banding patterns are then visualized by hybridization with labelled probe. These probes are mostly species-specific single locus probes of about 0.5–3.0 kb in size, obtained from a cDNA library or a genomic library. The genomic libraries are easy to construct and almost all sequence types are included; however, a large number of interspersed repeats are found in inserts that detect a large number of restriction fragments forming complex patterns. In plants, this problem is overcome to some extent by using methylation-sensitive restriction enzyme PstI. This helps to obtain low copy DNA sequences of small fragment sizes, which are preferred in RFLP analysis. On the other hand cDNA libraries are difficult to construct, however, they are more popular as actual genes are analysed and they contain fewer repeat sequences. The selection of appropriate source for RFLP probe varies, with the requirement of particular application under consideration. Though genomic library probes may exhibit greater variability than gene probes from cDNA libraries, a few studies reveal the converse. This observation may be because cDNA probes not only detect variation in coding regions of the corresponding genes but also regions flanking genes and introns of the gene. RFLP markers were used for the first time in the construction of genetic maps by Botstein et al. RFLPs, being codominant markers, can detect coupling phase of DNA molecules, as DNA fragments from all homologous chromosomes are detected. They are very reliable markers in linkage analysis and breeding and can easily determine if a linked trait is present in a homozygous or heterozygous state in an individual, an information highly desirable for recessive traits12. However, their utility has been hampered due to the large amount of DNA required for restriction digestion and Southern blotting. The requirement of radioactive isotope makes the analysis relatively expensive and hazardous. The assay is time-consuming and labour intensive and only one out of several markers may be polymorphic, which is highly inconvenient especially for crosses between closely-related species. Their inability to detect single base changes restricts their use in detecting point mutations occurring within the regions at which they are detecting polymorphism. Restriction landmark genomic scanning (RLGS). This method, introduced for the first time by Hatada et al.13, for genomic DNA analysis of higher
organisms, is based on the principle that restriction enzyme sites can be used as landmarks. It employs direct labelling of genomic DNA at the restriction site and two-dimensional (2D) electrophoresis to resolve and identify these landmarks. The technique has proven its utility in genome analysis of closely-related cultivars and for obtaining polymorphic markers that can be cloned by spot target method. It has been used as a new fingerprinting technique for rice cultivars.
RFLP markers converted in to PCR based-markers
Sequence-tagged sites (STS) RFLP probes specifically linked to a desired trait can be converted into PCR-based STS markers based on nucleotide sequence of the probe giving polymorphic band pattern, to obtain specific amplicon. Using this technique, tedious hybridization procedures involved in RFLP analysis can be overcome. This approach is extremely useful for studying the relationship between various species. When these markers are linked to some specific traits, for example powdery mildew resistance gene or stem rust resistance gene in barley, they can be easily integrated into plant breeding programmes for marker-assisted selection of the trait of interest. Allele-specific associated primers (ASAPs) To obtain an allele-specific marker, specific allele (either in homozygous or heterozygous state) is sequenced and specific primers are designed for amplification of DNA template to generate a single fragment at stringent annealing temperatures. These markers tag specific alleles in the genome and are more or less similar to SCARs. Expressed sequence tag markers (EST). This term was introduced by Adams et al. Such markers are obtained by partial sequencing of random cDNA clones. Once generated, they are useful in cloning specific genes of interest and synteny mapping of functional genes in various related organisms. ESTs are popularly used in full genome sequencing and mapping programmes underway for a number of organisms and for identifying active genes thus helping in identification of diagnostic markers. Moreover, an EST that appears to be unique helps to isolate new genes. EST markers are identified to a large extent for rice, Arabidopsis, etc. wherein thousands of functional cDNA clones are being converted in to EST markers. Single strand conformation polymorphism (SSCP) This is a powerful and rapid technique for gene analysis particularly for detection of point mutations and typing of DNA polymorphism. SSCP can identify heterozygosity of DNA fragments of the same molecular weight and can even detect changes of a few nucleotide bases as the mobility of the single-stranded DNA changes with change in its GC content due to its conformational change. To overcome problems of reannealing and complex banding patterns, an improved technique called asymmetricPCR SSCP was developed, wherein the denaturation step was eliminated and a large-sized sample could be loaded for gel electrophoresis, making it a potential tool for high throughput DNA polymorphism. It was found useful in the detection of heritable human diseases. In plants, however, it
is not well developed although its application in discriminating progenies can be exploited, once suitable primers are designed for agronomically important traits.
Multi locus probes
Repetitive DNA A major step forward in genetic identification is the discovery that about 30–90% of the genome of virtually all the species is constituted by regions of repetitive DNA, which are highly polymorphic in nature. These regions contain genetic loci comprising several hundred alleles, differing from each other with respect to length, sequence or both and they are interspersed in tandem arrays ubiquitously. The repetitive DNA regions play an important role in absorbing mutations in the genome. Of the mutations that occur in the genome, only inherited mutations play a vital role in evolution or polymorphism. Thus repetitive DNA and mutational forces functional in nature together form the basis of a number of marker systems that are useful for various applications in plant genome analysis. The markers belonging to this class are both hybridization-based and PCRbased.
Microsatellites and minisatellites
The term microsatellites was coined by Litt and Lutty, while the term minisatellites was introduced by Jeffrey. Both are multilocus probes creating complex banding patterns and are usually non-species specific occurring ubiquitously. They essentially belong to the repetitive DNA family. Fingerprints generated by these probes are also known as oligonucleotide fingerprints. The methodology has been derived from RFLP and specific fragments are visualized by hybridization with a labelled micro- or minisatellite probe. Minisatellites are tandem repeats with a monomer repeat length of about 11–60 bp, while microsatellites or short tandem repeats/simple sequence repeats (STRs/ SSRs) consist of 1 to 6 bp long monomer sequence that is repeated several times. These loci contain tandem repeats that vary in the number of repeat units between genotypes and are referred to as variable number of tandem repeats (VNTRs) (i.e. a single locus that contains variable number of tandem repeats between individuals) or hypervariable regions (HVRs) (i.e. numerous loci containing tandem repeats within a genome generating high levels of polymorphism between individuals). Microsatellites and minisatellites thus form an ideal marker system creating complex banding patterns by simultaneously detecting multiple DNA loci. Some of the prominent features of these markers are that they are dominant fingerprinting markers and codominant STMS (sequence tagged microsatellites) markers. Many alleles exist in a population, the level of heterozygosity is high and they follow Mendelian inheritance.
Minisatellite and microsatellite sequences converted into PCR-based markers
Sequence-tagged microsatellite site markers (STMS) This method includes DNA polymorphism using specific primers designed from the sequence data of a specific locus. Primers complementary to the flanking regions of the simple sequence repeat loci yield highly polymorphic amplification products. Polymorphisms appear because of variation in the number of tandem repeats (VNTR loci) in a given repeat motif. Tri- and tetranucleotide microsatellites are more popular for STMS analysis because they present a clear banding pattern after PCR and gel electrophoresis. However, dinucleotides are generally abundant in genomes and have been used as markers e.g. (CA)n(AG)n and (AT)n (ref. 30). The di- and tetranucleotide repeats are present mostly in the non-coding regions of the genome, while 57% of trinucleotide repeats are shown to reside in or around the genes. A very good relationship between the number of alleles detected and the total number of simple repeats within the targeted microsatellite DNA has been observed. Thus larger the repeat number in the microsatellite DNA, greater is the number of alleles detected in a large population. Direct amplification of minisatellite DNA markers (DAMD-PCR) This technique, introduced by Heath et al., has been explored as a means of generating DNA probes useful for detecting polymorphism. DAMD-PCR clones can yield individualspecific DNA fingerprinting pattern and thus have the potential as markers for species differentiation and cultivar identification. Inter simple sequence repeat markers (ISSR) In this technique, reported by Zietkiewicz et al., primers based on microsatellites are utilized to amplify inter-SSR DNA sequences. Here, various microsatellites anchored at the 3¢ end are used foramplifying genomic DNA which increases their specificity. These are mostly dominant markers, though occasionally a few of them exhibit codominance. An unlimited number of primers can be synthesized for various combinations of di-, tri-, tetra- and pentanucleotides [(4)3 = 64, (4)4 = 256] etc. with an anchor made up of a few bases and can be exploited for a broad range of applications in plant species.
Other repetitive DNA-type markers
A large number of transposable repeat elements have been studied in plants; however, only a few have been exploited as molecular markers. In evolutionary terms, they have contributed to genetic differences between species and individuals by playing a role in retrotransposition events promoting unequal crossing over. Retrotransposon-mediated fingerprinting has been shown to be an efficient fingerprinting method for detection of genetic differences between different species.
This strategy was developed to fingerprint genotypes using semispecific primers, complementary to repetitive DNA elements called ‘Alurepeats’, in human genome analysis. Alu repeats are a class of randomly repeated interspersed DNA, preferentially used for Alu PCR as they reveal considerable levels of polymorphism35. These representatives of short and long interspersed nuclear elements are known as SINES. Alu elements are approximately 300 bp in size and have been suggested to be originated from special RNA species that have been reintegrated at a rate of approximately one integration event per 10000 years. These repeats have been studied largely in humans, while their function in plants remains largely unexplored.
Repeat complementary primers
As an alternative to the interspersed repeats, primers complementary to other repetitive sequence elements were also successfully used for generation of polymorphisms, e.g. introns/exons splice junctions, tRNA genes, 5sRNA genes and Zn-finger protein genes. Primers complementary to specific exons, resulting in the amplification of the intervening introns have been studied by Lessa et al. One of the strengths of these new strategies is that they are more amenable to automation than the conventional hybridization-based techniques.
Arbitrary sequence markers
Randomly-amplified polymorphic DNA markers (RAPD)
In 1991 Welsh and McClelland developed a new PCR-based genetic assay namely randomly amplified polymorphic DNA (RAPD). This procedure detects nucleotide sequence polymorphisms in DNA by using a single primer of arbitrary nucleotide sequence. In this reaction, a single species of primer anneals to the genomic DNA at two different sites on complementary strands of DNA template. If these priming sites are within an amplifiable range of each other, a discrete DNA product is formed through thermocyclic amplification. On an average, each primer directs amplification of several discrete loci in the genome, making the assay useful for efficient screening of nucleotide sequence polymorphism between individuals42. However, due to the stoichastic nature of DNA amplification with random sequence primers, it is important to optimize and maintain consistent reaction conditions for reproducible DNA amplification. They are dominant markers and hence have limitations in their use as markers for mapping, which can be overcome to some extent by selecting those markers that are linked in coupling. RAPD assay has been used by several groups as efficient tools for identification of markers linked to agronomically important traits, which are introgressed during the development of near isogenic lines. The application of RAPDs and their related modified markers in variability analysis and individual-specific genotyping has largely been carried out, but is less popular due to problems such as poor reproducibility faint or fuzzy products, and difficulty in scoring bands, which lead to inappropriate inferences. Some variations in the RAPD technique include DNA amplification fingerprinting (DAF). Caetano-Anolles et al.44 employed single arbitrary primers as short as 5 bases to amplify DNA using polymerase chain reaction. In a spectrum of products obtained, simple patterns are useful as genetic markers for mapping, while more complex patterns are useful for DNA fingerprinting. Band patterns are reproducible and can be analysed using polyacrylamide gel electrophoresis and silver staining. DAF requires careful optimization of parameters; however, it is extremely amenable to automation and fluorescent tagging of primers for early and easy determination of amplified products. DAF profiles can be tailored by employing various modifications such as predigesting of template. This technique has been useful in genetic typing and mapping.
Arbitrary primed polymerase chain reaction (AP-PCR)
This is a special case of RAPD, wherein discrete amplification patterns are generated by employing single primers of 10–50 bases in length in PCR of genomic DNA45. In the first two cycles, annealing is under non-stringent conditions. The final products are structurally similar to RAPD products. Compared to DAF, this variant of RAPD is not very popular as it involves autoradiography. Recently, however, it has been
simplified by separating the fragments on agarose gels and using ethidium bromide staining for visualization.
Sequence characterized amplified regions for amplification of specific band (SCAR) Michelmore et al. and Martin et al. introduced this technique wherein the RAPD marker termini are sequenced and longer primers are designed (22–24 nucleotide bases long) for specific amplification of a particular locus. These are similar to STS markers48 in construction and application. The presence or absence of the band indicates variation in sequence. These are better reproducible than RAPDs. SCARs are usually dominant markers, however, some of them can be converted into codominant markers by digesting them with tetra cutting restriction enzymes and polymorphism can be deduced by either denaturing gel electrophoresis or SSCP. Compared to arbitrary primers, SCARs exhibit several advantages in mapping studies (codominant SCARs are informative for genetic mapping than dominant RAPDs), map-based cloning as they can be used to screen pooled genomic libraries by PCR, physical mapping, locus specificity, etc. SCARs also allow comparative mapping or homology studies among related species, thus making it an extremely adaptable concept in the near future.
Cleaved amplified polymorphic sequences (CAPs)
These polymorphic patterns are generated by restriction enzyme digestion of PCR products. Such digests are compared for their differential migration during electrophoresis49,50. PCR primer for this process can be synthesized based on the sequence information available in databank of genomic or cDNA sequences or cloned RAPD bands. These markers are codominant in nature.
Randomly amplified microsatellite polymorphisms (RAMPO)
In this PCR-based strategy, genomic DNA is first amplified using arbitrary (RAPD) primers. The amplified products are then electrophoretically separated and the dried gel is hybridized with microsatellite oligonucleotide probes. Several advantages of oligonucleotide fingerprinting51, RAPD52 and microsatellite-primed PCR are thus combined, these being the speed of the assay, the high sensitivity, the high level of variability detected and the non-requirement of prior DNA sequence information. This technique has been successfully employed in the genetic fingerprinting of tomato, kiwi fruit and closelyrelated genotypes of D. Bulbifera.
Amplified fragment length polymorphism (AFLP)
A recent approach by Zabeau et al.55, known as AFLP, is a technique based on the detection of genomic restriction fragments by PCR amplification and can be used for DNAs of any origin or complexity. The fingerprints are produced, without any prior knowledge of sequence, using a limited set of generic primers. The number of fragments detected in a single reaction can be ‘tuned’ by selection of specific primer sets. AFLP technique is reliable since stringent reaction conditions are used for primer annealing. This technique thus shows an ingenious combination of
RFLP and PCR techniques and is extremely useful in detection of polymorphism between closely related genotypes. AFLP procedure mainly involves 3 steps (a) Restriction of DNA using a rare cutting and a commonly cutting restriction enzyme simultaneously (such as MseI and EcoRI) followed by ligation of oligonucleotide adapters, of defined sequences including the respective restriction enzyme sites. (b) Selective amplifications of sets of restriction fragments, using specifically designed primers. To achieve this, the 5' region of the primer is made such that it would contain both the restriction enzyme sites on either sides of the fragment complementary to the respective adapters, while the 3' ends extend for a few arbitrarily chosen nucleotides into the restriction fragments. (c) Gel analysis of the amplified fragments. AFLP analysis depicts unique fingerprints regardless of the origin and complexity of the genome. Most AFLP fragments correspond to unique positions on the genome and hence can be exploited as landmarks in genetic and physical mapping. AFLPs are extremely useful as tools for DNA fingerprinting58 and also for cloning and mapping of variety-specific genomic DNA sequences. Similar to RAPDs, the bands of interest obtained by AFLP can be converted into SCARs. Thus AFLP provides a newly developed, important tool for a variety of applications.
Applications of molecular markers in plant genome analysis and breeding
Molecular markers have been looked upon as tools for a large number of applications ranging from localization of a gene to improvement of plant varieties by marker-assisted selection. They have also become extremely popular markers for phylogenetic analysis adding new dimensions to the evolutionary theories. If we look at the history of the development of these markers, it is evident that they have been improved over the last two decades to provide easy, fast and automated assistance to scientists and breeders. Genome analysis based on molecular markers has generated a vast amount of information and a number of databases are being generated to preserve and popularize it. Mapping and tagging of genes: Generating tools for markerassisted selection in plant breeding Plant improvement, either by natural selection or through the efforts of breeders, has always relied upon creating, evaluating and selecting the right combination of alleles. The manipulation of a large number of genes is often required for improvement of even the simplest of characteristics. With the use of molecular markers it is now a routine to trace valuable alleles in a segregating population and mapping them. These markers once mapped enable dissection of the complex traits into component genetic units more precisely, thus providing breeders with new tools to manage these complex units more efficiently in a breeding programme.
The very first genome map in plants was reported in maize, followed by rice, Arabidopsis etc. using RFLP markers. Maps have since then been constructed for several other crops like potato, barley, banana, members of Brassicaceae, etc. Once the framework maps are generated, a large number of markers derived from various techniques are used to saturate the maps as much as is possible. Microsatellite markers, especially STMS markers, have been found to be extremely useful in this regard. Owing to their quality of following clear Mendelian inheritance, they can be easily used in the construction of index maps, which can provide an anchor or reference point for specific regions of the genome. About 30 microsatellites have already been assigned to five linkage groups in Arabidopsis, while their integration into the genetic linkage maps is still in progress in rice, soybean, maize, etc. The very first attempt to map microsatellites in plants was made by Zhao and Kochert68 in rice using (GGC)n, followed by mapping of (GA)n and (GT)n by Tanksley et al.69 and (GA/AG)n, (ATC) 10 and (ATT) 14, by Panaud et al in rice. The most recent microsatellite map has been generated by Milbourne et al. for potato. Similar to microsatellites, looking at the pattern of variation, generated by retrotransposons, it is now proposed that apart from genetic variability, these markers are ideal for integrating genetic maps. Once mapped, these markers are efficiently employed in tagging several individual traits that are extremely important for a breeding programme like yield, disease resistance, stress tolerance, seed quality, etc. A large number of monogenic and polygenic loci for various traits have been identified in a number of plants, which are currently being exploited by breeders and molecular biologists together, so as to make the dream of marker-assisted selection come true. Tagging of useful genes like the ones responsible for conferring resistance to plant pathogen, synthesis of plant hormones, drought tolerance and a variety of other important developmental pathway genes, is a major target. Such tagged genes can also be used for detecting the presence of useful genes in the new genotypes generated in a hybrid programme or by other methods like transformation, etc. RFLP markers have proved their importance as markers for gene tagging and are very useful in locating and manipulating quantitative trait loci (QTL) in a number of crops. The very first reports on gene tagging were from tomato, availing the means for identification of markers linked to genes involved in several traits like water use efficiency, resistance to Fusarium oxysporum (the 12 gene), leaf rust resistance genes LR 9 and 24, and root knot nematodes (Meliodogyne sp.) (the mi gene).