A Summary of DNA Sequencing, Technologies, Applications and Future Scope
PESU I/O: DNA Sequencing and Nanopore Technologies
Subgroup-6 added, blocking further elongation. This is because 1.1 Abstract: Deoxyribonucleic acid (DNA) is the dideoxynucleotides are missing a special group of molecules, fundamental component of the genetic makeup of any called a 3'-hydroxyl group, needed to form a connection with organism. DNA has the property of being unique to every the next nucleotide. Only a small amount of a organism, even within a particular species. The basic dideoxynucleotide is added to each reaction, allowing component of DNA that is responsible for the storage for different reactions to proceed for various lengths of time, genetic code is termed a nucleotide. DNA Sequencing is the until, by chance, DNA polymerase inserts a process of identifying these nucleotides and the order in dideoxynucleotide , terminating the reaction. Therefore, the which they are found in a sample of DNA. This paper aims result is a set of new chains, all of different lengths. to describe in detail the past techniques used to sequence To read the newly generated sequence, the four reactions are DNA, the applications of DNA sequencing and the future run side-by-side on a polyacrylamide sequencing gel. The avenues that can be explored with this technique. family of molecules generated in the presence of ddATP are loaded into one lane of the gel and the other three 2.1 Introduction: DNA is an informative macromolecule in families, generated with ddCTP, ddGTP, and ddTTP, are the terms that it is used in transmission of genetic code and loaded into three adjacent lanes. After electrophoresis, the characteristics from one generation to another. Owing to this DNA sequence can be read directly from the positions of the property, the technique of sequencing DNA has played a bands in the gel. pivotal role in deepening our understanding of evolution of life on this planet and what order it followed. By comparing 4.1 Past methods used to sequence DNA: genetic information of different species, it is easy to identify In the past, the sequencing of DNA has been performed in where they overlap and where they diverge. The more two primary methods that retain some relevance to this day. closely species are related on the phylogenetic tree, the They include greater the amount of overlap in their DNA sequences. It can Maxam-Gilbert sequencing, and be concluded that DNA sequencing plays a major role in the Sanger-Coulson sequencing. present form of the study of evolutionary study of biology among many other fields, and has proven to be an effective 4.2 Sanger sequencing is a method of DNA sequencing , and accurate method of scientific study. based on the selective incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro 3.1 Working principle: In one method of DNA sequencing DNA replication. The classical chain-termination method called chain-termination sequencing, the DNA to be requires a single-stranded DNA template, a DNA primer, a sequenced, called the template DNA, is first prepared as a DNA polymerase, normal deoxynucleotidetriphosphates single-stranded DNA. Next, a short oligonucleotide is (dNTPs), and modified di-deoxynucleotidetriphosphates annealed, or joined, to the same position on each template (ddNTPs), the latter of which terminate DNA strand strand. The oligonucleotide acts as a primer for the synthesis elongation. These chain-terminating nucleotides lack a 3'- of a new DNA strand that will be complementary to the OH group required for the formation of a phosphodiester template DNA. This technique requires that four bond between two nucleotides, causing DNA polymerase to nucleotide-specific reactions--one each for G, A, C, and T-- cease extension of DNA when a modified ddNTP is be performed on four identical samples of DNA. The four incorporated. The ddNTPs may be radioactively or sequencing reactions require the addition of all the fluorescently labelled for detection in automated sequencing components necessary to synthesize and label new DNA, machines. including: The DNA sample is divided into four separate sequencing A DNA template, reactions, containing all four of the standard A primer tagged with a radioactive material or a deoxynucleotides (dATP, dGTP, dCTP and dTTP) and the photoemissive chemical, DNA polymerase. To each reaction is added only one of the DNA polymerase, four dideoxynucleotides (ddATP, ddGTP, ddCTP, or ddTTP), Four deoxynucleotides (G, A, C, T), and, while the other added nucleotides are ordinary ones. The One dideoxynucleotide(ddG, ddA, ddC, or ddT). dideoxynucleotide concentration should be approximately 100-fold lower than that of the corresponding After the first deoxynucleotide is added to the growing deoxynucleotide (e.g. 0.005mM ddTTP : 0.5mM dTTP) to complementary sequence, DNA polymerase moves along the allow enough fragments to be produced while still template and continues to add base after base. The strand transcribing the complete sequence. Putting it in a more synthesis reaction continues until a dideoxynucleotide is sensible order, four separate reactions are needed in this process to test all four ddNTPs. Following rounds of 1 template DNA extension from the bound primer, the resulting DNA fragments are heat denatured and separated by size using gel electrophoresis. In the original publication General applications of DNA sequencing include usage in of 1977, the formation of base-paired loops of ssDNA was a fields like cause of serious difficulty in resolving bands at some Metagenomics: Study of genetic material recovered locations. This is frequently performed using a denaturing directly from environmental samples polyacrylamide-urea gel with each of the four reactions run Medicine: Genetic disorder identification, gene in one of four individual lanes (lanes A, T, G, C). The DNA therapy bands may then be visualized by autoradiography or UV Forensic science: Crime investigation, paternity light and the DNA sequence can be directly read off the X- testing, DNA fingerprinting ray film or gel image. Molecular biology: Protein synthesis, physiological analysis, drug response mechanisms, phenotype 4.3 Maxam–Gilbert sequencing is based on nucleobase- identification specific partial chemical modification of DNA and Evolution studies: Timeline formation and so on. subsequent cleavage of the DNA backbone at sites adjacent to the modified nucleotides. 5.1.1 Specific Application: Evolution of Organisms Although Maxam and Gilbert published their chemical The patterns and sequences within DNA are analyzed to sequencing method two years after Frederick Sanger and arrive at a logical hierarchy related to the evolution of Alan Coulson published their work on plus-minus organisms. sequencing, Maxam–Gilbert sequencing rapidly became The components of study include primarily more popular, since purified DNA could be used directly, 1. The phylogenetic tree, while the initial Sanger method required that each read start 2. Evolutionary clock theory, be cloned for production of single-stranded DNA. However, 3. Extinction, with the improvement of the chain-termination method (see 4. Morphology of organisms, below), Maxam–Gilbert sequencing has fallen out of favour 5. Habitat of organisms, and, due to its technical complexity prohibiting its use in standard 6. Behavior of organisms molecular biology kits, extensive use of hazardous chemicals, and difficulties with scale-up. The phylogenetic tree: More overlap within the DNA Maxam–Gilbert sequencing requires radioactive labeling at samples of 2 organisms imply that they are more closely one 5′ end of the DNA fragment to be sequenced (typically related than those organisms whose samples show less by a kinase reaction using gamma-32P ATP) and purification overlap. of the DNA. Chemical treatment generates breaks at a small Evolutionary clock: The genetic samples help determine the proportion of one or two of the four nucleotide bases in each era during which 2 species diverged from a common ancestor. of four reactions (G, A+G, C, C+T). For example, the This is measured on a relative scale using known data, as purines (A+G) are depurinated using formic acid, the absolute time cannot be known from just a sequencing guanines (and to some extent the adenines) are methylated routine. Larger differences in the DNA indicate larger delta by dimethyl sulfate, and the pyrimidines (C+T) are in the eras of existence. This method of studying this field, is hydrolysed using hydrazine. The addition of salt (sodium of course, limited in utility by virtue of a limited population chloride) to the hydrazine reaction inhibits the reaction of size, protein function changes, change in mechanism of thymine for the C-only reaction. The modified DNAs may natural selection, species-specific anomalies, and changing then be cleaved by hot piperidine; (CH2)5NH at the position periods between generations. of the modified base. The concentration of the modifying Extinction: Divergence offers clues about mass extinctions, chemicals is controlled to introduce on average one as well as single extinction and the probable cause of death-- modification per DNA molecule. Thus a series of labeled exempli gratia, the probable presence of a mutated gene in fragments is generated, from the radiolabeled end to the first mammoths. "cut" site in each molecule. Morphology of organisms: Explaining by way of example, it is known that Neanderthal DNA differs from modern The fragments in the four reactions are electrophoresed side human DNA by 0.3%-- this includes the MC1R gene, which by side in denaturing acrylamide gels for size separation. To provides instructions to make a protein called the visualize the fragments, the gel is exposed to X-ray film for melanocortin 1 receptor—responsible for skin and hair autoradiography, yielding a series of dark bands each colors. The Neanderthal human was believed to have reddish showing the location of identical radiolabeled DNA hair and light skin—mating with modern humans resulted in molecules. From presence and absence of certain fragments individuals who now inhabit Europe and Asia. the sequence may be inferred. Habitat of organisms: Explaining by way of example, the Denisovans were a class of humans who inhabited present Siberia. Similarities in their DNA to Aborigines, Tibetians 2 and Eskimos indicated that they could survive in high 5.1 Applications of DNA sequencing: altitude conditions with frigid temperatures. This can also serve as explanation to the mammoths’ extinction due to 3 inbreeding as they tended to live in isolated islands. 7.1 Conclusion and afterthoughts Behavior of organisms: Changes in the FOXP2 gene led to From the above findings, it can be argued with reasonable humans using language and expressions, while apes that lack validity that DNA sequencing as a field has a lot of future this mutation cannot use language. The ‘fruitless’ gene of potential and continued research in this field will serve to Drosophila melanogaster determines mating and courtship deepen our understanding of ourselves and what led us to behavior. Food, foraging and psychological behavior also existence. It can also be argued with reasonable validity that works this way to an extent. the practice of gene manipulation can be beneficial to medicine and to victims of forced mutations by way of 5.1.2 Past experiments and hypotheses nuclear disasters and the like. The Genographic Project: In 2005, NGS and IBM launched the Genographic Project to retrace the earliest human Appendix-I migrations using DNA sequencing, using samples donated by people all over the world. This project provided valuable insight into the origin as well as the later migrations that brought humans to occupy all corners of the globe. By calculating the pattern of genetic diversity in different populations that arises due to mutations, the age and ancestry of people living in different regions can be deduced. From this work, it is now known that humans originated in Africa and left the continent about 60,000 years ago to populate the planet. This project also revealed that the Y chromosome is passed down virtually unchanged from father to son and that DNA in the cell’s mitochondria is passed down from the mother. Molecular clock hypothesis: The molecular clock hypothesis states that DNA and protein sequences evolve at a rate that is relatively constant over time and among different organisms. Therefore, if the hypothesis is true, this is an extremely useful method for estimation of evolutionary A-I.1: Conventional DNA sequencing timelines. Bacterial classification: DNA sequencing also facilitated the A-I.2 References identification of pathogens such as Heliobacter pylori and Sanger F; Coulson AR (May 1975). "A rapid the Mycobacterium species. Pyrosequencing of 16S vRNA method for determining sequences in DNA by gene was also used to develop a molecular gram stain in primed synthesis with DNA polymerase" order to rapidly classify bacteria using molecular methods. Maxam AM, Gilbert W (February 1977). "A new method for sequencing DNA". Proc. Natl. Acad. Sci. 6.1 Future applications and scope U.S.A. Some organisms which no longer exist or certain Sanger F, Coulson AR (May 1975). "A rapid traits associated with them could be brought back to method for determining sequences in DNA by life. primed synthesis with DNA polymerase". J. Mol. Exaptation is the repurposing of a certain sequence Biol. 94 for a different function. This happens in non-coding King, Turi E.; Jobling, Mark A. (2009). "What's in a sequences, or exons. It has happened a lot in the last name? Y chromosomes, surnames and the genetic 400 million years during evolution and can be genealogy revolution". Trends in Genetics. 25 exploited in the future. Wells, Spencer (2013). "The Genographic Project Epigenetics is the blocking or accessing of certain and the Rise of Citizen Science". Southern genes. Blocking by DNA methylation or even California Genealogical Society (SCGS) changing the sequence altogether can be done for Harry, Debra and Le'a Malia Kanehe. "Genetic specific purposes. Research: Collecting Blood to Preserve Culture?" Cultural Survival, 29.4 (Winter 2005)