This action might not be possible to undo. Are you sure you want to continue?
This module looks at evidence that nucleic acids are the genetic material, and examines the structure of nucleic acids (DNA and RNA), which underlies their function.
1. Understand the experiments that led DNA to be identified as the genetic material. 2. Know that DNA is a polymer of nucleotides, and that each nucleotide consists of a sugar, a phosphate, and a nitrogenous base. Know that there are four types of bases routinely used in DNA, and that they fall into two basic types: purines and pyrimidines. It is not necessary to know the exact molecular structure of each base. 3. Understand that RNA is similar in structure to DNA, but know the differences between the two types of molecules. 4. Understand how nucleic acids denature. Genetic Material When Mendel proposed the existence of his 'factors' (see the discussion of Mendel's postulates), he had no idea exactly what form those factors might have. Even in the first half of this century, there was considerable debate about the nature of the genetic material. Some thought it must be made protein, whereas others believed it was made of nucleic acid. In spite of this uncertainty, there was consensus about what properties the genetic material must have. Properties of the Genetic Material
y y y y
The genetic material must be able to store hereditary information. The genetic material must be able to be replicated (reproduced). The information stored in the genetic material must be able to be expressed. The genetic material must be able to undergo change (mutation/recombination).
These properties can be grouped into three essential functions of the genetic material: the genotypic function (storage and replication of hereditary information to be passed on to future generations), the phenotypic function (expression of the stored information as a physical or behavioral trait), and the evolutionary function (change in the information over time so that organisms can adapt to environmental changes). The first real evidence of the nature of the genetic material came from studies of genetic changes in bacteria. Bacterial Transformation
The discovery of the principle of bacterial transformation provided strong evidence to support DNA as the genetic material. This discovery came from research on Diplococcuspneumoniae, a bacterium that causes pneumonia. There are two strains of this bug: the type 'S' strain, which forms smooth colonies on agar plates and is virulent (causes disease) because the bacteria are covered with a polysaccharide capsule; and the type 'R' strain, which forms rough colonies on agar plates and is avirulent (does not cause disease) because the bacteria do not have a capsule. The virulence properties are summarized in the series of diagrams below.
When mice are injected with type S cells (from smooth colonies), the mice die of pneumonia.
When mice are injected with type R cells (from rough colonies), the mice survive.
When type S cells are heat killed and then injected, the mice survive, indicating that the cells must be alive to be virulent.
When living type R cells were mixed with heat-killed type S cells and injected, the mice surprisingly died. Even though either cell type alone is harmless, when combined, they become lethal. What's more, live type S cells could be isolated from the dead mice! It was concluded that the presence of the dead type S cells was somehow transforming the live type R cells into type S cells. In other words, the dead type S cells were causing a genetic change in the type R cells. But what was the transforming substance? DNA as the Transforming Material To try to identify the transforming material, Avery, MacLeod, and McCarty made extracts from heat-killed type S cells. The extracts were mixed with live type R cells. Antibodies against type R cells were then used to precipitate and remove any type R cells from the mixture, and the remaining mixture was cultured to see what types of colonies would grow. In this instance, type S colonies grew, indicating that some of the type R cells were transformed. The experiment was then repeated, but the extracts were pretreated with enzymes that degraded either DNA, RNA, or protein, then mixed with the type R cells. As shown in the summary diagram below, treatment of the extract with DNase prevented the transformation. Since degrading the DNA prevented transformation, the transformation must have been caused by DNA from the dead type S cells.
In a separate line of investigation, Hershey and Chase demonstrated that DNA, and not protein, was necessary for reproduction of bacterial viruses. These results taken together indicate that DNA is the genetic material of prokaryotes*. But what about eukaryotes? Unfortunately, direct evidence of the nature of genetic material was harder to come by, because eukaryotic cells are much more complex than prokaryotes. For many years, there was only indirect circumstantial evidence supporting DNA as the eukaryotic genetic material. For example, it was known that somatic cells are diploid (2n) in terms of genetic material whereas gametes are haploid (n). Measurement of DNA content of each type of cells revealed that somatic cells contained twice as much DNA as gametes did, which is the expected result if DNA is the genetic material. With the advent of molecular biology and the ability to manipulate DNA, direct evidence finally was obtained that implicated DNA as the eukaryotic genetic material. Many studies have shown that by manipulating DNA, it is possible to change phenotypes of individual cells and entire organisms. At this point, there remains no doubt that DNA is the genetic material. * It has since been discovered that some viruses actually have RNA as their genetic material. DNA and RNA Structure The nucleic acids, DNA and RNA, are composed of repeating subunits called nucleotides. Each nucleotide is in turn made of three structures: A nucleotide = a phosphate + a 5-carbon sugar + a cyclic nitrogenous base
DNA In the case of DNA, the 5-carbon sugar found in nucleotides is called 2'-deoxyribose. (The name means that compared to ribose, this sugar is lacking a hydroxyl group at the 2' position, as shown in the diagram above.) There are four bases: adenine, thymine, cytosine, and guanine (abbreviated A, T, C, G). These bases fall into two groups, based on molecular structure: adenine and guanine are purines, while thymine and cytosine are pyrimidines. DNA is a polynucleotide, meaning it is a polymer consisting of nucleotides joined together. Nucleotides are joined by a phosphodiester bond, in which the phosphate (which is attached to the 5' carbon of deoxyribose) of one nucleotide becomes connected to the hydroxyl (OH) group (attached to the 3' carbon) of the other nucleotide. This creates a chain of alternating phosphate and sugar groups, with the bases sticking out the side. The nucleotides are all connected by a 5' phosphate joined to a 3' hydroxyl. Any chain of nucleotides will therefore have a free (unattached) 5' phosphate at one end of the chain, and a free 3' hydroxyl at the other end. These are referred to as the 5' end and the 3' end of the DNA strand, respectively. DNA is usually found as a double-stranded molecule. This means that two chains are joined together in a ladder configuration. The strands are joined by hydrogen bonding between the bases on the nucleotides. These hydrogen bonds are called base pairs. Hydrogen bonds are very weak bonds, but enough of them acting together are strong enough to hold two strands of DNA together. Chromosomes have hundreds of millions of such base pairs, which are more than enough to hold the strands together. Base pairing must be specific in order to work. Only certain combinations of bases will hydrogen bond with each other. For example, adenine will only base pair with thymine, and cytosine will only base pair with guanine. Therefore, in order for two strands of DNA to base pair, the appropriate bases must be present in corresponding
positions on the two strands. Two such strands are said to be complementary. The two strands of DNA run in opposite directions when base paired (the 5' end of one strand is next to the 3' end of the other strand). The two strands are therefore said to be antiparallel. When double stranded, DNA forms a double helix, which is like a spiral staircase in which both rails spiral around a central point.
RNA RNA structure is similar to DNA structure. There are some differences, however. For one thing, the sugar found in RNA nucleotides is ribose rather than deoxyribose. (Check the structure in the nucleotide diagram.) Like DNA, RNA nucleotides contain the bases adenine, cytosine, and guanine, but in RNA thymine is replaced by uracil (uracil base pairs with adenine). RNA occasionally occurs as a double-stranded molecule, but mostly, RNA is single stranded (not base-paired to another strand). There are three major classes of RNA: messenger RNA (mRNA), ribosomal RNA (rRNA), and transfer RNA (tRNA). Each of these types of RNA plays a different role in the cell, as we will see elsewhere. All types of RNA are produced as complementary copies of one strand of DNA, by the process of transcription. Denaturation Because the two strands of double-stranded nucleic acids are held together by hydrogen bonds, which are individually weak, if enough energy is added to a double-stranded molecule, it will separate into two individual strands. In other words, if you heat up a DNA molecule to a high enough temperature, it will separate into two strands. This process is known as denaturation or melting. If the temperature is then lowered again, then complementary strands will renature, or reform a double-stranded molecule. This property of nucleic acids forms the basis of some powerful analytical techniques, as will be seen in the module on molecular techniques.
Nucleic Acids: Summary of Key Points
DNA is the genetic material of all organisms except for a few viruses. This was initially demonstrated in bacteria, where DNA was shown to be able to transform one type of bacterial cell into another. DNA is made of subunits called nucleotides. A nucleotide consists of a 5-carbon sugar (2-deoxyribose) linked to a phosphate group and a nitrogenous base. There are four nitrogenous bases used in DNA: adenine, thymine, guanine and cytosine. Adenine and guanine are purines; cytosine and thymine are pyrimidines.
Nucleotides are joined together in a chain. A DNA molecule consists of two such chains, joined to each other by hydrogen bonding between the nitrogenous bases on the nucleotides of each chain. Adenine bonds to thymine, and guanine bonds to cytosine. RNA is similar to DNA, except that the bases include uracil instead of thymine, and the sugar is ribose instead of deoxyribose. Adding energy (heat) to DNA can overcome the hydrogen bonds, denaturing the doublestranded DNA into two single strands. Cooling the molecules can reverse the process.
This module looks at the way DNA reproduces itself, which is one of the necessary properties for its role as hereditary material.
1. Understand the semiconservative nature of DNA replication. Realize that the process begins at unique origins of replication, and proceeds bidirectionally. 2. Know that DNA synthesis is catalyzed by a family of enzymes called DNA polymerases. Understand that DNA polymerase has a requirement for a template on which to synthesize the new DNA strand, and for a primer from which to extend the DNA strand. 3. Understand the various functions of the RNA polymerases, such as exonuclease and polymerase activities, and their function in the replication process. 4. Know the other components of the replication apparatus, and their basic functions. Understand the principles of leading strand synthesis and lagging strand synthesis. 5. Realize that eukaryotes require the activity of telomerase to complete the synthesis of their linear chromosomes. The Semiconservative Nature of DNA Replication One property of the genetic material necessary for its function is the ability to replicate (reproduce) itself. After it was established that DNA is the genetic material, attention turned toward how DNA was replicating in living organisms. The Watson-Crick model of DNA structure (as outlined in the module on nucleic acids) suggested a possible mechanism for replication of DNA molecules. The nature of base pairing meant that if the two strands of a DNA molecule were separated, they could each serve as a template for the creation of a complementary strand by bringing in individual nucleotides to base pair with their complementary base on the template, and joining the new nucleotides together. Thus, each DNA molecule after replication would consist of one of the original strands plus one newly synthesized strand. This model of DNA replication is called semiconservative. Semiconservative was not the only model of DNA replication, however. Other proposed models included conservative replication and dispersive replication. Conservative replication proposed that after replication, one DNA molecule consists entirely of newly synthesized DNA whereas the other molecule is entirely original DNA. Dispersive replication suggested that each DNA molecule after replication might consist of segments of new and old DNA interspersed. It would be difficult to devise a mechanism by which this latter outcome might occur, but until evidence to the contrary was produced, it had to be considered. The three possible models of DNA replication are depicted below.
To distinguish between these possibilities, Meselson and Stahl did the following experiment: First, they grew bacteria for many generations in a growth medium containing 15N. This is a heavy isotope of nitrogen (in contrast to the normal isotope, 14 N), which over many generations would be incorporated into all nitrogen-containing molecules of the cells, including DNA. DNA isolated from these cells could be distinguished from normal DNA because it would have a higher density. The bacteria grown in heavy nitrogen were then transferred to growth medium containing 14 N for one round of replication. This lighter isotope would incorporate into any newly synthesized DNA. If semiconservative replication occurred, then each DNA molecule after replication would contain heavy nitrogen and light nitrogen, and would therefore have a density intermediate between the two. Conservative replication would produce one DNA molecule containing heavy nitrogen and one molecule containing light nitrogen, so there would be two different densities. Dispersive replication would produce a single intermediate density, just like semiconservative. The observed density of the DNA after one round of replication was intermediate. Replication was therefore either semiconservative or dispersive. These possibilities could be distinguished after a second round of replication. After two rounds, semiconservative replication would produce two DNA molecules containing only light nitrogen, and two DNA molecules containing one light strand and one heavy strand. Therefore there would be two different densities: light and intermediate. Two rounds of dispersive replication would produce four DNA molecules, each of which would contain mostly light nitrogen and some heavy nitrogen. There would be a single density (we'll call it 'slightly heavy'). When density of the DNA was measured after two rounds, two densities were observed: light and intermediate, indicating that DNA replication is semiconservative, and not dispersive or conservative.
Origins of Replication Although the semiconservative nature of DNA replication had been confirmed, many questions about replication remained. One of these questions was: is replication initiated at a specific site on the chromosome, or is it initiated at random sites, or even multiple sites? The answer to this question depends somewhat on the organism being considered. Bacteria, for example, have a single specific origin of replication; in other words, bacterial replication begins at the same spot on the chromosome every time. In E. coli, this site is called OriC. OriC is a 9 base-pair (bp) sequence that is repeated four times within the region. Eukaryotes also have specific sites at which replication is originated. However, because eukaryotic cells contain much more DNA than bacteria (humans have approximately 1500 times as much DNA as E.coli), there must be multiple origins of replication on each chromosome in order to replicate all of the DNA in a timely fashion. The amount of DNA replicated from a single origin is called a replicon. Other research has revealed that DNA replication proceeds bidirectionally from an origin of replication. This means that replication proceeds in opposite directions away from the origin:
Note in the diagram how each original DNA molecule branches, or forks, at the point where replication is occurring. These branch points are called replication forks. Because replication is bi-directional, two replication forks form at each origin of replication. (Some rare examples have been seen where replication is unidirectional from the origin.) The open area of the chromosome between the replication forks is called a replication bubble. DNA Polymerase I DNA replication is catalyzed by a family of enzymes called DNA polymerases. The first of these enzymes to be discovered, DNA polymerase I, was isolated from bacteria (specifically, E. coli). Characterization of the activity of this enzyme in vitro revealed that it had certain requirements for activity. It needed 5'-triphosphate forms of the four nucleotides, and it required the presence of preexisting DNA. The DNA serves two purposes: 1) it serves as a template for the synthesis of the new DNA (the template determines the sequence of the new DNA strand, through the specificity of base pairing), and 2) it serves as a primer for DNA synthesis. It turns out that DNA polymerase I cannot initiate DNA synthesis without having a free 3'-OH to add a new nucleotide to. DNA synthesis therefore needs a primer, apreexisting piece of nucleic acid to serve as an initiator of DNA synthesis. DNA polymerase I synthesizes DNA by forming a bond between the 5' phosphate of the incoming nucleotide (the other two phosphate groups from the nucleotide triphosphate are lost)
and the 3' OH group of the nucleotide at the end of the growing DNA chain. If you draw this out for yourselves, you'll realize that this means the DNA chain being synthesized grows in a 5' to 3' direction. This is an important rule to remember: DNA polymerase synthesizes DNA only in a 5' to 3' direction. In addition toits polymerase activity, DNA polymerase I has two other enzymatic activities, both of which are exonuclease activities. Exonucleases are enzymes that digest DNA (cleave phosphodiester bonds), chewing away at nucleotides from the end of the DNA chain. DNA polymerase I has 3' to 5' exonuclease activity, which degrades DNA in a direction opposite to that of synthesis. This provides the enzyme with a proofreading function: if a wrong nucleotide gets inserted into a growing chain, the enzyme can digest it out with the 3' to 5' exonuclease activity (almost like using the delete key while word processing), and insert the correct nucleotide. DNA polymerase I also has a 5' to 3' exonuclease activity, which degrades nucleic acids in the same direction as synthesis. As we'll see, this activity is used to remove primers during DNA replication. Multiple Polymerases DNA polymerase I, it turns out, is not the main enzyme involved in replicating the bacterial chromosome. When the gene encoding the enzyme was mutated in E. coli, the bacteria were still able to replicate their chromosomes. They were, however, deficient in DNA repair. This suggested that DNA polymerase I is primarily involved in DNA repair (although it does play a role in replication, as we will see), and that another yet-to-be-discovered enzyme would be responsible for replication. Eventually, two other DNA polymerases were identified, and named DNA polymerases II and III. These both had 5' to 3' polymerase activity like DNA polymerase I, and 3' to 5' exonuclease activity. Neither of these enzymes had the 5' to 3' exonuclease activity found in DNA polymerase I. The various enzyme activities of the different polymerases are summarized in the table below. Enzyme Activity 5' to 3' polymerase 3' to 5' exonuclease 5' to 3' exonuclease DNA Polymerase DNA Polymerase DNA Polymerase I II III Yes Yes Yes Yes Yes No Yes Yes No
DNA polymerase III turns out to be the main enzyme involved in DNA replication. DNA polymerase II is a minor enzyme involved in DNA repair. DNA polymerase I is the main
polymerase involved in DNA repair, and plays a specialized role in DNA replication, using its 5' to 3' exonuclease activity. The Mechanism of Prokaryotic DNA Replication As mentioned above, DNA replication in E. coli begins at OriC. It starts when the polypeptide products of the dnaA gene bind to the origin. These polypeptides cause localized strand separation. This allows a complex of the protein products of the dnaB and dnaC genes to bind. This complex acts as a helicase, which functions to unwind the DNA further. This unwinding produces the two replication forks. The unwound single-stranded region is kept single stranded through the action of single-strand binding proteins. There are other proteins that are found at the replication forks at this time, but their function is not well understood, so they will not be addressed. At this point, a primer is needed so that DNA polymerase III can begin to act. As mentioned earlier, DNA synthesis needs a primer, so how is a primer produced? An enzyme called primase serves this purpose, by synthesizing a short stretch of RNA (generally from 5 to 15 nucleotides in length). RNA synthesis does not require a primer, so primase (which is a type of enzyme called an RNA polymerase) is able to synthesize a short primer where needed. Once this primer is made, DNA synthesis can begin, extending the polynucleotide chain originating with the RNA primer. Priming and synthesis occurs on both strands in the helicase complex moves along the parental DNA, shifting the replication fork, and allowing synthesis to continue. This leads to another problem that has to be solved. As more DNA unwinds, and the replication fork moves along, synthesis of one strand (the lower strand in the diagram) can just continue, following the movement of the replication fork. The other strand being synthesized, however, cannot do this. As the replication fork moves along, it leaves a gap behind, as shown in panel B of the figure. To compensate for this, a second RNA primer must be synthesized a bit behind the first one, and DNA synthesized until it reaches the first primer (this is shown in panel C). You can easily imagine that as the replication fork progresses a bit further, this process will have to be repeated. Therefore, at each replication fork, the synthesis of one new DNA strand (the lower one in the figure) is continuous, while synthesis of the other strand must be accomplished in small increments, short stretch after short stretch; this type of synthesis is termed discontinuous. The strand of DNA that is synthesized continuously is called the leading strand, and the strand that is synthesized discontinuously is called the lagging strand. The small fragments of DNA making up the lagging strand are named Okazaki fragments, after the researcher who discovered them. Okazaki fragments are typically about 1000 to 2000 nucleotides each.
Discontinuous replication solves one problem but still leaves one matter unsettled: the lagging strand will be composed of individual, unjoined fragments of DNA and RNA. This is where DNA polymerase I comes into play. DNA polymerase I uses its 5' to 3' exonuclease activity to digest away the primer RNA, and replaces the primer with DNA by extending the strand from the adjacent Okazaki fragment. At this point all that is left to be done is to physically join the Okazaki fragments. This is accomplished by an enzyme known as DNA ligase. DNA ligase is able to join the 5' end of one DNA strand to the 3' end of another DNA strand Eukaryotic DNA Replication Synthesis of DNA in eukaryotes is less well understood, but the process appears to be basically the same as in prokaryotes, with a few notable exceptions. For one thing, eukaryotic DNA is complexed with histones to form chromatin. Every round of replication, therefore, requires that the histones be removed, then replaced after replication is complete. This requirement understandably slows the whole replication process down. Eukaryotic cells are also much more complex than prokaryotes, because they contain organelles such as mitochondria and chloroplasts (in plants) that contain their own DNA, which must also be replicated. Eukaryotic cells therefore have more than three DNA polymerases; there have been five DNA polymerases identified so far. Eukaryotes have another difference: eukaryotic chromosomes are linear, rather than circular as in prokaryotes.
DNA Replication: Summary of Key Points
DNA replication is semiconservative, with each existing strand serving as a template for the synthesis of a new strand. Replication begins at specific locations called origins of replication. Replication requires a primer , and proceeds bidirectionally from the point of origin, creating an expanding replication bubble. On one strand (the leading strand), synthesis is continuous; on the other strand (the lagging strand) synthesis is discontinuous, producing a series of Okazaki fragments that must be ligated together.
Genes and Transcription
This module look at gene structure at the molecular level (in both prokaryotes and eukaryotes), and examines the process of transcription, by which the information encoded in DNA is copied into and RNA messenger. RNA processing, which produces a mature RNA molecule from an initial transcript, will also be considered.
1. 2. 3. 4. Understand the central dogma of gene expression. Know how genes are defined, and what their basic structure is. Understand the difference between exons and introns in eukaryotic genes. Know the basic structure of regulatory regions of genes (promoters and enhancers), and understand their function. 5. Understand the role of RNA polymerase in transcription, and know the essential steps of transcription. 6. Know the basic differences between prokaryotic and eukaryotic transcription. 7. Understand how eukaryotic RNAs are processed after transcription to produce functional mRNAs.
The Central Dogma of Molecular Biology In addition to being able to replicate itself (see DNA replication), the genetic material must be able to store genetic information, and be able to express that information. How information is stored in DNA will be considered elsewhere (see the module on the Genetic Code). Expression of the information occurs in the following way: the information coded in the DNA is transmitted to an intermediary molecule, RNA. The information in the RNA is then interpreted and translated into a sequence of amino acids, which makes up a polypeptide. The polypeptide, either on its own or in conjunction with other polypeptides, carries out the function specified by the information in the DNA. This flow of genetic information is represented in the following diagram:
This module deals with the process of transcription. A consideration of how the information in RNA is used to produce protein can be found in the modules on the Genetic Code and Translation. Genes Not all of the DNA in a cell has its genetic information transmitted to RNA intermediaries. The regions of DNA that do transmit information to RNA are called genes. A gene, therefore, is defined by the initial RNA molecule that is transcribed from it. In the case of prokaryotes, the initial RNA molecule, or initial transcript, is equivalent to the final mature RNA. Eukaryotic RNAs are another story, however. The initial transcript in most eukaryotic genes is processed so that the mature RNA is significantly different. For example, if you compare the nucleotide sequence of the final RNA to the sequence of the gene from which it was transcribed, an interesting property becomes apparent. Nucleotide sequences that are contiguous on the RNA molecule are separated by other nucleotide sequences in the gene. In other words, there are extra nucleotide sequences in the DNA (represented by white areas in the gene) that are not represented in the final RNA molecule, and that "intervene" between the segments of DNA whose sequences are represented in the final RNA (represented by colored blocks in the diagram). These intervening sequences are usually called introns; they do not code for polypeptides. The sequences that are represented in the final RNA and code for amino acids in a polypeptide are called exons. Most eukaryotes have genes that contain introns, prokaryotic genes do not contain introns. Eukaryotic RNAs, after transcription, must therefore have the intron sequences removed in order to produce the final mature RNA. This process of intron removal will be addressed later in the module. Regulatory Regions There is more to genes than just the sequences that are used to produce RNAs. Not all genes need to be transcribed all the time. This is especially true in higher eukaryotes (such as ourselves), where certain genes will only be transcribed in specific tissues (such as muscle, for example). It would therefore be a tremendous waste of a cell's resources to transcribe unnecessary genes. Genes need to be controlled so that they are transcribed only when and where they are needed. To accomplish this, there are regions of the DNA that regulate transcription of individual genes. These are known collectively as regulatory regions. There are two major types of regulatory regions: promoters and enhancers. Promoters are found immediately adjacent to genes in the upstream direction. ('Upstream' and 'downstream' refer to directions relative to the direction of transcription of a gene. Upstream means in the direction
opposite that of transcription.) Promoters are found associated with genes in both prokaryotes and eukaryotes. A promoter contains specific DNA sequences that act as 'molecular switches' to turn on transcription. Examples of these sequences will be discussed later in the module. Eukaryotic genes have enhancers in addition to promoters (prokaryotic genes do not have enhancers). Enhancers are specific DNA sequences that function to enhance transcription (as their name implies). Unlike promoters, enhancers are not found in any specific location relative to a gene. An enhancer does not need to be adjacent to a gene, nor does it need to be upstream. Enhancers have been found upstream of genes, downstream of genes, even within introns! An enhancer can also be found up to several thousand base pairs away from a gene. Summary: Differences between promoters and enhancers
1. A promoter must be immediately adjacent to the gene it controls. An enhancer can act from a long distance away. 2. A promoter has to be upstream of the gene it controls. An enhancer can act from upstream, downstream, or even within a gene. 3. A promoter can only act in one orientation. In other words, if a promoter is turned around 180 degrees, it will cease to function. An enhancer is functional in any orientation.
The role of enhancers will be more closely examined in the module on eukaryotic gene regulation. The function of promoters will be addressed in this module. RNA Polymerase Transcription requires an enzyme to catalyze the synthesis of the RNA molecule. The enzyme responsible for this is called RNA polymerase. (RNA polymerases are a class of enzyme, based on their function. We've seen a member of this class elsewhere - primase from DNA replication is an RNA polymerase. However, the enzyme for transcription is named 'RNA polymerase'.) RNA polymerase, when it was first isolated in the late 1950's, was found to be able to synthesize RNA in vitro, and to have some of the same requirements as DNA polymerase. For example, RNA polymerase required a DNA template. It also required nucleotide triphosphates (although in this case it was ribonucleotide triphosphates instead of deoxyribonucleotide triphosphates, as it was for DNA polymerase). (For more on the structure of nucleotides, see the module on nucleic acid structure.) One way that RNA polymerase differed from DNA polymerase was that RNA polymerase did not require a primer to initiate synthesis. RNA polymerase, just like DNA polymerase, synthesized only in the 5' to 3' direction. Transcription in Prokaryotes RNA polymerase in prokaryotes is made up of five subunits: two alpha subunits, one beta subunit, one beta prime subunit, and one sigma subunit. This configuration is called the RNA polymerase holoenzyme. The whole enzyme has a molecular weight of about 500,000. Of the various subunits, the beta and beta prime subunits are most important for catalytic activity; the sigma subunit is involved in regulation of transcription.
Transcription consists of three basic steps: initiation, elongation, and termination. Each of these steps will be considered in turn. Initiation Initiation involves recognition of the promoter by RNA polymerase. Bacterial promoters generally contain two important DNA sequences that are involved in regulation of transcription. One of these is found at the -10 position (ten base pairs upstream of the transcription start site) and has the sequence TATAAT; the other is found at -35 and has the sequence TTGACA. RNA polymerase binds to DNA, and scans along the DNA until it encounters a promoter. The sigma subunit recognizes the -35 sequence, and causes the polymerase to bind more tightly. The sigma subunit separates from the polymerase complex, leaving a four subunit core enzyme. The DNA unwinds at the -10 sequence, because that sequence is A/T-rich, and A-T base pairs are weaker than G-C base pairs. The unwound region is known as a transcription bubble, and averages about 18 bp in length. RNA polymerase now begins synthesizing RNA at the appropriate position on the gene. Elongation Elongation is much like DNA replication: nucleotides are attached to the 3' end of the growing RNA chain, by the formation of a phosphodiester bond between the 5' phosphate of the incoming nucleotide and the 3' hydroxyl at the end of the RNA chain. There is one major difference between transcription and replication, however. In DNA replication, both DNA strands serve as templates for the synthesis of new DNA. In contrast, only one of the two DNA strands serves as a template for transcription. For a particular gene, it will always be the same strand that serves as a template. This strand is known as the template strand. The other strand, which never serves as a template, is called the non-template or partner strand. As synthesis continues, a transient RNA-DNA hybrid duplex is formed, but the RNA quickly dissociates from the DNA, so that only a few RNA-DNA base pairs are present at a given time. As the polymerase proceeds along the gene, the DNA rewinds itself behind the enzyme. As a result, the transcription bubble appears to move along the gene with the polymerase. Termination Termination occurs when the RNA polymerase encounters a termination signal in the gene. Prokaryotic genes have two kinds of termination: rho-independent termination and rhodependent termination depending on whether termination requires the action of a terminator protein called rho. Each type of termination has a different termination signal in the gene. For rho-independent termination, there is a G/C-rich stretch of nucleotides, followed by an A/T-rich stretch of nucleotides. When the G/C-rich stretch is transcribed into RNA, the sequence of the nucleotides is such that the RNA molecule forms a short double-stranded region called a hairpin:
The hairpin significantly slows down the RNA polymerase (imagine it acting as an anchor or a parachute off the back of the polymerase), and causing it to pause in the A/T-rich region. The weaker A-T base pairs in the RNA-DNA duplex allow the transcription complex to fall apart, ending transcription. Rho-dependent termination does not involve a specific sequence, although there is a region of the DNA rich in C. As transcription occurs, a protein called rho attaches to the 5' end of the growing RNA molecule, and begins moving along the RNA toward the polymerase. When the polymerase reaches the C-rich sequence, it pauses, and the rho protein is able to catch up to the polymerase and knock it and the newly synthesized RNA off the gene. Eukaryotic Transcription Overall, the process of RNA synthesis in eukaryotes is similar to that of prokaryotes. There are some real differences, however. For one thing, initial transcripts in eukaryotes contain introns, which must be removed after transcription (this will be examined later). Eukaryotes also have three RNA polymerases, instead of just one. Each of these polymerases transcribes a different class of genes, as outlined in the table below: RNA Polymerase I Genes encoding ribosomal RNA RNA polymerase II Genes encoding messenger RNA RNA Polymerase III Genes encoding transfer RNA
Our consideration of eukaryotic transcription will focus on genes transcribed by RNA polymerase II (known as class II genes). As with prokaryotes, the transcription process can be broken down into the steps of initiation, elongation, and termination. In eukaryotes, there is also the additional step of RNA processing, which occurs during and after transcription. Initiation Initiation in eukaryotes is much more complex than it is in prokaryotes. Eukaryotic genes must be much more carefully regulated, because many genes are only expressed in specific cells or tissues at specific times in the organism's life. To achieve this careful regulation, eukaryotes have evolved a more complicated initiation scheme than prokaryotes. In addition to promoters, eukaryotic genes also have regulatory regions called enhancers. Both elements (promoter and enhancer) are required for full, correct expression of eukaryotic genes. As a result of this added complexity, eukaryotic RNA polymerases do not have anything equivalent to the sigma subunit found in prokaryotic RNA polymerases. Instead, eukaryotes have groups of transcription
factors, which are proteins, independent of the RNA polymerases, that recognize promoter and enhancer sequences. Eukaryotic promoters, like prokaryotic promoters, contain conserved sequences that are important for initiation. (Eukaryotes, because of their added complexity, tend to have more conserved sequences in their promoters than do prokaryotes.) One important sequence in most eukaryotic promoters is found around -30, and has the sequence TATAAA (or something close to it). This promoter element, known as the TATA Box, is analogous to the -10 element in prokaryotes. Other promoter sequences vary from gene to gene, but a common one is GGCCAATCT, otherwise known as the CCAAT Box (for the central bases in the sequence), which tends to occur around -80. A group of basal transcription factors helps to initiate transcription of class II genes. Each member of this group is named "TFII" for Transcription Factor, class II genes. The individual factors are assigned a separate letter designation. For example, TFIID, a factor made of multiple polypeptides, recognizes and binds to the TATA box. This factor and the other factors (TFIIA, TFIIB, TFIIE, TFIIF, TFIIH, and TFIIJ) forms a complex on the DNA that recruits RNA polymerase II to the promoter, and promotes initiation of transcription. These transcription factors are sufficient to get a basal (minimal) level of transcription. Other transcription factors binding to other promoter and enhancer elements are necessary for higher levels of transcription (enhancers and their transcription factors will be addressed in the module on eukaryotic gene regulation). Elongation Elongation in eukaryotes is just like in prokaryotes. Termination Termination in eukaryotes is quite different in eukaryotes than it is in prokaryotes. Eukaryotic genes have no strong termination sequences like prokaryotes. Instead, RNA polymerase II continues transcribing up to 1000 to 2000 nucleotides beyond where the 3' end of the mature mRNA will be. The actual 3' end will be determined during RNA processing. Processing Eukaryotic class II transcripts are processed in order to produce the final mRNA. Processing of the initial transcript includes capping, polyadenylation, and intron removal. Ribosomal and transfer RNAs are also processed, but differently; they are neither capped nor polyadenylated. Capping of the RNA occurs at the 5' end. A methylated guanine nucleotide is added to the transcript in a 5' to 5' phosphodiester linkage (it's like a nucleotide added to the 5' end in the backwards direction). This 'cap' is important for recognition of the mRNA by ribosomes during translation.
Polyadenylation involves cleavage of the RNA to produce the proper 3' end, and addition of a string of adenine nucleotides. The position of the 3' end is determined by a sequence within the RNA itself. This sequence, AAUAAA, is known as the polyadenylation signal. When this signal is recognized by the appropriate enzymes, the RNA is cleaved 10 to 30 nucleotides downstream of the signal, and a series of adenine nucleotides is added. This polyadenylation is done without a template - the As are simply added one after another to the 3' end of the RNA. This poly (A) tail, which averages about 200 nucleotides in length, helps protect the RNA from degradation, and plays other regulatory roles that are beyond the scope of our discussion. Introns in some RNAs (particularly mitochondrial RNAs) are capable of self-splicing (or autocatalytic splicing). In these splicing reactions, no protein enzyme is required - the enzyme activity resides within the intron RNA itself! Such RNA enzymes are termed ribozymes. Class II RNAs (pre-mRNAs) from most eukaryotes, however, do require protein enzymes to remove their introns. The splicing of these RNAs is carried out by large protein/RNA complex called spliceosomes. Spliceosomes are made up of five different snRNPs (pronounced 'snurps'; short for small nuclear ribonucleoprotein), called U1, U2, U4, U5, and U6. Each snRNP consists of a specific small nuclear RNA (snRNA) molecule complexed with protein. Spliceosomes are able to detect intron/exon boundaries, cleave the RNA at the appropriate point, and join adjacent exons together to produce the mature mRNA.
Transcription: Summary of Key Points
Transcription is carried out by enzymes called RNA polymerases, which synthesize RNA in a 5' to 3' direction. Prokaryotes have one RNA polymerase; eukaryotes have three RNA polymerases, each of which transcribes a different class of gene. Transcription occurs in three basic steps: initiation, elongation, and termination. Initiation in prokaryotes involves the recognition of promoter sequences by the sigma subunit of RNA polymerase. Initiation in eukaryotes involves the recognition of promoter sequences by transcription factors, which then recruit RNA polymerase to the promoter. Termination in prokaryotes occurs at termination sequences. Depending upon the gene, termination may or may not involve the termination protein rho. Termination in eukaryotes is less specific: it occurs well downstream of the 3' end of the mature mRNA. Eukaryotic transcripts must be processed to produce mature mRNAs; Prokaryotic RNAs are not processed. Processing involves the addition of a 5' cap, the production of the correct 3' end by cleavage and polyadenylation, and the removal of introns by spliceosomes.
This module looks at the mechanics involved in translating the information encoded in an mRNA molecule into a polypeptide. The components of the translation (protein synthesis) machinery will be introduced, and the process of protein synthesis will be outlined.
1. Know how ribosomes are built from their component parts. Know the identities of the different ribosomal RNAs. 2. Know the basic structure of transfer RNA. Understand how tRNA becomes charged with an amino acid. 3. Know the events involved in formation of a translation initiation complex. 4. Understand the events involved in elongation, and the formation of peptide bonds. 5. Understand the process of translation termination. Protein Synthesis Translation of the genetic code involves using the genetic information to produce a polypeptide. Therefore, translation is synonymous with protein synthesis. Proteins, or to be more precise, polypeptides, are linear chains of amino acids. (You don't need to know all of the amino acids, but you should work to become more familiar with them.) As outlined in the module on the genetic code, genetic information encoded in DNA, and transferred to mRNA, is used to determine the sequence of amino acids in a polypeptide. The actual synthesis of polypeptides is carried out by ribosomes. Ribosomes Protein synthesis is one of the most important processes in a cell, since most cellular functions are mediated by proteins. Therefore, cells contain many ribosomes to ensure that they are able to synthesize enough protein. Bacteria contain approximately 10,000 ribosomes, and eukaryotic cells contain over 50,000 ribosomes. Amphibian eggs, which are highly specialized single cells, contain more than a million ribosomes (!) because a tremendous amount of protein synthesis occurs after fertilization. Ribosomes are composed of two subunits, one large and one small. The intact ribosome, with both subunits resent, is called a monosome. Each of the subunits is a complex of RNA and protein. The specific type of RNA used in ribosomes is called ribosomal RNA, or rRNA. There are four specific types of rRNA in eukaryotes, and these are designated by their size: 28S, 18S, 5.8S, and 5S. ('S' refers to the Svedberg unit, which is a unit of relative size of a molecule, based on sedimentation as a result of centrifugation. Svedberg units are not additive; they are merely
relative measurements of size.) Of the ribosomal RNAs, the 28S, 5.8S and 5S are found in the large subunit, and the 18S is found in the small subunit. Subunit sizes and general makeup of the eukaryotic ribosomal subunits are shown in the following figure:
Prokaryotic ribosomes are constructed in a similar fashion, although the sizes are somewhat different. The prokaryotic monosome is 70S, and is made of a 50S large subunit and a 30S small subunit. The 50S subunit is composed of 23S rRNA (analogous to the eukaryotic 28S), 5S
rRNA(analogous to eukaryotic 5.8S), and 31 ribosomal proteins. The 30S subunit is composed of 16S rRNA (analogous to eukaryotic 18S) and 21 ribosomal proteins. In all, the differences between eukaryotic and prokaryotic ribosome constituents (rRNA sizes, number of proteins) simply reflects an increase in complexity of the eukaryotic ribosome over its prokaryotic counterpart. Transfer RNA The other player in the translation process that we have not yet considered is transfer RNA, or tRNA. As we shall see, tRNA serves as an adaptor or intermediary between mRNA and amino acids. tRNAs are among the best characterized RNA molecules - they are quite short (75 to 90 nucleotides long) and have nearly identical sequences in ekaryotes and prokaryotes. tRNA molecules are somewhat unique in that they contain several unusual nucleotides, such as inosine, pseudouridine, and hypoxanthine. The sequence of each individual tRNA molecule is such that base pairing occurs between strands in different regions of the same molecule. This gives tRNA molecules a characteristic 'cloverleaf' shape. There are two main functional regions of the tRNA molecule. The middle loop of the cloverleaf contains three unpaired bases known as the anticodon. The anticodon base pairs with the complementary codon on mRNA during translation. Directly opposite of the anticodon is a region with no loop - it contains both ends of the linear tRNA molecule. This region, particularly the 3' end of the tRNA is where a specific amino acid will bind in preparation for protein synthesis. A tRNA molecule with a particular anticodon sequence will only bind to one amino acid (for example, the tRNA with AGU as an anticodon sequence will only bind to the amino acid serine). In this way, specificity of the genetic code is maintained. tRNA molecules are joined to their specific amino acid in a reaction known as charging. The 3' end of the tRNA molecule is covalently linked to the correct amino acid by an enzyme called aminoacyltRNAsynthetase. This enzyme recognizes the appropriate tRNA and enzyme, and uses the energy of ATP to join the two. Because the recognition of the tRNA and amino acid by the enzyme is so specific, there must be a different aminoacyltRNAsynthetase for each amino acid. Therefore, there are at least 20 different aminoacyltRNAsynthetases. Prokaryotic Translation Translation, or protein synthesis, is quite similar in prokaryotes and eukaryotes. We will look at the details of the process in prokaryotes, and consider the differences in eukaryotes afterwards. The process of translation can be divided into three basic steps: initiation, elongation, and termination. Each of these steps will be considered in turn.
Initiation Initiation requires a large and small ribosomal subunit (the subunits exist separately when not actively engaged in protein synthesis), a molecule of mRNA, a set of proteins known as initiation factors, GTP (for energy), and an initiator tRNA. The initiator tRNA has UAC for its anticodon, to allow it to base pair with the AUG start codon. It is charged with a special amino acid, formyl methionine, or fmet for short.
1. The small ribosomal subunit binds to Initiation Factor 3 (IF3).
2. The small subunit/IF3 complex binds to the mRNA. Specifically, it binds to the sequence AGGAGG, known as the Shine-Delgarno sequence, which is found in all prokaryotic mRNAs.
3. Meanwhile, the fmettRNA binds to Initiation Factor 2 (IF2), which promotes binding of the tRNA to the start codon.
4. The small subunit/IF3 complex scans along the mRNA until it encounters the start codon. The tRNA/IF2 complex also binds to the start codon. This complex of the small ribosomal subunit, IF3, initiator tRNA, and IF2 is called the initiation complex.
At this point, the large ribosomal subunit joins in. A molecule of GTP is hydrolyzed, and the initiation factors are released. The ribosomal complex is now ready for protein synthesis
When the ribosome is assembled, two tRNA binding sites are created; these are designated 'P' and 'A' (P stands for peptidyl, A stands for aminoacyl). (Some books also discuss an 'E' site, from which used tRNA molecules are ejected, but for the sake of simplicity, we won't consider it here.) The initiator tRNA is in the P site, and the A site will be filled by the tRNA with the anticodon that is complementary to the codon next to the start. (In this case, it is the tRNA that binds proline.)
When the second tRNA base pairs with the appropriate codon in the mRNA, an enzyme called peptidyltransferase (one of the 31 proteins in the large ribosomal subunit) catalyzes the formation of a peptide bond between the two amino acids present (while breaking the bond between fmet and its tRNA).
At this point, the whole ribosome shifts over one codon. This shift requires several elongation factors (not shown) and energy from the hydrolysis of GTP. The result of the shift is that the uncharged tRNA that was in the P site is ejected, and the tRNA that was in the A site is now in the P site. The A site is free to accept the tRNA molecule with the appropriate anticodon for the next codon in the mRNA.
The next tRNA base pairs with the next codon, and peptidyltransferase catalyzes the formation of a peptide bond between the new amino acid and the growing peptide chain.
Once again, the ribosome shifts over, so that the uncharged tRNA is expelled, and the tRNA with the peptide chain occupies the P site. (This is why this site is called the 'peptidyl' site - after the shift, it contains the tRNA with the growing peptide chain. The other site will accept a tRNA with an amino acid, hence the name 'aminoacyl' site.) The process of shifting and peptide bond formation continues over and over until a termination codon is encountered. The elongation process is fairly rapid, with prokaryotic ribosomes able to add 15 amino acids to the growing polypeptide every second. The process is also relatively error-free. Only one mistake is made every 10,000 amino acids. For large proteins of 1000 amino acids, that would mean one wrong amino acid in every 10 polypeptides.
When a termination codon enters the A site, translation halts. This is because there is no tRNA with an anticodon that is complementary to any of the stop codons.
Instead, the termination codon is recognized and bound to by a release factor. The release factor is a protein, like the initiation factors and the elongation facotrs, that is independent of the ribosome.
The release factor causes the translation complex to fall apart, and cleaves the polypeptide from the final tRNA. The polypeptide product is now free to function in the cell. The mRNA molecule is now available to be translated again. Very often, more than one ribosome will translate a single mRNA at the same time. One ribosome will initiate translation, and after it moves down the mRNA a bit, another ribosome will initiate, then another, and so on. The structure consisting of multiple ribosomes translating a single mRNA molecule is called a polysome. Eventually, the mRNA is degraded, and translation of that particular message will cease.
As mentioned previously, eukaryotic translation is very similar overall to prokaryotic translation. There are a few notable differences, however. These include the follwing:
Eukaryotic mRNAs do not contain a Shine-Delgarno sequence. Instead, ribosomal subunits recognize and bind to the 5' cap of eukaryotic mRNAs. In other words, the 5' cap takes the place of the Shine-Delgarno sequence. Eukaryotes do not use formyl methionine as the first amino acid in every polypeptide; ordinary methionine is used. Eukaryotes do have a specific initiator tRNA, however. Eukaryotic translation involves many more protein factors than prokaryotic translation (For example, eukaryotic initiation involves at least 10 factors, instead of the 3 in prokaryotes.)
Translation: Summary of Key Points
Translation is the synthesis of a polypeptide using the information encoded in an mRNA molecule. The process involves mRNA, tRNA, and ribosomes. Ribosomes are large organelles made of two subunits, each of which is composed of ribosomal RNA and proteins. tRNA has a unique structure that exposes an anticodon, which binds to codons in an mRNA, and an opposite end that binds to a specific amino acid. Binding of an amino acid to a tRNA is carried out by an enzyme called aminoacyltRNAsynthetase in a process called charging. Translation consists of three basic steps: initiation, elongation, and termination. Initiation involves the formation of the ribosome/mRNA/initiator tRNA complex. Elongation is the actual synthesis of the polypeptide chain, by formation of peptide bonds between amino acids. Termination dissociates the translation complex and releases the finished polypeptide chain. Each of these steps requires the activity of a specific set of protein factors in addition to the ribosome, tRNA, and mRNA. Ribosomes serve as sites that encourage tRNA binding to the appropriate codon. The tRNA molecules are shuttles, bringing the correct amino acids into position so they can be connected to each other by peptidyltransferase.
Mutation, DNA Repair, and Recombination
This module looks at changes in the information stored in the genetic material, how they occur, what the consequences of such changes are, and strategies cells use to minimize DNA changes and damage.
1. Understand the differences between somatic and gametic mutations, regarding where they occur, and what the consequences are. 2. Understand the difference between spontaneous and induced mutations. 3. Know the various possible effects of mutations of organisms. 4. Understand the molecular basis and possible consequences of base substitutions and frameshift mutations. 5. Understand the general concept of mutagenicity testing using the Ames test. 6. Know how thymine dimers form, and the various repair mechanisms (photoreactivation, excision repair, recombination repair, SOS repair) used to fix thymine dimers. 7. Understand how proofreading and mismatch repair are used to prevent base substitutions in DNA. 8. Understand the mechanism by which DNA molecules recombine. 9. Understand the concept of gene conversion, and how it can result from recombination. Mutation One of the properties of the genetic material, as outlined in the module on nucleic acids, is the ability to exhibit variation over time. This property was necessary to explain why individuals within a population are not all genetically identical, and to explain how organisms evolve. Mutation is defined as a failure to store genetic information faithfully. Changes in genetic information can be reflected in the expression of that information (i.e. in the proteins produced). In other words, mutation accounts for the variability in the genetic information. Mutation is therefore a double-edged sword. One one hand, mutation is necessary to introduce variation into the gene pool of a population. Genetic variation has been shown to correlate with species fitness. On the other hand, most mutations are deleterious to the individuals in which they occur. So mutation is good for the population, but generally not so good for the individual. Somatic vs. Gametic Mutations The consequences of a mutation depend upon where in an individual they occur. Some mutations occur in regular body cells; these are somatic mutations. For example, someone who spends too much time suntanning might experience a mutation in a skin cell. The consequences of such a
mutation are felt only by the individual. The skin cell may develop some problem (such as cancer, perhaps) as a result of the mutation, but because the mutation occurred only in a skin cell, it would not be passed on to subsequent generations. Some mutations occur in germline cells. These cells produce the gametes; therefore, they are gametic mutations. In most cases, such mutations wouldn't even be noticed by the individual. After all, the gametes don't play a prominent role in the day-to-day function of the individual. These mutations, in contrast to the somatic mutations, will be passed on to the next generation, because they occur in the cells that produce the next generation. Spontaneous vs. Induced Mutations Some mutations arise as natural errors in DNA replication (or as a result of unknown chemical reactions); these are known as spontaneous mutations. The rates of such mutations have been determined for many species. E. coli has a spontaneous mutation rate of 1/108 (one error in every 108 nucleotides replicated). Humans have a higher spontaneous mutation rate: between 1/106 and 1/105 (probably as a result of the higher complexity of human replication). Mutations can also be caused by agents in the environment; these are induced mutations. Induced mutations increase the mutation rate over the spontaneous rate. Looking at a single mutation in an individual, one cannot tell if the mutation was spontaneous or induced. Induced mutations can only be discerned by looking at the mutation rate in a population, and comparing it to the spontaneous mutation rate for the species. If the observed mutation rate is higher, then induced mutations can be assumed. Agents in the environment that cause an increase in the mutation rate are called mutagens. Mutations: Random and Reversible The spontaneity of many mutations should suggest to you that the process is random. Mutations do not occur in response to a stimulus. In other words, bacteria do not mutate to become antibiotic resistant as a response to exposure to antibiotics. Instead, out of all of the mutations occurring in a population of bacteria, some (a miniscule percentage) will cause antibiotic resistance. If that antibiotic is encountered, those bacterial cells with that particular mutation will survive; the vastmajority of the cells that do not have the mutation will die. Mutations can be reversible. If a mutation occurs once in a gene, there is a very small probability that the mutated base could mutate back to its original form. Alternatively, there are occasions when a mutation in a second, separate gene will return the phenotype of the organism to a wild type appearance (a rare case of two wrongs making a right). This kind of mutation is known as a supressor mutation. Effects of Mutation Mutations can affect individuals in a variety of ways. Among the consequences of mutation are the following:
Change in a morphological trait. This refers to an obvious change in some physical characteristic of an organism. Most of the mutant phenotypes we have observed in this course have been of this type (for example, short plants instead of tall). Nutritional or biochemical variation. A mutation may occur in a gene that encodes an enzyme involved in a metabolic pathway, such as an enzyme involved in the biosynthesis of an amino acid. If this occurs, the organism can no longer synthesize the amino acid, and must obtain from dietary sources. Change in behavior. These are hard to characterize, and there are few known examples of specific behaviors affected by a single gene. In one example, Drosophila mating behavior was found to be affected by a mutation. Mutant male flies were no longer able to distinguish between males and females, and tried to mate with any fly available! Changes in gene regulation. If a mutation occurs in a gene encoding a transcription factor, it could affect when and where the genes controlled by that transcription factor are expressed. This will be addressed in the modules on gene regulation. Lethality. Some mutations are lethal to an organism, like the yellow coat color allele in mice (as outlined in the module on extensions of Mendelism) or the Huntington's allele of humans.
The Molecular Basis of Mutation There are two basic types of mutations:
base substitutions - this is the replacement of one base by another. For example, if a DNA molecule usually contains guanine at a certain position, but adenine takes the place of the guanine, then a base substitution has occurred. There are two types of base substitutions: o transitions - these involve the replacement of a purine with the other purine, or the replacement of a pyrimidine with the other pyrimidine. o transversions - these involve the replacement of a purine with a pyrimidine or vice versa.
(Question: if a transition occurs on one strand of DNA, what type of change must occur on the complementary strand in order to maintain complementary base pairing?)
frameshift mutations - these change the reading frame of a gene. There are two types of framehift mutations: o insertions - as the name implies, these involve the insertion of one or more extra nucleotides into a DNA chain. o deletions - these result from the loss of one or more nucleotides from a DNA chain.
To illustrate the effects of these mutations, consider the following phrase, read as a triplet code (just like the genetic code!): The fat cat ate the hot dog.
A base substitution might have an effect like this: The fat car ate the hot dog. or perhaps: The fat cat ate the hot hog. In each case, the phrase still makes sense, but the meaning has been slightly changed. An insertion, on the other hand, would have a more profound effect: The fmatca tat eth ehotdo g. Insertion of a single letter ('m' in this case) causes the phrase to become gibberish, because the reading frame has been changed. A deletion would have the same effect: The atcatatethehotdog. (Insertion or deletion of one or two nucleotides will change the reading frame of the genetic code. What would happen to the reading frame if three nucletides were inserted or deleted? To further consider the effects of frameshift mutations on reading of the genetic code, go to the genetic code module.) Base substitutions and insertions or deletions of one nucleotide are also known as point mutations(because they occur at a single point on a chromosome). Causes of Mutations Base substitutions are generally caused by changes in the way that nucleotides base pair. One way this occurs is through tautomeric shifts. The chemical nature of the bases of DNA is such that rare but natural, spontaneous fluctuations in the bonds of the bases can occur. These fluctuations can briefly affect the way a base forms hydrogen bonds. For example, adenine, when it undergoes a tautomeric shift, will base pair with cytosine. Therefore, if a tautomeric shift occurs during replication, the wrong nucleotide can be inserted in the newly-synthesized DNA. The bases usually switch back to their normal form quickly, but by that time, it might be too late. Base substitutions can also be caused by chemical modification of the bases. One type of chemical modification is caused by alkylating agents, such as ethylmethanesulfonate and methylmethanesulfonate. These agents donate alkyl groups (such as methyl and ethyl groups) to bases, affecting their base pairing. For example, when guanine is alkylated, producing 7ethylguanine, it will base pair with thymine. Once again, if this occurs during replication, the wrong nucleotide can be inserted in the molecule being synthesized, leading to a mutation Frameshift mutations can be caused by intercalating agents. These are chemical agents that insert between adjacent base pairs (like inserting between the rungs of a ladder). The
intercalation causes a conformational change in the double helix, so that when replication occurs, the aberrant conformation causes small deletions or insertions to occur in the newly synthesized DNA. Radiation Radiation is also capable of inducing mutations in DNA. Ionizing radiation, such as gamma rays and X-rays, depending on the energy of the radiation, can create free radicals that result in prblems ranging from point mutations to chromosome breaks. Ultraviolet (non-ionizing) radiation can cause mutations as well. The primary effect of UV on DNA is the creation of thymine dimers. Thymine dimers occur when two thymines are adjacent on a strand of DNA. UV radiation can cause the formation of a covalent bond between the two thymines, which prevents their participation in base pairing. Thymine dimers are very deleterious to a cell - they can completely interrupt replication, effectively causing a cell to die. As we'll see later in this module, one of the last-chance mechanisms of repair of thymine dimers causes the insertion of random nucleotides in place of the region containing the thymine dimer, resulting in several base substitutions at once. Screening for Mutagenicity Many chemicals found in the environment (both natural and synthetic) are capable of causing mutations. It is useful to know whether a particular substance is mutagenic (able to cause an increase in the mutation rate), because many mutagens are also carcinogens (cancer causing agents). This is because cancer is generally caused by mutations in genes that control cell division. Chemical compounds are tested for mutagenicity using the Ames test. This test uses auxotrophic (mutant) strains of the bacterium Salmonella typhimuriumthat require medium supplemented with histidine in order to grow. The bacteria are exposed to the compound being tested, then plated on minimal medium (with no histidine). Only bacteria that underwent a reverse mutation, allowing them to synthesize their own histidine, would be able to grow under these conditions. The more mutagenic a compound is, the more likely such a reverse mutation would be. Therefore the bacteria growing on minimal medium can be counted, and this gives a relative measure of how mutagenic a compound is. Using this technique, a 90% correlation has been observed between mutagenicity and carcinogenicity. DNA Repair Cells have developed a number of systems designed to repair DNA damage and correct mutations. Obviously, these mechanisms are not perfectly successful, but as we'll see, without them mutation rates would be much higher. Repair of Thymine Dimers
Several mechanisms are available for the removal or correction of thymine dimers from DNA. Which mechanism is used depends upon the circumstances of the cell.
Photoreactivation - It has been observed that a brief exposure to blue light following UV exposure can reverse the effects of the UV radiation. In other words, the blue light can cause a thymine dimer to be corrected. This is due to the function of an enzyme called photolyase or photoreactivation enzyme (PRE), which cleaves the covalent bonds linking the thymine dimers using the energy from a photon of blue light. This is essentially a reversal of the reaction that produced the thymine dimer in the first place. Excision Repair - This is a repair system that doesn't require light. Instead of just breaking the bonds of the thymine dimer (as was done by photolyase), the excision repair system removes (excises) the region surrounding the offending nucleotides. Several proteins are involvedin this process (in prokaryotes these are the products of the 'uvr' genes, for 'UV repair'). The steps of excision repair in prokaryotes are as follows:
The distortion in the DNA (caused by the thymine dimer) is recognized by a protein complex. A pair of endonucleases makes nicks in the DNA strand on either side of the thymine dimer (generally the nicks are 12 nucleotides apart).
The 12-nucleotide piece of DNA between the nicks is removed, and DNA polymerase Ifills in the gap left behind.
DNA ligase seals the final nick in the DNA.
Recombination Repair - Sometimes, DNA replication will begin before a thymine dimer can be repaired by one of the other mechanisms. When the replication machinery
hits the dimer, replication stops. Occasionally, the replication will reinitiate just beyond the dimer, leaving a gap in the DNA.
This leaves the cell with a curious predicament: if it tries to fix the dimer by excision repair, there is no template to use for resynthesis of the DNA. How, then, does the DNA get repaired? The answer is that the cell uses recombination to provide a template strand for repair synthesis. Here's how: First, the damaged region undergoes recombination with the complemenary strand on the other DNA molecule. One strand is exchanged between the two DNA molecules.
This essentially transfers the gap to the DNA molecule that doesn't have the dimer.
The gap can now be filled in by DNA polymerase I, and the dimer can be repaired by excision, since a template strand now exists.
A more detailed examination of the actual mechanism of recombination will be presented later in the module.
SOS Repair - If the UV exposure is sufficiently severe, the DNA damage may overwhelm the other repair mechanisms. In such situations, DNA replication would almost certainly halt, and the cell would die. As a last ditch effort to save itself, a cell
activates the SOS repair system. This is a complex system, in which a whole battery of repair mechanisms are used to try to save the cell. One of these mechanisms allows replication to proceed across damaged templates, even though the template can't accurately be read. As a result, random nucleotides get inserted into the newlysynthesized DNA strand. This mechanism is therefore error-prone, and leads to mutations, which could be deleterious. In this case, however, the alternative is death, so mutation is preferable. Repair of Mutations Cells have mechanisms for minimizing the amount of mutation that takes place. As stated previously, these are not perfect, but they do reduce greatly the frequency of mutation. The two mechanisms we'll consider are proofreading and mismatch repair.
Proofreading - This occurs during DNA replication. As DNA polymerase III adds nucleotides to the growing chain, it checks each one for correct base pairing. If the correct nucleotide has not been inserted, the polymerase uses its 3' to 5' exonuclease activity to remove the incorrect nucleotide. The polymerase can then carry on and insert the correct nucleotide. This is very much like a word processor: if you type in the wrong letter, you just hit the delete key to remove it, allowing you to type in the correct letter. (For more on 3' to 5' exonuclease activity, see the module on DNA replication.) In bacteria, a wrong nucleotide gets inserted for every 105 nucleotides added during replication. Proofreading corrects most of these, so that the overall error ratein replication is one mistake for every 107 nucleotides added. Mismatch Repair - This mechanism is used soon after replication, to correct errors that escaped proofreading. Because mismatched bases don't hydrogen bond, they create a distortion in the double helix, which can be recognized and repaired by excision repair. The question in this case is how does the repair system recognize which strand to repair? There are two nucleotides (one on each DNA strand) that won't base pair - which one is the wrong nucleotide? The answer comes from DNA methylation. DNA under normal circumstances is methylated; these methyl groups do not interfere with the function of the DNA in any way. Newly replicated DNA is not methylated however; the methyl groups are added enzymatically after replication. If mismatch repair is done immediately after replication (before methylation occurs), the original DNA strand will be methylated, and the newly-synthesized strand (the one containing the error) will be unmethylated. The mismatch repair system therefore repairs the unmethylated strand.
Excision repair can be used to correct other problems as well. For example, if a deoxynucleotide containing uracil is ever inserted into a DNA molecule, the base is detected and removed by an enzyme called uracil DNA glycosylase. This enzyme removes the base, but leaves the sugar and phosphate in the DNA molecule. This base-less site is then recognized by specific endonucleases, which initiate excision repair. Recombination
There is a mechanism of genetic variation other than mutation, and it involves exchange between DNA molecules with significant sequence similarity. Because the molecules must be homologous in order for recombination to occur, the process is often called homologous recombination. This process occurs in viruses, bacteria, and eukaryotes. In eukaryotes, recombination is the molecular basis of crossing over between homologous chromosomes. (For a review of crossing over, see the module on meiosis.)
Recombination begins with paired homologues, lined up so that the homologous sequences are adjacent.
A single-stranded nick is introduced at the same point on each molecule.
The nicked ends are displaced to the other molecule, and base pair with the complementary sequence on that molecule. (This is why the two molecules must be homologous - without homologous sequences, this base pairing can't occur.) DNA ligase seals the nick on each translplanted strand, creating a heteroduplex molecule from the two homologues. After the initial base pairing, more of each strand is displaced across to the other molecule in a zipperlike action. This process is known as branch migration, because the branch point between the two molecules (the crossover point) moves along the heteroduplex.
To understand how recombination is resolved, and how the heteroduplex is returned to two separate DNA molecules, we need to mentally manipulate the heteroduplex. First, we bend the 'arms' of each DNA molecule around the point of crossover. Then, the two molecules are rotated 180 degrees with respect to each other.
The result of the manipulations is the following structure. (Realize that these manipulations would not likely occur in a cell; they are sinply done to aid our understanding.) To return the heteroduplex to two separate molecules, nucleases cut the DNA in two places . However, there is a choice of the two locations used: the cuts can occur at positions 1 and 2, or at positions 3 and 4. (Each possibility is used with equal frequency.)
If sites 1 and 2 are cut, the molecules produced would be as depicted at right. There has been substantial recombination between the homologues. If sites 3 and 4 are cut, then the molecules at right would be produced. In this case, there has been little recombination (only one small segment of one strand from each molecule has been exchanged). Even this small amount of recombination can have an effect, however - this is what occurs during recombination repair of DNA.
Recombination Machinery Recombination does not just happen spontaneously on its own. Looking at the model of recombination, it is apparent that at the very least, an endonuclease is required (to produce the nicks that initiate the process, and separate the heteroduplex), as is DNA ligase. In fact, there is a specific set of proteins that functions to promote recombination. In bacteria, these are known as the rec (short for recombination) proteins. Rec A protein is a strand exchange promoter, while rec B, C, and D form a complex that nicks and unwinds the DNA. Other proteins involved in the process are less well characterized. Gene Conversion Even though recombination requires extensive homology between the two DNA molecules involved, the sequences do not have to be absolutely identical; for example, a point mutation would not interfere with the process. It is possible, therefore, that recombination could produce a DNA molecule in which there was a base mismatch. For example, if a segment of DNA with a wild-type allele was recombined with a segment of DNA from a mutant allele, the site of the mutation would be a mismatch.
Mismatch or excision repair may try to fix this situation, but since both strands will be methylated (after all, it would not be immediately post-replication) the sequence would be randomly converted to either the wild type or mutant form. Therefore, it would be possible for a mutant allele to be changed into wild-type allele. This is gene conversion.
Mutation, DNA Repair, and Recombination: Summary of Key Concepts
y y y y y
y y y y
Mutations can occur spontaneously as errors of replication, or they can be induced by mutagens in the environment. If mutations occur in somatic cells, they affect that individual only; if they occur in germline cells, they affect future generations. Mutations can affect physical form, nutritional requirements, behavior, gene regulation, or survival. There are two basic types of mutations: base substitutions and frameshift mutations. Each has a different effect on translation of the genetic code. UV exposure can lead to DNA lesions called thymine dimers. These lesions can be repaired by a variety of mechanisms, including photoreactivation, excision repair, recombination repair, and SOS repair. The mechanism used depends inpart on the particular situation. Cells have mechanisms, such as proofreading and mismatch repair, that limit the mutation rate. Cells contain a mechanism by which segments of homologous DNA molecules can be exchanged. This forms the basis for crossing over during meiosis. There are proteins in cells that function to promote recombination. Recombination can lead to gene conversion; that is, conversion of a mutant allele to a wild-type allele (or vice versa).