Lecture Week 15 Biochemistry of Nutrition, 560B Dr.
Charles Saladino Introduction For many years after the elucidation of the DNA structure by Watson and Crick, for which they received the Nobel Prize, Central Dogma was the accepted theory. That theory basically stated that DNA encodes for RNA, and then the RNA carries the DNA code in such a manner as to code for the formation of a protein (now redefined as a polypeptide). The synthesis of RNA from DNA is called transcription, whereas the protein synthesized from the RNA code is called translation. For decades, this was always thought to be the order of the sequence, until the mechanisms for retrovirus (viruses carrying only an RNA code, instead of DNA) replication was elucidated. In this unusual case, the RNA actually codes for DNA. Therefore, with the exception of the retroviruses using a reverse transcriptase enzyme to allow RNA to code for DNA (as is the case with HIV), the Central Dogma is still correct for the encoding of DNA for specific polypeptides. In other words, gene expression is the transformation of DNA information into functional molecules. It is then that the DNA information becomes “useful.” This is a rich and complex subject, requiring many chapters in a good biochemistry text. I will do my best to emphasize the most important highlights, as we detail these mechanisms of transcription and translation in this lecture. Various RNA Types Described Before describing the several types of RNA, let us examine characteristics common to all RNA species. 1) RNA is a single stranded molecule. It is still composed of nucleotide units, but there is no base pairing with a separate complementary strand. 2. The abbreviation RNA stands for ribonucleic acid, because the sugar is a ribose, not the deoxyribose found in DNA. If you look at the deoxyribose sugar from last week, you will see the lack of an OH group on carbon # 2, whereas the ribose sugar does contain an OH group at that same position. Otherwise, the sugars (pentoses) are the same. 3. Whereas DNA contains the G, C, A, and T heterocyclic nitrogen bases, RNA contains G, C, A, and U (uracil – see last week’s first figure). In other words, the pyrimidine base, thymine, of DNA is replaced by uracil (U) in RNA, and thus U is complementary to A, just the way G is complementary to C. Remember, by complementary we mean base to base pairing. 4. Except in the retroviruses, DNA is where the information of the genome is archived, whereas the various RNA types are for transcription and translation.
Messenger RNA (mRNA) mRNA is formed from a single, genetically-active strand of DNA in complementary fashion. The DNA strand that is coding for a mRNA is called the template strand, whereas the opposite DNA strand that might not be active at that time is referred to as the coding strand (even though it is not coding for anything at that point in time). Functionally, mRNA is the template for translation, but that’s for later. For now, however, let us note that a distinct mRNA is produced for each gene expressed in eukaryotes. Therefore, mRNA is a heterogeneous class of biomolecules (500 – 6000 nucleotides). Unique to most mRNA is a poly-A tail (about 200 adenine (A) nucleotides) found on the 3’ end of the molecule. It is not transcribed from DNA. Rather it is added after mRNA is transcribed by the enzyme polyadenylate polymerase. A consensus sequence (a series of nucleotides that are involved in signaling and are not encoding genes) called the polyadenylation signal sequence (AAUAAA) is found near the 3’ end of the mRNA. It signals the attachment of poly-A. It is known that these tails exist to help stabilize the messenger and are also involved intransport of the molecule out of the nucleus and into the cytoplasm. On the 5’end is a CAP end consisting of a 7-methyl-guanosine attached backwards through a triphosphate linkage, catalyzed by the nuclear enzyme guanylyltransferase. The addition of the methyl group is catalyzed by guanine-7-methyltransferase. We will mention the functionality of these two end regions later. However, between these two ends is the coding region of the mRNA. mRNA is first formed as a larger precursor molecule, called hnRNA (heterogeneous nuclear RNA). So the portion of the mRNA that actually carries a genetic code derived from DNA contains nucleotide sequences that are termed exons (one x, not like the Exxon gasoline), plus non-coding introns. At first glance, it seems like a thermodynamic waste for the DNA to code for parts of the messenger the can not translate a protein and thus have no coding function. That would be the introns, as opposed to the exons which do carry a useful genetic code from the DNA. So what is going on here? Well, as an mRNA molecule matures, some of its sequences are removed, and these are known as introns. What remains are exons (think of the x of exon to remember “expressed”). What splices the introns out is the enzyme complex called a splicosome. Whereas a small number of messengers contain no introns, most contain some, ranging in number up to around 50, as is in the case of the primary transcripts of collagen. Obviously, there must be consensus sequences at each end of the intron that signal where the cut is to be made. Splicing out introns allows rearrangement of the exon sequences, if so desired. This would greatly increase the possibilities as to how a single gene could code for more than one protein. For example, there would not be enough genetic material to code for all the different antibodies that the body would require. By rearranging exons, different transcripts could be derived from the same gene, instead of requiring one gene for every possible antibody. So that is quite thermodynamically efficient after all! As a point of interest, you might have heard the term “snurps.” These are small nuclear
RNAs (snRNAs) that associate with protein (hence, small ribonucleoprotein particles = snurps). They facilitate the splicing of exon segments by base pairing with the consensus sequences at the end of each intron. I know you have all heard of systemic lupus erythematosus, an important autoimmune disorder that affects women in a ratio of about 10 to 1 over men. Anyway, the autoantibodies produced in lupus attack one’s own proteins, including the snRNAs. Transfer RNA (tRNA) This molecule folds upon itself and shows internal base pairing – again, with itself, forming a sort of clover leaf structure. Many of its bases become modified post-transcriptionally. The tRNA is also made from a longer precursor, with an intron having to be removed from the anticodon loop (that term explained later) and from the 3’ and 5’ end of the molecule, as shown in the figure below.
Other post-transcriptional modifications include adding a CCA sequence to the 3’ terminal end, catalyzed by a nucleotidyltransferase, as well as modifying bases at various points along the tRNA. The purpose of this molecule is to carry an amino acid in its activated form to the ribosome for peptide bond formation. There is at least one kind of tRNA for each of the twenty amino acids. The tRNA consists of about 75 nucleotides (about 25 kd in mass), which renders it one of the smallest RNA molecules. Ribosomal RNAs (rRNA)
These are the major component of ribosomes. They serve a structural and a catalytic role in protein synthesis (translation). Remember I mentioned in our enzyme lecture that although almost all enzymes are proteins, some are not. This is because certain species of rRNA have catalytic power (to be explained a little later). In eukaryotes rRNA synthesis starts with a single precursor molecule referred to as preribosomal RNA and produces 5.8S, 18S, and 28S segments of rRNA. These S units are derived from the term Svedberg units, which is a relative measurements of a combination of molecular weight and shape, giving rise to their sedimentation characteristics in a centrifugation process. Larger S values will usually indicate larger molecular weight RNAs. The way these small S-value RNAs are formed is that the large precursor RNA is cleaved by ribonucleases to yield intermediates, which are further trimmed to produce those rRNA species just mentioned. Also, some of the proteins that are also part of the ribosomal structure will associate with the rRNA large precursor before and during its post-transcriptional modification in the nucleolus of the nucleus. This nucleoprotein will eventually be transported into the cytosol of the cytoplasm, where the ribosome structure is assembled from its two main subunits. Whereas in bacteria, for example, a 60S ribosome is assembled from one 30S and one 50S ribonucleoprotein subunit, in eukaryotes, there are cytosolic 80S ribosomes, which are assembled from 40S and 60S ribonucleoprotein subunits. By know you notice that numerical addition of Svalues is not valid. Again, that is because an S-value is a centrifugal sedimentation characteristic based on both molecular weight and shape. You can visualize, I am sure, how two molecules of the same molecular weight but different shapes would sediment differently. RNA Polymerases All cellular RNA is synthesized by a variety of RNA polymerases. Again, the synthesis of RNA from a DNA template is called transcription. The requirements of the polymerases are several. First, a template is required, usually double-stranded DNA, although both complementary DNA strands do not have to be read at the same time. (RNA is not an effective template, nor are DNA-RNA hydrid molecules.) Second, activated precursors are required, meaning all four ribonucleoside triphosphates – ATP, UTP, GTP, and CTP. Third, divalent cations are required for the enzyme –Mg2+ or Mn2+. There are three unique RNA polymerases in the eukaryotic nucleus that transcribe the various classes of RNA. These are large, multisubunit enzymes, with each type recognizing specific types of genes. For example, RNA pol I synthesizes the large rRNA precursor of which we have spoken. This occurs in the nucleolus in the nucleus, whereas tRNA and mRNA are synthesized in the nucleoplasm. RNA pol II catalyzes the synthesis of the large mRNA precursor (hnRNA). Apparently some viruses use this pol to produce viral RNA, and the enzyme can help synthesize some of the snRNAs. RNA pol III is required for tRNA synthesis. Now let’s see how this actually works, focusing on mRNA synthesis. Please refer to the diagram below.
So let’s start at the very top of the figure showing the DNA template strand coding for its complmentary mRNA (hnRNA) strand. Note the DNA coding strand (which remember is not coding for anything at the moment) is not present in the figure. Now if you look at the mRNA, you will notice the 5” end on the left. This is because RNA pols synthesize new RNA in their 5’ to 3’ direction. Ring a bell? You will also notice that the mRNA eventually will be read in triplets – that is, sets of three nucleotides to designate one amino acid. I will now insert below what we will call a codon table. Each triplet base sequence in the mRNA is referred to as a codon. To see how to use the table, you will notice the first column on the left for base # 1 of the codon. Then there are four colums to choose from for the second base, and a final column for the third base of the mRNA codon. See if you can find how AUG designates the amino acid methionine (Met) and CCA for proline (Pro). You will note that some amino acids have more than one possible codon to designate them, which is why I mentioned that there is at least one tRNA per amino acid – this to be explain in a little while. Print out this table (only) for the last exam, as you will need to refer to it during the exam. The rest of the test is not open book.
The rest of the figure preceding the above table is where we want to return to further explain mRNA synthesis. The main part of the figure shows an enhanced view of the DNA template strand that is coding for the mRNA. Notice the label “flanking region on the 5’ side and a transcribed region to the right. Within the flanking region is a promoter region. Please find it, and let’s start there. You will notice a CAAT box and a TATA box. These are consensus sequences, because they are signal, not bases from which genes are transcribed. Obviously, for RNA pol to recognize specific genes from within huge stretches of DNA, it must “know” where the transcriptional unit starts. This is why DNA templates contain regions called promoter sites that specifically bind RNA pols, so as to determine where transcription begins. Eukaryotic genes have promoter sites within a TATAAA consensus sequence called the TAT box or Hogness box centered about -25 (negative, because it is about 25 nucleorides from the beginning of the transcribing region). Many eukaryotic promoters also have a CAAT box (GGNCAATCT) centered at about -75 nucleotides, as well as a GC box (GGGCGG). Such boxes can vary from gene to gene. Also, gene transcription can be further stimulated by enhancer DNA sequences, which can be kilobases away from either end of the transcribing region. The crude diagram below illustrates some of this.
In order to complete our explanation of the figure now above the codon table, you will notice the transcribing region beginning in the vicinity of the CAP site. In eukaryotes,
CAP structures are attached to the mRNA 5’ end after transcription is complete, as are the poly A tails. The figure also shows the DNA sequences for exons and introns, which we described above briefly. Finally, within the transcribing region of the DNA is a stop signal sequence to be coded into the mRNA. However this is a stop signal for protein synthesis, which we have not yet discussed. Stop signals are mRNA codons and can be found in your codon table. Instead of designating a particular amino acid, this codon sequence signals the end of a polypeptide being formed. Codon UAA and UAG are codon stop signals. Thus far, we have seen the rRNA synthesized in the nucleolus (using DNA as a template), combined with protein, and transported to the cytosol as ribonuceloprotein, which forms an assembly of ribosomal subunits and then a complete ribosome. The ribosome will be the site of protein synthesis. In the mean time, genes are transcribed from the DNA code to form a mRNA. It is first a big precursor, but after intron removal and exon assembly, the mRNA is capped and a poly A tail added, which aids in the stabilization and transport of the mRNA to the cytosol. This mRNA is carrying a DNA-directed genetic code (in the form of a series of usually contiguous, three-nucleotide codons) for the assembly of amino acids into a very specific polypeptide. tRNA as the Adaptor for Protein Synthesis While all of this is going on, the tRNA’s have already been synthesized from their specific DNA segment, modified, and transported to the cytosol. There each amino acid to be linked to its specific tRNA molecule is activated and then linked to its tRNA by an enzyme called aminoacyl-tRNA synthetase, hooking the carboxyl end of the amino acid to the tRNA. There is at least one specific aminoacyl synthetase and one tRNA for each amino acid. If you think about it, by recognizing both the amino acid and the correct tRNA (because of the tRNA bases), this enzyme is really implementing the instructions of the genetic code. You will remember earlier in the lecture that I used the term anticodon (a three base sequence) within the tRNA structure. It is near a partial loop in the tRNA opposite to the end where the amino acid is attached. This is critical to understand: The anticodon will match to the appropriate codon in the mRNA by complementary base pairing, remembering that there is no T in any RNA, and that U replaces T in its base pairing to A. Thus, every properly made codon can accommodate an anticodon of a tRNA carrying an amino acid, unless the codon is a signal. This is taking place upon the ribosome, where the mRNA has bound. Polypeptide synthesis will be achieved when the amino acids of two adjacent tRNAs connected to their respective codons form a peptide bond between the two amino acids. The peptide bond is catalyzed by a peptidyltransferase. The enzymatic activity is intrinsic to one of the rRNAs (an example of a non-protein enzyme) within the ribosome. The polypeptide will be synthesized in the amino to carboxyl direction, and the mRNA will be translated in the 5’ to 3’ direction of the mRNA. What is important to note, is that the codons of the mRNA recognize the anticodons of the tRNA and not the amino acids carried
by the tRNA. In order to keep adding amino acids, a translocase mechanism moves the ribosome three nucleotides toward the 3’ end of the mRNA. This requires GTP as an energy source. This continues until a stop signal is reached, and the polypeptide can fall off, become associated with other polypeptides, and undergo a variety of chemical and structural modifications, including those we discussed at the beginning of the semester under the topic of proteins. The actual synthesis of the peptide bond and movement of the mRNA with the tRNA connected is difficult to explain here. Thus, I have included the following diagram to hopefully further clarify the process, trusting that it will not confuse you further. Let me know if there are interpretation problems. OK?
Metabolism of Nucleotide Bases
This subject is extremely complex and not necessary as a subject for an introductory course in biochemistry. However, in keeping with the old saying, “Sometimes a picture is worth a thousand words,” I have enclosed one more figure that gives you a feel for the synthesis of purines and pyrimidines. So I have two hopes, as I present this final figure, which I admittedly scribbled out on a piece of paper. First, I want you to realize the important role of amino acids and vitamins in the synthesis of these heterocyclic nitrogen bases, not memorize every detail. Second, I hope you will not use this figure to do a handwriting analysis on me!
Final Comments There is obviously so much more to this story about the genetic code. Topics such as jumping genes, genetic recombination, switching genes on and off, the telomere and telomerase, oncogenes and cancer, retroviruses, along with many others are very fascinating topics to explore. I hope sometime on your own that you do have the time to look into some of these very interesting and challenging subjects. I tell you, the day will come that coronary artery stents will be coated with – not drugs, as that day is here already – but with genes!