Professional Documents
Culture Documents
Molecular Biology Notes
Molecular Biology Notes
What do a human, a rose, and a bacterium have in common? Each of these things —
along with every other organism on Earth — contains the molecular instructions for life,
calleddeoxyribonucleic acid or DNA. Encoded within this DNA are the directions for traits as
diverse as the color of a person's eyes, the scent of a rose, and the way in which bacteria
infect a lung cell.
DNA is found in nearly all living cells. However, its exact location within a cell
depends on whether that cell possesses a special membrane-bound organelle called a nucleus.
Organisms composed of cells that contain nuclei are classified as eukaryotes, whereas
organisms composed of cells that lack nuclei are classified as prokaryotes. In eukaryotes,
DNA is housed within the nucleus, but in prokaryotes, DNA is located directly within the
cellular cytoplasm, as there is no nucleus available.
But what, exactly, is DNA? In short, DNA is a complex molecule that consists of
many components, a portion of which are passed from parent organisms to their offspring
during the process of reproduction. Although each organism's DNA is unique, all DNA is
composed of the same nitrogen-based molecules. So how does DNA differ from organism to
organism? It is simply the order in which these smaller molecules are arranged that differs
among individuals. In turn, this pattern of arrangement ultimately determines each organism's
unique characteristics, thanks to another set of molecules that "read" the pattern and stimulate
the chemical and physical processes it calls for.
Composed by:
LUBABA KOMAL Page 1
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 1: A single nucleotide contains a nitrogenous base (red), a deoxyribose sugar molecule
(gray), and a phosphate group attached to the 5' side of the sugar (indicated by light gray).
Opposite to the 5' side of the sugar molecule is the 3' side (dark gray), which has a free
hydroxyl group attached (not shown).
At the most basic level, all DNA is composed of a series of smaller molecules
callednucleotides. In turn, each nucleotide is itself made up of three primary components: a
nitrogen-containing region known as anitrogenous base, a carbon-based sugar molecule
called deoxyribose, and a phosphorus-containing region known as aphosphate group attached
to the sugar molecule (Figure 1). There are four different DNA nucleotides, each defined by a
specific nitrogenous base: adenine (often abbreviated "A" in science
writing), thymine (abbreviated "T"), guanine (abbreviated "G"), and cytosine(abbreviated
"C") (Figure 2).
Figure 2: The four nitrogenous bases that compose DNA nucleotides are shown in bright
colors: adenine (A, green), thymine (T, red), cytosine (C, orange), and guanine (G, blue).
Although nucleotides derive their names from the nitrogenous bases they contain,
they owe much of their structure and bonding capabilities to their deoxyribose molecule. The
central portion of this molecule contains five carbon atoms arranged in the shape of a ring,
and each carbon in the ring is referred to by a number followed by the prime symbol ('). Of
Composed by:
LUBABA KOMAL Page 2
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
these carbons, the 5' carbon atom is particularly notable, because it is the site at which the
phosphate group is attached to the nucleotide. Appropriately, the area surrounding this carbon
atom is known as the 5' end of the nucleotide. Opposite the 5' carbon, on the other side of the
deoxyribose ring, is the 3' carbon, which is not attached to a phosphate group. This portion of
the nucleotide is typically referred to as the 3' end (Figure 1). When nucleotides join together
in a series, they form a structure known as a polynucleotide. At each point of juncture within
a polynucleotide, the 5' end of one nucleotide attaches to the 3' end of the adjacent nucleotide
through a connection called aphosphodiester bond (Figure 3). It is this alternating sugar-
phosphate arrangement that forms the "backbone" of a DNA molecule.
Composed by:
LUBABA KOMAL Page 3
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
organism's DNA is unique to that individual, and it is this sequence that controls not only the
operations within a particular cell, but within the organism as a whole.
Figure Detail
Figure 5: Rosalind Franklin's X-ray diffraction image of DNA. Images like this one enabled
the precise calculation of molecular distances within the double helix.
Around the same time, researchers James Watson and Francis Crick were
pursuing a definitive model for the stable structure of DNA inside cell nuclei. Watson and
Crick ultimately used Franklin's images, along with their own evidence for the double-
Composed by:
LUBABA KOMAL Page 4
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
stranded nature of DNA, to argue that DNA actually takes the form of a double helix, a
ladder-like structure that is twisted along its entire length (Figure 6). Franklin, Watson, and
Crick all published articles describing their related findings in the same issue of Nature in
1953.
Figure 7: To better fit within the cell, long pieces of double-stranded DNA are tightly packed
into structures called chromosomes.
Most cells are incredibly small. For instance, one human alone consists of
approximately 100 trillion cells. Yet, if all of the DNA within just one of these cells were
arranged into a single straight piece, that DNA would be nearly two meters long! So, how can
this much DNA be made to fit within a cell? The answer to this question lies in the process
known asDNA packaging, which is the phenomenon of fitting DNA into dense compact
forms (Figure 7).
DNA Damage & Repair: Mechanisms for
Maintaining DNA Integrity
DNA RENATURATION
Composed by:
LUBABA KOMAL Page 5
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
DNA integrity is always under attack from environmental agents like skin cancer-causing UV
rays. How do DNA repair mechanisms detect and repair damaged DNA, and what happens
when they fail?
Because DNA is the storehouse of genetic information in each living cell, its
integrity and stability are essential to life. DNA, however, is not inert; rather, it is a chemical
entity subject to assault from the environment, and any resulting damage, if not repaired, will
lead to mutation and possibly disease. Perhaps the best-known example of the link between
environmental-induced DNA damage and disease is that of skin cancer, which can be caused
by excessive exposure to UV radiation in the form of sunlight (and, to a lesser degree,
tanning beds). Another example is the damage caused by tobacco smoke, which can lead to
mutations in lung cells and subsequent cancer of the lung. Beyond environmental agents,
DNA is also subject to oxidative damage from by-products of metabolism, such as free
radicals. In fact, it has been estimated that an individual cell can suffer up to one million
DNA changes per day (Lodish et al., 2005).
Figure 1
Composed by:
LUBABA KOMAL Page 6
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
in replication. Because DNA is a molecule that plays an active and critical role in cell
division, control of DNA repair is closely tied to regulation of thecell cycle. (Recall that cells
transit through a cycle involving the G1, S, G2, and M phases, with DNA replication occurring
in the S phase and mitosis in the M phase.) During the cell cycle, checkpoint mechanisms
ensure that a cell's DNA is intact before permitting DNA replication and cell division to
occur. Failures in these checkpoints can lead to an accumulation of damage, which in turn
leads to mutations.
Defects in DNA repair underlie a number of human genetic diseases that affect a
wide variety of body systems but share a group of common traits, most notably a
predisposition to cancer (Table 2). These disorders include ataxia-telangiectasia (AT), a
degenerative motor condition caused by failure to repair oxidative damage in the cerebellum,
and xerodermapigmentosum (XP), a condition characterized by sensitivity to sunlight and
linked to a defect in an important ultraviolet (UV) damage repair pathway. In addition, a
number of genes that have been implicated in cancer, such as the RAD group, have also been
determined to encode proteins critical for DNA damage repair.
Figure 3
Figure 2
Composed by:
LUBABA KOMAL Page 7
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
replication. Relatively flexible areas of the DNA double helix are most susceptible to
damage. In fact, one "hot spot" for UV-induced damage is found within a commonly
mutated oncogene, the p53 gene.
Bacteria and several other organisms also possess another mechanism to repair UV
damage called photoreactivation. This method is often referred to as "light repair," because it
is dependent on the presence of light energy. (In comparison, NER and most other repair
mechanisms are frequently referred to as "dark repair," as they do not require light as an
energy source.) During photoreactivation, an enzyme called photolyase binds pyrimidine
dimerlesions; in addition, a second molecule known as chromophore converts light energy
into the chemical energy required to directly revert the affected area of DNA to its
undamaged form. Photolyases are found in numerous organisms, including fungi, plants,
invertebrates such as fruit flies, and vertebrates including frogs. They do not appear to exist
in humans, however (Sinha & Hader, 2002).
Additional DNA Repair mechanisms
Figure 4
NER and photoreactivation are not the only methods of DNA repair. For
instance, base excision repair (BER) is the predominant mechanism that handles the
spontaneous DNA damage caused by free radicals and other reactive speciesgenerated by
metabolism. Bases can become oxidized, alkylated, or hydrolyzed through interactions with
these agents. For example, methyl (CH 3) chemical groups are frequently added to guanine to
form 7-methylguanine; alternatively,purine groups may be lost. All such changes result in
abnormal bases that must be removed and replaced. Thus, enzymes known as DNA
glycosylases remove damaged bases by literally cutting them out of the DNA strand
throughcleavage of the covalent bonds between the bases and the sugar-phosphate
backbone. The resulting gap is then filled by a specialized repair polymerase and sealed by
ligase. Many such enzymes are found in cells, and each is specific to certain types of base
alterations.
Yet another form of DNA damage is double-strand breaks, which are caused by
ionizing radiation, including gamma rays and X-rays. These breaks are highly deleterious. In
addition to interfering with transcription or replication, they can lead to chromosomal
rearrangements, in which pieces of one chromosome become attached to another
chromosome. Genes are disrupted in this process, leading to hybrid proteins or inappropriate
activation of genes. A number of cancers are associated with such rearrangements. Double-
strand breaks are repaired through one of two mechanisms: nonhomologous end joining
(NHEJ) or homologous recombination repair (HRR). In NHEJ, an enzyme calledDNA
ligase IV uses overhanging pieces of DNA adjacent to the break to join and fill in the ends.
Additional errors can be introduced during this process, which is the case if a cell has not
Composed by:
LUBABA KOMAL Page 8
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
completely replicated its DNA in preparation for division. In contrast, during HRR,
the homologous chromosome itself is used as a template for repair.
Mutations in an organism's DNA are a part of life. Our genetic code is exposed to
a variety of insults that threaten its integrity. But, a rigorous system of checks and balances is
in place through the DNA repair machinery. The errors that slip through the cracks may
sometimes be associated with disease, but they are also a source of variation that is acted
upon by longer-term processes, such as evolution and natural selection.
Composed by:
LUBABA KOMAL Page 9
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Composed by:
LUBABA KOMAL Page 10
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 1: Tautomeric shifts in nucleotide bases.
The purine and pyrimidine bases in DNA exist in two different tautomers, or
chemical forms. (A) Nucleotide bases shift from their common “keto” form to their rarer,
tautomeric “enol” form. (B) In common base pair arrangements, the common form of
thymine (T) binds with the common form of adenine (A), and the common form of cytosine
(C) binds with the common form of guanine (G). (C) Rare base-pairing arrangements result
when one nucleotide in a base pair is the rare form instead of the common form. Here, the
rare form of cytosine binds to the common form of adenine instead of guanine. The rare form
of guanine binds to the common form of thymine instead of cytosine.
Figure Detail
Today, scientists suspect that most DNA replication errors are caused by
mispairings of a different nature: either between different but nontautomeric chemical forms
of bases (e.g., bases with an extra proton, which can still bind but often with a mismatched
nucleotide, such as an A with a G instead of a T) or between "normal" bases that nonetheless
bond inappropriately (e.g., again, an A with a G instead of a T) because of a slight shift in
position of the nucleotides in space (Figure 2). This type of mispairing is known as wobble. It
occurs because the DNA double helix is flexible and able to accommodate slightly misshaped
pairings (Crick, 1966).
Composed by:
LUBABA KOMAL Page 11
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
in the newly synthesized, or primer, strand. Regions of DNA containing many copies of small
repeated sequences are particularly prone to this type of error.
Fixing Mistakes in DNA Replication
DNA polymerase enzymes are amazingly particular with respect to their choice of
nucleotides during DNA synthesis, ensuring that the bases added to a growing strand are
correctly paired with their complements on the template strand (i.e., A's with T's, and C's
with G's). Nonetheless, these enzymes do make mistakes at a rate of about 1 per every
100,000 nucleotides. That might not seem like much, until you consider how much DNA
a cell has. In humans, with our 6 billion base pairs in each diploid cell, that would amount to
about 120,000 mistakes every time a cell divides!
Fortunately, cells have evolved highly sophisticated means of fixing most, but not all,
of those mistakes. Some of the mistakes are corrected immediately during replication through
a process known as proofreading, and some are corrected after replication in a process
called mismatch repair. When an incorrect nucleotide is added to the growing strand,
replication is stalled by the fact that the nucleotide's exposed 3′-OH group is in the "wrong"
position. (Recall that new nucleotides are added to the growing strand during replication by
means of their 5′-phosphate group binding to the 3′-OH group of the previous nucleotide on
the strand.) During proofreading, DNA polymerase enzymes recognize this and replace the
incorrectly inserted nucleotide so that replication can continue. Proofreading fixes about 99%
of these types of errors, but that's still not good enough for normal cell functioning.
After replication, mismatch repair reduces the final error rate even further. Incorrectly paired
nucleotides cause deformities in the secondary structure of the final DNA molecule. During
mismatch repair, enzymes recognize and fix these deformities by removing the incorrectly
paired nucleotide and replacing it with the correct nucleotide.
Composed by:
LUBABA KOMAL Page 12
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Streisinger used this virus to show that most nucleotide insertion and deletion mutations
occur in areas of DNA that contain many repeated sequences (also called tandem repeats),
and he formulated the strand-slippage hypothesis to explain why this was the case
(Streisinger et al., 1966). (In Figure 3, notice the series of repeat T's on the template strand
where the slippage has occurred.) When slippage takes place, the presence of nearby
duplicate bases stabilizes the slippage so that replication can proceed. During the next round
of replication, when the two strands separate, the insertion or deletion on either the template
or primer strand, respectively, will be perpetuated as a permanent mutation. Scientists have
collected enough evidence to confirm Streisinger's strand-slippage hypothesis, and this type
of mutagenesis remains an active field of scientific research.
Composed by:
LUBABA KOMAL Page 13
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
damage. Rather, they are usually caused by normal chemical reactions that go on in cells,
such as hydrolysis. These types of errors include depurination, which occurs when the bond
connecting a purine to its deoxyribose sugar is broken by a molecule of water, resulting in a
purine-free nucleotide that can't act as a template during DNA replication, and deamination,
which results in the loss of an amino group from a nucleotide, again by reaction with water.
Again, most of these spontaneous errors are corrected by DNA repair processes. But if this
does not occur, a nucleotide that is added to the newly synthesized strand can become a
permanent mutation.
Even Low Mutation Rates Can Be Cause for Concern
Mutation rates vary substantially among taxa, and even among different parts of
the genome in a single organism. Scientists have reported mutation rates as low as 1 mistake
per 100 million (10-8) to 1 billion (10-9) nucleotides, mostly in bacteria, and as high as 1
mistake per 100 (10-2) to 1,000 (10-3) nucleotides, the latter in a group of error-prone
polymerase genes in humans (Johnson et al., 2000).
Even mutation rates as low as 10 -10 can accumulate quickly over time,
particularly in rapidly reproducing organisms like bacteria. This is one reason whyantibiotic
resistance is such an important public health problem; after all, mutations that accumulate in
a population of bacteria provide ample genetic variation with which to adapt (or respond) to
the natural selection pressures imposed by antibacterial drugs (Smolinski et al., 2003).
Take E. coli, for example. The genome of this common intestinal bacterium has about 4.2
million base pairs, or 8.4 million bases. Assuming a mutation rate of 10-9 (i.e., midway
between reported estimates of 10-8 and 10-10), every time E. coli divides, each daughter cell
will have, on average, 0.0084 new mutations. Or, another way to think about it is like this:
Approximately 1% of bacterial cells will contain a new mutation. That may not seem like
much. However, because bacteria can divide as rapidly as twice per hour, a single bacterium
can grow into a colony of 1 million cells in only about 10 hours (2 20 = 1,048,576). At that
point, approximately 10,000 of these bacteria will have accumulated at least one mutation. As
the number of bacteria carrying different mutations increases, so too does the likelihood that
at least one of them will develop a drug-resistant phenotype.
Likewise, in eukaryotes, cells accumulate mutations as they divide. In humans, if
enough somatic mutations (i.e., mutations in body cells rather than sperm or egg cells)
accumulate over the course of a person's lifetime, the end result could be cancer. Or, less
frequently, some cancer mutations are inherited from one or both parents; these are often
referred to as germ-line mutations. One of the first cancer-associated somatic mutations was
discovered in 1982, when researchers found that a mutated HRAS gene was associated with
bladder cancer (Reddy et al., 1982). HRAS encodes for aprotein that helps regulate cell
division. Since then, scientists have identified several hundred additional "cancer genes."
Some of them, like the handful of germ-line mutations associated with a form of colorectal
cancer known as hereditary nonpolyposis colorectal cancer (HNPCC), play crucial roles in
DNA repair (Wijnen et al., 1998).
Of course, not all mutations are "bad." But, because so many mutations can
cause cancer, DNA repair is obviously a crucially important property ofeukaryotic cells.
However, too much of a good thing can be dangerous. If DNA repair were perfect and no
mutations ever accumulated, there would be no genetic variation—and this variation serves
as the raw material for evolution. Successful organisms have thus evolved the means to repair
their DNA efficiently but not too efficiently, leaving just enough genetic variability for
evolution to continue.
Composed by:
LUBABA KOMAL Page 14
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Composed by:
LUBABA KOMAL Page 15
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Alternatives to Colinearity
Figure 2
One of the first clues that the colinearity of DNA and amino acid sequences is not
as simple as what Crick had proposed was the discovery of RNA splicing in the 1970s. Using
common cold viruses as their experimental systems, English molecular chemist Richard
Roberts and American molecular biologist Philip Sharp independently discovered that genes
can be split into several segments along the genome (Berget et al., 1977; Chow et al., 1977).
Then, using electron microscopy, both scientists observed that a single messenger RNA
(mRNA) molecule hybridized not to a single stretch of DNA but to as many as four or more
discontinuous DNA segments (Figure 2).
Roberts and Sharp also noted that the genetic material actually breaks apart and
then re-forms itself at certain points in protein synthesis. Specifically, the sections of DNA
that encode protein production are known as exons, and the noncoding sections interspersed
among the exons are known as introns. During splicing, which occursaftertranscription (i.e.,
the synthesis of RNA from a DNA template), the introns are removed and the exons are
joined, or spliced together.
Roberts's and Sharp's findings not only raised serious doubts about the concept
of a gene as a continuous, clearly demarcated segment of DNA, but they also led to a flurry
of research activity, with scientists curious about whether the same was true in other species.
As other researchers were quick to discover, discontinuous gene structure and splicing
during RNA processing are the norm, not the exception, in most eukaryotes. Some vertebrate
genes contain as many as 50 exons, and exons often make up only a small portion of the
transcribed region of a gene. For example, in one early splicing study that involved
examination of the intron-exon pattern of a chicken ovalbumin gene, Stein et al. (1980)
measured eight exons ranging in length from 20 to 181 base pairs and seven introns ranging
in length from 264 to 1,150 base pairs. Since that study, scientists have detected introns as
long as 50,000 base pairs or more in some species.
Composed by:
LUBABA KOMAL Page 16
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 3
Figure Detail
The final protein products encoded by any given intron-exon sequence also
vary in structure, depending on which exons are spliced back together during RNA
processing. This so-called "alternative splicing" is illustrated in Figure 3. Scientists have also
since learned that eukaryotic cells have evolved another "alternative" mRNA processing
pathway: the use ofmultiple 3' cleavage sites in a single exon. (Every intron has a 5' and 3'
splice site.) As illustrated in Figure 3, the end result is the same as with alternative splicing:
different mRNA molecules are produced from a single protein-coding gene. Clearly, contrary
to the conventional notion of a single gene encoding a single protein, a single continuous
stretch of DNA can encode multiple mRNA molecules and, ultimately, multiple protein
products.
Transcription Units Instead of Genes
Given the vast quantity of DNA that appears to have little protein-encoding power and the
fact that so much of this DNA resides right in the middle of functional genes (as introns),
some scientists prefer to think in terms of "transcription units" rather than "genes." A
transcription unit is a linear sequence of DNA that extends from a transcription start site to a
transcription stop site (Figure 4).
Figure 4
Composed by:
LUBABA KOMAL Page 18
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Composed by:
LUBABA KOMAL Page 19
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Initiation
Composed by:
LUBABA KOMAL Page 20
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
During initiation, so-called initiator proteins bind to the replication origin, a base-pair sequence of
nucleotides known as oriC. This binding triggers events that unwind the DNA double helix into two
single-stranded DNA molecules. Several groups of proteins are involved in this unwinding (Figure 1).
For example, the DNA helicases are responsible for breaking the hydrogen bonds that join
the complementary nucleotide bases to each other; these hydrogen bonds are an essential feature
of James Watson and Francis Crick's three-dimensional DNA model. Because the newly unwound
single strands have a tendency to rejoin, another group of proteins, the single-strand-binding
proteins, keep the single strands stable until elongation begins. A third family of proteins, the
topoisomerases, reduce some of the torsional strain caused by the unwinding of the double helix.
Composed by:
LUBABA KOMAL Page 21
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Primer Synthesis
Primer synthesis marks the beginning of the actual synthesis of the new
DNA molecule. Primers are short stretches of nucleotides (about 10 to 12 bases in length)
synthesized by an RNA polymerase enzyme called primase. Primers are required because
DNA polymerases, the enzymes responsible for the actual addition of nucleotides to the new
DNA strand, can only add deoxyribonucleotides to the 3’-OH group of an existing chain and
cannot begin synthesis de novo. Primase, on the other hand, can add ribonucleotides de novo.
Later, after elongation is complete, the primer is removed and replaced with DNA
nucleotides.
Elongation
Finally, elongation--the addition of nucleotides to the new DNA strand--begins
after the primer has been added. Synthesis of the growing strand involves adding
nucleotides, one by one, in the exact order specified by the original (template) strand.
Recall that one of the key features of the Watson-Crick DNA model is that adenine is
always paired with thymine and cytosine is always paired with guanine. So, for example,
if the original strand reads A-G-C-T, the new strand will read T-C-G-A.
DNA is always synthesized in the 5'-to-3' direction, meaning that nucleotides are
added only to the 3' end of the growing strand. As shown in Figure 2, the 5'-phosphate group
of the new nucleotide binds to the 3'-OH group of the last nucleotide of the growing strand.
Scientists have yet to identify a polymerase that can add bases to the 5' ends of DNA strands.
Composed by:
LUBABA KOMAL Page 22
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Composed by:
LUBABA KOMAL Page 23
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Using this method, Kornberg not only discovered DNA polymerases, but he also performed
some of the initial work demonstrating how enzymes add new nucleotides to growing DNA
chains (Kornberg, 1959).
Scientists have since identified a total of five different DNA polymerases in E. coli,
each with a specialized role. For example, DNA polymerase III does most of the elongation
work, adding nucleotides one by one to the 3' end of the new and growing single strand.
Other enzymes, including DNA polymerase I and RNase H, are responsible for removing the
RNA primer after DNA polymerase III has begun its work, replacing it with DNA
nucleotides (Ogawa & Okazaki, 1984). When these enzymes finish, they leave a nick
between the section of DNA that was formerly the primer and the elongated section of DNA.
Another enzyme called DNA ligase then acts to seal the bond between the two adjacent
nucleotides.
DNA Polymerase Only Moves in One Direction
After a primer is synthesized on a strand of DNA and the DNA strands unwind,
synthesis and elongation can proceed in only one direction. As previously mentioned, DNA
polymerase can only add to the 3' end, so the 5' end of the primer remains unaltered.
Consequently, synthesis proceeds immediately only along the so-called leading strand. This
immediate replication is known as continuous replication. The other strand (in the 5' direction
from the primer) is called the lagging strand, and replication along it is called discontinuous
replication. The double helix has to unwind a bit before the synthesis of another primer can
be initiated further up on the lagging strand. Synthesis can then occur from the 3' end of that
new primer. Next, the double helix unwinds a bit more, and another spurt of replication
proceeds. As a result, replication along the lagging strand can only proceed in short,
discontinuous spurts (Figure 3).
Composed by:
LUBABA KOMAL Page 24
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 3: Replication of the leading DNA strand is continuous, while replication along the
lagging strand is discontinuous.
Composed by:
LUBABA KOMAL Page 25
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
After a short length of the DNA has been unwound, synthesis must proceed in the 5'
to 3' direction; that is, in the direction opposite that of the unwinding.
Figure Detail
The fragments of newly synthesized DNA along the lagging strand are called Okazaki
fragments, named in honor of their discoverer, Japanese molecular biologist Reiji Okazaki.
Okazaki and his colleagues made their discovery by conducting what is known as a pulse-
chase experiment, which involved exposing replicating DNA to a short "pulse" of isotope-
labeled nucleotides and then varying the length of time that the cells would be exposed to
nonlabeled nucleotides. This later period is called the "chase" (Okazaki et al., 1968). The
labeled nucleotides were incorporated into growing DNA molecules only during the initial
few seconds of the pulse; thereafter, only nonlabeled nucleotides were incorporated during
the chase. The scientists then centrifuged the newly synthesized DNA and observed that the
shorter chases resulted in most of the radioactivity appearing in "slow" DNA. The
sedimentation rate was determined by size: smaller fragments precipitated more slowly than
larger fragments because of their lighter weight. As the investigators increased the length of
the chases, radioactivity in the "fast" DNA increased with little or no increase of radioactivity
in the slow DNA. The researchers correctly interpreted these observations to mean that, with
short chases, only very small fragments of DNA were being synthesized along the lagging
strand. As the chases increased in length, giving DNA more time to replicate, the lagging
strand fragments started integrating into longer, heavier, more rapidly sedimenting DNA
strands. Today, scientists know that the Okazaki fragments of bacterial DNA are typically
between 1,000 and 2,000 nucleotides long, whereas in eukaryotic cells, they are only about
100 to 200 nucleotides long.
The Challenges of Eukaryotic Replication
Bacterial and eukaryotic cells share many of the same basic features of replication;
for instance, initiation requires a primer, elongation is always in the 5'-to-3' direction, and
replication is always continuous along the leading strand and discontinuous along the lagging
strand. But there are also important differences between bacterial and eukaryotic replication,
some of which biologists are still actively researching in an effort to better understand the
molecular details. One difference is that eukaryotic replication is characterized by many
replication origins (often thousands), not just one, and the sequences of the replication origins
vary widely among species. On the other hand, while the replication origins for bacteria,
oriC, vary in length (from about 200 to 1,000 base pairs) and sequence, except among closely
related organisms, all bacteria nonetheless have just a single replication origin
(Mackiewicz et al., 2004).
Eukaryotic replication also utilizes a different set of DNA polymerase enzymes
(e.g., DNA polymerase δ and DNA polymerase ε instead of DNA polymerase III). Scientists
are still studying the roles of the 13 eukaryotic polymerases discovered to date. In addition, in
eukaryotes, the DNA template is compacted by the way it winds around proteins
called histones. This DNA-histone complex, called a nucleosome, poses a unique challenge
both for the cell and for scientists investigating the molecular details of eukaryotic
replication. What happens to nucleosomes during DNA replication? Scientists know from
electron micrograph studies that nucleosome reassembly happens very quickly after
replication (the reassembled nucleosomes are visible in the electron micrograph images), but
they still do not know how this happens (Annunziato, 2005).
Also, whereas bacterial chromosomes are circular, eukaryotic chromosomes
are linear. During circular DNA replication, the excised primer is readily replaced by
Composed by:
LUBABA KOMAL Page 26
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
nucleotides, leaving no gap in the newly synthesized DNA. In contrast, in linear DNA
replication, there is always a small gap left at the very end of the chromosome because of the
lack of a 3'-OH group for replacement nucleotides to bind. (As mentioned, DNA synthesis
can proceed only in the 5'-to-3' direction.) If there were no way to fill this gap, the DNA
molecule would get shorter and shorter with every generation. However, the ends of linear
chromosomes—the telomeres—have several properties that prevent this.
DNA replication occurs during the S phase of cell division. In E. coli, this means
that the entire genome is replicated in just 40 minutes, at a pace of approximately 1,000
nucleotides per second. In eukaryotes, the pace is much slower: about 40 nucleotides per
second. The coordination of the proteincomplexes required for the steps of replication and the
speed at which replication must occur in order for cells to divide are impressive, especially
considering that enzymes are also proofreading, which leaves very few errors behind.
Summary
The study of DNA replication started almost as soon as the structure of DNA was
elucidated, and it continues to this day. Currently, the stages of initiation, unwinding, primer
synthesis, and elongation are understood in the most basic sense, but many questions remain
unanswered, particularly when it comes to replication of the eukaryotic genome. Scientists
have devoted decades to the study of replication, and researchers such as Kornberg and
Okazaki have made a number of important breakthroughs. Nonetheless, much remains to be
learned about replication, including how errors in this process contribute to human disease.
References and Recommended Reading
DNA Transcription
The genetic code is frequently referred to as a "blueprint" because it contains the
instructions a cell requires in order to sustain itself. We now know that there is more to these
instructions than simply the sequence of letters in the nucleotide code, however. For example,
vast amounts of evidence demonstrate that this code is the basis for the production of various
molecules, including RNA and protein. Research has also shown that the instructions stored
within DNA are "read" in two steps: transcription and translation. In transcription, a portion
of the double-stranded DNA template gives rise to a single-stranded RNA molecule. In some
cases, the RNA molecule itself is a "finished product" that serves some
important function within the cell. Often, however, transcription of an RNA molecule is
followed by a translation step, which ultimately results in the production of a protein
molecule.
Visualizing Transcription
Composed by:
LUBABA KOMAL Page 27
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 1
DNA is double-stranded, but only one strand serves as a template for transcription at
any given time. This template strand is called the noncoding strand. The nontemplate
strand is referred to as the coding strand because its sequence will be the same as that of the
new RNA molecule. In most organisms, the strand of DNA that serves as the template for
one gene may be the nontemplate strand for other genes within the same chromosome.
Transcription Initiation
Figure 3
Composed by:
LUBABA KOMAL Page 28
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure Detail
Figure 2
Figure Detail
The first step in transcription is initiation, when the RNA pol binds to the
DNA upstream(5′) of the gene at a specialized sequence called a promoter (Figure 2a). In
bacteria, promoters are usually composed of three sequence elements, whereas in eukaryotes,
there are as many as seven elements.
In prokaryotes, most genes have a sequence called the Pribnow box, with
theconsensus sequence TATAAT positioned about ten base pairs away from the site that
serves as the location of transcription initiation. Not all Pribnow boxes have this exact
nucleotide sequence; these nucleotides are simply the most common ones found at each site.
Although substitutions do occur, each box nonetheless resembles this consensus fairly
closely. Many genes also have the consensus sequence TTGCCA at a position 35 bases
upstream of the start site, and some have what is called anupstream element, which is an A-T
rich region 40 to 60 nucleotides upstream that enhances the rate of transcription (Figure 3). In
any case, upon binding, the RNA pol "core enzyme" binds to another subunit called the
sigma subunit to form a holoezyme capable of unwinding the DNA double helix in order to
facilitate access to the gene. The sigma subunit conveys promoter specificity to RNA
polymerase; that is, it is responsible for telling RNA polymerase where to bind. There are a
number of different sigma subunits that bind to different promoters and therefore assist in
turning genes on and off as conditions change.
Eukaryotic promoters are more complex than their prokaryotic counterparts, in part
because eukaryotes have the aforementioned three classes of RNA polymerase that transcribe
different sets of genes. Many eukaryotic genes also possess enhancer sequences, which can
be found at considerable distances from the genes they affect. Enhancer sequences control
gene activation by binding with activator proteins and altering the 3-D structure of the DNA
to help "attract" RNA pol II, thus regulating transcription. Because eukaryotic DNA is tightly
Composed by:
LUBABA KOMAL Page 29
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
packaged as chromatin, transcription also requires a number of specialized proteins that help
make the template strand accessible.
In eukaryotes, the "core" promoter for a gene transcribed by pol II is most often
found immediately upstream (5′) of the start site of the gene. Most pol II genes have a TATA
box (consensus sequence TATTAA) 25 to 35 bases upstream of the initiation site, which
affects the transcription rate and determines location of the start site. Eukaryotic RNA
polymerases use a number of essential cofactors (collectively called general transcription
factors), and one of these, TFIID, recognizes the TATA box and ensures that the correct start
site is used. Another cofactor, TFIIB, recognizes a different common consensus sequence,
G/C G/C G/C G C CC, approximately 38 to 32 bases upstream (Figure 4).
The terms "strong" and "weak" are often used to describe promoters and
enhancers, according to their effects on transcription rates and thereby ongene expression.
Alteration of promoter strength can have deleterious effects upon a cell, often resulting
in disease. For example, some tumor-promoting viruses transform healthy cells by inserting
strong promoters in the vicinity of growth-stimulating genes, while translocations in
some cancer cells place genes that should be "turned off" in the proximity of strong
promoters or enhancers.
Enhancer sequences do what their name suggests: They act to enhance the rate at
which genes are transcribed, and their effects can be quite powerful. Enhancers can be
thousands of nucleotides away from the promoters with which they interact, but they are
brought into proximity by the looping of DNA. This looping is the result of interactions
between the proteins bound to the enhancer and those bound to the promoter. The proteins
that facilitate this looping are called activators, while those that inhibit it are called
repressors.
Composed by:
LUBABA KOMAL Page 30
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Transcription of eukaryotic genes by polymerases I and III is initiated in a similar
manner, but the promoter sequences and transcriptional activator proteins vary.
Strand Elongation
Once transcription is initiated, the DNA double helix unwinds and RNA polymerase
reads the template strand, adding nucleotides to the 3′ end of the growing chain (Figure 2b).
At a temperature of 37 degrees Celsius, new nucleotides are added at an estimated rate of
about 42-54 nucleotides per second in bacteria (Dennis & Bremer, 1974), while eukaryotes
proceed at a much slower pace of approximately 22-25 nucleotides per second (Izban&Luse,
1992).
Transcription Termination
Composed by:
LUBABA KOMAL Page 31
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Inverted repeat sequences at the end of a gene allow folding of the newly
transcribed RNA sequence into a hairpin loop. This terminates transcription and stimulates
release of the mRNA strand from the transcription machinery.
Figure Detail
Both polyadenylation and termination make use of the same consensus sequence,
and the interdependence of the processes was demonstrated in the late 1980s by work from
several groups. One group of scientists working with mouse globin genes showed that
introducing mutations into the consensus sequence AATAAA, known to be necessary for
poly(A) addition, inhibited both polyadenylation and transcription termination. They
measured the extent of termination by hybridizing transcripts with the different poly(A)
consensus sequence mutants with wild-type transcripts, and they were able to see a decrease
in the signal of hybridization, suggesting that proper termination was inhibited. They
therefore concluded that polyadenylation was necessary for termination (Logan et. al., 1987).
Another group obtained similar results using a monkey viral system, SV40 (simian virus 40).
They introduced mutations into a poly(A) site, which caused mRNAs to accumulate to levels
far above wild type (Connelly & Manley, 1988).
Gene expression is linked to RNA transcription, which cannot happen without RNA
polymerase. However, this is where the similarities between prokaryote and eukaryote
expression end.
Composed by:
LUBABA KOMAL Page 33
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
absorb sodium, while pancreatic cells produce insulin. How is this possible? The answer lies
in differential use of the genome; in other words, different cells within the body express
different portions of their DNA. This process, which begins with the transcription of DNA
into RNA, ultimately leads to changes in cell function. Changes in transcription are thus a
fundamental means by which cell function is regulated across species. In fact, even single-
celled organisms, such as bacteria, regulate gene transcription depending on cues in their
environments. Therefore, understanding how transcription is regulated is fundamental to
deciphering the mysteries of the genome.
Transcription: An Overview
In all species, transcription begins with the binding of the RNA
polymerase complex (or holoenzyme) to a special DNA sequence at the beginning of the
gene known as the promoter. Activation of the RNA polymerase complex enables
transcription initiation, and this is followed by elongation of the transcript. In turn, transcript
elongation leads to clearing of the promoter, and the transcription process can begin yet
again. Transcription can thus be regulated at two levels: the promoter level (cis regulation)
and the polymerase level (trans regulation). These elements differ among bacteria and
eukaryotes.
Transcription in Bacteria
In bacteria, all transcription is performed by a single type of RNA polymerase.
This polymerase contains four catalytic subunits and a single regulatory subunit known as
sigma (s). Interestingly, several distinct sigma factors have been identified, and each of these
oversees transcription of a unique set ofgenes. Sigma factors are thus discriminatory, as each
binds a distinct set of promoter sequences.
Composed by:
LUBABA KOMAL Page 34
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
process (Losick&Stragier, 1992). Each of these sigma factors recognizes the promoters of the
genes in its group, not those "seen" by other sigma factors. This simple example illustrates
how transcription can be regulated in both cis and trans to cause changes in cell function.
Therefore, while bacteria accomplish transcription of all genes using a single kind of RNA
polymerase, the use of different sigma factor subunits provides an extra level of control.
Transcription in Eukaryotes
Figure 1
Eukaryotic cells are more complex than bacteria in many ways, including in terms
of transcription. Specifically, in eukaryotes, transcription is achieved by three different types
of RNA polymerase (RNA pol I-III). These polymerases differ in the number and type of
subunits they contain, as well as the class of RNAs they transcribe; that is, RNA pol I
transcribes ribosomal RNAs (rRNAs), RNA pol II transcribes RNAs that will become
messenger RNAs (mRNAs) and also small regulatory RNAs, and RNA pol III transcribes
small RNAs such as transfer RNAs (tRNAs).
Composed by:
LUBABA KOMAL Page 35
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Composed by:
LUBABA KOMAL Page 36
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 1: A gene is expressed through the processes of transcription and translation.
Figure Detail
Composed by:
LUBABA KOMAL Page 37
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 2: The amino acids specified by each mRNA codon. Multiple codons can code for the
same amino acid.
The codons are written 5' to 3', as they appear in the mRNA. AUG is an initiation codon;
UAA, UAG, and UGA are termination (stop) codons.
Figure Detail
Within all cells, the translation machinery resides within a specialized organelle called
the ribosome. In eukaryotes, mature mRNA molecules must leave the nucleus and travel to
the cytoplasm, where the ribosomes are located. On the other hand, in prokaryotic organisms,
ribosomes can attach to mRNA while it is still being transcribed. In this situation, translation
begins at the 5' end of the mRNA while the 3' end is still attached to DNA.
In all types of cells, the ribosome is composed of two subunits: the large (50S)
subunit and the small (30S) subunit (S, for svedberg unit, is a measure of sedimentation
velocity and, therefore, mass). Each subunit exists separately in the cytoplasm, but the two
join together on the mRNA molecule. The ribosomal subunits contain proteins and
Composed by:
LUBABA KOMAL Page 38
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
specialized RNA molecules—specifically, ribosomal RNA (rRNA) and transfer RNA
(tRNA). The tRNA molecules are adaptor molecules—they have one end that can read the
triplet code in the mRNA through complementary base-pairing, and another end that attaches
to a specific amino acid (Chapeville et al., 1962; Grunberger et al., 1969). The idea that
tRNA was an adaptor molecule was first proposed by Francis Crick, co-discoverer of DNA
structure, who did much of the key work in deciphering the genetic code (Crick, 1958).
Within the ribosome, the mRNA and aminoacyl-tRNA complexes are held
together closely, which facilitates base-pairing. The rRNAcatalyzes the attachment of each
new amino acid to the growing chain.
So, what is the purpose of the UTR? It turns out that the leader sequence is
important because it contains a ribosome-binding site. In bacteria, this site is known as the
Shine-Dalgarno box (AGGAGG), after scientists John Shine and Lynn Dalgarno, who first
characterized it. A similar site in vertebrates was characterized by Marilyn Kozak and is thus
known as the Kozak box. In bacterial mRNA, the 5' UTR is normally short; in human
mRNA, the median length of the 5' UTR is about 170 nucleotides. If the leader is long, it may
contain regulatory sequences, including binding sites for proteins, that can affect
the stability of the mRNA or the efficiency of its translation.
A DNA transcription unit is composed, from its 3' to 5' end, of an RNA-
coding region (pink rectangle) flanked by a promoter region (green rectangle) and a
terminator region (black rectangle). Regions to the left, or moving towards the 3' end, of the
Composed by:
LUBABA KOMAL Page 39
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
transcription start site are considered \"upstream;\" regions to the right, or moving towards
the 5' end, of the transcription start site are considered \"downstream.\
Composed by:
LUBABA KOMAL Page 40
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Composed by:
LUBABA KOMAL Page 41
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 4: The translation initiation complex.
When translation begins, the small subunit of the ribosome and an initiator
tRNA molecule assemble on the mRNA transcript. The small subunit of the ribosome has
three binding sites: an amino acid site (A), a polypeptide site (P), and an exit site (E). The
initiator tRNA molecule carrying the amino acid methionine binds to the AUG start codon of
the mRNA transcript at the ribosome’s P site where it will become the first amino acid
incorporated into the growing polypeptide chain. Here, the initiator tRNA molecule is shown
binding after the small ribosomal subunit has assembled on the mRNA; the order in which
this occurs is unique to prokaryotic cells. In eukaryotes, the free initiator tRNA first binds the
small ribosomal subunit to form a complex. The complex then binds the mRNA transcript, so
that the tRNA and the small ribosomal subunit bind the mRNA simultaneously.
Figure Detail
Although methionine (Met) is the first amino acid incorporated into any new
protein, it is not always the first amino acid in mature proteins—in many proteins, methionine
is removed after translation. In fact, if a large number of proteins are sequenced and
compared with their known gene sequences, methionine (or formylmethionine) occurs at
the N-terminus of all of them. However, not all amino acids are equally likely to occur
second in the chain, and the second amino acid influences whether the initial methionine is
enzymatically removed. For example, many proteins begin with methionine followed by
alanine. In both prokaryotes and eukaryotes, these proteins have the methionine removed, so
that alanine becomes the N-terminal amino acid (Table 1). However, if the second amino acid
is lysine, which is also frequently the case, methionine is not removed (at least in the sample
proteins that have been studied thus far). These proteins therefore begin with methionine
followed by lysine (Flinta et al., 1986).
Composed by:
LUBABA KOMAL Page 42
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 5: The large ribosomal subunit binds to the small ribosomal subunit to complete the
initiation complex.
The initiator tRNA molecule, carrying the methionine amino acid that will
serve as the first amino acid of the polypeptide chain, is bound to the P site on the ribosome.
The A site is aligned with the next codon, which will be bound by the anticodon of the next
incoming tRNA.
Composed by:
LUBABA KOMAL Page 43
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 6
Figure Detail
The next phase in translation is known as the elongation phase (Figure 6).
First, the ribosome moves along the mRNA in the 5'-to-3'direction, which requires the
elongation factor G, in a process called translocation. The tRNA that corresponds to the
second codon can then bind to the A site, a step that requires elongation factors (in E. coli,
these are called EF-Tu and EF-Ts), as well as guanosine triphosphate (GTP) as an energy
source for the process. Upon binding of the tRNA-amino acid complex in the A site, GTP is
cleaved to form guanosine diphosphate (GDP), then released along with EF-Tu to be recycled
by EF-Ts for the next round.
This process is repeated until all the codons in the mRNA have been read
by tRNA molecules, and the amino acids attached to the tRNAs have been linked together in
the growing polypeptide chain in the appropriate order. At this point, translation must be
terminated, and the nascent protein must be released from the mRNA and ribosome.
Composed by:
LUBABA KOMAL Page 44
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Termination of Translation
There are three termination codons that are employed at the end of a protein-
coding sequence in mRNA: UAA, UAG, and UGA. No tRNAs recognize these codons. Thus,
in the place of these tRNAs, one of several proteins, called release factors, binds and
facilitates release of the mRNA from the ribosome and subsequent dissociation of the
ribosome.
Composed by:
LUBABA KOMAL Page 45
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Aside from their role in composing proteins, amino acids have many
biologically important functions. They are also energy metabolites, and many of them are
essential nutrients. Amino acids can often function as chemical messengers in communication
between cells. For example, ArvidCarlsson discovered in 1957 that the amine 3-
hydroxytyramine (dopamine) was not only a precursor for the synthesis of adrenaline from
tyrosine, but is also a key neurotransmitter. Certain amino acids — such as citrulline and
ornithine, which are intermediates in urea biosynthesis — are important intermediaries in
various pathways involving nitrogenous metabolism. Although other amino acids are
important in several pathways, S-adenosylmethionine acts as a universal methylating agent.
What follows is a discussion of amino acids, their biosynthesis, and the evolution of their
synthesis pathways, with a focus on tryptophan and lysine.
The way amino acids are synthesized has changed during the history of
Earth. The Hadean eon represents the time from which Earth first formed. The subsequent
Archean eon (approximately 3,500 million years ago) is known as the age of bacteria and
archaea. The Proterozoic eon was the gathering up of oxygen in Earth's atmosphere, and the
Phanerozoic eon coincides with the major diversification of animals, plants, and fungi.
Figure Detail
Composed by:
LUBABA KOMAL Page 46
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
rather than emerging from an electrified primordial soup, amino acids emerge from
biosynthetic enzymaticreactions.
Composed by:
LUBABA KOMAL Page 47
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
step. Similarly, asparagine and glutamine are synthesized from aspartate and glutamate,
respectively, by an amidation reaction step. The synthesis of other amino acids requires more
steps; between one and thirteen biochemical reactions are necessary to produce the different
amino acids from their precursors of the central metabolism (Figure 2). The relative uses of
amino acid biosynthetic pathways vary widely among species because different synthesis
pathways have evolved to fulfill unique metabolic needs in different organisms. Although
some pathways are present in certain organisms, they are absent in others. Therefore,
experimental results about amino acid metabolism that are achieved with model organisms
may not always have relevance for the majority of other organisms.
How do certain amino acids become essential for a given organism? Studies
in ecology and evolution give some clues. Organisms evolve under environmental
constraints, which are dynamic over time. If an amino acid is available for uptake,
the selective pressure to keep intact the genesresponsible for that pathway might be lowered,
because they would not be constantly expressing these biosynthetic genes. Without the
selective pressure, the biosynthetic routes might be lost or the gene could allow mutations
that would lead to a diversification of the enzyme's function. Following this logic, amino
acids that are essential for certain organisms might not be essential for other organisms
subjected to different selection pressures. For example, in 2000, Ishikawa and colleagues
completed the genome sequence of the endosymbiont bacteria Buchnera, and in it they found
Composed by:
LUBABA KOMAL Page 48
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
the genes for the biosynthetic pathways necessary for the synthesizing essential amino acids
for its symbiotic host, the aphid. Interestingly, those genes for the synthesis of its
"nonessential" amino acids are almost completely missing (Shigenobu et al. 2000). In this
way, Buchnera provides the host with some amino acids and obtains the other amino acids
from the host (Baumann 2005; Pal et al. 2006).
Composed by:
LUBABA KOMAL Page 49
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Fani and coworkers performed a comparative analysis of the synthesis enzyme sequences and
their phylogenetic distribution that suggested that the synthesis of leucine, lysine, and
arginine were initially carried out with the same set of versatile enzymes. Over the course of
time came a series of gene duplication events and enzyme specializations that gave rise to the
unambiguous pathways we know today. Which of the pathways appeared earlier is still a
source of query and debate.
Composed by:
LUBABA KOMAL Page 50
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
SeC, and similar mechanisms dependent on tRNA have been described for asparagine,
glutamine, and cysteine. Owing to its appearance of SeC across all three domains of life,
scientists wonder if it is an ancestral mechanism for amino acid biosynthesis or simply a
coincidence of selection pressures.
Summary
Scientists now recognize twenty-two amino acids as the building blocks of proteins: the
twenty common ones and two more, selenocysteine and pyrrolysine. Amino acids have
several functions. Their primary function is to act as the monomer unit in protein synthesis.
They can also be used as substrates for biosynthetic reactions; the nucleotide bases and a
number of hormones and neurotransmitters are derived from amino acids. Amino acids can
be synthesized from glycolytic or Krebs cycle intermediates. The essential amino acids, those
that are needed in the diet, require more steps to be synthesized. Some amino acids need to be
synthesized when charged onto their corresponding tRNAs. We have discussed only two
biosynthetic routes: the Trp pathway, which appears to have evolved only once, and the Lys
pathway, which seems to have evolved independently in different lineages. Prevailing
evidence suggests that metabolic pathways themselves seem to be evolving following the
patchwork assembly model, which proposes that pathways originated through
the recruitment of generalist enzymes that could react with a wide range of substrates. The
study of the evolution of amino acid metabolism has helped us understand the evolution of
metabolism in general.
Composed by:
LUBABA KOMAL Page 51
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
nucleotides in mRNA: adenine (A), uracil (U), guanine (G), and cytosine (C). Thus, 20 amino
acids are coded by only four unique bases in mRNA, but just how is this coding achieved?
The Codon
The discordance between the number of nucleic acid bases and the number of
amino acids immediately eliminates the possibility of a code of one base per amino acid. In
fact, even two nucleotides per amino acid (a doublet code) could not account for 20 amino
acids (with four bases and a doublet code, there would only be 16 possible combinations [42 =
16]). Thus, the smallest combination of four bases that could encode all 20 amino acids
would be atriplet code. However, a triplet code produces 64 (4 3 = 64) possible combinations,
or codons. Thus, a triplet code introduces the problem of there being more than three times
the number of codons than amino acids. Either these "extra" codons produce redundancy,
with multiple codons encoding the same amino acid, or there must instead be numerous dead-
end codons that are not linked to any amino acid.
Composed by:
LUBABA KOMAL Page 52
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 1
Figure Detail
Once the budding molecular biology community was convinced about the
triplet code, the race to decode which triplets specified which amino acids began. The
simplest way to decipher the code would be to start with an mRNA molecule of known
sequence, use it to direct the synthesis of a protein, and then determine the amino acid
sequence of the synthesized protein. Then, comparison of the original mRNA sequence with
the amino acid sequence of the synthesized protein could provide a means for directly
decoding the genetic code (Figure 1).
However, at the time when this decoding project was conducted, researchers
did not yet have the benefit of modern sequencing techniques. To circumvent this challenge,
Marshall W. Nirenberg and Heinrich J. Matthaei (1962) made their own simple, artificial
mRNA and identified the polypeptide product that was encoded by it. To do this, they used
theenzyme polynucleotide phosphorylase, which randomly joins together any RNA
nucleotides that it finds. Nirenberg and Matthaei began with the simplest codes possible.
Specifically, they added polynucleotide phosphorylase to a solution of pure uracil (U), such
that the enzyme would generate RNA molecules consisting entirely of a sequence of U's;
these molecules were known as poly(U) RNAs. Each poly(U) RNA thus contained a pure
series of UUU codons, assuming a triplet code. These poly(U) RNAs were added to 20 tubes
containing components for protein synthesis (ribosomes, activating enzymes, tRNAs, and
other factors). Each tube contained one of the 20 amino acids, which were radioactively
labeled. Of the 20 tubes, 19 failed to yield a radioactive polypeptide product. Only one tube,
the one that had been loaded with the labeled amino acid phenylalanine, yielded a product.
Nirenberg and Matthaei had therefore found that the UUU codon could be translated into the
amino acid phenylalanine. Similar experiments using poly(C) and poly(A) RNAs showed that
proline was encoded by the CCC codon, and lysine by the AAA codon.
Composed by:
LUBABA KOMAL Page 53
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 2
Figure Detail
In further experiments to decode the other codons, Nirenberg and his colleagues
made artificial RNAs containing defined proportions of two or three different bases. As
previously mentioned, polynucleotide phosphorylase joins nucleotides randomly; as a result,
these artificial RNAs contained random mixtures of the bases in proportion to the amounts of
bases mixed. Hence, the resulting products provided clues that the researchers could use to
deduce potential codon–amino acid relationships.
Thus, in 1965, H. Gobind Khorana and his colleagues used another method to
further crack the genetic code. These researchers had the insight to employ chemically
synthesized RNA molecules of known repeating sequences rather than random sequences.
For example, an artificial mRNA of alternating guanine and uracil nucleotides
(GUGUGUGUGUGU) should be read in translation as two alternating codons, GUG and
UGU, thus encoding a protein of two alternating amino acids. Translation of the artificial
GUGU mRNA yielded a protein of alternating cysteine and valine residues. However, this
technique alone could not determine whether GUG or UGU encoded cysteine, for example.
Composed by:
LUBABA KOMAL Page 54
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
single codon (three bases)—could still bind to a ribosome, even if this short sequence was
incapable of directing protein synthesis. The ribosome-bound codon could then base pair with
a particular tRNA that carried the amino acid specified by the codon (Figure 2).
Nirenberg and Leder thus synthesized many short mRNAs with known codons.
They then added the mRNAs one by one to a mix of ribosomes and aminoacyl-tRNAs with
one amino acid radioactively labeled. For each, they determined whether the aminoacyl-
tRNA was bound to the short mRNA-like sequence and ribosome (the rest passed through the
filter), providing conclusive demonstrations of the particular aminoacyl-tRNA that bound to
each mRNA codon.
Composed by:
LUBABA KOMAL Page 55
Molecular Biology.BOT-601 Recommended by:
BS-Botany-7th semester Dr. Muther Mansoor Qaisrani
Figure 3: The amino acids specified by each mRNA codon. Multiple codons can code for
the same amino acid.
The codons are written 5' to 3', as they appear in the mRNA. AUG is an initiation codon;
UAA, UAG, and UGA are termination (stop) codons.
Figure Detail
Composed by:
LUBABA KOMAL Page 56