You are on page 1of 230

TRANSCRIPTION

DNA-Dependent Synthesis of RNA


Our discussion of RNA synthesis begins with a
comparison between transcription and DNA
replication.

 Transcription resembles replication in its


fundamental chemical mechanism, its polarity
(direction of synthesis), and its use of a template.

 And like replication, transcription has initiation,


elongation, and termination phases
Transcription differs from replication in that it does
not require a primer.

Generally, involves only limited segments of a DNA


molecule.

 Additionally, within transcribed segments only one


DNA strand serves as a template.
RNA Is Synthesized by RNA Polymerases
DNA-dependent RNA polymerase requires

I. A DNA template

II. All four ribonucleoside 5’-triphosphates (ATP, GTP,


UTP, and CTP) as precursors of the nucleotide
units of RNA

III. Mg2+ and Zn2+


The chemistry and mechanism of RNA synthesis
closely resemble those used by DNA polymerases.

RNA polymerase elongates an RNA strand by


adding ribonucleotide units to the 3'-hydroxyl end,
building RNA in the 5'→3' direction.

Unlike DNA polymerase, RNA polymerase does not


require a primer to initiate synthesis.

Initiation occurs when RNA polymerase binds at


specific DNA sequences called promoters.
The structure of RNA polymerase and the signals
that control transcription, differ among organisms.

 Particularly from prokaryotes to eukaryotes.

Therefore, the discussions of prokaryotic and


eukaryotic transcription are presented separately.
TRANSCRIPTION OF PROKARYOTIC
GENES
Properties of prokaryotic RNA
polymerase
In bacteria, one species of RNA polymerase
synthesizes all of the RNA except for the short RNA
primers needed for DNA replication.
(RNA primers are synthesized by a specialized enzyme,
primase,).

RNA polymerase is a multi subunit enzyme that


recognizes a nucleotide sequence (the promoter
region) at the beginning of a length of DNA that is to be
transcribed.
It next makes a complementary RNA copy of the
DNA template strand.
And then recognizes the end of the DNA sequence
to be transcribed (the termination region).
RNA is synthesized from its 5'-end to its 3'-
end,antiparallel to its DNA template strand.
The template is copied as it is in DNA synthesis.
The two complementary DNA strands have
different roles in transcription.
The strand that serves as template for RNA
synthesis is called the template strand (anti-sense).
 The DNA strand complementary to the template,
the nontemplate strand, or coding strand (sense), is
identical in base sequence to the RNA transcribed
from the gene,
 Having U in the RNA in place of T in the DNA
 Transcription by RNA polymerase involves a core
enzyme and several auxiliary proteins.
CORE ENZYME
Four of the enzyme’s peptide subunits,
2α, 1β, and 1β', are referred to as the core
enzyme
• 2α subunits are required for enzyme
assembly.
• β‘ is required for template binding,
• β is responsible for the 5'→3' RNA
polymerase activity.
However it cannot recognize the promoter
region on the DNA template.
The function of a fifth subunit, Ω (omega), is
unclear.
HOLOENZYME
The σ subunit (“sigma factor”) enables RNA
polymerase to recognize promoter regions on
the DNA.

The σ subunit plus the core enzyme make up


the holoenzyme.

Different σ factors are required for the


recognition of different groups of genes.
Steps in RNA Synthesis
The process of transcription of a typical gene of E.
coli can be divided into three phases:
•Initiation
•Elongation
•and termination.
A transcription unit extends from the promoter to
the termination region.
The initial product of transcription produced by
RNA polymerase is termed the primary transcript.
INITIATION
Transcription begins with the binding of the RNA
polymerase holoenzyme to a region of the DNA
known as the promoter, which is not transcribed.
The prokaryotic promoter contains characteristic
consensus sequences.
Consensus sequences are idealized sequences in
which the base shown at each position is the base
most frequently (but not necessarily always)
encountered at that position.
 Those that are recognized by prokaryotic RNA
polymerase σ factors include:

 –35 sequence

Pribnow box
By convention, the DNA base pairs that
correspond to the beginning

• of an RNA molecule are given


• positive numbers,
• and those preceding the RNA start site are
given negative numbers.

The promoter region thus extends between


positions -70 and +30.
These sequences are important interaction sites for
the  (sigma) subunit.

Although the sequences are not identical for all


bacterial promoters in this class, certain nucleotides
that are particularly common at each position form
a consensus sequence

The consensus sequence at the -10 region is


(5)TATAAT(3); the consensus sequence at the -35
region is (5)TTGACA(3).
The efficiency with which an RNA polymerase
binds to a promoter and initiates transcription is
determined by these sequences.
• The spacing between them,
• And their distance from the transcription start
site.
Mutations that affect the function of a given
promoter often involve a base pair in these
regions.
Variations in the consensus sequence also affect
the efficiency of RNA polymerase binding and
transcription initiation.
–35 Sequence
A consensus sequence (5'-TTGACA-3'), centered
about 35 bases to the left of the transcription
start site is the initial point of contact for the holo
-enzyme, and a closed complex is formed.

The regulatory sequences that control


transcription are, by convention, designated by
the 5'→3' nucleotide sequence on the
nontemplate strand.
A base in the promoter region is assigned a
negative number if it occurs prior to (to the
left of, toward the 5'-end of, or “upstream” of)
the transcription start site.

Therefore, the TTGACA sequence is centered


at approximately base –35.

The first base at the transcription start site is


assigned a position of +1. There is no base
designated “0.”
Pribnow Box
The holoenzyme moves and covers a second
consensus sequence (5'-TATAAT-3'), centered at
about –10, which is the site of initial DNA melting
(unwinding).
 Melting of a short stretch (about 14 bases)
converts the closed complex to an open one
known as a transcription bubble.
A mutation in either the –10 or the –35
sequence can affect the transcription of the gene
controlled by the mutant promoter
ELONGATION
Once the promoter region has been recognized
and bound by the holoenzyme,
• local unwinding of the DNA helix continues,
which is mediated by the RNA polymerase.
Unwinding generates supercoils in the DNA that
can be relieved by DNA topoisomerases.
 RNA polymerase begins to synthesize a transcript
of the DNA sequence, and several short pieces of
RNA are made.
The elongation phase is said to begin when the
transcript (typically starting with a purine)
exceeds ten nucleotides in length.

Sigma is then released, and the core enzyme is


able to leave (“clear”) the promoter and move
along the template strand in a processive
manner.

During transcription, a short DNA-RNA hybrid


helix is formed.
Like DNA polymerase, RNA polymerase uses
nucleoside triphosphates as substrates and
releases pyrophosphate each time a
nucleoside monophosphate is added to the
growing chain.

As with replication, transcription is always in


the 5'→3' direction.

In contrast to DNA polymerase, RNA


polymerase does not require a primer and
does not appear to have proofreading activity.
TERMINATION
The elongation of the single-stranded RNA
chain continues until a termination signal is
reached.

Termination can be
1. Intrinsic (spontaneous) or
2. Dependent upon the participation of a
protein known as the ρ (rho) factor.
ρ-Independent termination m
 Most ρ-independent terminators have two
distinguishing features.
 The first is a region that produces an RNA
transcript with self-complementary sequences,
permitting the formation of a hairpin structure
centered 15 to 20 nucleotides before the projected
end of the RNA strand.
 The second feature is a highly conserved string of
three A residues in the template strand that are
transcribed into U residues near the 3’ end of the
hairpin.
 When a polymerase arrives at a termination
site with this structure, it pauses.
 Formation of the hairpin structure in the RNA
disrupts several A=U base pairs in the RNA-DNA
hybrid segment and may disrupt important
interactions between RNA and the RNA
polymerase.

 Thus facilitating dissociation of the transcript.


 Seen with most prokaryotic genes, this requires
that a sequence in the DNA template generate a
sequence in the nascent (newly made) RNA that is
self-complementary.

 This allows the RNA to fold back on itself, forming


a GC-rich stem (stabilized by H-bonds) plus a loop.

 This structure is known as a “hairpin”.

 Additionally, just beyond the hairpin, the RNA


transcript contains a string of Us at the 3'-end..
 The bonding of these Us to the
complementary As of the DNA template is
weak.

 This facilitates the separation of the newly


synthesized RNA from its DNA template, as the
double helix “zips up” behind the RNA
polymerase.
ρ-Dependent termination
 This requires the participation of an additional
protein, rho (ρ), which is a hexameric adenosine
triphosphatase (ATPase) with helicase activity.
 ρ binds a C-rich “rho recognition site” near the 3'-
end of the nascent RNA and, using its ATPase
activity
 Migrates in the 5'→3‘ direction along the RNA
until it reaches the transcription complex (RNA
polymerase + template strand) paused at the
termination site.
 The ATP dependent helicase activity of ρ separates
the RNA-DNA hybrid helix, causing the release of the
RNA.

 ATP is hydrolyzed by ρ protein during the


termination process.
 Helicases are the motor proteins that move
directionally along a nucleic acid phosphodiester
backbone.

 Thus separating two annealed nucleic acid strands


(i.e., DNA, RNA, or RNA-DNA hybrid) using energy
derived from ATP hydrolysis.
Specific Sequences Signal Termination
of RNA Synthesis

 RNA synthesis is processive (that is, the RNA


polymerase has high processivity necessarily so,
because if an RNA polymerase released an RNA
transcript prematurely,

 It could not resume synthesis of the same


RNA but instead would have to start over.
The average number of nucleotides added
before a polymerase dissociates defines its
processivity.

DNA polymerases vary greatly in processivity;


some add just a few nucleotides before
dissociating, others add many thousands.
 However, an encounter with certain DNA
sequences results in a pause in RNA synthesis,
and at some of these sequences transcription
is terminated.
 The process of termination is not yet well
understood in eukaryotes, so our focus is again
on bacteria. E. coli has at least two classes of
termination signals:
 One class relies on a protein factor called
(rho) and
 The other is (rho) independent
Action of antibiotics:
 Some antibiotics prevent bacterial cell growth by
inhibiting RNA synthesis.
 Rifampicin inhibits bacterial DNA-dependent
RNA synthesis by inhibiting bacterial DNA-
dependent RNA polymerase.
 For example, rifampin inhibits the initiation of
transcription by binding to the β subunit of
prokaryotic RNA polymerase,
 Thus interfering with the formation of the first
phosphodiester bond.
Resistance to rifampicin arises from mutations
that alter residues of the rifampicin binding site on
RNA polymerase, resulting in decreased affinity for
rifampicin.

 Resistant mutations map to the rpoB gene,


encoding RNA polymerase beta subunit.

 Rifampicin is useful in the treatment of


tuberculosis.
TRANSCRIPTION OF
EUKARYOTIC GENES.
 The transcriptional machinery in the nucleus of a
eukaryotic cell is much more complex than that in
bacteria.
 Eukaryotic transcription involves separate
polymerases for the synthesis of
 rRNA,
 tRNA and
 mRNA.
 In addition, a large number of proteins called
transcription factors (TFs) are also involved.
TFs bind to distinct sites on the DNA—either close
(proximal) to promoter region, or some distance
away (distal).
They are required for
 The assembly of a transcription complex at the
promoter and
 The determination of which genes are to be
transcribed.
 Each eukaryotic RNA polymerase has its own
promoters and TFs.
 For TFs to recognize and bind to their specific
DNA sequences, the chromatin structure in that
region must be altered (remodeled) in order to
allow the access to the DNA.
Eukaryotic Cells Have Three Kinds of
Nuclear RNA Polymerases.
 Eukaryotes have three RNA polymerases,
designated I, II, and III, which are distinct
complexes but have certain subunits in common.

 Each polymerase has a specific function and is


recruited to a specific promoter sequence.
 Each class of RNA polymerase recognizes
particular types of genes.
Mitochondrial RNA polymerase.

 In addition to nuclear RNA polymerases


eukaryotes also contain mitochondrial RNA
polymerase.

 Which is a single RNA polymerase that more


closely resembles bacterial RNA polymerase than
the eukaryotic enzyme.
RNA polymerase I (Pol I) is responsible for the
synthesis of only one type of RNA, a transcript
called pre ribosomal RNA (or pre-rRNA), which
contains the precursor for the
 18S,
 5.8S, and
 28S rRNAs.
 Pol I promoters vary greatly in sequence from
one species to another.
The principal function of RNA polymerase II (Pol
II) is synthesis of mRNAs and some specialized
RNAs.

 This enzyme can recognize thousands of


promoters that vary greatly in sequence.

 Many Pol II promoters have a few sequence


features in common, including a TATA box
(eukaryotic consensus sequence TATAAA) near
base pair -30 and an Inr sequence (initiator) near
the RNA start site at +1.
In the Inr consensus sequence shown here, N
represents any nucleotide; Y, a pyrimidine
nucleotide.
This enzyme synthesizes the nuclear precursors
of mRNA that are subsequently translated to
produce proteins.

Polymerase II also synthesizes certain small Non-


coding RNA (ncRNAs), such as
 small nuclear RNA (snRNA),
 small nucleolar RNA (snoRNA) and
 microRNA (miRNA).
 RNA polymerase III (Pol III) makes
 tRNAs,
 the 5S rRNA,
 and some other small specialized RNAs.
 The promoters recognized by Pol III are well
characterized.
 Interestingly, some of the sequences required for
the regulated initiation of transcription by Pol III
are located within the gene itself, whereas others
are in more conventional locations upstream of
the RNA start site.
Four RNA Polymerases of Eukaryotic
Cells
Type of Genes Transcribed
Polymerase
RNA pol I 5.8S, 18S, and 28S rRNA genes

RNA pol II
protein coding genes, snoRNA genes, some
snRNA genes,microRNAs

RNA pol III


tRNA genes, 5S rRNA genes some snRNA genes,
genes for other small RNAs

RNA pol IV plants only; small interfering RNAs (siRNAs


Super coiling In Eukaryotes.
 Eukaryotic DNA is associated with tightly
bound basic proteins, called histones.

 These serve to order the DNA into basic


structural units, called Nucleosomes that
resemble beads on a string.

 Nucleosomes are further arranged into


increasingly more complex structures that
organize and condense the long DNA
molecules into chromosomes 67
Super coiling In Eukaryotes
 There are five classes of histones, designated
H1, H2A, H2B, H3, and H4.

 These small proteins are positively charged at


physiologic pH as a result of their high content
of lysine and arginine.

 Because of their positive charge, they form


ionic bonds with negatively charged DNA.

68
Super coiling In Eukaryotes

 Nucleosomes: Two molecules each of H2A,


H2B, H3, and H4 form the structural core of
the individual "beads."

 Around this core, a segment of the DNA


double helix is wound nearly twice, forming a
negatively super twisted helix.

70
Nucleosome Structure

Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.
Super coiling In Eukaryotes
 Histone H1of which there are several related
species, is not found in the nucleosome core.

 But instead binds to linker DNA chain


between the nucleosomes beads

72
Nucleosomes connected together by linker
DNA and H1 histone to produce the “beads-
on-a-string” extended form of chromatin

73
Chromatin structure and gene expression.
 The association of DNA with histones to form
nucleosomes affects the ability of the transcription
machinery to access the DNA to be transcribed.

 Most actively transcribed genes are found in a


relatively relaxed form of chromatin called
euchromatin

Whereas most inactive segments of DNA are


found in highly condensed hetero chromatin.
 The inter conversion of these forms is called
chromatin remodeling.

 A major mechanism by which chromatin is


remodeled is through

Acetylation of lysine residues at the amino


terminus of histone proteins.
Acetylation, mediated by histone
acetyltransferases (HATs)

Eliminates the positive charge on the lysine and


thereby decreases the interaction of the histone
with the negatively charged DNA.

 Removal of the acetyl group by histone


deacetylases (HDACs) restores the positive charge,
and fosters stronger interactions between histones
and DNA.
RNA Polymerase II Requires Many
Other Protein Factors for Its Activity.
 RNA polymerase II is central to eukaryotic gene
expression and has been studied extensively.

 Although this polymerase is strikingly more


complex than its bacterial counterpart.

 The complexity masks a remarkable


conservation of structure, function, and
mechanism.
 Pol II is a huge enzyme with 12 subunits.

 The largest subunit (RBP1) exhibits a high


degree of homology to the βˊ subunit of bacterial
RNA polymerase.

 Another subunit (RBP2) is structurally similar to


the bacterial β subunit and

 Two others (RBP3 and RBP11) show some


structural homology to the two bacterial α
subunits.
Promoters and transcription factors for
RNA polymerase II:
 In some genes transcribed by RNA polymerase II

 A sequence of nucleotides that is nearly identical


to that of the Pribnow box is found centered about
25 nucleotides up stream of the transcription start
site.

This promoter consensus sequence is called the


TATA or Hogness box.
 Between 70 and 80 nucleotides upstream of the
transcription start site often is found a second
consensus sequence known as the CAAT box.

 In other genes, for example, those that are always


(“constitutively”) expressed, no TATA box is typically
present.
Constitutive gene or constitutive expression - a
gene that is transcribed continually compared to a
facultative gene which is only transcribed as needed

 Instead, a GC-rich region (GC box) may be found.


 RNA polymerase II requires an array of other
proteins, called transcription factors, in
order to form the active transcription complex.

 The general transcription factors required at


every pol II promotors (factors usually designated
TFII with an additional identifier) are highly
conserved in all eukaryotes.
 The process of transcription by Pol II can be
described in terms of several phases—
 assembly,
 initiation,
 elongation,
 termination
 Each associated with characteristic proteins.
 The step-by-step pathway described below leads
to active transcription in vitro.
Assembly of RNA Polymerase and
Transcription Factors at a Promoter.
 The formation of a closed complex begins when
 The TATA-binding protein (TBP) binds to the TATA
box.
 TBP is bound in turn by the transcription factor
TFIIB, which also binds to DNA on either side of
TBP.
 TFIIA binding, although not always essential, can
stabilize the TFIIB-TBP complex on the DNA.
The TFIIB-TBP complex is next bound by another
complex consisting of
 TFIIF and Pol II.

Finally, TFIIE and


 TFIIH bind to create the closed complex.
 Finally, TFIIE and
 TFIIH bind to create the closed complex.
 TFIIH has DNA helicase activity that promotes the
unwinding of DNA near the RNA start site (a
process requiring the hydrolysis of ATP), thereby
creating an open complex.
 Counting all the subunits of the various essential
factors (excluding TFIIA), this minimal active
assembly has more than 30 polypeptides.
RNA Strand Initiation and Promoter
Clearance
During synthesis of the initial 60 to 70 nucleotides

of RNA,

 First TFIIE and

 Then TFIIH is released, and

 Pol II enters the elongation phase of transcription


Elongation, Termination, and Release
 TFIIF remains associated with Pol II throughout
elongation.
 During this stage, the activity of the polymerase is
greatly enhanced by proteins called elongation
factors.
 The elongation factors suppress pausing during
transcription and also coordinate interactions
between protein complexes involved in the post
transcriptional processing of mRNAs.
 Once the RNA transcript is completed,
transcription is terminated.

 Pol II is dephosphorylated and recycled, ready to


initiate another transcript.
 No one consensus sequence is found in all core
promoters.

 Because these sequences are on the same


molecule of DNA as the gene being transcribed,
they are called cis-acting elements.

 Such sequences serve as binding sites for TFs,


which in turn interact with each other and with
RNA polymerase II.
 Because TFs are encoded by different genes,
synthesized in the cytosol, and
 Must transit to their sites of action, they are
called trans-acting factors.

 General TFs are the minimal requirements for


 Recognition of the promoter,
 Recruitment of RNA polymerase II to the
promoter, and
 Initiation of transcription.
 Eukaryotic general transcription factors like,
 Cat Box binding transcription factor (CTF),
 Specific Protein 1 (SP1),
 TFIID
 bind to consensus sequences found in promoters
for RNA polymerase II.
 Note: The SP1 transcription factor contains a zinc
finger protein motif, by which it binds directly to
DNA GC rich consensus sequence and enhances
gene transcription.
 In contrast to the holoenzyme of prokaryotes,
eukaryotic RNA polymerase II does not itself
recognize and bind the promoter.

 Instead, TFIID recognizes and binds the TATA box,


and TFIIF brings the polymerase to the promoter.

 The helicase activity of TFIIH melts the DNA and


its kinase activity phosphorylates polymerase,
allowing it to clear the promoter.
 Specific TFs (transcriptional activators) bind to
sequences within and outside of the core
promoter.
 They are required to
 modulate the frequency of initiation,
 to mediate the response to signals such as
hormones and
 to regulate which genes are expressed at a given
point in time.
 A typical protein-coding eukaryotic gene has
binding sites for many such factors.

 In addition to binding DNA, specific TFs also bind


other proteins (“coactivators”), recruiting them to
the transcription complex.

 Coactivators include the HAT enzymes involved in


chromatin remodeling.
Role of enhancers in eukaryotic gene
regulation:

 Enhancers are special cis-acting DNA sequences


that increase the rate of initiation of transcription
by RNA polymerase II.

 Enhancers are on the same chromosome as the


gene whose transcription they stimulate.
However, they can
 1) be located upstream (to the 5'-side) or
downstream (to the 3'-side) of the transcription
start site;
 2) be close to or thousands of base pairs away
from the promoter and
 3) occur on either strand of the DNA.
 Enhancers contain DNA sequences called
“response elements” that bind specific TFs that
function as transcriptional activators.
By bending or looping the DNA, these enhancer-
binding factors can interact with other
transcription factors bound to a promoter and with
RNA polymerase II, thereby stimulating
transcription.

 Silencers are similar to enhancers in that they act


over long distances; however, they reduce gene
expression.
Inhibitors of RNA polymerase II:
 This enzyme is inhibited by α- amanitin—

A potent toxin produced by the poisonous


mushroom Amanita phalloides (sometimes called
“death cap” ).

 α-Amanitin forms a tight complex with the


polymerase, thereby inhibiting mRNA synthesis
and, ultimately, protein synthesis.
RNA Processing.
Many of the RNA molecules in bacteria and
virtually all RNA molecules in eukaryotes are
processed to some degree after synthesis.

 Some of the most interesting molecular events in


RNA metabolism occur during this post synthetic
processing.

 Intriguingly, several of the enzymes that catalyze


these reactions consist of RNA rather than protein.
 The discovery of these catalytic RNAs, or
ribozymes, has brought a revolution in thinking
about RNA function and about the origin of life.

 A newly synthesized RNA molecule is called a


primary transcript.

 Perhaps the most extensive processing of


primary transcripts occurs in
eukaryotic mRNAs and
in tRNAs of both bacteria and eukaryotes.
POSTTRANSCRIPTIONAL MODIFICATION OF
RNA.
 A primary transcript is the initial, linear, RNA
copy of a transcription unit—the segment of DNA
between specific initiation and termination
sequences.
 The primary transcripts of both prokaryotic and
eukaryotic tRNA and rRNA are
 post transcriptionally modified by cleavage of the
original transcripts by ribonucleases.
tRNAs are then further modified to help give each
species its unique identity.

 In contrast, prokaryotic mRNA is generally


identical to its primary transcript

 Whereas eukaryotic mRNA is extensively


modified both co- and post transcriptionally.
 Noncoding tracts that break up the coding region
of the transcript are called introns, and the coding
segments are called exons.

 In a process called splicing, the introns are


removed from the primary transcript and the
exons are joined to form a continuous sequence
that specifies a functional polypeptide.
Ribosomal RNA
 rRNAs of both prokaryotic and eukaryotic cells
are generated from long precursor molecules
called pre-rRNAs.

 The 23S,
 16S, and
5S rRNA
of prokaryotes are produced from a single pre-
rRNA molecule,
In eukaryotes, a 45S pre-rRNA transcript is
processed in the nucleolus to form the
18S,
28S, and
 5.8S rRNAs characteristic of eukaryotic ribosomes

 The 5S rRNA of most eukaryotes is made as a


completely separate transcript by a different
polymerase (Pol III instead of Pol I) and modified
separately.
The 45S precursor is methylated at more than 100 of
its 14,000 nucleotides, mostly on the 2’-OH groups of
ribose units retained in the final products.

 A series of enzymatic cleavages produces the 18S,


5.8S, and 28S rRNAs.

The cleavage reactions require RNAs found in the


nucleolus, called small nucleolar RNAs (snoRNAs),
 The pre-rRNAs are cleaved by ribonucleases to
yield intermediate-sized pieces of rRNA,

 Which are further processed (trimmed by


exonucleases and modified at some bases and
riboses) to produce the required RNA species.

 In eukaryotes, rRNA genes are found in long,


tandem arrays.
 rRNA synthesis and processing occur in the
nucleolus, with
base and
sugar modifications facilitated by small
nucleolar RNAs (snoRNA).

 Some of the proteins destined to become


components of the ribosome associate with
pre-rRNA prior to and during its modification.
Transfer RNA

 Most cells have 40 to 50 distinct tRNAs, and


eukaryotic cells have multiple copies of many of
the tRNA genes.

 Transfer RNAs are derived from longer RNA


precursors by enzymatic removal of nucleotides
from the 5’ and 3’ ends.
 In eukaryotes, introns are present in a few tRNA
transcripts and must be excised.

 Where two or more different tRNAs are


contained in a single primary transcript, they are
separated by enzymatic cleavage.

 The endonuclease RNase P, found in all


organisms, removes RNA at the 5’ end of tRNAs.

 This enzyme contains both protein and RNA.


 The RNA component is essential for activity,

 But in bacterial cells it can carry out its


processing function with precision even without
the protein component.

 RNase P is therefore another example of a


catalytic RNA.

 The 3’ end of tRNAs is processed by one or more


nucleases, including the exonuclease RNase D.
 Transfer RNA precursors may undergo further
posttranscriptional processing.

 The 3’-terminal tri nucleotide CCA(3’) to which


an amino acid will be attached during protein
synthesis is absent from
 some bacterial and
 all eukaryotic tRNA precursors and
 is added during processing.
 This addition is carried out by tRNA
nucleotidyltransferase,
 an unusual enzyme that binds the three
ribonucleoside triphosphate precursors in separate
active sites and catalyzes formation of the
phosphodiester bonds to produce the CCA(3’)
sequence.

 The creation of this defined sequence of


nucleotides is therefore not dependent on a DNA
or RNA template—the template is the binding site
of the enzyme.
 The final type of tRNA processing is the
modification of some of the bases by
 methylation,
 deamination, or
 reduction.

 In the case of pseudouridine (Ψ), the base


(uracil) is removed and reattached to the sugar
through C-5.
 Some of these modified bases occur at
characteristic positions in all tRNAs

 The final step is splicing of the 14-nucleotide


intron.

 Introns are found in some eukaryotic tRNAs but


not in bacterial tRNAs.
Eukaryotic mRNA
 The collection of all the primary transcripts
synthesized in the nucleus by
 RNA polymerase II is known as heterogeneous
nuclear RNA (hnRNA).

 The pre-mRNA components of hnRNA undergo

 Extensive co- and posttranscriptional


modification in the nucleus.
These modifications usually include

5' “Capping

Addition of a poly-A tail

Removal of introns

Alternative splicing of mRNA molecules


Eukaryotic mRNAs Are Capped at the 5ˊ End

 Most eukaryotic mRNAs have a 5 ˊ cap,

 A residue of 7-methylguanosine linked to the 5 ˊ


-terminal residue of the mRNA through an unusual
5 ˊ,5 ˊ -triphosphate linkage.

 The 5ˊ cap helps protect mRNA from


ribonucleases.
 The cap also binds to a specific cap binding
complex (CBC) of proteins and

 Participates in binding of the mRNA to the


ribosome to initiate translation.
 The 5 cap is formed by condensation of a
 molecule of GTP with the
 triphosphate at the 5ˊ end of the transcript.

 The guanine is subsequently


 methylated at N-7, and additional methyl groups
are often added at the 2ˊ hydroxyls of the first and
second nucleotides adjacent to the cap
 Creation of the cap requires;
 Removal of the γ phosphate from the 5’-
triphosphate of the pre mRNA,

 Followed by addition of GMP (from GTP) by the


nuclear enzyme guanylyl transferase.

 Methylation of this terminal guanine occurs in the


cytosol, and is catalyzed by guanine-7-
methyltransferase.
 The methyl groups are derived from S-
adenosylmethionine.

 All these reactions occur very early in


transcription.

 After the first 20 to 30 nucleotides of the


transcript have been added.
 All three of the capping enzymes, and through
them the 5ˊ end of the transcript itself, are
associated with the RNA polymerase II until the cap
is synthesized.

 The capped 5ˊ end is then released from the


capping enzymes and bound by the cap-binding
complex.
 Additional methylation steps may occur.

 The addition of this 7-methylguanosine “cap”


helps in;
 Stabilizing the mRNA, and
 Permits initiation of translation.

 Eukaryotic mRNAs lacking the cap are not


efficiently translated.
Addition of a poly-A tail:
The poly(A) tail consists of multiple adenosine
monophosphates;
 In other words, it is a stretch of RNA that has
only adenine bases.

 Most eukaryotic mRNA (with several notable


exceptions, including those coding for the
histones) have a chain of 40–200 adenine
nucleotides attached to the 3'-end.
 Poly-A tail is not transcribed from the DNA,

 But rather is added after transcription by the


nuclear enzyme, polyadenylate polymerase,

 Using ATP as the substrate.


The mRNA is cleaved downstream of a consensus
sequence,

 Called the polyadenylation signal sequence


(AAUAAA),

found near the 3'-end of the RNA, and the poly-A


tail is added to the new 3'-end.
These tails help in

 Stabilizing the mRNA,

 Facilitate its exit from the nucleus, and

 Aid in translation.

After the mRNA enters the cytosol, the poly-A tail


is gradually shortened.
 The poly(A) tail is added in a multistep process.

 The transcript is extended beyond the site where


the poly(A) tail is to be added,

 Then is cleaved at the poly(A) addition site by an


endonuclease component of a large enzyme
complex, again associated with the RNA
polymerase II.
 The mRNA site where cleavage occurs is marked
by two sequence elements:

 The highly conserved sequence (5’)AAUAAA(3’),


 10 to 30 nucleotides on the 5’ side (upstream) of
the cleavage site, and

 A less well-defined sequence rich in G and U


residues, 20 to 40 nucleotides downstream of the
cleavage site.
 Cleavage generates the free 3’-hydroxyl group
that defines the end of the mRNA,
 To which A residues are immediately added by
polyadenylate polymerase, which catalyzes the
reaction
RNA + nATP → RNA–(AMP)n + nPPi
where n 80 to 250.

 This enzyme does not require a template but


does require the cleaved mRNA as a primer.
 Pol II synthesizes RNA beyond the segment of the
transcript containing the cleavage signal
sequences, including the highly conserved
upstream sequence (5’)AAUAAA.
 1. The cleavage signal sequence is bound by an
enzyme complex that includes
an endonuclease,
 a polyadenylate polymerase, and
several other multisubunit proteins
involved in sequence recognition,
stimulation of cleavage, and
 regulation of the length of the poly(A) tail.
 2. The RNA is cleaved by the endonuclease at a
point 10 to 30 nucleotides 3’ to (downstream of)
the sequence AAUAAA.

 3. The polyadenylate polymerase synthesizes a


poly(A) tail 80 to 250 nucleotides long, beginning
at the cleavage site.
Removal of introns:
 Maturation of eukaryotic mRNA usually involves
 The removal of RNA sequences (introns, or
intervening sequences),

 Which do not code for protein from the primary


transcript.

 The remaining coding sequences, the exons, are


joined together to form the mature mRNA.
The intron is also present in the RNA copy of the
gene and must be removed by a process called
“RNA splicing”

intron
pre-mRNA

RNA splicing

mRNA

translation

protein
RNA Splicing
 The process of removing introns and joining
exons is called splicing.

 The molecular complex that accomplishes these


tasks is known as the spliceosome.

 A few eukaryotic primary transcripts contain no


introns, for example, those from histone genes.
Splicing a pre-mRNA involves two
reactions
intron branchpoint
pre-mRNA A

Step 1

intermediates A

Step 2

spliced mRNA
Splicing occurs in a “spliceosome”
an RNA-protein complex
(simplified)

spliceosome
(~100 proteins + 5 small RNAs)

pre-mRNA spliced mRNA

Splicing works similarly in different organisms, for example in


yeast, flies, worms, plants and animals.
 Others eukaryotes contain a few introns,

 Whereas some, such as the primary transcripts


for the α chains of collagen, contain more than 50
intervening sequences

 That must be removed before mature mRNA is


ready for translation.
Role of snRNAs:
 In association with proteins, uracil-rich small
nuclear RNAs (snRNA) form

 Small nuclear ribonucleoprotein particles (snRNPs,


or “snurps” designated as U1, U2, etc.) that mediate
splicing.

 They facilitate the removal of introns by forming


base pairs with the consensus sequences at each
end of the intron.
RNA Catalyzes the Splicing of Introns

 There are four classes of introns.

 The first two, the group I and group II introns,


differ in the details of their splicing mechanisms
but share one surprising characteristic:

 They are self-splicing—no protein enzymes are


involved.
 Group I introns are found in some
nuclear,
 mitochondrial, and
chloroplast genes coding for rRNAs, mRNAs,
and tRNAs.

 Group II introns are generally found in the


primary transcripts of
 Mitochondrial or chloroplast mRNAs in fungi,
algae, and plants.
Mechanism of splicing:
 Most introns are not self-splicing, and these
types are not designated with a group number.

 The third and largest class of introns includes


those found in nuclear mRNA primary
transcripts.

 These are called spliceosomal introns,


because their removal occurs within and is
catalyzed by a large protein complex called a
spliceosome.
 Within the spliceosome, the introns undergo
splicing by

 The lariat-forming mechanism as the group II


introns.

 The spliceosome is made up of specialized


RNA-protein complexes,

 Small nuclear ribonucleoproteins (snRNPs,


often pronounced “snurps”).
 Each snRNP contains one of a class of
eukaryotic RNAs, 100 to 200 nucleotides long,
known as

 Small nuclear RNAs (snRNAs).

 Five snRNAs involved in splicing reactions are


generally found in abundance in eukaryotic
nuclei.

 These are designated as U1, U2, U4, U5, and


U6
 The binding of snRNPs brings the sequences of
the neighboring exons into the correct alignment
for splicing

 Spliceosomal introns generally have the


dinucleotide sequence

 GU and AG at the 5’ and 3’ ends, respectively,

 and these sequences mark the sites where


splicing occurs.
 The U1 snRNA contains a sequence
complementary to sequences near the 5’ splice
site of nuclear mRNA introns and

 The U1 snRNP binds to this region in the


primary transcript.

 U2 is paired to the intron at a position


encompassing the A residue (shaded pink)
during the splicing reaction.
 Base pairing of U2 snRNA causes a bulge that
displaces and helps to activate the adenylate,

 whose 2‘- OH will form the lariat structure


through a 2' → 5'-phosphodiester bond
RNA Splicing Chemistry

Specificity of splicing is a product of


specific nucleotide sequences

Exon-intron boundary = 5’ splice site


(donor)

Intron-exon boundary = 3’ splice site


(acceptor)
22:28 169
 Third site, interior of intron, = branch point site
(sequence), found toward 3’ end near polypyrimidine
tract (Py tract)
 GU of 5’ and AG of 3’ are most highly conserved sites
– both w/in the intron

GU of 5’ and AG of 3’ are most highly conserved


sites – both w/in the intron
Lariat Formation

Branch site is a 3-way


junction including 2’OH
creating a “branch”
point

22:28 172
 Addition of the U2, U4, U5, and U6 snRNPs
leads to formation of the spliceosome.

 The snRNPs together contribute five RNAs and


about 50 proteins to the spliceosome,
 A supra molecular assembly nearly as complex
as the ribosome.
 ATP is required for assembly of the spliceosome,
but
 The RNA cleavage-ligation reactions do not
seem to require ATP.
Assembly of spliceosomes.
 The U1 and U2 snRNPs bind,
 Then the remaining snRNPs (the U4/U6 complex
and U5) bind to form an
 Inactive spliceosome.
 Internal rearrangements convert this species to
an active spliceosome
 In which U1 and U4 have been expelled and
 U6 is paired with both the 5’ splice site and U2.
 After introns have been removed and exons
joined,

 The mature mRNA molecules leave the nucleus


and pass into the cytosol through pores in the
nuclear membrane.

 The introns in tRNA are removed by a different


mechanism.
Effect of splice site mutations:
 Mutations at splice sites can lead to improper
splicing and the production of aberrant
proteins.
 It is estimated that 15% of all genetic diseases
are a result of mutations that affect RNA
splicing.
 For example, mutations that cause the
incorrect splicing of β-globin mRNA are
responsible for some cases of β−thalassemia—a
disease in which the production of the β-globin
protein is defective.
Examples of the potential consequences of mutations on splicing
A B
Mutations occur C
on the DNA
(in a gene)

1 2 3 4 5

no mutation mutation A mutation B mutation C


normal mRNA truncated mRNA exon 3 skipped longer exon 4

1 2 3 4 5 1 2 1 2 4 5 1 2 3 4 5

normal protein truncated protein protein of different size (smaller or longer)


active inactive inactive or aberrant function
 Therefore, understanding the mechanism of RNA
splicing in normal cells and how it is regulated in
different tissues and at different stages of
development of an organism is essential in order
to develop strategies to correct aberrant splicing in
human pathologies
Alternative splicing
In humans, many genes contain multiple
introns
intron 1 intron 2 intron 3 intron 4
1 2 3 4 5

1 2 3 4 5

Usually all introns must be removed before


the mRNA can be translated to produce
protein
However, multiple introns may be spliced
differently in different circumstances, for
example in different tissues.
Heart muscle mRNA 1 2 3 5

pre-mRNA 1 2 3 4 5

Uterine muscle mRNA 1 3 4 5

 Thus one gene can encode more than one protein.


 The proteins are similar but not identical and may have distinct
properties.
 This is important in complex organisms
Alternative splicing of mRNA
molecules:
 The pre-mRNA molecules from some genes
can be spliced in alternative ways in different
tissues.
 This produces multiple variations of the mRNA
and, therefore, of its protein product.
 This appears to be a mechanism for producing
a diverse set of proteins from a limited set of
genes.
 For example, in eukaryotic cells the mRNA for
tropomyosin, an actin filament-binding protein
of the cytokeleton (and of the contractile
apparatus in muscle cells),

 Undergoes extensive tissue-specific alternative


splicing with production of multiple isoforms of
the tropomyosin protein.
Reverse Transcriptase Produces DNA
from Viral RNA:
 Certain RNA viruses that infect animal cells
carry within the viral particle an
RNA-dependent DNA polymerase called
reverse transcriptase.

 On infection, the single stranded RNA viral


genome (~10,000 nucleotides) and the enzyme
enter the host cell.
 The reverse transcriptase first catalyzes the
synthesis of a DNA strand complementary to
the viral RNA,
 Then degrades the RNA strand of the viral
RNA-DNA hybrid and replaces it with DNA.

 The resulting duplex DNA often becomes


incorporated into the genome of the eukaryotic
host cell.
 These integrated (and dormant) viral genes
can be activated and transcribed, and

 The gene products—viral proteins and the


viral RNA genome itself—packaged as new
viruses.

 The RNA viruses that contain reverse


transcriptases are known as retroviruses (retro
is the Latin prefix for “backward”).
 Reverse transcriptases have become important
reagents in the study of DNA-RNA relationships
and in DNA cloning techniques.

 They make possible the synthesis of DNA


complementary to an mRNA template, and

 Synthetic DNA prepared in this manner, called


complementary DNA (cDNA), can be used to
clone cellular genes
Some Retroviruses Cause Cancer and
AIDS:
 Retroviruses have featured prominently in
recent advances in the molecular understanding
of cancer.
 Most retroviruses do not kill their host cells
but remain integrated in the cellular DNA,
replicating when the cell divides.
 Some retroviruses, classified as RNA tumor
viruses, contain an oncogene that can cause the
cell to grow abnormally
 The human immunodeficiency virus (HIV),
which causes acquired immune deficiency
syndrome (AIDS), is a retrovirus.

 Identified in 1983, HIV has an RNA genome


with standard retroviral genes along with several
other unusual genes.
 Unlike many other retroviruses, HIV kills many
of the cells it infects (principally T lymphocytes)
rather than causing tumor formation.

 This gradually leads to suppression of the


immune system in the host organism.

 The reverse transcriptase of HIV is even more


error prone than other known reverse
transcriptases—ten times more so—
 Resulting in high mutation rates in this virus.
 One or more errors are generally made every
time the viral genome is replicated,
 So any two viral RNA molecules are likely to
differ.
Overview of Protein Synthesis.

 Genetic information, stored in the


chromosomes and transmitted to daughter cells
through DNA replication, is expressed through
 Transcription to RNA and,
 In the case of messenger RNA (mRNA),
subsequent translation into proteins
(polypeptide chains).
 The pathway of protein synthesis is called
translation because the “language” of the
nucleotide sequence on the mRNA is translated
into the “language” of an amino acid sequence.

 The process of translation requires a genetic


code, through which the information contained
in the nucleic acid sequence is expressed to
produce a specific sequence of amino acids.
 Any alteration in the nucleic acid sequence may
result in an incorrect amino acid being inserted
into the polypeptide chain, potentially causing
disease or even death of the organism.

 Newly made proteins undergo a number of


processes to achieve their functional form.

 They must fold properly, and misfolding can


result in degradation of the protein
 Many proteins are covalently modified to
activate them or alter their activities.

 Finally, proteins are targeted to their final


intra- or extracellular destinations by signals
present in the proteins themselves.
THE GENETIC CODE
• The genetic code is a dictionary that identifies
the correspondence between a sequence of
nucleotide bases and a sequence of amino
acids.

• Each individual word in the code is composed


of three nucleotide bases.

• These genetic words are called codons.


Codons:
By the 1960s it had long been apparent
that at least three nucleotide residues of
DNA are necessary to encode each amino
acid.
A codon is a triplet of nucleotides that
codes for a specific amino acid.
 Codons are presented in the mRNA language
of adenine (A), guanine (G), cytosine (C), and
uracil (U).
Their nucleotide sequences are always written
from the 5'-end to the 3'-end.

The four nucleotide bases are used to produce the


three-base codons (43=64).

 There are, therefore, 64 different combinations of


bases, taken three at a time (a triplet code).

Sixty-one of the 64 codons code for the 20


common amino acids.
 The four code letters of DNA (A, T, G, and C) in
groups of two can yield only 42=16 different
combinations, insufficient to encode 20 amino
acids.

 Groups of three, however, yield 43=64 different


Combinations.
 Translation occurs in such a way that these
nucleotide triplets are read in a successive, non
overlapping fashion.

 A specific first codon in the sequence


establishes the reading frame, in which a new
codon begins every three nucleotide residues.

Several codons serve special functions


Initiation codon
 The initiation codon AUG is the most common
signal for the beginning of a polypeptide

 In all cells, in addition to coding for Met


residues in internal positions of polypeptide
Termination (“stop” or “nonsense”) codons:

 Three of the codons, UAG, UGA, and UAA, do not


code for amino acids, but rather are termination
codons.

 When one of these codons appears in an mRNA


sequence, normally signal the end of polypeptide
synthesis e.g synthesis of the polypeptide coded
for by that mRNA stops.
”Dictionary” of amino acid code words in
mRNAs.

 The third base of each codon (in bold type)


plays a lesser role in specifying an amino acid
than the first two.

 The three termination codons are shaded in


pink.

 The initiation codon AUG in green.


All the amino acids except methionine and
tryptophan have more than one codon.

 In most cases, codons that specify the same


amino acid differ only at the third base.
 In a random sequence of nucleotides, 1 in
every 20 codons in each reading frame is, on
average,
 a termination codon.

 In general, a reading frame without a


termination codon among 50 or more codons is
referred to as an
 open reading frame (ORF).
 Long open reading frames usually correspond
to genes that encode proteins.
 A striking feature of the genetic code is that an
amino acid may be specified by more than one
codon, so the code is described as
 degenerate.

A code in which several code words have the same


meaning.

The genetic code is degenerate because there are


many instances in which different codons specify
the same amino acid.
A genetic code in which some amino acids may
each be encoded by more than one codon.

 The degeneracy of the code is not uniform.

 This does not suggest that the code is flawed:


 although an amino acid may have two or more
codons, each codon specifies only one amino acid.
 Where as:
 Methionine and tryptophan have single
codons,
 For example,
 Three amino acids (Leu, Ser, Arg) have six
codons,
 Five amino acids have four,
 Isoleucine has three, and
Nine amino acids have two.
 The genetic code is nearly universal.

 With the intriguing exception of a few minor


variations in mitochondria, some bacteria, and
some single-celled eukaryote amino acid codons
are identical in all species examined so far.

 Human beings, E. coli, tobacco plants,


amphibians, and viruses share the same genetic
code.
 Thus it would appear that all life forms have a
common evolutionary ancestor, whose genetic
code has been preserved throughout biological
evolution.

 Even the variations reinforce this theme.


1
ATTGGCCAGGGGACGGTAGCTGCAGGACTCTGCTCTCCT
GCGGCCATGGGCCAGGGTTGG ......................................
.......ATGGGCCAGGGTTGG .............................................
-M---G----Q-----G--W- S 61
GCTACTGCAGGACTTCCCAGCCTCCTC
TTCCTGCTGCTCTGCTACGGGCACCCTCTGCTG 16
GCTACTGCAGGACTTCCCAGCCTCCTCTTCCTGCTGCTCT
GCTACGGGCACCCTCTGCTG 6 -A--T--A--G--L--P--S--L--
L--F--L--L--L--C--Y--G--H--P--L--L- Y Y R -T-
121 GTCCCCAGCCAGGAGGCAT
CCCAACAGGTGACAGTCACCCATGGGA
CAAGCAGCCAGGCA 76
GTCCCCAGCCAGGAGGCATCCCAACAGGTGACAGT
CACCCATGGGACAAGCAGCCAGGCA 26 -V--P--S--
Q--E--A--S--Q--Q--V--T--V--T--H--G--T--S--S--Q--A-
R Y R R 181 ACAACCAGCAGCCAGACAACCACCC
ACCAGGCGACGGCCCACCAGACATCAGCCCAGAGC
136
ACAACCAGCAGCCAGACAACCACCCACCAGGCGAC
GGCCCACCAGACATCAGCCCAGAGC 46 -T--T--S--
S--Q--T--T--T--H--Q--A--T--A--H--Q--T--S--A--Q--S-
241
CCAAACCTGGTGACTGATGAGGCTGAGGCCAGCAA
GTTTGTGGAGGAATATGACCGGACA 196
CCAAACCTGGTGACTGATGAGGCTGAGGCCAGCAA
GTTTGTGGAGGAATATGACCGGACA 66 -P--N--L--
V--T--D--E--A--E--A--S--K--F--V--E--E--Y--D--R-

You might also like