You are on page 1of 24

Ribosome

The ribosome word is derived – ‘ribo’ from ribonucleic acid and ‘somes’ from the Greek word
‘soma’ which means ‘body’. Ribosomes are tiny spheroidal dense particles (of 150 to 200 A0
diameters) that are primarily found in most prokaryotic and eukaryotic.

Structure

The ribosome is a highly complex cellular machine. It is largely made up of specialized RNA
known as ribosomal RNA (rRNA) as well as dozens of distinct proteins (the exact number varies
slightly between species). The ribosomal proteins and rRNAs are arranged into two distinct
ribosomal pieces of different size, known generally as the large and small subunit of the ribosome.
Ribosomes consist of two subunits that fit together and work as one to translate the mRNA into a
polypeptide chain during protein synthesis. Because they are formed from two subunits of non-
equal size, they are slightly longer in the axis than in diameter.
Figure: Structure of ribosome.

Prokaryotic ribosomes

Prokaryotic ribosomes are around 20 nm (200 Å) in diameter and are composed of 65% rRNA and
35% ribosomal proteins. Eukaryotic ribosomes are between 25 and 30 nm (250–300 Å) in diameter
with an rRNA-to-protein ratio that is close to 1. Crystallographic work has shown that there are no
ribosomal proteins close to the reaction site for polypeptide synthesis. This suggests that the
protein components of ribosomes do not directly participate in peptide bond formation catalysis,
but rather that these proteins act as a scaffold that may enhance the ability of rRNA to synthesize
protein

The unit of measurement used to describe the ribosomal subunits and the rRNA fragments is the
Svedberg unit, a measure of the rate of sedimentation in centrifugation rather than size. This
accounts for why fragment names do not add up: for example, bacterial 70S ribosomes are made
of 50S and 30S subunits.

Bacteria have 70S ribosomes, each consisting of a small (30S) and a large (50S) subunit. E. coli,
for example, has a 16S RNA subunit (consisting of 1540 nucleotides) that is bound to 21 proteins.
The large subunit is composed of a 5S RNA subunit (120 nucleotides), a 23S RNA subunit (2900
nucleotides) and 31 proteins.

Eukaryotic Ribosome

Eukaryotic ribosomes are also known as 80S ribosomes, referring to their sedimentation
coefficients in Svedberg units, because they sediment faster than the prokaryotic (70S) ribosomes.
Eukaryotic ribosomes have two unequal subunits, designated small subunit (40S) and large subunit
(60S) according to their sedimentation coefficients. Both subunits contain dozens of ribosomal
proteins arranged on a scaffold composed of ribosomal RNA (rRNA). The small subunit monitors
the complementarity between tRNA anticodon and mRNA, while the large subunit catalyzes
peptide bond formation.

Compared to their prokaryotic homologs, many of the eukaryotic ribosomal proteins are enlarged
by insertions or extensions to the conserved core. Furthermore, several additional proteins are
found in the small and large subunits of eukaryotic ribosomes, which do not have prokaryotic
homologs. The 40S subunit contains a 18S ribosomal RNA (abbreviated 18S rRNA), which is
homologous to the prokaryotic 16S rRNA. The 60S subunit contains a 28S rRNA that is
homologous to the prokaryotic 23S ribosomal RNA. In addition, it contains a 5.8S rRNA that
corresponds to the 5' end of the 23S rRNA, and a short 5S rRNA. Both 18S and 28S have multiple
insertions to the core rRNA fold of their prokaryotic counterparts, which are called expansion
segments. For a detailed list of proteins, including archaeal and bacterial homologs please refer to
the separate articles on the 40S and 60S subunits. Recent research suggests heterogeneity in the
ribosomal composition, i.e., that the stoichiometry among core ribosomal proteins in wild-type
yeast cells and embryonic stem cells depends both on the growth conditions and on the number of
ribosomes bound per mRNA.

Function

Ribosomes are minute particles consisting of RNA and associated proteins that function to
synthesize proteins. Proteins are needed for many cellular functions such as repairing damage or
directing chemical processes. Ribosomes can be found floating within the cytoplasm or attached
to the endoplasmic reticulum. Basically, their main function is to convert genetic code into an
amino acid sequence and to build protein polymers from amino acid monomers.

Ribosomes act as catalysts in two extremely important biological processes called peptidyl transfer
and peptidyl hydrolysis. The "PT center is responsible for producing protein bonds during protein
elongation".

Translation

Ribosomes are the workplaces of protein biosynthesis, the process of translating mRNA into
protein. The mRNA comprises a series of codons which are decoded by the ribosome so as to make
the protein. Using the mRNA as a template, the ribosome traverses each codon (3 nucleotides) of
the mRNA, pairing it with the appropriate amino acid provided by an aminoacyl-tRNA.
Aminoacyl-tRNA contains a complementary anticodon on one end and the appropriate amino acid
on the other. For fast and accurate recognition of the appropriate tRNA, the ribosome utilizes large
conformational changes (conformational proofreading). The small and large ribosomal subunits
bind to an aminoacyl-tRNA containing the first amino acid Theronine, binds to an ACG codon on
the mRNA. The ribosome contains three RNA binding sites, designated A, P and E. The A-site
binds an aminoacyl-tRNA or termination release factors, the P-site binds a peptidyl-tRNA (a tRNA
bound to the poly-peptide chain); and the E-site (exit) binds a free tRNA. Protein synthesis begins
at a start codon AUG near the 5' end of the mRNA. mRNA binds to the P site of the ribosome first.
The ribosome recognizes the start codon by using the Shine-Dalgarno sequence of the mRNA in
prokaryotes and Kozak box in eukaryotes.

Cotranslational folding

The ribosome is known to actively participate in the protein folding. The structures obtained in
this way are usually identical to the ones obtained during protein chemical refolding, however, the
pathways leading to the final product may be different. In some cases, the ribosome is crucial in
obtaining the functional protein form. For example, one of the possible mechanisms of folding of
the deeply knotted proteins relies on the ribosome pushing the chain through the attached loop.

Addition of translation-independent amino acids

Presence of a ribosome quality control protein Rqc2 is associated with mRNA-independent protein
elongation. This elongation is a result of ribosomal addition (via tRNAs brought by Rqc2) of CAT
tails: ribosomes extend the C-terminus of a stalled protein with random, translation-independent
sequences of alanines and threonines.

Figure: Function of RNA

Location of functional sites in the ribosome:


There are three operational or binding sites are A, P and E reading from the mRNA entry
site(conventionally the right hand site).
Sites A and P span both the ribosome subunits with a larger path residing in the ribosome large
sub-unit, and a smaller part in the smaller sub-unit. Site E, the exit site, resides in the large
ribosome sub-unit.
Figure: Ribosome with its functional sites

Table of the binding sites, positions and functions in a ribosome

Binding site mRNA Biological term Main process


strand
entry site
Site A 1st Aminoacyl Admission of codon of mRNA &
‘charged’ strand of tRNA. Checking
and decoding and start of ‘handing over’
one amino acid molecule.
Site P 2nd Peptidyl Peptide synthesis, consolidation,
elongation and transfer of peptide chain
to site A
Site E 3rd Exit-to Preparation of ‘uncharged’ tRNA for
cytoplasm exit
Protein-tRNA interaction:

Figure: Ribosome-tRNA interaction


The small subunit, as seen in the image above, helps to hold the mRNA in place as the ribosome
translates it into protein. The larger subunit has various sites involved with different parts of the
protein synthesis process. When the tRNA first binds to the mRNA, the P site can bind to these
molecules. The P site is named after the polymerization, or construction of polymers, that occurs
there. Conformational changes occur in the proteins of the ribosome which causes it to change
shapes during the various steps of protein synthesis. As amino acids are added to the chain, tRNAs
move from the A site (where new amino acids with tRNAs enter) to the P site, and eventually to
the E site (not pictured), where they exit the ribosome without their amino acid. The rRNA that is
associated with the ribosome helps attach to the tRNAs as they move through the ribosome and
has been found to help catalyze the formation of peptide bonds. This RNA is known as a ribozyme,
or RNA catalyst.

Genetics of ribosomal RNA:


Eukaryotes generally have many copies of the rRNA genes organized in tandem repeats. In
humans, approximately 300–400 repeats are present in five clusters, located on chromosomes 13
(RNR1), 14 (RNR2), 15 (RNR3), 21 (RNR4) and 22 (RNR5). Humans have 10 clusters of
genomic rDNA which in total make up less than 0.5% of the human genome.
Mammalian cells have 2 mitochondrial (12S and 16S) rRNA molecules and 4 types of cytoplasmic
rRNA (the 28S, 5.8S, 18S, and 5S subunits). The 28S, 5.8S, and 18S rRNAs are encoded by a
single transcription unit (45S) separated by 2 internally transcribed spacers. The first spacer
corresponds to the one found in bacteria and archaea, and the other spacer is an insertion into what
was the 23S rRNA in prokaryotes. The 45S rDNA is organized into 5 clusters (each has 30–40
repeats) on chromosomes 13, 14, 15, 21, and 22. These are transcribed by RNA polymerase I. The
DNA for the 5S subunit occurs in tandem arrays (~200–300 true 5S genes and many dispersed
pseudogenes), the largest one on the chromosome 1q41-42. 5S rRNA is transcribed by RNA
polymerase III. The 18S rRNA in most eukaryotes is in the small ribosomal subunit, and the large
subunit contains three rRNA species (the 5S, 5.8S and 28S in mammals, 25S in plants, rRNAs).
RNA synthesis
The synthesis of RNA is performed by enzymes called RNA polymerases. In higher organisms
there are three main RNA polymerases, designated I, II, and III (or sometimes A, B, and C). Each
is a complex protein consisting of many subunits. RNA polymerase I synthesizes three of the four
types of rRNA (called 18S, 28S, and 5.8S RNA); therefore it is active in the nucleolus, where the
genes encoding these rRNA molecules reside. RNA polymerase II synthesizes mRNA, though its
initial products are not mature RNA but larger precursors, called heterogeneous nuclear RNA,
which are completed later (see below Processing of mRNA). The products of RNA polymerase III
include tRNA and the fourth RNA component of the ribosome, called 5S RNA. All three
polymerases start RNA synthesis at specific sites on DNA and proceed along the molecule, linking
selected nucleotides sequentially until they come to the end of the gene and terminate the growing
chain of RNA.

Figure: Eukaryotic Ribosomal genes


Energy for RNA synthesis comes from high-energy phosphate linkages contained in the nucleotide
precursors of RNA. Each unit of the final RNA product is essentially a sugar, a base, and one
phosphate, but the building material consists of a sugar, a base, and three phosphates. During
synthesis two phosphates are cleaved and discarded for each nucleotide that is incorporated into
RNA. The energy released from the phosphate bonds is used to link the nucleotides. The crucial
feature of RNA synthesis is that the sequence of nucleotides joined into a growing RNA chain is
specified by the sequence of nucleotides in the DNA template: each adenine in DNA specifies
uracil in RNA, each cytosine specifies guanine, each guanine specifies cytosine, and each thymine
in DNA specifies adenine. In this way the information encoded in each gene is transcribed into
RNA for translation by the protein-synthesizing machinery of the cytoplasm.

Regulation of RNA synthesis

The first level of regulation is mediated by variations in chromatin structure. In order to be


transcribed, a gene must be assembled into a structurally distinct form of active chromatin. A
second level of regulation is achieved by varying the frequency with which a gene in the active
conformation is transcribed into RNA by an RNA polymerase. There is evidence for regulation
of RNA synthesis at both these levels—for example, in response to hormone induction. At both
levels, protein factors are believed to perform the regulation—for example, by binding to special
promoter DNA regions flanking the transcribed gene.

Gene expression

Gene expression is the process by which information from a gene is used in the synthesis of a
functional gene product. These products are often proteins.

But in non-protein coding genes such as ribosomal RNA (rRNA), transfer RNA
(tRNA) or small nuclear RNA (snRNA) genes, the product is a functional RNA. The process of
gene expression is used by all known life.

Several steps in the gene expression process may be modulated, including the-

• Transcription

• RNA processing
• Non coding RNA mutation

• RNA export

• Translation

• Folding

• Post-translational modification of a protein.

Transcription

Transcription is the process by which the information in a strand of DNA is copied into a
new molecule of RNA. It is the first step of gene expression, in which a particular segment
of DNA is copied into RNA (especially mRNA) by the enzyme RNA polymerase. It results
in a complementary, antiparallel RNA strand called a primary transcript. The basic
mechanism of RNA synthesis by these eukaryotic RNA polymerases can be divided into
the following phases:

Initiation Phase

 During initiation, RNA polymerase recognizes a specific site on the DNA, upstream from
the gene that will be transcribed, called a promoter site and then unwinds the DNA locally.

 Most promoter sites for RNA polymerase II include a highly conserved sequence located
about 25–35 bp upstream (i.e. to the 5 side) of the start site which has the consensus
TATA(A/T)A(A/T) and is called the TATA box.

 Since the start site is denoted as position +1, the TATA box position is said to be located
at about position -25.

 The TATA box sequence resembles the -10 sequence in prokaryotes (TATAAT) except
that it is located further upstream.

 Both elements have essentially the same function, namely recognition by the RNA
polymerase in order to position the enzyme at the correct location to initiate transcription.
 The sequence around the TATA box is also important in that it influences the efficiency of
initiation. Transcription is also regulated by upstream control elements that lie 5′ to the
TATA box.

 Some eukaryotic protein-coding genes lack a TATA box and have an initiator element
instead, centered around the transcriptional initiation site.

 In order to initiate transcription, RNA polymerase II requires the assistance of several other
proteins or protein complexes, called general (or basal) transcription factors, which must
assemble into a complex on the promoter in order for RNA polymerase to bind and start
transcription.

 These all have the generic name of TFII (for Transcription Factor for RNA polymerase II).

 The first event in initiation is the binding of the transcription factor IID (TFIID) protein
complex to the TATA box via one its subunits called TBP (TATA box binding protein).

 As soon as the TFIID complex has bound, TFIIA binds and stabilizes the TFIID-TATA
box interaction. Next, TFIIB binds to TFIID.

 However, TFIIB can also bind to RNA polymerase II and so acts as a bridging protein.
Thus,

 RNA polymerase II, which has already complexed with TFIIF, now binds.

 This is followed by the binding of TFIIE and H. This final protein complex contains at
least 40 polypeptides and is called the transcription initiation complex.

 Those protein-coding genes that have an initiator element instead of a TATA box appear
to need another protein(s) that binds to the initiator element.

 The other transcription factors then bind to form the transcription initiation complex in a
similar manner to that described above for genes possessing a TATA box promoter.
Figure: Initiation phase

Elongation Phase

TFIIH has two functions:

1. It is a helicase, which means that it can use ATP to unwind the DNA helix, allowing
transcription to begin.

2. In addition, it phosphorylates RNA polymerase II which causes this enzyme to change its
conformation and dissociate from other proteins in the initiation complex.

 The key phosphorylation occurs on a long C-terminal tail called the C-terminal domain
(CTD) of the RNA polymerase II molecule.

 Interestingly, only RNA polymerase II that has a non-phosphorylated CTD can initiate
transcription but only an RNA polymerase II with a phosphorylated CTD can elongate
RNA.
 RNA polymerase II now starts moving along the DNA template, synthesizing RNA, that
is, the process enters the elongation phase.

 RNA synthesis occurs in the 5’ → 3’ direction with the RNA polymerase catalyzing a
nucleophilic attack by the 3-OH of the growing RNA chain on the alpha-phosphorus atom
on an incoming ribonucleoside 5-triphosphate.

 The RNA molecule made from a protein-coding gene by RNA polymerase II is called a
primary transcript.

Termination Phase

Figure: Termination phase

 Elongation of the RNA chain continues until termination occurs.

 Unlike RNA polymerase in prokaryotes, RNA polymerase II does not terminate


transcription at a specific site but rather transcription can stop at varying distances
downstream of the gene.
 RNA genes transcribed by RNA Polymerse II lack any specific signals or sequences that
direct RNA Polymerase II to terminate at specific locations.

 RNA Polymerase II can continue to transcribe RNA anywhere from a few bp to thousands
of bp past the actual end of the gene.

 The transcript is cleaved at an internal site before RNA Polymerase II finishes transcribing.
This releases the upstream portion of the transcript, which will serve as the initial RNA
prior to further processing (the pre-mRNA in the case of protein-encoding genes.)

 This cleavage site is considered the “end” of the gene. The remainder of the transcript is
digested by a 5′-exonuclease (called Xrn2 in humans) while it is still being transcribed by
the RNA Polymerase II.

 When the 5′-exonulease “catches up” to RNA Polymerase II by digesting away all the
overhanging RNA, it helps disengage the polymerase from its DNA template strand, finally
terminating that round of transcription.

RNA processing

The primary eukaryotic mRNA transcript is much longer and localised into the nucleus,
when it is also called heterogenous nuclear RNA (hnRNA) or pre- mRNA.

It undergoes various processing steps to change into a mature RNA:

Cleavage

 Larger RNA precursors are cleaved to form smaller RNAs.

 Primary transcript is cleaved by ribonuclease-P (an RNA enzyme) to form 5-7 tRNA
precursors.

Capping and Tailing

 Initially at the 5′ end a cap (consisting of 7-methyl guanosine or 7 mG) and a tail of poly
A at the 3′ end are added.

 The cap is a chemically modified molecule of guanosine triphosphate (GTP).


Splicing

 The eukaryotic primary mRNAs are made up of two types of segments; non-coding introns
and the coding exons.

 The introns are removed by a process called RNA splicing where ATP is used to cut the
RNA, releasing the introns and joining two adjacent exons to produce mature mRNA.

Nucleotide Modifications

 They are most common in tRNA-methylation (e.g., methyl cytosine, methyl guanosine),
deamination (e.g., inosine from adenine), dihydrouracil, pseudouracil, etc.

Translation

For some RNA (non-coding RNA) the mature RNA is the final gene product.[15] In the case of
messenger RNA (mRNA) the RNA is an information carrier coding for the synthesis of one or more
proteins. mRNA carrying a single protein sequence (common in eukaryotes) is monocistronic

Ribosome translating messenger RNA to chain of amino acids (protein).

During the translation, tRNA charged with amino acid enters the ribosome and aligns with the correct
mRNA triplet. Ribosome then adds amino acid to growing protein chain.

Every mRNA consists of three parts: a 5′ untranslated region (5′UTR), a protein-coding region or open
reading frame (ORF), and a 3′ untranslated region (3′UTR). The coding region carries information for
protein synthesis encoded by the genetic code to form triplets. Each triplet of nucleotides of the coding
region is called a codon and corresponds to a binding site complementary to an anticodon triplet in
transfer RNA. Transfer RNAs with the same anticodon sequence always carry an identical type of
amino acid. Amino acids are then chained together by the ribosome according to the order of triplets
in the coding region. The ribosome helps transfer RNA to bind to messenger RNA and takes the amino
acid from each transfer RNA and makes a structure-less protein out of it. Each mRNA molecule is
translated into many protein molecules, on average ~2800 in mammals.

In prokaryotes translation generally occurs at the point of transcription (co-transcriptionally), often


using a messenger RNA that is still in the process of being created. In eukaryotes translation can
occur in a variety of regions of the cell depending on where the protein being written is supposed to
be. Major locations are the cytoplasm for soluble cytoplasmic proteins and the membrane of the
endoplasmic reticulum for proteins that are for export from the cell or insertion into a cell membrane.
Proteins that are supposed to be expressed at the endoplasmic reticulum are recognised part-way
through the translation process. This is governed by the signal recognition particle—a protein that
binds to the ribosome and directs it to the endoplasmic reticulum when it finds a signal peptide on the
growing (nascent) amino acid chain.

Folding

Each protein exists as an unfolded polypeptide or random coil when translated from a sequence of
mRNA into a linear chain of amino acids. This polypeptide lacks any developed three-dimensional
structure (the left hand side of the neighboring figure). The polypeptide then folds into its characteristic
and functional three-dimensional structure from a random coil. Amino acids interact with each other to
produce a well-defined three-dimensional structure, the folded protein (the right hand side of the figure)
known as the native state. The resulting three-dimensional structure is determined by the amino acid
sequence (Anfinsen's dogma).

The correct three-dimensional structure is essential to function, although some parts of functional
proteins may remain unfolded. Failure to fold into the intended shape usually produces inactive
proteins with different properties including toxic prions. Several neurodegenerative and other diseases
are believed to result from the accumulation of misfolded proteins. Many allergies are caused by the
folding of the proteins, for the immune system does not produce antibodies for certain protein
structures.

Enzymes called chaperones assist the newly formed protein to attain (fold into) the 3-dimensional
structure it needs to function. Similarly, RNA chaperones help RNAs attain their functional shapes.
Assisting protein folding is one of the main roles of the endoplasmic reticulum in eukaryotes.

Structures of Eukaryotic gene:

1. Exons

2. Introns

3. Promoter sequences

4. Terminator sequences

5. Upstream sequences

6. Downstream sequence

7. Enhancers and Silencers


Exons:

Exons are the name for the nucleotide sequences that remain in a mature mRNA. Introns are the
name for the regions that are removed (spliced out). The term 'exon' refers to both the DNA
sequence within a gene and to the corresponding sequence in mRNA (also known as transcripts).
During RNA splicing, introns are removed and exons are covalently joined to one another. Once
all introns are removed, the resulting mRNA or noncoding RNA gene-product is considered
mature.

Exon Structure

Exons are made up of stretches of DNA that will ultimately be translated into amino acids and
proteins. In the DNA of eukaryotic organisms, exons can be together in a continuous gene or
separated by introns in a discontinuous gene. When the gene is transcribed into premRNA the
transcript contains both introns and exons. The pre-mRNA is then processed and the introns are
spliced out of the molecule. Mature mRNAs can be a few hundred to several thousand nucleotides
long. The mature mRNA consists of exons and short untranslated regions (UTRs) on either end.
The exons make up the final reading frame which consists of nucleotides arranged in triplets. The
reading frame begins with a start codon (usually AUG) and ends in a termination codon. The
nucleotides are arranged in triplets as each amino acid is coded for by a three-nucleotide sequence.

Introns

RNA sequences between exons that are removed by splicing are known as introns. Introns have
been found in eukaryotic mRNA, tRNA and rRNA, as well as chloroplast, mitochondrial and a
phage of E. coli. Eubacteria are the only species in which introns have not been found. In general,
genes that are related by evolution have related organizations with conservation of the position at
least some introns. Furthermore, conservation of introns is also detected between genes in related
species. The amount and size of introns varies greatly. The mammalian DHFR has 6 exons that
total about 2000 bases, yet the gene is 31,000 bases. Likewise, the alpha-collagen has 50 exons
that range from 45-249 bases and the gene is about 40,000 bases. Clearly two genes of the same
size can have different number of introns, and introns that vary in size. Some species will have an
intron in a gene, but another species may not have an intron in the same gene. An example is the
cytochrome oxidase subunit II gene of plant mitochondria where some plant species have an intron
in this gene and others do not. Introns are a common eukaryotic event. Several features of
interrupted genes are:

1. The sequence order is the same as in the mRNA

2. The structure of an interrupted gene is identical in all tissues.


3. Introns of nuclear genes have termination codons in all three reading frames.

Features of Nuclear Splicing Junctions No extensive homology exists between the ends of an
intron.

4. The intron/exon junctions, though, do have well-conserved short sequences.

Figure: Splicing of Introns


Promoter Sequences
The promoter contains specific DNA sequences that are recognized by proteins known as
transcription factors. These factors bind to the promoter sequences, recruiting RNA polymerase,
the enzyme that synthesizes the RNA from the codingregion of the gene or romoter sequences are
DNA sequences that define where transcription of a gene by RNA polymerase begins. Promoter
sequences are typically located directly upstream or at the 5' end of the transcription initiation site.
RNA polymerase and accessory proteins (transcription factors) bind to the promoter to initiate
production of an mRNA transcript. The three eukaryotic RNA polymerases have different
structures and they transcribe different classes of genes. Class II Promoters
Class II promoters can be considered as having two parts: the core promoter and the proximal
promoter. The core promoter attracts general transcription factors and RNA polymerase II at a
basal level and sets the transcription start site and direction of transcription. It consists of elements
lying within about 37 bp of the transcription start site. The core promoter is modular and can
contain almost any combination of the following elements.
Figure: Promoter sequence
The TATA box is centered at approximately position 228 (about 231 to 226) and has the consensus
sequence TATA(A/T)AA(G/A); the TFIIB recognition element (BRE) lies just upstream of the
TATA box (about (G/C)(G/C)(G/A)CGCC; the initiator (Inr) is centered on the transcription start
site (position 22 to 14) and has the consensus sequence GCA(G/T)T(T/C) in Drosophila, or
PyPyAN(T/A)PyPy in mammals; the downstream promoter element (DPE) is centered on position
130 (128 to 132); the downstream core element (DCE) has three parts located at approximately 16
to 112, 117 to 123, and 131 to 133, and these have the consensus sequences CTTC, CTGT, and
AGC, respectively; and the motif ten element (MTE) lies approximately between positions 118
and 127. The proximal promoter helps attract general transcription factors and RNA polymerase
and includes promoter elements that can extend from about 37 bp up to 250 bp upstream of the
transcription start site. Proximal promoter elements are usually found upstream of class II core
promoters therefore Elements of the proximal promoter are also sometimes called upstream
promoter elements. They differ from the core promoter in that they bind to relatively gene-specific
transcription factors. For example, GC boxes bind the transcription factor Sp1, while CCAAT
boxes bind CTF (the CCAAT-binding transcription factor). The proximal promoter elements,
unlike the core promoter, can be orientation-independent, but they are relatively position-
dependent, unlike classical enhancers.
Class I Promoters
Class I promoters are not well conserved in sequence from one species to another, but the general
architecture of the promoter is well conserved. It consists of two elements, a core element
surrounding the transcription start site, and an upstream promoter element (UPE) about 100 bp
farther upstream. The core promoter covers the start site of transcription, from about -40 to about
+30. The promoter also contains an upstream control element located about 70 bp further 5',
extending from -170 to -110. The spacing between these two elements is important but the
promoter efficiency is more sensitive to deletions than to insertions between the two promoter
elements. . RNA polymerase I then binds to this complex of DNA+UBF1+SL1 to initiate
transcription at the correct nucleotide and the elongate to make pre-rRNA.
Class III Promoters
RNA polymerase III transcribes a set of short genes. The classical class III genes (types I and II)
have promoters that lie wholly within the genes. The internal promoter of the type I class III gene
(the 5S rRNA gene) is split into three regions: box A, a short intermediate element, and box C.
The internal promoters of the type II genes (e.g., the tRNA genes) are split into two parts: box A
and box B. The promoters of the nonclassical (type III) class III genes resemble those of class II
genes. This promoter has internal control sequences Deletion of 5' flanking DNA still permits
efficient transcription of (most) genes transcribed by RNA PolIII. Even the intial part of the gene
is expendable, as is the 3' end. Sequences internal to the gene (e.g. +55 to +80 in 5S rRNA genes)
are required for efficient initiation, in contrast to the familiar situation in bacteria, where most of
the promoter sequences are 5' to the gene. TFIIIA binds to the internal control region of genes that
encode 5S RNA (type 1 internal promoter). TFIIIC binds to internal control regions of genes for
5S RNA (alongside TFIIIA) and for tRNAs (type 2 internal promoters). The binding of TFIIIC
directs TFIIIB to bind to sequences (-40 to +11) that overlap the start site for transcription. One
subunit of TFIIIB is TBP, even though no TATA box is required for transcription. TFIIIA and
TFIIIC can now be removed without affecting the ability of RNA polymerase III to initiate
transcription. Thus TFIIIA and TFIIIC are assembly factors, and TFIIIB is the initiation factor.
RNA polymerase III binds to the complex of TFIIIB+DNA to accurately and efficiently initiated
transcription.
Terminator sequences
In genetics, a transcription terminator is a section of nucleic acid sequence that marks the end of a
gene or operon in genomic DNA during transcription. This sequence mediates transcriptional
termination by providing signals in the newly synthesized mRNA that trigger processes which
release the mRNA from the transcriptional complex. The three different RNA polymerase enzymes
recognize different sequences of DNA leading the end of transcription, known as termination.
1. Termination by RNA Pol II
a. No clear evidence for a discrete terminator for RNA polymerase II
b. 3' end of mRNA is generated by cleavage and polyadenylation
c. Signal for cleavage and polyadenylation:
AAUAAA, about 20 not before the 3' end of the mRNA
Other sequences 3' to cleavage site
d. Cleavage enzyme not well characterized at this point; the U4 snRNP may play a role in cleavage.
A polyA polymerase has been identified.
e. Polyadenylation is required for termination by RNA Pol II; possibly also pausing by the RNA
polymerase
2. Termination by RNA Pol III:
Termination occurs at a run of 4-5 T's (on the non-template strand of DNA) surrounded by GC-
rich DNA
3. Termination by RNA Pol I:
Termination requires an 11 bp binding site for the protein Reb1p, which causes the polymerase to
pause, and a 46 bp segment located 5' to the Reb1p site, which may be required for release of the
polymerase.

Enhancers
Position independent DNA elements are called enhancers is a short (50-1500 bp) region of DNA
that can be bound by proteins stimulate the transcription process of eukaryotic genes. These have
several names: transcription factors, enhancer-binding proteins, or activators. These proteins
appear to stimulate transcription by interacting with other proteins called general transcription
factors at the promoter. This interaction promotes formation of a preinitiation complex, which is
necessary for transcription. Thus, enhancers usually allow a gene to be induced (or sometimes
repressed) by activators. There are hundreds of thousands of enhancers in the human genome, e.g.
an enhancer near the gene GADD45g has been described that may regulate brain growth in
chimpanzees and other mammals, but not in humans. The GADD45G regulator in mice and chimps
is active in regions of the brain where cells that form the cortex, ventral forebrain, and thalamus
are located and may suppress further neurogenesis.
Structures of Eukaryotic gene:

1. Exons

2. Introns

3. Promoter sequences

4. Terminator sequences

5. Upstream sequences

6. Downstream sequence

7. Enhancers and Silencers

Exons:

Exons are the name for the nucleotide sequences that remain in a mature mRNA. Introns are the
name for the regions that are removed (spliced out). The term 'exon' refers to both the DNA
sequence within a gene and to the corresponding sequence in mRNA (also known as transcripts).
During RNA splicing, introns are removed and exons are covalently joined to one another. Once
all introns are removed, the resulting mRNA or noncoding RNA gene-product is considered
mature.
Exon Structure

Exons are made up of stretches of DNA that will ultimately be translated into amino acids and
proteins. In the DNA of eukaryotic organisms, exons can be together in a continuous gene or
separated by introns in a discontinuous gene. When the gene is transcribed into premRNA the
transcript contains both introns and exons. The pre-mRNA is then processed and the introns are
spliced out of the molecule. Mature mRNAs can be a few hundred to several thousand nucleotides
long. The mature mRNA consists of exons and short untranslated regions (UTRs) on either end.
The exons make up the final reading frame which consists of nucleotides arranged in triplets. The
reading frame begins with a start codon (usually AUG) and ends in a termination codon. The
nucleotides are arranged in triplets as each amino acid is coded for by a three-nucleotide sequence.

Introns

RNA sequences between exons that are removed by splicing are known as introns. Introns have
been found in eukaryotic mRNA, tRNA and rRNA, as well as chloroplast, mitochondrial and a
phage of E. coli. Eubacteria are the only species in which introns have not been found. In general,
genes that are related by evolution have related organizations with conservation of the position at
least some introns. Furthermore, conservation of introns is also detected between genes in related
species. The amount and size of introns varies greatly. The mammalian DHFR has 6 exons that
total about 2000 bases, yet the gene is 31,000 bases. Likewise, the alpha-collagen has 50 exons
that range from 45-249 bases and the gene is about 40,000 bases. Clearly two genes of the same
size can have different number of introns, and introns that vary in size. Some species will have an
intron in a gene, but another species may not have an intron in the same gene. An example is the
cytochrome oxidase subunit II gene of plant mitochondria where some plant species have an intron
in this gene and others do not. Introns are a common eukaryotic event. Several features of
interrupted genes are:

1. The sequence order is the same as in the mRNA

2. The structure of an interrupted gene is identical in all tissues.

3. Introns of nuclear genes have termination codons in all three reading frames.

Features of Nuclear Splicing Junctions No extensive homology exists between the ends of an
intron.

4. The intron/exon junctions, though, do have well-conserved short sequences.


Figure: Splicing of Introns
Promoter Sequences
The promoter contains specific DNA sequences that are recognized by proteins known as
transcription factors. These factors bind to the promoter sequences, recruiting RNA polymerase,
the enzyme that synthesizes the RNA from the codingregion of the gene or romoter sequences are
DNA sequences that define where transcription of a gene by RNA polymerase begins. Promoter
sequences are typically located directly upstream or at the 5' end of the transcription initiation site.
RNA polymerase and accessory proteins (transcription factors) bind to the promoter to initiate
production of an mRNA transcript. The three eukaryotic RNA polymerases have different
structures and they transcribe different classes of genes. Class II Promoters
Class II promoters can be considered as having two parts: the core promoter and the proximal
promoter. The core promoter attracts general transcription factors and RNA polymerase II at a
basal level and sets the transcription start site and direction of transcription. It consists of elements
lying within about 37 bp of the transcription start site. The core promoter is modular and can
contain almost any combination of the following elements.

Figure: Promoter sequence


The TATA box is centered at approximately position 228 (about 231 to 226) and has the consensus
sequence TATA(A/T)AA(G/A); the TFIIB recognition element (BRE) lies just upstream of the
TATA box (about (G/C)(G/C)(G/A)CGCC; the initiator (Inr) is centered on the transcription start
site (position 22 to 14) and has the consensus sequence GCA(G/T)T(T/C) in Drosophila, or
PyPyAN(T/A)PyPy in mammals; the downstream promoter element (DPE) is centered on position
130 (128 to 132); the downstream core element (DCE) has three parts located at approximately 16
to 112, 117 to 123, and 131 to 133, and these have the consensus sequences CTTC, CTGT, and
AGC, respectively; and the motif ten element (MTE) lies approximately between positions 118
and 127. The proximal promoter helps attract general transcription factors and RNA polymerase
and includes promoter elements that can extend from about 37 bp up to 250 bp upstream of the
transcription start site. Proximal promoter elements are usually found upstream of class II core
promoters therefore Elements of the proximal promoter are also sometimes called upstream
promoter elements. They differ from the core promoter in that they bind to relatively gene-specific
transcription factors. For example, GC boxes bind the transcription factor Sp1, while CCAAT
boxes bind CTF (the CCAAT-binding transcription factor). The proximal promoter elements,
unlike the core promoter, can be orientation-independent, but they are relatively position-
dependent, unlike classical enhancers.
Class I Promoters
Class I promoters are not well conserved in sequence from one species to another, but the general
architecture of the promoter is well conserved. It consists of two elements, a core element
surrounding the transcription start site, and an upstream promoter element (UPE) about 100 bp
farther upstream. The core promoter covers the start site of transcription, from about -40 to about
+30. The promoter also contains an upstream control element located about 70 bp further 5',
extending from -170 to -110. The spacing between these two elements is important but the
promoter efficiency is more sensitive to deletions than to insertions between the two promoter
elements. . RNA polymerase I then binds to this complex of DNA+UBF1+SL1 to initiate
transcription at the correct nucleotide and the elongate to make pre-rRNA.
Class III Promoters
RNA polymerase III transcribes a set of short genes. The classical class III genes (types I and II)
have promoters that lie wholly within the genes. The internal promoter of the type I class III gene
(the 5S rRNA gene) is split into three regions: box A, a short intermediate element, and box C.
The internal promoters of the type II genes (e.g., the tRNA genes) are split into two parts: box A
and box B. The promoters of the nonclassical (type III) class III genes resemble those of class II
genes. This promoter has internal control sequences Deletion of 5' flanking DNA still permits
efficient transcription of (most) genes transcribed by RNA PolIII. Even the intial part of the gene
is expendable, as is the 3' end. Sequences internal to the gene (e.g. +55 to +80 in 5S rRNA genes)
are required for efficient initiation, in contrast to the familiar situation in bacteria, where most of
the promoter sequences are 5' to the gene. TFIIIA binds to the internal control region of genes that
encode 5S RNA (type 1 internal promoter). TFIIIC binds to internal control regions of genes for
5S RNA (alongside TFIIIA) and for tRNAs (type 2 internal promoters). The binding of TFIIIC
directs TFIIIB to bind to sequences (-40 to +11) that overlap the start site for transcription. One
subunit of TFIIIB is TBP, even though no TATA box is required for transcription. TFIIIA and
TFIIIC can now be removed without affecting the ability of RNA polymerase III to initiate
transcription. Thus TFIIIA and TFIIIC are assembly factors, and TFIIIB is the initiation factor.
RNA polymerase III binds to the complex of TFIIIB+DNA to accurately and efficiently initiated
transcription.
Terminator sequences
In genetics, a transcription terminator is a section of nucleic acid sequence that marks the end of a
gene or operon in genomic DNA during transcription. This sequence mediates transcriptional
termination by providing signals in the newly synthesized mRNA that trigger processes which
release the mRNA from the transcriptional complex. The three different RNA polymerase enzymes
recognize different sequences of DNA leading the end of transcription, known as termination.
1. Termination by RNA Pol II
a. No clear evidence for a discrete terminator for RNA polymerase II
b. 3' end of mRNA is generated by cleavage and polyadenylation
c. Signal for cleavage and polyadenylation:
AAUAAA, about 20 not before the 3' end of the mRNA
Other sequences 3' to cleavage site
d. Cleavage enzyme not well characterized at this point; the U4 snRNP may play a role in cleavage.
A polyA polymerase has been identified.
e. Polyadenylation is required for termination by RNA Pol II; possibly also pausing by the RNA
polymerase
2. Termination by RNA Pol III:
Termination occurs at a run of 4-5 T's (on the non-template strand of DNA) surrounded by GC-
rich DNA
3. Termination by RNA Pol I:
Termination requires an 11 bp binding site for the protein Reb1p, which causes the polymerase to
pause, and a 46 bp segment located 5' to the Reb1p site, which may be required for release of the
polymerase.

Enhancers
Position independent DNA elements are called enhancers is a short (50-1500 bp) region of DNA
that can be bound by proteins stimulate the transcription process of eukaryotic genes. These have
several names: transcription factors, enhancer-binding proteins, or activators. These proteins
appear to stimulate transcription by interacting with other proteins called general transcription
factors at the promoter. This interaction promotes formation of a preinitiation complex, which is
necessary for transcription. Thus, enhancers usually allow a gene to be induced (or sometimes
repressed) by activators. There are hundreds of thousands of enhancers in the human genome, e.g.
an enhancer near the gene GADD45g has been described that may regulate brain growth in
chimpanzees and other mammals, but not in humans.

You might also like