You are on page 1of 59

MBB

425
MOLECULAR
GENETICS II
Overview of the Composition and Chemical
Structure of Nucleic Acids
THE FLOW OF GENETIC INFORMATION
A. The Central Dogma
2. Transcription 3 Translation
DNA RNA PROTEIN
1.Replication 1. Replication (DNA Synthesis)
2. Transcription (RNA Synthesis)
DNA 3. Translation (Protein Synthesis)
THE FLOW OF GENETIC INFORMATION
• Genetic diseases occur because of mutations in DNA

•Many of these mutations affect the repair of other


mutations that occur during DNA replication,

•or at other times, which in turn affect the flow of


genetic information from DNA to RNA (transcription and
processing) and from RNA to protein synthesis
(translation).

•Many of these mutations also affect the structures of


the resulting proteins, affecting their functions.
Central Dogma …
Proposed by Francis Crick in 1958 to
describe the flow of information in a cell.
DNA
Information stored in DNA is transferred
residue-by-residue to RNA which in turn
transfers the information residue-by-residue
to protein.
RNA
The Central Dogma was proposed by Crick to
help scientists think about molecular biology.
It has undergone numerous revisions in the
past 45 years.
Protein
The Nobel Prize in Physiology or Medicine 1962

….for their discoveries concerning the molecular structure of nucleic


acids and its significance for information transfer in living material.
Central Dogma …
• The primary repository of genetic information is
typically DNA (it can be RNA in some organisms) which
must be replicated with fidelity and transmitted to
subsequent generations during reproduction to
propagate the species.

• DNA is copied or transcribed into RNA through a


process known as transcription. The resulting transcript
(if protein encoding) is subsequently decoded into
protein by translation.
Central Dogma …

•Thus, RNA or messenger RNA (mRNA) acts as


intermediary information that can be translated into
protein. Proteins in large part constitute the major
constituents of cells. The set of mRNAs expressed in a cell
and their protein products largely determined the
phenotype (type of cell) of a cell.
CENTRAL DOGMA
THEORY
The structure of DNA as a double helix with
base-pairing facilitates the replication and
transcription of the information contained
therein.

In the 1970s a modification to the central dogma


was made when it was observed that RNA can be
converted to DNA. In most organisms this is a rare
event, whereas for some viruses such as HIV it is a
necessary part of its life-cycle.
B. Revised Central Dogma

4. Reverse Transcription

2. Transcription 3 Translation
DNA RNA PROTEIN
1.Replication

DNA

1. Replication (DNA Synthesis)


2. Transcription (RNA Synthesis)
3. Translation (Protein Synthesis)
4. Reverse Transcription (Synthesis of DNA from
RNA)
THE CENTRAL DOGMA OF
MOLECULAR BIOLOGY
Structures of the DNA bases
Purines Pyrimidines

Be familiar with
the structures of
the purine bases,
adenine (A) and
guanine (G); and
the pyrimidine Thymine (T)
bases, thymine (T)
and cytosine (C).
DNA methylation

Directed DNA methylation on N6-adenine (6mA), N4-cytosine (4mC), and


C5-cytosine (5mC) can potentially increase DNA coding capacity and
regulate a variety of biological functions.

A common base modification in DNA results from the methylation of


cytosine, giving rise to 5-methylcytosine (5mC). As we shall see
subsequently, 5mC is highly mutagenic. It is believed that this methylation
functions to regulate gene expression because 5-methylcytosine (5mC)
residues are often clustered near the promoters of genes in so-called "CpG
islands.“ (Along one strand of DNA the nucleotides are sometimes
indicated by the base followed by a phosphate or “p” such as
ApTpCpCpGpApCpTpGpGp - this sequence contains one CpG site.) The
problem that arises from these methylations is that subsequent
deamination of a 5mC results in the production of thymine, which is not
foreign to DNA. As such, 5'-mCG-3' sites (or mCpG sites) are "hot-spots"
for mutation, and when mutated are a common cause of cancer.
Structures of the DNA Nucleotides and
Nucleosides
Nucleoside

Nucleotide
ii). Structure
Structure of the
of the DNA
DNA double
polynucleotide helix
chain

5’

3’

• polynucleotide chain
• 3’,5’-phosphodiester bond
Strength of A-T and G-C Base pairs

Hydrogen bonding of the bases

Chargaff’s rule: The content of A equals the content of T, and the content of G
equals the content of C in double-stranded DNA from any species.
. • Only 4 nucleotides (bases)
make up all DNA:
A (adenine) -T
C (cytosine) -G
T (thymine) -A
G (guanine) -C

• DNA strands are antiparallel


• The two ends (5 prime and 3
prime) are not equivalent
Double-stranded DNA
5’ 3’

Major groove

Minor groove

“B” DNA
3’ 5’ 3’ 5’
DNA is Double-Stranded
•Base-pairing between the complementary strands is
required for two important functions of DNA:
1) DNA replication involves an unwinding of the double
helix (right) followed by synthesis of a complementary
strand from each of the unpaired template strands,
and
2) DNA serves as a template for RNA synthesis by utilizing
the information in one strand to code for a
complementary RNA strand.
FORMS OF DNA
DNA in the "B" form has a major groove and a minor
groove, and has 10 base pairs per one turn of the double
helix. (Other forms include, A, C, D, E and Z-DNA)
DNA that is overwound (A-form – 11bp per turn with a
diameter of 2.3nm; Z-DNA -12bp; ) or underwound (C-
DNA, 9.3bp), with fewer than or more than 10 base pairs
per turn, is said to be "supercoiled".
Z-DNA is left-handed double helical structure in which the helix winds to the left
in a zigzag pattern, instead of to the right .
It should also be noted that the complementary strands in double helical DNA are
antiparallel with respect to each other.
Each polynucleotide chain has a 5' end and a 3' end running in opposite directions.
FORMS OF DNA
Form Direction of Helix Base Pairs Distance Diameter of
Per Turn Between bp helix (Å)
(Å)
A Right-handed 11 2.6 23
B Right-handed 10 3.4 19
C Right-handed 9.3 3.3 19
Z Left-handed 12 3.7 18
THE GENETIC
CODE
The genetic code is the set of rules by which information encoded in
genetic material (DNA or RNA sequences) is translated into proteins
(amino acid sequences) by living cells.
It defines a mapping between tri-nucleotide sequences called codons
and amino acids; every triplet of nucleotides in a nucleic acid
sequence specifies a single amino acid.
Properties of the Genetic Code
1. The code is a triplet codon: Each codon consists of three successive nitrogenous
bases.
2. The code is non-overlapping: A base in a mRNA is not used for different codons.
3. The code is commaless: after one amino acid is coded, the second amino acid will
be automatically, coded by the next three letters.
4. The code is non-ambiguous: Non-ambiguous code means that a particular codon
will always code for the same amino acid.
5. The code has polarity: The code is always read in a fixed direction, i.e., in the
5’→3′ direction.
6. The code is degenerate: More than one codon may specify the same amino acid;
this is called degeneracy of the code. E.g., UCU, UCC, UCA and UCG code for serine
7. Some codes act as start codons: AUG codon is the start or initiation codon, i.e.,
the polypeptide chain starts either with methionine (eukaryotes) or N- formylmethionine
(prokaryotes). In rare cases, GUG also serves as the initiation codon, e.g., bacterial
protein synthesis. Normally, GUG codes for valine, but when normal AUG codon is lost
by deletion, only then GUG is used as initiation codon.
8. Some codes act as stop codons: Three codons UAG, UAA and UGA are the
chain stop or termination codons
9. The code is universal: Same genetic code is found valid for all organisms.
DNASES
Deoxyribonucleases (or DNases) are enzymes that
cleave phosphodiester bonds.
Some are used for constructive purposes, such as
proofreading during DNA replication, whereas others
are used to degrade DNA.
There are two basic classes of DNases:
1. exonucleases and 2. endonucleases.
Exonucleases remove only the terminal nucleotide,
whereas Endonucleases cleave anywhere within the
DNA double helix.
Chemistry of DNA
Forces affecting the stability of the DNA double helix
• hydrophobic interactions - stabilize
- hydrophobic inside and hydrophilic outside
• stacking interactions – stabilize
- relatively weak but additive van der Waals forces
• hydrogen bonding - stabilize
- relatively weak but additive and facilitates stacking
• electrostatic interactions - destabilize
- contributed primarily by the (negative) phosphates
- affect intrastrand and interstrand interactions
- repulsion can be neutralized with positive charges
(e.g., positively charged Na+ ions or proteins)
Stacking interactions

Charge repulsion

Charge repulsion
Denaturation of DNA
Double-stranded DNA Strand separation
and formation of
single-stranded
random coils

Extremes in pH or A-T rich regions


high temperature denature first

Cooperative unwinding
of the DNA strands
When the strands of DNA separate, the DNA is said to
be denatured (when high temperature is used to
denature DNA, the DNA is said to be melted).

Because some of the forces stabilizing the DNA double


helix are contributed by base pairing interactions, and
because A-T base pairs have only two hydrogen bonds
in contrast to G-C base pairs which have three hydrogen
bonds, regions of the DNA duplex that are A-T rich will
denature first.
Electron Micrograph of Partially Melted DNA

Double-stranded, G-C rich


DNA has not yet melted

A-T rich region of DNA


has melted into a
single-stranded bubble

• A-T rich regions melt first, followed by G-C rich regions


HYPERCHROMICITY…
When a solution of double-stranded DNA is placed in a
spectrophotometer cuvette and the absorbance of the DNA
is determined across the electromagnetic spectrum, it
characteristically shows an absorbance maximum at 260
nm (in the UV region of the spectrum).

If the same DNA solution is melted, the absorbance at


260 nm increases approximately 40%.

This property is termed "hyperchromicity’’- increase of


absorbance (optical density) of a material. The
hyperchromic shift is due to the fact that unstacked bases
absorb more light than stacked bases.
Hyperchromicity

Absorbance maximum
for single-stranded DNA

Absorbance
Absorbance maximum for
double-stranded DNA

220 260 300

The absorbance at 260 nm of a DNA solution increases


when the double helix is melted into single strands.
DNA melting curve

100

Percent hyperchromicity

50

50 70 90
Temperature oC
• Tm is the temperature at the midpoint of the transition
DNA Melting Curve
• Hyperchromicity can be used to follow the denaturation of DNA
as a function of increasing temperature. As the temperature of a
DNA solution gradually rises above 50⁰C, the A-T regions will melt
first giving rise to an increase in the UV absorbance. As the
temperature increases further, more of the DNA will become
single-stranded, further increasing the UV absorbance, until the
DNA is fully denatured above 90⁰C.
• The temperature at the mid-point of the melting curve is
termed "melting temperature" and is abbreviated Tm. The Tm for
a DNA depends on its average G+C content: the higher the G+C
content, the higher the Tm.
• Note: G+C content, G-C content, and GC content are equivalent
terms, that is, they are used interchangeably.
The 2+4 Rule: Tm = 2 X (A+T) + 4 X (G+C).
Tm is dependent on the G-C content of the DNA
Percent hyperchromicity

E. coli DNA is
50% G-C
50

Average G-C content can


be determined from the
Tm of DNA

60 70 80
Temperature oC
This slide shows the dependence of Tm on avg G+C content of 3 different DNAs. Under the conditions
used in this experiment, E. coli DNA which has an avg G+C content of about 50%, melted with a Tm of
69ºC. The curve on the left represents a DNA with a lower G+C content and the curve on the right
represents a DNA with a higher G+C content. Tm is dependent on the ionic strength of the solution. At
a fixed ionic strength there is a linear relation between Tm and G+C content.
DNA Renaturation…
• Denaturation of a DNA double helix produces single-stranded DNA.
•The reverse reaction (the formation of double-stranded DNA) can be
carried out if the complementary DNA strands are incubated under
conditions that will promote renaturation (reassociation).
• This process involves two steps.
•The first step is a slower, rate-limiting reaction in which the
complementary strands attempt to find each other. The single
strands randomly interact until complementary base pairing can
occur.
•This may occur over only a short region of each strand, but may be
enough to "nucleate" the reaction. Because the two reacting strands
are at the same concentration in solution, the reaction is second-
order, with the rate of the reaction being defined by the second-order
rate constant, k2. Once nucleation has occurred, the complementary
strands rapidly zipper up.
DNA reassociation (renaturation)

Double-stranded DNA

Denatured,
single-stranded
DNA
Faster,
zippering
reaction to
form long
k2 molecules
of double-
Slower, rate-limiting,
stranded
second-order process of
DNA
finding complementary
sequences to nucleate
base-pairing
Polymerase Chain Reaction (PCR)
DNA RENATURATION...
Secondly, the complexity of a DNA is a function of how many
base pairs it has. In other words, a low complexity DNA may
consist of a few hundred or a few thousand base pairs, in contrast
to a high complexity DNA that may contain millions of base pairs.

Low complexity sequences are able to find each other much


faster, during a renaturation reaction, than are high complexity
sequences, since the low complexity sequences have to make
fewer collisions to find a base-pairing partner. Thus, the rate of a
renaturation reaction is a function of the complexity of the
DNA.

This means the complexity of a DNA can be determined by


measuring its second-order reassociation rate constant.
DNA REASSOCIATION
If a DNA molecule is melted and allowed to reassociate, the
complexity of the genome dictates the rate in which duplex
DNA will form.
The human genome consists of three populations of DNA:
the fast and intermediate fractions make up about 10% and
15% of the genome, respectively, and the slow fraction
makes up about 75% of the genome.

Most of the genes in the human genome are in the single-


copy fraction. Repeated sequences can be of two types:

1. those that are interspersed throughout the genome or

2. those that are tandemly repeated satellite DNAs.


DNA Re-association …
• Among the interspersed repetitive sequences are so-called "Alu"
sequences, which are about 300 base pairs in length and are
repeated about 300,000 times in the genome.
•They can be found adjacent to or within genes, and as illustrated
later, their presence can sometimes lead to the occasional
disruption of genes.
• The interspersed repetitive sequences also include VNTRs
(variable number of tandem repeats), which are made up of short
repeated sequences of only a few base-pairs, but of variable
lengths.
•They, too, are interspersed throughout the genome, and are quite
useful as landmarks for mapping genes because they are highly
polymorphic (they differ in length or number of repeats from
person to person).
Classes of Repetitive DNA

Interspersed (dispersed) repeats (e.g., Alu sequences)

GCTGAGG GCTGAGG GCTGAGG

Tandem repeats (e.g., microsatellites)

TTAGGGTTAGGGTTAGGGTTAGGG

Interspersed repeats are sequences that are repeated many times and scattered
throughout the genome. In contrast, tandem repeats are sequences that are
repeated many times adjacent to each other. The latter are usually found in the
centromeres and telomeres of chromosomes
Biology of the RNA
RNA Structure
The major bases found in DNA and RNA

DNA RNA

Adenine Adenine
Cytosine Cytosine
Guanine Guanine
Thymine Uracil (U)

thymine-adenine base pair uracil-adenine base pair


The major bases found in DNA and RNA

•The DNA and RNA polynucleotide chains have


similar structures

•Except for the presence of uridine in RNA (instead


of thymidine) and for the presence of the 2' OH in
the ribose sugar.
Secondary and Tertiary Structures of RNA
Secondary structure

Tertiary structure

RNAs frequently adopt secondary structure and tertiary structure in order to carry
out their functions. WHY?
Types And Functions Of RNA
Of the many types of RNA, the three most well-known and most commonly studied
are 1. messenger RNA (mRNA), 2. transfer RNA (tRNA), and 3. ribosomal RNA (rRNA).
In protein synthesis, mRNA carries genetic codes from the DNA in the nucleus to
ribosomes, the sites of protein translation in the cytoplasm. Ribosomes are
composed of rRNA and protein. The ribosome protein subunits are encoded by
rRNA and are synthesized in the nucleolus. Once fully assembled, they move to the
cytoplasm, where, as key regulators of translation, they “read” the code carried by
mRNA. A sequence of three nitrogenous bases in mRNA specifies incorporation of a
specific amino acid in the sequence that makes up the protein. Molecules of tRNA
(sometimes also called soluble, or activator, RNA), which contain fewer than 100
nucleotides, bring the specified amino acids to the ribosomes, where they are linked
to form proteins.
Types And Functions Of RNA

In addition to mRNA, tRNA, and rRNA, RNAs can be broadly


divided into coding (cRNA) and noncoding RNA (ncRNA). There
are two types of ncRNAs:
1.Housekeeping ncRNAs (tRNA and rRNA)
2.Regulatory ncRNAs, which are further classified according to their
size.
a) Long ncRNAs (lncRNA) have at least 200 nucleotides.
b) small ncRNAs have fewer than 200 nucleotides.
i) microRNA (miRNA),
ii) small nucleolar RNA (snoRNA)
iii) small nuclear RNA (snRNA)
iv) small-interfering RNA (siRNA)
v) PIWI-interacting RNA (piRNA).
Biogenesis of miRNA

(Asgari 2011)
Circular RNA (circRNA)
Circular RNA (circRNA) is unique from other RNA types because its 5′ and 3′
ends are bonded together, creating a loop. The circRNAs are generated
from many protein-encoding genes, and some can serve as templates for
protein synthesis, similar to mRNA. They can also bind miRNA, acting as
“sponges” that prevent miRNA molecules from binding to their targets. In
addition, circRNAs play an important role in regulating the transcription
and alternative splicing of the genes from which circRNAs were derived.

circRNA splicing and isoform diversity


Barrett and Salzman (2016)
Gene Structure
promoter exons (filled and unfilled boxed regions)
region

+1
introns (between exons)

transcribed region

mRNA structure

5’ 3’

translated region
GENE STRUCTURE …
•RNA processing (we shall see this later) then removes the
intron sequences, "splicing" together the exon sequences to
produce the mature mRNA.

•The translated region of the mRNA (the region that encodes


the protein) is indicated in blue.

•Note: There are untranslated regions at the 5' and 3' ends of
mRNAs that are encoded by exon sequences but are not
directly translated.
GENE STRUCTURE …
• The next slide shows examples of the wide variety of gene
structures seen in the human genome. Some (very few) genes do
not have introns. One example is the histone genes, which encode
the small DNA-binding proteins, histones H1, H2A, H2B, H3, and
H4. Shown here is a histone gene that is only 400 base pairs (bp)
in length and is composed of only one exon.
• The beta-globin gene has three exons and two introns.
• The hypoxanthine-guanine phosphoribosyl transferase (HGPRT
or HPRT) gene has nine exons and is over 100-times larger than
the histone gene, yet has an mRNA that is only about 3-times
larger than the histone mRNA (total exon length is 1,263 bp).
•This is due to the fact that introns can be very long, while exons
are usually relatively short. An extreme example of this is the
factor VIII gene, which has numerous exons (the blue boxes and
blue vertical lines).
The (exon-intron-exon) structure of various genes
histone

total = 400 bp; exon = 400 bp

-globin

total = 1,660 bp; exons = 990 bp

HGPRT
(HPRT)
total = 42,830 bp; exons = 1263 bp
(hypoxanthine-guanine phosphoribosyl transferase)

factor VIII

total = ~186,000 bp; exons = ~9,000 bp


Properties of the human genome

Nuclear genome
• the haploidhuman genome has ~3 X 109 bp of DNA
• single-copy DNA comprises ~75% of the human genome
• the human genome contains ~30,000 to 40,000 genes
• most genes are single-copy in the haploid genome
• genes are composed of 1 to >75 exons
• genes vary in length from <100 to >2,300,000 bp
• Alu sequences are present throughout the genome
Mitochondrial genome
• circular genomeof ~17,000 bp
• contains <40 genes
Principles of Anotomy and Physiology (2006). John Wiley & Sons
TELOMERES AND AGING
•The chromosome contains a single, long molecule of
double stranded DNA, and thus has two ends.
•These ends create two problems:
•they are difficult to replicate and
•they have a tendency to fuse with other
chromosome ends causing karyotypic
rearrangements.
•To prevent these problems, chromosomes have
protective ends called "telomeres" that are
composed of tandemly repeated, 5-8 bp sequences
up to 12 kb in length.
TELOMERES AND AGING
In germline cells and in the cells of young individuals,
telomeres are of maximal length, but with every round of somatic
cell division telomeres get a little shorter.

After many rounds of replication and cell division, telomeres


become too short to protect the chromosome ends from fusing
with other chromosomes. At this stage, cells are said to be
"senescent."

Telomere length is maintained in germline cells by an enzyme


called "telomerase," which can restore any shortening that has
occurred. Tumor cells also have telomerase and thus are
immortal and can grow indefinitely.
Telomeres are protective
Telomeres and aging “caps” on chromosome
ends consisting of short
5-8 bp tandemly repeated
GC-rich DNA sequences,
Metaphase chromosome
that prevent chromosomes
from fusing and causing
karyotypic rearrangements.

telomere centromere telomere

<1 to >12 kb
telomere structure
(TTAGGG)many
young
(TTAGGG)few
senescent

• telomerase (an enzyme) is required to maintain telomere length in


germline cells

• most differentiated somatic cells have decreased levels of telomerase


and therefore their chromosomes shorten with each cell division

You might also like