You are on page 1of 37

DNA Structure and Chemistry

a). Evidence that DNA is the genetic information


i). DNA transformation – know this term
ii). Transgenic experiments – know this process
iii). Mutation alters phenotype – be able to define
genotype and phenotype
b). Structure of DNA
i). Structure of the bases, nucleosides, and nucleotides
ii). Structure of the DNA double helix
iii). Complementarity of the DNA strands
c). Chemistry of DNA
i). Forces contributing to the stability of the double helix
ii). Denaturation of DNA
THE FLOW OF GENETIC INFORMATION

2 3
DNA RNA PROTEIN
1

DNA

1. REPLICATION (DNA SYNTHESIS)


2. TRANSCRIPTION (RNA SYNTHESIS)
3. TRANSLATION (PROTEIN SYNTHESIS)
Structures of the bases

Purines Pyrimidines

Adenine (A) Thymine (T)

5-Methylcytosine (5mC)

Guanine (G) Cytosine (C)


Nucleoside

[structure of deoxyadenosine]

Nucleotide
Nomenclature

Nucleoside Nucleotide
Base +deoxyribose +phosphate

Purines
adenine adenosine
guanine guanosine
hypoxanthine inosine

Pyrimidines
thymine thymidine
cytosine cytidine
+ribose
uracil uridine
ii). Structure
Structure of theofDNA
the
DNA doublechain
polynucleotide helix

5’

3’

• polynucleotide chain
• 3’,5’-phosphodiester bond
A-T base pair

Hydrogen bonding of the bases

G-C base pair

Chargaff’s rule: The content of A equals the content of T,


and the content of G equals the content of C
in double-stranded DNA from any species
Double-stranded DNA
5’ 3’

Major groove

Minor groove

“B” DNA
3’ 5’ 3’ 5’
Chemistry of DNA

Forces affecting the stability of the DNA double helix


• hydrophobic interactions - stabilize
- hydrophobic inside and hydrophilic outside
• stacking interactions - stabilize
- relatively weak but additive van der Waals forces
• hydrogen bonding - stabilize
- relatively weak but additive and facilitates stacking
• electrostatic interactions - destabilize
- contributed primarily by the (negative) phosphates
- affect intrastrand and interstrand interactions
- repulsion can be neutralized with positive charges
(e.g., positively charged Na+ ions or proteins)
Stacking interactions

Charge repulsion

Charge repulsion
Model of double-stranded DNA showing three base pairs
Denaturation of DNA
Strand separation
Double-stranded DNA and formation of
single-stranded
random coils

Extremes in pH or A-T rich regions


high temperature denature first

Cooperative unwinding
of the DNA strands
Electron micrograph of partially melted DNA

Double-stranded, G-C rich


DNA has not yet melted

A-T rich region of DNA


has melted into a
single-stranded bubble

• A-T rich regions melt first, followed by G-C rich regions


Hyperchromicity

Absorbance maximum
for single-stranded DNA

Absorbance
Absorbance
maximum for
double-stranded DNA

220 260 300

The absorbance at 260 nm of a DNA solution increases


when the double helix is melted into single strands.
DNA melting curve

100
Percent hyperchromicity

50

50 70 90
Temperature oC

• Tm is the temperature at the midpoint of the transition


Tm is dependent on the G-C content of the DNA
Percent hyperchromicity

E. coli DNA is
50% G-C
50

60 70 80
Temperature oC

Average base composition (G-C content) can be


determined from the melting temperature of DNA
Genomic DNA, Genes, Chromatin

a). Complexity of chromosomal DNA


i). DNA reassociation
ii). Repetitive DNA and Alu sequences
iii). Genome size and complexity of genomic DNA
b). Gene structure
i). Introns and exons
ii). Properties of the human genome
iii). Mutations caused by Alu sequences
c). Chromosome structure - packaging of genomic DNA
i). Nucleosomes
ii). Histones
iii). Nucleofilament structure
iv). Telomeres, aging, and cancer
DNA reassociation (renaturation)

Double-stranded DNA

Denatured,
single-stranded
DNA
Faster,
zippering
reaction to
k2 form long
Slower, rate-limiting, molecules
second-order process of of double-
finding complementary stranded
sequences to nucleate DNA
base-pairing
DNA reassociation kinetics for human genomic DNA

Cot1/2 = 1 / k2 k2 = second-order rate constant


Co = DNA concentration (initial)
t1/2 = time for half reaction of each
component or fraction
0
fast (repeated)
Kinetic fractions:
% DNA reassociated

fast
intermediate intermediate
(repeated) slow
Cot1/2
50
Cot1/2
slow (single-copy)
Cot1/2
100
I I I I I I I I I
log Cot
106 copies per genome of 1 copy per genome of
a “low complexity” sequence a “high complexity” sequence
of e.g. 300 base pairs of e.g. 300 x 106 base pairs

high k2 low k2
Type of DNA % of Genome Features

Single-copy (unique) ~75% Includes most genes 1


Repetitive
Interspersed ~15% Interspersed throughout genome between
and within genes; includes Alu sequences 2
and VNTRs or mini (micro) satellites
Satellite (tandem) ~10% Highly repeated, low complexity sequences
usually located in centromeres
and telomeres
0
fast ~10%
2
Alu sequences are
intermediate about 300 bp in length
~15% and are repeated about
50 300,000 times in the
genome. They can be
slow (single-copy) found adjacent to or
~75% within genes in introns
or nontranslated regions.
100
I I I I I I I I I

1
Some genes are repeated a few times to thousands-fold and thus would be in
the repetitive DNA fraction
Classes of repetitive DNA

Interspersed (dispersed) repeats (e.g., Alu sequences)

GCTGAGG GCTGAGG GCTGAGG

Tandem repeats (e.g., microsatellites)

TTAGGGTTAGGGTTAGGGTTAGGG
Genome sizes in nucleotide pairs (base-pairs)

plasmids
viruses
bacteria
fungi
plants
algae
insects
mollusks
bony fish
The size of the human
genome is ~ 3 X 109 bp; amphibians
almost all of its complexity
is in single-copy DNA. reptiles
birds
The human genome is thought
to contain ~30,000 to 40,000 genes. mammals

104 105 106 107 108 109 1010 1011


Gene structure

promoter exons (filled and unfilled boxed regions)


region

+1
introns (between exons)

transcribed region

mRNA structure

5’ 3’

translated region
The (exon-intron-exon)n structure of various genes

histone

total = 400 bp; exon = 400 bp

β -globin

total = 1,660 bp; exons = 990 bp

HGPRT
(HPRT)
total = 42,830 bp; exons = 1263 bp

factor VIII

total = ~186,000 bp; exons = ~9,000 bp


Properties of the human genome

Nuclear genome

• the haploid human genome has ~3 X 109 bp of DNA


• single-copy DNA comprises ~75% of the human genome
• the human genome contains ~30,000 to 40,000 genes
• most genes are single-copy in the haploid genome
• genes are composed of from 1 to >75 exons
• genes vary in length from <100 to >2,300,000 bp
• Alu sequences are present throughout the genome

Mitochondrial genome

• circular genome of ~17,000 bp


• contains <40 genes
Alu sequences can be “mutagenic”
Familial hypercholesterolemia
• autosomal dominant
• LDL receptor deficiency

From Nussbaum, R.L. et al. "Thompson & Thompson Genetics in Medicine," 6th edition (Revised Reprint), Saunders, 2004.
LDL receptor gene

Alu repeats present within introns

4 5 6
Alu repeats in exons
unequal 4 5 6
crossing over Alu Alu

X
Alu Alu
4 5 6

one product has a


deleted exon 5
Alu (the other product is not shown)
4 6
Chromatin structure

EM of chromatin shows presence of


nucleosomes as “beads on a string”
Nucleosome structure

Nucleosome core (left)


• 146 bp DNA; 1 3/4 turns of DNA
• DNA is negatively supercoiled
• two each: H2A, H2B, H3, H4 (histone octomer)
Nucleosome (right)
• ~200 bp DNA; 2 turns of DNA plus spacer
• also includes H1 histone
Histones (H1, H2A, H2B, H3, H4)
• small proteins
• arginine or lysine rich: positively charged
• interact with negatively charged DNA
• can be extensively modified - modifications in
general make them less positively charged
Phosphorylation
Poly(ADP) ribosylation
Methylation
Acetylation
Hypoacetylation
by histone deacetylase (facilitated by Rb)
“tight” nucleosomes
assoc with transcriptional repression
Hyperacetylation
by histone acetylase (facilitated by TFs)
“loose” nucleosomes
assoc with transcriptional activation
Nucleofilament structure
Condensation and decondensation
of a chromosome in the cell cycle
Telomeres are protective
Telomeres and aging “caps” on chromosome
ends consisting of short
5-8 bp tandemly repeated
GC-rich DNA sequences,
Metaphase chromosome
that prevent chromosomes
from fusing and causing
karyotypic rearrangements.

telomere centromer telomere


e
<1 to >12 kb
telomere structure
(TTAGGG)many
young
(TTAGGG)few
senescent

• telomerase (an enzyme) is required to maintain telomere length in


germline cells

• most differentiated somatic cells have decreased levels of telomerase


and therefore their chromosomes shorten with each cell division
Class Assignment (for discussion on Sept 9th )

Botchkina GI, et al.


“Noninvasive detection of prostate cancer by
quantitative analysis of telomerase activity.”
Clin Cancer Res. May 1;11(9):3243-3249, 2005

PDF of article is accessible on the website