You are on page 1of 76

A LITTLE HISTORY OF DNA ...

X-ray fibre diffraction pattern


of B-DNA
Unsung Heroine
Rosalind Franklin’s role as chief data provider for the discovery had faded
into obscurity until James Watson revived interest in her work with his
controversial 1968 book The Double Helix.
Watson portrayed Rosy – as he called her – in an unflattering light. He
made amends slightly in an epilogue in which he said his intention in the
book was to convey his actual impressions as a young man during the
discovery period and – particularly with Rosalind Franklin – his
impressions had often been wrong. Watson added the epilogue after he
received outraged reactions to his first draft of the book from Francis Crick
and Maurice Wilkins.

Rosalind Franklin had died in 1958 and was in no position to comment on


the book.

Despite edits Watson made to his drafts, Crick and Wilkins were not
satisfied, and they succeeded in persuading Harvard University Press to
withdraw their offer to publish it.

Watson found another publisher.


Unsung Heroine
One of the keys to DNA’s structure was Francis Crick’s
reading of a report by Rosalind Franklin in which she said
that the space group of DNA crystals is face-centered
monoclinic.

Although a number of other scientists were aware of this


fact, Crick saw what they had been blind to – he saw that
this space group implied that DNA was a dyad: if you rotate
a molecule of DNA by 180 degrees, you get back to where
you started; this allowed Crick to visualize the double helix
structure. It was a major breakthrough.

The face-centered monoclinic identification came first from


Dorothy Hodgkin at the University of Oxford, UK. Franklin
had ruled out all but 3 of 230 possible symmetry types for
DNA crystals. Hodgkin was then able to rule out 2 of
Franklin’s 3, leaving only one: face-centered monoclinic.
Unsung Heroes
In 1871 Friedrich Miescher at the
University of Göttingen in Germany
published his discovery of a new
substance.
He called it nuclein, because he had
extracted it from the nuclei of biological
cells.
Today we call it DNA.

In the 1880s and 1890s Albrecht Kossel


at the University of Berlin in Germany
identified the four bases (in the acid-
base sense) that can be found within the
DNA molecule: adenine, cytosine,
guanine, and thymine.
These bases are often abbreviated to A,
C, G and T.
Unsung Heroes
Phoebus Levene was head of the biochemical laboratory at the
Rockefeller Institute of Medical Research in New York, USA. He
could be seen as both a hero and a villain, albeit a well-
intentioned villain, in DNA’s story.
On the one hand Levene greatly advanced our understanding
of the chemical units in DNA, but on the other hand he stated
that DNA could not be genetic material.
Levene discovered deoxyribose (the D in DNA) in 1929.
He also coined the term nucleotide to describe another of his
major discoveries: that DNA is made of repeating units of
phosphate-sugar-base.
Levene wrongly believed that all DNA contained equal amounts
of the bases A, C, G and T, and therefore, because it lacked
variation, he ruled it out as genetic material. This proved rather
influential, so even after Oswald Avery and his colleagues
demonstrated in 1944 that DNA carries the genetic code, there
was a lot of resistance to the idea.
In 1938 Levene pubished work showing that DNA is a huge
molecule, with a molecular weight of 200,000 – 1,000,000. Until
1938 most scientists believed that DNA was the size of one
nucleotide – i.e. one phosphate-sugar-base unit.
Unsung Heroes
Florence Bell took some of the first X-ray diffraction
photos of DNA in 1938 and 1939.

Bell was a Ph.D. student working with William Astbury


at the University of Leeds in the UK.

Bell and Astbury identified correctly that the A, C, G,


and T bases are attached to a supporting backbone.
They described the arrangement of bases as ‘a pile of
pennies’ and were the first to measure the spacing
between each of the ‘pennies.’ The spacing they found
of 0.34 nanometers was correct, and it was the spacing
Watson and Crick used in their model of DNA.
Unsung Heroes
Sven Furberg
In 1949 Sven Furberg, a Ph.D. student at Birkbeck College, London, UK,
suggested the A, C, G, and T base units were held within a double helix, at
right-angles to the length of the molecule. He suggested that another of
DNA’s chemical units, the phosphate group, is oriented to point outwards from
the molecule. He was correct.

Rudolf Signer
In 1938 Rudolf Signer at the University of Bern in Switzerland reported that
DNA is a huge molecule, with a molecular weight of 500,000 to 1,000,000.
In May 1950, at a meeting of the Faraday Society in London, Signer offered
very high quality DNA free to anyone who was interested. Maurice Wilkins
from King’s College London was very interested and Signer gave him a
sample. It was Signer’s beautifully prepared and freely given DNA sample that
Maurice Wilkins and then Rosalind Franklin used in their work.
Unsung Heroes
Alexander Stokes and Raymond Gosling
In 1950 Maurice Wilkins and his Ph.D. student Raymond Gosling took X-ray
diffraction photos of Signer’s DNA, showing a clear crystal pattern. Their colleague
Alexander Stokes told them the pattern suggested DNA had helical symmetry with
the bases stacked like a ‘pile of pennies.’
Stokes then did a mathematical analysis using Bessel functions to predict how X-ray
diffraction photos of helical structures would look; his analysis was used by both
Wilkins and Franklin.
Gosling and Wilkins’s photos grabbed the attention of James Watson, pulling him to
the UK in his quest to solve the mystery of the gene.

Elwyn Beighton
In May and June 1951, Elwyn Beighton, a postdoctoral researcher at the University of
Leeds, took X-ray diffraction photos of DNA that look remarkably similar to Franklin
and Gosling’s famous Photo 51, taken in May 1952. Her photos reveal that Beighton
had also stumbled upon DNA’s B form, discovered in September 1951 by Franklin.
Unfortunately, neither Beighton nor her boss William Astbury saw the huge
significance their work. Beighton’s work was an example of a monumental ‘if only….’
Unsung Heroes
In 1949 and 1951, Erwin Chargaff published
important papers adding two very important
pieces to the DNA jigsaw.

Firstly, in DNA from different sources Chargaff


found that the ratio of bases A:T was always 1:1
and the ratio of bases G:C was always 1:1. To
Crick and Watson, this suggested that in DNA
these bases were paired with one another.
Ultimately their double helix model would explain
the reason for base pairing.

Secondly, Chargaff discovered that the amount of


A and T relative to G and C varied from species to
species, countering Levene’s earlier idea that
DNA lacked variation and so could not be genetic
material.
Unsung Heroes
Jerry Donohue
Jerry Donohue shared an office with James Watson at the University of
Cambridge’s Cavendish Laboratory.

Watson had been trying without success to see how the four bases A, C,
G, and T could fit within Crick’s dyad double helix. He told Donohue his
problem and Dohohue was able to tell Watson that textbooks were
wrong about the behavior of the G and T bases – they would exist in the
keto rather than the enol form. And Watson then had his revelation about
how the bases would line up within the double helix.

This was the final piece of the jigsaw. Watson and Crick built their model
of DNA and saw that its structure naturally suggested a method of
replication.
DNA envisioned by Alexander Todd
1953 Watson and Crick propose the double helix as the structure of DNA
based on the work of Erwin Chargaff, Jerry Donohue, Rosy Franklin
and John Kendrew
Secondary Structure of DNA: The Double Helix.
Initial “like-with-like”, parallel helix:
Does not fit with with Chargaff’s Rule: A = T G=C
H
H N
N N dR
dR dR dR
N N N N O N
O N

O N H N N N
N
H
H H O O H H
H N N
H H H H
N
O N N H N
N
N O
N N N O
N N dR
dR dR dR
N N
N H
H

purine - purine pyrimidine - pyrimidine

Wrong tautomers !!

Watson, J. D. The Double Helix, 1968


Two polynucleotide strands, running in opposite directions
(anti-parallel) and coiled around each other in a double helix.
The strands are held together by complementary hydrogen-
bonding between specific pairs of bases.
Antiparallel C-G Pair

Hyd rogen Bond H


Donor N4 O 6 HydAcceptor
rogen Bond
N H O N
Hyd rogen Bond 5'
Acceptor
N3 N RO
N 1 Hyd rogen Bond
N H N Donor
5' N N O
Hyd rogen Bond RO
Acceptor O2 O O H N N2
H yd rogen Bond
Donor
H OR

OR
3'

3'

Antiparallel T-A Pair


H Hyd rogen Bond
Hyd rogen Bond
O4 O H N
N6 Donor
Acceptor N
5'
H ydrogen Bond
Donor
N3 N H N N RO N 1 H ydrogen Bond
5' Accep tor
N N O
RO O O
OR
3'
OR
3'
DNA double helix

major
groove
12 Å
one
helical
turn
34 Å minor
groove

backbone: deoxyribose and phosphodiester linkage


bases
Tertiary Structure of DNA: Supercoils. Each cell contains
about two meters of DNA. DNA is “packaged” by coiling around
a core of proteins known as histones. The DNA-histone
assembly is called a nucleosome. Histones are rich with lysine
and arginine residues.

Pdb code 1kx5


Intro. to Nucleotides and Nucleic Acids
Nucleotides have a variety of roles in cellular metabolism. They are
the energy currency in metabolic reactions, the essential chemical
links in the response of cells to hormones and other stimuli, and the
structural components of a variety of enzyme cofactors and
metabolic intermediates. They are also constituents of the nucleic
acids, deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). The
amino acid sequence of every protein in a cell, and the nucleotide
sequence of every RNA, is specified by a nucleotide sequence in
genomic DNA. Segments of DNA specifying the synthesis of a
functional protein or RNA product are called genes. The storage and
transmission of biological information are the only known functions
of DNA. RNAs have a broader range of functions. Ribosomal RNAs
(rRNAs) are structural and catalytic components of ribosomes.
Messenger RNA (mRNAs) carry genetic information specifying the
sequences of proteins. Transfer RNAs (tRNAs) are adaptor
molecules that participate in translating the information in mRNA
into a specific sequence of amino acids in a polypeptide. There is
also a wide variety of special-function RNAs, including some called
ribozymes that have enzymatic activity.
Structure of Nucleotides
Nucleotides contain three components: 1) a nitrogen-containing
base, 2) a pentose, and 3) one or more phosphates. In the absence
of the phosphate group(s), the molecule is called a nucleoside. The
nitrogenous bases are derivatives of pyrimidine and purine. The
numbering of the ring atoms of pyrimidines and purines is
illustrated here. The numbering of the pentose rings follows the
convention outlined as below, except that the carbon numbers in
the pentoses of nucleotides and nucleosides are given a prime (‘)
designation to distinguish them from the numbered atoms of the
bases. The base of a nucleotide is joined covalently (at N-1 of
pyrimidines and N-9 of purines) in an N-ß-glycosyl bond to the 1’
carbon of the pentose, and the phosphate is esterified commonly to
the 5’ carbon. Water is removed in the formation of the N-ß-
glycosyl bond as occurs in O-glycosidic bond formation.
Major Bases of Nucleic Acids
Both DNA and RNA contain two major purine bases, adenine (A)
and guanine (G). Both nucleic acids also contain the pyrimidine,
cytosine (C), and a second pyrimidine that is thymine (T) in DNA
and uracil (U) in RNA. Only occasionally does thymine occur in
RNA or uracil in DNA. In some cases the names of the bases
reflect the sources from which they originally were isolated.
Guanine, for example, was first isolated from guano (bird manure)
and thymine was first isolated from thymus tissue.
Nomenclature of Nucleosides & Nucleotides
The names of the nucleosides and nucleotides containing the five
common bases are listed.
The Pentoses of Nucleotides
Nucleotides have two kinds of pentoses. The recurring
deoxyribonucleotide units of DNA contain 2’-deoxy-D-ribose, and
the ribonucleotide units of RNA contain D-ribose. In both types
of nucleotides the pentoses exist in their ß-furanose (closed five-
membered ring) forms. The formation of the ß-D-ribofuranose
ring from the straight-chain aldehyde form of D-ribose in
solution. Deoxyribose undergoes a similar interconverion in
solution, but in DNA exists solely as ß-2’-deoxy-D-ribofuranose.
Deoxyribofuranose and ribofuranose rings in nucleotides exist in
four different puckered conformations. In all cases, four of the
five ring atoms are nearly in the same plane. The fifth atom (C-2’
or C-3’) is on either the same (endo) or the opposite (exo) side of
the plane relative to the C-5’ atom.
Deoxyribonucleotides of DNA
The structures and names of the four major deoxyribonucleotides
(deoxyribonucleoside 5’-monophosphates) of DNA are shown below.
All nucleotides are shown in their free form at pH 7.0. The
deoxyribonucleotide units of DNA are usually symbolized as A, G,
T, and C, and sometimes as dA, dG, dT, and dC. In their free
forms, the deoxyribonucleotides are commonly abbreviated dAMP,
dGMP, cTMP, and dCMP. For each nucleotide in the figure, the
more common name is followed by the complete name in
parentheses. All abbreviations assume that the phosphate group is
at the 5’ position. The nucleoside portion of each molecule is
shaded in light red.
Ribonucleotides of RNA
The structures and names of the four major ribonucleotides
(ribonucleoside 5’-monophosphates) of RNA are shown below. All
nucleotides are shown in their free form at pH 7.0. The
ribonucleotide units of RNA are usually symbolized as A, G, U,
and C. In their free forms, the ribonucleotides are commonly
abbreviated AMP, GMP, UMP, and CMP. For each nucleotide in the
figure, the more common name is followed by the complete name
in parentheses. All abbreviations assume that the phosphate group
is at the 5’ position. The nucleoside portion of each molecule is
shaded in light red.
Phosphodiesters, Oligonucleotides, and
3'

O
Polynucleotides. The chemical linkage between O P
O O
nucleotide units in nucleic acids is a phosphodiester, 5' O
Base

which connects the 5’-hydroxyl group of one 3'

O X
nucleotide to the 3’-hydroxyl group of O P
O O
the next nucleotide. 5' O
Base

3'

By convention, nucleic acid sequences are written O X

from left to right, from the 5’-end to the 3’-end. 5'

Nucleic acids are negatively charged RNA, X= OH


DNA, X=H
Nucleic Acids.
1944: Avery, MacLeod & McCarty - Strong evidence that DNA
is genetic material
1950: Chargaff - careful analysis of DNA from a wide variety of
organisms. Content of A,T, C & G varied widely according
to the organism, however: A=T and C=G (Chargaff’ Rule)
1953: Watson & Crick - structure of DNA (1962 Nobel Prize with
M. Wilkens, 1962)
Nucleosides & Nucleotides
Nucleosides & Nucleotides
Nucleosides & Nucleotides
Nucleosides & Nucleotides
Primary & Secondary structure of DNA
Biosynthesis of Nucleic acids:
Phosphodiester Linkages in the Covalent
Backbone of DNA and RNA
The successive nucleotides in DNA and
RNA are covalently linked through
phosphate-group bridges in which the 5’-
phosphate of one nucleotide unit is joined
to the 3’-hydroxyl group of the next,
creating a phosphodiester linkage. Thus,
the covalent backbones of nucleic acids
consist of alternating phosphate and
pentose residues, and the nitrogenous
bases may be regarded as side groups
joined to the backbone at regular
intervals. The backbones of both DNA and
RNA are hydrophilic. The hydroxyl groups
of the sugar residues form hydrogen bonds
with water. The phosphate groups, with a
pKa near 0, are completely ionized and
negatively charged at pH 7. The negative
charges are generally neutralized by ionic
interactions with positive charges on
proteins, metal ions, and polyamines. All
the phosphodiester linkages in DNA and
RNA have the same orientation along the
chain giving each linear nucleic acid strand
a specific polarity and distinct 5’ and 3’
ends. By definition, the 5’ end lacks a
nucleotide at the 5’ position and the 3’ end
lacks a nucleotide at the 3’ position.
Hydrolysis of RNA by Alkali
The covalent backbone of DNA and RNA is subject to slow,
nonenzymatic hydrolysis of its phosphodiester bonds. In vitro, RNA
is hydrolyzed rapidly under alkaline conditions, but DNA is not.
This is because the 2’-hydroxyl group in the ribose moieties of
RNA is directly involved in the cleavage process. 2’,3’-cyclic
monophosphate nucleotides are the first products of the action of
alkali on RNA and are subsequently hydrolyzed further to yield a
mixture of 2’- and 3’-nucleoside monophosphates.
Representations of Nucleotides
The nucleotide sequences of nucleic acids can be represented as
illustrated below for a segment of DNA with five nucleotide units.
The phosphate groups are symbolized by circled Ps, and each
deoxyribose is symbolized by a vertical line, from C-1’ at the top
to C-5’ at the bottom. The connecting lines between nucleotides
(which pass through the P symbols) are drawn diagonally from the
middle (C-3’) of the deoxyribose of one nucleotide to the bottom
(C-5’) of the next. Some simpler representations of this
pentadeoxyribonucleotide are pA-C-G-T-AOH, pApCpGpTpA, and
pACGTA. Note that the sequence of a single strand of nucleic acid
is always written with the 5’ end at the left and the 3’ end at the
right, that is in the 5’3’ direction. A short nucleic acid such as
shown in the figure is referred to as an oligonucleotide. This term
is generally applied to nucleotides of 50 or fewer residues. Longer
nucleic acids are referred to as polynucleotides.
Tautomeric Forms of Uracil
Free pyrimidine and purine bases may exist in two or more
tautomeric forms depending on the pH. Uracil, for example,
occurs in lactam, lactim, and double lactim forms depending on
the pH. Certain tautomeric forms predominate at neutral pH,
and these are the structures shown for the five common
purines and pyrimidines. These are the tautomers that are
present in the bases in DNA and RNA.
Nucleotide Absorption Spectra
All nucleotide bases absorb UV light, and nucleic acids are
characterized by a strong absorption at wavelengths near 260 nm.
Plotted in this figure is the variation in molar extinction
coefficient, , as a function of wavelength. The molar extinction
coefficients at 260 nm are listed in the attached table. The
spectra of corresponding ribonucleotides and deoxyribonucleotides
are essentially identical. For mixtures of nucleotides, a wavelength
of 260 nm is used for absorption measurements.
Watson and Crick Base-pairing in DNA
The functional groups of pyrimidines and purines are ring nitrogens,
carbonyl groups, and exocyclic amino groups. Hydrogen bonds
involving the amino and carbonyl groups are the most important
mode of interactions between two complementary strands of nucleic
acid. The most common hydrogen-bonding patterns are those
defined by Watson and Crick in 1953, in which A bonds specifically
to T (or U) and G bonds to C. These two types of base pairs
predominate in double-stranded DNA and RNA, and the tautomers
suggested by Jerry …………………..are responsible for these types of
base pairs. It is this specific pairing of bases that permits the
duplication of genetic information in DNA, as discussed below.
Watson-Crick Model for the Structure of
Double-helical DNA (I)
A model for the structure of DNA was proposed by Watson and
Crick in 1953. Their model was based on a number of pieces of
information that were available at the time about the composition
of DNA and the x-ray diffraction properties of DNA fibers. Most
importantly, x-ray diffraction studies of DNA fibers performed
by Rosalind Franklin and Maurice Wilkins showed that DNA
molecules are helical and exhibit two periodicities repeating along
the length of the fiber--a primary repeat of 3.4 Å and a
secondary repeat of 34 Å. In addition, Erwin Chargaff and
colleagues had shown through DNA compositional analysis that the
number of T residues always equals the number of A residues (A =
T), and the number of G residues always equals the number of C
residues (G = C). As a result, the sum of purine residues equals
the sum of pyrimidine residues (A + G =
T + C). Watson and Crick then set
about to develop a structure that was
consistent with these two sets of data
(next slide).
Watson-Crick Model for the Structure of
Double-helical DNA (II)
In DNA, Watson and Crick proposed
that two helical DNA chains are wound
around the same axis to form a right-
handed double helix. They speculated
that the two chains have an
antiparallel orientation, and this was
later proven to be true. The
hydrophilic backbones of alternating
deoxyribose and phosphate groups are
on the outside of the helix facing the
surrounding water. The furanose ring
of each deoxyribose is in the C-2’
endo conformation. The purine and
pyrimidine bases of both strands are
stacked inside the double helix with
their hydrophobic and nearly planar
ring structures very close together and
perpendicular to the axis of the helix.
(Continued on the next slide).
Watson-Crick Model for the Structure of
Double-helical DNA (III)
Base stacking accounts for the 3.4 Å
repeat along the length of the helix.
The secondary repeat of about 34 Å
was accounted for by the presence of
10 base pairs in each complete turn of
the double helix. This was later
modified to 10.5 base pairs per turn
for DNA in aqueous solution. In the
Watson-Crick model, A/T and G/C base
pairing was proposed based on the fact
that these combinations of bases fit
well inside the double helix. Finally, the
offset pairing of the two strands
creates a major and a minor groove on
the surface of the duplex. It should be
noted that the double helix not only is
stabilized by Watson-Crick base pairing
between residues in the helix, but is
also stabilized by base-stacking
interactions that remove the bases from
contact with water. The features of the
double-helical model of DNA structure
are supported by much chemical and
biochemical evidence.
Double-helical Strand Complementarity
The two antiparallel chains of double-helical
DNA are not identical in either base sequence
or composition. Instead, they are
complementary to one another. Wherever
adenine occurs in one chain, thymine occurs in
the other. Similarly, guanine occurs opposite
cytosine in the two chains.
DNA Biosynthesis
DNA Replication
Nucleosides & Nucleotides
RNA transcription

• Only 2% of DNA used to make proteins


• 400 scientists, 1200 experiments, ENCODE
• 150 cell types and ~ 21000 proteins
• 30,000 genes code for RNA which act like
switches (ncRNAs)
Translation
Breaking the Genetic Code : Nirenberg and Khorana

When it became known that each amino acid was coded


for by a sequence of three nucleotide bases, scientists
eagerly sought to determine which triplets went with
which amino acids.

In 1964, Marshall Nirenberg and Har Gobind Khorana


worked out the puzzle of the genetic code. By using
radioactively labeled synthetic mRNA molecules, they
were able to assign specific triplets to each of the 20
amino acids.
Breaking the Genetic Code : Required organic chemistry

The key breakthrough in deciphering the genetic code came from an


unexpected direction.

In 1960, Marshall Nirenberg and J. H. Matthaei developed a system


for synthesizing proteins in vitro. They had learned that preparation
of disrupted cells soon ceased to make protein, and, in an attempt to
prolong the short period during which in vitro synthesis continued,
they added RNA to the preparations (rRNA). rRNA indeed prolonged
the period of in vitro protein synthesis and all 20 amino acids were
actively incorporated into newly-made protein.
As a control, they used an artificial RNA, reasoning that
only RNA sequences with physiological significance
should be active in in vitro protein synthesis. Artificial
RNA, because it was not naturally occurring, should not
prolong in vitro protein synthesis. Well, an experiment is
only as good as its controls, and in this case the control
proved far more important than the experiment itself (the
effect of rRNA on in vitro protein synthesis was later
shown to be indirect).
Nirenberg and Matthaei used the enzyme polynucleotide
phosphorylase, which synthesizes RNA chains randomly
from available precursors without a template, to make
the artificial RNA polyuridylic acid (poly- U) from UDP.
They added the poly-U to a fresh, disrupted cell
suspension (cell-free extract), expecting the rapid decay
of in vitro protein synthesis (they monitored the 14C
amino acid into acid-precipitable protein to detect protein
synthesis). Instead, protein synthesis was stimulated ! .
Activity was so great as to make the rRNA activity levels
seem miniscule by comparison. Only 10 micrograms of
poly-U yielded approximately 13,000 14C amino acid
counts per minute (CPM is a measurement of
radioactivity; higher levels of radioactivity are indicated
by higher counts per minute), while 2,400 micrograms of
rRNA yielded only about 200 CPM!) Most importantly,
only 14C phenylalanine was incorporated into protein.
The acid precipitable 14C label was in polyphenylalanine
(PHE-PHE-PHE--).
This immediately provided additional confirmation of
Brenner, Jacob, and Meselson’s mRNA hypothesis,
and suggested an additional hypothesis of first
importance : that the ribosomes could not distinguish
an artificial mRNA from a naturally-derived one. When
an artificial mRNA was presented carrying the code
word for phenylalanine (evidently UUU), the
ribosomes proceeded to read it with high efficiency. In
a similar manner, AAA = LYS, and CCC = PRO. It is
this approach, the synthesis of synthetic mRNA
molecules, which led directly and quickly to the full
deciphering of the genetic code.
At first, attempts were made to deduce the code from more complex artificial
mRNA molecules. By presenting polynucleotide phosphorylase with two
nucleotides present in varying proportions, RNA chains could be obtained with the
two nucleotides present in random sequence. This mRNA could then be employed
in in vitro protein synthesis and protein isolated with several amino acids present.
Their composition provided direct code information.

Imagine an initial mix of 3:1 U to G. The possibility of UUU is (3/4)(3/4)(3/4), or


27/64; the probability of two U’s and one C is (3/4)(1/4)(1/4) or 3/64. Thus, the ratio
of PHE to the three codons with two U’s and one C should be 3:1, and the ratio of
PHE to the codons carrying one C should be 9:1. When one tries poly-UG, 3:1 in in
vitro protein synthesis, one obtains valine, leucine, and cysteine incorporated about
1/3 as often as phenylalanine, suggesting that the codons for VAL, LEU, and CYS
each obtain two U’s and one C. But which is which?
This approach cannot tell you that. Artificial mRNA of
random sequence can provide information only about
codon composition, not codon sequence. What was
required then was a sequence-specific probe.
Nirenberg’s Experiment

The first such probes were indirect, but powerful. Marshall


Nirenberg and Philip Leder showed in 1964 that the simple
trinucleotide UUU, while it was incapable of acting as
mRNA, would bind with 14C PHE tRNA (the phenylalanine-
specific transfer RNA, charged with 14C labeled
phenylalanine) to ribosomes. The binding required the
presence of several additional binding factor proteins and
GTP, and was specific : only 14C PHE-tRNA was bound to
ribosomes when the UUU trinucleotide triplet was
employed. It was thus possible to carry out a simple triplet
binding assay.
A specific triplet (say UGU) was added to a mix containing
ribosomes, binding factors, GTP, and a variety of 14C amino acid-
charged tRNAs. This mixture was then passed through a filter. While
most radioactivity passed through the filter, a small amount remained
trapped on the filter surface because the ribosomes adhered to the
filter, and the ribosomes had bound to them the 14C amino acid-
tRNA that recognized UGU. When the filter was analyzed, it
contained 14C-cysteine, so UGU = CYS. Because all possible
trinucleotides could be readily synthesized, it was possible to decode
most three-base codons, despite the indirect nature of the assay.
Some 47 of the 64 possible combinations gave unambiguous
results.
Khorana’s Experiment

The remaining 17 triplets gave ambiguous results on triplet binding


assays, and decoding them required a more direct approach. Har
Gobind Khorana provided such an approach by setting out to directly
construct a series of artificial mRNA molecules of defined sequence.
He first constructed short defined sequences of DNA. He knew the
sequences of the DNA molecules that he synthesized because he
made the DNA from special chemical groups blocked so that only
certain base combinations were possible.
An over-simplified example might be to imagine G bound to a
column matrix, but T blocked chemically so that it could not
bind to the column. The blocked T was added to the column
under conditions that promoted the nucleotide condensation
reaction, and GT was obtained, with unused T washed out the
bottom of the column and all the initial G’s then bound by T.
Blocked G was then added to yield –GTG. In this way, defined
DNA double-helical models of 6 to 8 base pairs were
constructed. Khorana then used those DNA oligonucleotides
as templates for RNA polymerase, and produced specific RNA
molecules such as GUGUGUGU-----. Very long mRNA
molecules of known sequence could be produced in this
fashion.
From an mRNA segment such as GUGUGUGU---, there are
two alternating codons, GUG and UGU. When employed in
in vitro protein synthesis, this mRNA yielded a polypeptide of
alternating CYS-VAL-CYS- VAL---. Which was which? From
the triplet binding assay, Khorana knew that UGU coded for
CYS. Therefore, GUG must code for valine (VAL). By
constructing these and more complicated defined sequence
mRNAs, Khorana was able to verify the entire code
Genetic code

You might also like