Professional Documents
Culture Documents
F R O M F A CT S TO C ONCEPTS
M AR KU S E N G S T L E R , T H O M A S R U D E L , T H O M A S D AN D E K A R , M A R KU S S AU ER
W INTER 2023/4
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Example:
Markus Engstler
Introduces each module and lays the foundation for the other lectures.
Starts very basic but seeks to look at apparently simple stuff in an unexpected way.
For further reading: e.g. Alberts et al., Essential Cell Biology, Garland Science.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The Cell Biology lectures
More important than memorising facts is that you come to understand the principles, ideas
and gaps in knowledge of molecular biology. This includes seeing our current knowledge in
the light of its discovery.
Numbers are particularly important. Without a feeling for numbers, you cannot understand
the processes in and on cells.
Self-study is the key to success. This does not have to be boring. That's why I provide lists
of links that will take you on a special journey through molecular biology.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
H OW C E LLS R EA D T H E G ENOME
O R G A N I S AT I O N OF THE E U K A RY O T I C G E N O M E
MARKUS ENGSTLER
17.10.2023
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.jstor.org/stable/1629276?seq=1#metadata_info_tab_contents
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.biozentrum.uni-wuerzburg.de/zeb/research/topics/theodor-boveri/
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Good question:
Why didn’t people believe Oswald Avery in 1944 and why did he never
get the Nobel prize?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.nobelprize.org/prizes/themes/the-nobel-prize-in-physiology-or-medicine-1901-2000-2
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
DNA carries the genetic information
DNA as genetic material was disputed until Alfred Hershey and Martha
Chase did their “blender experiment” in 1952. What did they do?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
DNA as genetic material was disputed until Alfred Hershey and Martha
Chase did their “blender experiment” in 1952. How did it work?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
DNA contains linear information
Rosalind
Elsie Fran
klin
https://www.sciencemag.org/careers/2018/08/rosalind-franklin-and-damage-gender-harassment
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.zeit.de/gesellschaft/2023-04/osalind-franklin-dna-entdeckung-nobelpreis-kriminalpodcast
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
DNA is packaged into chromosomes but remains accessible
Basics:
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
During the cell cycle DNA must be replicated, separated and partitioned.
Good question:
Chromosomes
Chromosomes extended
condensed
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Mitotic chromosome formation requires condensins
Schematic model of the S. cerevisiae condensin Flip- op model of the condensin reaction cycle
holo complex and 8.1-Å-resolution 3D map
showing its overall architecture.
SMC2 SMC4
Large-scale conformational changes form the basis for the ability of condensin to translocate along DNA and to
extrude DNA loops of kilobase pairs in length. https://www.nature.com/articles/s41594-020-0457-x 2020
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Cross-section of mitotic
chromatid: condensin
enriches at stiff core. Swelling of chromatin
Sharp phase boundary to network
cytoplasm
Disruption of chromatin
morphology and stiffness
Full dispersion of
without affecting compaction
chromatin bre
Condensin
Mitotic chromosomes form by condensin-mediated DNADNA loop extrusion and acetylation-regulated chromatin phase separation. Upper-left
corner represents the cross-section of an unperturbed mitotic chromatid. Condensin enriches at a stiff axial core, surrounded by loops of immiscible,
compact chromatin. At a local scale, the chromatin layer is fluid and forms a sharp phase boundary with the cytoplasm. Depletion of condensin disrupts
chromatin morphology and stiffness without affecting the degree of chromatin compaction. Histone hyperacetylation induces swelling of the chromatin
network, whereby additional
LECTURE depletion
SERIES of condensin
MOLECULAR induces full dispersion of the
BIOLOGY WIchromatin
SE 2023/4fibre. DNA (black) and condensin (red) are OF
ORGANISATION shown.
THEDashed
EUKARYOTIC GENOME
lines indicate degraded condensin.
required for the elastic stiffness of the chromosome axis accumulation of Ki-67 at the chromosome surface de-
[71]. Stiffness is instead established by condensin- pends on histone deacetylation [24], supporting a model
mediated loop extrusion, which shapes chromosomes where Ki-67 enriches at the phase boundary between
into cylindrical threads rather than the spheres expected immiscible chromatin and cytoplasm to function as a
of phase separation alone [12,34]. Thus, a combination surfactant dispersing mitotic chromosomes (Figure 3).
of DNA loop extrusion and chromatin phase separation
generates mitotic chromosomes that can resist micro- In the immiscible mitotic state, chromatin excludes not
tubule pulling and pushing forces. only microtubules but also other cytoplasmic proteins
and ribonucleoprotein complexes such as ribosomes,
Regulation of chromosome surface properties allowing pre-partitioning of cytoplasm from the assem-
A model of immiscible chromatin in the mitotic cytosol bling nucleus during mitotic exit [24,75,76]. The sep-
raises the question of how single chromosomes can be aration between emerging nuclear and cytosolic
maintained as separate entities, as phase-separated compartments is also regulated by Ki-67, which during
condensates have a natural tendency to minimise sur- mitotic exit loses its surfactant-like properties after
face energy by fusion [72]. The protein Ki-67 has been collapse of its repulsive molecular brush structure. This
identified to be essential in organising the chromosome
Tuning chromatin stiffness by loop extrusion processivity
periphery [73] and prevents chromosome coalescence
by forming a repulsive molecular brush at the surface of
leads to chromosome clustering, which sequesters
cytoplasmic components during nuclear assembly [75]
(Figure 3). Following chromosome clustering in late
mitotic chromosomes, a property characteristic of sur- anaphase, the protein barrier to autointegration factor
face active agents (surfactants) [74]. Notably, the coats and cross-links the chromatids, such that the
6 3D Genome Chromatin Organization and Regulation (2023)
Current Opinion in Structural Biology 2023, 81:102617 www.sciencedirect.com
In interphase, cohesin forms loops that Owing to gaps, interphase chromatin can be
Figure 4 are interspersed with gaps. deformed by application of forces.
In mitosis, condensin forms arrays of larger Mitotic chromatin is stiffer than interphase chromatin in
Tuning chromatin stiffness by loop extrusion processivity. (a) In interphase, cohesin forms loops that are interspersed with gaps. (b) Owing to gaps,
loops without
interphase largecan
chromatin gaps in between
be deformed them.of forces. response
by application (c) In mitosis,to pullingforms
condensin along theof mitotic
arrays chromosome
consecutive, axis.
larger loops without large gaps in
between them. (d) Mitotic chromatin is stiffer than interphase chromatin in response to pulling along the mitotic chromosome axis. DNA (black), cohesin
(blue) and condensin (red) are shown. Arrows indicate direction of force.
between interphase loop extrusion and the regulation while further studies implicate both methylation and
LECTUREexpression
of genome SERIES MOLECULAR BIOLOGY
and maintenance, are now WISHP1
E 2023/4 ORGANISATION OF THE Estiffness
a in imparting condensin-independent UKARYOTIC GENOME
beginning to be understood. [102]. While the abundance of evidence supporting the
role of condensins in mitotic chromosome formation puts
Conclusions and outlook this structural maintenance of chromosomes complex at
The expansion of biological studies to include concepts the fore, the potential complementary roles of these and
of polymer physics has yielded new perspectives in our other factors will be interesting to explore. Piecing
understanding of chromatin structure and function. together the components involved in regulating chro-
Chromatin material properties emerge from combina- matin material properties, and expanding upon models
tions of various governing principles. Dynamic conden- outlined in this review, will further our understanding of
sin and cohesin cross-links organise the flexible how core physical principles govern key genomic pro-
chromatin fibre into a network, whose stiffness can be cesses throughout the cell cycle.
tuned at different cell cycle stages by changing levels of
loop extrusion. Meanwhile, a regulated shift between Declaration of competing interest
soluble and insoluble chromatin states controls acces- The authors declare that they have no known competing
sibility and generates surface tension, which may pre- financial interests or personal relationships that could
vent inappropriate invasion of soluble factors or have appeared to influence the work reported in
microtubules. Histone deacetylation appears to play a this paper.
key role in driving this chromatin phase transition, with
the precise regulatory pathways The materialthis
underlying properties
process of mitotic chromosomes Spicer and Gerlich 5
Data availability
Regulation of the mitotic chromosome surface
during mitotic entry and exit still to be identified.
No data was used for the research described in
the article.
While condensin cross-links endow mitotic chromosomes
with stiffness, additional factors are involved in me-
Figure 3 chanical stabilisation. In vitro manipulation of purified Acknowledgements
The authors thank Antoine Coulon and Paul Batty for comments on the
mitotic chromosomes suggests that histone methylation manuscript. Research in the laboratory of D.W.G. is supported by the
may modulate stiffness of the chromatin network [71], Austrian Academy of Sciences, the Vienna Science and Technology Fund
The protein Ki-67 is an essential in
Current Opinion in Structural Biology 2023, 81:102617
organising thewww.sciencedirect.com
chromosome periphery
Regulation of the mitotic chromosome surface. In prometaphase and metaphase, Ki-67 forms a repulsive layer of molecular brushes on mitotic
In prometaphase
chromosomes, maintaining andallowing
their separation and metaphase, Ki-67 motility
independent forms ona repulsive
the mitoticlayer of In
spindle. molecular brushes
late anaphase, on mitoticbrushes
the molecular chromosomes,
collapse
and Ki-67 promotes clustering of chromosomes
maintaining to exclude
their separation large
and cytoplasmic
allowing particles. In
independent telophase,
motility barrier-to-autointegration
on the mitotic spindle. factor (BAF) forms a
network around each set of segregated chromosomes such that the reforming nuclear envelope enwraps them to form a single nucleus. The figure
depicts only chromosomes and anaphase,
In late their surface the
regulators.
molecular brushes collapse and Ki-67 promotes clustering of chromosomes to exclude large
cytoplasmic particles.
In telophase, barrier-to-autointegration factor (BAF) forms a network around each set of segregated chromosomes
reassembling nuclear envelope
such that enwraps
the reforming entire
nuclear setsenwraps
envelope between
them tothese
form acondensates
single nucleus.and the intrinsic phase
chromatids to form a single nucleus rather than frag- separation properties of chromatin itself, and how these
mented micronuclei [77]. Dynamic control of the together contribute to nuclear function.
chromosome surface thereby allows distinct functions to
be performed at different stages of mitosis. While volume phase transitions control chromatin
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
accessibility, changes in DNA looping account for
Tuning chromatin elasticity and solubility during cell further differences between mitotic and interphase
cycle progression chromatin. Once mitosis is complete, condensins
Once the genome has been equally distributed between dissociate from chromatin and the binding of cohesin
two daughter cells, chromosomes decondense and ac- re-establishes interphase looping patterns [78,79].
quire a specific morphology characteristic of interphase Cohesin has a DNA looping activity similar to
[78,79]. In contrast to the globally compacted state of condensin [31,86,87] but has a shorter residence time
mitotic chromosomes, interphase chromatin displays on chromatin, resulting in the formation of smaller
varied degrees of compaction. Transcriptionally active loops on interphase chromosomes compared to mitosis
chromatin regions decondense after mitotic exit, while [39,88e93]. The less processive loop extrusion
transcriptionally repressed constitutive heterochromat- performed by cohesin results in a greatly altered
in retains high levels of compaction, perhaps through interphase chromosome configuration, compared to
similar phase separation mechanisms to those observed that of mitosis. In contrast to the mitotic bottlebrush
in mitosis. Compact interphase chromatin may restrict structure formed by consecutive loops extruded by
the accessibility of nuclear components such as tran- condensins, interphase loops formed by cohesin are
scription factors through similar principles of phase short-lived, regulated by boundaries imposed by the
separation and electrostatic repulsion as observed in protein CTCF, and interspersed with gaps [90,94e97].
mitosis [24,80,81]. In addition, it is important to note This altered cross-link distribution leads to a less
that interphase nuclei contain many other condensates constrained chromatin state than that observed in
fi
Three DNA elements are essential for chromosomes
Eukaryotic chromosomes
contain many replication
origins. Why?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Good question:
In Indian muntjacs
chromosomes have fused
without large changes in
gene number. How can
this happen?
Mol Biol Evol (2000) 17 (9): 1326-1333. The molecular mechanism whereby the muntjac telomere and centromere repetitive sequences
induce frequent tandem fusions is unknown. …. By elucidating the driving force behind the tandem fusions, we may one day be able to
reconstruct the reduction process in laboratories.
https://academic.oup.com/mbe/article/17/9/1326/994705
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
ARTICLE
2022
https://doi.org/10.1038/s41467-021-27091-0 OPEN
Bingzhao Zhuo1, Baowei Zhang13, Jiang Chang 14, Haiyuan Qian11, Yingmei Peng1, Xianqing Chen1, Lei Chen1,
Zhipeng Li15,16, Qi Zhou 17,18,19 ✉, Wen Wang 1,4 ✉ & Fuwen Wei 2,3,4 ✉
https://doi.org/10.1038/s41467-021-27091-0
Muntjac deer have experienced drastic karyotype changes during their speciation, making it
an ideal model for studying mechanisms and functional consequences of mammalian chro-
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
mosome evolution. Here we generated chromosome-level genomes for Hydropotes inermis
(2n = 70), Muntiacus reevesi (2n = 46), female and male M. crinifrons (2n = 8/9) and a
contig-level genome for M. gongshanensis (2n = 8/9). These high-quality genomes combined
with Hi-C data allowed us to reveal the evolution of 3D chromatin architectures during
mammalian chromosome evolution. We find that the chromosome fusion events of muntjac
species did not alter the A/B compartment structure and topologically associated domains
near the fusion sites, but new chromatin interactions were gradually established across the
fusion sites. The recently borne neo-Y chromosome of M. crinifrons, which underwent male-
specific inversions, has dramatically restructured chromatin compartments, recapitulating the
early evolution of canonical mammalian Y chromosomes. We also reveal that a complex
structure containing unique centromeric satellite, truncated telomeric and palindrome
repeats might have mediated muntjacs’ recurrent chromosome fusions. These results provide
insights into the recurrent chromosome tandem fusion in muntjacs, early evolution of
mammalian sex chromosomes, and reveal how chromosome rearrangements can reshape the
3D chromatin regulatory conformations during species evolution.
Chromosome numbers vary even between closely related
ARTICLEspecies
NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-27091-0
a B. taurus
2n=60
R. tarandus
-9.56 2n=70
H. inermis
2n=70
-11.33 C. albirostris
1
-4.47 2n=66
10X + Hi-C
1
E. davidianus
Nanopore + Hi-C 2n=68
-9.45
PacBio + Hi-C 6 M. reevesi
2n=46
Nanopore 6 -3.05 13 M. muntjak vaginalis
7+33, 33+20, 26+21, 3+1, 4+25, 32+31
tandem fusion 2n=6 /7
-2.04 6 1
29+17, 8+19, 19+30, 2+12, 7+22, 22+15, 15+16,
16+32, 14+21, 26+1, 25+10, 10+28, 28+23
13 M. gongshanensis
Robertsonian fusion 2n=8 /9
27+29, 30+2, 31+11, 9+14, 7 -1.44
23+13, 13+24, 17+8, M. crinifrons
fission 18+5, 24+34 2 3
2n=8 /9
11+6, 5+27, 34+X
Million years
-20 -17.5 -15 -12.5 -10 -7.5 -5 -2.5 0
b https://doi.org/10.1038/s41467-021-27091-0 c
40
35
MCR
2022
Effective population size (x10⁴)
MGO
30 MMU
MRE
25
M. crinifrons
20
15
M. reevesi
10
5
M. muntjak vaginalis
0 M. gongshanensis
10⁴ 10⁵ 10⁶ 10⁷
d
19 10 L1ECTURE
29 16SERIES
8 12MOLECULAR
6 4 13 BIOLOGY
26 28 25 18 9 WI17
20 21 27 15 SE 2023/4
5 22 24 7 3 11 ORGANISATION
2 14 23 XOF THE EUKARYOTIC GENOME
BTA
18 5 27 29 12 17 8 19 30 2 9 7 33 20 22 15 16 32 31 11 6 14 26 21 23 3 1 24 4 25 34 13 10 28 X
HIN
17 8 18 5 19 9 16 21 6 2 14 13 15 11 10 7 20 4 1 3 12 22 X
MRE
Fig. 1 Phylogeny, demographic histories, and distribution and chromosome synteny of muntjac species. a Maximum likelihood tree of muntjac and
outgroup species with the respective sequencing technologies (red geometries), the divergence time (blue numbers) and number of chromosome fusion or
fission events (red numbers) shown. Different combinations of black arrows represent different types of chromosome fusion and fission. The 31 fusion
events leading to M. crinifrons are displayed in detail with the chromosome code (black numbers) of H. inermis, which are connected with the arrow mark on
Who has the most chromosomes?
the phylogenic tree with dotted lines. Red hollow circles mark the nodes whose divergence times were used as calibration for estimating the divergence
time among other species. b The demographic histories of M. reevesi (MRE), M. muntjak vaginalis (MMU), M. gongshanensis (MGO), and M. crinifrons
(MCR) estimated by PSMC37. The gray box marks the time range of the Xixiabangma Glaciation (XG, 0.8-1.17 million years ago). c Topographic map on
current geographic distribution of the four muntjac species. The colors of dashed line are consistent with the colors of distribution areas of a particular
species. d The chromosome synteny between B. taurus, H. inermis, M. reevesi, female and male M. crinifrons with chromosome names shown above. 1p and
1q represent short arm and long arm of chromosome 1, respectively. The red line indicates the synteny blocks of female and male M. crinifrons in neo-Y
inverted regions.
You (Homo sapiens): 46 chromosomes
Blue whale (Balœnoptera musculus): 48 chromosomes
NATURE COMMUNICATIONS | (2021)12:6858 | https://doi.org/10.1038/s41467-021-27091-0 | www.nature.com/naturecommunications 3
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Chromatin is a complex of nuclear DNA and proteins
Chromatin at interphase
= 30-nm thick threads
This evidence suggests pairwise asso- chromatin than in solution, but cross- 17. Y. V. Ilyin, A. Ya. Varshavsky, U. N. Mick-
elsaar, G. P. Georgiev, Eur. J. Blochem. 22,
ciations of the histones in chromatin linked products up to pentamers are 235 (1971).
but says nothing of details, such as readily observed and call for further 18. R. J. DeLange, D. M. Fambrough, E. L.
Smith, J. Bonner, J. Biol. Chem. 244, 5669
whether the F2A1 and F3 pair, which investigation. (1969).
occurs as an (F2Al)2(F3)2 tetramer 19. L. Patthy, E. L. Smith, J. Johnson, ibid.
Nucleosomes
in become
solution, also occurs as a tetramer References and Notes 248, 6834 (1973).
20. G. S. Bailey and G. H. Dixon, ibid. p. 5463.
visible
in on The
chromatin. experimentally
most direct evidence 1. Molecular weights are from R. J. DeLange
and E. L. Smith, Accounts Chem. Res. 5,
21. E. P. M. Candido and G. H. Dilon, Proc.
Nail. Acad. Sci. U.S.A. 69, 2015 %1972).
for an (F2Al)2(F3)2 tetramer in 368 (1972). Relative amounts of the histones 22. S. C. Rail and R. D. Cole, J. Biot. Chem.
decondensed
chromatin is that afibers.
complex formed 2. M. are discussed in the accompanying article (24). 246, 7175 (1971).
H. F. Wilkins, Cold Spring Harbor Symp. 23. K. Murray, E. M. Bradbury, C. Crane-Rob-
from tetramers, F2A2-F2B oligomers, Quant. Biol. 21, 75 (1956); - , G. Zubay, inson, R. M. Stephens, A. J. Haydon, A. R.
H. R. Wilson, J. Mol. Biol. 1, 179 (1959);
and DNA gives the same x-ray pattern V. Luzzati and A. Nicolaieff, ibid. 7, 142 Peacocke, Biochem. 1. 120, 859 (1970).
chromatin (Fig. 4, upper two traces). 24. R. D. Kornberg, Science 184, 868 (1974).
as (1963).
3. B. M. Richards and J. F. Pardon, Exp. Cell 25. G. Zubay and P. Doty, J. Mol. Biol. 1, 1
Tetramers and F2A2-F2B oligomers
Are histones large or small molecules and are theyRes. 62, 184 (1970).
charged?
(1959).
are both required to give the x-ray pat- 4. J.Biol.F. 68, Pardon and M. H. F. Wilkins, J. Mol. 26. S. Panyim, R. H. Jensen, R. Chalkley, Bio-
115 (1972). chim. Biophys. Acta 160, 252 (1968).
tern (Fig. 4, lower two traces), but Fl 5. P. A. Edwards and K. V. Shooter, Biochem. 27. 0. H. Lowry, N. J. Rosebrough, A. L. Farr,
is not-in keeping with previous obser- 6. J.R. 114, 227 (1969). R. J. Randall, J. Biol. Chem. 193, 265
Ziccardi and V. Schumaker, Biochemistry (1951).
removing
vations (3, 23 ) thatLECTURE Fl Mfrom
SERIES 12, 3231 (1973).
OLECULAR BIOLOGY WISE 2023/4 Pringle, Biochem.
28. J. R.ORGANISATION OF THE Biophys. Res. Com-
EUKARYOTIC GENOME
A. C. H. Durham, unpublished. mun. 39, 46 (1970).
chromatin does not affect the x-ray 8.7. A. B. Barclay and R. Eason, Biochim. Bio- 29. K. Weber and M. Osborn, J. Biol. Chem.
pattern. Further implications of these 9. phys. Acta 269, 37 (1972). 244, 4406 (1969).
30. R. N. Perham and J. 0. Thomas, FEBS (Fed.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Some facts:
These are known as the core histones. Histones are basic proteins that have
an affinity for DNA.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The histone core is disk-shaped
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The fifth histone is called H1 and helps to pull nucleosomes together to form the 30-nm fiber.
H1 has a globular region and a pair of long tails at N- and C-terminus.
Good question:
The globular region is possibly involved What is the function of the histone tails?
in constraining another 20 bp of DNA
close to the nucleosome core.
The C-terminal tail binds to chromatin,
but the position of both tails is not known.
Article
Correspondence
mazubel@stanford.edu
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
In brief
Partial decondensation of mitotic
chromosomes and cryo-electron
tomography revealed the internal
structure of the chromosomal material.
Nucleosomes and linker DNA segments
between them were observed, permitting
chromatin fibers to be traced over
distances of 10 kb (50 nucleosomes) or
more. Patterns of coiling or folding in
specific gene regions can now be
determined.
Article
d Trajectories
In brief of chromatin fibers were irregular, unrelated to
previous proposals of mitotic
Partial decondensation
chromosomes and cryo-electron
d Condensation occurred
tomography revealed without higher-order structure in the
the internal
structure examined
regions of the chromosomal material.
Nucleosomes and linker DNA segments
between them were observed, permitting
chromatin fibers to be traced over
distances of 10 kb (50 nucleosomes) or
more. Patterns of coiling or folding in
specific gene regions can now be
determined.
tomogram revealed that 74.1% of linker DNA segments passed analyzed from two chromosomes were centered on 43 ± 1 bp
through multiple slices. and 42 ± 2 bp (with dispersions of 16 and 19 bp), (Figure 5B),
LECTURE
Attempts SERIESchromatin
to segment MOLECULAR BIOLOGY
fibers automatically using WIS
close 2023/4
toEthe value of 38 bp obtained by ORGANISATION
subtraction of OF coreEUKARYOTIC GENOME
the THE
neural networks in EMAN2 were unsuccessful. Nucleosomes particle DNA length of 146 bp from the nucleosome repeat length
were readily located, but the linker DNA connecting them could of 184 bp (measured by micrococcal nuclease digestion of HeLa
not be unambiguously identified. Some 2,600 subtomograms, metaphase chromosomes [De Ambrosis et al., 1987]). The core
each containing a single nucleosome located by template DNA length of 146 bp, corresponding to one and two-thirds turns
matching, were extracted, aligned, and averaged. The resulting around the nucleosome, is an intermediate in nuclease digestion
density map was then placed in the tomogram at the location and is reduced upon more extensive digestion to smaller sizes.
of each nucleosome. Comparison with the results of manual The close agreement between our measurement by cryo-ET
docking showed close agreement in both nucleosome location and the result from nuclease digestion shows that the 146-bp
and orientation (Figure S3A). However, the procedures em- core particle, whose physiological relevance has never been
ployed for automatic segmentation of linker DNA failed to established, is indeed representative of the actual extent of
segment linker DNA passing through multiple slices of a tomo- wrapping around the histone core of the nucleosome in chromo-
gram, and even when the DNA was contained in a single slice somes (but see below).
of the tomogram, the identification was unreliable, with a variety The length of linker DNA varied from one segment to the next.
of artifacts, such as false connectivity and dangling ends Long stretches with a consistent linker length were rare; among
(Figure S3B). the 548 linkers analyzed, not more than six consecutive linkers
Linker DNA was often seen to be curved or bent (Figure 5A). As were of the same length, within ±5 bp (Figure S4). There was
expected from nuclease digestion analysis (Prunell and Korn- no discernible pattern in the length difference between entry
berg, 1982), linker DNA varied in length. Distributions for regions and exit linkers (‘‘DL’’) for consecutive nucleosomes, except for
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Histone modifications may be inherited
After mitosis each daughter chromosome will contain two types of nucleosomes:
(1) Inherited and modified from parent chromosome (2) newly synthesized and not modified.
Modified histones are recognized and the modification is copied to the naked nucleosomes.
This is epigenetic inheritance that does not involve DNA.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.nature.com/articles/cr201122
constitutiveAndrew
heterochromatin.
J Bannister1, Tony Kouzarides1 cleosomes containing unmodified H3. Furthermore, this
(a) Facultative heterochromatin consists of genomic positive feedback mechanism helps to explain, at least in
1
The Gurdon
regions containing genes Institutethat
and are
Department of Pathology,
differentially University of Cambridge,
expressed part, theCambridge
highly CB2 1QN, UKnature of heterochromatin, not
dynamic
through development and/or differentiation and which
Chromatin is not an inert structure, but rather an instructive leastDNA its ability to encroach
scaffold that can respond into euchromatic
to external cues to regions un-
then become silenced.
regulate the many A classic example
uses of DNA. of thiscomponent
A principle less it isthat
type of of chromatin checked
plays afrom
key roledoing so.regulation is the
in this
modification
heterochromatin is theof histones.
inactiveThere is an ever-growing
X-chromosome list of these modifications and the complexity of their action is
present
only just beginning to be understood. However, it is clear that histone modifications play fundamental roles in most
within mammalian female cells, which is heavily marked Euchromatin
biological processes that are involved in the manipulation and expression of DNA. Here, we describe the known
by H3K27me3 histoneand the Polycomb
modifications, repressor
define where they are found genomically In
complexes andstark
discusscontrast
some of to heterochromatin,
their euchromatin is a
functional consequences,
(PRCs) [87]. This co-localization
concentrating makes where
mostly on transcription sensethebecause far more relaxed
majority of characterisation has takenenvironment
place. containing active genes.
the H3K27Keywords:
methyltransferase EZH2chromatin
histone; modifications; resides within the However, as with heterochromatin, not all euchromatin
Cell Research
trimeric PRC2 complex. (2011) 21:381-395.
Indeed, recentdoi:10.1038/cr.2011.22;
elegant work haspublished is the
onlinesame. Certain
15 February 2011 regions are enriched with certain
shed light on how H3K27me3 and PRC2 are involved histone modifications, whereas other regions seem rela-
in positionally maintaining facultative heterochromatin tively devoid of modifications. In general, modification-
through DNA replication [88]. Once established, it seems rich ‘islands’ exist, which tend to be the regions that
Introduction such as repair, replication and recombination.
that H3K27me3 recruits PRC2 to sites of DNA replica- regulate transcription or are the sites of active transcrip-
tion, facilitating
Ever the
sincemaintenance
Vincent Allfrey’s of H3K27me3
pioneering studies via the in tion [86].
Histone For instance, active transcriptional enhancers
acetylation
action of EZH2.
the earlyIn1960s,
this
LECTUREway,
we theknown
have
SERIES histone
M mark
that histones
OLECULAR BIOLOGY isare‘repli-
post- contain
WISE 2023/4 relatively high Olevels of H3K4me1,
RGANISATION OF THE EUKARYOTICa reliable
GENOME
cated’ ontotranslationally modified [1].
the newly deposited We now
histones andknow
thethat there
faculta- Allfrey et al.feature
predictive [1] first reported histone acetylation
[89]. However, active ingenes them-
are a large number of different histone post-translational 1964. Since then, it has been shown that the acetylation
tive heterochromatin is maintained.
modifications (PTMs). An insight into how these modi-
selves possess a high enrichment
of lysines is highly dynamic and regulated by the oppos-
of H3K4me3, which
(b) Constitutive
fications couldheterochromatin containscame
affect chromatin structure perma-
from marks
ing actionthe of twotranscriptional
families of enzymes, start histone
site (TSS)
acetyl-[86, 90]. In
nently silenced
solving genes in genomicX-ray
the high-resolution regions such
structure as the
of the nu- addition, (HATs)
transferases H3K36me3 is highly
and histone enriched
deacetylases (HDACs;throughout the
centromeres cleosome in 1997 [2]. The
and telomeres. It isstructure indicates that
characterised by highly
rela- for review,
entire see reference
transcribed [3]). The
region [91].HATsTheutilize acetyl by which
mechanisms
tively highbasic histone amino (N)-terminal tails can protrude from
levels of H3K9me3 and HP1α/β [87]. As CoA
their own nucleosome and make contact with adjacent
as cofactor and catalyse the transfer of an acetyl
H3K4me1 is laid down at enhancers is unknown, but
group to the ε-amino group of lysine side chains. In do-
discussed nucleosomes.
above, HP1It dimers seemed likelybindatto theH3K9me2/3
time that modifica- via ing work in yeast
so, they has the
neutralize provided mechanistic
lysine’s positive charge detail
and into how
their chromodomains,
tion of these tailsbutwould
importantly they also interact
affect inter-nucleosomal interac- theaction
this H3K4 hasand H3K36tomethyltransferases
the potential weaken the interactions are recruited
with SUV39, tionsaand thus H3K9
major affect themethyltransferase.
overall chromatin structure.As DNA We between
to genes, histones
which andin DNAturn(see below).
helps There arethe
to explain two distinct dis-
replicationnow know that this is indeed the case. Modifications not
proceeds, there is a redistribution of the exist- major classes of HATs: type-A and type-B. The type-B
tribution patterns of these two modifications (Figure 3).
only regulate chromatin structure by merely being there, HATs are predominantly cytoplasmic, acetylating free
ing modifiedbut histones (bearing
they also recruit H3K9me3),
remodelling enzymes asthat
well as the
utilize the The scSet1
histones but not H3K4 methyltransferase
those already binds to the serine
deposited into chromatin.
deposition energy
of newly
derivedsynthesized histones
from the hydrolysis intotothe
of ATP repli- This
reposition 5 phosphorylated
class of HATs is highly CTDconserved
of RNAPII, and allthe initiating form
type-B
cated chromatin. SinceThe
nucleosomes. HP1 binds toof SUV39,
recruitment proteins and it complexes
is tempt- HATs of polymerase
share sequencesituated
homology atwith
the scHat1,
TSS [92]. In contrast, the
the found-
with specific
that enzymatic activities is nowaanfeedback
accepted ing member of this type of HAT. Type-B HATs acetylate
ing to speculate
Interplay of factors at an active gene in yeast
the proteins generate
dogma of how modifications mediate their function. As
scSet2 H3K36 methyltransferase
newly synthesized histone H4 at K5 and K12 (as well as
phosphorylated
binds to the serine 2
loop capablewe ofwillmaintaining
describe below,heterochromatin
in this way modificationspositioning
can in- certain sites within H3),CTD and thisof pattern
RNAPII, the transcriptional
of acetylation is
following DNA
fluence replication
transcription, [68]. In chromatin
but since other words, elongating
during important
is ubiquitous, form ofofpolymerase
for deposition [93].which
the histones, after Thus, the the two en-
DNA replication,
modificationsHP1also binds
affecttomany
nucleosomes
other DNA processesbearing marks zymes are are recruited
removed [4]. to genes via interactions with distinct
H3K9me2/3, thereby recruiting the SUV39 methyltrans- The type-A
forms HATs areand
of RNAPII, a moreit isdiverse familythe
therefore of location
en- of the
zymes than the type-Bs. Nevertheless, they can be classi-
ferase, which in turn methylates H3K9 in adjacent nu- fied different forms of RNAPII that defines where the modifi-
into at least three separate groups depending on ami-
Correspondence: Tony Kouzarides
Tel: +44-1223-334112; Fax: +44-1223-334089 no-acid sequence homology and conformational struc-
E-mail: t.kouzarides@gurdon.cam.ac.uk ture: GNAT, MYST and CBP/p300 families [5]. Broadly
Figure 3 Interplay of factors at an active gene in yeast (adapted from references [128] and [3]).
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Trends in Biochemical Sciences
OPEN ACCESS Histone modifications 2023
Figure 3. Histone Modifications. Shown are chemical structures for various histone modifications that are expected to
change charge/sterics (top), hydrophobicity/sterics (bottom, left), or all upon conjugation of small proteins such as
ubiquitin or small ubiquitin-like modifier (SUMO) (bottom, right). These are all expected to alter tail/DNA interactions.
Abbreviation: PTM, post-translational modification.
Figure 3. Histone Modifications. Shown are chemical structures for various histone modifications that are expected to
change charge/sterics (top), hydrophobicity/sterics (bottom, left), or all upon conjugation of small proteins such as
ubiquitin or small ubiquitin-like modifier (SUMO) (bottom, right). These are all expected to alter tail/DNA interactions.
Abbreviation: PTM, post-translational modification.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
(1) DNA as genetic material was disputed until Alfred Hershey and Martha
Chase did their “blender experiment” in 1952. What did they do?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Core histone tails and the fuzzy complex
https://www.cell.com/trends/biochemical-sciences/fulltext/S0968-0004(20)30324-8
Trends in
Biochemical Sciences OPEN ACCESS
Review
The core histone tails are critical in chromatin structure and signaling. Studies Highlights
over the past several decades have provided a wealth of information on the his- Eukaryotic DNA is wrapped around his-
tone tails and their interaction with chromatin factors. However, the conforma- tone proteins to form nucleosomes that
fold into higher-order chromatin struc-
tion of the histone tails in a chromatin relevant context has remained elusive.
tures, and the local chromatin structure
Only recently has enough evidence emerged to start to build a structural model regulates all DNA-templated processes.
of the tails in the context of nucleosomes and nucleosome arrays. Here, we
review these studies and propose that the histone tails adopt a high-affinity All core histone proteins contain intrinsi-
cally disordered tail regions that protrude
fuzzy complex with DNA, characterized by robust but dynamic association. Fur- from the DNA-wrapped core and are
thermore, we discuss how these DNA-bound conformational ensembles pro- known to be critical in chromatin
mote distinct chromatin structure and signaling, and that their fuzzy nature is regulation.
important in transitioning between functional states.
Recent studies have revealed that the
LECTURE SERIES MOLECULAR BIOLOGY Trends inWBiochemical
ISE 2023/4 Sciences
core ORGANISATION
histone tails adopt multiple confor- OF THE EUKARYOTIC GENOME
Histone Tails: Dynamic Hubs for Chromatin Signaling mations on the nucleosomal and linker OPEN ACCESS
The eukaryotic genome exists in the cell nucleus as chromatin, a complex between the genomic DNA DNA; these tail/DNA interactions are
robust, but exchange quickly between
and proteins known as histones. The most basic repeating unit of chromatin is the nucleosome
multiple conformations consistent with a
core particle (NCP) (see Glossary), in which ~147 bp of DNA wrap around an octamer that contains so-called fuzzy complex.
two each of the core histone proteins H2A, H2B, H3, and H4 [1]. NCPs are flanked by linker DNA,
H3 either unmodified or methylated at
which is of variable length (10–70 bp) depending on the local chromatin state [2]. The dynamic Intra- versus inter-nucleosome contacts
by the tails differentially contribute to the
Lys4, or bromodomains which
organization of these nucleosome particles, both spatially and temporally, is critical in regulation of local chromatin state and thus regulation recognize various acetylated lysines on
the underlying genome and in the proper execution of all DNA templated processes [3]. Chromatin of DNA-templated processes. histones.
modulation is orchestrated by a slew of chromatin-associated proteins (CAPs). In addition, Super helical location (SHL): a
Histone post-translational modifications specific DNA helical turn within the
post-translational modifications (PTMs) on the histone proteins can directly regulate chromatin
and chromatin-associated factors can nucleosome core particle; the major
or indirectly regulate it through modulation of CAP activity [4]. modulate these fuzzy conformational grooves facing the histone core are
ensembles and tail accessibility, indicat- numbered +1 through +7 and -1
Much effort has been placed on building an understanding of the structure and dynamics of ing that the tail/DNA interactions are through -7 either direction starting from
an important regulatory mechanism of the dyad (which is denoted 0), and the
nucleosomes and chromatin. Several near atomic resolution structures of NCPs have been
chromatin. minor grooves are numbered in half
solved, and lower resolution structures of nucleosome arrays and nucleosomes in complex
steps.
Trends in Biochemical Sciences
with CAPs are more recently being tackled. Together, these have given us great insight into
chromatin structure [5–8]. In addition, mechanisms of inherent nucleosome dynamics have
OPEN ACCESS
been characterized including DNA breathing (i.e., spontaneous reversible unwrapping) of the
DNA at the entry/exit points [9–11].
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Figure 1. Histone Composition and Nucleosome Structure. (A) Core histones H2A, H2B, H3, and H4. The core region is
represented by a rectangle flanked by the tail sequences. Shown are human sequences with residues that vary between organisms
(only in H2A and H2B) in italics. The positive residues are denoted by a (+). (B) Left, a crystal structure of the nucleosome core particle
(PDB ID 1AOI). Histones are shown in red and DNA in gray, with the H2A/H2B acidic patch residues shown as black
spheres. The super helical locations (SHLs) are marked with the negative SHLs italicized. Right, a crystal structure of
the chromatosome (PDB ID 5NL0). Histones are shown in red and DNA in gray, the globular domain of linker histone
H1 is shown in blue. (C) A model of the broad conformational ensemble adopted by the H3 tails in the context of the
nucleosomes. The tails are blurred to represent their dynamic exchange between states.
CAP = chromatin-associated
protein
PTM = post-translational
modification
https://www.cell.com/trends/biochemical-sciences/fulltext/S0968-0004(20)30324-8
Figure 2. Schematic Model of Histone Tail Contacts in Various Chromatin States and Regulatory Effects of
Tail–DNA Interactions. Histones are shown in red, DNA in gray, the H2A/H2B acidic patch as a black oval, and histone
LECTURE(PTMs)
post-translational modifications SERIESasMcyan ovals B
OLECULAR stars. The histone tails W
orIOLOGY areISblurred
E 2023/4to represent dynamicORGANISATION OF THE EUKARYOTIC GENOME
exchange within a broad conformational ensemble. Predicted inter-nucleosome interactions stabilizing the tetra-
nucleosome are shown. The intra- and inter-nucleosome interactions favored by compact and extended tails are shown.
A chromatin-associated protein (CAP) is shown with a star denoting a histone PTM binding pocket, and the effect of PTM
crosstalk on CAP binding is represented. The inhibitory effects of RNA polymerase II (RNA Pol II) and remodeler activity, as
well as the positive effect of tail–DNA weakening PTMs, are represented.
been shown to alter the conformational ensemble of the H3 and H4 tails on DNA [48]. Notably,
not all DNA-binding factors lead to increased accessibility of the tails. Binding of the linker histone
H1 to form the chromatosome (Figure 1B) was found to reduce H3 tail dynamics and accessi-
bility potentially stabilizing it on the linker DNA [28]. Similarly, a recent study found that the H3 tail
stabilizes an RNA–DNA triplex formed on the linker DNA of a nucleosome [67]. Thus, the acces-
sibility can be even more restricted by limiting the available conformational ensemble.
The weakening of tail/DNA interactions can also regulate machinery acting on the nucleosome
core independent of tail binding (Figure 2). For instance, the presence of histone tails decreases
progression of RNA polymerase II through a nucleosome. However, mutation of lysine residues to
mimic acetylation in the tails, which would weaken tail–DNA interactions, positively regulates RNA
polymerase II activity, enhancing progression [68,69]. Similar effects are seen with chromatin
Chromatin structure varies along an interphase chromosome
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.youtube.com/watch?v=OjPcT1uUZiE
Good question:
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Available online at www.sciencedirect.com Chromatin structure: does the 30-nm fibre exist in vivo?
Kazuhiro Maeshima1,2, Saera Hihara1,2 and Mikhail Eltsov3
Chromatin structure: does the 30-nm fibre exist in vivo?
Kazuhiro Maeshima1,2, Saera Hihara1,2 and Mikhail Eltsov3
Introduction
1 in this model [9], essentially nucleo-
some variations exist
somes are arranged in a zigzag manner, such that a
Biological Macromolecules Laboratory, Structural Biology Center, or cations, as described below.
The human body is made up of 60 LECTURE SERIES
trillion cells, each MOLECULAR
nucleosome inBtheIOLOGY WISE 2023/4
fibre is bound to the second neighbour, ORGANISATION
National Institute of Genetics, Mishima, Shizuoka OF THE EUKARYOTIC
411-8540, JapanGENOME
containing 2 m of genomic DNA in its nucleus. How is but not the first (Figure 2b and d).
2
this genomic DNA organised into nuclei? Around 1880, Department of Genetics, School of Life Science, Graduate University In this review, in addition to a description of current
W. Flemming discovered a nuclear substance that was In 2004, Richmond and co-workers found that their cross-
clearly visible on staining under primitive light micro- linking study on nucleosomal arrays (12-nucleosome for Advanced Studies (Sokendai), Mishima, Shizuoka 411-8540, Japan
scopes and named it ‘chromatin’; this is now thought to be repeats) was in good
3 agreement with the zigzag confor-
European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117
progress in the field, we propose that the nucleosome
the basic unit of genomic DNA organisation [1]. Since mation of the two-start helix [10]. In addition, they
long before DNA was known to carry genetic information,
chromatin has fascinated biologists.
succeeded in resolving the crystal structure of a tetra-
nucleosome (four nucleosome cores) at a resolution of 9 Å Heidelberg, Germany fibres in nuclei or mitotic chromosomes exist in a highly
Deoxyribonucleic acid (DNA) has a negatively charged
(Figure 2d) [11]. Although the resolution of the structure
is relatively low, they defined the positions of the linker
disordered, interdigitated state, which is locally similar to
phosphate backbone that produces electrostatic repulsion
between adjacent DNA regions, making it difficult for
DNA and nucleosomes in the fibre by replacing
the coarse core region with the fine atomic structure
Corresponding author: Maeshima, Kazuhiro (kmaeshim@lab.nig.ac.jp) a polymer melt with dynamic movements.
DNA to fold upon itself [2,3]. For the first level of folding, of a nucleosome core particle [5]. Again, their results
2010
(a, b) Under diluted conditions, the flexible nucleosome fibres may compact through selective close neighbour associations, forming the 30-nm
chromatin fibres. An increase in nucleosome concentration results in inter-fibre nucleosomal contacts, which interfere with the intra-fibre bonds (c).
The L ECTURE S
nucleosomes adjacentM
ofERIES OLECULAR
fibres BIOLOGY
interdigitate WIS
and intermix. This disrupts the 30-nm E 2023/4
folding ORGANISATION
and the nucleosomal fibres progress to a state of OF THE EUKARYOTIC GENOME
‘polymer melt’ (d). (e) The concept of polymer melt implies dynamic polymer chains [40], that is, nucleosome fibres may be moving and rearranging
constantly. This may have several advantages in chromosome condensation and segregation during mitosis and the transcription and DNA replication
processes during interphase (see text). (f) ‘Chromatin liquid drop’: The transcriptional silencing can be established through a dynamic capturing of
transcriptional regions inside compact chromatin melt domains. These domains can be considered as drops of viscous liquid, which could be formed
by the nucleosome–nucleosome interaction and macromolecular crowding effect [42,43]. Active and inactive chromatins are shown in orange and
blue, respectively. Active chromatin regions are transcribed on the surfaces of the drops (shown in green).
chromatin fibres required a strict cationic environment, decreased and favoured intra-fibre nucleosome associ-
namely a low-salt buffer containing 1–2 mM Mg2+; under ations, leading to the formation of 30-nm chromatin
such conditions, isolated nuclei or chromosomes become fibres (Figure 3b). Furthermore, in conventional EM
swollen. Accordingly, the local nucleosome concentration observations, the formation of 30-nm chromatin fibres
Please cite this article in press as: Maeshima K, et al. Chromatin structure: does the 30-nm fibre exist in vivo?, Curr Opin Cell Biol (2010), doi:10.1016/j.ceb.2010.03.001
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
fi
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
* The exabyte is a multiple of the unit byte for digital information. The pre x exa indicates multiplication by the sixth
power of 1000 (1018) in the International System of Units (SI). Therefore, one exabyte is one quintillion bytes (short
scale). The symbol for the exabyte is EB.
* eine Trillion (1018) Bytes, eine Milliarde Gigabyte, eine Million Terabyte, Tausend Petabyte
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Unlike most digital storage media, DNA storage is
not restricted to a planar layer and is often
readable despite degradation in non-ideal
conditions over millennia.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
2021
ARTICLES
https://doi.org/10.1038/s41563-021-01021-3
DNA is an ultrahigh-density storage medium that could meet exponentially growing worldwide demand for archival data stor-
age if DNA synthesis costs declined sufficiently and if random access of files within exabyte-to-yottabyte-scale DNA data pools
were feasible. Here, we demonstrate a path to overcome the second barrier by encapsulating data-encoding DNA file sequences
within impervious silica capsules that are surface labelled with single-stranded DNA barcodes. Barcodes are chosen to repre-
sent file metadata, enabling selection of sets of files with Boolean logic directly, without use of amplification. We demonstrate
random access of image files from a prototypical 2-kilobyte image database using fluorescence sorting with selection sensi-
tivity of one in 106 files, which thereby enables one in 106N selection capability using N optical channels. Our strategy thereby
offers a scalable concept for random access of archival files in large-scale molecular datasets.
W
hile DNA is the polymer selected by evolution for the can be used for data encoding. Second, PCR-based retrieval requires
storage and transmission of genetic information in biol- an aliquot of the entire data pool to be irreversibly consumed for
ogy, it can also be used for the storage of arbitrary digi- random access, and therefore additional PCR amplification of the
tal information at densities far exceeding conventional data storage entire data pool may periodically be needed to restore this loss of
technologies such as flash and tape memory, at scales well beyond data. In this case, each PCR amplification may introduce stochastic
the capacity of the largest existing data centres1,2. Recent progress variation in copy number of the file sequences, leading to up to 2%
in nucleic acid synthesis and sequencing technologies continues to data loss per amplification19 if using tenfold physical redundancy,
https://www.nature.com/articles/s41563-021-01021-3
reduce the cost of writing and reading DNA, foreshadowing future as recently suggested18. Finally, avoiding spurious amplification of
commercially competitive DNA-based information storage1,3–5. off-target files due to crosstalk of PCR primers with incorrect bar-
Demonstrations of its viability as a general information storage codes or main file sequences requires careful primer design20. While
medium include the storage and retrieval of books, images, com- strategies exist to circumvent these preceding challenges, they gen-
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
puter programs, audio clips, works of art and Shakespeare’s son- erally reduce data density and might not be easily scalable to exabyte
nets using a variety of encoding schemes6–12, with data size limited and larger file systems. For example, data loss due to periodic PCR
primarily by the cost of DNA synthesis. In each case, digital infor- amplification of the entire data pool19 may be reduced by increasing
mation was converted to DNA sequences composed of ~100–200 the physical redundancy of the files in the main data pool, and PCR
nucleotide data blocks for ease of chemical synthesis and sequenc- crosstalk can be mitigated by spatial segregation of data into distinct
ing. Sequence fragments were then assembled to reconstruct the pools21 or extraction of selected DNA using biochemical affinity17,22.
original, encoded information. As an alternative to PCR-based approaches, here we introduce a
While considerable effort in DNA data storage has focused on direct random access memory approach that retrieves specific files,
increasing the scale of DNA synthesis, as well as improving encoding or arbitrary subsets of files, directly using physical sorting, without
schemes, an additional crucial aspect of data storage systems is the a need for amplification, and without any potential for barcode–
ability to efficiently retrieve specific files or arbitrary subsets of files. memory crosstalk, while also preserving non-selected files intact
To date, molecular random access has largely relied on conventional by recycling them into the original memory pool. To realize this
polymerase chain reaction (PCR)8,10,12,13, which uses up to ~20–30 file system, we first encapsulate DNA-based files physically within
heating and cooling cycles with DNA polymerase to selectively discrete, impervious silica capsules9,23,24, which we subsequently
amplify specific DNA sequences from a DNA data pool using prim- surface-label with unique single-stranded DNA barcodes that offer
ers. Nested addressing barcodes14–16 have also been used to uniquely Boolean-logic-based selection on the entire data pool via simple
identify a greater number of files, as well as biochemical affinity tags hybridization. Downstream file selection may then be optical,
to selectively pull down oligos for targeted amplification17. physical or biochemical, with sequencing-based read-out follow-
While powerful demonstrations of PCR have shown successful ing de-encapsulation of the memory DNA from the silica capsule.
file retrieval from a 150 GB file system18, notable limitations include, Each ‘unit of information’ encoded in DNA we term a ‘file’, which
first, the length of DNA needed to uniquely label DNA data strands includes both the DNA encoding the main data as well as any addi-
for file indexing, which reduces the DNA available for data stor- tional components used for addressing, storage and retrieval. Each
age. For example, for an exabyte-scale data pool, each file requires file contains a ‘file sequence’, consisting of the DNA encoding the
at least three barcodes17, or up to sixty nucleotides in total barcode main data, and ‘addressing barcodes’, or simply ‘barcodes’, which are
sequence length, thereby reducing the number of nucleotides that additional short DNA sequences used to identify the file in solution
2021
NATURE MATERIALS ARTICLES
a (ii) Writing and storing (iii) Random access (iv) Reading
Molecular file
database
(i) Data
Image database
Encapsulation
‘cat’
b Metadata ‘orange’
Encapsulation Other files
tagging
-based random DNA sequencing
access Molecular file
database
(v) Copying
Bacterial transfomation Data reconstruction
Fig. 1 | Write–access–read cycle for a content-addressable molecular file system. a, A general framework for DNA data storage that uses PCR-based
random access and its associated challenges. b, We demonstrate here an alternative encapsulation-based file system that allows for scalable indexing
and Boolean logic selection and retrieval. Coloured images were converted into 26!×!26 pixel, black-and-white icon bitmaps. The black-and-white images
were then converted into DNA sequences using a custom encoding scheme (Methods). The DNA https://www.nature.com/articles/s41563-021-01021-3
sequences that encoded the images (file sequences)
were inserted into a pUC19 plasmid vector and encapsulated into silica particles using sol–gel chemistry. Silica capsules were then addressed with content
barcodes using orthogonal 25-nucleotide ssDNA strands, which were the final forms of the files. Files were pooled to form the molecular file database. To
query a file or several files, fluorescently labelled 15-nucleotide ssDNA probes that were complementary to the file barcodes were added to the data pool.
Particles were then sorted with FAS using two to four fluorescence channels simultaneously. Files that were not selected were returned to the molecular
database. Addition ofLECTURE
a chemicalSetching MOLECULAR
ERIES reagent into theBtarget
IOLOGYpopulations released the W ISE 2023/4
encapsulated DNA plasmid. Sequences OforRGANISATION OF THE EUKARYOTIC GENOME
the encoded images
were validated using Sanger sequencing or Illumina MiniSeq. Because plasmids were used to encode information, retransformation of the released plasmids
into bacteria to replenish the molecular file database thereby closed the write–access–read cycle.
using hybridization. We refer to a collection of files as a ‘data pool’ of fluorescence channels employed, without enzymatic amplifica-
or ‘database’, and the set of procedures for storing, retrieving and tion or associated loss of nucleotides available for data encoding. We
reading out files is termed a ‘file system’ (Supplementary Section 0 also demonstrate Boolean AND, OR, NOT logic to select arbitrary
for a full list of terms). subsets of files with combinations of distinct barcodes to query the
As a proof-of-principle of our archival DNA file system, we data pool, similar to the conventional Boolean logic applied in text
encapsulated 20 image files, each composed of a ~0.1 kilobyte image and file searches on solid-state silicon devices.
file encoded in a 3,000-base-pair plasmid, within monodisperse, While only 20 icon-resolution images were chosen as our image
6 µm silica particles that were chemically surface labelled using up to database, representing diverse subject matter including animals,
three 25-nucleotide single-stranded DNA (ssDNA) oligonucleotide plants, transportation and buildings (Supplementary Fig. 1), our
barcodes chosen from a library of 240,000 orthogonal primers20, file system may in principle be scaled to considerably larger sets
which allows for individual selection of up to ~1015 possible distinct of images, limited primarily by the cost of DNA synthesis and the
files using only three unique barcodes per file (Fig. 1). While we need to develop strategies for high-throughput silica encapsula-
chose plasmids to encode DNA data in order to produce microgram tion of distinct file sequences and surface-based DNA labelling
quantities of DNA memory at low cost and to facilitate a renew- for barcoding (Supplementary Fig. 1). Because physical encapsula-
able, closed-cycle write–store–access–read system using bacterial tion separates file sequences from external barcodes that are used
DNA data encoding and expression25–28, our file system is equally to describe the encapsulated information, our file system offers
applicable to ssDNA oligos produced using solid-phase chemical long-term environmental protection of encoded file sequences
synthesis2,6,7,9–12,17 or gene-length oligos produced enzymatically29–32. via silica encapsulation for permanent archival storage9,23,24, where
Fluorescence-activated sorting (FAS) was then used to select target external barcodes may be renewed periodically, further protected
subsets of the complete data pool by first annealing fluorescent oli- with secondary encapsulation, or data pools may simply be stored
gonucleotide probes that are complementary to the barcodes used to using methods implemented in PCR-based random access, such
address the database33, enabling direct physical retrieval of specific, as dehydrating the data pool and immersing the dried molecular
individual files from a pool of 106N total files, where N is the number database in oil21.
Currently it would cost €1 trillion to write one petabyte of data (1 million gigabytes)
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
fi
fi
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The flow of information from DNA to protein occurs in all cells
Do you remember?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
3x3=9
possible direct transfers of information between 3 polymer classes
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
laboraton of Moleculrr Blolom that such informatfon cannot be transferred from protein to either
~Bslthsors,D N&we, S 1206 (1670). &a also the brief mount af
MC J \
con-
’ NcCartby, B., madRoWid, J. J., Proe. 119Nd. Ad. Sei., 64,660 W66).
-i- 3‘ -
RNA PROTEIN
Rep&dim
Thlrartkie
‘Thr central dogma, anumiatod by Crick In 1958 and thr analogous to thymine in DNA, thus giving four st&ndard ~ktdnly existed, class II was probably rare or absent,
km of molecular biology ever since, is likely ta prove a symbols for the oomponents of nucleia aoid. ’ and that olees III was very unlikely to occur. The
important today es when it was ht proposed.
Fit& S. A bnWlva ohmta~tlon for the present day. SolId arrow rhow
csnrtdenblo ovrr-slmplt5cstlon.” The prinoipal problem oonld then be s&ted as the decision had to be made, therefore, ‘whether to BBBUUM ~~8m-B do&Warmti~ tmufen. AwIn, the
Fha Biolooiml
very important work of Dr Howard Ternin’ and others* Thiaaotddbecomp&IyrepreeMedbythediagramof, class II should ‘6ot be impoesible. In- feet, for all we called general -fore, special transfers and unknown
ahawing that an RNA turnout virus can use viral RNA Fig. 1 (whioh was aotually ‘drawn at-that time;> though I knew, the replication of all RNA viruses could +.ve sqne transfers.
M tt template for DNA syntheeis. This is not the 5rst &m not mrre that it wus ‘ever published) in which all by way of a DNA intermediate. On the other hand, there
time that the idee of the central dogma has bean mis- pomible sim le transfem were reprwented, by arrfmJ. we& good general IWBOM sgainst all the three possible General and Special Transfers
Bpkgobnul% recentwork on p4g.J l!m?.
Bid.
understood, in one way or another. In this article I Thelurowa c& not, of oouree, m@wmt the flow of titter tnrnsfere in cl866 III. In brief, it was most unlikely, for
A general transfer ie one which can ocour in ell cells.
B.. Naium. W, 664 W66).
explain why the term was originally introduced, its true but the direotional flow of detailed, residue-by-residue; stereochemical reasoner that. protein+protein transfer
P.. Natmm. Ma NJ (1670).
meaning, and state why I think thmt, propotiy under- sequenaa information from one polymer, moleoule to could be done in the simple wey that DNA-+DNA transfer The obvious ~8888 are
SW. Etp.
stood, it is dill 8n id08 of fund8montal importance. 8nother. ‘!., was,enviaaged. Tht’tratd” p&e&RNA (and the DNA+DNA
The central dogma was put forward’ at 8 period when Now if 811 poesible transfers commonly oceurmd it uudogous protein+DNA) would hav,e required (back)
‘Hsnhs);A.D.,N-s#.~(1Q70).
DNA+RNA
much of what we new know in moleouler genetics was not would have been almost impassible ta.construct useful translation, that is, the transfer from ‘one alphabet to a
established. Allwehadfoworkonworeoer&infrag theorim. Nevertheless, such theories were p&of our 6tNoturally quite dif%mnt one. It was retllized that RNA-*Protein
mentary experimental resuhs, themselves often rather everydaydis&sifm~1.This~bec8use~it.w&bGg forward franelation involved very complex machinery. Minor exceptions, such aa the mammalian zetioulooyte,
l&ml,
uncertain and cmfuaed, and 8 botmdle~~ optimism that tacitly atmmed thet oertsin t&m could not ixour: Moreover, it eeemed unlikely on g&era1 @ounds that this whioh probeblg lacks the first two of thm, rhould not
tsL lloB (r;&
Jtllr 8. lMO
the I basic oonoeptr involved were rathor simple and It oeourmd to me that it would be wioe to st+ these mtacbhm could e&Iv work baokwards. The onlv re&on- exclude.
probably much tho same in all living things. In such e vptions explicitly. eble alt.eLtive wua &8t the oell had evolved en’ entirely A speoiel mfer ie one which ,doee not CMXXU in most
m
‘“$,$~~~&(r~).
situation web constructed theories ten phby 8 IY&]Y useful sepmate det of complicated machinery for back tranalstion, cells, but may occur in ape&l owoum&nce& Possible
‘0
fn afeting problems olearly and thun guiding experi- and of t&s there wsa no trace, and no reason to believe calldidttb are
thatitmightbeneeded. J
’ Flekhmm,
‘C!owmwr,
that in spite of the miscellaneous list of amino-ecids The& t&e the three them which the central dogma
found in proteins (as then ,&en in uU biochemical text- postulates never occur:
books) some of them, such as phosphoserine, were second- Proteii4Protein
ary modi5catioy ; and that there was probably a universal r’ _’ .i ProG.u+DNA
1970
minor moditlcations to the nuclei0 a&d bseee were ignored; about the rata at which&e ~~~MSHS work. ;
urecil in ‘RNA w&r considered to be informationdly dividedroughlyintotbree~upe.’ The5riftglwupwM (3) It was intended * 8pply only to presentday Stated in this w8y it is aleer ‘&&the epeei8l trum5ferr
6
those for whioh some e&denoe; direct or’indimat, 8BB668d orj@sme,andnottoevent8intheremotepast,suchae are those 8boUt WhiOh there is the &St unc&%inty. It
toexist.’ These’arenhownbythesolid~ in Fig. 3. the origin of l.iSeor the origin of the code. might indeed hsve “profo* impliostioas for moleoul6r
Theywere: :. : ‘;.
AUGUST
(4) Itianotthe6ame,aeit3oommonly tbsaumed, as the biology”1 if any of thm speciel t&bra could be ahoWn
I (a) DNA-+DNA aeq- hypothesis, which K(LB clearly distii t.0 be general, or-if not in all oel’lt I& fo’be’widely
DNA I (b) DtiA+RNA ~ fkom it in the aune srtiole4. In p&rticular the sequence ‘distributed. .8o liir, howetier, there L no evidenca for the
I (a) RNA~Protein’ hypothe&. ww IL poeitive titement, ss$ng that the iht two of the& except in a 041 infected with an RNA
. (over&l) tmnafbr nucleia aoid+protein did exist, whereas virus. In such e 0011the central dognia demanda that at
transfer.
I (d) RNAr’RNAl the central dogma wea a negative statement, saying that least one of the flmt two speaial frenttfem pbould oaour-
227
The la& of them tr&&ms WM presumed to ooaur~beo&isd trsn&m kom protein did not exist. this statement, iniridentally, shows the power of the
/!+i of the existence of RNA Grus~~. In looking +k I am struck not only bi the brashness central doemb in making theqretical predictions. Nor, (u
tih allowed UB to venture pore&l statements of a I have in&cat& is there any good theoretical reason why
VOL.
wed: ,,. Time br shown that not everybody appreciated our I know, have any of-my oolleaguee~
G the mferan& “6 +Yg* A.&&j I rwtMint. Although the detedn of the &am&&m $oF here
NATURE
Fly. 1. The umm rbor all ‘he poiulbb rtmpb truufm between Un n :(a$, &&2~~~ ( 80 muoh for the h of the subject. whst of the are plausible, our knowled@ of moleaular IO egg, even
lhree fmnuJa of polYmfua. They rwraurt the dh3uonal aor of
dewhxl qlluJIon 1nrormAon. ,II (6) !,DNA+Rotein ’ prmmt 1 I think it llltor i8 o Ed that the old alassiiloetion, in one oell-let alone for all the organimmr~.in natu~+-
though weful at the time, oould be improved, and I is atill fiu too inoompleta ti ally ua to amert d~tically
e that the nine ible traders bs regrouped that it in oormat. (There is, for exam le, the problem of
tmWively into three oiE6 . 1 propose that theee be the &emical nature of the vt of tfl e disecrre sar&ea:
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The flow of information from DNA to protein occurs in all cells
In fungi this change happens from one generation to the next, i.e. Protein → Protein.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Seor the origin of the code. might indeed hsve “profo* impliostioas for moleoul6r se2 NATURE VOL. 227 AUGUST 8 1970
ame,aeit3oommonlyCentral Dogmatbsaumed,
of Molecular as the biology”1 if any of thm speciel t&bra
Biology could be ahoWn
DNA
(double stranded) DNA to protein, though by that time
his prticul~~ theory had &en disproved.
The third CIWS consisted of the three’ t-fen, thr
c)
aune srtiole4. In p&rticular the sequence ‘distributed. .8o liir, howetier, there L no evidenca for the
FRANCIS CRICK residue-by-residue transfer of sequential information. It states III (a) Proteti4Protein / \
and it is for this reason that the central dogma is as
J \
laboraton
of Moleculrr Blolom
con-
’ NcCartby, B., madRoWid, J. J., Proe. 119Nd. Ad. Sei., 64,660 W66).
IL poeitive titement, ss$ng that the iht two of the& except in a 041 infected with an RNA
III (c) Protein-cDNA
tc Dr l’amln’o asrller work datthg backtc 1666.
-i- 3‘
Thlrartkie
r nucleia aoid+protein did exist, whereas virus. In such e 0011the central dognia demanda that at
‘Thr central dogma, anumiatod by Crick In 1958 and thr analogous to thymine in DNA, thus giving four st&ndard
and that olees III was very unlikely to occur. The
important today es when it was ht proposed.
km of molecular biology ever since, is likely ta prove a symbols for the oomponents of nucleia aoid. ’ Fit& S. A bnWlva ohmta~tlon for the present day. SolId arrow rhow
csnrtdenblo ovrr-slmplt5cstlon.” The prinoipal problem oonld then be s&ted as the decision had to be made, therefore, ‘whether to BBBUUM ~~8m-B do&Warmti~ tmufen. AwIn, the
Fha Biolooiml
very important work of Dr Howard Ternin’ and others* Thiaaotddbecomp&IyrepreeMedbythediagramof, class II should ‘6ot be impoesible. In- feet, for all we called general -fore, special transfers and unknown
protein did not exist. this statement, iniridentally, shows the power of the
ahawing that an RNA turnout virus can use viral RNA Fig. 1 (whioh was aotually ‘drawn at-that time;> though I knew, the replication of all RNA viruses could +.ve sqne transfers.
M tt template for DNA syntheeis. This is not the 5rst &m not mrre that it wus ‘ever published) in which all by way of a DNA intermediate. On the other hand, there
time that the idee of the central dogma has bean mis- pomible sim le transfem were reprwented, by arrfmJ. we& good general IWBOM sgainst all the three possible General and Special Transfers
Bpkgobnul% recentwork on p4g.J l!m?.
I am struck not only bi the brashness central doemb in making theqretical predictions. Nor, (u
Bid.
understood, in one way or another. In this article I Thelurowa c& not, of oouree, m@wmt the flow of titter tnrnsfere in cl866 III. In brief, it was most unlikely, for
A general transfer ie one which can ocour in ell cells.
B.. Naium. W, 664 W66).
explain why the term was originally introduced, its true but the direotional flow of detailed, residue-by-residue; stereochemical reasoner that. protein+protein transfer
P.. Natmm. Ma NJ (1670).
could be done in the simple wey that DNA-+DNA transfer The obvious ~8888 are
SW. Etp.
meaning, and state why I think thmt, propotiy under- sequenaa information from one polymer, moleoule to
UB to venture pore&l statements of a I have in&cat& is there any good theoretical reason why
stood, it is dill 8n id08 of fund8montal importance. 8nother. ‘!., was,enviaaged. Tht’tratd” p&e&RNA (and the DNA+DNA
uudogous protein+DNA) would hav,e required (back)
‘Hsnhs);A.D.,N-s#.~(1Q70).
The central dogma was put forward’ at 8 period when Now if 811 poesible transfers commonly oceurmd it DNA+RNA
much of what we new know in moleouler genetics was not would have been almost impassible ta.construct useful translation, that is, the transfer from ‘one alphabet to a RNA-*Protein
ature, but ti by the rather delicate the frrrnefer MA-DNA
established. Allwehadfoworkonworeoer&infrag
mentary experimental resuhs, themselves often rather should not .sometimea be used.
theorim. Nevertheless, such theories were p&of
everydaydis&sifm~1.This~bec8use~it.w&bGg
our 6tNoturally quite dif%mnt one. It was retllized that
forward franelation involved very complex machinery. Minor exceptions, such aa the mammalian zetioulooyte,
l&ml,
Moreover, it eeemed unlikely on g&era1 @ounds that this whioh probeblg lacks the first two of thm, rhould not
tsL lloB (r;&
uncertain and cmfuaed, and 8 botmdle~~ optimism that tacitly atmmed thet oertsin t&m could not ixour:
used in s&+ut~ what sfatementa to make. I have never sugg&ed thet it cannot oeaur, nor, gr) far aa
Jtllr 8. lMO
the I basic oonoeptr involved were rathor simple and It oeourmd to me that it would be wioe to st+ these mtacbhm could e&Iv work baokwards. The onlv re&on- exclude.
probably much tho same in all living things. In such e vptions explicitly. eble alt.eLtive wua &8t the oell had evolved en’ entirely A speoiel mfer ie one which ,doee not CMXXU in most
m
‘“$,$~~~&(r~).
situation web constructed theories ten phby 8 IY&]Y useful sepmate det of complicated machinery for back tranalstion, cells, but may occur in ape&l owoum&nce& Possible
wn that not everybody appreciated our I ‘0know, have any of-my oolleaguee~
fn afeting problems olearly and thun guiding experi- and of t&s there wsa no trace, and no reason to believe calldidttb are
thatitmightbeneeded. J
’ Flekhmm,
‘C!owmwr,
RNAdRNA
‘~&~.f.x.~d
Thi two central conoepts which had been produced, I de&led, therefom, fo play safe, and to state as the
//
the poeaible transfers’f?om profein, the central dogma At th6 pre&nt time the first two of these have only been
-~
nk it llltor
i8 o Ed that the old alassiiloetion,/ RN:’ ;in one ~PRO&J oell-let alone for all the organimmr~.in natu~+-
neomycin~, ,though by a trick it
Nevertheless, we know enough to s8y that a non-trivial)
stratum, it was nv
on one side, and p&u&e
‘\
to put the folding-up prooer
that, by end large, the -
to heppen, using neomycin, in 8n
intaotbeotmidc.ell.
peptide oh&n folded it&f up. This temporuriIy &
at the time, oould be improved, and I-f ‘r. is atill fiu too inoompleta ti ally ua to amert d~tically
the central problem from a three dimensional one to a Urhown Transfers
oould oarry out any of the three unknown trandm
he nine ible traders bs regrouped that it in oormat. (There is, for exam le, the problem of
found in proteins (as then ,&en in uU biochemical text- postulates never occur:
books) some of them, such as phosphoserine, were second- Proteii4Protein
ary modi5catioy ; and that there was probably a universal
three oiE6 . 1 propose that theee be the &emical nature of the vt of tfl e disecrre sar&ea: _’ .i ProG.u+DNA
1970
r’
set of twenty used throughout nature. In the ssme ,way Protein~RNA, L
exumple showing that the class&&ion
minor moditlcations to the nuclei0 a&d bseee were ignored; A &lo dJ9is. &wd’th& the &.. &$ :& about the rata at which&e ~~~MSHS work. ;
urecil in ‘RNA w&r considered to be informationdly dividedroughlyintotbree~upe.’ The5riftglwupwM (3) It was intended * 8pply only to presentday Stated in this w8y it is aleer ‘&&the epeei8l trum5ferr
6
those for whioh some e&denoe; direct or’indimat, 8BB668d orj@sme,andnottoevent8intheremotepast,suchae are those 8boUt WhiOh there is the &St unc&%inty. It
toexist.’ These’arenhownbythesolid~ in Fig. 3. the origin of l.iSeor the origin of the code. might indeed hsve “profo* impliostioas for moleoul6r
AUGUST
Theywere: :. : ‘;.
(4) Itianotthe6ame,aeit3oommonly tbsaumed, as the biology”1 if any of thm speciel t&bra could be ahoWn
I (a) DNA-+DNA aeq- hypothesis, which K(LB clearly distii t.0 be general, or-if not in all oel’lt I& fo’be’widely
DNA I (b) DtiA+RNA ~ fkom it in the aune srtiole4. In p&rticular the sequence ‘distributed. .8o liir, howetier, there L no evidenca for the
hypothe&. ww IL poeitive titement, ss$ng that the iht two of the& except in a 041 infected with an RNA
I (a) RNA~Protein’ (over&l) tmnafbr nucleia aoid+protein did exist, whereas virus. In such e 0011the central dognia demanda that at
transfer.
. I (d) RNAr’RNAl the central dogma wea a negative statement, saying that least one of the flmt two speaial frenttfem pbould oaour-
227
trsn&m kom protein did not exist. this statement, iniridentally, shows the power of the
The la& of them tr&&ms WM presumed to ooaur~beo&isd In looking +k I am struck not only bi the brashness central doemb in making theqretical predictions. Nor, (u
/!+i of the existence of RNA Grus~~. tih allowed UB to venture pore&l statements of a I have in&cat& is there any good theoretical reason why
VOL.
RNA e--PROTEIN Nest there wmtitmo Wem (shown~h Fig. !2+iid$‘t& T. ~1 nature, but ti by the rather delicate the frrrnefer MA-DNA should not .sometimea be used.
arrows) for,.whioh, there was neither any w &crzmmation used in s&+ut~ what sfatementa to make. I have never sugg&ed thet it cannot oeaur, nor, gr) far aa
_I Cwidena8 nor irny strong theoretical ~rfquiremenfl~“~. Tlhey’ .
RNA-tDNA
c). (>. Time br shown that not everybody appreciated our I know, have any of-my oolleaguee~
wed: ,,. rwtMint. Although the detedn of the &am&&m $oF here
NATURE
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
RNA polymerase II transcribes the DNA
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
TBP-DNA complex
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
REVIEWS F O C U S O N t r a N S CRrEiV
p ItEi W
ONS
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Binding of the 10 subunit Pol II core to the Pol II
subcomplex Rpb4/7 generates the complete, 12 subunit
enzyme.
Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
DNA melting commences above the active center cleft, 20 base pairs
downstream of the TATA box.
DNA melting allows the template single strand to reach the active site and the
downstream DNA duplex to bind near the jaws of Pol II.
Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Backtracking and arrest can occur upon
attempts to elongate through speci c DNA
sequences or a nucleosome.
Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.
https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
fi
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.youtube.com/watch?v=WsofH466lqk
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.ncbi.nlm.nih.gov/pubmed/11909516
The different steps in the pathway from gene to protein Recent findings suggest that each step regulating gene
have traditionally been viewed as independent events, expression is a subdivision of a continuous process.
with each going to completion before the next begins . Each stage is physically and functionally connected to
the next, ensuring that there is efficient transfer between
Good question: manipulations and that no individual step is omitted.
What is the message of the two schemes?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
What is faster: transcription or translation?
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3828032/
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
ribosome profiling
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3225288/
N. Ingolia et al., Cell, 146:789, 2011
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
original papers
www.laskerfoundation.org
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
original papers
https://www.nature.com/articles/171737a0
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.nature.com/articles/171964b0
NATURE | VOL 421 | 23 JANUARY 2003 | www.nature.com/nature © 2003 Nature Publishing Group 397
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
How is it replicated?
How is it translated?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
No DNA sequencing
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Fundamental questions:
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
There are 20 kinds of amino acids in proteins but only four kinds of nucleotide
bases in DNA.
No one-to-one mapping from bases to amino acids. (41)
No two-to-one, since there are only 16 doublets of bases. (42)
Three-to-one could work = 64 triplets. (43)
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Less than a year after James Watson and Francis Crick discovered the molecular
structure of DNA, George Gamow, a professional physicist and amateur biologist,
proposed the first definite coding scheme for DNA.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
In this letter to microbiologist Martynas Ycas, Gamow discusses his idea for the code.
https://www.genetics.org/content/211/3/789
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
https://www.nature.com/articles/173318a0
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Even with the sparse protein sequence data available in the mid-1950s, Crick was
able to show that the diamond code was ruled out by the experimental evidence.
There were known patterns of amino acid repetitions that the diamond code could
not produce.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
Gamow founded “The RNA Tie Club”, limited to 20 regular members (20 amino acid) and
four honorary members (4 nucleotide base).
Orgel
Crick
Watson
Rich
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
By the later 1950s, there was growing support for the idea of messenger RNA -
a single-strand molecule acting as an intermediary between DNA and the
protein-synthesizing machinery.
At the same time Crick was formulating the "adaptor hypothesis," the idea that
amino acids do not interact directly with messenger RNA but are carried by
small molecules that recognize specific codons. The codons were by then
thought to be non-overlapping triplets of bases.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The frame-shift problem doesn't arise with an overlapping code, because all
three reading frames are simultaneously valid.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
sense
nonsense
nonsense
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
(1) The codons AAA, CCC, GGG and UUU cannot appear in any
comma-free code, since they cannot combine with themselves without
generating reading-frame ambiguity.
The remaining 60 codons can be sorted into groups of three, where
the codons within each group are related by a cyclic permutation
(AGU, GUA and UAG).
A comma-free code can have no more than one codon from each of
these permutation classes.
Dividing 60 objects into groups of three produces exactly 20 groups.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Using Ray's results,2 the proof also generalizes immediately to the case where
particles are killed at a rate V(x) throughout the region R as well as at the boundary.
Only slightly more than continuity of V(x) almost everywhere in R is required.
Similar results are expected to hold for the elastic-barrier case.
'M. Kac, "On Some Connections between Probability Theory and Differential and Integral
Equations," Proc. Second Berkeley Symposium Math. Statistics and Probability, pp. 189-215, 1951.
2 D. Ray, "On Spectra of Second-Order Differential Operators," Trans. Am. Math. Soc. 77,
299-321, 1954.
3R. Courant andThe race
D. Hilbert, to crack
Methods the Physics,
of Mathematical geneticVol. 1code
(New York: Interscience
Publishers, Inc., 1953).
416 PHYSICS: CRICK ET AL. PRoc. N. A. S.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC528468/
Using Ray's results,2 the proof also generalizes immediately to the case where
particles are killed at a rate V(x) throughout the region R as well as at the boundary.
Only slightly more than continuity of V(x) almost everywhere in R is required.
CODES WITHOUT COMMAS
Similar results are expected to hold for the elastic-barrier case.
'M. Kac, "On Some Connections between Probability Theory and Differential and Integral
BY F. H. C. CRICK, J. S. GRIFFITH, AND L. E. ORGEL
Equations," Proc. Second Berkeley Symposium Math. Statistics and Probability, pp. 189-215, 1951.
2 D. Ray, "On Spectra of Second-Order Differential Operators," Trans. Am. Math. Soc. 77,
299-321, 1954.
3R. Courant andRESEARCH
MEDICAL COUNCIL
D. Hilbert, Methods of MathematicalUNIT, CAVENDISH
Physics, Vol. LABORATORY,
1 (New York: Interscience AND DEPARTMENT OF THEORETICAL
Publishers, Inc., 1953).
CHEMISTRY, CAMBRIDGE, ENGLAND
ien ce
418 Communicated
CODES WITHOUT COMMAS
by G. Gamow,
PHYSICS: February
CRICK ET y sc PROC. N. A. S.
AL. 11, 1967
tur en arose in connection with
This BYpaper F. H. C. deals
some CRICK, J. S.with
make GRIFFITH,
nonsense. mathematical
a AND E. ORGEL
WeL.further assumeproblem th-c sequences
0which
that all2possible of the amino
f every
protein synthesis. We is, can be the
(thatpresent coded)solution all oathere because
point in theitstringgives the "magic
MEDICAL RESEARCH COUNCIL UNIT, CAVENDISH LABORATORY, AND DEPARTMENT OF THEORETICAL
acids may occur
CHEMISTRY, CAMBRIDGE,
ENGLAND and that of letters
in
number" one 20,Communicated
so only
can that read
by G. our
Gamow, "sense"answer
February 1967themay
11,in eaperhaps
correct
id way. This be biological
isofillustrated significance.
in Figure 3. In To
with awords,
other any two triplets n g
which make sense can be put side by side, and yet
This paper deals
makesynthesis.thistheclear, mathematical
the sketch
problem
insot the robiochemical
which arose in connection with
the "magic background
protein presentwe
Weoverlapping solution here
triplets
answer may perhapstie
wformed
because
besof biological
it givesmust
significance.always be nonsense. first.
It is assumed
number" 20, so that our
in one of t the
rebackground first. popular
more theories
To
of protein synthesis that amino
e ap nucleic
make this clear, we sketch in the biochemical
acids arein ordered hpopular acidsynthesis
strand
Ton senseand that(see, for example, etc. Douncel) and that the
It is assumed one of the more theories of protein that amino
acids are ordered on a nucleic acid strand (see,sense
for example, Douncel) the sense
order
order of theof the
amino acidsamino
is determined acids
acid. There are some twenty naturally occurring
is
byr the order
1 determined
of the nucleotides of by
2 amino 3 acids4 commonly
the
the nucleic
5 found 6 in7
order
8 of
----I
9 the 10 nucleotides
11 of the nucleic
acid. but There
proteins, (usually) only arefoursomedifferenttwenty
sequence of four things (nucleotides) can determinenonsense
The problem of occurring
nucleotides. naturally how a
nonsense
amino acids commonly found in
nonsense etc.
a sequence of twenty things
proteins,
(amino acids) is knownbut as (usually)
the "coding" problem. only four
I_ different L_
nonsense with eithernonsense
nucleotides. L The
nonsense etc.
problem of how a
This problem is a formal one. In essence, it is not concerned
sequence
chemical steps or theof details
fourof the things (nucleotides)
stereochemistry. can the
It is not even essential determine
to a sequence of twenty things
these pointsacids)
all(amino of theis "coding"
or 3.-The
specify whether RNA FIG.
is shown
are known
DNA is the
which
greatest
numbers
interest,as
nucleic
triplets make
but the
represent
acid being
theysense
are
the
which problem.
positions
considered.
andindirectly
only
occupied
Naturally,
nonsense.
involved
by the four letters A, B, C, and D. It
in theThis problem of coding. is a formal one. In suggested it is not concerned with either the
The first definite proposal was made by Gamow.2 His code, which wasessence,
formal problem
chemical
by the structure ofsteps
DNA, wasor
illustrated in FigureIt1. isGamow's
the
of the
obvious details
"overlapping"
codethatwas also
type.ofThe the meaning
these restrictions
with"degenerate"-that stereochemistry.
of this is
is, several
It is
one will be unable to code 64even not essential to
different
specify
sets whether
amino
of three letters L acids.
(picked
ECTURE in RNA The
aSspecial
ERIES Mway)
OLECULAR DNA
ormathematical
stood
IOLOGY is the
forB a particular problem nucleic
amino
I E is WtoSacid
acid. find
2023/4 being
the considered.
maximum number
O
RGANISATION OF THE Naturally,
that E
UKARYOTIC GENOME
However, all the 64 (4 X 4 X 4) possible sets of three letters stood for one amino
allorthese
acid sopoints
another, can thatbe are of
anycoded.
sequence We theshall
whatever the show
ofgreatest (1)interest,
four letters thatfor the
stood but theynumber
a de- maximum are only cannotindirectly
be greaterinvolved
finite sequence ofthan
inIt isthe 20
amino acids. and (2) that a solution for 20 can be given.
easy formal
to see that codes
the allowed amino acid
problem
Tosequences.
prove the firstcoding.
of
of the overlapping
Unfortunately,
type impose severe restrictions on
point, consider
werestrictions haveforbeenthe moment the restrictions imposed
found,The first
although by definite proposal
placing(unpublished)
considerable each amino
no such
wasbeen
acid
efforts have made
next made, by
tobyitself. Gamow.2
a number Then, clearly, Histhe code, which
triplet AAAwas mustsuggested
ofby theto structure of since,
workers, DNA,if was of the "overlapping"
find them. Part of this work has been reviewed by Gamow, Rich,
be nonsense, it corresponded to an amino acid, type.a., then Theaameaning
would beof this is
AAAAAA, and this
illustrated in Figure 1. Gamow's code was also "degenerate"-that the
sequence can be misinterpreted by associating a with is, several
sets of three secondletters to fourth, or third in
(picked to fifth, a special letters. We can thus for rejecta AAA, BBB, CCC,
and DDD. way) stood particular amino acid.
However, all It isthe easy 64 to see(4 thatX 4theX60 4)remaining possible setscan
triplets of bethree
grouped into 20stood
letters sets offorthree,one amino
acid or another, each set ofsothree thatbeing anycyclic sequence permutations whatever the fourConsider
of oneofanother. lettersasstoodan ex-for a de-
finite sequence ample ABC of aminoand itsacids. cyclic permutations BCA and CAB. It is clear that we can
choose any one of these, but not more than one. For suppose that we let BCA
It is easy to see that codes of the overlapping
stand for the amino acid 13; then 131 is BCABCA, and so CAB and ABC must, by type impose severe restrictions on
the allowed our rules,aminobe acid nonsense. sequences. Since we Unfortunately,
can choose at the most no such restrictions
one triplet from each have been
found, although cyclic set, we cannot choose (unpublished)
considerable more than 20. Noefforts solutionhave made, which
beentherefore,
is possible, by a number
of workers, codestomore The
find thanthem. 20race
differentPartto ofcrack
amino acids.work
this the hasgenetic
been reviewed code by Gamow, Rich,
We have so far not considered the effects of putting unlike amino acids- to-
gether, to give pairs of the form a13 and ha. It might be thought that this would
still further reduce the possible number of amino acids, but this turns out not to be
so, since we can write down a construction which obeys all our rules and yet codes
20 different amino acids. One possible solution is
A A A
A BABB A CB DB
B C CBCD
27.05.1961 (3:00 a.m.)
A
where ABB means ABA and ABB, etc. It is easy to see, by systematic enumer-
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
A stableBIOLOGY
LECTURE SERIES MOLECULAR cell-free system W has been obtained fromORGANISATION
ISE 2023/4 E. coli which incorporates
OF THE E UKARYOTIC GENOME
C14-valine int)o protein at, a rapid rate. It was shown that this apparent protein
synthesis was energy-dependent, was st’imulated by a mixture of L-amino acids,
and was markedly inhibited by RNAane, puromycin, and chloramphenico1.l The
present, communication describes a novel characteristic of the system, that is, a
requirement for templat’e RNA? needed for amino acid incorporation even in the
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code
https://www.sciencedirect.com/science/article/pii/S0968000403003025
This is an autobiographical description of the events b-galactosidase in Escherichia coli and that the mechan-
that led to the breaking of the genetic code and the sub- ism of protein synthesis was one of the most exciting areas
sequent race to decipher the code. The code was deci- in biochemistry. Some of the best biochemists in the world
phered in two stages over a five-year period between were working on cell-free protein synthesis, and I had no
experience with either gene regulation or protein syn-
ADD Khorana!!!!
1961 and 1966. During the first stage, the base compo-
VOL. 48, 1962
sitions of codons were deciphered BIOCHEMISTRY:
by the directing cell- thesis,CHAPEVILLE ET AL.worked on sugar 1087
having previously transport,
free protein synthesis with randomly ordered RNA glycogen metabolism and enzyme purification. After
This reaction
preparations. During the second performed while
thebenucleotide
phase,can the cysteine
thinking aboutisthis
attached its sRNA, producing
for atoconsiderable time, I finally
sequences of RNA codonsthe hybrid Ala-sRNACYSH. A superscript denotes the normal amino acid accept- was to
were deciphered by deter- decided to switch fields. My immediate objective
mining the species of aminoacyl-tRNA
LECTURE OLECULAR Bthat
ing Mspecificity
SERIES of anbound
IOLOGY investigate
sRNA,tothe actual aminothe
WISE 2023/4 acidexistence
attached of mRNA
being by EUKARYOTIC
indicated
ORGANISATION OF THE determining
as a GENOME
ribosomes in response to trinucleotides
prefix. of known
Figure 1 illustrates whether cell-free protein synthesis in E. coli extracts
this procedure.
sequence. Views on general topics such as the
howcoding was stimulated
to pickproperties by anmolecule,
of the hybrid RNA fraction or byutilized
we have DNA. In the
To determine
a research problem and competition versus collabor- longer term, my objective was to achieve the cell-free
the finding by Nirenberg and Matthaei,6 Lengyel, Speyer, and Ochoa,7 Speyer,
synthesis of penicillinase, a small inducible enzyme that
ation also are discussed.
Lengyel, Basilio, and Ochoa,8 and Martin, Matthaei, Jones, and Nirenberg,9 that
polyuridylic-guanylic acid will stimulate ribosomal incorporation into polypeptides
I would like to tell you how the genetic code was deciphered
of certain amino acids, including cysteine, but not alanine. As shown below, this
from a personal point of view. I came to the National
Institutes of Health (NIH) indifference 1957 as a betweenpost-doctoral cysteine fellow and alanine also applies when they are attached to their
with Dewitt Stetten, Jr, a wise, normal highlyacceptors,
articulate CySH-sRNACYSH is reactive with poly UG but Ala-sRNAAl,
i.e.,scientist
and administrator, immediately is not.afterThe hybrid molecule
obtaining a PhD inAla-sRNACYSH proved to be just as reactive as CySH-
biochemistry from the University sRNACYSH, leading to intheAnn
of Michigan conclusion that the sRNA moiety indeed determines the
1086 Arbor. The next year, I started
BIOCHEMISTRY: codingworkspecificity.
CHAPEVILLE with WilliamET AL. JakobyPROC. N. A. S.
and, by enrichment culture, I isolated Materialsaand Methods-Preparation
Pseudomonad that of C'4-L-cysteine: -C'4-cysteine of high specific activity
grew on
fects changes the absorbance:
g-butyrolactone (1) Unstackingwas the
and prepared
nucleotide
purified from pairsL-C'4-serine.
three increases
enzymes The yeast
absorbance by ap-serine sulfhydrase described by Schlossmann and
proximately
involved 45 perincent
the 250-280 mjs. (2)
at catabolism ofLynen"°
The catalyzes
difference
g-hydroxybutyric the reaction:
spectrum acid [1]. of a T2-mimetic
for reaction
mixture of adenine, guanine, and cytosine with CH20 has its isosbestic points near 255 mA and
maximumThere was aWeweekly
at 275 m/A.'4 estimateseminar in Stetten’s
that the combined oflaboratory
denaturation in
CH2OH-CHNH2-COOH
effects + H2S CH2SH-CHNH2*COOH + H20.
and formylation
lead towhich
fractionalGordon
The absorbance
tively.different
The RNA message is decoded in ribosomes
absorbance Tomkins
increases, r,(Figure
that ordered
laboratory, C-RNA
of 0.45, 0.49,1), and
Thecontributes
participated. enzymeGordon
who
was
at 255, 258,inanda260 m/s respec-
0.54worked
prepared
to thewas solution from
brilliant, a/lr. yeast which had been frozen in liquid nitrogen
therefore bakers'
(Ah) isNational
fh is calculated from Ah and Ao, the absorbance andistored
before CH20
with a wonderful associative memory and a magnificent
at -20° ; 50asgm of frozen yeast were extracted by stirring at 3° for 5 hr with 100 ml of
addition,
fh = (1 + r)Ah/(AO + 0.05
Ah) M= K2HPO4
(1 + and 0.05
r)aA/r(Ao + a"). M EDTA. From then on, the procedure of Schlossmann and Lynen
sense of humor. His seminars was were superb,
followed up to especially
the ammonium his sulfate step. The ammonium sulfate precipitate between 40
sample 11, the values fh calculated
of ofthe
Forhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC220908/
description step-by-step at 255, 258, and 260in
in this waydevelopments mu are
the 0.57, 0.59, and.
0.55 respectively. Averaged values of Ah are and 65 per
listed cent 1.saturation
in Table They are was dissolved
consistently in than
lower 15 ml
problem
fh calculated fromthat he intended
the thermal transition to of discuss.
data 0.05 Tris
(e.g.,MFig. 1). HCl, pHthe
Towards and dialyzed
7.5, end of against ysteine
my post-doctoral
14 Haselkorn, R., and P. Doty, fellowship, 0.02Gordon
J. Biol. Chem., HCl
Tris2738
M236, buffer. The
(1961).
replaced Hermandialysate was stored ctivating 0 H H
15 Geiduschek,
Kalckar as E. P., J. Mol.
head of theBiol.,Section at (1962).
in press -20° . 0.75 gmole
of Metabolic Enzymes of C'4-L-serine
and (65 ,uC! Cysteine Accepto sRNAOH n HO-C-C-C-SH
16 It has already been shown that when T2 DNA and T2-C-RNA are heated together to 100'C
offered no
and quenched, mecomplex
a position
formationas an Mmole,
can independent
New England
detected inina aCsCl
bedissolved investigator
density
Nuclear in Corgoration)
gradient. hisSuch complexes was NH2 H
form laboratory. The other
during the "annealing" independent
at 41° C.8, 17 The reheating oftotal
investigators
T2 DNA volume of 1.5with
in the
together
mlthe
containing:
C-RNA,
Cysteine
therefore, complicates
in no way were 3 pumolesofofthis
the interpretation EDTA; 0.5 ml of 0.5 M Tris HCl, pH
experiment.
laboratory Elizabeth Maxwell andwith Victor Ginsberg,
17 Hall, B. D., and S. Spiegelman, these 8.5, saturated
PROCEEDINGS, 47, 137H2S; (1961).0.3 jsmole of pyridoxal
who were
18 Marmur, J., andcarbohydrate biochemists,
D. Lane, these PROCEEDINGS,phosphate;46, and and 2Todd
451 (1960). mg ofMiles, enzyme.a After incu-
nucleic-acid
19 Doty, P. J. Marmur,biochemist.
J. Eigner, and It C.was
bation aatwonderful
370 these
Schildkraut, hropportunity
for 3 PROCEEDINGS,
in nitrogen, 461ml
46, 20 (1960).
of etha- it
20 Recently, Nakamoto and Weiss, these PROCEEDINGS, 48, 880 (1962), have shown that the
and I decided then that if I nol wascontaining
going to 0.2 work ml of this 2 NhardHCl Iwere added to Cysteine Acceptor sRNA0-'- C-C-C-SH
enzyme preparations used for the DNA-primed synthesis of C-RNA also catalyze
Afterancentrifugation, an RNA-primed
the superna- NH2 H
RNA might as wellIt ishave
polymerization. thepossible
therefore fun ofcysteine.
thethat exploring
part of the RNA important
isolated for these experiments
is made on a C-RNA rather than a DNAtant
problem. was evaporated
template. However, the the residue
andrelative rates ofre-extracted
these two Cysteine
processes In are mysuch that under our
opinion, mostwith
thesynthetic 10 ml ethanol
conditions
exciting not
work more plus
in 0.2
thanmolecular
5 per 2 NofHCl,
mlcent and the
the C-RNA
could biology
have beenin made by the RNA-primed extraction
pathway. The evaporation
and self-complementarity were repeated leastonce
of atFigure 85 Raney Nickel
1959 were
per cent of T2-C-RNA (Fig. 6) must therefore
the genetic
more. experiments
be a property of MonodRNA synthesis.
of the DNA-primed
1. Gordon Tompkins. Gordon was brilliant, highly articulate and very funny.
He was a charismatic individual who created a stimulating atmosphere and
and Jacob on the regulationBefore of the thegeneethanol that encodes
addition, 75 jmoles of C12- exploration.
encouraged _ In 1958, towards the end of my post-doctoral fellowship
serine, 20 pmoles of C"-alanine, and 30at ,umoles the NIH, heCysteine Accepto,
offered me sRNA O-C-C-H
a position as an independent investigator in his
of C"2-glycine
Corresponding author: Marshall Nirenberg were added to dilute residual
(mnirenberg@nih.gov). ra-
laboratory. NH2 H
THE ROLE OF0968-0004/$
ONhttp://tibs.trends.com SOLUBLE dioactivity
RIBONUCLEIC
- see front of Elsevier
matter q 2003 the ACID
serine and possible
rightsCODING
Ltd. AllIN degrada-
FOR
reserved. doi:10.1016/j.tibs.2003.11.009 Alonine
AMINO products.
tion ACIDS*,t The repeated extraction-evapora-
tion procedure was used,' since cysteine is more attached FIG. 1.-Plan of experiment. Cysteine is
BY FRANCOIS CHAPEVILLE,§ FRITZ LIPMANNt
soluble inin GPNTER
ethanol thanVON
etanlthnhohr theEHRENSTEIN,**
other amino io acids. . its normal
through thetomediation of the acceptor sRNA
cysteine activat-
BERNARD WEISBLUM, WILLIAMThe final JR.,
J. RAY, product SEYMOUR BENZERtt
AND contained approximately 40 ing enzyme. By the action of Raney Nickel,
THE ROCKEFELLER INSTITUTE, JOHNSper
of the input
cent UNIVERSITY,
HOPKINS AND radioactivity.
PURDUE UNIVERSITY On paper the cysteine, while still attached, is con-
electrophoresis at pH 1.85 in 7.8 per cent acetic verted to alanine. The coding properties of
Communicated April2.5
acid and 1962cent formic acid (70 volts/cm,
25, per
60 mip), no detectable
In protein synthesis, each amino acid is first joined specifically radioactivity
with a was found except with cysteine. The dry product was stored
correspond-
ing sRNA through the mediation of -20°activating
at an . enzyme. These aminoacyl-
sRNA's, by reaction with a ribosomalPreparation preparation,1'of C'4-CySH-sRNACvSH:
2 form proteins with specific E. coli-sRNA was prepared as described," and a 105,000
amino acid sequences.3 Accordingcan
Good question: to the
the "'adaptor" check the of
ribosome hypothesis Crick4acid
amino Hoag- to the tRNA? How do we know that?
andattached
land,5 the position of a particular amino acid would be determined not by the amino
acid itself, but by hydrogen bonding between the RNA template and a comple-
mentary nucleotide sequence in the sRNA carrying the amino acid. The experiment
described in this paper was LECTURE SERIES
designed as Ma OLECULAR
direct testBIOLOGY
of the adaptor hypothesis, WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
by attaching an amino acid to its normal sRNA and then, without breaking the
bond, converting the amino acid to another one of the natural amino acids. It is
then possible to determine whether the coding properties of this hybrid are deter-
mined by the sRNA or the amino acid. We have made use of the fact that cys-
teine can be altered by reductive desulfhydration with Raney Nickel to alanine.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
What is bigger, tRNA or amino acid?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Ribosomes have one binding site for mRNA and 3 sites for tRNA
P P
E A E A
Aminoacyl-tRNA
Peptidyl-tRNA
Exit
large subunit small subunit
Good question:
The eukaryotic ribosome can process about 2 amino acids per second.
Bacterial ribosomes can be 10-times faster.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Translation is a four-step cycle
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
crystal structure
solved in 2000
Good question:
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Bonus
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Who has the largest genome?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
How many genes are in a genome?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Is replication time limiting for the cell cycle?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
In early Drosophila development ≈120 million bp* are replicated once every ≈8 minutes
* Let’s also consider the leopard lung fish with its 133 billion bp!
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
E. coli
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Large genome = complex organism?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Other
Conventional
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
RNA can form intramolecular base pairs
Good question:
What are the differences between DNA and RNA helices and what causes those differences?
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Article
Correspondence
ash@northwestern.edu
In Brief
How transcription is shut down as cells
During the cell cycle chromosomes have different states begin to condense chromosomes during
mitosis is poorly understood. Liang et al.
report the requirement of mitotic
transcriptional activation by P-TEFb to
release paused Pol II as a prerequisite for
this process, and ultimately for proper
In Brief
How transcription is shut down as cells
begin to condense chromosomes during Liang et al., 2015, Molecular Cell 60, 435–445
November 5, 2015 ª2015 Elsevier Inc.
mitosis is poorly understood. Liang et al. http://dx.doi.org/10.1016/j.molcel.2015.09.021
report the requirement of mitotic
transcriptional activation by P-TEFb to 2015
release paused Pol II as a prerequisite for
this process, and ultimately for proper
cell-cycle progression.
https://www.cell.com/action/showPdf?pii=S1097-2765(15)00741-8
centromer
The knobs on chromosomes 13, 14, 15, 21 and 22 indicate
the positions of rRNA genes.
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Karyotyping reveals genetic defects
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Chromosome painting
https://www.nature.com/articles/nprot.2006.91
© 2006 Nature Publishing Group http://www.nature.com/natureprotocols
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME