You are on page 1of 52

M OLECULAR B IOLOGY

F R O M F A CT S TO C ONCEPTS

M AR KU S E N G S T L E R , T H O M A S R U D E L , T H O M A S D AN D E K A R , M A R KU S S AU ER

W INTER 2023/4

This semester >> Biophysics*


is responsible for the organisation!

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

How cells read the genome

How cells control gene expression

How cells store, transport and secrete

How the cytoskeleton shapes and moves the cell

How cells communicate

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The modular structure of the lecture series

Example:

Module I: How cells read the genome (12 lectures)

3 x Cell biology 3x Microbiology 3x Bioinformatics 3x Biophysics

Markus Engstler

Introduces each module and lays the foundation for the other lectures.

Starts very basic but seeks to look at apparently simple stuff in an unexpected way.

Wants to be interactive; be prepared for lots of questions.

For further reading: e.g. Alberts et al., Essential Cell Biology, Garland Science.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The Cell Biology lectures

Only a limited number of exam-relevant facts are covered in each lecture.


These are highlighted in grey.

More important than memorising facts is that you come to understand the principles, ideas
and gaps in knowledge of molecular biology. This includes seeing our current knowledge in
the light of its discovery.

Numbers are particularly important. Without a feeling for numbers, you cannot understand
the processes in and on cells.

Self-study is the key to success. This does not have to be boring. That's why I provide lists
of links that will take you on a special journey through molecular biology.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

H OW C E LLS R EA D T H E G ENOME

O R G A N I S AT I O N OF THE E U K A RY O T I C G E N O M E

MARKUS ENGSTLER

17.10.2023

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The eukaryotic DNA is packed into chromosomes

Chromosomes become visible when cells prepare


for cell division.

Theodor Boveri (and William Sutton) propose the


chromosome theory of inheritance in 1902.

https://www.jstor.org/stable/1629276?seq=1#metadata_info_tab_contents

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.biozentrum.uni-wuerzburg.de/zeb/research/topics/theodor-boveri/

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

DNA carries the genetic information

It is difficult to envisage how the chemically


rather simple DNA can encode all biological
information.

In 1944, Avery, MacLeod and McCarthy revealed


that DNA is the genetic material. But many
scientists simply did not believe it.

Good question:

Why didn’t people believe Oswald Avery in 1944 and why did he never
get the Nobel prize?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

DNA carries the genetic information

https://www.nobelprize.org/prizes/themes/the-nobel-prize-in-physiology-or-medicine-1901-2000-2

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
DNA carries the genetic information

DNA as genetic material was disputed until Alfred Hershey and Martha
Chase did their “blender experiment” in 1952. What did they do?

J. Gen. Physiol., 36 (1): 39-56 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2147348/

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

DNA carries the genetic information

DNA as genetic material was disputed until Alfred Hershey and Martha
Chase did their “blender experiment” in 1952. How did it work?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The DNA double helix is the most famous molecule

DNA is a right-handed double helix with


ten bases per turn.

This coiling creates two grooves.

What is the biological significance


of the two grooves?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
DNA contains linear information

The DNA structure was solved in 1953.

Rosalind
Elsie Fran
klin

https://www.sciencemag.org/careers/2018/08/rosalind-franklin-and-damage-gender-harassment

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

DNA contains linear information

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

https://www.zeit.de/gesellschaft/2023-04/osalind-franklin-dna-entdeckung-nobelpreis-kriminalpodcast

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
DNA is packaged into chromosomes but remains accessible

In eukaryotes DNA is packaged into chromosomes that can easily be segregated


between dividing cells.

Basics:

Human DNA is 2m long,


contains 3.2x109 (billion) nucleotides
in 23 chromosomes (1n)

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

During the cell cycle chromosomes have different states

During the cell cycle DNA must be replicated, separated and partitioned.

Good question:

Is gene expression stalled


during mitosis?

Chromosomes
Chromosomes extended
condensed

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Mitotic chromosome formation requires condensins

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Mitotic chromosome formation requires condensins

Schematic model of the S. cerevisiae condensin Flip- op model of the condensin reaction cycle
holo complex and 8.1-Å-resolution 3D map
showing its overall architecture.

SMC2 SMC4

ATP binding to the Smc4 active site releases Ycs4.


This enables Ycg1 binding to Smc2.
Upon second ATP binding at the Smc2 active site the head domains fully engage.
Kinking of the Smc2 coiled coil triggers release of Brn1N and opens the coiled coils from their rod-shaped conformation.

Large-scale conformational changes form the basis for the ability of condensin to translocate along DNA and to
extrude DNA loops of kilobase pairs in length. https://www.nature.com/articles/s41594-020-0457-x 2020

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Available online at www.sciencedirect.com Current Opinion in

ScienceDirect Structural Biology

Available online at www.sciencedirect.com Current Opinion in


The material properties of mitotic chromosomes
1,2 ScienceDirect Structural Biology
Maximilian F. D. Spicer
Availableand
online Daniel W. Gerlich1
at www.sciencedirect.com Current Opinion in

ScienceDirect Structural Biology


Abstract the wrapping of DNA around histones to form a 10 nm-
Chromosomes transform during the cell cycle, allowing tran- wide nucleosome fibre [1,2] which is itself further folded
scription and replication during interphase and chromosome into large loops, building chromosomes.
segregation during mitosis. Morphological changes are
thought to be driven by the combined effects of DNA loop Chromosomes dynamically reorganise during the cell
The material
The material properties of mitotic chromosomes
extrusion and a chromatin solubility phase transition. By
extruding the chromatin fibre into loops, condensins enrich at During interphase, chromosomes undergo a variable
properties of mitotic chromosomes
cycle to support different functions at distinct stages.

F. D. Spicer1,2 and Daniel W. Gerlich1


1,2 forces. 1 to regulate access to tran-
Maximilian F. D. Spicer
an axial core and provide resistance to spindle pulling and Daniel W. Maximilian
Gerlichdegree of local compaction
scription factors and other nuclear components, and are
Mitotic chromosomes are further compacted by deacetylation
of histone tails, rendering chromatin insoluble and resistant to folded into specific loop patterns to control regulatory
penetration by microtubules. Regulation of surface properties interactions between distant DNA elements [3e5].
Abstract the wrapping of DNA around histones to form a 10 nm-
by Ki-67 allows independent chromosome movement in early wide nucleosome fibre [1,2] which is itself further folded
Chromosomes transform during the cell cycle, allowing tran-
mitosis and clustering during mitotic exit. Recent progress has
scription and replication during interphase and chromosome
In cell division, the mesh of chromatin that occupies
Abstract
into large loops, building chromosomes. the wrapping of DNA around histones to form a 10 nm-
provided insight into how the extraordinary material properties interphase nuclei converts into a set of discrete and
segregation during mitosis. Morphological changes are
of chromatin emerge from these activities, and how these Chromosomes
compacted chromosomes [6,7]. Astransform
a result of during
this con- the cell cycle, allowing tran- wide nucleosome fibre [1,2] which is itself further folded
thought to be driven by the combined effects of DNA loop version,Chromosomes dynamically
abilityreorganise during the cell
into large loops, building chromosomes.
properties facilitate faithful chromosome segregation. scription
chromosomes
cycle the
to support and
gain the replication
to moveduring
individu-interphase and chromosome
extrusion and a chromatin solubility phase transition. By ally along mitotic different
spindle, functions
allowing atefficient
distinct stages.
1
extruding the chromatin fibre into loops, condensins enrichsegregation
Addresses at ofsegregation
During interphase,
one genome copy during
to eachmitosis.
chromosomes undergo
of the Morphological
daughter a variable changes are
Institute
an of Molecular
axial Biotechnology
core and of the Austrian
provide resistance Academypulling
to spindle of forces. degreechromosomes
of local compaction to regulate accessshape to tran-effects of DNA loop Chromosomes dynamically reorganise during the cell
Sciences (IMBA), Vienna BioCenter (VBC), 1030, Vienna, Austria
cells. Mitotic thought
scription
to be driven
acquire by the
a characteristic combined
2 Mitotic chromosomes are further compacted by deacetylation with two parallel factors
threads and other
forming nuclear
sister components,
chromatids, each and are cycle to support different functions at distinct stages.
Vienna BioCenter PhD Program, Doctoral School of the University of
Viennaof histone
and Medicaltails, rendering
University chromatin
of Vienna, A-1030,insoluble
folded
and resistantcontaining
Vienna, Austria to oneextrusion
into and
of thespecific
replicated a patterns
loopDNAchromatin
molecules. solubility
to control
Recent phase transition. By
regulatory
penetration by microtubules. Regulation of surface properties work hasinteractions
extruding
identified betweenthedistant
motor-driven chromatin
DNA DNAloopelements
fibre [3e5].
into
extrusion, loops, condensins enrich at During interphase, chromosomes undergo a variable
Corresponding
by Ki-67author:
allows Gerlich, Daniel W.chromosome
independent (daniel.gerlich@imba.oeaw.
movement in earlychromatin phase separation and surface control as funda-
ac.at) In an
cell axial
division, core
the meshand provide
of chromatin resistance to spindle pulling forces.
that occupies
degree of local compaction to regulate access to tran-
mitosis and clustering
2 3D during
Genomemitotic Chromatin
exit. Recent progress has
Organization
mentaland Regulation
activities (2023)
contributing to mitotic chromosome
(Spicer M.F.D.)
provided insight into how the extraordinary material properties
(Gerlich D.W.) assembly. Mitotic
interphase
Here, chromosomes
we nuclei
discuss converts
how theseinto aare further
setare
activities of com- compacted
discrete and by deacetylation scription factors and other nuclear components, and are
of chromatin emerge from these activities, and how these bined and compacted
coordinatedchromosomes [6,7].genome
to allow faithful As a result of this con-
segrega- folded into specific loop patterns to control regulatory
of
version,on
tion, focusing
histone
chromosomes tails, rendering
gain the
vertebrate cells as aability
chromatin
modeltosystem.
insoluble
move individu- and resistant to
properties facilitate faithful chromosome segregation.
Current Opinion in Structural Biology 2023, 81:102617 Further,ally penetration
along
we outline thematerial
the by properties
mitotic microtubules.
spindle,of chromatinRegulation
allowing interactions between distant DNA elements [3e5].
efficient of surface properties
Addresses
This review comes from a themed issue on 3D Genome Chromatin segregation
that emerge of oneactivities,
from these genome copyand tohoweach of the
these aredaughter
1
Institute
Organization and axial condensin core to depend on the presence
ofRegulation
Molecular Biotechnology
(2023)
by Ki-67ofallows intact independent
entry, condensin
of the Austrian Academy of tuned to suit contrasting functions across the cell cycle. chromosome movement
II is released in early
from its inhibitor MCPH1
cells. Mitotic chromosomes acquire a characteristic shape
Sciences (IMBA), Vienna BioCenter (VBC), 1030, Vienna, Austria
DNA [24,28,29]. That the structural and
Edited2by Genevieve Almouzni and Juanma Vaquerizas mitosis and clustering
mechanical [37] during mitotic exit.
and organises
with two parallel threads forming sister chromatids, each Recent into
chromatin progress In cell division, the mesh of chromatin that occupies
has structures
thread-like
Vienna BioCenter PhD Program, Doctoral School of the University of
For complete
Viennaoverview of theUniversity
and Medical section, please refer theA-1030,
of Vienna, Vienna, -AustriaCondensins
integrity of mitotic chromosomes depends
article collection shapeone
provided
containing an of
interlinked
on bothinsight chromatin
the replicated DNA network
DNA intoduring how the prophase.
extraordinary
molecules. Recent This material
is followed by the binding
properties interphase
of nuclei converts into a set of discrete and
3D Genome Chromatin Organization and Regulation (2023) A key factor
work in hasthe shaping motor-driven
identified of mitotic chromosomes
DNA loop isextrusion,
Corresponding
and protein supports a model not of rigidof scaffolding, but cytoplasmic condensin
chromatin emerge from these activities, and how these I, which gains access to chro-
compacted chromosomes [6,7]. As a result of this con-
Available online 6 Juneauthor:
2023 Gerlich, Daniel W. (daniel.gerlich@imba.oeaw. the structural
chromatin maintenance of chromosomes
phase separation proteinas funda-
and surface control
ac.at) of an arrangement in which the flexible chromatin complexmental propertiesfibre
condensin, of which
activities is
facilitate
vertebrate
contributing mosomes
to cells encode
mitotic after nuclearsegregation.
faithful chromosome
chromosome envelope disassembly to promote version, chromosomes gain the ability to move individu-
https://doi.org/10.1016/j.sbi.2023.102617
(Spicer M.F.D.)
0959-440X/© (Gerlich
cross-linked by condensins (Figure 1). The resulting
2023 TheD.W.)
Author(s). Published by Elsevier Ltd. This is an
two isoforms
assembly.[8e13].
Here,Condensins further
are highly
we discuss how these abundant lateral
activities are com- compaction of chromosomes
ally along the mitotic spindle, allowing efficient
cross-linked substance can be described as a gel: a
open access ar ticle under the CC BY-NC-ND license (http:// and enrich
binedatand a central axis in
coordinated to each
allow sister chromatid
[10,16,21,38]. Of the two isoforms, condensin II has the
faithful genome segrega-
[8,14e16]. Addresses segregation of one genome copy to each of the daughter
creativecommons.org/licenses/by-nc-nd/4.0/).
network with many intra-polymer links, suspended in a tion,This chromosome axis, which also contains
1focusing on vertebrate cells as a model system.
longer residence time on chromatin
Current Opinion in Structural Biology 2023, 81:102617
https://doi.org/10.1016/j.sbi.2023.102617
topoisomerase
Further,II,Institute
wewas of the
Molecular
originally proposedBiotechnology
to form a rigid of the Austrian Academy [16,21]
of and forms
intocells. Mitotic chromosomes acquire a characteristic shape
outline material properties of chromatin
Keywords
fluid medium and capable of swelling.
This review comes from a themed issue on 3D Genome Chromatin proteinaceous Sciences
scaffold
that emerge from (IMBA),
from
thesewhich large
Vienna
activities, loops
DNABioCenter
loops
and how that
are thesearearethen
(VBC), 1030,further
Vienna,partitioned
Austria smaller
Mitosis,Organization and Regulation suspended 2
[17e19].
Vienna Consistent
BioCenter with this
loops
PhD “scaffolding”
bythe
Program, condensin
Doctoral I [39].
School Condensin
ofOFthe University with two parallel threads forming sister chromatids, each
Iofmechanically
Chromosomes, Chromatin, L(2023)
Condensin,
ECTURE Phase separation.
SERIES MOLECULARmodel, BIOLOGYtuned to suit contrasting functions
WISE 2023/4
condensins impart structure and stiffness to
across cell cycle.
O RGANISATION THE E UKARYOTIC GENOME
Edited by GenevieveIdentification
Almouzni and Juanma of Vaquerizas
the DNA loop Vienna
extrusion and
activity Medical
of University
stabilises of Vienna,
the A-1030,
centromere Vienna,
region Austria
within the containing one of the replicated DNA molecules. Recent
chromo-
Introduction mitotic chromosomes, allowing resistance to pulling
condensins [30e32] suggestscollectionthat Condensins
in contrast shape a an interlinked chromatin networkit to withstand spindle pullingwork has identified motor-driven DNA loop extrusion,
The very
For complete overview
long DNA
of the section, please
molecules encoding
refer the article
eukaryotic
-
forces exerted by thetospindle con- some,
in cells [20e25] allowing
or to forces
3D Genome Chromatin Organization and Regulation (2023)ge- A key factor in the shaping of mitotic chromosomes is
nomesAvailable
are folded ventional gel with static and homogeneously
extensively
Corresponding
micromanipulations on purifiedauthor:
distributed Gerlich,
acting
chromosomes onDaniel W. (daniel.gerlich@imba.oeaw.
kinetochores
[26,27]. [20,21,25]. chromatin phase separation and surface control as funda-
online 6 June 2023to fit into the small volume the structural maintenance of chromosomes protein
of cells. In this highlycross-links
folded state, DNA betweenmolecules separate
must fibres,complex ac.at)
condensins contin-
https://doi.org/10.1016/j.sbi.2023.102617
condensin,
The model of a purely
of
proteinaceous
(Spicer
which
M.F.D.) network
vertebrate cells
scaffolding
encode mental activities contributing to mitotic chromosome
be organised in a way that supports
uously move along the transcription,
a singleThischromatin two fibre to
isoforms cross-link
[8e13]. Depletion ofabundant
condensins leads to loss of cylindrical
0959-440X/© 2023 The Author(s). Published
and by Elsevier Ltd. of is mitotic
an chromosomes, however,Condensins
has been are highlyby
challenged assembly. Here, we discuss how these activities are com-
replication, epigenetic
open access ar ticle
modification
theunderbasesthe CC
inheritance
ofisBY-NC-ND
loops. license
As by a (http://
result ofandthis
endonuclease enrich at(Gerlich
a central
motor-driven
treatments showingD.W.)
axis
thein each sister
chromosome
integrity of the chromatid
organisation, shown most clearly through
genetic information. This organisation
creativecommons.org/licenses/by-nc-nd/4.0/).
accomplished [8,14e16]. This chromosome axis, which also contains bined and coordinated to allow faithful genome segrega-
movement, chromatin loops are reeled outwards
topoisomerase while
II, was originallyprotein-level
proposed to form depletion
a rigid by immunoprecipitation in vitro
www.sciencedirect.com condensins concentrate at the axialproteinaceous core. This
Current model
Opinion isfrom
in Structural
scaffold [8,9]
Biology orDNA
2023,
which targeted
81:102617 loops degradation
are
tion, focusing on vertebrate cells as a model system.
in cells [24,40]. However,
Keywords Current Opinion in Structural Biology 2023, 81:102617 Further, we outline the material properties of chromatin
supported
Mitosis, Chromosomes, by polymer
Chromatin, Condensin, simulations suspended
Phase separation. [33,34] and [17e19].in vitro
Consistent condensin-depleted
with this “scaffolding” chromatin still undergoes compac-
reconstitution experiments [35,36], both model, condensins
This review
of which impart
comes from structure
tiona themedand stiffness
[24,40,41]. to
on 3D Genome
issue Therefore, Chromatin
although essential that emerge from these activities, and how these are
for the
Introductionillustrate the ability of loop extrusion to mitotic chromosomes, allowing resistance to pulling
Organization
shape and Regulation
chro- organisation (2023)and segregation of mitotic chromosomes, tuned to suit contrasting functions across the cell cycle.
The very long DNA molecules encoding eukaryotic ge- forces exerted by the spindle in cells [20e25] or to
mosomes into cylindrical threadsmicromanipulations
nomes are folded extensively to fit into the small volume
with
Editedcondensins
by Genevieve
on purified condensins
Almouzni
chromosomes and are not necessary
Juanma
[26,27]. Vaquerizas for global volume compac-
of cells. In this enriching
highly foldedat state,theDNA axis. In silicomust
molecules and in vitro work has also tion of mitotic chromatin. Condensins shape an interlinked chromatin network
For complete
The model a purelyoverview
ofcatalysing of thenetwork
proteinaceous section,scaffolding
please refer the article collection -
be organised inshown a way thatthe supports
importance of topoisomerase
the transcription, II in
replication, epigenetic modification and inheritance of mitotic3D Genome Chromatin
chromosomes, however, has Organization
been challengedand Regulation
by (2023) A key factor in the shaping of mitotic chromosomes is
DNA strand passage in this process, allowingtreatments
endonuclease the or- showing Chromatin forms
the integrity an immiscible condensate in
of the
genetic information. This organisation is accomplished by
dered formation of the bottlebrush chromosome Available onlinestruc-6 June 2023
mitosis
the structural maintenance of chromosomes protein
In vitro, chromatin isolated from cells [18,42] orcomplex recon- condensin, of which vertebrate cells encode
Mitotic chromosome formation requires condensins
ture observed in mitosis.
www.sciencedirect.com
https://doi.org/10.1016/j.sbi.2023.102617
Current Opinion in Structural Biology 2023, 81:102617

0959-440X/© 2023 The stituted


Author(s).from Published recombinant
by Elsevier core histones
Ltd. This is an
twoDNA
and isoforms [8e13]. Condensins are highly abundant
The two condensin isoforms contributeopen differently to [43e46] compacts
access ar ticle under the CC BY-NC-ND license (http:// and decompacts by a processand
thatenrich
is at a central axis in each sister chromatid
the dimensions of mitotic chromosomes.creativecommons.org/licenses/by-nc-nd/4.0/).
Upon mitotic independent of condensins but highly dependent [8,14e16].
on This chromosome axis, which also contains
topoisomerase II, was originally proposed to form a rigid
Figure 1 Condensins cause progressive extrusion of DNA loops and enrich axially proteinaceous scaffold from which DNA loops are
Keywords
Mitosis, Chromosomes, Chromatin, Condensin, Phase separation. suspended [17e19]. Consistent with this “scaffolding”
model, condensins impart structure and stiffness to
Introduction mitotic chromosomes, allowing resistance to pulling
The very long DNA molecules encoding eukaryotic DNA ge- forces exerted by the spindle in cells [20e25] or to
nomes are folded extensively to fit into the small volume micromanipulations on purified chromosomes [26,27].
Condensin
of cells. In this highly folded state, DNA molecules must
be organised in a way that supports the transcription, The model of a purely proteinaceous network scaffolding
replication, epigenetic modification and inheritance of mitotic chromosomes, however, has been challenged by
genetic information. This organisation is accomplished by endonuclease treatments showing the integrity of the

www.sciencedirect.com Current Opinion in Structural Biology 2023, 81:102617

Depletion of condensins or endonuclease treatment disrupt the intergrity of mitotic chromosomes


Condensins form a dynamic chromatin network. (a) Progressive extrusion of DNA loops by condensin results in axial enrichment of condensin within a
bottlebrush-like arrangement of chromatin fibre loops. (b) Depletion of condensin impairs structural and mechanical integrity of mitotic chromosomes. (c)
DNA endonucleases disrupt structural and mechanical integrity of mitotic chromosomes. DNA (black) and condensin (red) are shown. Dashed lines
indicate degraded condensin.

Current Opinion in Structural Biology 2023, 81:102617 www.sciencedirect.com


LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
fl
Mitotic chromosome formation requires condensins
4 3D Genome Chromatin Organization and Regulation (2023)

Chromatin forms an immiscible condensate in mitosis


Figure 2

Cross-section of mitotic
chromatid: condensin
enriches at stiff core. Swelling of chromatin
Sharp phase boundary to network
cytoplasm

Disruption of chromatin
morphology and stiffness
Full dispersion of
without affecting compaction
chromatin bre

Condensin
Mitotic chromosomes form by condensin-mediated DNADNA loop extrusion and acetylation-regulated chromatin phase separation. Upper-left
corner represents the cross-section of an unperturbed mitotic chromatid. Condensin enriches at a stiff axial core, surrounded by loops of immiscible,
compact chromatin. At a local scale, the chromatin layer is fluid and forms a sharp phase boundary with the cytoplasm. Depletion of condensin disrupts
chromatin morphology and stiffness without affecting the degree of chromatin compaction. Histone hyperacetylation induces swelling of the chromatin
network, whereby additional
LECTURE depletion
SERIES of condensin
MOLECULAR induces full dispersion of the
BIOLOGY WIchromatin
SE 2023/4fibre. DNA (black) and condensin (red) are OF
ORGANISATION shown.
THEDashed
EUKARYOTIC GENOME
lines indicate degraded condensin.

required for the elastic stiffness of the chromosome axis accumulation of Ki-67 at the chromosome surface de-
[71]. Stiffness is instead established by condensin- pends on histone deacetylation [24], supporting a model
mediated loop extrusion, which shapes chromosomes where Ki-67 enriches at the phase boundary between
into cylindrical threads rather than the spheres expected immiscible chromatin and cytoplasm to function as a
of phase separation alone [12,34]. Thus, a combination surfactant dispersing mitotic chromosomes (Figure 3).
of DNA loop extrusion and chromatin phase separation
generates mitotic chromosomes that can resist micro- In the immiscible mitotic state, chromatin excludes not
tubule pulling and pushing forces. only microtubules but also other cytoplasmic proteins
and ribonucleoprotein complexes such as ribosomes,
Regulation of chromosome surface properties allowing pre-partitioning of cytoplasm from the assem-
A model of immiscible chromatin in the mitotic cytosol bling nucleus during mitotic exit [24,75,76]. The sep-
raises the question of how single chromosomes can be aration between emerging nuclear and cytosolic
maintained as separate entities, as phase-separated compartments is also regulated by Ki-67, which during
condensates have a natural tendency to minimise sur- mitotic exit loses its surfactant-like properties after
face energy by fusion [72]. The protein Ki-67 has been collapse of its repulsive molecular brush structure. This
identified to be essential in organising the chromosome
Tuning chromatin stiffness by loop extrusion processivity
periphery [73] and prevents chromosome coalescence
by forming a repulsive molecular brush at the surface of
leads to chromosome clustering, which sequesters
cytoplasmic components during nuclear assembly [75]
(Figure 3). Following chromosome clustering in late
mitotic chromosomes, a property characteristic of sur- anaphase, the protein barrier to autointegration factor
face active agents (surfactants) [74]. Notably, the coats and cross-links the chromatids, such that the
6 3D Genome Chromatin Organization and Regulation (2023)
Current Opinion in Structural Biology 2023, 81:102617 www.sciencedirect.com

In interphase, cohesin forms loops that Owing to gaps, interphase chromatin can be
Figure 4 are interspersed with gaps. deformed by application of forces.

In mitosis, condensin forms arrays of larger Mitotic chromatin is stiffer than interphase chromatin in
Tuning chromatin stiffness by loop extrusion processivity. (a) In interphase, cohesin forms loops that are interspersed with gaps. (b) Owing to gaps,
loops without
interphase largecan
chromatin gaps in between
be deformed them.of forces. response
by application (c) In mitosis,to pullingforms
condensin along theof mitotic
arrays chromosome
consecutive, axis.
larger loops without large gaps in
between them. (d) Mitotic chromatin is stiffer than interphase chromatin in response to pulling along the mitotic chromosome axis. DNA (black), cohesin
(blue) and condensin (red) are shown. Arrows indicate direction of force.

between interphase loop extrusion and the regulation while further studies implicate both methylation and
LECTUREexpression
of genome SERIES MOLECULAR BIOLOGY
and maintenance, are now WISHP1
E 2023/4 ORGANISATION OF THE Estiffness
a in imparting condensin-independent UKARYOTIC GENOME
beginning to be understood. [102]. While the abundance of evidence supporting the
role of condensins in mitotic chromosome formation puts
Conclusions and outlook this structural maintenance of chromosomes complex at
The expansion of biological studies to include concepts the fore, the potential complementary roles of these and
of polymer physics has yielded new perspectives in our other factors will be interesting to explore. Piecing
understanding of chromatin structure and function. together the components involved in regulating chro-
Chromatin material properties emerge from combina- matin material properties, and expanding upon models
tions of various governing principles. Dynamic conden- outlined in this review, will further our understanding of
sin and cohesin cross-links organise the flexible how core physical principles govern key genomic pro-
chromatin fibre into a network, whose stiffness can be cesses throughout the cell cycle.
tuned at different cell cycle stages by changing levels of
loop extrusion. Meanwhile, a regulated shift between Declaration of competing interest
soluble and insoluble chromatin states controls acces- The authors declare that they have no known competing
sibility and generates surface tension, which may pre- financial interests or personal relationships that could
vent inappropriate invasion of soluble factors or have appeared to influence the work reported in
microtubules. Histone deacetylation appears to play a this paper.
key role in driving this chromatin phase transition, with
the precise regulatory pathways The materialthis
underlying properties
process of mitotic chromosomes Spicer and Gerlich 5
Data availability
Regulation of the mitotic chromosome surface
during mitotic entry and exit still to be identified.
No data was used for the research described in
the article.
While condensin cross-links endow mitotic chromosomes
with stiffness, additional factors are involved in me-
Figure 3 chanical stabilisation. In vitro manipulation of purified Acknowledgements
The authors thank Antoine Coulon and Paul Batty for comments on the
mitotic chromosomes suggests that histone methylation manuscript. Research in the laboratory of D.W.G. is supported by the
may modulate stiffness of the chromatin network [71], Austrian Academy of Sciences, the Vienna Science and Technology Fund
The protein Ki-67 is an essential in
Current Opinion in Structural Biology 2023, 81:102617
organising thewww.sciencedirect.com
chromosome periphery

Regulation of the mitotic chromosome surface. In prometaphase and metaphase, Ki-67 forms a repulsive layer of molecular brushes on mitotic
In prometaphase
chromosomes, maintaining andallowing
their separation and metaphase, Ki-67 motility
independent forms ona repulsive
the mitoticlayer of In
spindle. molecular brushes
late anaphase, on mitoticbrushes
the molecular chromosomes,
collapse
and Ki-67 promotes clustering of chromosomes
maintaining to exclude
their separation large
and cytoplasmic
allowing particles. In
independent telophase,
motility barrier-to-autointegration
on the mitotic spindle. factor (BAF) forms a
network around each set of segregated chromosomes such that the reforming nuclear envelope enwraps them to form a single nucleus. The figure
depicts only chromosomes and anaphase,
In late their surface the
regulators.
molecular brushes collapse and Ki-67 promotes clustering of chromosomes to exclude large
cytoplasmic particles.

In telophase, barrier-to-autointegration factor (BAF) forms a network around each set of segregated chromosomes
reassembling nuclear envelope
such that enwraps
the reforming entire
nuclear setsenwraps
envelope between
them tothese
form acondensates
single nucleus.and the intrinsic phase
chromatids to form a single nucleus rather than frag- separation properties of chromatin itself, and how these
mented micronuclei [77]. Dynamic control of the together contribute to nuclear function.
chromosome surface thereby allows distinct functions to
be performed at different stages of mitosis. While volume phase transitions control chromatin
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
accessibility, changes in DNA looping account for
Tuning chromatin elasticity and solubility during cell further differences between mitotic and interphase
cycle progression chromatin. Once mitosis is complete, condensins
Once the genome has been equally distributed between dissociate from chromatin and the binding of cohesin
two daughter cells, chromosomes decondense and ac- re-establishes interphase looping patterns [78,79].
quire a specific morphology characteristic of interphase Cohesin has a DNA looping activity similar to
[78,79]. In contrast to the globally compacted state of condensin [31,86,87] but has a shorter residence time
mitotic chromosomes, interphase chromatin displays on chromatin, resulting in the formation of smaller
varied degrees of compaction. Transcriptionally active loops on interphase chromosomes compared to mitosis
chromatin regions decondense after mitotic exit, while [39,88e93]. The less processive loop extrusion
transcriptionally repressed constitutive heterochromat- performed by cohesin results in a greatly altered
in retains high levels of compaction, perhaps through interphase chromosome configuration, compared to
similar phase separation mechanisms to those observed that of mitosis. In contrast to the mitotic bottlebrush
in mitosis. Compact interphase chromatin may restrict structure formed by consecutive loops extruded by
the accessibility of nuclear components such as tran- condensins, interphase loops formed by cohesin are
scription factors through similar principles of phase short-lived, regulated by boundaries imposed by the
separation and electrostatic repulsion as observed in protein CTCF, and interspersed with gaps [90,94e97].
mitosis [24,80,81]. In addition, it is important to note This altered cross-link distribution leads to a less
that interphase nuclei contain many other condensates constrained chromatin state than that observed in
fi
Three DNA elements are essential for chromosomes

Why do eukaryotic chromosomes


have telomeres?

Eukaryotic chromosomes
contain many replication
origins. Why?

How does the centromere


attach the duplicated
chromosomes to the mitotic
spindle?
Replication

MTOC, centrosome, kinetochor

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Chromosome numbers vary even between closely related species

Good question:

In Indian muntjacs
chromosomes have fused
without large changes in
gene number. How can
this happen?

Mol Biol Evol (2000) 17 (9): 1326-1333. The molecular mechanism whereby the muntjac telomere and centromere repetitive sequences
induce frequent tandem fusions is unknown. …. By elucidating the driving force behind the tandem fusions, we may one day be able to
reconstruct the reduction process in laboratories.

https://academic.oup.com/mbe/article/17/9/1326/994705

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Chromosome numbers vary even between closely related species

ARTICLE
2022
https://doi.org/10.1038/s41467-021-27091-0 OPEN

Molecular mechanisms and topological


consequences of drastic chromosomal
rearrangements of muntjac deer
Yuan Yin 1,20, Huizhong Fan2,3,20, Botong Zhou 1,20, Yibo Hu2,4,20, Guangyi Fan5,6,7,20, Jinhuan Wang8,20,
Fan Zhou9,20, Wenhui Nie8,20, Chenzhou Zhang1, Lin Liu9, Zhenyu Zhong10, Wenbo Zhu1, Guichun Liu1,
Zeshan Lin 1, Chang Liu1, Jiong Zhou1, Guangping Huang2, Zihe Li1, Jianping Yu11, Yaolei Zhang5,12, Yue Yang1,
1234567890():,;

Bingzhao Zhuo1, Baowei Zhang13, Jiang Chang 14, Haiyuan Qian11, Yingmei Peng1, Xianqing Chen1, Lei Chen1,
Zhipeng Li15,16, Qi Zhou 17,18,19 ✉, Wen Wang 1,4 ✉ & Fuwen Wei 2,3,4 ✉

https://doi.org/10.1038/s41467-021-27091-0

Muntjac deer have experienced drastic karyotype changes during their speciation, making it
an ideal model for studying mechanisms and functional consequences of mammalian chro-
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
mosome evolution. Here we generated chromosome-level genomes for Hydropotes inermis
(2n = 70), Muntiacus reevesi (2n = 46), female and male M. crinifrons (2n = 8/9) and a
contig-level genome for M. gongshanensis (2n = 8/9). These high-quality genomes combined
with Hi-C data allowed us to reveal the evolution of 3D chromatin architectures during
mammalian chromosome evolution. We find that the chromosome fusion events of muntjac
species did not alter the A/B compartment structure and topologically associated domains
near the fusion sites, but new chromatin interactions were gradually established across the
fusion sites. The recently borne neo-Y chromosome of M. crinifrons, which underwent male-
specific inversions, has dramatically restructured chromatin compartments, recapitulating the
early evolution of canonical mammalian Y chromosomes. We also reveal that a complex
structure containing unique centromeric satellite, truncated telomeric and palindrome
repeats might have mediated muntjacs’ recurrent chromosome fusions. These results provide
insights into the recurrent chromosome tandem fusion in muntjacs, early evolution of
mammalian sex chromosomes, and reveal how chromosome rearrangements can reshape the
3D chromatin regulatory conformations during species evolution.
Chromosome numbers vary even between closely related
ARTICLEspecies
NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-27091-0

a B. taurus
2n=60
R. tarandus
-9.56 2n=70
H. inermis
2n=70

-11.33 C. albirostris
1
-4.47 2n=66
10X + Hi-C
1
E. davidianus
Nanopore + Hi-C 2n=68
-9.45
PacBio + Hi-C 6 M. reevesi
2n=46
Nanopore 6 -3.05 13 M. muntjak vaginalis
7+33, 33+20, 26+21, 3+1, 4+25, 32+31
tandem fusion 2n=6 /7
-2.04 6 1
29+17, 8+19, 19+30, 2+12, 7+22, 22+15, 15+16,
16+32, 14+21, 26+1, 25+10, 10+28, 28+23
13 M. gongshanensis
Robertsonian fusion 2n=8 /9
27+29, 30+2, 31+11, 9+14, 7 -1.44
23+13, 13+24, 17+8, M. crinifrons
fission 18+5, 24+34 2 3
2n=8 /9
11+6, 5+27, 34+X
Million years
-20 -17.5 -15 -12.5 -10 -7.5 -5 -2.5 0

b https://doi.org/10.1038/s41467-021-27091-0 c
40

35
MCR
2022
Effective population size (x10⁴)

MGO
30 MMU
MRE
25
M. crinifrons
20

15
M. reevesi
10

5
M. muntjak vaginalis
0 M. gongshanensis
10⁴ 10⁵ 10⁶ 10⁷

d
19 10 L1ECTURE
29 16SERIES
8 12MOLECULAR
6 4 13 BIOLOGY
26 28 25 18 9 WI17
20 21 27 15 SE 2023/4
5 22 24 7 3 11 ORGANISATION
2 14 23 XOF THE EUKARYOTIC GENOME
BTA

18 5 27 29 12 17 8 19 30 2 9 7 33 20 22 15 16 32 31 11 6 14 26 21 23 3 1 24 4 25 34 13 10 28 X
HIN

17 8 18 5 19 9 16 21 6 2 14 13 15 11 10 7 20 4 1 3 12 22 X
MRE

Female 1 2 3 4+X neo-X


MCR

Male 1q 2 3 1p+4 (neo-Y) X


MCR

Fig. 1 Phylogeny, demographic histories, and distribution and chromosome synteny of muntjac species. a Maximum likelihood tree of muntjac and
outgroup species with the respective sequencing technologies (red geometries), the divergence time (blue numbers) and number of chromosome fusion or
fission events (red numbers) shown. Different combinations of black arrows represent different types of chromosome fusion and fission. The 31 fusion
events leading to M. crinifrons are displayed in detail with the chromosome code (black numbers) of H. inermis, which are connected with the arrow mark on
Who has the most chromosomes?
the phylogenic tree with dotted lines. Red hollow circles mark the nodes whose divergence times were used as calibration for estimating the divergence
time among other species. b The demographic histories of M. reevesi (MRE), M. muntjak vaginalis (MMU), M. gongshanensis (MGO), and M. crinifrons
(MCR) estimated by PSMC37. The gray box marks the time range of the Xixiabangma Glaciation (XG, 0.8-1.17 million years ago). c Topographic map on
current geographic distribution of the four muntjac species. The colors of dashed line are consistent with the colors of distribution areas of a particular
species. d The chromosome synteny between B. taurus, H. inermis, M. reevesi, female and male M. crinifrons with chromosome names shown above. 1p and
1q represent short arm and long arm of chromosome 1, respectively. The red line indicates the synteny blocks of female and male M. crinifrons in neo-Y
inverted regions.
You (Homo sapiens): 46 chromosomes
Blue whale (Balœnoptera musculus): 48 chromosomes
NATURE COMMUNICATIONS | (2021)12:6858 | https://doi.org/10.1038/s41467-021-27091-0 | www.nature.com/naturecommunications 3

Red viscacha rat (Tympanoctomys barrerae): 102 chromosomes


Atlas blue (Polyommatus atlanticus): 448-452 chromosomes

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Chromosome number is not related


to the size (or complexity) of the organism

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Chromatin is a complex of nuclear DNA and proteins

Histones + non-histones + DNA = chromatin

There are huge amounts of histones in a cell (mass equals DNA)

Chromatin at interphase
= 30-nm thick threads
This evidence suggests pairwise asso- chromatin than in solution, but cross- 17. Y. V. Ilyin, A. Ya. Varshavsky, U. N. Mick-
elsaar, G. P. Georgiev, Eur. J. Blochem. 22,
ciations of the histones in chromatin linked products up to pentamers are 235 (1971).
but says nothing of details, such as readily observed and call for further 18. R. J. DeLange, D. M. Fambrough, E. L.
Smith, J. Bonner, J. Biol. Chem. 244, 5669
whether the F2A1 and F3 pair, which investigation. (1969).
occurs as an (F2Al)2(F3)2 tetramer 19. L. Patthy, E. L. Smith, J. Johnson, ibid.
Nucleosomes
in become
solution, also occurs as a tetramer References and Notes 248, 6834 (1973).
20. G. S. Bailey and G. H. Dixon, ibid. p. 5463.
visible
in on The
chromatin. experimentally
most direct evidence 1. Molecular weights are from R. J. DeLange
and E. L. Smith, Accounts Chem. Res. 5,
21. E. P. M. Candido and G. H. Dilon, Proc.
Nail. Acad. Sci. U.S.A. 69, 2015 %1972).
for an (F2Al)2(F3)2 tetramer in 368 (1972). Relative amounts of the histones 22. S. C. Rail and R. D. Cole, J. Biot. Chem.
decondensed
chromatin is that afibers.
complex formed 2. M. are discussed in the accompanying article (24). 246, 7175 (1971).
H. F. Wilkins, Cold Spring Harbor Symp. 23. K. Murray, E. M. Bradbury, C. Crane-Rob-
from tetramers, F2A2-F2B oligomers, Quant. Biol. 21, 75 (1956); - , G. Zubay, inson, R. M. Stephens, A. J. Haydon, A. R.
H. R. Wilson, J. Mol. Biol. 1, 179 (1959);
and DNA gives the same x-ray pattern V. Luzzati and A. Nicolaieff, ibid. 7, 142 Peacocke, Biochem. 1. 120, 859 (1970).
chromatin (Fig. 4, upper two traces). 24. R. D. Kornberg, Science 184, 868 (1974).
as (1963).
3. B. M. Richards and J. F. Pardon, Exp. Cell 25. G. Zubay and P. Doty, J. Mol. Biol. 1, 1
Tetramers and F2A2-F2B oligomers
Are histones large or small molecules and are theyRes. 62, 184 (1970).
charged?
(1959).
are both required to give the x-ray pat- 4. J.Biol.F. 68, Pardon and M. H. F. Wilkins, J. Mol. 26. S. Panyim, R. H. Jensen, R. Chalkley, Bio-
115 (1972). chim. Biophys. Acta 160, 252 (1968).
tern (Fig. 4, lower two traces), but Fl 5. P. A. Edwards and K. V. Shooter, Biochem. 27. 0. H. Lowry, N. J. Rosebrough, A. L. Farr,
is not-in keeping with previous obser- 6. J.R. 114, 227 (1969). R. J. Randall, J. Biol. Chem. 193, 265
Ziccardi and V. Schumaker, Biochemistry (1951).
removing
vations (3, 23 ) thatLECTURE Fl Mfrom
SERIES 12, 3231 (1973).
OLECULAR BIOLOGY WISE 2023/4 Pringle, Biochem.
28. J. R.ORGANISATION OF THE Biophys. Res. Com-
EUKARYOTIC GENOME
A. C. H. Durham, unpublished. mun. 39, 46 (1970).
chromatin does not affect the x-ray 8.7. A. B. Barclay and R. Eason, Biochim. Bio- 29. K. Weber and M. Osborn, J. Biol. Chem.
pattern. Further implications of these 9. phys. Acta 269, 37 (1972). 244, 4406 (1969).
30. R. N. Perham and J. 0. Thomas, FEBS (Fed.

Downloaded from www.sciencemag.org on September 23, 2010


D. R. van der Westhuyzen and C. von Holt,
results are discussed in the accompany- FEBS (Fed. Eur. Biochem. Soc.) Lett. 14, 333 Eur. Biochem. Soc.) Lett. 15, 8 (1971).
ing article (24). (1971). 31. S. M. McElvain and J. P. Schroeder, J. Am.
10. G. E. Davies and G. R. Stark, Proc. Natl. Chem. Soc. 71, 40 (1949).
We are currently studying associa- Acad. Sci. U.S.A. 66, 651 (1970). 32. A. C. H. Durham, J. Mol. Biol. 67, 289
tions of the histones in chromatin by 11. 246, S. Panyim and R. Chalkley, J. Biol. Chem. (1972).
7557 (1971). 33. E. M. Bradbury, H. V. Molgaard, R. M.
cross-linking. There are two difficulties 12. Electrophoretically pure F2AI and F3 were Stephens, L. A. Bolund, E. W. Johns, Eur.
J. Biochem. 31, 474 (1972).
gifts of Dr. E. W. Johns.
that do not arise in experiments on the 13. Preliminary work shows that the sedimenta- 34. E. R. M. Kay, N. S. Simmons, A. L. Dounce,
histones in solution: the amino side tion coefficient of the van der Westhuyzen J. Am. Chem. Soc. 74, 1724 (1952).
and von Holt (9) preparation of F2AI and F3 35. C. D. Laird, Chromosoma 32, 378 (1971).
chains are involved in salt linkages is the same at pH 5 as at pH 7, so the two 36. F. W. Studier, J. Mol. Biol. 11, 373 (1965).
with the phosphate groups of DNA histones most likely occur as a tetramer in 37. H. E. Huxley and W. Brown, ibid. 30, 383
the Sephadex G-100 gel filtration (pH 5) jusi (1967).
and are thus less available for chemical as they do in the cross-linking (pH 8) and 38. R. D. Kornberg, A. Klug, F. H. C. Crick,
sedimentation (pH 7) experiments described in preparation.
modification; and the presence of five in Figs. 1 and 2. 39. We thank Drs. A. Klug and F. H. C. Crick
rather than two histones complicates 14. A. J. Haydon and A. R. Peacocke, Chem. for helpful discussions and criticism of the
Soc. Spec. Publ. No. 23 (1968), p. 315. manuscript. We thank Janet Francis for ex-
identification of products from molec- 15. R. I. Kelley, Biochem. Biophys. Res. Conmniun. pert technical assistance. R.D.K. thanks
ular weights. -Preliminary results do 54, 1588 (1973). the National Cystic Fibrosis Research Foun-
K. Laemmli, Natuire (Lond.) 227, 680 dation for support during the early part ot
show less cross-linking of histones in 16. U.(1970).
1974: birth of epigenetics
this work.

Introduction to Chomatin Structure


( Chromatin of eukaryotes contains
nearly equal weights of histone and
DNA. This corresponds, on the basis
Chromatin Structure: A Repeating of the molecular weights and relative
amounts of the five main types of his-
Unit of Histones and DNA tone, F1, F2A1, F2A2, F2B, and F3,
to roughly one of each type of histone
per 100 base pairs of DNA with the
Chromatin structure is based on a repeating unit of eight exception of Fl, of which there is half
as much. The arrangement of histones
histone molecules and about 200 DNA base pairs. and DNA involves repeats of structurej
The first evidence of this comes from
the work of Wilkins and co-workers
Roger D. Kornberg (2) who obtained x-ray diffraction
https://science.sciencemag.org/content/184/4139/868.long patterns from whole nuclei of cells
showing relatively sharp bands.( Chro-
matin isolated from the nucli as a
nearly pure complex of histone and
Evidence is given in the preceding the structure of chromatin is based DNA/ gives x-ray patterns w the
article (1) for oligomers of the his- on a repeating unit of two each of same bands. Further x-ray work (3-5)
tones, both in solution and in chro- the four main types of histone and has shown thatthese bands corropond
matin. Here I wish to discuss this and about 200 base pairs of DNA. A
other evidence in relation to the ar- chromatin fiber may consist of many The author is a Junior Fellow of the Society
rangement of histones and DNA in such units forming a flexibly jointed of Fellows of Harvard UniversIty; he is working
at the MRC Laboratory of Molecular Biology,
chromatin. In particular, I propose that chain. Hills Road, Cambridge CB2 2QH, England.
f868J SCIENCE, VOL. 184

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Nucleosomes contain DNA wrapped around histones

Simplest packaging structure of DNA that is


found in all eukaryotic chromosomes is the
nucleosome.
DNA is wrapped 1.7-fold around an octamer of
histones.
147 bp are wrapped around the core and the
remaining bases link to the next nucleosome.
This structure causes negative supercoiling

Some facts:

The nucleosome consists of about 200 bp wrapped around a histone


octamer that contains two copies of histone proteins H2A, H2B, H3 and H4.

These are known as the core histones. Histones are basic proteins that have
an affinity for DNA.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The histone core is disk-shaped

The high-resolution structure of nucleosome core https://www.nature.com/articles/38444

was solved in 1997.

All core histones have N-terminal tails that


extend out from the core particle.

Core histones are the most highly conserved


eukaryotic proteins.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Histone H1 is the linker histone

The fifth histone is called H1 and helps to pull nucleosomes together to form the 30-nm fiber.
H1 has a globular region and a pair of long tails at N- and C-terminus.
Good question:
The globular region is possibly involved What is the function of the histone tails?
in constraining another 20 bp of DNA
close to the nucleosome core.
The C-terminal tail binds to chromatin,
but the position of both tails is not known.
Article

Structure of mitotic chromosomes


Graphical abstract Authors
Andrew J. Beel, Maia Azubel,
Pierre-Jean Matteı̈, Roger D. Kornberg

Correspondence
mazubel@stanford.edu
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

In brief
Partial decondensation of mitotic
chromosomes and cryo-electron
tomography revealed the internal
structure of the chromosomal material.
Nucleosomes and linker DNA segments
between them were observed, permitting
chromatin fibers to be traced over
distances of 10 kb (50 nucleosomes) or
more. Patterns of coiling or folding in
specific gene regions can now be
determined.
Article

Structure of mitotic chromosomes


Highlights
Graphical abstract d Authors
Chromosomes were decondensed and recondensed in a
physiologically relevant
Andrew J. Beel, Maia Azubel, manner
Pierre-Jean Matteı̈, Roger D. Kornberg
d Cryo-ET revealed linker DNA and nucleosomes as double-
Correspondence
layer discs
mazubel@stanford.edu

d Trajectories
In brief of chromatin fibers were irregular, unrelated to
previous proposals of mitotic
Partial decondensation
chromosomes and cryo-electron
d Condensation occurred
tomography revealed without higher-order structure in the
the internal
structure examined
regions of the chromosomal material.
Nucleosomes and linker DNA segments
between them were observed, permitting
chromatin fibers to be traced over
distances of 10 kb (50 nucleosomes) or
more. Patterns of coiling or folding in
specific gene regions can now be
determined.

Beel et al., 2021, Molecular Cell 81, 4369–4376


November 4, 2021 ª 2021 Elsevier Inc.
Highlights
https://doi.org/10.1016/j.molcel.2021.08.020
https://doi.org/10.1016/j.molcel.2021.08.020 ll
d Chromosomes were decondensed and recondensed in a

physiologically relevant manner

d Cryo-ET revealed linker DNA and nucleosomes as double-


WISE 2023/4
layer discs LECTURE SERIES MOLECULAR BIOLOGY ORGANISATION OF THE EUKARYOTIC GENOME

d Trajectories of chromatin fibers were irregular, unrelated to


previous proposals

d Condensation occurred without higher-order structure in the


regions examined

Beel et al., 2021, Molecular Cell 81, 4369–4376


November 4, 2021 ª 2021 Elsevier Inc.
https://doi.org/10.1016/j.molcel.2021.08.020 ll
ll
The structure of mitotic chromosomes (is surprising) Article

fashion (Figure 1, top row, second to fifth panels). Similar


behavior was observed for a variety of other cations, such as
magnesium (Figure 1, bottom row), consistent with prior studies
(Maniotis et al., 1997; Poirier et al., 2002). Chromosome decon-
densation was evidently reversible in morphology, as revealed
by light microscopy.

Relevance to chromosome condensation-


decondensation in vivo
To assess the relevance of our perturbations of isolated chro-
mosomes to the chromosomal dynamics of dividing cells, we
Figure 1. Decondensation and recondensation of chromosomes
made use of cells expressing GFP fused to H2B (Kanda
perfused with solutions of various composition
Chromosomes isolated in the condensed state in the presence of 0.375-mM et al., 1998), which allowed for comparative measurements of
spermidine (left-most column, +spd). The same chromosomes, decondensed the lifetime of H2B-GFP in purified chromosomes (Figure 2A)
after the removal of spermidine by perfusion with 5 mM Tris (pH 7.5), 2 mM KCl and dividing cells (Figure 2B) as they underwent condensa-
(second column from the left). Recondensation (third, fourth, and fifth columns tion-decondensation reactions. Fluorescent lifetime is exqui-
from the left) brought about by perfusion with the chloride salts of spermine sitely sensitive to the local environment, and, therefore,
(spm) and magnesium (Mg) at the concentrations indicated. Scale bar, 2 mm.
changes in lifetime are expected to accompany changes in
condensation state. Previous studies employing small interca-
neutral pH (Figure 1, top row, left panel). Chromosome decon- lating dyes have demonstrated a dependence of fluorescence
densation was observed in the absence of spermidine in a lifetime on chromatin condensation, attributable to changes in
LECTURE SERIES WISE 2023/4
solution of Mlow
OLECULAR BIOLOGY
ionic strength (Figure 1, top row, second ORGANISATION EUKARYOTIC GENOME
panel).OF THEBrownian polymer fluctuations and local viscosity (Spagnol
Chromosome recondensation was observed upon the addition and Dahl, 2016). The average fluorescence lifetime of purified,
of an agent with condensing activity, such as spermine (a poly- GFP-labeled chromosomes in the condensed state was
amine bearing nearly four positive charges at neutral pH). By 2.11 ± 0.08 ns (yellow curve in Figure 2C) and that of similarly
varying the concentration of spermine, the degree of chromo- labeled chromosomes during anaphase—when maximal chro-
some condensation could be varied in an apparently continuous mosome condensation is attained (Mora-Bermúdez et al.,

The structure of mitotic chromosomes ll


Article

studies described below, we employed conditions promoting


only partial decondensation, within the physiological range of
chromatin condensation states.

Trajectories of chromatin fibers at subnucleosome


resolution
Partial decondensation of mitotic chromosomes purified from
human cells enabled the imaging of their internal structure by
cryo-ET. The extent of decondensation in the regions analyzed
was about 50% in any linear direction. A tilt series was
collected with the use of a phase plate and an energy filter,
with minimal accumulated electron dose, at a magnification
corresponding to a pixel size of 1.34 Å. The resulting tomo-
grams revealed the chromatin structure at a sub-nucleosomal
resolution. The locations and orientations of nucleosomes
could be determined, and in many views, the two gyres of
DNA around the histone octamer could be distinguished (Fig-
ure 3). A procedure was devised (see below) whereby all linker
DNA segments within a region could be discerned. A number of
tomograms were obtained, and portions of two of them, de-
Figure 2. Relevance to chromosome condensation-decondensation in vivo
noted as tomograms 1 and 2, were subjected to manual seg-
(A) Fluorescence lifetime maps of H2B-GFP in condensed (left) and decondensed
mentation (right),
(by the docking purified chromosomes;
of nucleosomes and the tracing of scale bar, 1 mm. The color bar (bottom) shows
linker DNA), yielding a complete, three-dimensional view of
the correspondence with H2B-GFP lifetime in units of nanoseconds.
the internal structure.
(B) Lifetime maps of H2B-GFP in a dividing cell at the interphase (left), prophase,
Tracing metaphase,
linker DNA, heretofore and anaphase
an unattainable goal (Cai et(right),
al., respectively; scale bar, 5 mm. The color bar
(right) shows the correspondence with H2B-GFP lifetime. 2018b; Ou et al., 2017), is the key to the elucidation of chromatin
structure because linker DNA reveals the connectivity and, thus,
(C) Distribution Figure
of fluorescence lifetime from n = 94 purified
3. Cryo-ET reconstruction of partially decondensed mitotic
chromosomes in condensed
the trajectories of chromatin fibers. (yellow) and decondensed (blue) states (shading denotes the 95%
confidence interval). The mean lifetimes (± SD) are 2.11 ± 0.08 (condensed)
chromosomes Such tracing hasandbeen2.36 ± 0.08 especially
challenging, ns (decondensed).
for linker DNA
(A) Gallery of edge and face views of nucleosomes extracted from the tomo- not contained in a plane perpendicular to the electron micro-
(D) Distribution of fluorescence
grams. Scale bar, 10 nm. lifetime from n = 13 live cells imaged at different stages of the cell cycle (shading
scope beam (the xy-plane). Although in-plane linker DNA is
denotes the 95% confidence interval). The mean
lifetimes (± SD) (B
are and C) 4-nm slabs, comprising 10 slices of tomogram 1, displayed with
2.13 ± 0.06 (interphase; n = 204 images), 2.08 clearly
inverted contrast. Scale bar,10 nm.
± 0.06visualized
(prophase; as ann = 17), 2.01
element ± 0.08
joining (metaphase; n = 37), and 1.95 ± 0.09 ns (anaphase; n = 8).
two nucleosomes,
out-of-plane linker DNA segments only appear as small, round
(E) Average relative fluorescence lifetime of GFP-H2B from late G2 phase to early G1 phase of the cell cycle, centered (t = 0) around the time of anaphase (n = 13 cells).
or oval spots (depending on the angle between the linker DNA
See also Figure2007)—was
S1. 1.95 ± 0.09 ns (yellow curve in Figure 2D). To and the xy-plane), which cannot be unambiguously assigned as
LECTURE SERIES MOLECULAR
compare theseBIOLOGY fluorescence lifetimes,WISitEwas
2023/4
necessary to ac- DNA. ORGANISATION
TomogramsOF THE E
suffer from GENOMEloss of resolution
an anisotropic
UKARYOTIC
count for refractive indices, in vivo and in vitro. Because lifetime because of the missing wedge of information in the direction
is proportional to the inverse square of the refractive index parallel to the beam. The resulting distortion hinders the recog-
(Suhling et al., 2002), transplantation
4370 Molecular Cell 81, 4369–4376, November 4, 2021 of a chromosome from nition of features in xz and yz views, so the visualization and
anaphase cytoplasm (n z 1.36–1.39; Choi et al., 2007) to water segmentation of tomograms is primarily done with the use of
(n z 1.333) would, by the effect of refractive index alone, in- views obtained by representing the volume as thin xy-slices.
crease the observed lifetime from 1.94 ns to 2.03–2.12 ns, Although linker DNA lying in the xy-plane could be traced by
similar to that of purified chromosomes under conditions of conventional display of xy-slices, a different strategy was
maximal condensation. The fluorescence lifetime of maximally required for tracing linker DNA passing through multiple slices.
condensed, purified chromosomes increased by about A slightly tilted, thin slab comprising several consecutive tomo-
0.25 ns upon full decondensation in vitro (compare yellow graphic slices captured pertinent three-dimensional information
and blue curves in Figure 2C), whereas an increase of about without distortion or interference due to the overlap of density
0.18 ns was found upon decondensation in vivo in cells exiting from additional slices. By means of this procedure, round or
mitosis (compare yellow and blue curves in Figure 2D). The oval features of appropriate size present in consecutive slices
temporal evolution of the H2B-GFP fluorescence lifetime could be confidently identified as linker DNA (Figure S2; Video
throughout the cell cycle shows a clear demarcation between S1). Translating the slab through the tomogram revealed the
interphase and mitosis (Figures 2E and S1). The greater lifetime entire path of the linker DNA (Figure 4). This procedure is illus-
prolongation upon decondensation in vitro than in vivo merely trated by an example of two apparently unconnected nucleo-
reflects the capacity to induce full decondensation in vitro somes found in different slices of a tomogram (Figures 4A
versus the partial decondensation characteristic of interphase and 4B). The linker between them could not be recognized in

ll The structure of mitotic chromosomes


chromatin in vivo, well known from electron microscopy of individual slices (Figures 4C and 4D) but was revealed in thin
interphase nuclei and also consistent with the low euchromatin slabs (Figures 4E and 4F) and was traced from one nucleosome
fraction typical of metazoan genomes. For the structural to the other.Article Application of this procedure to the entire

Molecular Cell 81, 4369–4376, November 4, 2021 4371

Figure 4. Tracing of linker DNA


(A and B) Slices of tomograms with two apparently unconnected nucleosomes (nucleosomes 217 and 471) circled in red. Insets: 2-fold magnification of the areas
around the circled nucleosomes, into which nucleosome core-particle maps (EMDB: EMD-8140) (Chua et al., 2016), displayed in white with a green outline, have
been manually docked.
(C) Tilted view of an intermediate slice with a small, round density indicated by the red arrow.
(D) Same as (C), with the DNA path, deduced below, shown by a white line.
(E and F) Tilted view of slabs revealing linker DNA density. Nucleosomes 217 and 417 are displayed using manually docked EMDB: EMD-8140 maps (shown in
white with a green outline). The path of the linker DNA is shown by a white line.
Scale bars, 10 nm. The box diagrams on either side are schematic representations of the slice/slab positions (gray) within the context of the tomogram (red
outline). See also Figure S2 and Video S1.

tomogram revealed that 74.1% of linker DNA segments passed analyzed from two chromosomes were centered on 43 ± 1 bp
through multiple slices. and 42 ± 2 bp (with dispersions of 16 and 19 bp), (Figure 5B),
LECTURE
Attempts SERIESchromatin
to segment MOLECULAR BIOLOGY
fibers automatically using WIS
close 2023/4
toEthe value of 38 bp obtained by ORGANISATION
subtraction of OF coreEUKARYOTIC GENOME
the THE
neural networks in EMAN2 were unsuccessful. Nucleosomes particle DNA length of 146 bp from the nucleosome repeat length
were readily located, but the linker DNA connecting them could of 184 bp (measured by micrococcal nuclease digestion of HeLa
not be unambiguously identified. Some 2,600 subtomograms, metaphase chromosomes [De Ambrosis et al., 1987]). The core
each containing a single nucleosome located by template DNA length of 146 bp, corresponding to one and two-thirds turns
matching, were extracted, aligned, and averaged. The resulting around the nucleosome, is an intermediate in nuclease digestion
density map was then placed in the tomogram at the location and is reduced upon more extensive digestion to smaller sizes.
of each nucleosome. Comparison with the results of manual The close agreement between our measurement by cryo-ET
docking showed close agreement in both nucleosome location and the result from nuclease digestion shows that the 146-bp
and orientation (Figure S3A). However, the procedures em- core particle, whose physiological relevance has never been
ployed for automatic segmentation of linker DNA failed to established, is indeed representative of the actual extent of
segment linker DNA passing through multiple slices of a tomo- wrapping around the histone core of the nucleosome in chromo-
gram, and even when the DNA was contained in a single slice somes (but see below).
of the tomogram, the identification was unreliable, with a variety The length of linker DNA varied from one segment to the next.
of artifacts, such as false connectivity and dangling ends Long stretches with a consistent linker length were rare; among
(Figure S3B). the 548 linkers analyzed, not more than six consecutive linkers
Linker DNA was often seen to be curved or bent (Figure 5A). As were of the same length, within ±5 bp (Figure S4). There was
expected from nuclease digestion analysis (Prunell and Korn- no discernible pattern in the length difference between entry
berg, 1982), linker DNA varied in length. Distributions for regions and exit linkers (‘‘DL’’) for consecutive nucleosomes, except for

4372 Molecular Cell 81, 4369–4376, November 4, 2021


The structure of mitotic chromosomes
ll
ArticleThe variation of linker DNA length precludes any uniform mode of coiling !
Figure 5. Linker DNA
(A) Densities cropped from tomogram 1 depicted
in mesh representation, into which nucleosome
core-particle maps (EMDB: EMD-8160) (Chua
et al., 2016) have been manually docked. The
lengths of linker DNA are indicated.
(B) Distributions of linker-DNA lengths in
segmented regions of tomograms 1 (blue) and 2
(orange).
(C) Angles between DNA entering and exiting nu-
cleosomes. Each point corresponds to the angle
measured for a single nucleosome. The vertical
white band contains all angular data from fibers
with 10 or fewer nucleosomes. Each vertical
colored band contains angular data for the nucle-
osomes, consecutively displayed, of single fibers
with more than 10 nucleosomes (coloring matches
that of Figure 6). The zone between the horizontal
red lines encompasses 50% of the nucleosomes.
The zone between the horizontal green lines, cor-
responding to the angular range 60! ± 5! , contains
all nucleosomes with extra density, in a location
expected for histone H1 (indicated by red dots).
(D) Tomographic density represented as a solid
(top panels) or a mesh (bottom panels), shown
alone (top left)—with a red arrow indicating extra
density in the location expected for histone H1—
and after docking of a nucleosome core particle map (EMDB: EMD-8160) (top right), a ribbon representation of the 197-bp nucleosome-histone H1 complex
(PDB: 5NL0; Bednar et al., 2017) (bottom left), or a surface representation of the globular domain of histone H1 (bottom right).
See also Figures S4.

No angular bias was observed in the orientations of linker DNA


some cases in which that difference alternated between large Only about 10% ll ofsegments !
nucleosomes were found in coiling motifs
positive and negative values (|DL| > 30 bp) (Figure S4, green similar to those previously reported for reconstituted chromatin Article
boxes). The angle Lbetween the DNA entering and
ECTURE SERIES MOLECULAR BIOLOGY
exiting a nucle- (Robinson
WISE 2023/4 et al., 2006; Song et al., 2014),
ORGANISATION and such motifs
OF THE EUKARYOTIC GENOME
osome also varied, ranging from 0! to almost 180! , with angles were never longer than eight nucleosomes (Figure S5C). More- Local density analysis of the chromosomal material revealed
regions with densities comparable with those of 30-nm fibers
for about half the nucleosomes falling between 40! and 80! (Fig- over, such motifs exhibited greater irregularity than did those in
assembled in vitro, despite the absence of evidence of a hierar-
ure 5C). Only about 10% of nucleosomes fell in the narrower reconstituted chromatin (Figure 7). For example, in regions in organization (Video S3). The possibility remains that other
chical
range of 55! –65! , consistent with one and two-thirds turns of which the electron density resembled the tetrameric unit oforganized the structures are formed in more condensed states or
through the actions of chromosomal proteins, such as hetero-
the DNA in a 146-bp particle (labeled by green lines in Figure 5C), ‘‘zigzag’’ 30-nm fiber (Song et al., 2014) or where the density
chromatin protein 1 and Polycomb group proteins, in regions
and these nucleosomes included all members of a special class resembled ‘‘overlapping dinucleosomes’’ (Kato et al., 2017),not sig-sampled by our tomograms.
(labeled by red dots in Figure 5C) that exhibited extra density nificant alteration of the published structures would be required Our findings open the way to the elucidation of the structure in
(Figure 5D) at the location expected for the globular domain of to fit the density (red arrows in Figure 7). The lengths of linker specific chromosomal regions. Partial decondensation of chro-
mosomes will facilitate the penetration of reagents, such as anti-
histone H1 (Bednar et al., 2017). Limitations because of the DNA in regions resembling the tetrameric unit were similarbody but fragments and sequence-specific DNA-binding proteins,
missing wedge of information from the tomograms might have not identical, and we find extensive variation of linker length, tagged with fluorescent dyes and heavy atom clusters. Struc-
prevented the identification of some members of this class. whereas all published coiling motifs require uniformity. tural features that are conserved among instances of the same
Nevertheless, nucleosomes were evidently more or less equally The segmented regions of tomograms 1 and 2 containedchromosome 520 may be investigated, and structural variation can
be determined as well.
under- and overwrapped, so the 146-bp particle represents an and 102 nucleosomes, connected by 470 and 78 linkers, respec-
average, rather than a unique state, of the nucleosome in tively (Figures S6). Based on these numbers, the densities of
Limitations of the study
chromosomes. chromatin in the tomograms were about 10% (w/v), compared During specimen preparation for cryo-electron microscopy
Every nucleosome in both tomograms was connected by with an average density of about 36% for mitotic chromosomes (cryo-EM), damage may be caused by adsorption to the elec-
tron-microscope grid, blotting to remove excess liquid, and
linker DNA to two neighbors, except for nucleosomes at the (Bennett et al., 1983). The segmented regions were, therefore,
ll The structure of mitotic chromosomes
edges of the regions analyzed. Chromatin fibers, some contain- decondensed by factors of three to four with respect tothe
freezing. Cryo-EM analysis is limited by maximal penetration of
theelectron beam to regions of sufficient thinness, which may
ing more than 50 nucleosomes, could therefore be traced average of the fully condensed state.
throughout the tomograms (Figure 6; Video S2). The fibers fol-
Article
have been produced in the present study by flattening during
adsorption to the grid and blotting. Alternatively, such regions
may represent the natural state of the material at the periphery
lowed irregular paths and were not entangled or intertwined, DISCUSSION of the chromosome. These possibilities remain to be resolved.
consistent with analysis by chromosome conformation capture LocalFigure
density analysis of the chromosomal material revealed
6. Chromatin fibers in a partially decondensed mitotic chro-
(Tavares-Cadete et al., 2020). The close approximation of regions Partial mosome
decondensation
with densities of chromosomes
comparable with in vitro
those opens a window
of 30-nm STAR+METHODS
fibers
some fibers (Figure S5A) demonstrated a potential source of er- assembled
on their(A)internal
A 4-nm slab,structure.
in vitro,
comprising 10Itslices
despite may
the be asked
of tomogram
absence toevidence
what
1, displayed
of extent
with inverted
of adoes
hierar-methods are provided in the online version of this paper
contrast. Selected fibers, with nucleosome core-particle maps (EMDB: EMD- Detailed
ror in connectivity assignments that are based solely upon prox- chical
such decondensation
organization 2016)perturb
8160; Chua et al., (Video the chromatin
S3).docked
manually The and linker DNA structure.
possibility remains Might
traced. Scalethatandother
include the following:
imity of nucleosomes (Ou et al., 2017). Rotating irregular fibers any of bars, the 10modes
nm. of chromatin fiber coiling previously reported
organized structures
(B) A completely are volume
segmented formed from in more1. condensed
tomogram Nucleosome core states or
d KEY RESOURCES TABLE
showed that they could give the appearance of regular, 30-nm for isolated or reconstituted
colored according tochromatin (Dorigo etDNAal.,is 2004;
shown Finch
throughparticles
the areactions fiber membership,
of chromosomal and linker
proteins, such as hetero-
d RESOURCE AVAILABILITY
fibers, if viewed in certain orientations (Figure S5B). and Klug, 1976; Robinson et al., 2006; Song et al., 2014), absent B Lead contact
in black.
chromatin protein
(C) Same as (B), but1rotated
and90Polycomb
! group
about a horizontal proteins,
axis in the plane of (B). in regions
(D) Examples of individual fibers from the segmented volume in (B) and (C). B Materials availability
not sampled by our tomograms.
See also Figures S3, S5, and S6 and Video S2. B Data and code availability
Our findings openCell
Molecular the 81,way4369–4376,
to the elucidation
November of4,the2021structure
4373 in
d EXPERIMENTAL MODEL AND SUBJECT DETAILS
d METHOD DETAILS
specific chromosomal regions. Partial decondensation of chro-
from our tomograms, nevertheless occur in the fully condensed B Chromosome purification
mosomes state?will
Two facilitate
observations theargue
penetration
against theofdisruption
reagents, such as anti-
of previ- B Widefield microscopy
body fragments
ously reported and sequence-specific
modes DNA-binding
of coiling by decondensation. First, proteins,
B Fluorescence lifetime imaging

taggedassuming decondensation to be isotropic, (Poirier et al., 2002;


with fluorescent dyes and heavy atom clusters. Struc- B Sample preparation and vitrification
Beel et al., 2021), the densities of the regions we analyzed corre- B Cryo-ET data collection
tural features that areinconserved
spond to extension any direction byamong
only about instances
50%. It was of not the same
B Cryo-ET data processing and analysis
chromosome maythebe
apparent how fiberinvestigated, and structural
trajectories we observed could derive variation
from dcan
QUANTIFICATION AND STATISTICAL ANALYSIS
any reported uniform mode of coiling by stretching to that extent.
be determined as well.
Second, the variation of linker DNA length that we observed (Fig- SUPPLEMENTAL INFORMATION
ures 5, 7, and S4) precludes any uniform mode of coiling, and that
variation could not have been caused by stretching because the Supplemental information can be found online at https://doi.org/10.1016/j.
Limitations of the study
average linker DNA length we measured was virtually identical molcel.2021.08.020.
During tospecimen
the linker DNApreparation
length determinedforbycryo-electron
micrococcal nuclease microscopy
ACKNOWLEDGMENTS
(cryo-EM), damage
digestion. may
The regular be caused
modes by adsorption
of coiling identified to the
in prior studies elec-
have been based on the observation of artificial chromatin
tron-microscope grid, blotting to remove excess liquid, andPeter Geiduschek for reviewing this manuscript; Geoffrey Wahl and
We thank
composed of regular arrays of strong nucleosome-positioning se- Teru Kanda for generously providing the H2B-GFP-expressing HeLa cell line;
freezing. Cryo-EM
quences. analysis
In addition, is limited
no angular bias wasby maximal
observed in the penetration of and the Stanford Cell Sciences Imaging Facility for light micro-
orien- Jon Mulholland
the electron beam to regions of sufficient thinness, which may
tations of linker DNA segments. scope access and training; and David Bushnell and the Stanford-SLAC EM

have been produced in the present study by flattening during


adsorption
4374 to the grid
Molecular and
Cell 81, blotting.
4369–4376, Alternatively,
November 4, 2021 such regions
may represent the natural state of the material at the periphery
of the chromosome. These possibilities remain to be resolved.
Figure 6. Chromatin fibers in a partially decondensed mitotic chro-
mosome STAR+METHODS
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
(A) A 4-nm slab, comprising 10 slices of tomogram 1, displayed with inverted
contrast. Selected fibers, with nucleosome core-particle maps (EMDB: EMD- Detailed methods are provided in the online version of this paper
8160; Chua et al., 2016) manually docked and linker DNA traced. Scale and include the following:
bars, 10 nm.
(B) A completely segmented volume from tomogram 1. Nucleosome core d KEY RESOURCES TABLE
particles are colored according to fiber membership, and linker DNA is shown
d RESOURCE AVAILABILITY
in black.
B Lead contact
(C) Same as (B), but rotated 90! about a horizontal axis in the plane of (B).
(D) Examples of individual fibers from the segmented volume in (B) and (C). B Materials availability
See also Figures S3, S5, and S6 and Video S2. B Data and code availability
d EXPERIMENTAL MODEL AND SUBJECT DETAILS
d METHOD DETAILS
from our tomograms, nevertheless occur in the fully condensed B Chromosome purification
state? Two observations argue against the disruption of previ- B Widefield microscopy
ously reported modes of coiling by decondensation. First, B Fluorescence lifetime imaging
assuming decondensation to be isotropic, (Poirier et al., 2002; B Sample preparation and vitrification
Beel et al., 2021), the densities of the regions we analyzed corre- B Cryo-ET data collection

The structure of mitotic chromosomes (is surprising)


spond to extension in any direction by only about 50%. It was not
apparent how the fiber trajectories we observed could derive from d
B Cryo-ET data processing and analysis
QUANTIFICATION AND STATISTICAL ANALYSIS
any reported uniform mode of coiling by stretching to that extent.
Second, the variation of linker DNA length that we observed (Fig- SUPPLEMENTAL INFORMATION
ures 5, 7, and S4) precludes any uniform mode of coiling, and that
variation could not have been caused by stretching because the Supplemental information can be found online at https://doi.org/10.1016/j.
average linker DNA length we measured was virtually identical molcel.2021.08.020.
to the linker DNALinker
lengthDNA regions
determined were traced,
by micrococcal nucleaserevealing the trajectories of the
digestion. The regular modes of coiling identified in prior studies ACKNOWLEDGMENTS
have been based chromatin fibers. The
on the observation trajectories
of artificial chromatin were irregular, with almost no
We thank Peter Geiduschek for reviewing this manuscript; Geoffrey Wahl and
evidence
composed of regular arrays ofof coiling
strong and no short-
nucleosome-positioning se-or Teru
long-range order
Kanda for generously of the
providing the H2B-GFP-expressing HeLa cell line;
quences. In addition, no angular bias was observed in the orien- Jon Mulholland and the Stanford Cell Sciences Imaging Facility for light micro-
tations of linkerchromosomal
DNA segments. material. scope access and training; and David Bushnell and the Stanford-SLAC EM

4374 Molecular Cell 81, 4369–4376, November 4, 2021

The 146-bp core particle, long known as a product of nuclease


digestion, is identified as the native state of the nucleosome,
with no regular spacing along the chromatin fibers.

The regular modes of coiling identified in prior studies have been


based on the observation of artificial chromatin composed of
regular arrays of strong nucleosome-positioning sequences.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Histone modifications may be inherited

After mitosis each daughter chromosome will contain two types of nucleosomes:
(1) Inherited and modified from parent chromosome (2) newly synthesized and not modified.
Modified histones are recognized and the modification is copied to the naked nucleosomes.
This is epigenetic inheritance that does not involve DNA.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

All histone domains can be post-translationally modified

https://www.nature.com/articles/cr201122

Cell Research (2011) 21:381-395.


© 2011 IBCB, SIBS, CAS All rights reserved 1001-0602/11 $ 32.00
npg
REVIEW www.nature.com/cr

Andrew J Bannister and Tony Kouzarides npg


Regulation of chromatin by histone modifications 389

constitutiveAndrew
heterochromatin.
J Bannister1, Tony Kouzarides1 cleosomes containing unmodified H3. Furthermore, this
(a) Facultative heterochromatin consists of genomic positive feedback mechanism helps to explain, at least in
1
The Gurdon
regions containing genes Institutethat
and are
Department of Pathology,
differentially University of Cambridge,
expressed part, theCambridge
highly CB2 1QN, UKnature of heterochromatin, not
dynamic
through development and/or differentiation and which
Chromatin is not an inert structure, but rather an instructive leastDNA its ability to encroach
scaffold that can respond into euchromatic
to external cues to regions un-
then become silenced.
regulate the many A classic example
uses of DNA. of thiscomponent
A principle less it isthat
type of of chromatin checked
plays afrom
key roledoing so.regulation is the
in this
modification
heterochromatin is theof histones.
inactiveThere is an ever-growing
X-chromosome list of these modifications and the complexity of their action is
present
only just beginning to be understood. However, it is clear that histone modifications play fundamental roles in most
within mammalian female cells, which is heavily marked Euchromatin
biological processes that are involved in the manipulation and expression of DNA. Here, we describe the known
by H3K27me3 histoneand the Polycomb
modifications, repressor
define where they are found genomically In
complexes andstark
discusscontrast
some of to heterochromatin,
their euchromatin is a
functional consequences,
(PRCs) [87]. This co-localization
concentrating makes where
mostly on transcription sensethebecause far more relaxed
majority of characterisation has takenenvironment
place. containing active genes.
the H3K27Keywords:
methyltransferase EZH2chromatin
histone; modifications; resides within the However, as with heterochromatin, not all euchromatin
Cell Research
trimeric PRC2 complex. (2011) 21:381-395.
Indeed, recentdoi:10.1038/cr.2011.22;
elegant work haspublished is the
onlinesame. Certain
15 February 2011 regions are enriched with certain
shed light on how H3K27me3 and PRC2 are involved histone modifications, whereas other regions seem rela-
in positionally maintaining facultative heterochromatin tively devoid of modifications. In general, modification-
through DNA replication [88]. Once established, it seems rich ‘islands’ exist, which tend to be the regions that
Introduction such as repair, replication and recombination.
that H3K27me3 recruits PRC2 to sites of DNA replica- regulate transcription or are the sites of active transcrip-
tion, facilitating
Ever the
sincemaintenance
Vincent Allfrey’s of H3K27me3
pioneering studies via the in tion [86].
Histone For instance, active transcriptional enhancers
acetylation
action of EZH2.
the earlyIn1960s,
this
LECTUREway,
we theknown
have
SERIES histone
M mark
that histones
OLECULAR BIOLOGY isare‘repli-
post- contain
WISE 2023/4 relatively high Olevels of H3K4me1,
RGANISATION OF THE EUKARYOTICa reliable
GENOME
cated’ ontotranslationally modified [1].
the newly deposited We now
histones andknow
thethat there
faculta- Allfrey et al.feature
predictive [1] first reported histone acetylation
[89]. However, active ingenes them-
are a large number of different histone post-translational 1964. Since then, it has been shown that the acetylation
tive heterochromatin is maintained.
modifications (PTMs). An insight into how these modi-
selves possess a high enrichment
of lysines is highly dynamic and regulated by the oppos-
of H3K4me3, which
(b) Constitutive
fications couldheterochromatin containscame
affect chromatin structure perma-
from marks
ing actionthe of twotranscriptional
families of enzymes, start histone
site (TSS)
acetyl-[86, 90]. In
nently silenced
solving genes in genomicX-ray
the high-resolution regions such
structure as the
of the nu- addition, (HATs)
transferases H3K36me3 is highly
and histone enriched
deacetylases (HDACs;throughout the
centromeres cleosome in 1997 [2]. The
and telomeres. It isstructure indicates that
characterised by highly
rela- for review,
entire see reference
transcribed [3]). The
region [91].HATsTheutilize acetyl by which
mechanisms
tively highbasic histone amino (N)-terminal tails can protrude from
levels of H3K9me3 and HP1α/β [87]. As CoA
their own nucleosome and make contact with adjacent
as cofactor and catalyse the transfer of an acetyl
H3K4me1 is laid down at enhancers is unknown, but
group to the ε-amino group of lysine side chains. In do-
discussed nucleosomes.
above, HP1It dimers seemed likelybindatto theH3K9me2/3
time that modifica- via ing work in yeast
so, they has the
neutralize provided mechanistic
lysine’s positive charge detail
and into how
their chromodomains,
tion of these tailsbutwould
importantly they also interact
affect inter-nucleosomal interac- theaction
this H3K4 hasand H3K36tomethyltransferases
the potential weaken the interactions are recruited
with SUV39, tionsaand thus H3K9
major affect themethyltransferase.
overall chromatin structure.As DNA We between
to genes, histones
which andin DNAturn(see below).
helps There arethe
to explain two distinct dis-
replicationnow know that this is indeed the case. Modifications not
proceeds, there is a redistribution of the exist- major classes of HATs: type-A and type-B. The type-B
tribution patterns of these two modifications (Figure 3).
only regulate chromatin structure by merely being there, HATs are predominantly cytoplasmic, acetylating free
ing modifiedbut histones (bearing
they also recruit H3K9me3),
remodelling enzymes asthat
well as the
utilize the The scSet1
histones but not H3K4 methyltransferase
those already binds to the serine
deposited into chromatin.
deposition energy
of newly
derivedsynthesized histones
from the hydrolysis intotothe
of ATP repli- This
reposition 5 phosphorylated
class of HATs is highly CTDconserved
of RNAPII, and allthe initiating form
type-B
cated chromatin. SinceThe
nucleosomes. HP1 binds toof SUV39,
recruitment proteins and it complexes
is tempt- HATs of polymerase
share sequencesituated
homology atwith
the scHat1,
TSS [92]. In contrast, the
the found-
with specific
that enzymatic activities is nowaanfeedback
accepted ing member of this type of HAT. Type-B HATs acetylate
ing to speculate
Interplay of factors at an active gene in yeast
the proteins generate
dogma of how modifications mediate their function. As
scSet2 H3K36 methyltransferase
newly synthesized histone H4 at K5 and K12 (as well as
phosphorylated
binds to the serine 2
loop capablewe ofwillmaintaining
describe below,heterochromatin
in this way modificationspositioning
can in- certain sites within H3),CTD and thisof pattern
RNAPII, the transcriptional
of acetylation is
following DNA
fluence replication
transcription, [68]. In chromatin
but since other words, elongating
during important
is ubiquitous, form ofofpolymerase
for deposition [93].which
the histones, after Thus, the the two en-
DNA replication,
modificationsHP1also binds
affecttomany
nucleosomes
other DNA processesbearing marks zymes are are recruited
removed [4]. to genes via interactions with distinct
H3K9me2/3, thereby recruiting the SUV39 methyltrans- The type-A
forms HATs areand
of RNAPII, a moreit isdiverse familythe
therefore of location
en- of the
zymes than the type-Bs. Nevertheless, they can be classi-
ferase, which in turn methylates H3K9 in adjacent nu- fied different forms of RNAPII that defines where the modifi-
into at least three separate groups depending on ami-
Correspondence: Tony Kouzarides
Tel: +44-1223-334112; Fax: +44-1223-334089 no-acid sequence homology and conformational struc-
E-mail: t.kouzarides@gurdon.cam.ac.uk ture: GNAT, MYST and CBP/p300 families [5]. Broadly

Figure 3 Interplay of factors at an active gene in yeast (adapted from references [128] and [3]).

www.cell-research.com | Cell Research

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Trends in Biochemical Sciences
OPEN ACCESS Histone modifications 2023

Trends in Biochemical Sciences

Figure 3. Histone Modifications. Shown are chemical structures for various histone modifications that are expected to
change charge/sterics (top), hydrophobicity/sterics (bottom, left), or all upon conjugation of small proteins such as
ubiquitin or small ubiquitin-like modifier (SUMO) (bottom, right). These are all expected to alter tail/DNA interactions.
Abbreviation: PTM, post-translational modification.

10 Trends in Biochemical Sciences, Month 2021, Vol. xx, No. xx

Trends in Biochemical Sciences


https://www.cell.com/trends/biochemical-sciences/fulltext/S0968-0004(20)30324-8
Figure 3. Histone Modifications. Shown are chemical structures for various histone modifications that are expected to
change charge/sterics (top), hydrophobicity/sterics (bottom, left), or all upon conjugation of small proteins such as
LECTURE SERIESubiquitin
MOLECULARor smallBubiquitin-like
IOLOGY modifier (SUMO) (bottom, right). These are all expected toOalter
WISE 2023/4 tail/DNA interactions.
RGANISATION OF THE EUKARYOTIC GENOME
Abbreviation: PTM, post-translational modification.

10 Trends in Biochemical Sciences, Month 2021, Vol. xx, No. xx

Trends in Biochemical Sciences

Figure 3. Histone Modifications. Shown are chemical structures for various histone modifications that are expected to
change charge/sterics (top), hydrophobicity/sterics (bottom, left), or all upon conjugation of small proteins such as
ubiquitin or small ubiquitin-like modifier (SUMO) (bottom, right). These are all expected to alter tail/DNA interactions.
Abbreviation: PTM, post-translational modification.

Chromatin-remodeling can reposition DNA on nucleosomes


10 Trends in Biochemical Sciences, Month 2021, Vol. xx, No. xx

The packaging of DNA must be dynamic.


The information must be available upon request.

Chromatin-remodeling complexes use ATP


to change the position of DNA wrapped
around nucleosomes.
By this the chromatin can be decondensed.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

(1) DNA as genetic material was disputed until Alfred Hershey and Martha
Chase did their “blender experiment” in 1952. What did they do?

(2) What is the biological significance of the two DNA grooves?

(3) Chromosome condensation is a complex, active process


Lecture One

(4) Three DNA elements are essential for chromosomes

(5) Chromosome number is not related to the size (or complexity)


of the organism

(6) Chromatin-remodeling can reposition DNA on nucleosomes

(7) Histone modifications may be inherited

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Core histone tails and the fuzzy complex

https://www.cell.com/trends/biochemical-sciences/fulltext/S0968-0004(20)30324-8

TIBS 1764 No. of Pages 15


2021

Trends in
Biochemical Sciences OPEN ACCESS

Review

Histone Tail Conformations: A Fuzzy Affair


with DNA
Mohamed Ghoneim,1 Harrison A. Fuchs,1 and Catherine A. Musselman 1,
*

The core histone tails are critical in chromatin structure and signaling. Studies Highlights
over the past several decades have provided a wealth of information on the his- Eukaryotic DNA is wrapped around his-
tone tails and their interaction with chromatin factors. However, the conforma- tone proteins to form nucleosomes that
fold into higher-order chromatin struc-
tion of the histone tails in a chromatin relevant context has remained elusive.
tures, and the local chromatin structure
Only recently has enough evidence emerged to start to build a structural model regulates all DNA-templated processes.
of the tails in the context of nucleosomes and nucleosome arrays. Here, we
review these studies and propose that the histone tails adopt a high-affinity All core histone proteins contain intrinsi-
cally disordered tail regions that protrude
fuzzy complex with DNA, characterized by robust but dynamic association. Fur- from the DNA-wrapped core and are
thermore, we discuss how these DNA-bound conformational ensembles pro- known to be critical in chromatin
mote distinct chromatin structure and signaling, and that their fuzzy nature is regulation.
important in transitioning between functional states.
Recent studies have revealed that the
LECTURE SERIES MOLECULAR BIOLOGY Trends inWBiochemical
ISE 2023/4 Sciences
core ORGANISATION
histone tails adopt multiple confor- OF THE EUKARYOTIC GENOME
Histone Tails: Dynamic Hubs for Chromatin Signaling mations on the nucleosomal and linker OPEN ACCESS
The eukaryotic genome exists in the cell nucleus as chromatin, a complex between the genomic DNA DNA; these tail/DNA interactions are
robust, but exchange quickly between
and proteins known as histones. The most basic repeating unit of chromatin is the nucleosome
multiple conformations consistent with a
core particle (NCP) (see Glossary), in which ~147 bp of DNA wrap around an octamer that contains so-called fuzzy complex.
two each of the core histone proteins H2A, H2B, H3, and H4 [1]. NCPs are flanked by linker DNA,
H3 either unmodified or methylated at
which is of variable length (10–70 bp) depending on the local chromatin state [2]. The dynamic Intra- versus inter-nucleosome contacts
by the tails differentially contribute to the
Lys4, or bromodomains which
organization of these nucleosome particles, both spatially and temporally, is critical in regulation of local chromatin state and thus regulation recognize various acetylated lysines on
the underlying genome and in the proper execution of all DNA templated processes [3]. Chromatin of DNA-templated processes. histones.
modulation is orchestrated by a slew of chromatin-associated proteins (CAPs). In addition, Super helical location (SHL): a
Histone post-translational modifications specific DNA helical turn within the
post-translational modifications (PTMs) on the histone proteins can directly regulate chromatin
and chromatin-associated factors can nucleosome core particle; the major
or indirectly regulate it through modulation of CAP activity [4]. modulate these fuzzy conformational grooves facing the histone core are
ensembles and tail accessibility, indicat- numbered +1 through +7 and -1
Much effort has been placed on building an understanding of the structure and dynamics of ing that the tail/DNA interactions are through -7 either direction starting from
an important regulatory mechanism of the dyad (which is denoted 0), and the
nucleosomes and chromatin. Several near atomic resolution structures of NCPs have been
chromatin. minor grooves are numbered in half
solved, and lower resolution structures of nucleosome arrays and nucleosomes in complex
steps.
Trends in Biochemical Sciences
with CAPs are more recently being tackled. Together, these have given us great insight into
chromatin structure [5–8]. In addition, mechanisms of inherent nucleosome dynamics have
OPEN ACCESS
been characterized including DNA breathing (i.e., spontaneous reversible unwrapping) of the
DNA at the entry/exit points [9–11].

Core histone tails and the fuzzy complex


However, one component has remained elusive: the histone tails. These are the N termini of all 1Department of Biochemistry and
H3 either unmodified
four histone proteins, as well as the C terminus of H2A, that protrude out from the nucleosome or methylated
Molecular Genetics,at University of
Lys4, or bromodomains which
core. These tails are enriched in PTMs and are known to be hubs of chromatin signaling [12]. Colorado Anschutz Medical Campus,
recognize various acetylated
Aurora, COlysines
80045,onUSA
They largely do not resolve in the structures of the nucleosome or nucleosome arrays, indicating
histones.
a high level of conformational dynamics, which is also demonstrated by their susceptibility to pro-
Super helical location (SHL): a
tease digestion [13]. However, a number of studies have indicated that they are not specific
fully solvent
DNA helical turn within the
Core histone tails adopt multiple conformations on the nucleosomal and linker DNA.
exposed and have DNA binding potential [14]. The conformations of the histone tails nucleosome
and the core particle; the major
grooves facing the*Correspondence:
histone core are
mechanistic basis of their interactions with CAPs in the context of the nucleosome has thus catherine.musselman@cuanschutz.edu
numbered +1 through +7 and -1
remained elusive for a number of years. Here, we review a number of structural andthrough
biophysical
These tail/DNA interactions are robust, but exchange quickly between multiple
(C.A. Musselman).
-7 either direction starting from
the dyad (which is denoted 0), and the

conformations consistent with a so-called fuzzy complex.


minor grooves are numbered in half
steps.
1 Trends in Biochemical Sciences, Month 2021, Vol. xx, No. xx https://doi.org/10.1016/j.tibs.2020.12.012
© 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Trends in Biochemical Sciences


Crystal structure of the Figure 1. Histone Composition and Nucleosome Structure. (A) Core histones H2A, H2B, H3, and H4. The core region is
nucleosome core particle. represented by a rectangle flanked by the tail sequences. Shown are human sequences with residues that vary between organisms
Histones are shown in red and Crystal structure ofin the
(only H2A and H2B) in italics. The positive residues are denoted by a (+). (B) Left, a crystal structure of the nucleosome core particle
A model of the broad conformational ensemble
DNA in gray, with the H2A/H2B chromatosome.(PDB Histones
ID 1AOI). are
Histones are shown in red and DNA in gray, with the H2A/H2B acidic patch residues shown as black
spheres. The super helical locations adopted by the
(SHLs) are marked withH3
the tails
negative inSHLs
the italicized.
context of the
Right, a crystal structure of
acidic patch residues shown as shown in red and DNA in gray, the
the chromatosome (PDB ID 5NL0). Histones nucleosomes.
are shown inThered andtails
DNAare blurred
in gray, to represent
the globular domain of linkertheir
histone
black spheres. The super helical globular domainH1ofis shown
linkerinhistone
blue. (C) A model of the broad conformational ensemble adopted by the H3 tails in the context of the
dynamic exchange between states.
locations (SHLs) are marked with H1 is shown in nucleosomes.
blue. The tails are blurred to represent their dynamic exchange between states.
the negative SHLs italicized. https://www.cell.com/trends/biochemical-sciences/fulltext/S0968-0004(20)30324-8
Trends in Biochemical Sciences, Month 2021, Vol. xx, No. xx 3

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Trends in Biochemical Sciences

Figure 1. Histone Composition and Nucleosome Structure. (A) Core histones H2A, H2B, H3, and H4. The core region is
represented by a rectangle flanked by the tail sequences. Shown are human sequences with residues that vary between organisms
(only in H2A and H2B) in italics. The positive residues are denoted by a (+). (B) Left, a crystal structure of the nucleosome core particle
(PDB ID 1AOI). Histones are shown in red and DNA in gray, with the H2A/H2B acidic patch residues shown as black
spheres. The super helical locations (SHLs) are marked with the negative SHLs italicized. Right, a crystal structure of
the chromatosome (PDB ID 5NL0). Histones are shown in red and DNA in gray, the globular domain of linker histone
H1 is shown in blue. (C) A model of the broad conformational ensemble adopted by the H3 tails in the context of the
nucleosomes. The tails are blurred to represent their dynamic exchange between states.

Trends in Biochemical Sciences, Month 2021, Vol. xx, No. xx 3

Trends in Biochemical Sciences


Core histone tails and the fuzzy complex OPEN ACCESS

CAP = chromatin-associated
protein

PTM = post-translational
modification

https://www.cell.com/trends/biochemical-sciences/fulltext/S0968-0004(20)30324-8

Trends in Biochemical Sciences

Figure 2. Schematic Model of Histone Tail Contacts in Various Chromatin States and Regulatory Effects of
Tail–DNA Interactions. Histones are shown in red, DNA in gray, the H2A/H2B acidic patch as a black oval, and histone
LECTURE(PTMs)
post-translational modifications SERIESasMcyan ovals B
OLECULAR stars. The histone tails W
orIOLOGY areISblurred
E 2023/4to represent dynamicORGANISATION OF THE EUKARYOTIC GENOME
exchange within a broad conformational ensemble. Predicted inter-nucleosome interactions stabilizing the tetra-
nucleosome are shown. The intra- and inter-nucleosome interactions favored by compact and extended tails are shown.
A chromatin-associated protein (CAP) is shown with a star denoting a histone PTM binding pocket, and the effect of PTM
crosstalk on CAP binding is represented. The inhibitory effects of RNA polymerase II (RNA Pol II) and remodeler activity, as
well as the positive effect of tail–DNA weakening PTMs, are represented.

been shown to alter the conformational ensemble of the H3 and H4 tails on DNA [48]. Notably,
not all DNA-binding factors lead to increased accessibility of the tails. Binding of the linker histone
H1 to form the chromatosome (Figure 1B) was found to reduce H3 tail dynamics and accessi-
bility potentially stabilizing it on the linker DNA [28]. Similarly, a recent study found that the H3 tail
stabilizes an RNA–DNA triplex formed on the linker DNA of a nucleosome [67]. Thus, the acces-
sibility can be even more restricted by limiting the available conformational ensemble.

The weakening of tail/DNA interactions can also regulate machinery acting on the nucleosome
core independent of tail binding (Figure 2). For instance, the presence of histone tails decreases
progression of RNA polymerase II through a nucleosome. However, mutation of lysine residues to
mimic acetylation in the tails, which would weaken tail–DNA interactions, positively regulates RNA
polymerase II activity, enhancing progression [68,69]. Similar effects are seen with chromatin
Chromatin structure varies along an interphase chromosome

more condensed more extended

During interphase chromosomes contain both condensed and more


extended forms of chromatin.
Heterochromatin is localised around centromere and telomeres.
Formation is induced by specific histone tail modifications. These
modifications recruit heterochromatin-specific proteins that induce the
same modification at neighbouring nucleosomes, resulting in a wave
of heterochromatin formation.
Most DNA that is permanently folded into heterochromatin does not
contain genes.
Accidental heterochromatin formation can cause disease
(e.g. ß-globin gene is located close to heterochromatin > anaemia)

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Chromosome packaging occurs on multiple levels

https://www.youtube.com/watch?v=OjPcT1uUZiE

Good question:

How does the 30-nm fibre look in vivo?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

COCEBI-774; NO. OF PAGES 7

Available online at www.sciencedirect.com

How does chromosome packaging work?


COCEBI-774; NO. OF PAGES 7

Available online at www.sciencedirect.com Chromatin structure: does the 30-nm fibre exist in vivo?
Kazuhiro Maeshima1,2, Saera Hihara1,2 and Mikhail Eltsov3
Chromatin structure: does the 30-nm fibre exist in vivo?
Kazuhiro Maeshima1,2, Saera Hihara1,2 and Mikhail Eltsov3

A long strand of DNA is wrapped around the core histone and


forms a nucleosome. Although the nucleosome has long been
A long strand of DNA is wrapped around the core histone and
the long negatively charged DNA string is wrapped
around a basic protein complex called a core histone
the long negatively charged DNA string is wrapped
assumed to be folded into 30-nm chromatin fibres, their
structural details and how such fibres are organised into a
forms a nucleosome. Although the nucleosome has long been
octamer, which consists of the histone H2A, H2B, H3
and H4 proteins, and forms a nucleosome (Figure 1) [4]. around a basic protein complex called a core histone
nucleus or mitotic chromosome remain unclear. When we The structural details of the nucleosome core are now
observed frozen hydrated (vitrified) human mitotic cells using assumed to be folded into 30-nm chromatin fibres, their
known at a resolution of 1.9 Å [5]. In the core particle, octamer, which consists of the histone H2A, H2B, H3
cryo-electron microscopy, which enables direct high- 147 bp of DNA are wrapped in 1.7 left-handed super-
resolution imaging of the cellular structures in a close-to-native
state, we found no higher order structures including 30-nm
structural details and how such fibres are organised into a
helical turns around the histone octamer. Each nucleo-
some core is connected by ‘linker DNA’ to make
and H4 proteins, and forms a nucleosome (Figure 1) [4].
chromatin fibres in the chromosome. Therefore, we propose
that the nucleosome fibres exist in a highly disordered,
nucleus or mitotic chromosome remain unclear. When we
repetitive motifs. Accordingly, the nucleosome fibre
was originally described as ‘beads on a string’ [1]. Since
The structural details of the nucleosome core are now
interdigitated state like a ‘polymer melt’ that undergoes
dynamic movement. We postulate that a similar state exists in
observed frozen hydrated (vitrified) human mitotic cells using
the core histones have tails with positively charged lysine
and arginine residues, only !60% of the negative charges known at a resolution of 1.9 Å [5]. In the core particle,
active interphase nuclei, resulting in several advantages in the of DNA are neutralised [6]; consequently, for further
transcription and DNA replication processes. cryo-electron microscopy, which enables direct high-
folding, the remaining !40% of the DNA charge has to
https://www.sciencedirect.com/science/article/pii/S0955067410000256?via%3Dihub
147 bp of DNA are wrapped in 1.7 left-handed super-
be neutralised by other factors, such as linker histone H1
Addresses
1
Biological Macromolecules Laboratory, Structural Biology Center,
National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
resolution imaging of the cellular structures in a close-to-native
or cations, as described below. helical turns around the histone octamer. Each nucleo-
2
Department of Genetics, School of Life Science, Graduate University
for Advanced Studies (Sokendai), Mishima, Shizuoka 411-8540, Japan
In this review, in state, we
addition to foundof current
a description no higher order structures including 30-nm some core is connected by ‘linker DNA’ to make
3
European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117
Heidelberg, Germany 2010
progress in the field, we propose that the nucleosome
fibres in nuclei orchromatin
mitotic chromosomes fibres in the chromosome. Therefore, we propose
exist in a highly
repetitive motifs. Accordingly, the nucleosome fibre
disordered, interdigitated state, which is locally similar to
Corresponding author: Maeshima, Kazuhiro (kmaeshim@lab.nig.ac.jp)
that the nucleosome fibres exist in a highly disordered,
a polymer melt with dynamic movements.
was originally described as ‘beads on a string’ [1]. Since
30-nm chromatin fibre
Current Opinion in Cell Biology 2010, 22:1–7
More than 30 years interdigitated state
ago, Finch and Klug first proposedlike a ‘polymer melt’ that undergoes
2+
the core histones have tails with positively charged lysine
This review comes from a themed issue on that the nucleosome, with linker histone H1 or Mg ions,
Nucleus and gene expression
Edited by Ana Pomba and Dave Gilbert
dynamic
is folded into ‘30-nm movement.
chromatin fibres’ (Figures 1 and 2) We postulate that a similar state exists in and arginine residues, only !60% of the negative charges
[7]. In fact, isolated nucleosomes looked like fibres with a
diameter of 30 nm active interphase
under transmission nuclei, resulting in several advantages in the
electron micro-
scopy. In their model called the ‘solenoid’, consecutive
of DNA are neutralised [6]; consequently, for further
0955-0674/$ – see front matter
# 2010 Elsevier Ltd. All rights reserved. transcription and DNA replication processes.
nucleosomes are located next to each other in the fibre,
folding into a simple one-start helix (Figures 1, 2a and 2c). folding, the remaining !40% of the DNA charge has to
Subsequently, a second model of the ‘two start helix’ was
DOI 10.1016/j.ceb.2010.03.001
proposed on the basis of microscopic observations of
Addresses be neutralised by other factors, such as linker histone H1
isolated nucleosomes (Figure 2b and d) [8]. Although

Introduction
1 in this model [9], essentially nucleo-
some variations exist
somes are arranged in a zigzag manner, such that a
Biological Macromolecules Laboratory, Structural Biology Center, or cations, as described below.
The human body is made up of 60 LECTURE SERIES
trillion cells, each MOLECULAR
nucleosome inBtheIOLOGY WISE 2023/4
fibre is bound to the second neighbour, ORGANISATION
National Institute of Genetics, Mishima, Shizuoka OF THE EUKARYOTIC
411-8540, JapanGENOME
containing 2 m of genomic DNA in its nucleus. How is but not the first (Figure 2b and d).
2
this genomic DNA organised into nuclei? Around 1880, Department of Genetics, School of Life Science, Graduate University In this review, in addition to a description of current
W. Flemming discovered a nuclear substance that was In 2004, Richmond and co-workers found that their cross-
clearly visible on staining under primitive light micro- linking study on nucleosomal arrays (12-nucleosome for Advanced Studies (Sokendai), Mishima, Shizuoka 411-8540, Japan
scopes and named it ‘chromatin’; this is now thought to be repeats) was in good
3 agreement with the zigzag confor-
European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117
progress in the field, we propose that the nucleosome
the basic unit of genomic DNA organisation [1]. Since mation of the two-start helix [10]. In addition, they
long before DNA was known to carry genetic information,
chromatin has fascinated biologists.
succeeded in resolving the crystal structure of a tetra-
nucleosome (four nucleosome cores) at a resolution of 9 Å Heidelberg, Germany fibres in nuclei or mitotic chromosomes exist in a highly
Deoxyribonucleic acid (DNA) has a negatively charged
(Figure 2d) [11]. Although the resolution of the structure
is relatively low, they defined the positions of the linker
disordered, interdigitated state, which is locally similar to
phosphate backbone that produces electrostatic repulsion
between adjacent DNA regions, making it difficult for
DNA and nucleosomes in the fibre by replacing
the coarse core region with the fine atomic structure
Corresponding author: Maeshima, Kazuhiro (kmaeshim@lab.nig.ac.jp) a polymer melt with dynamic movements.
DNA to fold upon itself [2,3]. For the first level of folding, of a nucleosome core particle [5]. Again, their results

www.sciencedirect.com Current Opinion in Cell Biology 2010, 22:1–7

30-nm chromatin fibre


Current Opinion in Cell Biology 2010, 22:1–7
Please cite this article in press as: Maeshima K, et al. Chromatin structure: does the 30-nm fibre exist in vivo?, Curr Opin Cell Biol (2010), doi:10.1016/j.ceb.2010.03.001

More than 30 years ago, Finch and Klug first proposed


This review comes from a themed issue on that the nucleosome, with linker histone H1 or Mg2+ ions,
Nucleus and gene expression is folded into ‘30-nm chromatin fibres’ (Figures 1 and 2)
Edited by Ana Pomba and Dave Gilbert
[7]. In fact, isolated nucleosomes looked like fibres with a
diameter of 30 nm under transmission electron micro-
scopy. In their model called the ‘solenoid’, consecutive
COCEBI-774; NO. OF PAGES 7

4 Nucleus and gene expression

Chromosomes as polymer gel?


Figure 3

2010
(a, b) Under diluted conditions, the flexible nucleosome fibres may compact through selective close neighbour associations, forming the 30-nm
chromatin fibres. An increase in nucleosome concentration results in inter-fibre nucleosomal contacts, which interfere with the intra-fibre bonds (c).
The L ECTURE S
nucleosomes adjacentM
ofERIES OLECULAR
fibres BIOLOGY
interdigitate WIS
and intermix. This disrupts the 30-nm E 2023/4
folding ORGANISATION
and the nucleosomal fibres progress to a state of OF THE EUKARYOTIC GENOME
‘polymer melt’ (d). (e) The concept of polymer melt implies dynamic polymer chains [40], that is, nucleosome fibres may be moving and rearranging
constantly. This may have several advantages in chromosome condensation and segregation during mitosis and the transcription and DNA replication
processes during interphase (see text). (f) ‘Chromatin liquid drop’: The transcriptional silencing can be established through a dynamic capturing of
transcriptional regions inside compact chromatin melt domains. These domains can be considered as drops of viscous liquid, which could be formed
by the nucleosome–nucleosome interaction and macromolecular crowding effect [42,43]. Active and inactive chromatins are shown in orange and
blue, respectively. Active chromatin regions are transcribed on the surfaces of the drops (shown in green).

chromatin fibres required a strict cationic environment, decreased and favoured intra-fibre nucleosome associ-
namely a low-salt buffer containing 1–2 mM Mg2+; under ations, leading to the formation of 30-nm chromatin
such conditions, isolated nuclei or chromosomes become fibres (Figure 3b). Furthermore, in conventional EM
swollen. Accordingly, the local nucleosome concentration observations, the formation of 30-nm chromatin fibres

Current Opinion in Cell Biology 2010, 22:1–7 www.sciencedirect.com

Please cite this article in press as: Maeshima K, et al. Chromatin structure: does the 30-nm fibre exist in vivo?, Curr Opin Cell Biol (2010), doi:10.1016/j.ceb.2010.03.001

As with typical web pages, we used Universal Character


Set Transformation Format, 8-bit (UTF-8), a variable-width
encoding, which is backwards compatible with ASCII and
UNICODE for special characters and fonts. There are 11
images that are black-and-white and JPEG encoded
(typically a 10:1 data compression with little loss in
quality). These are embedded “inline” (i.e. not separate
les) in the html in base64 format. A consensus bit error in
the middle of any of these JPEG segments would only
affect data downstream within that segment.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
fi
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

DNA storage is very dense.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

At theoretical maximum, DNA can encode


two bits per nucleotide (nt) or 455 exabytes *
per gram of single-stranded DNA.

* The exabyte is a multiple of the unit byte for digital information. The pre x exa indicates multiplication by the sixth
power of 1000 (1018) in the International System of Units (SI). Therefore, one exabyte is one quintillion bytes (short
scale). The symbol for the exabyte is EB.

1 EB = 10006bytes = 1018bytes = 1000000000000000000B = 1000 petabytes = 1million terabytes = 1billion gigabytes.

* eine Trillion (1018) Bytes, eine Milliarde Gigabyte, eine Million Terabyte, Tausend Petabyte

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Unlike most digital storage media, DNA storage is
not restricted to a planar layer and is often
readable despite degradation in non-ideal
conditions over millennia.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

JUNE 29, 2019

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

2021

ARTICLES
https://doi.org/10.1038/s41563-021-01021-3

Random access DNA memory using Boolean


search in an archival file storage system
James L. Banal! !1,4, Tyson R. Shepherd1,4, Joseph Berleant! !1,4, Hellen Huang, Miguel Reyes! !1,2,
Cheri M. Ackerman2, Paul C. Blainey1,2,3 and Mark Bathe! !1,2 ✉

DNA is an ultrahigh-density storage medium that could meet exponentially growing worldwide demand for archival data stor-
age if DNA synthesis costs declined sufficiently and if random access of files within exabyte-to-yottabyte-scale DNA data pools
were feasible. Here, we demonstrate a path to overcome the second barrier by encapsulating data-encoding DNA file sequences
within impervious silica capsules that are surface labelled with single-stranded DNA barcodes. Barcodes are chosen to repre-
sent file metadata, enabling selection of sets of files with Boolean logic directly, without use of amplification. We demonstrate
random access of image files from a prototypical 2-kilobyte image database using fluorescence sorting with selection sensi-
tivity of one in 106 files, which thereby enables one in 106N selection capability using N optical channels. Our strategy thereby
offers a scalable concept for random access of archival files in large-scale molecular datasets.

W
hile DNA is the polymer selected by evolution for the can be used for data encoding. Second, PCR-based retrieval requires
storage and transmission of genetic information in biol- an aliquot of the entire data pool to be irreversibly consumed for
ogy, it can also be used for the storage of arbitrary digi- random access, and therefore additional PCR amplification of the
tal information at densities far exceeding conventional data storage entire data pool may periodically be needed to restore this loss of
technologies such as flash and tape memory, at scales well beyond data. In this case, each PCR amplification may introduce stochastic
the capacity of the largest existing data centres1,2. Recent progress variation in copy number of the file sequences, leading to up to 2%
in nucleic acid synthesis and sequencing technologies continues to data loss per amplification19 if using tenfold physical redundancy,
https://www.nature.com/articles/s41563-021-01021-3
reduce the cost of writing and reading DNA, foreshadowing future as recently suggested18. Finally, avoiding spurious amplification of
commercially competitive DNA-based information storage1,3–5. off-target files due to crosstalk of PCR primers with incorrect bar-
Demonstrations of its viability as a general information storage codes or main file sequences requires careful primer design20. While
medium include the storage and retrieval of books, images, com- strategies exist to circumvent these preceding challenges, they gen-
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
puter programs, audio clips, works of art and Shakespeare’s son- erally reduce data density and might not be easily scalable to exabyte
nets using a variety of encoding schemes6–12, with data size limited and larger file systems. For example, data loss due to periodic PCR
primarily by the cost of DNA synthesis. In each case, digital infor- amplification of the entire data pool19 may be reduced by increasing
mation was converted to DNA sequences composed of ~100–200 the physical redundancy of the files in the main data pool, and PCR
nucleotide data blocks for ease of chemical synthesis and sequenc- crosstalk can be mitigated by spatial segregation of data into distinct
ing. Sequence fragments were then assembled to reconstruct the pools21 or extraction of selected DNA using biochemical affinity17,22.
original, encoded information. As an alternative to PCR-based approaches, here we introduce a
While considerable effort in DNA data storage has focused on direct random access memory approach that retrieves specific files,
increasing the scale of DNA synthesis, as well as improving encoding or arbitrary subsets of files, directly using physical sorting, without
schemes, an additional crucial aspect of data storage systems is the a need for amplification, and without any potential for barcode–
ability to efficiently retrieve specific files or arbitrary subsets of files. memory crosstalk, while also preserving non-selected files intact
To date, molecular random access has largely relied on conventional by recycling them into the original memory pool. To realize this
polymerase chain reaction (PCR)8,10,12,13, which uses up to ~20–30 file system, we first encapsulate DNA-based files physically within
heating and cooling cycles with DNA polymerase to selectively discrete, impervious silica capsules9,23,24, which we subsequently
amplify specific DNA sequences from a DNA data pool using prim- surface-label with unique single-stranded DNA barcodes that offer
ers. Nested addressing barcodes14–16 have also been used to uniquely Boolean-logic-based selection on the entire data pool via simple
identify a greater number of files, as well as biochemical affinity tags hybridization. Downstream file selection may then be optical,
to selectively pull down oligos for targeted amplification17. physical or biochemical, with sequencing-based read-out follow-
While powerful demonstrations of PCR have shown successful ing de-encapsulation of the memory DNA from the silica capsule.
file retrieval from a 150 GB file system18, notable limitations include, Each ‘unit of information’ encoded in DNA we term a ‘file’, which
first, the length of DNA needed to uniquely label DNA data strands includes both the DNA encoding the main data as well as any addi-
for file indexing, which reduces the DNA available for data stor- tional components used for addressing, storage and retrieval. Each
age. For example, for an exabyte-scale data pool, each file requires file contains a ‘file sequence’, consisting of the DNA encoding the
at least three barcodes17, or up to sixty nucleotides in total barcode main data, and ‘addressing barcodes’, or simply ‘barcodes’, which are
sequence length, thereby reducing the number of nucleotides that additional short DNA sequences used to identify the file in solution
2021
NATURE MATERIALS ARTICLES
a (ii) Writing and storing (iii) Random access (iv) Reading

Binary data Querying PCR amplification DNA Data


sequencing reconstruction
PCR-based ‘cat 2’
DNA encoding Selected files
random access

Molecular file
database
(i) Data

Image database

(ii) Writing and storing (iii) Random access (iv) Reading


Fluorescence-activated
Binary data Querying sorting Reverse encapsulation

DNA encoding ‘cat AND orange’ Selected files

Encapsulation
‘cat’
b Metadata ‘orange’
Encapsulation Other files
tagging
-based random DNA sequencing
access Molecular file
database

(v) Copying
Bacterial transfomation Data reconstruction

Fig. 1 | Write–access–read cycle for a content-addressable molecular file system. a, A general framework for DNA data storage that uses PCR-based
random access and its associated challenges. b, We demonstrate here an alternative encapsulation-based file system that allows for scalable indexing
and Boolean logic selection and retrieval. Coloured images were converted into 26!×!26 pixel, black-and-white icon bitmaps. The black-and-white images
were then converted into DNA sequences using a custom encoding scheme (Methods). The DNA https://www.nature.com/articles/s41563-021-01021-3
sequences that encoded the images (file sequences)
were inserted into a pUC19 plasmid vector and encapsulated into silica particles using sol–gel chemistry. Silica capsules were then addressed with content
barcodes using orthogonal 25-nucleotide ssDNA strands, which were the final forms of the files. Files were pooled to form the molecular file database. To
query a file or several files, fluorescently labelled 15-nucleotide ssDNA probes that were complementary to the file barcodes were added to the data pool.
Particles were then sorted with FAS using two to four fluorescence channels simultaneously. Files that were not selected were returned to the molecular
database. Addition ofLECTURE
a chemicalSetching MOLECULAR
ERIES reagent into theBtarget
IOLOGYpopulations released the W ISE 2023/4
encapsulated DNA plasmid. Sequences OforRGANISATION OF THE EUKARYOTIC GENOME
the encoded images
were validated using Sanger sequencing or Illumina MiniSeq. Because plasmids were used to encode information, retransformation of the released plasmids
into bacteria to replenish the molecular file database thereby closed the write–access–read cycle.

using hybridization. We refer to a collection of files as a ‘data pool’ of fluorescence channels employed, without enzymatic amplifica-
or ‘database’, and the set of procedures for storing, retrieving and tion or associated loss of nucleotides available for data encoding. We
reading out files is termed a ‘file system’ (Supplementary Section 0 also demonstrate Boolean AND, OR, NOT logic to select arbitrary
for a full list of terms). subsets of files with combinations of distinct barcodes to query the
As a proof-of-principle of our archival DNA file system, we data pool, similar to the conventional Boolean logic applied in text
encapsulated 20 image files, each composed of a ~0.1 kilobyte image and file searches on solid-state silicon devices.
file encoded in a 3,000-base-pair plasmid, within monodisperse, While only 20 icon-resolution images were chosen as our image
6 µm silica particles that were chemically surface labelled using up to database, representing diverse subject matter including animals,
three 25-nucleotide single-stranded DNA (ssDNA) oligonucleotide plants, transportation and buildings (Supplementary Fig. 1), our
barcodes chosen from a library of 240,000 orthogonal primers20, file system may in principle be scaled to considerably larger sets
which allows for individual selection of up to ~1015 possible distinct of images, limited primarily by the cost of DNA synthesis and the
files using only three unique barcodes per file (Fig. 1). While we need to develop strategies for high-throughput silica encapsula-
chose plasmids to encode DNA data in order to produce microgram tion of distinct file sequences and surface-based DNA labelling
quantities of DNA memory at low cost and to facilitate a renew- for barcoding (Supplementary Fig. 1). Because physical encapsula-
able, closed-cycle write–store–access–read system using bacterial tion separates file sequences from external barcodes that are used
DNA data encoding and expression25–28, our file system is equally to describe the encapsulated information, our file system offers
applicable to ssDNA oligos produced using solid-phase chemical long-term environmental protection of encoded file sequences
synthesis2,6,7,9–12,17 or gene-length oligos produced enzymatically29–32. via silica encapsulation for permanent archival storage9,23,24, where
Fluorescence-activated sorting (FAS) was then used to select target external barcodes may be renewed periodically, further protected
subsets of the complete data pool by first annealing fluorescent oli- with secondary encapsulation, or data pools may simply be stored
gonucleotide probes that are complementary to the barcodes used to using methods implemented in PCR-based random access, such
address the database33, enabling direct physical retrieval of specific, as dehydrating the data pool and immersing the dried molecular
individual files from a pool of 106N total files, where N is the number database in oil21.

NATURE MATERIALS | VOL 20 | SEPTEMBER 2021 | 1272–1280 | www.nature.com/naturematerials 1273

On Earth right now, there are about 10 trillion gigabytes of


digital data, and every day, humans produce emails, photos,
tweets, and other digital les that add up to another 2.5 million
gigabytes of data.

Much of this data is stored in enormous facilities known as


exabyte data centers, which can be the size of several football
elds and cost around $1 billion to build and maintain.

Currently it would cost €1 trillion to write one petabyte of data (1 million gigabytes)

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
fi
fi
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Francis Harry Compton Crick (1916- 2004)

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The flow of information from DNA to protein occurs in all cells

How can genetic instructions direct the formation


of all organisms?

Do you remember?

The central dogma of molecular biology was formulated by Francis Crick


in 1958 and confirmed in 1970.

Can you define it? Is it still valid?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The central dogma of molecular biology

3x3=9
possible direct transfers of information between 3 polymer classes

believed to occur known to occur, believed never to occur


normally in most cells but only under specific
conditions in case of
some viruses or
in a laboratory

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The flow of information from DNA to protein occurs in all cells


NATURE VOL. 227 AUGUST 8 1970 501

se2 NATURE VOL. 227 AUGUST 8 1970


https://www.nature.com/articles/227561a0
The latter nns tho t,ranafer postulated by Gamow, from
(double stranded) DNA to protein, though by that time c)
his prticul~~ theory had &en disproved.
Central Dogma of Molecular Biology The third CIWS consisted of the three’ t-fen, thr DNA
RITOWSof which haye been omitted front Fig. 2. Thos
were t ho transfbrs : ,f ‘\\
bY The central dogma of molecular biology deals with the detailed / \
FRANCIS CRICK / \
residue-by-residue transfer of sequential information. It states III (a) Proteti4Protein / \
and it is for this reason that the central dogma is as

laboraton of Moleculrr Blolom that such informatfon cannot be transferred from protein to either
~Bslthsors,D N&we, S 1206 (1670). &a also the brief mount af

MC J \
con-

III (b) Profx~in43NA


663

’ NcCartby, B., madRoWid, J. J., Proe. 119Nd. Ad. Sei., 64,660 W66).

:Hllls Road, - proteln or nucleic acid. . / II


I ,
Y
*Gmbrldga CB2 2QH III (c) Protein-cDNA
tc Dr l’amln’o asrller work datthg backtc 1666.

-i- 3‘ -
RNA PROTEIN
Rep&dim
Thlrartkie

The general opinion et the time was that class I ah&t 4


t-,*’ I
.

‘Thr central dogma, anumiatod by Crick In 1958 and thr analogous to thymine in DNA, thus giving four st&ndard ~ktdnly existed, class II was probably rare or absent,
km of molecular biology ever since, is likely ta prove a symbols for the oomponents of nucleia aoid. ’ and that olees III was very unlikely to occur. The
important today es when it was ht proposed.

Fit& S. A bnWlva ohmta~tlon for the present day. SolId arrow rhow
csnrtdenblo ovrr-slmplt5cstlon.” The prinoipal problem oonld then be s&ted as the decision had to be made, therefore, ‘whether to BBBUUM ~~8m-B do&Warmti~ tmufen. AwIn, the
Fha Biolooiml

em rpeclnad by the central


formulrction of the generel rules for information tr8nsGr that only class I trausfers occurred. There were, however,
’ Qtbbom. B. A.. and Eonter. 0. D., N&te,6l6,1041W67).

Tars quotation is taken fx~m the boginning of an unsigned doomL


article* headed “Central dogma revemcd”, recounting the fkom one polymer with a d&red alphabet~t.o~another. no overwhelminn 6t~otural retbeons why the transfer in
8.. Nu&re,&6,1211(1970).

very important work of Dr Howard Ternin’ and others* Thiaaotddbecomp&IyrepreeMedbythediagramof, class II should ‘6ot be impoesible. In- feet, for all we called general -fore, special transfers and unknown
ahawing that an RNA turnout virus can use viral RNA Fig. 1 (whioh was aotually ‘drawn at-that time;> though I knew, the replication of all RNA viruses could +.ve sqne transfers.
M tt template for DNA syntheeis. This is not the 5rst &m not mrre that it wus ‘ever published) in which all by way of a DNA intermediate. On the other hand, there
time that the idee of the central dogma has bean mis- pomible sim le transfem were reprwented, by arrfmJ. we& good general IWBOM sgainst all the three possible General and Special Transfers
Bpkgobnul% recentwork on p4g.J l!m?.
Bid.

understood, in one way or another. In this article I Thelurowa c& not, of oouree, m@wmt the flow of titter tnrnsfere in cl866 III. In brief, it was most unlikely, for
A general transfer ie one which can ocour in ell cells.
B.. Naium. W, 664 W66).

explain why the term was originally introduced, its true but the direotional flow of detailed, residue-by-residue; stereochemical reasoner that. protein+protein transfer
P.. Natmm. Ma NJ (1670).

m Qrblltl~, J. 6.. iVafure,8l6,1046 (1967).

meaning, and state why I think thmt, propotiy under- sequenaa information from one polymer, moleoule to could be done in the simple wey that DNA-+DNA transfer The obvious ~8888 are
SW. Etp.

stood, it is dill 8n id08 of fund8montal importance. 8nother. ‘!., was,enviaaged. Tht’tratd” p&e&RNA (and the DNA+DNA
The central dogma was put forward’ at 8 period when Now if 811 poesible transfers commonly oceurmd it uudogous protein+DNA) would hav,e required (back)
‘Hsnhs);A.D.,N-s#.~(1Q70).

DNA+RNA
much of what we new know in moleouler genetics was not would have been almost impassible ta.construct useful translation, that is, the transfer from ‘one alphabet to a
established. Allwehadfoworkonworeoer&infrag theorim. Nevertheless, such theories were p&of our 6tNoturally quite dif%mnt one. It was retllized that RNA-*Protein
mentary experimental resuhs, themselves often rather everydaydis&sifm~1.This~bec8use~it.w&bGg forward franelation involved very complex machinery. Minor exceptions, such aa the mammalian zetioulooyte,
l&ml,

uncertain and cmfuaed, and 8 botmdle~~ optimism that tacitly atmmed thet oertsin t&m could not ixour: Moreover, it eeemed unlikely on g&era1 @ounds that this whioh probeblg lacks the first two of thm, rhould not
tsL lloB (r;&
Jtllr 8. lMO

the I basic oonoeptr involved were rathor simple and It oeourmd to me that it would be wioe to st+ these mtacbhm could e&Iv work baokwards. The onlv re&on- exclude.
probably much tho same in all living things. In such e vptions explicitly. eble alt.eLtive wua &8t the oell had evolved en’ entirely A speoiel mfer ie one which ,doee not CMXXU in most
m

‘“$,$~~~&(r~).

situation web constructed theories ten phby 8 IY&]Y useful sepmate det of complicated machinery for back tranalstion, cells, but may occur in ape&l owoum&nce& Possible

‘0
fn afeting problems olearly and thun guiding experi- and of t&s there wsa no trace, and no reason to believe calldidttb are
thatitmightbeneeded. J
’ Flekhmm,
‘C!owmwr,

Thi two central conoepts which had been produced, RNAdRNA


‘~&~.f.x.~d

I de&led, therefom, fo play safe, and to state as the


originally without any explicit statement of the simplifica- DNA b&o awum&on of the n&v inol&ular biology the non- RNA-rDNA
tion being introduced, were those of sequential inform&ion 9f ‘1 \ et- oft- of alass III. Because these were all DNA-tprotein
, ~-

and of de5ned alphabets. Neither of these rteps was


// the poeaible transfers’f?om profein, the central dogma At th6 pre&nt time the first two of these have only been
-~

trivial. Because it WBB.ebundently &uw by that time \\


//
could be stated in the form %nce (sequentisl) information llllown-lnasrtainvirw-infectedoelli. AsfiuesIknow
that a protein had 8 well de5ned three dimension81 struo-
/ \ the third Ed@ in ‘a epecisl oell-
interest to dnd e cell (es opposed to a virus) which had

Perhaps the so-called repetitive DNA is produced by an


was wrong could

Any of these would be of the


be an important discovery. It would certaiuly be of great

turn, end that its nctivity depended crucially on this


RNA as its genetic meterie end no DNA, or 8 cell which

greatest interest, but they could be .ecoor;~odated into


the discovery of just one type of’present day eel1 which
would
our thinking without undue strain. On the other hand,
800 the articles by Gibbons and Hunter* and by Grifflthio.

shake the whole intellectual lx&s of molecular biology,


used single-stranded DNA aa measemgerrather than RNA.

neomycin~, ,though by a trick it


Nevertheless, we know enough to s8y that a non-trivial)

stratum, it was nv to put the folding-up prooer / ‘\ to heppen, using neomycin, in 8n


on one side, and p&u&e that, by end large, the -
peptide oh&n folded it&f up. This temporuriIy & RN:’ ; ~PRO&J intaotbeotmidc.ell.
the central problem from a three dimensional one to a -f ‘r.
one dimensional one. It w88 also v to 6rgue Urhown Transfers
oould oarry out any of the three unknown trandm

that in spite of the miscellaneous list of amino-ecids The& t&e the three them which the central dogma
found in proteins (as then ,&en in uU biochemical text- postulates never occur:
books) some of them, such as phosphoserine, were second- Proteii4Protein
ary modi5catioy ; and that there was probably a universal r’ _’ .i ProG.u+DNA
1970

set of twenty used throughout nature. In the ssme ,way


A &lo dJ9is. &wd’th& the &.. &$ :& Protein~RNA, L
exumple showing that the class&&ion

minor moditlcations to the nuclei0 a&d bseee were ignored; about the rata at which&e ~~~MSHS work. ;
urecil in ‘RNA w&r considered to be informationdly dividedroughlyintotbree~upe.’ The5riftglwupwM (3) It was intended * 8pply only to presentday Stated in this w8y it is aleer ‘&&the epeei8l trum5ferr
6

those for whioh some e&denoe; direct or’indimat, 8BB668d orj@sme,andnottoevent8intheremotepast,suchae are those 8boUt WhiOh there is the &St unc&%inty. It
toexist.’ These’arenhownbythesolid~ in Fig. 3. the origin of l.iSeor the origin of the code. might indeed hsve “profo* impliostioas for moleoul6r
Theywere: :. : ‘;.
AUGUST

(4) Itianotthe6ame,aeit3oommonly tbsaumed, as the biology”1 if any of thm speciel t&bra could be ahoWn
I (a) DNA-+DNA aeq- hypothesis, which K(LB clearly distii t.0 be general, or-if not in all oel’lt I& fo’be’widely
DNA I (b) DtiA+RNA ~ fkom it in the aune srtiole4. In p&rticular the sequence ‘distributed. .8o liir, howetier, there L no evidenca for the
I (a) RNA~Protein’ hypothe&. ww IL poeitive titement, ss$ng that the iht two of the& except in a 041 infected with an RNA
. (over&l) tmnafbr nucleia aoid+protein did exist, whereas virus. In such e 0011the central dognia demanda that at
transfer.

I (d) RNAr’RNAl the central dogma wea a negative statement, saying that least one of the flmt two speaial frenttfem pbould oaour-
227

The la& of them tr&&ms WM presumed to ooaur~beo&isd trsn&m kom protein did not exist. this statement, iniridentally, shows the power of the
/!+i of the existence of RNA Grus~~. In looking +k I am struck not only bi the brashness central doemb in making theqretical predictions. Nor, (u
tih allowed UB to venture pore&l statements of a I have in&cat& is there any good theoretical reason why
VOL.

RNA e--PROTEIN Nest there wmtitmo Wem (shown~h Fig. !2+iid$‘t&


arrows) for,.whioh, there was neither any w T. ~1 nature, but ti by the rather delicate the frrrnefer MA-DNA should not .sometimea be used.
&crzmmation used in s&+ut~ what sfatementa to make. I have never sugg&ed thet it cannot oeaur, nor, gr) far aa
c). _I (>.
Cwidena8 nor irny strong theoretical ~rfquiremenfl~“~. Tlhey’ .
RNA-tDNA

wed: ,,. Time br shown that not everybody appreciated our I know, have any of-my oolleaguee~
G the mferan& “6 +Yg* A.&&j I rwtMint. Although the detedn of the &am&&m $oF here
NATURE

Fly. 1. The umm rbor all ‘he poiulbb rtmpb truufm between Un n :(a$, &&2~~~ ( 80 muoh for the h of the subject. whst of the are plausible, our knowled@ of moleaular IO egg, even
lhree fmnuJa of polYmfua. They rwraurt the dh3uonal aor of
dewhxl qlluJIon 1nrormAon. ,II (6) !,DNA+Rotein ’ prmmt 1 I think it llltor i8 o Ed that the old alassiiloetion, in one oell-let alone for all the organimmr~.in natu~+-
though weful at the time, oould be improved, and I is atill fiu too inoompleta ti ally ua to amert d~tically
e that the nine ible traders bs regrouped that it in oormat. (There is, for exam le, the problem of
tmWively into three oiE6 . 1 propose that theee be the &emical nature of the vt of tfl e disecrre sar&ea:

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The flow of information from DNA to protein occurs in all cells

Prions are proteins that propagate themselves by making conformational changes in


other molecules of the same type of protein.

In fungi this change happens from one generation to the next, i.e. Protein → Protein.

While this represents a transfer of information, prion interactions are no replications,


and so are not technically considered an exception to the central dogma.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

NATURE VOL. 227 AUGUST 8 1970

tho t,ranafer postulated by Gamow, from


ed) DNA to protein, though by that time c)
theory had &en disproved.
IWS consisted of the three’ t-fen, thr DNA
h haye been omitted front Fig. 2. Thos
brs : ,f ‘\\
/ \
/ \
III (a) Proteti4Protein \ /
The flow of information from DNAJ to protein \ occurs in all cells
III (b) Profx~in43NA I
/ I Y
I ,
III (c) Protein-cDNA
As of 2010, prion replication -i- has RNA - PROTEIN
opinion et the time was that class I ah&t 4 3‘I been shown to be subject to
ed, class II was probably raremutation or absent, and natural selection t-,*’ similar to other forms of replication.
s III was very unlikely to occur. The Fit& S. A bnWlva ohmta~tlon for the present day. SolId arrow rhow
o be made, therefore, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2848070/
‘whether to BBBUUM ~~8m-B do&Warmti~ tmufen. AwIn, the
em rpeclnad by the central
I trausfers occurred. There were, however, doomL
nn 6t~otural retbeons why the transfer in
d ‘6ot be impoesible. In- feet, for all we called general -fore, special transfers and unknown
cation of all RNA viruses could +.ve sqne transfers.
NA intermediate. On the other hand, there
eral IWBOM sgainst all the three possible General and Special Transfers
66 III. In brief, it was most unlikely, for
reasoner that. protein+protein transfer A general transfer ie one which can ocour in ell cells.
n the simple wey that DNA-+DNA transfer The obvious ~8888 are
Tht’tratd” p&e&RNA (and the DNA+DNA
ein+DNA) would hav,e required (back) DNA+RNA
at is, the transfer from ‘one alphabet to a RNA-*Protein
uite dif%mnt one. It was retllized that
ation involved very complex machinery. Minor exceptions, such aa the mammalian zetioulooyte,
eemed unlikely on g&era1 @ounds that this whioh probeblg lacks the first two of thm, rhould not
d e&Iv work baokwards. The onlv re&on- exclude.
wua &8t the oell had evolved en’ entirely A speoiel mfer ie one which ,doee not CMXXU in most
complicated machinery for back tranalstion, cells, but may occur in ape&l owoum&nce& Possible
re wsa no trace, and no reason to believe calldidttb are
ded. J
erefom, fo play safe, and to state as the RNAdRNA
of the n&v inol&ular biology the non- RNA-rDNA
of alass III. Because these were LECTUREallSERIES MOLECULAR BIOLOGY DNA-tprotein
WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
ansfers’f?om profein, the central dogma At th6 pre&nt time the first two of these have only been
in the form %nce (sequentisl) information llllown-lnasrtainvirw-infectedoelli. AsfiuesIknow
the third Ed@ in ‘a epecisl oell-
neomycin~, ,though by a trick it
to heppen, using neomycin, in 8n
intaotbeotmidc.ell.
Urhown Transfers
The& t&e the three them which the central dogma
postulates never occur:
Proteii4Protein
ProG.u+DNA
at which&e ~~~MSHS Thework.flow of information ; from DNA to proteinLoccurs in all cells
Protein~RNA,
ntended * 8pply only to presentday Stated in this w8y it is aleer ‘&&the epeei8l trum5ferr
oevent8intheremotepast,suchae
NATURE VOL. 227 are those 8boUt WhiOh there is the &St unc&%inty.
AUGUST 8 1970 It 501

Seor the origin of the code. might indeed hsve “profo* impliostioas for moleoul6r se2 NATURE VOL. 227 AUGUST 8 1970

The latter nns tho t,ranafer postulated by Gamow, from

ame,aeit3oommonlyCentral Dogmatbsaumed,
of Molecular as the biology”1 if any of thm speciel t&bra
Biology could be ahoWn
DNA
(double stranded) DNA to protein, though by that time
his prticul~~ theory had &en disproved.
The third CIWS consisted of the three’ t-fen, thr
c)

hesis, which K(LB clearly distii


bY t.0 be general, or-if not in all oel’lt I& fo’be’widely
The central dogma of molecular biology deals with the detailed
RITOWSof which haye been omitted front Fig. 2. Thos
were t ho transfbrs :
/
/
,f ‘\\
\
\

aune srtiole4. In p&rticular the sequence ‘distributed. .8o liir, howetier, there L no evidenca for the
FRANCIS CRICK residue-by-residue transfer of sequential information. It states III (a) Proteti4Protein / \
and it is for this reason that the central dogma is as

~Bslthsors,D N&we, S 1206 (1670). &a also the brief mount af

J \
laboraton
of Moleculrr Blolom
con-

that such informatfon cannot be transferred from protein to either


$

MC III (b) Profx~in43NA


663

’ NcCartby, B., madRoWid, J. J., Proe. 119Nd. Ad. Sei., 64,660 W66).

:Hllls Road, - proteln or nucleic acid. / II Y


. I ,

IL poeitive titement, ss$ng that the iht two of the& except in a 041 infected with an RNA
III (c) Protein-cDNA
tc Dr l’amln’o asrller work datthg backtc 1666.

*Gmbrldga CB2 2QH


RNA - PROTEIN
Rep&dim

-i- 3‘
Thlrartkie

The general opinion et the time was that class I ah&t 4


.

~ktdnly existed, class II was probably rare or absent, t-,*’ I

r nucleia aoid+protein did exist, whereas virus. In such e 0011the central dognia demanda that at
‘Thr central dogma, anumiatod by Crick In 1958 and thr analogous to thymine in DNA, thus giving four st&ndard
and that olees III was very unlikely to occur. The
important today es when it was ht proposed.

km of molecular biology ever since, is likely ta prove a symbols for the oomponents of nucleia aoid. ’ Fit& S. A bnWlva ohmta~tlon for the present day. SolId arrow rhow
csnrtdenblo ovrr-slmplt5cstlon.” The prinoipal problem oonld then be s&ted as the decision had to be made, therefore, ‘whether to BBBUUM ~~8m-B do&Warmti~ tmufen. AwIn, the
Fha Biolooiml

em rpeclnad by the central


that only class I trausfers occurred. There were, however,

least one of the flmt two speaial frenttfem pbould oaour-


’ Qtbbom. B. A.. and Eonter. 0. D., N&te,6l6,1041W67).

ma wea a negative statement, saying that


Tars quotation is taken fx~m the boginning of an unsigned formulrction of the generel rules for information tr8nsGr doomL
article* headed “Central dogma revemcd”, recounting the fkom one polymer with a d&red alphabet~t.o~another. no overwhelminn 6t~otural retbeons why the transfer in
8.. Nu&re,&6,1211(1970).

very important work of Dr Howard Ternin’ and others* Thiaaotddbecomp&IyrepreeMedbythediagramof, class II should ‘6ot be impoesible. In- feet, for all we called general -fore, special transfers and unknown

protein did not exist. this statement, iniridentally, shows the power of the
ahawing that an RNA turnout virus can use viral RNA Fig. 1 (whioh was aotually ‘drawn at-that time;> though I knew, the replication of all RNA viruses could +.ve sqne transfers.
M tt template for DNA syntheeis. This is not the 5rst &m not mrre that it wus ‘ever published) in which all by way of a DNA intermediate. On the other hand, there
time that the idee of the central dogma has bean mis- pomible sim le transfem were reprwented, by arrfmJ. we& good general IWBOM sgainst all the three possible General and Special Transfers
Bpkgobnul% recentwork on p4g.J l!m?.

I am struck not only bi the brashness central doemb in making theqretical predictions. Nor, (u
Bid.

understood, in one way or another. In this article I Thelurowa c& not, of oouree, m@wmt the flow of titter tnrnsfere in cl866 III. In brief, it was most unlikely, for
A general transfer ie one which can ocour in ell cells.
B.. Naium. W, 664 W66).

explain why the term was originally introduced, its true but the direotional flow of detailed, residue-by-residue; stereochemical reasoner that. protein+protein transfer
P.. Natmm. Ma NJ (1670).

m Qrblltl~, J. 6.. iVafure,8l6,1046 (1967).

could be done in the simple wey that DNA-+DNA transfer The obvious ~8888 are
SW. Etp.

meaning, and state why I think thmt, propotiy under- sequenaa information from one polymer, moleoule to

UB to venture pore&l statements of a I have in&cat& is there any good theoretical reason why
stood, it is dill 8n id08 of fund8montal importance. 8nother. ‘!., was,enviaaged. Tht’tratd” p&e&RNA (and the DNA+DNA
uudogous protein+DNA) would hav,e required (back)
‘Hsnhs);A.D.,N-s#.~(1Q70).

The central dogma was put forward’ at 8 period when Now if 811 poesible transfers commonly oceurmd it DNA+RNA
much of what we new know in moleouler genetics was not would have been almost impassible ta.construct useful translation, that is, the transfer from ‘one alphabet to a RNA-*Protein
ature, but ti by the rather delicate the frrrnefer MA-DNA
established. Allwehadfoworkonworeoer&infrag
mentary experimental resuhs, themselves often rather should not .sometimea be used.
theorim. Nevertheless, such theories were p&of
everydaydis&sifm~1.This~bec8use~it.w&bGg
our 6tNoturally quite dif%mnt one. It was retllized that
forward franelation involved very complex machinery. Minor exceptions, such aa the mammalian zetioulooyte,
l&ml,

Moreover, it eeemed unlikely on g&era1 @ounds that this whioh probeblg lacks the first two of thm, rhould not
tsL lloB (r;&

uncertain and cmfuaed, and 8 botmdle~~ optimism that tacitly atmmed thet oertsin t&m could not ixour:

used in s&+ut~ what sfatementa to make. I have never sugg&ed thet it cannot oeaur, nor, gr) far aa
Jtllr 8. lMO

the I basic oonoeptr involved were rathor simple and It oeourmd to me that it would be wioe to st+ these mtacbhm could e&Iv work baokwards. The onlv re&on- exclude.
probably much tho same in all living things. In such e vptions explicitly. eble alt.eLtive wua &8t the oell had evolved en’ entirely A speoiel mfer ie one which ,doee not CMXXU in most
m

‘“$,$~~~&(r~).

situation web constructed theories ten phby 8 IY&]Y useful sepmate det of complicated machinery for back tranalstion, cells, but may occur in ape&l owoum&nce& Possible

wn that not everybody appreciated our I ‘0know, have any of-my oolleaguee~
fn afeting problems olearly and thun guiding experi- and of t&s there wsa no trace, and no reason to believe calldidttb are
thatitmightbeneeded. J
’ Flekhmm,
‘C!owmwr,

RNAdRNA
‘~&~.f.x.~d

Thi two central conoepts which had been produced, I de&led, therefom, fo play safe, and to state as the

9fAlthough the detedn of the &am&&m $oF here


DNA
originally without any explicit statement of the simplifica- b&o awum&on of the n&v inol&ular biology the non- RNA-rDNA
tion being introduced, were those of sequential inform&ion et- oft- of alass III. Because these were all DNA-tprotein
‘1 \
, ~-

//
the poeaible transfers’f?om profein, the central dogma At th6 pre&nt time the first two of these have only been
-~

and of de5ned alphabets. Neither of these rteps was


the h of the subject. whst of the / are
trivial.
plausible,
\\
Because it WBB.ebundently &uw by that time
// \ our knowled@ of moleaular IO egg, even
that a protein had 8 well de5ned three dimension81 struo-
could be stated in the form %nce (sequentisl) information llllown-lnasrtainvirw-infectedoelli. AsfiuesIknow
the third Ed@ in ‘a epecisl oell-
interest to dnd e cell (es opposed to a virus) which had

Perhaps the so-called repetitive DNA is produced by an


was wrong could

Any of these would be of the


be an important discovery. It would certaiuly be of great
RNA as its genetic meterie end no DNA, or 8 cell which

greatest interest, but they could be .ecoor;~odated into


the discovery of just one type of’present day eel1 which
would
our thinking without undue strain. On the other hand,

turn, end that its nctivity depended crucially on this


800 the articles by Gibbons and Hunter* and by Grifflthio.

shake the whole intellectual lx&s of molecular biology,


used single-stranded DNA aa measemgerrather than RNA.

nk it llltor
i8 o Ed that the old alassiiloetion,/ RN:’ ;in one ~PRO&J oell-let alone for all the organimmr~.in natu~+-
neomycin~, ,though by a trick it
Nevertheless, we know enough to s8y that a non-trivial)

stratum, it was nv
on one side, and p&u&e
‘\
to put the folding-up prooer
that, by end large, the -
to heppen, using neomycin, in 8n
intaotbeotmidc.ell.
peptide oh&n folded it&f up. This temporuriIy &

at the time, oould be improved, and I-f ‘r. is atill fiu too inoompleta ti ally ua to amert d~tically
the central problem from a three dimensional one to a Urhown Transfers
oould oarry out any of the three unknown trandm

one dimensional one. It w88 also v to 6rgue


that in spite of the miscellaneous list of amino-ecids The& t&e the three them which the central dogma

he nine ible traders bs regrouped that it in oormat. (There is, for exam le, the problem of
found in proteins (as then ,&en in uU biochemical text- postulates never occur:
books) some of them, such as phosphoserine, were second- Proteii4Protein
ary modi5catioy ; and that there was probably a universal

three oiE6 . 1 propose that theee be the &emical nature of the vt of tfl e disecrre sar&ea: _’ .i ProG.u+DNA
1970

r’
set of twenty used throughout nature. In the ssme ,way Protein~RNA, L
exumple showing that the class&&ion

minor moditlcations to the nuclei0 a&d bseee were ignored; A &lo dJ9is. &wd’th& the &.. &$ :& about the rata at which&e ~~~MSHS work. ;
urecil in ‘RNA w&r considered to be informationdly dividedroughlyintotbree~upe.’ The5riftglwupwM (3) It was intended * 8pply only to presentday Stated in this w8y it is aleer ‘&&the epeei8l trum5ferr
6

those for whioh some e&denoe; direct or’indimat, 8BB668d orj@sme,andnottoevent8intheremotepast,suchae are those 8boUt WhiOh there is the &St unc&%inty. It
toexist.’ These’arenhownbythesolid~ in Fig. 3. the origin of l.iSeor the origin of the code. might indeed hsve “profo* impliostioas for moleoul6r
AUGUST

Theywere: :. : ‘;.
(4) Itianotthe6ame,aeit3oommonly tbsaumed, as the biology”1 if any of thm speciel t&bra could be ahoWn
I (a) DNA-+DNA aeq- hypothesis, which K(LB clearly distii t.0 be general, or-if not in all oel’lt I& fo’be’widely
DNA I (b) DtiA+RNA ~ fkom it in the aune srtiole4. In p&rticular the sequence ‘distributed. .8o liir, howetier, there L no evidenca for the
hypothe&. ww IL poeitive titement, ss$ng that the iht two of the& except in a 041 infected with an RNA
I (a) RNA~Protein’ (over&l) tmnafbr nucleia aoid+protein did exist, whereas virus. In such e 0011the central dognia demanda that at
transfer.

. I (d) RNAr’RNAl the central dogma wea a negative statement, saying that least one of the flmt two speaial frenttfem pbould oaour-
227

trsn&m kom protein did not exist. this statement, iniridentally, shows the power of the
The la& of them tr&&ms WM presumed to ooaur~beo&isd In looking +k I am struck not only bi the brashness central doemb in making theqretical predictions. Nor, (u
/!+i of the existence of RNA Grus~~. tih allowed UB to venture pore&l statements of a I have in&cat& is there any good theoretical reason why
VOL.

RNA e--PROTEIN Nest there wmtitmo Wem (shown~h Fig. !2+iid$‘t& T. ~1 nature, but ti by the rather delicate the frrrnefer MA-DNA should not .sometimea be used.
arrows) for,.whioh, there was neither any w &crzmmation used in s&+ut~ what sfatementa to make. I have never sugg&ed thet it cannot oeaur, nor, gr) far aa
_I Cwidena8 nor irny strong theoretical ~rfquiremenfl~“~. Tlhey’ .
RNA-tDNA

c). (>. Time br shown that not everybody appreciated our I know, have any of-my oolleaguee~
wed: ,,. rwtMint. Although the detedn of the &am&&m $oF here
NATURE

G the mferan& “6 +Yg* A.&&j I


Fly. 1. The umm rbor all ‘he poiulbb rtmpb truufm between Un n :(a$, &&2~~~ ( 80 muoh for the h of the subject. whst of the are plausible, our knowled@ of moleaular IO egg, even
lhree fmnuJa of polYmfua. They rwraurt the dh3uonal aor of
dewhxl qlluJIon 1nrormAon. ,II (6) !,DNA+Rotein ’ prmmt 1 I think it llltor i8 o Ed that the old alassiiloetion, in one oell-let alone for all the organimmr~.in natu~+-
though weful at the time, oould be improved, and I is atill fiu too inoompleta ti ally ua to amert d~tically
e that the nine ible traders bs regrouped that it in oormat. (There is, for exam le, the problem of
tmWively into three oiE6 . 1 propose that theee be the &emical nature of the vt of tfl e disecrre sar&ea:

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
RNA polymerase II transcribes the DNA

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

TATA-binding protein (TBP) distorts the DNA

TBP-DNA complex

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Eukaryotic RNA pol II requires general transcription factors

Transcription in eukaryotic cells and bacteria is


different.

Bacteria have just one type of RNA polymerase.


We have three.

Bacterial RNA Pol can work on its own.


Eukaryotes need general transcription factors (GTFs).

Transcription initiation in bacteria is more simple.


In eukaryotes there can be huge distances
between genes and the DNA is
packaged into nucleosomes.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
REVIEWS F O C U S O N t r a N S CRrEiV
p ItEi W
ONS

Table 1 | Subunits of Pol II and general transcription factors


Factor Gene name
Yeast Human
Mass (kDa)
Yeast Human
Uniprot accession number
Yeast Human
Copies
Table 1 (cont.) | Subunits of Pol II and general transcription factors
Factor Gene name
Yeast Human
Mass (kDa)
Yeast Human
Uniprot accession number
Yeast Human
Copies 2015
Pol II (RNAP*): transcribing enzyme TFIIF§: TSS selection and stabilization of TFIIB
RPB1 RPO21 POLR2A 191.6 217.2 P04050 P24928 1 TFIIFα TFG1 GTF2F1 82.2 58.2 P41895 P35269 1
RPB2 RPB2 POLR2B 138.8 133.9 P08518 P30876 1 TFIIFβ TFG2 GTF2F2 46.6 28.4 P41896 P13984 1
RPB3 RPB3 POLR2C 35.3 31.4 P16370 P19387 1 TFG3# TAF14 NA 27.4 NA P35189 NA NA
RPB4 RPB4 POLR2D 25.4 16.3 P20433 O15514 1 Total 156.2 86.6
(2–3 subunits)
RPB5‡ RPB5 POLR2E 25.1 24.6 P20434 P19388 1
TFIIH§ (core): promoter opening and DNA repair
RPB6‡ RPO26 POLR2F 17.9 14.5 P20435 P61218 1
Subunit 1 (p62) TFB1 GTF2H1 72.9 62.0 P32776 P32780 1
RPB7 RPB7 POLR2G 19.1 19.3 P34087 P62487 1
Subunit 2 (p44) SSL1 GTF2H2 52.3 44.4 Q04673 Q13888 1
RPB8‡ RPB8 POLR2H 16.5 17.1 P20436 P52434 1
Subunit 3 (p34) TFB4 GTF2H3 37.5 34.4 Q12004 Q13889 1
RPB9 RPB9 POLR2I 14.3 14.5 P27999 P36954 1
Subunit 4 (p52) TFB2 GTF2H4 58.5 52.2 Q02939 Q92759 1
RPB10‡ RPB10 POLR2L 8.3 7.6 P22139 P62875 1
Subunit 5 (p8) TFB5 GTF2H5 8.2 8.1 Q3E7C1 Q6ZYL4 1
RPB11 RPB11 POLR2J 13.6 13.3 P38902 P52435 1
XPD subunit: ATPase; RAD3 ERCC2 89.8 86.9 P06839 P18074 1
RPB12 ‡
RPB12 POLR2K 7.7 7.0 P40422 P53803 1 DNA repair
Total 513.6 516.7 XPB subunit: ATPase; SSL2 ERCC3 95.3 89.3 Q00578 P19447 1
(12 subunits) promoter opening
TFIIA§: TBP stabilization and counteracts repressive effects of negative co-factors Total 414.5 377.3
Large subunit TOA1 GTF2A1 32.2 41.5 P32773 P52655 1 (7 subunits)
Small subunit TOA2 GTF2A2 13.5 12.5 P32774 P52657 1 TFIIH (kinase module): CTD phosphorylation
Total 45.7 54.0 Cyclin H CCL1 CCNH 45.2 37.6 P37366 P51946 1
(2 subunits) CDK7 KIN28 CDK7 35.2 39.0 P06242 P50613 1
TFIIB: Pol II recruitment, TBP binding and TSS selection MAT1 TFB3 MNAT1 38.1 35.8 Q03290 P51948 1
TFIIB (TFB*) SUA7 GTF2B 38.2 34.8 P29055 Q00403 1 Total 118.5 112.4
TFIID: Pol II recruitment and promoter recognition (3 subunits)
CTD, C-terminal domain; NA, not available; Pol, RNA polymerase; TAF, TBP-associated factor; TBP, TATA-box-binding protein;
TBP (TBP*): recognition TBP TBP 27.0 37.7 P13393 P20226 1 TFIIA, transcription initiation factor IIA; TSS, transcription start site. *Archaeal homologue. ‡Factor shared between Pol I, Pol II
of the TATA box and Pol III. §No known archaeal homologue. ||Component of TFIID, TFIIF and chromatin remodelling complexes. ¶Approximate
molecular weight. #TFG3 is a component of TFIID, TFIIF and chromatin remodelling complexes; the yeast-specific subunit is
TAF1 TAF1 TAF1 120.7 212.7 P46677 P21675 1 non-essential as part of TFIIF and as part of TFIID212.
TAF2 TAF2 TAF2 161.5 137.0 P23255 Q6P1X5 1
TAF3 TAF3 TAF3 40.3 103.6 Q12297 Q5VWG9 1 FOCUS ON traNSCriptiON
the core initiation complex and the interaction of this TFIIB was found to be located on the Pol II dock domain
TAF4 TAF4 TAF4 42.3 110.1 P50105 O00268 2 complex with the auxiliary factor TFIIA. We then dis- using biochemical probing 26 and X-ray crystallography 27
cuss TFIIE and TFIIH and their roles in promoter DNA
REVIEWS
(BOX 1). As this domain was not present in the TFIIB–
TAF5 TAF5 TAF5 89.0 86.8 P38129 Q15542 2
TAF6 TAF6 TAF6 57.9 72.7 P53040 P49848 2 opening. Finally, we discuss how recent data on the TBP–DNA complex structure19, a model for the Pol II–
structure of TFIID provide insights into its functions in TFIIB–TBP–DNA complex had to be derived with the
TAF7 TAF7 TAF7 67.6 40.3 Q05021 Q15545 1
determining promoter specificity. use of site-specific protein cleavage probing 28,29 (BOX 1).
TAF8 TAF8 TAF8 58.0 34.3 Q03750 Q7Z7C8 1 In 2009, the structure of the complete 12-subunit
TAF9 TAF9 TAF9 17.3 29.0 Q05027 Q16594 2 A brief history of initiation complex architecture Pol II bound by TFIIB confirmed the location of the
TAF10 TAF10 TAF10 23.0 21.7 Q12030 Q12962 2 Structural analyses of the PIC started over two decades B-ribbon domain on the dock and positioned one of
TAF11 TAF11 TAF11 40.6 23.3 Q04226 Q15544 1 Structural basis of transcription
ago. Structures of TBP in free form14,15 and bound to two cyclin folds in the carboxy-terminal B-core domain
DNA16,17 revealed that TBP is a saddle-shaped molecule of TFIIB on the Pol II wall10. This enabled modelling of
TAF12
TAF13
TAF12
TAF13
TAF12
TAF13
61.1
19.1
17.9
14.3
Q03761
P11747
Q16514
Q15543
2
1
initiation by RNA polymerase II
that binds to the DNA minor groove and bends DNA Pol II–TFIIB–TBP–DNA complexes with closed and
by 90 degrees. In later structures, TFIIA and TFIIB were open promoter DNA10. An independent TFIIB structure
TAF14|| TAF14 NA 27.4 NA P35189 NA 3 seen to flank the TBP–DNA complex on either side18–20. bound to the 10-subunit Pol II core enzyme provided
Sarah Sainsbury*, Carrie Bernecky* and Patrick Cramer
When the yeast Pol II structure became available21–25, similar results11. The crystallographically derived models
Total 1,200¶ 1,300¶
(14–15 subunits)
it revealedAbstract | Transcription
many domains of eukaryotic
that have protein-coding
putative functions genes
also commences
resembled with the
the models assembly
that of
were derived biochemi-
a conserved
during initiation, initiation
including thecomplex, which
dock, wall andconsists
clampof RNA callypolymerase
28,29 II (Pol II)obtained
. A recently and the general
high-resolution crystal
TFIIE: recruitment of TFIIH and open DNA stabilization domains, andtranscr
an iption
RNA factors, at promoter
exit tunnel. DNA. After
The challenge then two decades
structureofof research, the structural
a Pol II–TFIIB complexbasis of 2a) additionally
(FIG.
TFIIEα (TFE*) TFA1 GTF2E1 54.7 49.5 P36100 P29083 1 transcription
was to understand the initiation is emerging.
relative positions of Crystal
TFIIB and structures of manyDNA
contained components
and a short of the
RNAinitiation
transcript, and revealed
complex
Pol II, because have been
subsequent resolved, and of
superimposition structural
the known information
detailsonabout
Pol II the
complexes with general
TFIIB ‘B-reader’ and ‘B-linker’ regions
TFIIEβ TFA2 GTF2E2 37.0 33.0 P36145 P29084 1
transcription
TFIIB–TBP–DNA complexfactors has recently
would been
reveal the obtained.
position of Although mechanistic
that connect detailstoawait
the B-ribbon the B-core and run through
Total 91.7 82.5 elucidation, available
TBP and promoter DNA on Poldata outline
II. The how Pol(also
B-ribbon II cooperates with the general
the polymerase activetranscription
centre cleft30. These models could
(2 subunits) factors to bind to and zinc
openribbon)
promoter DNA, and
known as the amino-terminal domain of how
bePol II directsto
extended RNA
thesynthesis and escapes
core initiation complex (FIG. 2b) by
from the promoter.

130 | MARCH 2015 | VOLUME 16 NATURE REVIEWS | MOLECULAR CELL BIOLOGY


www.nature.com/reviews/molcellbio Transcription of the eukaryotic genome is carried VOLUME
In the classical model, a Pol II–TFIIF complex | MARCH 2015 | 131
16 binds
out by nuclear RNA polymerase I (Pol I), Pol II and to a pre-formed TFIIB–TBP–DNA promoter complex,
© 2015 Macmillan Publishers Limited. All rights reserved Pol III. Whereas© 2015
Pol Macmillan
I transcribesPublishers
the rRNA Limited. All rights
precur- reserved
resulting in the formation of a core initiation complex
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
sor, Pol III transcribes small non-coding RNAs such as (FIG. 1). The core initiation complex is conserved in the
tRNAs. Pol II is a 12-subunit enzyme that transcribes Pol I and Pol III transcription systems, which also use TBP
protein-coding genes to produce mRNAs. Pol II regu- and contain proteins with homologies to TFIIB and TFIIF
lation underlies cell differentiation, the maintenance of (reviewed in REF. 7). The core initiation complex binds to
cell identity and the responses of cells to environmen- TFIIE and TFIIH to form a complete PIC that contains
tal changes. It occurs at different stages of transcrip- closed, double-stranded promoter DNA (TABLE 1). In the
tion, although regulation at the stage of initiation is presence of nucleoside triphosphates, a central DNA
a key mechanism for the control of gene expression. region is melted, leading to a ‘transcription bubble’ and
Understanding Pol II regulation, therefore, requires the formation of the open promoter complex. In the open
detailed insights into the structure of the Pol II ini- promoter complex, the DNA template strand passes near
tiation complex and the molecular mechanisms of the Pol II active site and can programme DNA-templated
transcription initiation. RNA chain synthesis. Most general transcription fac-
For initiation, Pol II assembles with the general tors are modular and contain structured domains that
transcription factors TFIIB, TFIID, TFIIE, TFIIF are connected by flexible linkers. Upon assembly of the
and TFIIH, which are collectively known as the gen- PIC, these factors adopt their functional structure. Their
eral transcription factors, at promoter DNA to form linker regions fold on the Pol II surface, and their protein
the pre-initiation complex (PIC) (TABLE 1). According domains locate to sites where they can exhibit specialized
to exemplary studies with a subset of promoters, the functions. Detailed structural information on how gen-
Max Planck Institute for general transcription factors cooperate with Pol II to eral factors interact with Pol II is scarce, but recent studies
Biophysical Chemistry, bind to and open promoter DNA, and to initiate RNA have increased our understanding of the 3D architecture
Department of Molecular synthesis and stimulate the escape of Pol II from the of initiation complexes8–13.
Biology, Am Fassberg 11,
promoter. TFIID contains the TATA box-binding pro- In this Review, we summarize known structural
37077 Göttingen, Germany.
*These authors contributed tein (TBP) and several TBP-associated factors (TAFs). information on Pol II initiation complexes and discuss
equally to this work. Whereas TBP is required for transcription from all the functions of PIC components. We describe available
Correspondence to P.C. promoters, the TAFs have promoter-specific func- structures for the general transcription factors and their
e-mail: patrick.cramer@ tions. Order-of-addition experiments1 combined with complexes (TABLE 2; see Supplementary information S1
mpibpc.mpg.de
doi:10.1038/nrm3952
in vivo analysis led to the classical model of stepwise PIC (table)), following the order of the stepwise assembly
Published online assembly (reviewed in REFS 2–6), although alternative model of PIC formation. We start with the recruitment
18 February 2015 assembly pathways are possible. of initiation factors to promoter DNA, the formation of

NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 16 | MARCH 2015 | 129

© 2015 Macmillan Publishers Limited. All rights reserved

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Binding of the 10 subunit Pol II core to the Pol II
subcomplex Rpb4/7 generates the complete, 12 subunit
enzyme.

Subsequent binding of TFIIF to Pol II generates the


complete Pol II-TFIIF complex

Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The PolII-TFIIF complex binds the TBP-TFIIB-DNA


complex, resulting in a minimal closed promoter complex.
This model reveals the central role of TFIIB as a bridge
between the promoter and the polymerase.
Docking of the TBP-TFIIB-DNA complex onto the Pol II-TFIIF complex involves the
binding of the TFIIB N-terminal ribbon domain to the Pol II dock domain and the binding
of the C-terminal TFIIB core domain to the polymerase wall. The TFIIB reader and linker
regions connect the N- and C-terminal domains of TFIIB and extend through the Pol II
cleft.

Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The formation of the open promoter complex involves DNA


melting and formation of a transcription bubble.

DNA melting commences above the active center cleft, 20 base pairs
downstream of the TATA box.

DNA melting allows the template single strand to reach the active site and the
downstream DNA duplex to bind near the jaws of Pol II.

Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The open complex then initiates RNA synthesis from


DNA-templated nucleoside triphosphate (NTP)
substrates, forming the initially transcribing complex.

Initial RNA synthesis further involves


‘‘scrunching’’ (zerknüllen) of the emerging DNA
strands at the upstream edge of the bubble.

This scrunching occurs because the upstream DNA duplex remains on


the polymerase surface, whereas the downstream DNA is pulled into the
active center as the DNA-RNA hybrid grows.

Once the RNA-DNA hybrid contains eight base pairs, it is stably


associated with Pol II

Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The complete elongation complex (EC ) is bound by the


elongation factor Spt4/5 .
Spt4/5 binds to the polymerase clamp and is located
adjacent to the nontemplate strand of the transcription
bubble.
Spt4/5 can associate only upon promoter escape because its binding site
is occupied in the initiation complex. The N-terminal NusG homology domain of
Spt5 closes the active center cleft, apparently locking in the nucleic acids and
enhancing EC processivity.

Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Backtracking and arrest can occur upon
attempts to elongate through speci c DNA
sequences or a nucleosome.

During backtracking, RNA is extruded into the pore beneath


the Pol II active site and is trapped in an RNA-binding site,
which prevents forward translocation and NTP binding and
results in arrest.

Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Reactivation of arrested Pol II requires the


elongation factor TFIIS, which stimulates RNA
cleavage at the Pol II active site.
First, domain II of TFIIS binds at the polymerase funnel and
causes a conformational change in the polymerase that
repositions the backtracked RNA.

Second, domain III of TFIIS inserts into the pore next to


backtracked RNA and reaches the active site with a hairpin that
contains three charged residues that stimulate RNA cleavage,
generating a new RNA 3′ end from which transcription resumes.

Pol II is depicted in silver, initiation factors in different shades of green, and elongation factors
in different shades of orange. The DNA template strand is in dark blue, the nontemplate strand
in light blue, and the RNA in red.

https://www.sciencedirect.com/science/article/pii/S0092867412006976?via%3Dihub https://ars.els-cdn.com/content/image/1-s2.0-S0092867412006976-mmc1.mov

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
fi
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Transcription copies one strand of DNA

What is wrong with transcription?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Transcription copies one strand of DNA

https://www.youtube.com/watch?v=WsofH466lqk

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
https://www.ncbi.nlm.nih.gov/pubmed/11909516

A Traditional View of Gene Expression A Contemporary View of Gene Expression

The different steps in the pathway from gene to protein Recent findings suggest that each step regulating gene
have traditionally been viewed as independent events, expression is a subdivision of a continuous process.
with each going to completion before the next begins . Each stage is physically and functionally connected to
the next, ensuring that there is efficient transfer between
Good question: manipulations and that no individual step is omitted.
What is the message of the two schemes?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Eukaryotic RNAs are transcribed and processed simultaneously

The phosphorylation status of the C-terminal


domain of RNA Pol II (CTD) acts as timer
for RNA processing.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
What is faster: transcription or translation?

Transcription in eukaryotes (in nucleus):

H. G. Garcia, et al., Current Biology, 23, 2140–2145, 2013

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3828032/

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

What is faster: transcription or translation?

Translation in eukaryotes (in cytoplasm; following about 5 min of splicing etc.)

ribosome profiling
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3225288/
N. Ingolia et al., Cell, 146:789, 2011

Inhibiting translation initiation followed by inhibition of elongation creates a pattern of


ribosome stalling dependent on the time differences and rates of translation. Using
modern sequencing techniques this can be quantified genome wide and the
translation rate accurately measured for each transcript.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Forget everything you know about translation

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
original papers

The race to crack the genetic code

„Some of the most intense competitions ever in the history of science...“

www.laskerfoundation.org

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
original papers

The race to crack the genetic code


original papers

https://www.nature.com/articles/171737a0

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

ww.nature.com/nature © 2003 Nature Publishing Group 397

The race to crack the genetic code


NATURE | VOL 421 | 23 JANUARY 2003 | www.nature.com/nature © 2003 Nature Publishing Group 397

https://www.nature.com/articles/171964b0

NATURE | VOL 421 | 23 JANUARY 2003 | www.nature.com/nature © 2003 Nature Publishing Group 397

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

How is the genetic information encoded?

How is it replicated?

How is it translated?

The concept of transcription did not exist.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

No DNA sequencing

Ribosomes had been noticed, but not related to inheritance.

mRNA was unknown

tRNA was unknown

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

Fundamental questions:

DNA is a double helix. So, is the information on both strands?


If it is only on one strand, how do you know which one?
Which direction do you read in?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

The alphabet problem:

There are 20 kinds of amino acids in proteins but only four kinds of nucleotide
bases in DNA.
No one-to-one mapping from bases to amino acids. (41)
No two-to-one, since there are only 16 doublets of bases. (42)
Three-to-one could work = 64 triplets. (43)

64 triplets—more than three times the number needed.


Explaining this excess became a major preoccupation of coding theorists.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

Less than a year after James Watson and Francis Crick discovered the molecular
structure of DNA, George Gamow, a professional physicist and amateur biologist,
proposed the first definite coding scheme for DNA.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

In this letter to microbiologist Martynas Ycas, Gamow discusses his idea for the code.
https://www.genetics.org/content/211/3/789

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

https://www.nature.com/articles/173318a0

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

dsDNA acts directly as a template for assembling amino


acids into proteins (Gamow’s diamond code).

The various combinations of bases along one of the


grooves in the double helix form distinctively shaped
cavities into which the side chains of amino acids fit. Each
cavity attracts a specific amino acid.
Once all the amino acids are lined up along the groove,
an enzyme polymerizes them.

Although the diamonds have four corners, the paired bases


along the horizontal diagonal are complementary, and so
only one of them carries any information (a triplet code in
disguise).

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

Gamow noted that most amino acid side chains are


symmetrical and he therefore postulated that the
diamonds could be flipped end-for-end or flopped side-to-
side without changing their meaning.
When all such symmetries are taken into account exactly
20 codons remained - just the number Gamow was
looking for.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

Even with the sparse protein sequence data available in the mid-1950s, Crick was
able to show that the diamond code was ruled out by the experimental evidence.
There were known patterns of amino acid repetitions that the diamond code could
not produce.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

Gamow founded “The RNA Tie Club”, limited to 20 regular members (20 amino acid) and
four honorary members (4 nucleotide base).

Orgel

Crick

Watson

Rich

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

By the later 1950s, there was growing support for the idea of messenger RNA -
a single-strand molecule acting as an intermediary between DNA and the
protein-synthesizing machinery.

At the same time Crick was formulating the "adaptor hypothesis," the idea that
amino acids do not interact directly with messenger RNA but are carried by
small molecules that recognize specific codons. The codons were by then
thought to be non-overlapping triplets of bases.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

The process of gene expression was imagined like this.

(1) The appropriate segment of DNA is transcribed into messenger RNA.


(2) Then the messenger RNA stretches out in the cytoplasm of the cell with
its long row of codons exposed.
(3) Each adaptor molecule, already charged with the correct amino acid,
latches onto the right codon.
(4) When all the codons are occupied, the amino acids are linked together,
and the protein is complete.

The frame-shift problem

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

The frame-shift problem doesn't arise with an overlapping code, because all
three reading frames are simultaneously valid.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

The race to crack the genetic code

In 1957 Crick devised a solution.


A comma-free code is constructed so that only the codons in one reading frame
are meaningful; the overlap triplets are nonsense.

A code with this property is said to be comma-free, since messages remain


unambiguous even when words are run togetherwithoutcommasorspaces.

sense
nonsense
nonsense

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

How many words can a comma-free code include?

(1) The codons AAA, CCC, GGG and UUU cannot appear in any
comma-free code, since they cannot combine with themselves without
generating reading-frame ambiguity.
The remaining 60 codons can be sorted into groups of three, where
the codons within each group are related by a cyclic permutation
(AGU, GUA and UAG).
A comma-free code can have no more than one codon from each of
these permutation classes.
Dividing 60 objects into groups of three produces exactly 20 groups.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

416 PHYSICS: CRICK ET AL. PRoc. N. A. S.

Using Ray's results,2 the proof also generalizes immediately to the case where
particles are killed at a rate V(x) throughout the region R as well as at the boundary.
Only slightly more than continuity of V(x) almost everywhere in R is required.
Similar results are expected to hold for the elastic-barrier case.
'M. Kac, "On Some Connections between Probability Theory and Differential and Integral
Equations," Proc. Second Berkeley Symposium Math. Statistics and Probability, pp. 189-215, 1951.
2 D. Ray, "On Spectra of Second-Order Differential Operators," Trans. Am. Math. Soc. 77,
299-321, 1954.
3R. Courant andThe race
D. Hilbert, to crack
Methods the Physics,
of Mathematical geneticVol. 1code
(New York: Interscience
Publishers, Inc., 1953).
416 PHYSICS: CRICK ET AL. PRoc. N. A. S.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC528468/
Using Ray's results,2 the proof also generalizes immediately to the case where
particles are killed at a rate V(x) throughout the region R as well as at the boundary.
Only slightly more than continuity of V(x) almost everywhere in R is required.
CODES WITHOUT COMMAS
Similar results are expected to hold for the elastic-barrier case.
'M. Kac, "On Some Connections between Probability Theory and Differential and Integral
BY F. H. C. CRICK, J. S. GRIFFITH, AND L. E. ORGEL
Equations," Proc. Second Berkeley Symposium Math. Statistics and Probability, pp. 189-215, 1951.
2 D. Ray, "On Spectra of Second-Order Differential Operators," Trans. Am. Math. Soc. 77,
299-321, 1954.
3R. Courant andRESEARCH
MEDICAL COUNCIL
D. Hilbert, Methods of MathematicalUNIT, CAVENDISH
Physics, Vol. LABORATORY,
1 (New York: Interscience AND DEPARTMENT OF THEORETICAL
Publishers, Inc., 1953).
CHEMISTRY, CAMBRIDGE, ENGLAND
ien ce
418 Communicated
CODES WITHOUT COMMAS
by G. Gamow,
PHYSICS: February
CRICK ET y sc PROC. N. A. S.
AL. 11, 1967
tur en arose in connection with
This BYpaper F. H. C. deals
some CRICK, J. S.with
make GRIFFITH,
nonsense. mathematical
a AND E. ORGEL
WeL.further assumeproblem th-c sequences
0which
that all2possible of the amino
f every
protein synthesis. We is, can be the
(thatpresent coded)solution all oathere because
point in theitstringgives the "magic
MEDICAL RESEARCH COUNCIL UNIT, CAVENDISH LABORATORY, AND DEPARTMENT OF THEORETICAL
acids may occur
CHEMISTRY, CAMBRIDGE,
ENGLAND and that of letters
in
number" one 20,Communicated
so only
can that read
by G. our
Gamow, "sense"answer
February 1967themay
11,in eaperhaps
correct
id way. This be biological
isofillustrated significance.
in Figure 3. In To
with awords,
other any two triplets n g
which make sense can be put side by side, and yet
This paper deals
makesynthesis.thistheclear, mathematical
the sketch
problem
insot the robiochemical
which arose in connection with
the "magic background
protein presentwe
Weoverlapping solution here
triplets
answer may perhapstie
wformed
because
besof biological
it givesmust
significance.always be nonsense. first.
It is assumed
number" 20, so that our
in one of t the
rebackground first. popular
more theories
To
of protein synthesis that amino
e ap nucleic
make this clear, we sketch in the biochemical
acids arein ordered hpopular acidsynthesis
strand
Ton senseand that(see, for example, etc. Douncel) and that the
It is assumed one of the more theories of protein that amino
acids are ordered on a nucleic acid strand (see,sense
for example, Douncel) the sense
order
order of theof the
amino acidsamino
is determined acids
acid. There are some twenty naturally occurring
is
byr the order
1 determined
of the nucleotides of by
2 amino 3 acids4 commonly
the
the nucleic
5 found 6 in7
order
8 of
----I
9 the 10 nucleotides
11 of the nucleic
acid. but There
proteins, (usually) only arefoursomedifferenttwenty
sequence of four things (nucleotides) can determinenonsense
The problem of occurring
nucleotides. naturally how a
nonsense
amino acids commonly found in
nonsense etc.
a sequence of twenty things
proteins,
(amino acids) is knownbut as (usually)
the "coding" problem. only four
I_ different L_
nonsense with eithernonsense
nucleotides. L The
nonsense etc.
problem of how a
This problem is a formal one. In essence, it is not concerned
sequence
chemical steps or theof details
fourof the things (nucleotides)
stereochemistry. can the
It is not even essential determine
to a sequence of twenty things
these pointsacids)
all(amino of theis "coding"
or 3.-The
specify whether RNA FIG.
is shown
are known
DNA is the
which
greatest
numbers
interest,as
nucleic
triplets make
but the
represent
acid being
theysense
are
the
which problem.
positions
considered.
andindirectly
only
occupied
Naturally,
nonsense.
involved
by the four letters A, B, C, and D. It
in theThis problem of coding. is a formal one. In suggested it is not concerned with either the
The first definite proposal was made by Gamow.2 His code, which wasessence,
formal problem

chemical
by the structure ofsteps
DNA, wasor
illustrated in FigureIt1. isGamow's
the
of the
obvious details
"overlapping"
codethatwas also
type.ofThe the meaning
these restrictions
with"degenerate"-that stereochemistry.
of this is
is, several
It is
one will be unable to code 64even not essential to
different
specify
sets whether
amino
of three letters L acids.
(picked
ECTURE in RNA The
aSspecial
ERIES Mway)
OLECULAR DNA
ormathematical
stood
IOLOGY is the
forB a particular problem nucleic
amino
I E is WtoSacid
acid. find
2023/4 being
the considered.
maximum number
O
RGANISATION OF THE Naturally,
that E
UKARYOTIC GENOME
However, all the 64 (4 X 4 X 4) possible sets of three letters stood for one amino
allorthese
acid sopoints
another, can thatbe are of
anycoded.
sequence We theshall
whatever the show
ofgreatest (1)interest,
four letters thatfor the
stood but theynumber
a de- maximum are only cannotindirectly
be greaterinvolved
finite sequence ofthan
inIt isthe 20
amino acids. and (2) that a solution for 20 can be given.
easy formal
to see that codes
the allowed amino acid
problem
Tosequences.
prove the firstcoding.
of
of the overlapping
Unfortunately,
type impose severe restrictions on
point, consider
werestrictions haveforbeenthe moment the restrictions imposed
found,The first
although by definite proposal
placing(unpublished)
considerable each amino
no such
wasbeen
acid
efforts have made
next made, by
tobyitself. Gamow.2
a number Then, clearly, Histhe code, which
triplet AAAwas mustsuggested
ofby theto structure of since,
workers, DNA,if was of the "overlapping"
find them. Part of this work has been reviewed by Gamow, Rich,
be nonsense, it corresponded to an amino acid, type.a., then Theaameaning
would beof this is
AAAAAA, and this
illustrated in Figure 1. Gamow's code was also "degenerate"-that the
sequence can be misinterpreted by associating a with is, several
sets of three secondletters to fourth, or third in
(picked to fifth, a special letters. We can thus for rejecta AAA, BBB, CCC,
and DDD. way) stood particular amino acid.
However, all It isthe easy 64 to see(4 thatX 4theX60 4)remaining possible setscan
triplets of bethree
grouped into 20stood
letters sets offorthree,one amino
acid or another, each set ofsothree thatbeing anycyclic sequence permutations whatever the fourConsider
of oneofanother. lettersasstoodan ex-for a de-
finite sequence ample ABC of aminoand itsacids. cyclic permutations BCA and CAB. It is clear that we can
choose any one of these, but not more than one. For suppose that we let BCA
It is easy to see that codes of the overlapping
stand for the amino acid 13; then 131 is BCABCA, and so CAB and ABC must, by type impose severe restrictions on
the allowed our rules,aminobe acid nonsense. sequences. Since we Unfortunately,
can choose at the most no such restrictions
one triplet from each have been
found, although cyclic set, we cannot choose (unpublished)
considerable more than 20. Noefforts solutionhave made, which
beentherefore,
is possible, by a number
of workers, codestomore The
find thanthem. 20race
differentPartto ofcrack
amino acids.work
this the hasgenetic
been reviewed code by Gamow, Rich,
We have so far not considered the effects of putting unlike amino acids- to-
gether, to give pairs of the form a13 and ha. It might be thought that this would
still further reduce the possible number of amino acids, but this turns out not to be
so, since we can write down a construction which obeys all our rules and yet codes
20 different amino acids. One possible solution is

A A A
A BABB A CB DB
B C CBCD
27.05.1961 (3:00 a.m.)
A
where ABB means ABA and ABB, etc. It is easy to see, by systematic enumer-

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

Marshall Warren Nirenberg (April 10, 1927 – January 15, 2010)

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Reprinted tram the Proceedings of the NATIONAL ACADEMY OP SCIENCEX


Vol. 47, No. 10, BP. 1.588-1602. October, 1961.

The race to crack the genetic code

Reprinted tram the Proceedings of the NATIONAL ACADEMY OP SCIENCEX


Vol. 47, No. 10, BP. 1.588-1602. October, 1961.

9”HE’ DEPENDENCE OF CELL- FREE PROTEIN SYNTHESIS IN E. COLI


lrPON Nil TURALLY OCCURRING OR SYNTHETIC
POLYRIRONUCLEOTIDES

BY MARSHALL W. ~IHENBEE~G ANI) J. HEINHICH MATTHAEI*

NATIONAL ISSTITUTES OF HEALTH, BETHESDA, MARYLAND

Comnunicated by Joseph E. Smadel, August 3, 1961


https://www.ncbi.nlm.nih.gov/pmc/articles/PMC223178/
A stable cell-free system has been obtained from E. coli which incorporates
C14-valine int)o protein at, a rapid rate. It was shown that this apparent protein
synthesis was energy-dependent, was st’imulated by a mixture of L-amino acids,
and was markedly inhibited by RNAane, puromycin, and chloramphenico1.l The
present, communication describes a novel characteristic of the system, that is, a
requirement for templat’e RNA? needed for amino acid incorporation even in the

9”HE’ DEPENDENCE OF CELL- FREE PROTEIN SYNTHESIS IN E. COLI


lrPON Nil TURALLY OCCURRING OR SYNTHETIC
POLYRIRONUCLEOTIDES

BY MARSHALL W. ~IHENBEE~G ANI) J. HEINHICH MATTHAEI*

NATIONAL ISSTITUTES OF HEALTH, BETHESDA, MARYLAND

Comnunicated by Joseph E. Smadel, August 3, 1961

A stableBIOLOGY
LECTURE SERIES MOLECULAR cell-free system W has been obtained fromORGANISATION
ISE 2023/4 E. coli which incorporates
OF THE E UKARYOTIC GENOME
C14-valine int)o protein at, a rapid rate. It was shown that this apparent protein
synthesis was energy-dependent, was st’imulated by a mixture of L-amino acids,
and was markedly inhibited by RNAane, puromycin, and chloramphenico1.l The
present, communication describes a novel characteristic of the system, that is, a
requirement for templat’e RNA? needed for amino acid incorporation even in the

The race to crack the genetic code

It took until 1965 until the genetic code was


solved completely - 12 years after the double helix.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
The race to crack the genetic code

https://www.sciencedirect.com/science/article/pii/S0968000403003025

46 Review TRENDS in Biochemical Sciences Vol.29 No.1 January 2004

Historical review: Deciphering the


genetic code – a personal account
Marshall Nirenberg
Laboratory of Biochemical Genetics, National Heart, Lung and Blood Institute, National Institutes of Health, 9000 Rockville Pike,
MSC – 1654, Building 10, Room 7N-315, Bethesda, MD 20892-1654, USA

This is an autobiographical description of the events b-galactosidase in Escherichia coli and that the mechan-
that led to the breaking of the genetic code and the sub- ism of protein synthesis was one of the most exciting areas
sequent race to decipher the code. The code was deci- in biochemistry. Some of the best biochemists in the world
phered in two stages over a five-year period between were working on cell-free protein synthesis, and I had no
experience with either gene regulation or protein syn-
ADD Khorana!!!!
1961 and 1966. During the first stage, the base compo-
VOL. 48, 1962
sitions of codons were deciphered BIOCHEMISTRY:
by the directing cell- thesis,CHAPEVILLE ET AL.worked on sugar 1087
having previously transport,
free protein synthesis with randomly ordered RNA glycogen metabolism and enzyme purification. After
This reaction
preparations. During the second performed while
thebenucleotide
phase,can the cysteine
thinking aboutisthis
attached its sRNA, producing
for atoconsiderable time, I finally
sequences of RNA codonsthe hybrid Ala-sRNACYSH. A superscript denotes the normal amino acid accept- was to
were deciphered by deter- decided to switch fields. My immediate objective
mining the species of aminoacyl-tRNA
LECTURE OLECULAR Bthat
ing Mspecificity
SERIES of anbound
IOLOGY investigate
sRNA,tothe actual aminothe
WISE 2023/4 acidexistence
attached of mRNA
being by EUKARYOTIC
indicated
ORGANISATION OF THE determining
as a GENOME
ribosomes in response to trinucleotides
prefix. of known
Figure 1 illustrates whether cell-free protein synthesis in E. coli extracts
this procedure.
sequence. Views on general topics such as the
howcoding was stimulated
to pickproperties by anmolecule,
of the hybrid RNA fraction or byutilized
we have DNA. In the
To determine
a research problem and competition versus collabor- longer term, my objective was to achieve the cell-free
the finding by Nirenberg and Matthaei,6 Lengyel, Speyer, and Ochoa,7 Speyer,
synthesis of penicillinase, a small inducible enzyme that
ation also are discussed.
Lengyel, Basilio, and Ochoa,8 and Martin, Matthaei, Jones, and Nirenberg,9 that
polyuridylic-guanylic acid will stimulate ribosomal incorporation into polypeptides
I would like to tell you how the genetic code was deciphered
of certain amino acids, including cysteine, but not alanine. As shown below, this
from a personal point of view. I came to the National
Institutes of Health (NIH) indifference 1957 as a betweenpost-doctoral cysteine fellow and alanine also applies when they are attached to their
with Dewitt Stetten, Jr, a wise, normal highlyacceptors,
articulate CySH-sRNACYSH is reactive with poly UG but Ala-sRNAAl,
i.e.,scientist
and administrator, immediately is not.afterThe hybrid molecule
obtaining a PhD inAla-sRNACYSH proved to be just as reactive as CySH-
biochemistry from the University sRNACYSH, leading to intheAnn
of Michigan conclusion that the sRNA moiety indeed determines the
1086 Arbor. The next year, I started
BIOCHEMISTRY: codingworkspecificity.
CHAPEVILLE with WilliamET AL. JakobyPROC. N. A. S.
and, by enrichment culture, I isolated Materialsaand Methods-Preparation
Pseudomonad that of C'4-L-cysteine: -C'4-cysteine of high specific activity
grew on
fects changes the absorbance:
g-butyrolactone (1) Unstackingwas the
and prepared
nucleotide
purified from pairsL-C'4-serine.
three increases
enzymes The yeast
absorbance by ap-serine sulfhydrase described by Schlossmann and
proximately
involved 45 perincent
the 250-280 mjs. (2)
at catabolism ofLynen"°
The catalyzes
difference
g-hydroxybutyric the reaction:
spectrum acid [1]. of a T2-mimetic
for reaction
mixture of adenine, guanine, and cytosine with CH20 has its isosbestic points near 255 mA and
maximumThere was aWeweekly
at 275 m/A.'4 estimateseminar in Stetten’s
that the combined oflaboratory
denaturation in
CH2OH-CHNH2-COOH
effects + H2S CH2SH-CHNH2*COOH + H20.
and formylation
lead towhich
fractionalGordon
The absorbance
tively.different
The RNA message is decoded in ribosomes
absorbance Tomkins
increases, r,(Figure
that ordered
laboratory, C-RNA
of 0.45, 0.49,1), and
Thecontributes
participated. enzymeGordon
who
was
at 255, 258,inanda260 m/s respec-
0.54worked
prepared
to thewas solution from
brilliant, a/lr. yeast which had been frozen in liquid nitrogen
therefore bakers'
(Ah) isNational
fh is calculated from Ah and Ao, the absorbance andistored
before CH20
with a wonderful associative memory and a magnificent
at -20° ; 50asgm of frozen yeast were extracted by stirring at 3° for 5 hr with 100 ml of
addition,
fh = (1 + r)Ah/(AO + 0.05
Ah) M= K2HPO4
(1 + and 0.05
r)aA/r(Ao + a"). M EDTA. From then on, the procedure of Schlossmann and Lynen
sense of humor. His seminars was were superb,
followed up to especially
the ammonium his sulfate step. The ammonium sulfate precipitate between 40
sample 11, the values fh calculated
of ofthe
Forhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC220908/
description step-by-step at 255, 258, and 260in
in this waydevelopments mu are
the 0.57, 0.59, and.
0.55 respectively. Averaged values of Ah are and 65 per
listed cent 1.saturation
in Table They are was dissolved
consistently in than
lower 15 ml
problem
fh calculated fromthat he intended
the thermal transition to of discuss.
data 0.05 Tris
(e.g.,MFig. 1). HCl, pHthe
Towards and dialyzed
7.5, end of against ysteine
my post-doctoral
14 Haselkorn, R., and P. Doty, fellowship, 0.02Gordon
J. Biol. Chem., HCl
Tris2738
M236, buffer. The
(1961).
replaced Hermandialysate was stored ctivating 0 H H
15 Geiduschek,
Kalckar as E. P., J. Mol.
head of theBiol.,Section at (1962).
in press -20° . 0.75 gmole
of Metabolic Enzymes of C'4-L-serine
and (65 ,uC! Cysteine Accepto sRNAOH n HO-C-C-C-SH
16 It has already been shown that when T2 DNA and T2-C-RNA are heated together to 100'C
offered no
and quenched, mecomplex
a position
formationas an Mmole,
can independent
New England
detected inina aCsCl
bedissolved investigator
density
Nuclear in Corgoration)
gradient. hisSuch complexes was NH2 H
form laboratory. The other
during the "annealing" independent
at 41° C.8, 17 The reheating oftotal
investigators
T2 DNA volume of 1.5with
in the
together
mlthe
containing:
C-RNA,
Cysteine
therefore, complicates
in no way were 3 pumolesofofthis
the interpretation EDTA; 0.5 ml of 0.5 M Tris HCl, pH
experiment.
laboratory Elizabeth Maxwell andwith Victor Ginsberg,
17 Hall, B. D., and S. Spiegelman, these 8.5, saturated
PROCEEDINGS, 47, 137H2S; (1961).0.3 jsmole of pyridoxal
who were
18 Marmur, J., andcarbohydrate biochemists,
D. Lane, these PROCEEDINGS,phosphate;46, and and 2Todd
451 (1960). mg ofMiles, enzyme.a After incu-
nucleic-acid
19 Doty, P. J. Marmur,biochemist.
J. Eigner, and It C.was
bation aatwonderful
370 these
Schildkraut, hropportunity
for 3 PROCEEDINGS,
in nitrogen, 461ml
46, 20 (1960).
of etha- it
20 Recently, Nakamoto and Weiss, these PROCEEDINGS, 48, 880 (1962), have shown that the
and I decided then that if I nol wascontaining
going to 0.2 work ml of this 2 NhardHCl Iwere added to Cysteine Acceptor sRNA0-'- C-C-C-SH
enzyme preparations used for the DNA-primed synthesis of C-RNA also catalyze
Afterancentrifugation, an RNA-primed
the superna- NH2 H
RNA might as wellIt ishave
polymerization. thepossible
therefore fun ofcysteine.
thethat exploring
part of the RNA important
isolated for these experiments
is made on a C-RNA rather than a DNAtant
problem. was evaporated
template. However, the the residue
andrelative rates ofre-extracted
these two Cysteine
processes In are mysuch that under our
opinion, mostwith
thesynthetic 10 ml ethanol
conditions
exciting not
work more plus
in 0.2
thanmolecular
5 per 2 NofHCl,
mlcent and the
the C-RNA
could biology
have beenin made by the RNA-primed extraction
pathway. The evaporation
and self-complementarity were repeated leastonce
of atFigure 85 Raney Nickel
1959 were
per cent of T2-C-RNA (Fig. 6) must therefore
the genetic
more. experiments
be a property of MonodRNA synthesis.
of the DNA-primed
1. Gordon Tompkins. Gordon was brilliant, highly articulate and very funny.
He was a charismatic individual who created a stimulating atmosphere and
and Jacob on the regulationBefore of the thegeneethanol that encodes
addition, 75 jmoles of C12- exploration.
encouraged _ In 1958, towards the end of my post-doctoral fellowship
serine, 20 pmoles of C"-alanine, and 30at ,umoles the NIH, heCysteine Accepto,
offered me sRNA O-C-C-H
a position as an independent investigator in his
of C"2-glycine
Corresponding author: Marshall Nirenberg were added to dilute residual
(mnirenberg@nih.gov). ra-
laboratory. NH2 H
THE ROLE OF0968-0004/$
ONhttp://tibs.trends.com SOLUBLE dioactivity
RIBONUCLEIC
- see front of Elsevier
matter q 2003 the ACID
serine and possible
rightsCODING
Ltd. AllIN degrada-
FOR
reserved. doi:10.1016/j.tibs.2003.11.009 Alonine
AMINO products.
tion ACIDS*,t The repeated extraction-evapora-
tion procedure was used,' since cysteine is more attached FIG. 1.-Plan of experiment. Cysteine is
BY FRANCOIS CHAPEVILLE,§ FRITZ LIPMANNt
soluble inin GPNTER
ethanol thanVON
etanlthnhohr theEHRENSTEIN,**
other amino io acids. . its normal
through thetomediation of the acceptor sRNA
cysteine activat-
BERNARD WEISBLUM, WILLIAMThe final JR.,
J. RAY, product SEYMOUR BENZERtt
AND contained approximately 40 ing enzyme. By the action of Raney Nickel,
THE ROCKEFELLER INSTITUTE, JOHNSper
of the input
cent UNIVERSITY,
HOPKINS AND radioactivity.
PURDUE UNIVERSITY On paper the cysteine, while still attached, is con-
electrophoresis at pH 1.85 in 7.8 per cent acetic verted to alanine. The coding properties of
Communicated April2.5
acid and 1962cent formic acid (70 volts/cm,
25, per
60 mip), no detectable
In protein synthesis, each amino acid is first joined specifically radioactivity
with a was found except with cysteine. The dry product was stored
correspond-
ing sRNA through the mediation of -20°activating
at an . enzyme. These aminoacyl-
sRNA's, by reaction with a ribosomalPreparation preparation,1'of C'4-CySH-sRNACvSH:
2 form proteins with specific E. coli-sRNA was prepared as described," and a 105,000
amino acid sequences.3 Accordingcan
Good question: to the
the "'adaptor" check the of
ribosome hypothesis Crick4acid
amino Hoag- to the tRNA? How do we know that?
andattached
land,5 the position of a particular amino acid would be determined not by the amino
acid itself, but by hydrogen bonding between the RNA template and a comple-
mentary nucleotide sequence in the sRNA carrying the amino acid. The experiment
described in this paper was LECTURE SERIES
designed as Ma OLECULAR
direct testBIOLOGY
of the adaptor hypothesis, WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
by attaching an amino acid to its normal sRNA and then, without breaking the
bond, converting the amino acid to another one of the natural amino acids. It is
then possible to determine whether the coding properties of this hybrid are deter-
mined by the sRNA or the amino acid. We have made use of the fact that cys-
teine can be altered by reductive desulfhydration with Raney Nickel to alanine.

Which is bigger, mRNA of the encoded protein?

Amino acids ≈ 110 Da Nucleotides ≈ 330 Da 3 Nucleotides per codon ≈ 1000 Da

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
What is bigger, tRNA or amino acid?

Amino acids ≈ 110 Da Nucleotides ≈ 330 Da ≈ 70-90 nucleotides per tRNA

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Two adaptors are required to translate the genetic code

Aminoacyl-tRNA synthetase couples amino acid to tRNA (=charging).


tRNA anticodon base-pairs to mRNA.
Both adaptors are of equal importance for the decoding process.

Note, how tiny the amino acid is compared to the tRNA

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Ribosomes have one binding site for mRNA and 3 sites for tRNA
P P
E A E A

Aminoacyl-tRNA
Peptidyl-tRNA
Exit
large subunit small subunit

Good question:

How many sites of the ribosome are occupied


at any one time?

The eukaryotic ribosome can process about 2 amino acids per second.
Bacterial ribosomes can be 10-times faster.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Translation is a four-step cycle

This is what we (hopefully) all know, but.......

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Ribosomes are made of 4 RNAs and >80 proteins

crystal structure
solved in 2000

A ribosome has much more proteins than


RNAs, but the ribosomal RNAs give the
ribosome its shape.

The ribosome is a ribozyme.

Good question:

But what is then the function of all those proteins?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Bonus

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Who has the largest genome?

133 billion base pairs: the marbled lungfish / leopard


lungfish (Protopterus aethiopicus) (largest known
genome of any vertebrate)

150 billion base pairs: Paris japonica

670 billion base pairs: Polychaos dubium (freshwater amoeba)

Jaume Pellicer, Michael F. Fay, Ilia J. Leitch. The largest


eukaryotic genome of them all? Botanical Journal of the
Linnean Society, 2010; 164 (1): 10 DOI: 10.1111/
j.1095-8339.2010.01072.x

Why does that all matter?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Who has the largest genome?

Chromosome number and genome size are not


related to size or complexity of organisms

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

How many genes are in a genome?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
How many genes are in a genome?

3.000.000.000 ÷ 1000 = 3.000.000

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

How many genes are in a genome?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Gene number is only vaguely related to size or


complexity of organisms

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Is replication time limiting for the cell cycle?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Is replication time limiting for the cell cycle?

How many origins in mouse? 1.000 - 100.000 (estimated range)

In Drosophila? 10.000 (estimate)

Each replisome proceeds at a rate of 4-40 bp/s or roughly 1 kb/min

In early Drosophila development ≈120 million bp* are replicated once every ≈8 minutes

We do not know the answer to the above question!

* Let’s also consider the leopard lung fish with its 133 billion bp!

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

How much of a cell is DNA?

E. coli

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Large genome = complex organism?

Gene number is only vaguely correlated with species complexity. Why?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

RNA can form intramolecular base pairs

Other

Conventional

Hypothetical fold considering Hypothetical fold considering Existing RNA


only conventional base pairing. conventional and non-
conventional base pairing.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
RNA can form intramolecular base pairs

Good question:

What are the differences between DNA and RNA helices and what causes those differences?

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Article

Mitotic Transcriptional Activation: Clearance of


Actively Engaged Pol II via Transcriptional
Elongation Control in Mitosis
Graphical Abstract Authors
Kaiwei Liang, Ashley R. Woodfin,
Brian D. Slaughter, ..., Jeffrey S. Haug,
Sue L. Jaspersen, Ali Shilatifard

Correspondence
ash@northwestern.edu

In Brief
How transcription is shut down as cells

During the cell cycle chromosomes have different states begin to condense chromosomes during
mitosis is poorly understood. Liang et al.
report the requirement of mitotic
transcriptional activation by P-TEFb to
release paused Pol II as a prerequisite for
this process, and ultimately for proper

Article cell-cycle progression.

Mitotic Transcriptional Activation: Clearance of


Actively Engaged Pol II via Transcriptional Highlights Accession Numbers
Elongation Control in Mitosis d Mitotic transcription inhibition occurs in early mitosis GSE71848

d P-TEFb is required for mitotic transcriptional activation and


Graphical Abstract Authors release of paused Pol II
Kaiwei Liang, Ashley R. Woodfin,
d Nascent RNA-seq and RNA FISH reveal active transcription at
Brian D. Slaughter, ..., Jeffrey S. Haug,
the onset of mitosis
Sue L. Jaspersen, Ali Shilatifard
d Inhibition of mitotic transcriptional activation delays cell-
Correspondence cycle progression
ash@northwestern.edu

In Brief
How transcription is shut down as cells
begin to condense chromosomes during Liang et al., 2015, Molecular Cell 60, 435–445
November 5, 2015 ª2015 Elsevier Inc.
mitosis is poorly understood. Liang et al. http://dx.doi.org/10.1016/j.molcel.2015.09.021
report the requirement of mitotic
transcriptional activation by P-TEFb to 2015
release paused Pol II as a prerequisite for
this process, and ultimately for proper
cell-cycle progression.

https://www.cell.com/action/showPdf?pii=S1097-2765(15)00741-8

Highlights Accession Numbers


d Mitotic transcription inhibition occurs in early mitosis GSE71848
LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
d P-TEFb is required for mitotic transcriptional activation and
release of paused Pol II

d Nascent RNA-seq and RNA FISH reveal active transcription at


the onset of mitosis

d Inhibition of mitotic transcriptional activation delays cell-


cycle progression

Liang et al., 2015, Molecular Cell 60, 435–445


November 5, 2015 ª2015 Elsevier Inc.
http://dx.doi.org/10.1016/j.molcel.2015.09.021

Human cells contain two copies of each chromosome

Chromosomes differ in size.


centromer

Chromosome bands can be visualised


using Giemsa staining.

Giemsa produces dark bands in A-T-rich regions

centromer
The knobs on chromosomes 13, 14, 15, 21 and 22 indicate
the positions of rRNA genes.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME
Karyotyping reveals genetic defects

Chromosomes can be “painted” by


in situ hybridisation.

Some inherited diseases and many forms of cancer can be detected by


chromosome painting.

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

Chromosome painting

https://www.nature.com/articles/nprot.2006.91
© 2006 Nature Publishing Group http://www.nature.com/natureprotocols

LECTURE SERIES MOLECULAR BIOLOGY WISE 2023/4 ORGANISATION OF THE EUKARYOTIC GENOME

You might also like