Professional Documents
Culture Documents
Leading Edge
Review
3D genome, on repeat:
Higher-order folding principles
of the heterochromatinized repetitive genome
Spencer A. Haws,1,2,3,4 Zoltan Simandi,1,2,3,4 R. Jordan Barnett,1,2,3 and Jennifer E. Phillips-Cremins1,2,3,*
1Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
2Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
3Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
4These authors contributed equally
*Correspondence: jcremins@seas.upenn.edu
https://doi.org/10.1016/j.cell.2022.06.052
SUMMARY
Nearly half of the human genome is comprised of diverse repetitive sequences ranging from satellite repeats
to retrotransposable elements. Such sequences are susceptible to stepwise expansions, duplications, inver-
sions, and recombination events which can compromise genome function. In this review, we discuss the
higher-order folding mechanisms of compartmentalization and loop extrusion and how they shape, and
are shaped by, heterochromatin. Using primarily mammalian model systems, we contrast mechanisms gov-
erning H3K9me3-mediated heterochromatinization of the repetitive genome and highlight emerging links be-
tween repetitive elements and chromatin folding.
HETEROCHROMATIN: A HISTORICAL PERSPECTIVE ON juxtaposition to heterochromatin (Elgin and Reuter, 2013; Muller
PROPERTIES, MECHANISMS, AND FUNCTIONS and Altenburg, 1930). Moreover, in 1949, the inactive X chromo-
some was observed as a heterochromatinized Barr body to
Eukaryotic genomes are broadly partitioned into heterochromat- achieve dosage compensation of X-linked genes in females
in and euchromatin. The existence of heterochromatin was first (Barr and Bertram, 1949). These classic studies, among many
reported by Emil Heitz in 1928 as high-intensity staining of others, provided early insight which suggested that heterochro-
DNA dye throughout the cell cycle in specific genome fragments matin is generally associated with a repressive role on gene
(Heitz, 1928). In the nearly 100 years following Heitz’s initial expression.
discovery, the molecular properties and biological roles of het- Heterochromatin also plays a chief role in the protection of
erochromatin have been elucidated in detail (Figure 1, top). Het- genome stability by repressing repetitive DNA elements (Peters
erochromatin is relatively compact, replicates at the end of S et al., 2001). Repetitive DNA elements constitute 54% of the
phase, and is refractory to both enzymatic digestion and sonicat- human genome (Hoyt et al., 2022). Since the initial discovery of
ion (Becker et al., 2017; Lima-de-Faria and Jaworska, 1968; mobile transposable elements (TEs) in maize (McClintock,
Wallrath and Elgin, 1995). A leading theory is that heterochro- 1950), several other repetitive elements has been identified in eu-
matin is relatively inaccessible to transcription factors due to karyotic genomes, including short and long interspersed nuclear
its condensed state, and, therefore, the genes it encompasses elements (SINEs and LINEs, respectively) (Kit, 1961; Schmid and
are transcriptionally silent (Becker et al., 2016). By contrast, Deininger, 1975), pericentromeric and centromeric satellite re-
euchromatin is gene rich, transcriptionally active, and accessible peats (Tyler-Smith and Brown, 1987), telomeric repeats (Black-
to DNA binding factors that promote gene expression. Therefore, burn and Gall, 1978), retroviral sequences (Martin et al., 1981),
Heintz’s original observation of heterochromatin as a distinct nu- and short tandem repeats (STRs) (La Spada et al., 1991; Oberlé
clear feature has stood the test of time over nearly 100 years of et al., 1991; Verkerk et al., 1991; Yu et al., 1991) (Figure 1, bot-
imaging, genetics, genome engineering, and sequencing tom). Targeted repression of repetitive elements is critical to
studies. counter their propensity for instability events (e.g., stepwise ex-
Significant advances have been made in our understanding of pansions, duplications, inversions, and recombination). Thus,
the classes of heterochromatin and their interplay with genome heterochromatin has a critical role in controlling genome function
function. The causal effect of heterochromatin on gene silencing at the interface between gene expression and repeat stability.
was demonstrated with classic genetic screens or transgene Heterochromatin can be earmarked by multiple known chro-
studies in Drosophila. As early as the 1930s, the phenomenon matin modifications. Repetitive DNA is packaged into what is
of position-effect variegation (PEV) was reported in which active canonically referred to as ‘‘constitutive heterochromatin,’’ a tran-
genes are stochastically silenced upon ectopic placement in scriptionally silent chromatin state characterized by the crosstalk
2690 Cell 185, July 21, 2022 ª 2022 Published by Elsevier Inc.
ll
Review
between H3 Lys-9 tri-methylation (H3K9me3) and DNA methyl- marked by the histone modification H3 Lys-27 tri-methylation
ation (Millán-Zambrano et al., 2022; Saksouk et al., 2015). Consti- (H3K27me3) plays a critical role in developmentally regulated
tutive heterochromatin at telomeres and pericentromeres con- gene expression (Millán-Zambrano et al., 2022). Together, these
denses vulnerable repetitive regions in an invariant manner data highlight that distinct heterochromatin signatures can exhibit
across developmental lineages and phases of the cell cycle. context-specific constitutive and developmentally regulated pat-
H3K9me3 can also be developmentally regulated, as genome- terns of acquisition and maintenance in the mammalian genome.
wide changes in occupancy patterns have been linked to gene An open question at the forefront of the last several decades
expression changes during mammalian differentiation and re- has been the mechanisms governing context-specific deposition
programming (Magklara et al., 2011; Nicetto et al., 2019; Soufi and maintenance of heterochromatin modifications. In the case of
et al., 2012). Moreover, cell-type-specific heterochromatin H3K9me3, genetic screens in Drosophila and S. Pombe identified
more than fifty candidate genetic loci—termed Su(Var) group elements in the human genome. Using primarily mammalian
genes—as suppressors of PEV (Allshire et al., 1995; Reuter and model systems, we compare and contrast the mechanisms of
Spierer, 1992). Two such genes—SUV39H1 and SUV39H2— H3K9me3-mediated heterochromatinization of repetitive ele-
encode histone methyltransferases (HMTs) which selectively ments and highlight emerging evidence suggesting a link be-
methylate H3K9 in mammals. Gain- and loss-of-function studies tween genome folding and the repetitive genome.
for SUV39H1/2 have demonstrated their causal role in methylating
H3K9 in mammalian cell lines (Firestein et al., 2000; Melcher et al., HIGHER-ORDER FOLDING PATTERNS OF REPRESSIVE
2000; Rea et al., 2000). Another HMT, SET domain bifurcated his- HETEROCHROMATIN
tone lysine methyltransferase 1 (SETDB1), also possesses H3K9-
specific tri-methylation activity (Schultz et al., 2002). In addition to Technological advances in imaging, sequencing, and computa-
HMTs, eukaryotes employ histone demethylases (HDMs) for tar- tional biology over the last two decades have revealed that het-
geted methyl-group removal, and the mammalian Jumonji erochromatin not only functions in a linear fashion as described
domain-containing protein 2 (JMJD2/KDM4) isoforms A–D were above, but also in concert with higher-order genome folding
identified as H3K9me3 HDM enzymes (reviewed in Kooistra and structures. As imaging and sequencing technologies enabled
Helin [2012]). Similarly, H3K27me3 is also deposited in a cell- the creation of genome-wide, ultra-high-kilobase-resolution
type-specific manner by the HMT activity of specific subunits of maps of chromatin folding in single cells, we now have the abil-
the Polycomb-repressive complex 2 (PRC2) (Yu et al., 2019), ity to detect genome-wide locations of A/B compartments,
and can be removed by HDM KDM6 isoforms A–B (Kooistra subnuclear bodies, topologically associating domains (TADs),
and Helin, 2012). The existence of evolutionarily conserved subTADs, and loops across many mammalian cell types
H3K9me3 and H3K27me3 writers and erasers highlights the likely (Figure 2). Here we discuss whether and how each distinct
importance of precise spatiotemporal regulation of heterochro- genome folding pattern is shaped by and informs H3K9me3-
matin during mammalian development and adulthood. mediated heterochromatin—and the mechanistic principles un-
In this review, we highlight the contrasting higher-order derlying the establishment and maintenance of each folding
folding patterns of compartmentalization and loop extrusion feature.
and their interplay with repressive heterochromatin in the
three-dimensional (3D) nucleus. In light of the first gapless, B compartments can be established via HP1a
complete genome assembly created by the Telomere-to- Compartments were originally identified in ensemble Hi-C maps
Telomere (T2T) consortium (Hoyt et al., 2022), we summarize binned at 1 Megabase resolution as a chromosome-wide plaid
the diversity of repetitive DNA sequences and mobile genetic pattern of ultra-long-range intra-chromosomal and inter-
chromosomal contacts (Lieberman-Aiden et al., 2009; Rao et al., be explained by the ability of HP1a to nucleate into spherical
2014) (Figure 2A). The plaid pattern has been interpreted to sug- droplets exhibiting the physical properties of liquid-like, mem-
gest genome partitioning into A compartments of euchromatin braneless condensates (Larson et al., 2017; Strom et al., 2017).
and B compartments enriched for heterochromatin (Lieber- Liquid-like droplets of HP1a can condense a stretched fragment
man-Aiden et al., 2009). As Hi-C data has increased in read of linear DNA into a dense focal point in vitro, thus raising the
depth and quality over the last decade, it has enabled the addi- possibility that B compartments could form in principle through
tional partitioning of 1 kilobase-binned mammalian genomes into phase separation. Biochemical studies indicate that the ability
several subtypes of smaller sub-compartments marked by of human HP1a to form liquid-like droplets is dependent on
signature combinations of active and repressive chromatin mod- binding to DNA via its intrinsically disordered hinge region and
ifications (Liu et al., 2021; Rao et al., 2014; Spracklin et al., 2021). N-terminal HP1a phosphorylation (Larson et al., 2017). While ob-
Super-resolution and widefield imaging experiments with Oligo- servations from these and other studies are suggestive of a role
paints DNA fluorescence in situ hybridization (FISH) probes veri- for HP1a in the establishment of B compartments via oligomer-
fied compartments independently via direct visualization in situ ization or phase separation mechanisms, more perturbative
(Bintu et al., 2018; Mateo et al., 2019; Takei et al., 2021; Nir gain and loss of function studies will ascertain the extent of
et al., 2018). Compartmentalization has also been observed in HP1a0 s role in B compartment establishment in mouse and hu-
C. elegans and Drosophila (Bian et al., 2020; Rowley et al., man systems across developmental stages.
2017), suggesting that it may be an essential feature of genome
folding across mammalian and non-mammalian eukaryotic or- H3K9me2/3 is necessary and sufficient for formation
ganisms. of LADs
Following their initial stratification into A and B states (Lieber- The association of genomic loci with the nuclear lamina in lam-
man-Aiden et al., 2009), compartments were further classified ina-associated domains (LADs) was reported using a DNA
into multiple sub-compartments (e.g., A1, A2, B1, B2, B3) based adenine methyltransferase-based assay (DamID) targeting
on their combinations of chromatin modifications, DNA replica- B-type lamin proteins first in Drosophila Kc cells (Pickersgill
tion timing, and association with subnuclear structures (i.e., nu- et al., 2006) (Figure 2B). DamID was subsequently modified to
clear lamina and nucleolus) (Liu et al., 2021; Rao et al., 2014). B target lamin B1 in Tig3 human lung fibroblasts, revealing that
compartments are generally heterochromatic and gene poor and LADs are approximately 100 kb–10 Mb in size and span up to
have since been stratified into sub-B compartments marked by 40% of the human genome (Guelen et al., 2008). It is well estab-
(1) H3K9me3, HP1a, and HP1b; (2) H3K9me2 (but not lished that LADs correlate with repressed genes and B compart-
H3K9me3) and the H2A.Z histone variant; or (3) H3K27me3 ments, and localization to the nuclear periphery has long been
and PRC 1 and 2 (Spracklin et al., 2021). By contrast, A compart- discussed as a model for gene silencing (van Steensel and Bel-
ments are euchromatic and encompass accessible, gene-dense mont, 2017).
DNA enriched for histone modifications linked to transcriptional The mechanisms by which LADs are established and main-
activation (e.g. H3K27ac and H3K4me3). Although the full iden- tained are of significant importance for understanding
tification of sub-compartment subtypes in each cell type is H3K9me2/3 deposition. LADs are enriched for H3K9me2/3 and
ongoing, these data indicate that multiple different heterochro- reduction of H3K9me2/3 via depletion or inhibition of H3K9
matin mechanisms exist within the generalized B compartment, HMTs results in disruption of chromatin contacts with the nuclear
and thus they should be sub-stratified by signatures of chromatin periphery across model organisms (Bian et al., 2013, 2020; Harr
marks when ascertaining function. et al., 2015; Towbin et al., 2012). Moreover, dCas9 targeting of
The mechanisms underlying B compartment formation are an the HMT G9afor genomic locus-specific H3K9 di-methylation
active area of investigation. The adaptor protein heterochromat- promotes lamina interactions in a methyltransferase-activity-
in protein 1 homolog alpha (HP1a) binds to H3K9me3 to maintain dependent manner (See et al., 2020). HP1a is a candidate in
heterochromatin via the direct recruitment of additional effectors the establishment of the interface between H3K9me2/3 and
across the cell cycle and in development (Bannister et al., 2001; the lamina, as it binds directly to H3K9me2/3 through its chromo-
Lachner et al., 2001; Nakayama et al., 2001). HP1a has been domain and can also bind directly to lamina-associated proteins
directly implicated in the establishment of B compartments dur- (Holla et al., 2020; Ye et al., 1997). Together, these collective ob-
ing the early developmental stage of zygotic genome activation servations suggest a possible model where H3K9me2/3 is suffi-
(ZGA) in the Drosophila embryo (Zenk et al., 2021). Depletion of cient for the establishment and necessary for the maintenance of
HP1a during ZGA causes a 20% decrease in B compartment chromatin-lamina associations.
strength and disrupted trans interactions between pericentro- The functional relationship between gene expression and LAD
meric regions (Zenk et al., 2021). However, its knockdown localization remains an open question. In mammalian systems,
does not perturb genome folding in somatic Drosophila S2 cells, thousands of genes are localized within LADs, and the majority
thus suggesting that it is required for establishment and not are actively repressed or transcriptionally inactive (Leemans
maintenance of B compartments. Computational modeling of et al., 2019). Artificial tethering to LADs results in silencing of spe-
HP1a-H3K9me3-dependent chromatin compaction is only cific genes, which supports the argument that positioning at the
partially capable of reproducing the distinctive alternating plaid periphery instructs gene expression (Finlan et al., 2008; Reddy
pattern exhibited by A/B compartments on Hi-C maps, suggest- et al., 2008). However, it is not yet fully clear whether specific
ing additional organizing mechanisms may be involved (Mac- lamina proteins or H3K9me2/3 drive silencing given the
Pherson et al., 2018). The apparent discrepancy might in part complexity of experiments decoupling periphery localization
with the factors involved. For example, depletion of lamina com- tether specific genomic segments to the nuclear periphery and
ponents does not typically lead to significant changes in gene subnuclear bodies will be critical for addressing these underex-
expression (Amendola and van Steensel, 2015; Solovei et al., plored concepts. As genomics and computational technologies
2013). Moreover, at least 10% of genes in LADs retain their continue to advance, the granularity with which repressive sub-
expression, and many such ‘‘escaper’’ genes are locally de- compartments are classified will improve, allowing mechanistic
tached from the lamina and depleted of H3K9me2/3 signal, sug- and functional perturbative studies to precisely dissect the ge-
gesting that mechanisms also exist for transcription to overcome nome’s structure-function relationship.
the conventionally repressive environment at the nuclear periph-
ery (Guelen et al., 2008; Leemans et al., 2019). Together, these TADs and subTADs formed via loop extrusion
studies suggest that tethering to the nuclear lamina can be caus- antagonize B compartments and heterochromatin
ally linked to gene expression silencing but that there are Within larger A/B compartments and subnuclear bodies, the
escapee genes which are exceptions to this general rule. mammalian genome folds into TADs and subTADs (Dixon
et al., 2012; Nora et al., 2012; Phillips-Cremins et al., 2013;
An emerging diversity of subnuclear bodies exhibits Sexton et al., 2012). As defined algorithmically in low-resolution
distinct heterochromatin and genome-function Hi-C data, TADs are Mb-scale domains in which genomic frag-
signatures ments have a higher interaction frequency with themselves
In addition to compartments and LADs, chromatin loci can compared to loci outside of the domains (Figure 2C). TADs and
spatially co-localize with proteinaceous sub-nuclear bodies their nested subMb-sized counterparts subTADs (Figure 2D)
including nuclear pores, nuclear speckles, stress granules, and are demarcated by boundaries—which are molecularly distinct
nucleoli-associated proteins (Figure 2B). For example, nucle- regions of the genome refractory to permitting chromatin inter-
olus-associated domains (NADs) were discovered by high- actions between up and downstream genomic loci (Beagan
throughput sequencing of genomic DNA associated with purified and Phillips-Cremins, 2020; Norton and Phillips-Cremins, 2017;
nucleoli (Németh et al., 2010; van Koningsbruggen et al., 2010). Phillips-Cremins, 2014) (orange segments, Figures 2C and 2D).
NADs can encompass as much as 4% of the human genome Although TADs were initially reported as conserved across cell
and are enriched for repetitive ribosomal RNA genes, pericentro- types, evidence is mounting that TAD and subTAD boundaries
meric and centromeric satellite repeats, and specific telomere can vary significantly in strength across cell types and in disease
arms (Németh et al., 2010; Su et al., 2020). NADs can overlay (Beagan et al., 2016, 2017, 2020; Norton et al., 2018; Sun
with the heterochromatin marks of H3K9me3 (Type I) or et al., 2018).
H3K27me3 (Type II) (Bersaglieri et al., 2022; Vertii et al., 2019). Over the last decade, major progress has been made toward
Type I NADs have been hypothesized to facilitate repression understanding the molecular mechanisms that regulate TAD/
and stabilization of repetitive DNA elements, whereas Type II subTAD formation (reviewed in detail in Beagan and Phillips-Cre-
NADs have been implicated in developmentally regulated gene mins [2020]). Many TADs and subTADs are formed by the cohe-
expression (Vertii et al., 2019). The nucleolus might also exhibit sin-based mechanism of loop extrusion in which the subunits of
the biophysical properties of liquid-like droplets (Frottin et al., the structural maintenance of chromosomes (SMC) proteins
2019; Lafontaine et al., 2021), consistent with the possibility form a ring-like structure which progressively moves in an
that repetitive elements may associate in the nucleolus via phase ATP-dependent manner along the genome and reels flanking
separation mechanisms. The roles and mechanisms by which loci through the inner diameter of its ring (Figure 2E) (Banigan
nucleolar associated DNA can be heterochromatinized and and Mirny, 2020; Davidson and Peters, 2021). Long-range loop-
silenced remains and exciting area for future investigation. ing interactions form when the cohesin complex stalls at TAD/
Recently, the development of new genomics and computa- subTAD boundaries bound by the architectural protein
tional technologies brought to light that repressive sub-compart- CCCTC-binding factor (CTCF) (Figure 2E) (Davidson et al.,
ments could be further refined beyond the foundational B1, B2, 2016; Fudenberg et al., 2016; Sanborn et al., 2015). Extrusion-
and B3 convention (Liu et al., 2021; Quinodoz et al., 2018; based long-range looping interactions manifest structurally in
Wang et al., 2021). Tyramide signal amplification sequencing ensemble Hi-C maps as dots, and, therefore, TADs/subTADs
(TSA-seq) directly measures the cytological distance of chro- formed by extrusion exhibit corner-dots (Figure 2F). Due to the
matin fragments to nuclear bodies such as speckles or stress transient nature of loop extrusion, a corner dot feature is not al-
granules (Chen et al., 2018). The spatial position inference of ways readily detectable in TADs formed by extrusion at a popu-
the nuclear genome (SPIN) computational method integrates lation level (Beagan and Phillips-Cremins, 2020; Emerson et al.,
data from TSA-seq for speckles and the nuclear lamina, Hi-C, 2022; Rao et al., 2017). The majority of dot-like structure-indica-
and DamID for the lamina and nucleolus and has successfully tive loops are lost upon short-term knockdown of CTCF and co-
identified at least 4 active and 6 repressive subtypes of nuclear hesin protein levels, thus reinforcing that these factors are
bodies (Wang et al., 2021) (Figure 2B). These observations open essential for loop establishment and maintenance (Nora et al.,
up new questions as to whether and how specific subnuclear 2017; Rao et al., 2017; Schwarzer et al., 2017). Hereinafter, we
bodies functionally contribute to or might be shaped by gene refer to TADs/subTADS as all domain-like structures in Hi-C
expression and heterochromatinization. The coupling of TSA- and single-cell imaging data which form mechanistically via
seq with synthetic biology technologies such as CRISPR- loop-extrusion mechanisms—but we note that CTCF/cohesin-
genome organization (CRISPR-GO) (Wang et al., 2018b) or independent mechanisms of TAD, subTAD, and loop formation
light-activated-dynamic-looping (LADL) (Kim et al., 2019) to are an active area of intense investigation.
Figure 3. Types and distribution of repetitive DNA elements across the human genome
(A) The human genome consists of 54% repetitive sequences. Class I repeats comprise retrotransposable elements, including long interspersed nuclear el-
ements (LINEs, grey), short interspersed nuclear elements (SINEs, dark orange), and long terminal repeat (LTR) endogenous retroviruses (ERVs, light orange). The
human genome has primarily one abundant LINE, LINE-1, that is 6 kb long and encodes its own requisite reverse transcription and transposition machinery. The
most common class of SINE elements in the human genome are the 280-bp-long Alu-SINEs which are non-autonomous and rely on LINE-1 encoded machinery
to facilitate their transposition. ERV remnants are present in the human genome in fragments (HERVs) and in some cases can be actively transcribed. Tandem
repeats are non-mobile DNA sequences in which the copies of one or more nucleotides are repeated in a head-to-tail manner. They are distributed throughout the
genome and are most prevalent as satellite repeats at pericentromeric and centromeric chromatin regions.
(B) The recent gapless telomere-to-telomere (T2T) human genome assembly consists of assemblies for all 22 human autosomes and chromosome X and
comprises 3.05 gigabase pairs (Gbp) of nuclear DNA (CHM13v1) (adapted from Nurk et al. [2022]). Bar plot, recent T2T estimates of the percentage of the human
genome made up of each repeat class.
TSD, target site duplication; ORF, open reading frame; EN, endonuclease; RT, reverse transcriptase; A and B, split Pol III RNA promoter; LTR, long terminal
repeats; Gag, group antigens; Prt, protease; Pol2, reverse transcriptase; Env, envelope protein.
It is established that the loss of cohesin-based loop extrusion gest that mechanisms driving the loss or gain of chromatin loops
leads to gained compartments in the mammalian genome (Nora could actively shape heterochromatin.
et al., 2017; Rao et al., 2017; Schwarzer et al., 2017). Knockdown
of subunits of both the Mediator and cohesin complex leads to LINEAR HETEROCHROMATINIZATION OF THE
lost loops, increased genome compartmentalization, and in- REPETITIVE GENOME
creases in heterochromatin domain formation (Figure 2G) (Haar-
huis et al., 2022). Knockdown of the Setdb1 H3K9 HMT in Repeat composition of the complete gapless human
mammalian neurons results in lost H3K9me3 and gained ectopic genome assembly
binding of CTCF (Jiang et al., 2017). Moreover, stabilization of The recent T2T-CHM13 human genome assembly released by
chromatin loops by knock-down of the cohesin-unloading factor the Telomere-to-Telomere (T2T) consortium offers the most ac-
wings apart-like protein homolog (WAPL) can result in ablation of curate estimation to date regarding the composition of the repet-
heterochromatin domains and a decrease in compartmentaliza- itive genome (Figure 3). Repetitive sequences comprise 54%
tion (Figure 2G) (Haarhuis et al., 2017, 2022). Altogether, these ob- of the T2T-CHM13v1 assembly (Figure 3A-B) (Hoyt et al., 2022;
servations raise the possibility that stabilization of chromatin Nurk et al., 2022). Transposable elements (TEs) are mobile
loops can counteract the formation of B compartments and sug- DNA sequences capable of integrating into new genomic loci
and are the largest class of repetitive DNA elements. Class I TEs, H3K9me2/3 and 5-methylcytosine (5mC) DNA methylation (re-
or retrotransposable elements (RTEs) including long inter- viewed in Thakur et al. [2021]). In mammals, H3K9me3 is depos-
spersed nuclear elements (LINEs), short interspersed nuclear el- ited at pericentromeres by SUV39H1/2 HMTs (Peters et al.,
ements (SINEs), and long terminal repeat (LTR) endogenous ret- 2001; Maison et al., 2011; Rice et al., 2003) and can be recog-
roviruses (ERVs), are dominant in the CHM13v1 assembly due to nized by the chromodomain of HP1a (Bannister et al., 2001;
their ‘‘copy-and-paste’’ mechanism where an RNA intermediate Lachner et al., 2001; Nakayama et al., 2001) (Figure 4A). HP1a
is reverse transcribed to cDNA prior to its integration into a new can co-immunoprecipitate with the H3K9me3 HMTs SUV39H1/
genomic locus (Chuong et al., 2017). By contrast, class II TEs, or 2 and SETDB1, as well as the DNA methyltransferase DNMT3b
DNA transposons, leverage a ‘‘cut-and-paste’’ mechanism in mammalian cell lines (Lehnertz et al., 2003; Maeda and Tachi-
where terminal internal repeats are bound by the self-encoded bana, 2022; Smallwood et al., 2007) (Figure 4A). Double
transposase, facilitating TE excision and reintegration into a knockout of SUV39H1 and SUV39H2 in HeLa cells results in
new genomic locus (Chuong et al., 2017). Non-mobile repetitive the depletion of H3K9me3 and HP1a on chromatin (Johnson et
DNA sequences, including pericentric and centromeric satellite al., 2017). Once localized to pericentromeres, HP1a functions
repeats, short tandem repeats (STRs), and variable number of as an adaptor protein to facilitate heterochromatin spreading
tandem repeats (VNTRs), also occur at high frequency in the hu- by acting as a scaffold for the recruitment of additional
man genome. We anticipate that as our knowledge of the heterochromatin effectors (Lechner et al., 2005; Smothers and
complexity and diversity of repetitive sequences among individ- Henikoff, 2000). The mechanisms by which HMTs, DNMTs,
uals increases, it will transform our understanding of the func- HP1a, and other chromatin effectors interplay to heterochroma-
tions and roles for the repetitive genome in phenotypic variation tinize mammalian pericentromeres remain an exciting open
in the coming years. question.
Repetitive elements threaten genome stability due to their sus- It is noteworthy that an early study knocking out the mouse en-
ceptibility to stepwise expansions, duplications, inversions, and zymes for DNA methylation maintenance (i.e., Dnmt1) or de novo
recombination events (Lupski and Stankiewicz, 2005). A prin- deposition (i.e., Dnmt3a/Dnmt3b double knockout) do not show
cipal role for H3K9me3 is the repression of repetitive DNA se- decreased H3K9me3 at pericentromeres (Lehnertz et al., 2003).
quences and mobile genetic elements. Such mechanisms are By contrast, knock-down of both Suv39h1 and Suv39h2 reduces
a double-edged sword, however, because some repetitive ele- pericentromeric DNA methylation and de-represses MajSat
ments have evolved to regulate their host’s genomes (reviewed repeat expression (Lehnertz et al., 2003). Although still under
in detail in Hermant and Torres-Padilla [2021] and Sundaram active investigation, this early work suggests that DNA methyl-
and Wysocka [2020]). Here, we discuss the distinguishing fea- ation at pericentromeres may occur downstream of H3K9me3.
tures of satellite repeats, STRs and VNTRs, and RTEs, as well Together, these data suggest a working model of heterochro-
as the linear H3K9-related heterochromatic mechanisms regu- matin at pericentromeres similar to the non-repetitive genome.
lating their accessibility and expression. In this working model, HP1a associates with H3K9me3 and sta-
bilizes—either directly or indirectly—local H3K9 HMT and DNMT
Pericentromeric satellite repeat tracts: Crosstalk occupancy, resulting in a self-reinforcing cycle of H3K9me3 and
between H3K9me3 and DNA methylation facilitates DNA methylation (Figure 4A).
silencing It is worth pointing out that crosstalk between H3K9me2/3
In both humans and mice, pericentromeric and centromeric DNA and 5mC DNA methylation might also occur at pericentromeres
is enriched for repetitive satellite repeat sequences (reviewed in independently from HP1a during S phase of the cell cycle via
Thakur et al. [2021]). Human centromeric DNA is enriched for re- the recruitment of ubiquitin-like PHD and RING finger
petitive a-satellite repeat tracts consisting of 171-bp monomer domain-containing protein 1 (UHRF1). UHRF1 (previously
subunits. In their most simple form, a-satellite repeat units are di- known as NP95 in mice and ICBP90 in humans) co-localizes
mers, but they can also be found in greater repeating arrays with 40 ,6-diamidino-2-phenylindole (DAPI)-dense pericentro-
known as higher-order repeats. Mouse centromeric DNA con- meric foci (Fang et al., 2016) and recognizes both H3K9me3
tains primarily minor satellite (MiSat) tracts consisting of and hemi-5mC DNA generated during replication (Rothbart
repeating 120-bp subunits. In human pericentromeres, there et al., 2012). Heterochromatin-bound UHRF1 can recruit
are 6 unique classes of satellite sequences: satellites I (17- and DNMT1 to facilitate local DNA methylation through two distinct
25-bp), satellites II (10- to 80-bp), satellites III (5- and 10-bp), molecular models (Figure 4B). In one leading model, hemi-5mC
a-satellites (171-bp), b-satellites (68-bp), and g-satellites (220- DNA provides UHRF1 binding specificity, and H3K9me2/3 im-
bp). Mouse pericentromeres are predominantly populated by proves binding affinity (Rothbart et al., 2012). Upon stable bind-
major satellite (MajSat) tracts comprised of a repeating 234-bp ing to existing H3K9me2/3 and hemi-5mC DNA, UHRF1 can re-
subunit. As long-read sequencing and alignment technologies cruit DNMT1 via a direct interaction to further mediate DNA
continue to improve, the further characterization of satellite methylation maintenance (Achour et al., 2008) (Model A, Figure
repeat tract diversity and structure represents an exciting area 4B. Alternatively, it has also been suggested that UHRF1 can
for future scientific exploration. ubiquitinate local H3 residues which may then directly recruit
Although centromeric repeats are actively transcribed and DNMT1 to guide DNA methylation maintenance (Model B,
generally remain accessible to facilitate kinetochore assembly Figure 4B) (Ishiyama et al., 2017). In addition to DNMT1, the
(McKinley and Cheeseman, 2016), the satellite repeats within HMT SUV39H1 can directly associate with UHRF1 in human
pericentromeres are constitutively heterochromatinized by cells, and UHRF1 knockdown can reduce both SUV39H1 and
H3K9me3 signal at specific target genes (Babbio et al., 2012). tion, these data raise a potential working model for HP1a-inde-
Although the functional role for SUV39H1 recruitment by pendent maintenance of pericentromeric heterochromatin dur-
UHRF1 specifically at pericentromeres requires further explora- ing S phase via UHRF1.
Noncoding RNA-mediated H3K9me3 deposition at immense importance for human health and also an unsolved
pericentromeric repeats pathophysiological question.
Although pericentromeric heterochromatin is targeted for consti- Emerging evidence suggests that heterochromatin acquisition
tutive repression, it is not transcriptionally inert. Some of its at unstable STRs might stabilize disease-related expansion
derived transcripts, such as MajSat non-coding RNA (ncRNA), events. In the prototypic example of FXS, expansion of a CGG
can remain largely chromatin associated through RNA:DNA hy- STR tract in the 5’UTR of FMR1 expands to mutation length of
brids (Velazquez Camacho et al., 2017). Biochemical studies 200 triplets or more to cause disease (Oberlé et al., 1991; Verkerk
suggest that the N-terminal basic domain of mouse SUV39H2 et al., 1991; Yu et al., 1991). STR expansion from pre-mutation to
contains a single-stranded RNA (ssRNAs) recognition module mutation length causes transcriptional inhibition of FMR1 and
that selectively binds MajSat ncRNA. N-terminal truncation of consequent severe reduction in levels of the Fragile X Mental
the ssRNA recognition domain reduces pericentromeric repeat Retardation Protein (FMRP) it encodes. Upon STR expansion
association with SUV39H2 and H3K9me3, suggesting a critical above 200 CGGs to full-mutation length, classic models assert
role for MajSat ncRNA in pericentromere heterochromatinization that the unstable STR tract undergoes local DNA hypermethyla-
(Velazquez Camacho et al., 2017). Biochemical studies have also tion and heterochromatinization by H3K9me3. Heterochromati-
shown that human SUV39H1 can interact with a-satellite ncRNA nization is thought to stabilize the STR tract, but it also spreads
in vitro (Johnson et al., 2017). In contrast to mice, human upstream to the FMR1 promoter to cause transcriptional
SUV39H1 binds ssRNA through its chromodomain, as it does silencing (Oberle et al., 1991; Sutcliffe et al., 1992; Kumari and
not possess an N-terminal RNA binding domain (Johnson Usdin, 2010). The role for heterochromatin in stabilizing dis-
et al., 2017). Non-specific RNase A digestion or mutation of the ease-expanding STRs and the downstream consequences in
SUV39H1 ssRNA binding domain reduced SUV39H1 enrichment disease is an exciting area for future inquiry, with crosscutting
at pericentromeres in HeLa cells, suggesting these ncRNAs may impact across multiple repeat expansion disorders already
also be critical for heterochromatinization in humans (Figure 4C). linked to DNA methylation and H3K9me3 (reviewed in Dion and
Together, these findings support the existence of satellite repeat Wilson [2009]).
ncRNA-facilitated heterochromatinization of pericentromeres. Finally, VNTRs, also known as minisatellites, are distin-
An exciting avenue for future inquiry is to dissect and further un- guished from STRs as their repeating monomer units are
derstand the seemingly discordant relationship between satellite greater than six bps in length (Course et al., 2021). Although
repeat transcription and heterochromatin acquisition and main- more than 55,000 VNTRs are present in the human genome,
tenance at mammalian pericentromeres. relatively little is known regarding their mechanisms of hetero-
chromatinization and variation across patient cohorts. Limited
Local heterochromatin silencing of genes containing knowledge of STRs is largely a consequence of their greater
unstable STR tracts in a subset of repeat expansion size relative to STRs, which has prevented their accurate
disorders mapping to human reference genomes and repeat length quan-
Over 1 million STR tracts, also known as microsatellites, are tification due to limitations associated with short-read seq-
distributed across the human genome in introns, exons, and in- uencing. The recent advent of long-read sequencing may
tergenic regions (Hannan, 2018). STRs consist of 1–6 bp motifs address these limitations, as evidenced by the identification
arranged in tandem (Hannan, 2018). Recent genome-wide of VNTRs in the long-read generated T2T CHM13 genome
studies have revealed that thousands of STRs are polymorphic (Hoyt et al., 2022). Long-read sequencing has recently shown
across individuals and serve as key regulators of gene expres- promise in identifying disease-associated VNTRs (daVNTRs),
sion (Gymrek et al., 2016). STR instability occurs at significantly such as the amyotrophic lateral sclerosis (ALS)-associated
higher rates than single nucleotide mutations, and small changes WDR7 daVNTR (Course et al., 2020).
in tract lengths contribute to the diversity of healthy human
phenotypic traits (Ellegren, 2000). Thus, STRs are abundant in A hexanucleotide STR tract facilitates the association
the human genome and play important functional roles in devel- between heterochromatin effectors and shelterin to
opment and phenotypic diversity. maintain telomere length
By a process that is poorly understood, a small number of spe- Human telomeres contain an internal hexanucleotide 50 -TTA
cific normal-length STRs undergo somatic or germline expan- GGG-30 STR tract that recruits the protein complex shelterin as
sion and transition to long-normal, intermediate, pre-mutation, a critical mechanism supporting telomere stability (de Lange,
and mutation STR tract lengths. Unstable expansion of STRs 2018). Telomeres are exceptionally unstable as their unpro-
was first reported simultaneously by four independent groups tected ends can stimulate DNA damage repair pathways and
and serves as the mechanistic basis for a growing list of more undergo progressive shortening with each cell division in non-
than 40 inherited repeat expansion disorders, including fragile cancerous, somatic cells (Shay and Wright, 2019). To protect
X syndrome (FXS), Huntington’s disease, amyotrophic lateral telomere ends from initiating DNA damage response pathways,
sclerosis, and Friedreich’s ataxia (La Spada et al., 1991; Oberlé shelterin facilitates telomere loop (t-loop) formation in which the
et al., 1991; Verkerk et al., 1991; Yu et al., 1991). Patients with single-stranded 30 DNA overhang at telomere ends invades the
repeat expansion diseases typically suffer from a complex set double-stranded telomere DNA to create a protective loop struc-
of neuropsychiatric symptoms such as anxiety, hyperactivity, ture (de Lange, 2018). Shelterin also recruits telomerase in germ-
low IQ, social deficits, and seizures (Orr and Zoghbi, 2007). Un- line and embryonic tissues to maintain telomere length (Hocke-
derstanding the mechanisms governing STR instability is of meyer and Collins, 2015). Together, these studies highlight the
functional role for telomere hexanucleotide STR tracts in promot- SETDB1 rather than SUV39H1 or SUV39H2 as the predominant
ing telomere stability via the recruitment of the shelterin protein HMT mediating its heterochromatinization.
complex. The necessity for SETDB1-mediated ERV repression has been
Telomeres and heterochromatin were first linked upon discov- illustrated using primordial germ cells from Setdb1 conditional
ery of the telomere position effect (TPE), a phenomenon knockout mice. In this model system, primordial germ cells
conserved from yeast to humans where reporter genes incorpo- exhibit an ectopic increase in ERV expression as well as severe
rated near telomeres become silenced due to the spread of postnatal defects, suggesting that SETDB1-mediated ERV
nearby heterochromatin (Blasco, 2007). Follow-up studies silencing is critical for proper germ cell development (Liu et al.,
have begun deciphering the specific mechanisms supporting 2014). The requirement of SETDB1-mediated KRAB-ZFP
heterochromatin acquisition, maintenance, and function at telo- silencing of ERVs has also been demonstrated in somatic mouse
meres. For example, in cells derived from a Suv39h1 and cells and tissues, supporting a pervasive role for SETDB1-
Suv39h2 double-knockout mouse, telomeres are depleted for dependent heterochromatin in maintaining ERV repression
H3K9me2/3 and HP1 isoforms concomitant with aberrant telo- beyond development (Ecco et al., 2016).
mere lengthening (Garcı́a-Cao et al., 2004). Additionally, the
long ncRNA termed ‘‘telomeric repeat-containing RNA’’ Non-LTR retrotransposons: DNA methylation and
(TERRA) has been reported to promote telomere heterochroma- H3K9me3 cooperate to repress LINE-1 expression in
tinization by recruiting HP1a to shelterin in human colorectal early development
cancer HCT116 cells (Figure 4D) (Deng et al., 2009). TERRA Non-LTR RTEs are the most prevalent repetitive DNA sequences
depletion leads to an increase in telomere-free ends, telomere present within the human genome (Hoyt et al., 2022) and primar-
doublets, and chromatid duplications, thus supporting a role ily include LINEs and SINEs (Figure 3). Within the autonomous
for telomere heterochromatinization in promoting telomere sta- LINE subdivision of RTEs, only LINE-1s have retained their
bility (Deng et al., 2009). It is important to acknowledge that these genomic mobility, with 100 active LINE-1s thought to currently
studies, while exciting and thought provoking, still represent an reside in an individual human’s genome (Brouha et al., 2003).
area of active investigation and conflicting observations have LINE-1 retrotransposition is facilitated by an internal 50 UTR Pol
been reported (Barral and Déjardin, 2020). Future research into II promoter that transcribes two open reading frames (i.e.,
the interplay between telomeric heterochromatinization, tran- ORF1 and ORF2) ending with a 30 UTR polyadenylation signal.
scription, and stability are needed to dissect the TERRA-depen- ORF1 encodes a protein possessing RNA binding and nucleic
dent and -independent mechanisms of heterochromatinization acid chaperone abilities while ORF2 encodes a protein with
of telomeres. endonuclease and reverse transcriptase properties that are
fundamental to the RTE copy-and-paste mechanism (Beck
Long terminal repeat retrotransposons: SETDB1- et al., 2011).
dependent H3K9 methylation is essential for silencing Roughly 100 cases of sporadic human disease have been
of ERVs attributed to LINE-1 or SINE retrotransposition into gene bodies
One of three major subclasses of class I TEs (termed ‘‘RTEs’’) (Hancks and Kazazian, 2016). LINE-1 exon insertions, such as
are the long terminal repeat (LTR) ERVs comprising 8–9% those first detected in two hemophilia A patients as early as
of the human genome (Hoyt et al., 2022). ERVs were originally 1988 (Kazazian et al., 1988), represent the most direct mecha-
integrated into the germline via an exogenous provirus infec- nism for disrupting protein function by altering the coding
tion, after which they utilized a virus-like mechanism leveraging sequence of the target gene. Intron integration of non-LTR
self-encoded Gag, Pol, and Env proteins driven by an internal RTEs can also contribute to human disease by altering RNA
LTR Pol II promoter to infect a new host (Chuong et al., splicing (Burns and Boeke, 2012; Payer et al., 2019), high-
2017). This once primary role of ERVs has been lost due in lighting the diverse mechanisms by which gene body retro-
part to Env truncations. Of the thirty or more ERV subfamilies transposition events can impair a downstream protein’s
known to be present within the human genome (HERVs), function.
none are thought to contain all the necessary components to LINE-1 elements are expressed shortly after human embryo
support their mobilization (Goodier, 2016). fertilization and then are silenced by the 16-cell stage or after.
ERVs are targeted by distinct molecular machinery to facilitate Recent reports have linked LINE-1 expression to the establish-
their heterochromatinization and repression. In early embryonic ment of chromatin accessibility during early embryogenesis
development, when global DNA demethylation promotes a tran- (Guo et al., 2014; Jachowicz et al., 2017). LINE-1 expression co-
sient burst in ERV expression (Fu et al., 2019; Reik and Surani, incides, but does not directly overlap, with the wave of global
2015), ERV silencing is primarily mediated by Krüppel-associ- DNA demethylation which occurs in pre-implantation embryos
ated box-domain-containing-zinc-finger proteins (KRAB-ZFPs). (Seisenberger et al., 2012), suggesting that DNA methylation is
Mammalian KRAB-ZFPs bind to motifs internal to the ERV a critical regulator of LINE-1 expression during early develop-
DNA sequence via their C2H2 zinc fingers (Figure 4E) (Geis ment. H3K9me3-mediated silencing is also crucial in the repres-
and Goff, 2020). The KRAB domain facilitates direct interactions sion of LINE-1s, as gain of H3K9me3 occurs during the 16-cell
with KRAB-associated protein 1 (KAP1/TRIM28), which in turn stage in early embryogenesis in parallel with gained DNA methyl-
recruits heterochromatin effectors including HP1a and the ation (Wang et al., 2018a). Together these studies suggest that
H3K9 HMT SETDB1 (Geis and Goff, 2020; Schultz et al., both H3K9me3 and DNA methylation regulate the silencing
2002)(Figure 4E). Thus, ERVs appear to be targeted by and heterochromatinization of LINE-1 expression.
The mechanisms governing LINE-1 heterochromatinization in Spatial positioning of genomic loci with double strand
mammalian systems are an active area of investigation. breaks outside of chromocenters promotes high-fidelity
SETDB1 facilitates H3K9me2/3 deposition in both early develop- repair of pericentromeric repeats in mice
ment and in somatic cells in LINE-1 elements via its association A notable example for how the higher-order folding of hetero-
with the human silencing hub (HUSH) complex (Figure 4F) (Tcha- chromatin interplays with the repetitive genome comes from
sovnikarova et al., 2015; Tunbak et al., 2020). The HUSH com- studies exploring the activation of DNA repair pathways within
plex localizes to intronless DNA, a characteristic feature of evolu- clustered pericentromeric repeats. The high concentration of re-
tionarily young LINE-1s, where it interacts with actively petitive satellite DNA sequences in pericentromeric chromatin
transcribed LINE-1 RNA to promote H3K9me3 deposition pose a significant challenge for double strand break (DSB) repair
(Figure 4F) (Seczynska et al., 2022). HUSH produces prema- by homologous recombination (HR). Although HR is typically an
turely terminated transcripts which may then be locally degraded exceptionally high-fidelity DSB repair pathway (Verma and
via the nuclear exosome targeting (NEXT) complex (Figure 4F) Greenberg, 2021), the presence of repetitive sequences in-
(Garland et al., 2022). Together, these studies highlight that creases the possibility for improper recombination in cis (i.e., be-
DNA methylation as well as RNA-mediated deposition of tween similar satellites present at distinct genomic loci on
H3K9me3 via the HUSH complex can silence LINE-1 elements homologous chromosomes) or in trans (i.e., between similar sat-
in mammalian development and somatic cells. ellites present on non-homologous chromosomes). In mouse
model systems, the heterochromatinization and condensation
of pericentromeric satellite repeat clusters into chromocenters
Non-LTR retrotransposons: SUV39H1 represses SINE
can establish a barrier to prevent improper access of HR ma-
transcription via local H3K9me3
chinery (Mitrentsi et al., 2022). It has been suggested that this
By contrast to LINEs, SINE RTEs are not known to encode for
mechanism likely works cooperatively with the relocation of peri-
proteins and thus rely on LINE-1 encoded proteins to facilitate
centromeric DSBs towards the periphery to improve the fidelity
their non-autonomous retrotransposition. In total, there are
of HR DSB repair (Caridi et al., 2018; Chiolo et al., 2011; Tsour-
850k full-length SINEs across three primary lineages (i.e.,
oula et al., 2016). Notably, in NIH 3T3 mouse fibroblasts, enrich-
AluJ, AluS, and AluY) within the human genome. Although all
ment of H3K9me3 at CRISPR-induced pericentromeric DSBs re-
full-length AluJ elements (i.e., 160k) are believed to be non-
mains unchanged following its relocation to the periphery, which
functional, a significant portion of full-length AluS (i.e., 550k)
may provide continued additional protection against nonspecific
and nearly all full-length AluY (i.e., 130k) elements are sug-
HR (Tsouroula et al., 2016). It is interesting that these mecha-
gested to have retained their retrotransposition abilities based
nisms have been shown to be absent in human cell model sys-
on in vitro mobilization assays (Bennett et al., 2008). Thus, along
tems (Mitrentsi et al., 2022), which is somewhat unexpected
with LINE-1s, SINEs maintain a significant threat to genome
given the conservation of spatial mechanisms governing peri-
instability.
centromeric DSB repair from yeast to mice. Therefore, future in-
SINE DNA is enriched for 5mC DNA methylation (Arand et al.,
vestigations into how both spatial genome organization and het-
2012). However, multiple studies argue against DNA methylation
erochromatinization influence the fidelity of pericentromeric DSB
as sufficient to repress SINE expression. Both 5-Azacytidine, a
repair in humans will be critical for addressing this intriguing
DNMT small-molecule inhibitor, and DNMT1 repression are un-
discrepancy.
able to induce SINE transcriptional activation (Varshney et al.,
2015). The heterochromatin effectors HP1a and SUV39H1 as
STR expansion disrupts TAD boundaries in fragile X
well as H3K9me3 are enriched at SINEs even in the absence of
syndrome
DNA methylation (Varshney et al., 2015). Small molecule inhibi-
Although repeat expansion disorders share the commonality of
tion of SUV39H1 results in significant decreases in H3K9me3
unstable STR expansion, the repeat tracts themselves are quite
abundance, along with increased RNA polymerase III occupancy
diverse in their specific properties across diseases. For
leading to elevated SINE transcription (Varshney et al., 2015).
example, the repeat unit sequence, as well as the repeat unit
These data suggest that SUV39H1-mediated H3K9me3 can at
length, can be highly variable. Each disorder has a different
least in part govern SINE transcription repression (Figure 4G).
premutation and mutation threshold, and the tracts are evenly
distributed across the genome in exons and introns and the
INTERPLAY AMONG 3D GENOME FOLDING, 50 UTR, 30 UTR, and intergenic regions. We recently set out to
HETEROCHROMATIN, AND REPETITIVE DNA ascertain if chromatin folding patterns could shed light on
ELEMENTS why some regions of the genome grow unstable in disease,
whereas hundreds of thousands of sequence- and transcrip-
While the literature provides support for a link between hetero- tion-matched STR tracts remain stable. Using Hi-C maps
chromatin and higher-order chromatin organization, and a role across multiple cell types, Sun et al. reported that nearly all
for heterochromatin in silencing the repetitive genome, the STRs which are known to grow unstable in disease co-localize
higher-order folding principles of the repetitive genome have with the boundaries of TADs/subTADs (Sun et al., 2018)
only recently begun to be explored. We highlight some early (Figure 5A). On the basis of this observation, it was hypothe-
case studies exploring the relationship among heterochromatin, sized that a subset of genetically encoded boundaries would
higher-order chromatin folding patterns, and the repetitive serve as hotspots in the human genome for vulnerability to
genome. instability events. Using fibroblasts, B cells, and post-mortem
Figure 5. STR expansion and HERV-H transcription influence TAD boundary integrity
(A) A CGG trinucleotide short tandem repeat (STR) tract residing in the 50 UTR of FMR1 is associated with fragile X syndrome.
(B) A study by (Sun et al., 2018) reports that in healthy individuals, the CGG STR (i.e., 6–40 repeating monomer subunits) is present at a TAD boundary, where
FMR1 is actively transcribed.
(C) As the CGG STR expands to full mutation length of 200 triplets or more, CTCF is displaced and local TAD/subTAD boundaries within 1 Mb around FMR1 are
disrupted.
(D) A study by Zhang et al. (2019) shows that transcription of HERV-H by the RNA Polymerase II complex presents a physical barrier that interferes with cohesin-
mediated loop extrusion, leading to the creation of a local CTCF-independent TAD boundary.
(E) De novo HERV-H insertion is sufficient for the formation of a new TAD boundary.
(F) Transcriptional repression of an endogenous HERV-H locus can dissolve a pre-existing TAD boundary.
brain tissue from male FXS patients compared to healthy drives genome misfolding or vice versa. A recent BioRxiv pre-
normal-length individuals, Sun et al. observed severe boundary print reports the deposition of Megabase-scale H3K9me3 do-
disruption around the FMR1 gene upon long mutation-length mains along with widespread genome misfolding and trans in-
CGG expansion (Sun et al., 2018) (Figures 5B and 5C). Tran- teractions between autosomes and the X chromosome in FXS
scriptional silencing of FMR1 correlated with severe boundary (Zhou et al., 2021), suggesting an interplay between
disruption, suggesting a link between structural boundary H3K9me3, genome folding, and repeat instability.
integrity and proper gene expression. Together, these data
suggest that STR instability events may be linked to the molec- Transcription of HERV-H is sufficient to create a TAD/
ular and structural features at a subset of TAD boundaries. subTAD boundary
Although mutation-length STR expansion events coincide with Transcriptionally active RTEs can actively shape chromatin ar-
boundary disruption, it is not yet known whether instability chitecture and genome function independent of whether they
mobilize or destroy architectural protein motifs (Zhang et al., and heterochromatin acquisition. Looking ahead, technological
2019). Human ES cells are hypomethylated with respect to differ- innovations including high-resolution microscopy, spatial geno-
entiated cell lines, which can lead to the de-repression of LTR mics, single-cell technologies, and systematic perturbative
and non-LTR RTEs. In human embryonic stem cells, the RTE studies will be well positioned to catalyze a wave of fresh insights
HERV-H (Figure 5D) is transcribed, and its transcripts account into how the genome’s structure-function relationship intricately
for 2% of the total poly A RNA pool (Santoni et al., 2012). governs stability of the repetitive genome across space and time
Random integration of an active HERV-H element into a new in development and human disease.
genomic locus is sufficient to form a de novo TAD boundary in
a transcription-dependent manner (Figure 5E). Moreover, ACKNOWLEDGMENTS
repression of HERV-H expression via differentiation of human
We regret that many important works of our colleagues are not cited due to the
ES cells to cardiomyocytes or dCas9-KRAB mediated silencing
large scope of the topic and journal space limits. S.A.H. is a New York Stem
attenuates the ability of the HERV-H sequence to serve as a
Cell Foundation – Druckenmiller Fellow. This research was supported by the
boundary element (Figure 5F). In the development of the mouse New York Stem Cell Foundation (to S.A.H.); NIH National Institute of Mental
2 cell stage embryo, transcription of murine endogenous retro- Health (1R01MH120269; 1DP1MH129957) (to J.E.P.-C.); NIH National Institute
virus element (MERV) family elements is also sufficient to form of Neural Disorders and Stroke (1R01-NS114226) (to J.E.P.-C.); 4D Nucleome
TAD boundaries (Kruse et al., 2019). Together, these studies Common Fund grants (1U01DK127405, 1U01DA052715) (to J.E.P.-C.); NSF
highlight that specific subclasses of ERV repeat elements can CAREER Award (CBE-1943945) (to J.E.P.-C.); and Chan Zuckerberg Initiative
Neurodegenerative Disease Pairs Award (2020-221479-5022) (to J.E.P.-C.).
form TAD/subTAD boundaries in a transcription-dependent
manner.
DECLARATION OF INTERESTS
SINEs and LINEs may shape human genome folding via The authors declare no competing interests.
redistribution CTCF binding sites
Finally, it has been hypothesized that SINEs and LINEs shape REFERENCES
the higher-order folding of the genome by harboring binding
sites for the architectural protein CTCF within their own se- Achour, M., Jacq, X., Rondé, P., Alhosin, M., Charlot, C., Chataigneau, T.,
quences (Choudhary et al., 2022; Diehl et al., 2020). Activation Jeanblanc, M., Macaluso, M., Giordano, A., Hughes, A.D., et al. (2008). The
of retroelements during evolution can lead to species-specific interaction of the SRA domain of ICBP90 with a novel domain of DNMT1 is
involved in the regulation of VEGF gene expression. Oncogene 27, 2187–
expansion of CTCF binding sites. Over 95% of mammalian
2197. https://doi.org/10.1038/sj.onc.1210855.
CTCF sites are derived from SINEs, LINEs, and LTRs (Choudh-
Allshire, R.C., Nimmo, E.R., Ekwall, K., Javerzat, J.P., and Cranston, G. (1995).
ary et al., 2020)., and SINEs in mice are enriched for CTCF-
Mutations derepressing silent centromeric domains in fission yeast disrupt
bound motifs (Schmidt et al., 2012). It is possible that the mobi- chromosome segregation. Genes Dev. 9, 218–233. https://doi.org/10.1101/
lization of SINE sequences during evolution could provide gad.9.2.218.
significant alterations to genome folding patterns in a CTCF- Amendola, M., and van Steensel, B. (2015). Nuclear lamins are not required for
dependent manner. To minimize this risk, the CTCF motifs in lamina-associated domain organization in mouse embryonic stem cells.
SINEs are targeted by the ChAHP complex, which competes EMBO Rep. 16, 610–617. https://doi.org/10.15252/embr.201439789.
with CTCF for motif binding to prevent novel SINE B2 integra- Arand, J., Spieler, D., Karius, T., Branco, M.R., Meilinger, D., Meissner, A., Je-
tions from disrupting the 3D genome (Kaaij et al., 2019). Addi- nuwein, T., Xu, G., Leonhardt, H., Wolf, V., and Walter, J. (2012). In vivo control
tionally, SETDB1 facilitates the heterochromatinization of SINE of CpG and non-CpG DNA methylation by DNA methyltransferases. PLoS
Genet. 8, e1002750. https://doi.org/10.1371/journal.pgen.1002750.
B2s in mouse macrophages, thus preserving the regulation of
Babbio, F., Pistore, C., Curti, L., Castiglioni, I., Kunderfranco, P., Brino, L., Ou-
lipopolysaccharide-inducible genes by preventing inappro-
det, P., Seiler, R., Thalman, G.N., Roggero, E., et al. (2012). The SRA protein
priate CTCF binding and ectopic chromatin looping events
UHRF1 promotes epigenetic crosstalks and is involved in prostate cancer pro-
(Gualdrini et al., 2022). Comparatively, in human lymphoblas- gression. Oncogene 31, 4878–4887. https://doi.org/10.1038/onc.2011.641.
toid cells, only a small fraction of CTCF-bound loop anchors Banigan, E.J., and Mirny, L.A. (2020). Loop extrusion: theory meets single-
are present in SINE elements, as LTR RTEs contained the molecule experiments. Curr. Opin. Cell Biol 64, 124–138. https://doi.org/10.
greatest percentage of CTCF-bound loop anchors (Choudhary 1016/j.ceb.2020.04.011.
et al., 2020). Thus, CTCF binding sites in RTEs can form loops Bannister, A.J., Zegerman, P., Partridge, J.F., Miska, E.A., Thomas, J.O., All-
and mechanistically contribute to the formation of TAD shire, R.C., and Kouzarides, T. (2001). Selective recognition of methylated
boundaries. lysine 9 on histone H3 by the HP1 chromo domain. Nature 410, 120–124.
https://doi.org/10.1038/35065138.
CONCLUSIONS Barr, M.L., and Bertram, E.G. (1949). A morphological distinction between
neurones of the male and female, and the behaviour of the nucleolar satellite
during accelerated nucleoprotein synthesis. Nature 163, 676–677. https://
Contrasting mechanisms of compartmentalization and loop
doi.org/10.1038/163676a0.
extrusion shape the higher-order folding of eukaryotic genomes
Barral, A., and Déjardin, J. (2020). Telomeric Chromatin and TERRA. J Mol Biol
in ways that directly influence heterochromatin acquisition and 432, 4244–4256.
maintenance. While the heterochromatic mechanisms associ-
Beagan, J.A., Duong, M.T., Titus, K.R., Zhou, L., Cao, Z., Ma, J., Lachanski,
ated with repetitive DNA element biology have been well studied C.V., Gillis, D.R., and Phillips-Cremins, J.E. (2017). YY1 and CTCF orchestrate
in the context of the linear chromatin fiber, emerging evidence a 3D chromatin looping switch during early neural lineage commitment.
reveals critical crosstalk between higher-order folding patterns Genome Res. 27, 1139–1152. https://doi.org/10.1101/gr.215160.116.
Pickersgill, H., Kalverda, B., de Wit, E., Talhout, W., Fornerod, M., and van Schultz, D.C., Ayyanathan, K., Negorev, D., Maul, G.G., and Rauscher, F.J.,
Steensel, B. (2006). Characterization of the Drosophila melanogaster genome 3rd. (2002). SETDB1: a novel KAP-1-associated histone H3, lysine 9-specific
at the nuclear lamina. Nat. Genet. 38, 1005–1014. https://doi.org/10.1038/ methyltransferase that contributes to HP1-mediated silencing of euchromatic
ng1852. genes by KRAB zinc-finger proteins. Genes Dev. 16, 919–932. https://doi.org/
10.1101/gad.973302.
Quinodoz, S.A., Ollikainen, N., Tabak, B., Palla, A., Schmidt, J.M., Detmar, E.,
Lai, M.M., Shishkin, A.A., Bhat, P., Takei, Y., et al. (2018). Higher-order inter- Schwarzer, W., Abdennur, N., Goloborodko, A., Pekowska, A., Fudenberg, G.,
chromosomal hubs shape 3D genome organization in the nucleus. Cell 174, Loe-Mie, Y., Fonseca, N.A., Huber, W., Haering, C.H., Mirny, L., and Spitz, F.
744–757.e24. https://doi.org/10.1016/j.cell.2018.05.024. (2017). Two independent modes of chromatin organization revealed by cohe-
sin removal. Nature 551, 51–56. https://doi.org/10.1038/nature24281.
Rao, S., Huntley, M., Durand, N., Stamenova, E., Bochkov, I., Robinson, J.,
Sanborn, A., Machol, I., Omer, A., Lander, E., and Aiden, E. (2014). A 3D Seczynska, M., Bloor, S., Cuesta, S.M., and Lehner, P.J. (2022). Genome sur-
map of the human genome at kilobase resolution reveals principles of chro- veillance by HUSH-mediated silencing of intronless mobile elements. Nature
matin looping. Cell 159, 1665–1680. https://doi.org/10.1016/j.cell.2014. 601, 440–445. https://doi.org/10.1038/s41586-021-04228-1.
11.021. See, K., Kiseleva, A.A., Smith, C.L., Liu, F., Li, J., Poleshko, A., and Epstein,
Rao, S.S., Huang, S.C., Glenn St Hilaire, B., Engreitz, J.M., Perez, E.M., J.A. (2020). Histone methyltransferase activity programs nuclear peripheral
Kieffer-Kwon, K.R., Sanborn, A.L., Johnstone, S.E., Bascom, G.D., Bochkov, genome positioning. Dev. Biol. 466, 90–98.
I.D., et al. (2017). Cohesin loss eliminates all loop domains. Cell 171, 305– Seisenberger, S., Andrews, S., Krueger, F., Arand, J., Walter, J., Santos, F.,
320.e24. https://doi.org/10.1016/j.cell.2017.09.026. Popp, C., Thienpont, B., Dean, W., and Reik, W. (2012). The dynamics of
Rea, S., Eisenhaber, F., O’Carroll, D., Strahl, B.D., Sun, Z.W., Schmid, M., genome-wide DNA methylation reprogramming in mouse primordial germ
Opravil, S., Mechtler, K., Ponting, C.P., Allis, C.D., and Jenuwein, T. (2000). cells. Mol. Cell 48, 849–862. https://doi.org/10.1016/j.molcel.2012.11.001.
Regulation of chromatin structure by site-specific histone H3 methyltrans- Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman,
ferases. Nature 406, 593–599. https://doi.org/10.1038/35020506. M., Parrinello, H., Tanay, A., and Cavalli, G. (2012). Three-dimensional folding
Reddy, K.L., Zullo, J.M., Bertolino, E., and Singh, H. (2008). Transcriptional and functional organization principles of the Drosophila genome. Cell 148,
repression mediated by repositioning of genes to the nuclear lamina. Nature 458–472. https://doi.org/10.1016/j.cell.2012.01.010.
452, 243–247. https://doi.org/10.1038/nature06727. Shay, J.W., and Wright, W.E. (2019). Telomeres and telomerase: three de-
Reik, W., and Surani, M.A. (2015). Germline and pluripotent stem cells. Cold cades of progress. Nat. Rev. Genet. 20, 299–309. https://doi.org/10.1038/
Spring Harb Perspect. Biol. 7, a019422. https://doi.org/10.1101/cshper- s41576-019-0099-1.
spect.a019422. Smallwood, A., Estève, P.-O., Pradhan, S., and Carey, M. (2007). Functional
Reute, G., and Spierer, P. (1992). Position effect variegation and chromatin cooperation between HP1 and DNMT1 mediates gene silencing. Genes Dev
proteins. Bioessays 14, 605–612. https://doi.org/10.1002/bies.950140907. 21, 1169–1178.
Rice, J.C., Briggs, S.D., Ueberheide, B., Barber, C.M., Shabanowitz, J., Hunt, Smothers, J.F., and Henikoff, S. (2000). The HP1 chromo shadow domain
D.F., Shinkai, Y., and Allis, C.D. (2003). Histone methyltransferases direct binds a consensus peptide pentamer. Curr Biol 10, 27–30.
different degrees of methylation to define distinct chromatin domains. Mol Solovei, I., Wang, A., Thanisch, K., Schmidt, C., Krebs, S., Zwerger, M., Cohen,
Cell 12, 1591–1598. T., Devys, D., Foisner, R., Peichl, L., et al. (2013). LBR and lamin A/C sequen-
Rothbart, S.B., Krajewski, K., Nady, N., Tempel, W., Xue, S., Badeaux, A.I., tially tether peripheral heterochromatin and inversely regulate differentiation.
Barsyte-Lovejoy, D., Martinez, J.Y., Bedford, M.T., Fuchs, S.M., et al. (2012). Cell 152, 584–598. https://doi.org/10.1016/j.cell.2013.01.009.
Association of UHRF1 with methylated H3K9 directs the maintenance of Soufi, A., Donahue, G., and Zaret, K. (2012). Facilitators and impediments of
DNA methylation. Nat. Struct. Mol. Biol. 19, 1155–1160. https://doi.org/10. the pluripotency reprogramming factors’ initial engagement with the genome.
1038/nsmb.2391. Cell 151, 994–1004. https://doi.org/10.1016/j.cell.2012.09.045.
Rowley, M.J., Nichols, M.H., Lyu, X., Ando-Kuri, M., Rivera, I.S.M., Hermetz, Spracklin, G., Abdennur, N., Imakaev, M., Chowdhury, N., Pradhan, S., Mirny,
K., Wang, P., Ruan, Y., and Corces, V.G. (2017). Evolutionarily conserved prin- L., and Dekker, J. (2021). Heterochromatin diversity modulates genome
ciples predict 3D chromatin organization. Mol. Cell 67, 837–852.e7. https:// compartmentalization and loop extrusion barriers. Preprint at bioRxiv.
doi.org/10.1016/j.molcel.2017.07.022. https://doi.org/10.1101/2021.08.05.455340.
Sutcliffe, J.S., Nelson, D.L., Zhang, F., Pieretti, M., Caskey, C.T., Saxe, D., and Vertii, A., Ou, J., Yu, J., Yan, A., Pagès, H., Liu, H., Zhu, L.J., and Kaufman, P.D.
Warren, S.T. (1992). DNA methylation represses FMR-1 transcription in fragile (2019). Two contrasting classes of nucleolus-associated domains in mouse
X syndrome. Hum. Mol. Genet. 1, 397–400. https://doi.org/10.1093/hmg/1. fibroblast heterochromatin. Genome Res. 29, 1235–1249. https://doi.org/10.
6.397. 1101/gr.247072.118.
Takei, Y., Yun, J., Zheng, S., Ollikainen, N., Pierson, N., White, J., Shah, S., Wallrath, L.L., and Elgin, S.C. (1995). Position effect variegation in Drosophila
Thomassie, J., Suo, S., Eng, C.-H.L., et al. (2021). Integrated spatial genomics is associated with an altered chromatin structure. Genes Dev. 9, 1263–1277.
reveals global architecture of single nuclei. Nature 590, 344–350. https://doi.org/10.1101/gad.9.10.1263.
Wang, C., Liu, X., Gao, Y., Yang, L., Li, C., Liu, W., Chen, C., Kou, X., Zhao, Y.,
Tchasovnikarova, I.A., Timms, R.T., Matheson, N.J., Wals, K., Antrobus, R.,
Chen, J., et al. (2018a). Reprogramming of H3K9me3-dependent heterochro-
Göttgens, B., Dougan, G., Dawson, M.A., and Lehner, P.J. (2015). GENE
matin during mammalian embryo development. Nat. Cell Biol 20, 620–631.
SILENCING. Epigenetic silencing by the HUSH complex mediates position-ef-
https://doi.org/10.1038/s41556-018-0093-4.
fect variegation in human cells. Science 348, 1481–1485. https://doi.org/10.
1126/science.aaa7227. Wang, H., Xu, X., Nguyen, C.M., Liu, Y., Gao, Y., Lin, X., Daley, T., Kipniss,
N.H., La Russa, M., and Qi, L.S. (2018b). CRISPR-mediated Programmable
Thakur, J., Packiaraj, J., and Henikoff, S. (2021). Sequence, chromatin and
3D genome positioning and nuclear organization. Cell 175, 1405–1417.e14.
evolution of satellite DNA. Int. J. Mol. Sci. 22, 4309. https://doi.org/10.3390/
https://doi.org/10.1016/j.cell.2018.09.013.
ijms22094309.
Wang, Y., Zhang, Y., Zhang, R., van Schaik, T., Zhang, L., Sasaki, T., Peric-
Towbin, B., González-Aguilera, C., Sack, R., Gaidatzis, D., Kalck, V., Meister,
Hupkes, D., Chen, Y., Gilbert, D.M., van Steensel, B., et al. (2021). SPIN reveals
P., Askjaer, P., and Gasser, S. (2012). Step-wise methylation of histone H3K9
genome-wide landscape of nuclear compartmentalization. Genome Biol. 22,
positions heterochromatin at the nuclear periphery. Cell 150, 934–947. https://
36. https://doi.org/10.1186/s13059-020-02253-3.
doi.org/10.1016/j.cell.2012.06.051.
Ye, Q., Callebaut, I., Pezhman, A., Courvalin, J.C., and Worman, H.J. (1997).
Tsouroula, K., Furst, A., Rogier, M., Heyer, V., Maglott-Roth, A., Ferrand, A., Domain-specific interactions of human HP1-type chromodomain proteins
Reina-San-Martin, B., and Soutoglou, E. (2016). Temporal and spatial uncou- and inner nuclear membrane protein LBR. J. Biol. Chem. 272, 14983–14989.
pling of DNA double strand break repair pathways within mammalian hetero- https://doi.org/10.1074/jbc.272.23.14983.
chromatin. Mol. Cell 63, 293–305. https://doi.org/10.1016/j.molcel.2016.
Yu, J.R., Lee, C.H., Oksuz, O., Stafford, J.M., and Reinberg, D. (2019). PRC2 is
06.002.
high maintenance. Genes Dev. 33, 903–935. https://doi.org/10.1101/gad.
Tunbak, H., Enriquez-Gasca, R., Tie, C.H.C., Gould, P.A., Mlcochova, P., 325050.119.
Gupta, R.K., Fernandes, L., Holt, J., van der Veen, A.G., Giampazolias, E.,
Yu, S., Pritchard, M., Kremer, E., Lynch, M., Nancarrow, J., Baker, E., Holman,
et al. (2020). The HUSH complex is a gatekeeper of type I interferon through
K., Mulley, J.C., Warren, S.T., Schlessinger, D., et al. (1991). Fragile X genotype
epigenetic regulation of LINE-1s. Nat. Commun. 11, 5387. https://doi.org/10.
characterized by an unstable region of DNA. Science 252, 1179–1181. https://
1038/s41467-020-19170-5.
doi.org/10.1126/science.252.5009.1179.
Tyler-Smith, C., and Brown, W.R. (1987). Structure of the major block of al-
Zenk, F., Zhan, Y., Kos, P., Löser, E., Atinbayeva, N., Schächtle, M., Tiana, G.,
phoid satellite DNA on the human Y chromosome. J. Mol. Biol. 195, 457–
Giorgetti, L., and Iovino, N. (2021). HP1 drives de novo 3D genome reorganiza-
470. https://doi.org/10.1016/0022-2836(87)90175-6.
tion in early Drosophila embryos. Nature 593, 289–293. https://doi.org/10.
ski, M., Schofield, P., Martin, D., Barton, G.J.,
van Koningsbruggen, S., Gierlin 1038/s41586-021-03460-z.
Ariyurek, Y., den Dunnen, J.T., and Lamond, A.I. (2010). High-resolution Zhang, Y., Li, T., Preissl, S., Amaral, M.L., Grinstein, J.D., Farah, E.N., Destici,
whole-genome sequencing reveals that specific chromatin domains from E., Qiu, Y., Hu, R., Lee, A.Y., et al. (2019). Transcriptionally active HERV-H ret-
most human chromosomes associate with nucleoli. Mol. Biol. Cell 21, 3735– rotransposons demarcate topologically associating domains in human plurip-
3748. https://doi.org/10.1091/mbc.e10-06-0508. otent stem cells. Nat. Genet. 51, 1380–1388. https://doi.org/10.1038/s41588-
van Steensel, B., and Belmont, A.S. (2017). Lamina-associated domains: links 019-0479-7.
with chromosome architecture, heterochromatin, and gene repression. Cell Zhou, L., Ge, C., Malachowski, T., Kim, J.H., Chandradoss, K.R., Su, C., Wu,
169, 780–791. https://doi.org/10.1016/j.cell.2017.04.022. H., Rojas, A., Wallace, O., Titus, K.R., et al. (2021). Spatially coordinated het-
Varshney, D., Vavrova-Anderson, J., Oler, A.J., Cowling, V.H., Cairns, B.R., erochromatinization of distal short tandem repeats in fragile X syndrome. bio-
and White, R.J. (2015). SINE transcription by RNA polymerase III is suppressed Rxiv. https://doi.org/10.1101/2021.04.23.441217.