Professional Documents
Culture Documents
10 1093@nar@gkz610
10 1093@nar@gkz610
doi: 10.1093/nar/gkz610
Received March 13, 2019; Revised June 26, 2019; Editorial Decision July 03, 2019; Accepted July 03, 2019
ABSTRACT INTRODUCTION
Guanine quadruplexes (G4s) are non-canonical nu- DNA G-quadruplexes (G4s) are non-canonical secondary
cleic acids structures common in important genomic structures formed by stacked G-tetrads and stabilized
regions. Parallel-stranded G4 folds are the most by central coordinating cation. As indicated by high-
abundant, but their folding mechanism is not fully un- throughput sequencing there are over 700 000 potential G4
derstood. Recent research highlighted that G4 DNA forming sites in the human genome (1). They are notably
concentrated in telomeric and centromeric regions as well
molecules fold via kinetic partitioning mechanism as in promoter regions of protein-coding genes, particu-
dominated by competition amongst diverse long- larly in those of proto-oncogenes (2,3). These findings agree
living G4 folds. The role of other intermediate species with experimental observations that G4 structures may con-
such as parallel G-triplexes and G-hairpins in the tribute to maintaining genome integrity of centromeric and
folding process has been a matter of debate. Here, telomeric DNA regions (4–10). A number of studies sug-
we use standard and enhanced-sampling molecular gested G4 formation at oncogene promoters; their role in
dynamics simulations (total length of ∼0.9 ms) to transcription regulation then highlighted G4s as promis-
study these potential folding intermediates. We sug- ing therapeutic targets in a broad range of human diseases
gest that parallel G-triplex per se is rather an unsta- including cancer (11–21). Considerable effort has been di-
ble species that is in local equilibrium with a broad rected towards the identification of small molecules en-
ensemble of triplex-like structures. The equilibrium abling modulation of transcription of (onco)genes via bind-
ing and stabilization of the G4 structures in a specific and
is shifted to well-structured G-triplex by stacked
controlled way (22–26).
aromatic ligand and to a lesser extent by flanking To understand and potentially modulate the biochemical
duplexes or nucleotides. Next, we study propeller roles of G4 molecules, it is essential to comprehend their
loop formation in GGGAGGGAGGG, GGGAGGG and folding properties (27–57). Analysis of available in vitro ex-
GGGTTAGGG sequences. We identify multiple fold- perimental and computational data indicates that DNA
ing pathways from different unfolded and misfolded G4-forming sequences have very complex free-energy (fold-
structures leading towards an ensemble of interme- ing) landscapes kinetically dominated by a set of deep free-
diates called cross-like structures (cross-hairpins), energy basins, i.e. substates formed by molecular structures
thus providing atomistic level of description of the with very long lifetimes (58). This phenomenon is gener-
single-molecule folding events. In summary, the par- ally known as kinetic partitioning of the folding landscape
allel G-triplex is a possible, but not mandatory (39,58–61), and the DNA G4 folding landscape has been
suggested to be an extreme case of kinetic partitioning (58).
short-living (transitory) intermediate in the folding of
The deep (long-living, metastable) free-energy basins likely
parallel-stranded G4. correspond to diverse G4 folds and typically act with re-
spect to each other as competing off-pathway folding in-
* To whom correspondence should be addressed. Tel: +420 541 517 133; Email: sponer@ncbr.muni.cz
C The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License
(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work
is properly cited. For commercial re-use, please contact journals.permissions@oup.com
2 Nucleic Acids Research, 2019
often suggested to fold via antiparallel G-hairpins, from other molecules may affect the folding mechanisms. Thus,
which the process proceeds either through antiparallel G- in vivo folding pathways of G4-forming sequences may sub-
triplexes or via G-hairpins merge (27–32,35,38,45). It has stantially differ from those indicated by in vitro experiments
also been proposed that folding of hybrid G4s proceeds and may be tuneable. In addition, metastable species (pre-
through antiparallel G-hairpins and G-triplexes, with the folded structures, intermediates or alternative G4 folds) that
propeller loop being formed at the end, with some of the might be present under given experimental conditions, but
suggested pathways including an antiparallel chair-type hidden to a given experimental technique, might be uncon-
G4 structure (31,35,38,42,45,48,50,85). Other recent studies sciously considered as part of the unfolded ‘unstructured’
disputed significant participation of G-triplexes in G4 fold- ensemble. It can affect the measurements and their interpre-
to provide equivalent results for the studied systems and ing HBfix (see Supporting Data). Further, all enhanced-
the investigated properties. Hemin was described by GAFF sampling simulations except the GGGAGGGAGGG T-
parameters (100) with charges derived by Shahrokh et al. REMD run were performed with restraints keeping the Gs
(101). in the anti region, referred to as the anti-G restraints. The
All systems were equilibrated using a standard protocol aim was to prevent flipping of Gs to the syn orientation,
(see Supporting Data for full details). Simulations were car- and thus restricting the conformational space to target fold-
ried out using periodic boundary conditions, electrostatic ing of the parallel-stranded structures. The anti-G restraint
interactions were treated by the PME algorithm and the reduces kinetic partitioning by avoiding other types of the
non-bonded cutoff was set to 9 Å (102,103). The temper- G-hairpin/triplexes on the free-energy landscape. The re-
populated in triplex-like intermediates. Analogous 2-1-3-G- the family called 2-cross-3|51, G-tracts 1 and 3 are parallel
triplex was not sampled in our simulations. and Hoogsteen paired with G-tract 2 rotated, but G-tract 3
The family of structures intermediate between the G- is slipped by one base in the 5 direction relative to G-tract
triplex and the 1-3-2 G-triplex is named the symmetric 1. In addition, the last G of G-tract 3 is stacked below the
triplex. It has direct hydrogen bonding of all three G-tracts last G of G-tract 2. The most loosely defined cluster fam-
and may contain symmetric triads (86). The symmetric ily, the cross-cross, contains structures in which all three G-
triplex ensemble has a bit variable channel, which may ac- tracts are mutually rotated, and each G-tract is hydrogen
commodate cations, depending on the exact type of base bonded to the other two by either Hoogsteen or Watson–
pairing between the bases. The last significant cluster fam- Crick edges. There is no central channel.
ily closely related to the G-triplex is the G-triplex-1|51. It is Besides the above-described triplex-like structures, many
essentially the G-triplex with G-tract 1 slipped by one base other arrangements were observed in the simulations. They
in the 5 direction (G-tract 1 | 5 direction, 1 base). include intermediates of G-triplex unfolding in standard
All four above-described cluster families feature all three MD simulations, as well as intermediates in folding found
G-tracts in parallel orientation and utilize mostly Hoog- in the enhanced sampling simulations. The most promi-
steen hydrogen bonding. The remaining triplex cluster fami- nent ones are cross-hairpin structures, i.e. a cross-like ar-
lies have at least one G-tract in a different position, and they rangement of two G-tracts with the third G-tract unbound.
are thus called cross-like triplexes. The most common cross- Cross-hairpins may be formed by G-tracts 1 and 2 (des-
like arrangement is the 3-cross. In this structure, G-tracts 1 ignated as 5 -cross) or by G-tracts 2 and 3 (3 -cross) (Fig-
and 2 remain Hoogsteen paired like in the G-triplex, while ure 3; note the different meaning of 3 -cross [cross-hairpin]
G-tract 3 is rotated and uses Hoogsteen or Watson–Crick and 3-cross [cross-like triplex]). The two crossed G-tracts
edge of its Gs to form hydrogen bonds with the first two have at least two stacked Gs each that are mutually rotated
G-tracts. Gs in G-tract 3 are not strictly parallel, so one of and hydrogen-bonded through their WC or Hoogsteen edge
the Gs in G-tract 3 sometimes forms a native G-triad with so that the overall structure resembles a cross. Though the
Gs from G-tracts 1 and 2. This ensemble has poorly devel- structure with perpendicular orientation of the G-tracts is
oped central ion channel. Analogously, the 1-cross family common, their arrangement is flexible and they often adopt
has G-tracts 2 and 3 parallel, while G-tract 1 is rotated. In a more parallel-like orientation, resulting in a broad en-
Nucleic Acids Research, 2019 7
semble of structures rather than a single well-defined con- the absolute lifetime values indicated by our MD should be
formation. Upon complete folding of the propeller loop, taken with a great care.
the 5 -cross forms a parallel G-hairpin with G-tract 3 un-
bound, called the 5 -G-hairpin (Figure 3). Similarly, the 3 - Duplex-triplex junction and flanking nucleotides visibly sta-
cross folds into the 3 -G-hairpin. bilize the G-triplex ensemble
Other observed structures include the ss-helix and var-
Hairpin-triplex (HT) constructs (Figure 2), which emulate
ious non-G4-resembling arrangements, such as antiparal-
protrusion of G-rich ssDNA from dsDNA, are profoundly
lel hairpins or circular structures with two ends stacked to-
more stable (i.e. have considerably longer lifetime) than
gether (Figure 3). All clusters found in all the simulations
cation, mimicking O6 of a fourth G in full G-tetrad (Figure G-triplex folding is not observed in GGGAGGGAGGG
5). When hemin is not in the dominant conformation, i.e. it REST2 simulation, but parallel G-hairpin folding events are
is not hydrogen bonded to the G, it rotates while remaining common
stacked.
To study the folding of G-triplex, we ran a REST2 simu-
lation starting from the unfolded ss-helical structure. Full
Cation binding G-triplex folding was not reached, nor did we observe
the formation of any of the triplex clusters shown in
In the simulations, the G-triplex channel binds mostly one Figure 3A. Nevertheless, there was a clear tendency to
or two cations in the G4-like manner. When fluctuations of form G-hairpin arrangements within the context of the
the G-triplex are attenuated (e.g. by hemin), binding of two GGGAGGGAGGG sequence, namely, the cross-hairpin
cations dominates. Binding of two cations to other triplex- structures (Figures 3B and 6). A broad ensemble of the
like structures is scarce. The 3-cross and cross-cross either 3 -cross accounted for 22.4% population of the reference
bind one or no cation (Supplementary Table S12). More de- replica. The 5 -cross had a population of 2.7%. Importantly,
tails are given in Supporting Data.
Nucleic Acids Research, 2019 9
T and HT constructs using NMR and CD spectroscopy experimentally characterize potential triplex intermediates
(Supplementary Figures S9 and S10). However, with a sin- were unsuccessful, we report the results to highlight the ma-
gle exception represented by HT-TTA, our efforts were jor challenges in the experimental characterization of tran-
halted by apparent oligomerization of the studied con- sient intermediates that may participate in G4 folding. See
structs, as shown by control native PAGE experiments (Sup- Supporting Data for further information.
plementary Figure S11). The imino region of the 1D 1 H
NMR spectrum of HT-TTA (Supplementary Figure S9)
was marked by strong signal(s) from the GCGAAGC hair- DISCUSSION
pin and by rather diffuse, broad and overlapped multi- We have carried out a series of MD simulations (cumulative
ple signals of very low intensity in the region specific for time 929 s) to investigate properties of three-layer parallel-
imino protons involved in Hoogsteen base pairing span- stranded DNA G-triplexes and G-hairpins, which are ex-
ning from ∼10.5 to 12 ppm. The overall appearance of this pected to contribute to free-energy landscapes of three-
spectral region indicates co-existence of multiple, dynamic tetrad G4 systems. The standard simulations aim to probe
species stabilized by Hoogsteen type of hydrogen bonding. structural stability of the G-triplexes once already formed,
In all other cases, the structural interpretation of these ex- while the enhanced-sampling REST2 and T-REMD simu-
periments would only be highly speculative, so we refrain lations seek for the folding pathway of parallel G-hairpins
from drawing any conclusions. Despite that our attempts to and G-triplexes from the unfolded state. We have focused on
Nucleic Acids Research, 2019 11
parallel G-triplexes with two propeller loops since they rep- like conformations. We have identified eight such structural
resent a considerably more challenging system than antipar- families (Figure 3). The most important are the ‘native’ G-
allel G-triplexes. This assessment is based on previous MD triplex, which has its three G-tracts paired like in a G4, and
studies which revealed that parallel G-triplexes are struc- the 3-cross, which differs from the G-triplex by the posi-
turally less stable (have shorter lifetime) than antiparallel tion of the third G-tract, being rotated to the other two.
G-triplexes, while the formation of propeller loops in sim- We have also observed other less populated cluster fami-
ulations is considerably more difficult compared to lateral lies, closely related to either the G-triplex or other cross-like
and diagonal loops (58,86). triplex structures. Our simulations thus predict that the lo-
cal part of the free-energy landscape pertinent to parallel G-
triplexes corresponds to a broad range of quickly intercon-
The parallel G-triplex is represented by a broad ensemble of verting structures rather than to a well-structured native-
rapidly interconverting structures like G-triplex. Analysis of ion binding has shown that cross-
The simulations have shown that the triplex-forming se- like triplexes could be the folding state at which one albeit
quences can populate a broad ensemble of parallel-triplex-
12 Nucleic Acids Research, 2019
weakly interacting cation gets bound to the forming chan- ligands on the formation of G4 by stabilization of suppos-
nel. edly antiparallel G-hairpin and (2+1)hybrid G-triplex inter-
mediates has been suggested by DNA origami experiments
(76).
Adjacent DNA sequences and ligands can substantially sta-
bilize the parallel G-triplex
Parallel G-triplex could contribute to the DNA G4 folding
All simulations indicate that parallel G-triplexes and G-
landscape as transitory species, but is not an obligatory in-
hairpins would be only marginally populated when being
termediate
isolated (Figures 4 and 6; Supplementary Figure S1 and
assumed. In reality, the single-molecule G4 folding events 20 ms (82). Nevertheless, the triplexes could still contribute
may proceed through an overwhelmingly broad and multi- to the folding rate to the individual G4 basins, when con-
dimensional spectrum of micro-pathways and do not need tributing to the bottleneck of the process, and their role
to dominantly include any salient parallel structures with may be sensitive to the sequence context, ligand binding and
G4-like triads and tetrads. Balance between the individual other external factors; our simulations with hemin show
micro-pathways likely depends on sequence and environ- that lifetime of parallel G-triplexes can be increased per-
mental conditions. haps even by several orders of magnitude. In general, the
Our REST2 and T-REMD folding simulations were rel- ensemble of structures corresponding to the unfolded state
atively abundant in parallel G-hairpin folding events (Fig- of G4 molecules can be dramatically modulated by the en-
ures 6 and 7). This indicates that parallel G-hairpins can be vironment.
straightforwardly accessed from the unfolded state, though An intramolecular parallel G4 could fold via cross-
we have suppressed the competition of all structures with hairpin structures and then cross-like triplexes or the G-
different syn–anti G-tract patterns. However, the popula- triplex structure (Figure 8). A bimolecular parallel G4
tion of parallel G-hairpins was low, unlike the much more might prefer folding by a merge of two cross-hairpins into
abundant ensemble of cross-hairpin structures. When con- a cross-like G4 state. Nevertheless, the presence of a suit-
sidering the simulations and also their limitations (see be- able ligand could modulate the free-energy landscape also
low), we suggest the following interpretation of the data. for the bimolecular and tetramolecular G4s and promote
The G-triplex state would correspond to a broad spectrum folding via the parallel-stranded G-triplex. In conclusion,
of fast-interconverting conformations rather than to a fully this work suggests that the parallel G-triplex may transiently
paired G4-like G-triplex (Figure 3). When comparing with participate during the parallel G4 folding. However, it is not
simulations of complete cation-stabilized G4 folds (58), the an obligatory intermediate (or transitory) structure for the
parallel G-triplex has very short lifetimes and thus does not process (Figure 8), and its actual participation and lifetime
seem to be capable of forming major kinetic traps. There- may vary based on other circumstances.
fore, parallel three-layer G-triplex-like ensembles can only The past simulation studies have revealed that antipar-
act as transitory intermediates during folding into some allel G-triplexes and G-hairpins supporting lateral and di-
G4 basins from the unfolded state and during transitions agonal loops are structurally more stable and more easily
between different G4 folds (cf. Figure 1). This suggestion accessible than the parallel G-hairpins/triplexes supporting
(58,86) is consistent with the latest fluorescence-force exper- propeller loops (51,58,86,120). Thus, although this study
iments, which indicate that if there is the G-triplex present has been focused on parallel G-triplex, we suggest that also
during the folding, it does not correspond to a deep-enough antiparallel G-triplex ensembles may readily participate as
free-energy minimum, and that its lifetime is shorter than
14 Nucleic Acids Research, 2019
transitory species in folding pathways of other G4-folds and near-to-physiological conditions by CD and NMR spec-
are more populated than the parallel G-triplex ensembles. troscopy, FRET, DNA origami with AFM, and optical or
The simulation technique alone is not capable of pro- magnetic tweezers (27,29,32,34–41,43,45,47,76,84,85,134),
viding complete coverage of the G4 folding. Neverthe- these studies have been unable to provide unambiguous
less, the richness of structures and structural transitions prove for existence of G-triplex and/or G-hairpin due
already documented in MD simulations (38,47,49–53,55– to either inherently low-structural resolution (CD, FRET,
57,86,119,120,126–128) caution against interpreting the G4 AFM, tweezers), need for covalent tagging or binding that
folding mechanisms using too simple models based on a few can potentially influence G4 folding pathway (FRET, DNA
straightforward and structurally appealing intermediates. origami), or due to inability to observe low-populated
the unstructured part of the unfolded ensemble. The sim- 6. Juranek,S.A. and Paeschke,K. (2012) Cell cycle regulation of
ulations suggest that the parallel G-hairpin and G-triplex G-Quadruplex DNA structures at telomeres. Curr. Pharm. Des., 18,
1867–1872.
belong to a broader and dynamical ensemble of structures 7. Rizzo,A., Salvati,E., Porru,M., D’Angelo,C., Stevens,M.F.,
that also includes a wide spectrum of imperfectly paired D’Incalci,M., Leonetti,C., Gilson,E., Zupi,G. and Biroccio,A.
structures with not fully parallel or even perpendicular ar- (2009) Stabilization of quadruplex DNA perturbs telomere
rangement of the G-tracts, with a fast exchange of confor- replication leading to the activation of an ATR-dependent ATM
mations within the ensemble. In addition, parallel all-anti signaling pathway. Nucleic Acids Res., 37, 5353–5364.
8. Postberg,J., Tsytlonok,M., Sparvoli,D., Rhodes,D. and Lipps,H.J.
G-hairpins/triplexes appear in simulations less stable than (2012) A Telomerase-associated RecQ Protein-like helicase resolves
the corresponding antiparallel structures with alternating telomeric G-quadruplex structures during replication. Gene, 497,
27. Boncina,M., Lah,J., Prislan,I. and Vesnaver,G. (2012) Energetic 48. Mashimo,T., Yagi,H., Sannohe,Y., Rajendran,A. and Sugiyama,H.
basis of human telomeric DNA folding into G-quadruplex (2010) Folding pathways of human telomeric Type-1 and Type-2
structures. J. Am. Chem. Soc., 134, 9657–9663. G-quadruplex structures. J. Am. Chem. Soc., 132, 14910–14918.
28. Koirala,D., Mashimo,T., Sannohe,Y., Yu,Z.B., Mao,H.B. and 49. Stadlbauer,P., Krepl,M., Cheatham,T.E. 3rd, Koca,J. and Sponer,J.
Sugiyama,H. (2012) Intramolecular folding in three tandem guanine (2013) Structural dynamics of possible Late-Stage intermediates in
repeats of human telomeric DNA. Chem. Commun., 48, 2006–2008. folding of quadruplex DNA studied by molecular simulations.
29. Gray,R.D., Buscaglia,R. and Chaires,J.B. (2012) Populated Nucleic Acids Res., 41, 7128–7143.
intermediates in the thermal unfolding of the human telomeric 50. Bian,Y., Tan,C., Wang,J., Sheng,Y., Zhang,J. and Wang,W. (2014)
quadruplex. J. Am. Chem. Soc., 134, 16834–16844. Atomistic picture for the folding pathway of a Hybrid-1 type human
30. Buscaglia,R., Gray,R.D. and Chaires,J.B. (2013) Thermodynamic telomeric DNA G-quadruplex. PLoS Comput. Biol., 10, e1003562.
characterization of human telomere quadruplex unfolding. 51. Stadlbauer,P., Kuhrova,P., Banas,P., Koca,J., Bussi,G., Trantirek,L.,
68. Lim,K.W., Ng,V.C.M., Martin-Pintado,N., Heddi,B. and Phan,A.T. 89. Luu,K.N., Phan,A.T., Kuryavyi,V., Lacroix,L. and Patel,D.J. (2006)
(2013) Structure of the human telomere in Na+ Solution: An Structure of the human telomere in K+ Solution: An intramolecular
antiparallel (2+2) G-quadruplex scaffold reveals additional (3+1) G-Quadruplex scaffold. J. Am. Chem. Soc., 128, 9963–9970.
diversity. Nucleic Acids Res., 41, 10556–10562. 90. Jorgensen,W.L., Chandrasekhar,J., Madura,J.D., Impey,R.W. and
69. Dailey,M.M., Miller,M.C., Bates,P.J., Lane,A.N. and Trent,J.O. Klein,M.L. (1983) Comparison of simple potential functions for
(2010) Resolution and characterization of the structural simulating liquid water. J. Chem. Phys., 79, 926–935.
polymorphism of a single Quadruplex-Forming sequence. Nucleic 91. Berendsen,H.J.C., Grigera,J.R. and Straatsma,T.P. (1987) The
Acids Res., 38, 4877–4888. missing term in effective pair potentials. J. Phys. Chem., 91,
70. Brcic,J. and Plavec,J. (2017) ALS and FTD linked 6269–6271.
GGGGCC-Repeat containing DNA oligonucleotide folds into two 92. Joung,I.S. and Cheatham,T.E. (2008) Determination of Alkali and
distinct G-Quadruplexes. Biochim. Biophys. Acta Gen. Subj., 1861, Halide monovalent ion parameters for use in explicitly solvated
109. Sponer,J., Bussi,G., Krepl,M., Banas,P., Bottaro,S., Cunha,R.A., 124. Stefl,R., Cheatham,T.E., Spackova,N., Fadrna,E., Berger,I., Koca,J.
Gil-Ley,A., Pinamonti,G., Poblete,S., Jurecka,P. et al. (2018) RNA and Sponer,J. (2003) Formation pathways of a Guanine-Quadruplex
structural dynamics as captured by molecular Simulations: A DNA revealed by molecular dynamics and thermodynamic analysis
comprehensive overview. Chem. Rev., 118, 4177–4338. of the substates. Biophys. J., 85, 1787–1804.
110. Liu,P., Kim,B., Friesner,R.A. and Berne,B.J. (2005) Replica 125. Bergues-Pupo,A.E., Arias-Gonzalez,J.R., Moron,M.C.,
exchange with solute Tempering: A method for sampling biological Fiasconaro,A. and Falo,F. (2015) Role of the central cations in the
systems in explicit water. Proc. Natl. Acad. Sci. U.S.A., 102, mechanical unfolding of DNA and RNA G-quadruplexes. Nucleic
13749–13754. Acids Res., 43, 7638–7647.
111. Wang,L., Friesner,R.A. and Berne,B.J. (2011) Replica exchange with 126. Stadlbauer,P., Mazzanti,L., Cragnolini,T., Wales,D.J.,
solute Scaling: A more efficient version of replica exchange with Derreumaux,P., Pasquali,S. and Sponer,J. (2016) Coarse-Grained
solute tempering (REST2). J. Phys. Chem. B, 115, 9431–9438. simulations complemented by atomistic molecular dynamics provide