Professional Documents
Culture Documents
proteins
Pham Dang Lan, Maksim Kouza, Andrzej Kloczkowski, and Mai Suan Li
(Received 31 July 2018; accepted 11 October 2018; published online 2 November 2018)
We studied the refolding free energy landscape of 26 proteins using the Go-like model. The distance
between the denaturated state and the transition state, X F , was calculated using the Bell theory and the
nonlinear Dudko-Hummer-Szabo theory, and its relation to the geometrical properties of the native
state was considered in detail. We showed that none of the structural parameters, such as the contact
order, protein length, and radius of cross section, correlate with X F for all classes of proteins. To
overcome this problem, we have introduced the nematic order parameter P02 , which describes the
ordering of the structured elements of the native state. Due to its topologically global nature, P02 is
better than other structural parameters in describing the folding free energy landscape. In particular,
P02 displays a good correlation with X F extracted from the nonlinear theory for all three classes of
proteins. Therefore, this parameter can be used to predict X F for any protein, if its native structure is
known. Published by AIP Publishing. https://doi.org/10.1063/1.5050483
TABLE I. Relative contact order (RCO), absolute contact order (CO), radius of cross section (V ASA /S ASA )2 ,
protein size (L 2/3 ), and order parameter P02 of 26 proteins studied in this work. X F1 and X F2 in low and mean
force regimes were calculated in the Bell approximation. The last column refers to X F extracted from DHS theory.
A proteins
β proteins
the weak force regime, folding pathways are the same as in the level is observed for L 2/3 (R = 0.64 for X F1 and 0.68 for X F2 )
absence of force (temperature-driven regime) and the depen- and radius of gyration Rg (R = 0.72 for X F1 and 0.69 for X F2 ).
dence of the folding time on the external force is described (V ASA /S ASA )2 parameter, which describes the compactness of
by Eq. (2) with the distance X F1 . In the medium force regime, the protein, also correlates with the distance between DS and
which is probed by AFM experiments,19 the dependence of τ f NS, but the correlation level is slightly lower than that of ACO
on f is also given by Eq. (2) but with the distance X F2 , where and L 2/3 (R = 0.57 and 0.65 for X F1 and X F2, respectively).
X F1 < X F2 . To be more confident in the correlations between X F and the
In this work, we also observed two regimes separated structural parameters from linear regression, we calculated the
by a switch force fswitch (Figs. S1–S3 in the supplementary p-values that are shown in the corresponding plots. By adopt-
material), but for more proteins than our previous study.19 Here ing the standard criterion that a good fit should have p less than
the proteins are divided into three classes of α-, β-, and mixed 0.05, ACO and Rg display the best correlation with X F among
α/β-proteins. Using Eq. (2) and the data from Figs. S1–S3, we the others.
extracted X F1 and X F2 for two regimes (Table I). X F greatly
b. No correlation between XF and structure parameters for
varies among proteins. Some of them such as 1NTI (α-protein)
α-proteins and α/ β-proteins. As evident from Fig. S4, the
and 2RN2 (mixed α/β-protein) have very large X F , indicating
distance X F of α-proteins correlates with structural parame-
their high sensitivity to the quenched force and the acquisition
ters worse than β-proteins, since R remains below 0.27 and
of their native state is difficult.
p-values are larger than 0.56. This may be due to the fact that
a. Correlation between XF and structural parameters for proteins of this class are too sensitive to external force, leading
β-proteins. It is known that α-rich proteins fold more rapidly to a greater variation in X F compared to β-proteins. Another
than β-rich proteins15,40 and the folding rate correlates with the possible reason is that our set is not large enough to have good
parameters specifying NS such as contact order (CO) and its statistics.
various variants,15,41,42 the protein length (L),43 and the radius Correlation is better in the case of α/β-proteins (Fig. S5),
of the cross section [(V ASA /S ASA )2 ].32 Studies on mechanical but the fit remains poor, with the exception of L 2/3 and L for
unfolding showed a correlation between X u and CO.2,14 An which we obtained R = 0.65, p = 0.044 and R = 0.68, p = 0.029,
interesting question emerges: does the distance X F correlate respectively.
with these structural parameters.
For β-proteins, there is no fit with the relative contact c. Why XF of β-proteins correlates with structure parame-
order CO, but ACO has a correlation level R = 0.67 and 0.74 ters? In order to qualitatively understand why X F correlates
for X F1 and X F2 , respectively (Fig. 2). A similar correlation with the structural parameters of β-proteins but not with other
FIG. 2. Correlation between XF and relative contact order (RCO) and absolute contact order (ACO); L 2/3 and radius of cross section (V ASA /S ASA )2 ; and L and
radius of gyration Rg for β-proteins in the low (left panel) and medium (right panel) force regime.
175101-5 Lan et al. J. Chem. Phys. 149, 175101 (2018)
! 1−1/ν
classes of proteins, we present FEL as a function of the end- νf XF ∆G
1− 1+νf XF /∆G )]
1/ν
τ(f ) = τ0 1 + e kB T [ ( , (3)
to-end distance and the fraction of native contacts for 1LMB ∆G
(α-protein) and 1G1C (β-protein) (Fig. 3). For 1G1C, in the
where ∆G is the height of the barrier between DS and TS, and
both low (f = 3.5 pN) and medium force (f = 8.16 pN) regimes,
the parameter ν is related to the shape of the barrier: ν = 2/3
the most representative structures contain tertiary contacts
for a linear-cubic barrier, ν = 1/2 for a cusp-like barrier,
implying that the refolding is not so sensitive to external force.
and ν = 1 returns the Bell theory [Eq. (2)]. This theory
As a result, X F values are not very diverse among proteins lead-
was successfully used to explore FEL of proteins and RNA
ing to its correlation with structural parameters, as can be seen
hairpin.4,29,44
in our simulation.
Since X F extracted from Eq. (3) is almost the same for
The situation becomes different for the -protein 1LMB,
ν = 1/2 and 2/3, we present the results for ν = 1/2. Figure 4
where secondary structures were formed in the representative
illustrates the application of the DHS theory to protein 1KSR.
structures in the medium force regime (f = 6.8 pN), but ter-
Obviously the fit works for a wider range than the weak force
tiary contacts do not yet occur making it difficult to acquire
regime, but it fails to cover the entire medium regime. In gen-
the native structure. Thus, the effect of external force on the
eral, the region, where this fit works, depends on proteins. For
refolding of α-proteins is strong and leads to a large variation
some proteins, it is valid for the weak force regime, but for
of X F or, ultimately, to a poor fit between X F and structural
other proteins it covers both weak and medium force regimes
parameters.
(Figs. S1–S3). X F predicted by the DHS theory for all studied
2. Nonlinear Dudko-Hummer-Szabo (DHS) theory proteins is given in Table I.
Although Bell theory is widely used in experiments to a. The DHS theory works best for α/ β proteins. Contrary to
obtain information about FEL, X F is determined only in a the Bell theory, for β-proteins X F , predicted by the DHS the-
narrow force range and estimation of τ0 from Eq. (2) might ory does not correlate with any structural parameter (Fig. 5).
be inaccurate.37 This is partly because in the linear theory The best fit was obtained for Rg , but the corresponding corre-
the movement of TS under the action of an external force is lation level is rather low (R = 0.58, p = 0.103). For α-proteins,
neglected. On the other hand, as seen above, in the Bell approx- the DHS approximation works better than the Bell theory
imation, the correlation between X F and structural parameters (compare Fig. 5 and Fig. S4), but the correlation remains
is poor, in particular, for α-proteins and α/β-proteins. There- poor because the best correlation, obtained for ACO, is only
fore, we attempted to go beyond this approximation using DHS 0.5.
theory, which took into account the displacement of TS under For α/β proteins, the DHS theory works much better
an external force.13 Then X F can be defined over a broader than the Bell theory with two regimes (Fig. 5 and Fig. S5).
range of forces by the following equation: With the exception of ACO (R = 0.32, p = 0.37) and RCO
FIG. 4. Application of the linear (two force regimes) and nonlinear DHS
theory to protein 1KSR. Using Eq. (3) and ν = 1/2, we obtaind X F = 0.49 nm
(green curve). For ν = 2/3, we obtained the same results.
B. Order parameter P 02 FIG. 6. A cartoon representation of 1PGA protein that has 5 structured seg-
ments, identifined by STRIDE. Vectors connecting two Cα-ends of secondary
1. Definition of P02 structure elements are →
−
u1 (residues 2 → 8), →
−
u2 (residues 13 → 19), →
−
u3 (residues
23 → 37), → − (residues 42 → 46), and →
u4
− (residues 51 → 55). For 1PGA,
u 5
Because none of the structural parameters are capable P02 = 0.82.
to characterize X F for all classes of proteins, we introduce
a “nematic liquid crystal” order parameter P02 as follows: obtain secondary structures of native conformation retrieved
N from the Protein Data Bank (PDB).45 The vectors → −
ui were
1 X 3 →− →− 2 1 46
calculated by the cafit orientation module plugged in the
P02 = ui . d − . (4)
2N i=1 2 2 Pymol47 software. Note that parameter P02 was introduced48 to
describe the fibril structure of protein. In our case, this parame-
Here →−
ui (i = 1, . . ., N) are the unit vectors connecting the two ter measures the order of structured segments. P02 of all studied
Cα-ends of secondary structures including the helix and beta proteins is shown in Table I. Proteins with small P02 are struc-
(Fig. 6), N is the number of secondary structure elements, and turally less ordered than proteins with large P02 . Thus, it is
→
−
d is the unit eigenvector, which defines the collective direc- expected that proteins, which have small P02 , will be sensitive
tion of the vectors → −
ui . The STRIDE software was employed to to external force.
FIG. 5. Correlation between X F , extracted from DHS theory, and relative contact order (RCO), absolute contact order (ACO), L2/3 , radius of cross section
(V ASA /S ASA )2 , L, and radius of gyration Rg for three classes of proteins.
175101-7 Lan et al. J. Chem. Phys. 149, 175101 (2018)
and verify our main prediction that they are correlated with
P02.
IV. CONCLUSION
We have calculated X F for 26 Go proteins using the Bell
and DHS theory. The relationship between this distance and
various structural parameters has been analyzed in detail. It
was shown that none of the existing parameters (RCO, ACO,
radius of cross section, Rg , and protein length) is correlated
with X F obtained for all classes of proteins. To overcome this
problem, we have introduced parameter P02 which character-
izes the ordering of the β- and α-structures of NS. Due to this
topologically global nature, for all proteins, the new param-
eter correlates with X F extracted from DHS theory in such a
FIG. 9. Correlation between XF estimated by using DHS formula [Eq. (3)] way that the smaller the P02 , the larger the X F . Thus, contrary
and P02 for each class of proteins and for all proteins. to the distance between NS and TS, X U , which is determined
by the contact order,2,14 the relationship between X F and the
geometry of NS is more complicated.
the experimental data on X F and parameter P02 (Fig. 10) as Our simulations show that P02 is an useful parameter for
R = −0.43 and p = 0.337. This is probably because the data exploring FEL of protein folding. It can be used to predict
set is small, but more importantly because in the experiment X F of proteins, based solely on information about their NS. It
X F was estimated using the Bell theory, i.e., from the linear would be important to reaffirm our result on the relationship
dependence of the rupture force on the logarithm of the pulling between P02 and X F for a larger data set not only by simulation
speed. From this point of view, the experimental data agree but also by experiment.
with our result about the poor relationship between P02 and X F
estimated within the linear theory. It would be interesting to
SUPPLEMENTARY MATERIAL
extract X F from fitting experimental data, which are collected
in both weak and medium force regimes, with the DHS Eq. (3) See supplementary material for dependence of folding
time on quenched force for three classes of proteins with lin-
ear and DHS fitting (Figs. S1–S3); correlation between X F
TABLE II. Experimental values of X F for seven proteins.
extracted by Bell formula and relative contact order (RCO),
No. Protein Class P02 Experimental X F (nm) Reference
absolute contact order (ACO), L2/3 , L, radius of cross sec-
tion (V ASA /S ASA )2 , and radius of gyration (Rg ) for α-proteins
1 1UBQ α/β 0.283 0.8 6 (Fig. S4) and α/β-proteins (Fig. S5); correlation between
2 1SRL β 0.16 5.25 8 X F , extracted by Bell formula, and P02 for α-proteins and
3 1NTI α 0.720 6.7 9 α/β-proteins (Figs. S6 and S7); and correlation between X F ,
4 1CFC α 0.613 8.0 10
extracted by using the DHS equation, and relative contact order
5 1PGA α/β 0.819 2.1 11
(RCO), absolute contact order (ACO), L 2/3 , L, radius of cross
6 1BZ6 α 0.147 14.9 7
7 1TIT β 0.802 2.3 5
section (V ASA /S ASA )2 , and radius of gyration (Rg) for all 26
proteins.
ACKNOWLEDGMENTS
This work was supported by the Department of Sci-
ence and Technology at Ho Chi Minh city, Vietnam
(Grant No. 32/2017/HD-KHCNTT), and the Polish NCN
Grant No. 2015/19/B/ST4/02721, Poland. M.K. acknowl-
edges the Polish Ministry of Science and Higher Education
for financial support through “Mobilnosc Plus” Program No.
1287/MOB/IV/2015/0.
1 T. Hoffmann, K. M. Tych, M. L. Hughes, D. J. Brockwell, and L. Dougan,
Phys. Chem. Chem. Phys. 15, 15767 (2013).
2 S. Kumar and M. S. Li, Phys. Rep. 486, 1 (2010).
3 M. L. Hughes and L. Dougan, Rep. Prog. Phys 79, 076601 (2016).
4 Y. Chen, S. E. Radford, and D. J. Brockwell, Curr. Opin. Struct. Biol. 30,
89 (2015).
5 M. Carrion-Vazquez, A. F. Oberhauser, S. B. Fowler, P. E. Marszalek, S.
7 P. J. Elms, J. D. Chodera, C. Bustamante, and S. Marqusee, Proc. Natl. Acad. 28 T. Veitshans, D. Klimov, and D. Thirumalai, Folding Des. 2, 1 (1997).
Sci. U. S. A. 109, 3796 (2012). 29 M. S. Li, M. Kouza, and C.-K. Hu, Biophys. J. 92, 547 (2007).
8 B. Jagannathan, P. J. Elms, C. Bustamante, and S. Marqusee, Proc. Natl. 30 P. Rotkiewicz and J. Skolnick, J. Comput. Chem. 29, 1460 (2008).
Acad. Sci. U. S. A. 109, 17820 (2012). 31 G. G. Krivov, M. V. Shapovalov, and R. L. Dunbrack, Proteins: Struct.,
9 P. O. Heidarsson, I. Valpapuram, C. Camilloni, A. Imparato, G. Tiana, F. M. Funct., Bioinf. 77, 778 (2009).
Poulsen, B. B. Kragelund, and C. Cecconi, J. Am. Chem. Soc. 134, 17068 32 D. N. Ivankov, N. S. Bogatyreva, M. Y. Lobanov, and O. V. Galzitskaya,
Phys. 146, 135101 (2017). 42 H. Zhou and Y. Zhou, Biophys. J. 82, 458 (2002).
20 N. Go, Ann. Rev. Biophys. Bioeng. 12, 183 (1983). 43 A. V. Finkelstein, N. S. Bogatyreva, and S. O. Garbuzynskiy, FEBS Lett.
21 C. Clementi, H. Nymeyer, and J. N. Onuchic, J. Mol. Biol. 298, 937 (2000). 587, 1884 (2013).
22 M. Kouza, C.-K. Hu, and M. S. Li, J. Chem. Phys. 128, 045103 (2008). 44 O. K. Dudko, T. G. Graham, and R. B. Best, Phys. Rev. Lett. 107, 208301
23 J. I. Sułkowska and M. Cieplak, Biophys. J. 95, 3174 (2008). (2011).
24 P. C. Whitford, J. K. Noel, S. Gosavi, A. Schug, K. Y. Sanbonmatsu, and 45 H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig,
J. N. Onuchic, Proteins: Struct., Funct., Bioinf. 75, 430 (2009). I. N. Shindyalov, and P. E. Bourne, Nucleic Acids Res. 28, 235 (2000).
25 M. Cieplak, T. X. Hoang, and M. O. Robbins, Proteins: Struct., Funct., 46 See https://github.com/Pymol-Scripts/Pymol-script-repo/blob/master/
Bioinf. 49, 114 (2002). anglebetweenhelices.py for the method for calculation of vector →
−
ui using a
26 D. Klimov and D. Thirumalai, Proc. Natl. Acad. Sci. U. S. A. 97, 2544 python script.
(2000). 47 W. L. DeLano, http://pymol.org (2002).
27 E. P. O’Brien, G. Ziv, G. Haran, B. R. Brooks, and D. Thirumalai, Proc. 48 P. H. Nguyen, M. S. Li, G. Stock, J. E. Straub, and D. Thirumalai, Proc.
Natl. Acad. Sci. U. S. A. 105, 13403 (2008). Natl. Acad. Sci. U. S. A. 104, 111 (2007).