Professional Documents
Culture Documents
Chemosphere
journal homepage: www.elsevier.com/locate/chemosphere
H I G H L I G H T S G R A P H I C A L A B S T R A C T
A R T I C L E I N F O A B S T R A C T
Handling Editor: A. Gies According to Eurostat, the EU production of chemicals hazardous to health reached 211 million tonnes in 2019.
Thus, the possibility that some of these chemical compounds interact negatively with the human endocrine
Keywords: system has received, especially in the last decade, considerable attention from the scientific community. It is
Computational chemistry obvious that given the large number of chemical compounds it is impossible to use in vitro/in vivo tests for
Consensus prediction
identifying all the possible toxic interactions of these chemicals and their metabolites. In addition, the poor
Database
availability of highly curated databases from which to retrieve and download the chemical, structure, and
Nuclear receptors
Toxicology regulative information about all food contact chemicals has delayed the application of in silico methods. To
overcome these problems, in this study we use robust computational approaches, based on a combination of
highly curated databases and molecular docking, in order to screen all food contact chemicals against the nuclear
receptor family in a cost and time-effective manner.
1. Introduction activity. It is obvious that, given the large number of chemical com
pounds and their metabolites existing and developed every year, it is
A research project starts with a question. The main question of this impossible to use in vitro (or in vivo) tests for identifying all possible toxic
project is: how we can evaluate all the possible food contact chemicals interactions. The solution is to use computational approaches to reduce
against a protein family to discover potential endocrine disrupting the number of wet tests, seeking only the most probable interactors.
* Corresponding author.
E-mail addresses: pietro.cozzini@unipr.it (P. Cozzini), francesca.cavaliere@unipr.it (F. Cavaliere), giulia.spaggiari@unipr.it (G. Spaggiari), gianluca.morelli@
unipr.it (G. Morelli), marco.riani@unipr.it (M. Riani).
https://doi.org/10.1016/j.chemosphere.2021.133422
Received 25 October 2021; Received in revised form 20 December 2021; Accepted 22 December 2021
Available online 28 December 2021
0045-6535/© 2021 Elsevier Ltd. All rights reserved.
P. Cozzini et al. Chemosphere 292 (2022) 133422
Endocrine disrupting chemicals (EDCs) are exogenous substances these errors propagate quickly and easily across the internet. These
that can interfere with the synthesis, secretion, transport, binding, and undermine the effort of in silico methods. So far, much attention has been
elimination of natural hormones in the body that are responsible for the paid to structure normalization to ensure the detection and the correc
maintenance of homeostasis, reproduction, and behavior (Kavlock et al., tion of three-dimensional errors and a variety of public and commercial
1996). Human exposure to EDCs occurs through oral consumption of toolkits exist to address this problem. However, less attention is often
food and water, contact with skin, inhalation, or intravenous, route given to the consistency of the association between chemical identifiers
(Kabir et al., 2015). These molecules are highly heterogenous and (CAS RN and name) and chemical structures. For example, the com
include pesticides, plasticizers (i.e., phthalates, bisphenols), persistent pound classified as flavouring having the CAS N: 563187-91-7 and the
organic pollutants (POPs) (i.e., dioxins, polychlorinated biphenyls), but common name “l-Menthone-1,2-glycerol ketal” in the EFSA list is a
also chemicals added to food to enhance some characteristics (i.e., fla typical example of CAS:Name wrong association. In fact, this CAS
vourings, food additives), or naturally occurred, such as mycotoxins. actually corresponds to “DNA (mouse strain C57BL/6 J clone
EDCs can act through different mechanisms: mimicking the action of a 5430425J12 EST (expressed sequence tag))” and the correct CAS RN of
naturally produced hormone, blocking hormone receptors in cells, the compound “l-Menthone-1,2-glycerol ketal” is 67785–70-0. More
interacting indirectly by influencing the biosynthesis or availability of over, although CAS RN is commonly used as an identifier of the majority
normal hormones. Between them, the most privileged route is the of databases, in several databases molecules are classified using different
interaction with nuclear receptors (NRs). Nuclear receptors are a su identifiers and thus there is often a lack of standardisation (Hersey et al.,
perfamily of 48 ligand-activated transcription factors, including estro 2015). Although data quality is undoubtedly important for every data
gen receptor (ER), androgen receptor (AR), mineralocorticoid receptor base, they may have been developed with different aims and scope, and
(MR), glucocorticoid receptor (GR), progesterone receptor (PR), and it is unreasonable to expect the same degree of curation. The increasing
thyroid receptor (TR). NRs share a common structural organization amounts of compounds released every year (500–1000 new molecules)
composed of an N-terminal region (A/B domain), a conserved region and that are in contact with food, along with the different sources of
DNA-binding domain (DBD), and a ligand-binding domain (LBD) data, have made it difficult to check manually the reliability of data. In
responsible for ligand recognition. The alteration of nuclear receptors view of this, it is essential to design and implement a data curation
pathways is correlated to many pathologies, such as breast cancer, pipeline into an automated procedure.
prostate cancer, and testicular cancer, infertility, cardiovascular com A wide number of computational applications (tools) specifically for
plications, disturbances in energy metabolism, immune responses, the analysis of EDCs are available in the literature in order to determine
impairment of cognitive functions and the regulation of cell prolifera the relationship between one compound and its toxic effect. In partic
tion and differentiation, hypertension, obesity, and so on (Dall’Asta, ular, the molecular docking technique is a well-establish application to
2016) (De Coster and Van Larebeke, 2012) (Desvergne et al., 2009) study protein-ligand interaction, which means analysing if the ligand
(Fucic et al., 2012) (Luccio-Camelo and Prins, 2011) (Odermatt and has the suitable physical-chemical characteristic, shape, the volume to
Gumy, 2008) (Petrakis et al., 2017) (Safe, 2004) (Schug et al., 2011) fit properly into the binding cavity of the receptor. Molecular docking is
(Gore et al., 2015). In order to prevent human diseases, in the past de mainly composed of two main parts: an algorithm that is used to predict
cades, different regulatory and policy approaches were made even if the different binding poses of a molecule in the protein binding site, and a
identification and safety assessment of potential EDCs is complicated scoring function used to evaluate the strength of ligand-protein inter
both by the observed low-dose effects and the often long-term exposure action, i.e., to predict its binding affinity. Different algorithms and
or exposure during a critical window early in development. One of these scoring functions exist but answering the question of which algorithm or
is the REACH (Registration, Evaluation, Authorisation, and Restriction scoring function is the best one, is a complicated task (Morris and
of Chemicals) legislation that is committed to protecting human health Lim-Wilby, 2008). In fact, each docking software (that is the sum of
and the environment from hazardous chemicals. However, testing all the algorithm and scoring function) has been trained with different proteins
possible EDCs against all the potential targets is very important but also and ligands. Thus, before starting a molecular docking analysis, it should
an expensive, long and difficult task (e.g., the nuclear receptors family be advisable to identify the more appropriate software based on the
contains 48 members). In fact, these tests are still mainly based on trained protein-ligand complexes that best fit with the proteins and li
biological and animal experimentations (toxicity tests), very time- and gands under investigation. However, in the present work, 31 different
cost-intensive, and which cause millions of animals’ death every year. In nuclear receptors with different binding pocket characteristics and a
this context, in silico methods, already well-established tools in drug huge number of heterogeneous molecules from a chemical and struc
discovery, can be good tools either in the identification of new EDCs or tural point of view were considered. Thus, it is unthinkable to identify a
pointing in the right direction when finding the mechanism of action for single docking program that may have the same performance for all
already known EDCs. Computational approaches produce predictive nuclear receptors and for all food contact molecules. For that reason, we
models that are more rapid and less costly than in vitro and in vivo tests, used a robust consensus scoring approach using two different docking
allowing a large amount of data concerning numerous chemical sub software and four different scoring functions. The combination of more
stances to be generated and analysed in a short time without the use of scoring functions allows to reduce the number of false-positive and to
test animals (F. Cavaliere et al., 2020). A key prerequisite for the suc obtain more reliable results by compensating the deficiencies of each
cessful application of computational modeling techniques is the quality scoring function, leading to an improvement of the performances (Ter
of the input data. The availability of open access databases offers the amoto and Fukunishi, 2007) (Wang et al., 2003). Such as Bissantz and
capability to retrieve a huge amount of information from different data co-workers have highlighted, the use of three different scoring functions
sources. The CAS Registry Number (RN) has been chosen, long time ago, enhances the capability to reach hit rates from 10% up to 70% (Bissantz
as a unique and unambiguous numeric identifier for a specific chemical et al., 2000).
compound. It is developed by the American Chemical Society to help The goal of this work is to predict a possible endocrine disrupting
scientists to retrieve and use information from different data sources. activity of a huge set of molecules that can contact the food as a base for
Since it may be unique, validated, and internationally recognized, the further in vitro/in vivo tests using computational methods that do not
governmental agencies rely on CAS RNs for substance identification. consider the intake dose. The following approach takes into consider
However, CAS RNs are often used improperly by the scientific commu ation the interaction between a ligand (i.e. the endocrine disruptor
nity and there is no check made by the American Chemical Society. compound) and the binding site of a receptor (i.e. the nuclear receptor)
Thus, it is really common to find some errors and this wrong information that is considered the molecular initiate event (MIE). This event is
propagates easily across the Internet (Grulke et al., 2019). In fact, con fundamental from a biological point of view because it is the first
flicts in the chemical identifier are not so rare in public resources and mechanism that, in most cases, initiates a biological effect based on the
2
P. Cozzini et al. Chemosphere 292 (2022) 133422
3
P. Cozzini et al. Chemosphere 292 (2022) 133422
Dioxins 75
The data have been organized into two different databases, MariaDB Acrylamide 1
Flavourings 2091
and Elasticsearch, written implementing SQL and Bigdata technology
Food Additives 110
(NoSQL – Not only SQL) respectively. We decided to implement two Furans 133
versions of the same database to answer two requirements. An SQL DB Mycotoxins 327
storing structural data of the selected molecules, more suitable for Pesticides 465
docking and molecular dynamics analysis, and a Big Data version able to Phthalates 361
Bisphenols 51
store a different kind of information, not only structural information but PCBs 209
also in vitro/in vivo tests, regulatory reports, etc. The specification of the FCCDB 4268
structure/mapping used in the present work is explained in more detail
in Table 1.
INTeraction).
2.5. Protein preparation
2.8. Molecular docking with Autodock Vina software
The crystallographic structures of 31 nuclear receptors of Homo sa
piens were downloaded from the Protein Data Bank (PDB) (www.rcsb. Molecular docking experiments were performed with Autodock Vina
org). Among them, only 26 structures with high reliability and quality 1.1.2 using default settings (Trott and Olson, 2009). The search space
are available. For this reason, the nuclear receptors (3) with fragmented was included in a box of 24 × 24 × 24 Å, centred on the binding site of
portions, such as constitutive androstane receptor (CAR), nuclear the ligands as mentioned before. The side chain flexibility was allowed
receptor-related 1 protein (NURR1), and estrogen-related receptor alpha for the same residues defined in the GOLD docking. The ligand amide
(ERRα), were built and minimized for 1 ns with NAMD 2.13 software and backbone flexibility were allowed.
package. In addition, the mutated amino acids present in glucocorticoid
receptor (GR) (F602S) and steroidogenic factor 1 (SF-1) (C247S and 3. Results and discussion
C412S) crystallographic structures were replaced. The receptor struc
tures were processed using Sybyl software v8.1 (www.tripos.com). The foodchem DB has been also designed to accelerate computa
Water molecules and ligands were removed, and hydrogen atoms were tional applications since it stores not only regulative information but
added. Energy was minimized using the Powell algorithm with a also chemical-physical properties and three-dimensional structures.
coverage gradient of ≤0.5 kcal (mol Å)− 1 and a maximum of 1500 cy Very careful attention has been made to ensure the correctness of the 3D
cles. For the molecular docking with AutoDock (see below), the receptors structure to the CAS RN. Thus, it has been conceived for a different
were further processed: using AutoDockTools software polar hydrogens purpose compared to the FPF database which does not contain all the
are added to the proteins and the Gasteiger charges were calculated to chemical-physical information used in the foodchem DB and it does not
assign AD4 type to each atom. store the three-dimensional structure. Moreover, our database has been
written in SQL and NoSQL language with the purpose to make it avail
2.6. Ligand preparation able to the scientific community through a website interface where the
user can make searches and extract information. Using our database, the
Structural coordinates of the endogenous and putative ligands were three-dimensional structures of 8091 substances, belonging to different
retrieved from the NCBI PubChem compound database. Software FLAP sub-classes (Table 2), has been extracted and all these molecules have
was used to assign the correct protonation state to each ligand (pH = been screened using a molecular docking approach in order to identify
7.4). the compounds having the capability to bind the thirty-one nuclear re
ceptors. This method allows to screen the substances which have the
2.7. Molecular docking with GOLD software most probable physical-chemical characteristics to act as endocrine
disruptors.
The GOLD software v5.8.1 (CCDC; Cambridge, UK; www.ccd.cam.ac. Two different docking software and four different scoring functions
uk) was applied in order to dock ligands into the binding site of the 31 have been used as in our previous papers (Francesca Cavaliere et al.,
nuclear receptors. For each compound and receptor, 30 binding poses 2020) (Spaggiari et al., 2021). Thus, for each receptor and for each food
were generated. The binding site centroid of each receptor was defined contact chemical, four values have been obtained. In humans, there are
using the coordinates of the crystallographic complexes. The side chain 48 nuclear receptors, but many of these remain “orphans” as their
flexibility was allowed for each receptor amino acid. For the genetic endogenous ligands are yet to be determined. For this reason, if the
algorithm run, a maximum number of 100000 operations were per endogenous ligand is known, the relative binding affinity (RBA) of each
formed on a population of 100 individuals with a selection pressure of molecule was calculated using it as a reference compound. On the other
1.1. The number of islands and the niche size were set to 5 and 2, hand, all the endogenous and no-endogenous co-crystallized ligands
respectively. The default GoldScore fitness function was applied for were docked against the respective nuclear receptors to obtain a refer
performing the energetic evaluations. The distance for hydrogen ence value. A cut-off value was selected for each four docking values: i) a
bonding and the cut-off value for the van der Waals calculation were set cut-off of 50 for GoldScore; ii) a cut-off of 30 for ChemScore; iii) a cut-off
to 2.5 Å and 4.0 Å, respectively. Flip pyramidal N, flip amide bonds, and of − 7 for Autodock (affinity); and iv) a cut-off of 500 for HintScore.
flip ring corners were allowed for ligand flexibility options. After that, To reach a consensus scoring prediction, a robust statistical method
all the poses generated by GOLD software were rescored using the has been used and it is explained in more detail below.
scoring functions ChemScore and HintScore (HINT, Hydropathic As training dataset, the crystallographic structures available from
4
P. Cozzini et al. Chemosphere 292 (2022) 133422
Fig. 1. Results obtained from the robust multivariate statistical procedure. The 31 NRs are on the x-axis, while the number of the molecules (%) is on the ordinate.
The molecules with a score smaller than 0.3 are highlighted in green (A), the molecules with a score between 0.3 and 0.8 are highlighted in yellow (B), the molecules
with a score greater than 0.8 are highlighted in red (C), while the outliers are highlighted in grey (D). (For interpretation of the references to colour in this figure
legend, the reader is referred to the Web version of this article.)
5
P. Cozzini et al. Chemosphere 292 (2022) 133422
Fig. 2. The percentage of molecules able to bind more than 15 nuclear receptors with high (≥0.8), medium (0.3–0.8), and low binding affinity (<0.3) considering
each class of food contact chemicals.
the molecular docking scores were far away from the normal trend. flavouring compound, and it is also included in the Food Contact
Thus, considering that the volume of the ligand-binding pocket of ERRα Chemical DB (FCCDB), has two different predicted activities for its
is only about 80 Å3 (against the ~300 Å3 of the most nuclear receptor, capability to act as an agonist for the estrogen receptor α. In fact, in the
excluding the PPAR family), it may be plausible to find a higher number Tox21 project (Richard et al., 2021), the quantitative high-throughput
of outliers. screening assay (qHTS) identifies 4′ -Methoxyacetophenone both as
As the second step of our analysis, we turned our attention on which active and inactive for its agonist activity on ERα. In light of this, we
class of food contact chemicals have the greater number of molecules think that there is not an approach that can be judged as better than
able to interfere with the endocrine system. Thus, we counted the another, but all are equally valid and should be considered together.
number of molecules belonging to each class that can interact with more Thus, the present work should not be seen as an opposing method to
than 50 percent of nuclear receptors with high, medium, and low classical in vitro and in vivo tests, but it should be considered as a useful
binding affinity. As we can see in Fig. 2, almost the totality of dioxins, and preliminary method to screen a huge number of molecules in a cost
furans, and PCBs molecules can interact with more than 15 nuclear re and time-effective manner. In fact, using our robust computational
ceptors with high binding affinity, following by the pesticides and method, we screened a large volume of molecules against the nuclear
phthalates sub-classes. receptor family in a relatively short time when compared to the time
The impact of this finding highlights the potential capability of these needed for in vitro and in vivo experiments.
molecules to cause a very broad endocrine effect on the human body.
Considering the medium interactors, a great number of flavourings, Author contribution statement
bisphenols, and FCCDBs fall in this group. The single compound in the
acrylamide class is also able to interact with more than fifteen nuclear Pietro Cozzini – Conceptualization, Methodology, Project adminis
receptors with medium binding affinity. On the other site, food additives tration, Resources, Supervision, Writing reviewing/editing, Giulia
and mycotoxins are more selective in their interaction with nuclear re Spaggiari – Data curation, Formal analysis, Investigation, Methodology,
ceptors, and just a few numbers of molecules can interact with high Validation, Writing – original draft, Francesca Cavaliere – Data curation,
affinity to more than 50 percent of NRs. Formal analysis, Investigation, Methodology, Software Development,
Validation, Writing – original draft.Marco Riani & Gianluca Morelli –
4. Conclusion Statistical methods and software development
6
P. Cozzini et al. Chemosphere 292 (2022) 133422
De Coster, S., Van Larebeke, N., 2012. Endocrine-disrupting chemicals: associated Morris, G.M., Lim-Wilby, M., 2008. Molecular docking. Methods Mol. Biol. 443,
disorders and mechanisms of action. J. Environ. Public Health 2012. https://doi. 365–382. https://doi.org/10.1007/978-1-59745-177-2_19.
org/10.1155/2012/713696. Odermatt, A., Gumy, C., 2008. Glucocorticoid and mineralocorticoid action: why should
Desvergne, B., Feige, J.N., Casals-Casas, C., 2009. PPAR-mediated activity of phthalates: we consider influences by environmental chemicals? Biochem. Pharmacol. 76,
a link to the obesity epidemic? Mol. Cell. Endocrinol. 304, 43–48. https://doi.org/ 1184–1193. https://doi.org/10.1016/j.bcp.2008.07.019.
10.1016/j.mce.2009.02.017. Petrakis, D., Vassilopoulou, L., Mamoulakis, C., Psycharakis, C., Anifantaki, A.,
Fucic, A., Gamulin, M., Ferencic, Z., Katic, J., Krayer Von Krauss, M., Bartonova, A., Sifakis, S., Docea, A.O., Tsiaoussis, J., Makrigiannakis, A., Tsatsakis, A.M., 2017.
Merlo, D.F., 2012. Environmental exposure to xenoestrogens and oestrogen related Endocrine disruptors leading to obesity and related diseases. Int. J. Environ. Res.
cancers: reproductive system, breast, lung, kidney, pancreas, and brain. Environ. Public Health 14, 1–18. https://doi.org/10.3390/ijerph14101282.
Heal. A Glob. Access Sci. Source 11, 1–9. https://doi.org/10.1186/1476-069X-11- Richard, A.M., Huang, R., Waidyanatha, S., Shinn, P., Collins, B.J., Thillainadarajah, I.,
S1-S8. Grulke, C.M., Williams, A.J., Lougee, R.R., Judson, R.S., Houck, K.A., Shobair, M.,
Gore, A.C., Chappell, V.A., Fenton, S.E., Flaws, J.A., Nadal, A., Prins, G.S., Toppari, J., Yang, C., Rathman, J.F., Yasgar, A., Fitzpatrick, S.C., Simeonov, A., Thomas, R.S.,
Zoeller, R.T., 2015. EDC-2: the endocrine society’s second scientific statement on Crofton, K.M., Paules, R.S., Bucher, J.R., Austin, C.P., Kavlock, R.J., Tice, R.R., 2021.
endocrine-disrupting chemicals. Endocr. Rev. 36, E1–E150. https://doi.org/ The Tox21 10K compound library: collaborative chemistry advancing toxicology.
10.1210/er.2015-1010. Chem. Res. Toxicol. 34, 189–216. https://doi.org/10.1021/acs.
Grulke, C.M., Williams, A.J., Thillanadarajah, I., Richard, A.M., 2019. EPA’s DSSTox chemrestox.0c00264.
database: history of development of a curated chemistry resource supporting Safe, S., 2004. Endocrine disruptors and human health: is there a problem. Toxicology
computational toxicology research. Comput. Toxicol. 12, 100096 https://doi.org/ 205, 3–10. https://doi.org/10.1016/j.tox.2004.06.032.
10.1016/j.comtox.2019.100096. Schug, T.T., Janesick, A., Blumberg, B., Heindel, J.J., 2011. Endocrine disrupting
Hersey, A., Chambers, J., Bellis, L., Patrícia Bento, A., Gaulton, A., Overington, J.P., chemicals and disease susceptibility. J. Steroid Biochem. Mol. Biol. 127, 204–215.
2015. Chemical databases: curation or integration by user-defined equivalence? https://doi.org/10.1016/j.jsbmb.2011.08.007.
Drug Discov. Today Technol 14, 17–24. https://doi.org/10.1016/j. Spaggiari, G., Iovine, N., Cozzini, P., 2021. In silico prediction of the mechanism of
ddtec.2015.01.005. action of pyriproxyfen and 4′ -oh-pyriproxyfen against a. Mellifera and h. sapiens
Kabir, E.R., Rahman, M.S., Rahman, I., 2015. A review on endocrine disruptors and their receptors. Int. J. Mol. Sci. 22 https://doi.org/10.3390/ijms22147751.
possible impacts on human health. Environ. Toxicol. Pharmacol. 40, 241–258. Teramoto, R., Fukunishi, H., 2007. Supervised consensus scoring for docking and virtual
https://doi.org/10.1016/j.etap.2015.06.009. screening. J. Chem. Inf. Model. 47, 526–534. https://doi.org/10.1021/ci6004993.
Kavlock, R.J., Daston, G.P., DeRosa, C., Fenner-Crisp, P., Gray, L.E., Kaattari, S., Trott, O., Olson, A.J., 2009. Software news and update AutoDock Vina: improving the
Lucier, G., Luster, M., Mac, M.J., Maczka, C., Miller, R., Moore, J., Rolland, R., speed and accuracy of docking with a new scoring function, efficient optimization,
Scott, G., Sheehan, D.M., Sinks, T., Tilson, H.A., 1996. Research needs for the risk and multithreading. J. Comput. Chem. 31, 455–461.
assessment of health and environmental effects of endocrine disrupters: a report of Wang, R., Lu, Y., Wang, S., 2003. Comparative evaluation of 11 scoring functions for
the U.S. EPA-sponsored workshop. Environ. Health Perspect. 104, 715–740. https:// molecular docking. J. Med. Chem. 46, 2287–2303. https://doi.org/10.1021/
doi.org/10.1289/ehp.96104s4715. jm0203783.
Luccio-Camelo, D.C., Prins, G.S., 2011. Disruption of androgen receptor signaling in
males by environmental chemicals. J. Steroid Biochem. Mol. Biol. 127, 74–82.
https://doi.org/10.1016/J.JSBMB.2011.04.004.