Professional Documents
Culture Documents
Synthesis-Testing-Synthesis-Testing
Cost for designing a new drug is
about $300 million
Needs 10-15 years to launch a drug
in market.
Resources like time, chemicals, etc.
are consumed
Slower, frustrating, lower success,
etc.
QSAR is not theoretical !!!!
• Collection of experimental bioactivity like IC50, EC50,
Do you agree?
Activity = Lipophilicity + Steric + Electronic + Unknown
Factors
c) External validation
Real
troponin I-interacting troponin I-interacting
kinase (TNNI3K) kinase (TNNI3K)
IC50 = 8000 nM* AI pred:
IC50 = 7800 nM
Experimental:
IC50 = 80 nM*
*Lawhorn, B. G. et al., Identification of purines and 7-deazapurines as potent and
selective type I inhibitors of troponin I-interacting kinase (TNNI3K). J. Med. Chem.
2015, 58, 7431−7448.
spleen tyrosine spleen tyrosine
kinase (Syk) kinase (Syk)
IC50 = 8.8 nM* AI pred:
IC50 = 10 nM
Experimental:
IC50 = 0.060 nM*
*Ellis, J. M. et al., Overcoming mutagenicity and ion channel activity: optimization of
selective spleen tyrosine kinase inhibitors. J. Med. Chem. 2015, 58, 1929−1939.
QSAR based virtual screening
• Molecular docking can rapidly identify large subsets of
molecules with desired activity from large screening
collections of compounds (105–106 compounds) using
automated methods.
• However, the hit rate ranges between 0.01% and 0.1% !!!
• Most of the screened compounds are routinely reported as
false positives.
• On the other hand, typical hit rates for QSAR-based virtual
screening range between 1% and 40% !!!!!
Reference: Neves BJ, Braga RC, Melo-Filho CC, Moreira-Filho JT, Muratov EN and
Andrade CH (2018) QSAR-Based Virtual Screening: Advances and Applications in
Drug Discovery. Frontiers in Pharmacology 9. doi: 10.3389/fphar.2018.01275
QSAR based virtual screening:
Success Stories
• Zhang et al. (2013), a data set of 3,133 compounds reported
as active or inactive against P. falciparum was used to
develop QSAR models.
• QSAR models were applied for VS of the ChemBridge
database.
• After VS, 176 potential antimalarial compounds were
identified and submitted to experimental validation along
with 42 putative inactive compounds.
• Twenty-five compounds presented antimalarial activity in P.
falciparum.
• All 42 compounds predicted as inactives by the models were
confirmed experimentally to be inactives.
QSAR based virtual screening:
Success Stories
• Alves et al. (2020), a data set of 113 compounds (40 actives
and 73 inactives) for the SARS-CoV Mpro.
• QSAR models were applied for VS of the DrugBank
database of FDA approved drugs.
• After VS, 42 potential drugs were identified but only 11 were
tested for experimental validation.
• Three compounds presented strong activity for the SARS-
CoV-2 Mpro.
1. Zhang, L. et al. (2013) Discovery of novel antimalarial compounds enabled by
QSAR-based virtual screening, J. Chem. Inf. Model. 53, 475–492. DOI:
10.1021/ci300421n
2. Alves et al. (2020) QSAR Modeling of SARS-CoV Mpro Inhibitors Identifies
Sufugolix, Cenicriviroc, Proglumetacin, and Other Drugs as Candidates for
Repurposing against SARS-CoV-2, Mol inf (Wiley). DOI: 10.1002/minf.202000113
Disadvantages of QSAR
• False correlations may arise because biological data that
are subject to considerable experimental error (noisy data).
• If training dataset is not large enough, the data collected
may not reflect the complete property space.
Consequently, many QSAR results cannot be used to
confidently predict the most likely compounds of best
activity.
• Features may not be reliable as well. This is particularly
serious for 3D features because 3D structures of ligands
binding to receptor may not be available. Common
approach is to use minimized structure, but that may not
represent the reality well.
Free Software for QSAR
1. ACD Chemsketch (www.acdlabs.com)
2. PyMOL
3. RDKit
4. ChemDraw
5. Avogadro software (https://avogadro.cc/)
6. OpenBabel (http://openbabel.org/wiki/Main_Page)
7. MMTK (http://dirac.cnrs-orleans.fr/MMTK.html)
8. PyDescriptor (available from Dr. V. H. Masand)
9. PaDEL (http://www.yapcwsoft.com/dd/padeldescriptor/)
10.BuildQSAR (https://profanderson.net/files/buildqsar.php)
11.Weka (https://www.cs.waikato.ac.nz/ml/weka/)
12.‘R’ package like GA-MLR, Carret, etc.
Databases
1. ChEMBL Database - EMBL-EBI: ChEMBL is a
manually curated database of bioactive molecules with
drug-like properties. It brings together chemical,
bioactivity and genomic data
https://www.ebi.ac.uk/chembl/
2. Enzyme Database – BRENDA: A comprehensive
enzyme information system.
https://www.brenda-enzymes.org/