You are on page 1of 53

Multi-Target Drug Design Using

Chem-Bioinformatic Approaches Kunal


Roy
Visit to download the full and correct content document:
https://textbookfull.com/product/multi-target-drug-design-using-chem-bioinformatic-ap
proaches-kunal-roy/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Multi-Scale Approaches in Drug Discovery. From


Empirical Knowledge to In Silico Experiments and Back
Alejandro Speck-Planche (Eds.)

https://textbookfull.com/product/multi-scale-approaches-in-drug-
discovery-from-empirical-knowledge-to-in-silico-experiments-and-
back-alejandro-speck-planche-eds/

Green Approaches in Medicinal Chemistry for Sustainable


Drug Design 1st Edition Bimal K. Banik (Editor)

https://textbookfull.com/product/green-approaches-in-medicinal-
chemistry-for-sustainable-drug-design-1st-edition-bimal-k-banik-
editor/

Multi dimensional Approaches Towards New Technology


Ashish Bharadwaj

https://textbookfull.com/product/multi-dimensional-approaches-
towards-new-technology-ashish-bharadwaj/

Rational Drug Design Thomas Mavromoustakos

https://textbookfull.com/product/rational-drug-design-thomas-
mavromoustakos/
Mathematical Models for Therapeutic Approaches to
Control Psoriasis Priti Kumar Roy

https://textbookfull.com/product/mathematical-models-for-
therapeutic-approaches-to-control-psoriasis-priti-kumar-roy/

Drug Delivery Approaches and Nanosystems Volume 1 Novel


Drug Carriers 1st Edition Raj K. Keservani

https://textbookfull.com/product/drug-delivery-approaches-and-
nanosystems-volume-1-novel-drug-carriers-1st-edition-raj-k-
keservani/

Drug repositioning : approaches and applications for


neurotherapeutics 1st Edition Berliocchi

https://textbookfull.com/product/drug-repositioning-approaches-
and-applications-for-neurotherapeutics-1st-edition-berliocchi/

Computational Drug Discovery and Design Mohini Gore

https://textbookfull.com/product/computational-drug-discovery-
and-design-mohini-gore/

Multi hazard Approaches to Civil Infrastructure


Engineering 1st Edition Paolo Gardoni

https://textbookfull.com/product/multi-hazard-approaches-to-
civil-infrastructure-engineering-1st-edition-paolo-gardoni/
Methods in Pharmacology
and Toxicology

Kunal Roy Editor

Multi-Target
Drug Design
Using Chem-
Bioinformatic
Approaches
METHODS IN PHARMACOLOGY
AND TOXICOLOGY

Series Editor
Y. James Kang
Department of Pharmacology &
Toxicology, University of Louisville
Louisville, Kentucky, USA

For further volumes:


http://www.springer.com/series/7653
Methods of Pharmacology and Toxicology publishes cutting-edge methods and protocols
in all areas of pharmacological and toxicological research. Each book in the series offers time
tested laboratory protocols and step by step methods for reproducible lab experiments to aid
toxicologists and pharmaceutical scientists in laboratory testing. With an emphasis on the
molecular biology of toxicological methods, Methods of Pharmacology and Toxicology
focuses on topics with wide ranging implications to human health, such as Immunotoxicol-
ogy, Drug Metabolism, and Metabolomics to provide investigators with highly useful
compendiums of key strategies and approaches to successful research in drug development.

More information about this series at http://www.springer.com/series/7653


Multi-Target Drug Design Using
Chem-Bioinformatic Approaches

Edited by

Kunal Roy
Department of Pharmaceutical Technology, Jadavpur University, Kolkata, West Bengal, India
Editor
Kunal Roy
Department of Pharmaceutical Technology
Jadavpur University
Kolkata, West Bengal, India

Additional material to this book can be downloaded from http://extras.springer.com.

ISSN 1557-2153 ISSN 1940-6053 (electronic)


Methods in Pharmacology and Toxicology
ISBN 978-1-4939-8732-0 ISBN 978-1-4939-8733-7 (eBook)
https://doi.org/10.1007/978-1-4939-8733-7
Library of Congress Control Number: 2018961559

© Springer Science+Business Media, LLC, part of Springer Nature 2019


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction
on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation,
computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations
and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to
be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty,
express or implied, with respect to the material contained herein or for any errors or omissions that may have been made.
The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Humana Press imprint is published by the registered company Springer Science+Business Media, LLC part of
Springer Nature.
The registered company address is: 233 Spring Street, New York, NY 10013, U.S.A.
Dedication

For Aatreyi, Arpit and Chaitali

v
Preface

Despite significant development of novel rational drug design strategies and high-
throughput screening methods, the cost of drug development has sharply increased, and
at the same time, the rate of failures in clinical trials has escalated [1]. The “one drug, one
target, one disease” approach has failed to appreciate the complexities of disease pathways
and the system-wide effects of drugs [2]. Diseases are often multifactorial involving a
combination of constitutive and/or environmental factors, and they result from the break-
down of robust physiological systems due to multiple genetic and/or environmental factors,
leading to the establishment of robust disease conditions. Thus, complex disorders are more
likely to be healed or alleviated through simultaneous modulation of multiple targets. Until
now, there are still not fully effective drugs for treating complex, multifactorial diseases, such
as cancer, metabolic diseases, and neurological diseases [1]. Polypharmacology that
addresses small-molecule interactions with multiple targets has generated a great interest
in drug discovery [3]. This approach allows for studies of off-target activities and the
facilitation of drug repositioning. Multi-target drugs expand the number of pharmacologi-
cally relevant target molecules by introducing a set of indirect, network-dependent effects
[4]. Moreover, low-affinity binding of multi-target drugs eases the constraints of drugg-
ability and significantly increases the size of the druggable proteome. Multi-target agents are
a promising strategy to face complex, multifactorial disorders and drug resistance issues.
Compared to combination therapies, they present several advantages, including more
predictable pharmacokinetics, lower probabilities of drug interactions, and higher patient
compliance [5]. Several already existing efficient drugs, such as nonsteroidal anti-
inflammatory drugs, antidepressants, anti-neurodegenerative agents, and multi-target
kinase inhibitors, affect many targets simultaneously [4]. Hybridization of drugs is also a
powerful tool to develop better treatments for several human diseases, as this can provide
combination therapies in a single multifunctional agent in a more specific and powerful way
than conventional treatments [6].
In polypharmacology, one of the most important goals is to rationally design com-
pounds that act on multiple key targets driving the pathogenesis of a given disease. There-
fore, targeting multiple proteins simultaneously stands a good chance to increase drug
efficacy and decrease the possibility of drug resistance. In order to achieve these goals, it
would be necessary to develop state-of-the-art computational techniques for data curation,
model development, and quantitative predictions [2]. Computational approaches are capa-
ble of predicting the activity profile of ligands to a set of targets, anticipating potential
selectivity issues, and discovering desired multi-target activities early in the iterative design
and optimization steps typical of a preclinical drug discovery project. These approaches are
based on 2D or 3D shape and chemical similarity, pharmacophore mapping, target and
binding site similarity assessment, docking experiments, bioinformatics, graph theory and
modeling, machine learning algorithms, and chemogenomics [3]. These approaches can be
classified into statistical data analysis and bioinformatics, ligand-based, and structure-based
approaches, all of which are well-documented in the literature. The structure-based meth-
ods include inverse docking, binding site similarity analysis, inverse pharmacophore model-
ing, molecular dynamics simulation, etc., while the ligand-based methods include similarity
ensemble approach, extended-connectivity fingerprint, fragment-based shape similarity

vii
viii Preface

search, etc. which can be used in combination with a variety of machine learning methods
including deep learning [2]. Systems biology approaches and cellular networks help to
understand complex diseases and their mechanisms and offer a lot of possibilities to point
out the key elements as potential drug targets and thus suggest new therapeutic treatment
strategies. Proteochemometric modeling (PCM) simultaneously considers the bioactivity of
multiple ligands against multiple targets and permits exploration of the selectivity and
promiscuity of ligands on biomolecular systems of different complexity [7].
Computational modeling including quantitative structure-activity relationship (QSAR),
pharmacophore mapping, docking, virtual screening, and other cheminformatics and pro-
teochemometric approaches play a vital role in finding and optimization of leads in any drug
discovery program. Computational modeling helps to understand the important molecular
features contributing to the binding interactions with the target proteins, thus facilitating
design of new potential compounds and prediction of activity of designed compounds which
have not yet been tested. These approaches can save time, money, and more importantly
animal sacrifice in the complex, long, and costly drug discovery process.
This volume (Multi-target Drug Design Using Chem-Bioinformatic Approaches) intends
to showcase the recent advances in computational design of multi-target drug candidates
involving various ligand- and structure-based strategies. Different chem-bioinformatic
modeling strategies that can be applied for design of multi-target drugs have been discussed
in this book. Apart from a few literature reviews on the application of chemometric and
cheminformatic modeling tools for multi-target drug design, several case studies are also
presented. Important databases and web servers in connection with multi-target drug
design are also discussed. There are a total of 21 chapters in this book.
The first chapter “Cheminformatics Approaches to Study Drug Polypharmacology”
provides a tutorial overview on selected cheminformatics methods useful for assembling,
curating/preparing a chemical database, and assessing its diversity and chemical space. This
chapter also discusses the methods for evaluating the structure-activity relationships and
polypharmacology.
The second chapter “Computational Predictions for Multi-target Drug Design” high-
lights the current state-of-the-art methodologies used in multi-target identification for
therapeutic effects of known drugs or new drug candidates. This chapter emphasizes
experimental validation of model-derived predictions.
The third chapter “Computational Multi-target Drug Design” discusses multi-target or
polypharmacological drug discovery and several in silico methodologies like quantitative
structure-activity relationship (QSAR), pharmacophore modeling, and molecular docking
used in the process of discovery of multi-targeted drugs.
The fourth chapter “Multi-target Drug Design for Neurodegenerative Diseases” pre-
sents an overview of multi-target computational methods as well as of their successful
applications to neurodegenerative diseases. This chapter recommends application of virtual
screening encompassing both structure-based and ligand-based techniques for effective
multi-target drug design.
The fifth chapter “Molecular Docking Studies in Multi-target Antitubercular Drug
Discovery” gives an overview of various targets for antitubercular drug development fol-
lowed by a literature survey of application of docking studies for the development of multi-
target compounds for developing new promising drug candidates against tuberculosis.
The sixth chapter “Advanced Chemometric Modeling Approaches for the Design of
Multi-target Drugs Against Neurodegenerative Diseases” discusses the recent advances in
chemometric techniques in multi-target anti-neurodegenerative drug design. This chapter
Preface ix

recommends the use of proteochemometric modeling for multi-target-directed ligand


design.
The seventh chapter “Computational Studies on Natural Products for the Development
of Multi-target Drugs” provides an overview of the currently used computational methods
in natural product research, with special reference to multi-target drug design. This chapter
discusses that pan-assay interference compounds (PAINS) are for the most part not extraor-
dinarily promiscuous and should not be disregarded prematurely.
The eighth chapter “Computational Design of Multi-target Drugs Against Alzheimer’s
Disease” provides the basic background about the molecular targets implicated in the patho-
genesis of Alzheimer’s disease. Furthermore, the chapter reviews structure-activity relation-
ships (SAR), 2D and 3D quantitative structure-activity relationships (QSAR), as well as other
computational modeling studies performed on multi-target agents for Alzheimer’s disease.
The ninth chapter “Design of Multi-target-Directed Ligands as a Modern Approach for
the Development of Innovative Drug Candidates for Alzheimer’s Disease” reviews some
examples of the exploitation of the multi-target-directed ligand approach in the rational
design of novel drug candidate prototypes for the treatment of Alzheimer’s disease.
The tenth chapter “Virtual Screening for Dual Hsp90/B-Raf Inhibitors” describes a
computational strategy leading to the identification of the first dual inhibitors of heat shock
protein 90 (Hsp90) and protein kinase B-Raf, both being validated targets for anticancer
drug discovery.
The eleventh chapter “Strategies for Multi-target-Directed Ligands: Application in
Alzheimer’s Disease (AD) Therapeutics” presents several in silico strategies adopted for
the development of multi-target anti-Alzheimer compounds followed by a case study
leading to their in vitro validation.
The twelfth chapter “Computational Design of Multi-target Kinase Inhibitors” sum-
marizes two effective computational strategies to identify multi-target kinase inhibitors. The
first approach involved a combination of merged pharmacophore matching, database
screening, and molecular docking to reliably identify potential multi-target kinase inhibitors.
The second strategy employed ensemble pharmacophore-based screening (EPS) of a com-
pound database, post-EPS filtration (PEPSF) of the ligand hits, and multiple dockings.
The thirteenth chapter “Proteochemometrics for the Prediction of Peptide Binding to
Multiple HLA Class II Proteins” discusses “proteochemometrics” (PCM) as a method for
deriving QSAR. This chapter presents a protocol applied to a set of peptides binding to
seven polymorphic HLA class II proteins from locus DP.
The fourteenth chapter “Linked Open Data: Ligand-Transporter Interaction Profiling
and Beyond” presents a workflow for retrieving and curating information for multiple drug
targets from the open domain, provides insights into how the retrieved data can be
employed in ligand- and structure-based approaches, and discusses the hurdles to consider
with respect to data analysis.
The fifteenth chapter “Design of Novel Dual-Target Hits Against Malaria and Tuber-
culosis Using Computational Docking” reviews different approaches (knowledge-based and
screening-based) for designing multi-target inhibitors. Additionally, a step-by-step guide
(protocol) and different computational resources are also discussed in detail to design multi-
target hits for malaria and tuberculosis.
The sixteenth chapter “Computational Design of Multi-target Drugs Against Breast
Cancer” presents protocols and computational practices for screening of multi-target drug
molecules for breast cancer receptors. However, the authors emphasize that validation of the
screened molecules is essential in the in vitro and in vivo conditions.
x Preface

The seventeenth chapter “Computational Methods for Multi-target Drug Designing


Against Mycobacterium tuberculosis” presents available strategies for computational multi-
target drug designing with their advantages and disadvantages. This chapter also discusses
an easy, fast, and accurate protocol for multi-target drug designing against the Mycobacte-
rium tuberculosis.
The eighteenth chapter “Development of a Web Server for Identification of Common
Lead Molecules for Multiple Protein Targets” presents a computational protocol that
involves screening, docking, and scaffold-based optimization of hit molecules from a variety
of compound libraries against any two specified protein targets. The protocol is made
available via a web server named “Multi-target Ligand Design.”
The nineteenth chapter “Computational Method for Prediction of Targets for Breast
Cancer Using siRNAs Approach” discusses the development and application of a web-based
database, BOSS, for selection of potential RNAi based on the sequences that have been used
and validated for RNAi-mediated suppression of breast oncogenes. This database includes
the latest information regarding used RNAi molecules that can be cost-effective and less
time-consuming.
The twentieth chapter “Historeceptomics: Integrating a Drug’s Multiple Targets (Poly-
pharmacology) with Their Expression Pattern in Human Tissues” presents “historecep-
tomics” as a new, integrative informatics approach to describing the mechanism of action of
drugs in a holistic, in vivo context. The chapter discusses that this approach may give new
insights into drug mechanism of action, drug repurposing, and prediction of adverse effects,
including the design and development of multi-target drugs or drug combinations.
The twenty-first chapter “Networking of Smart Drugs: A Chem-Bioinformatic
Approach to Cancer Treatment” reviews the existing network of “smart drugs” by using a
chem-bioinformatic approach toward cancer treatment. According to the authors, an appli-
cation of computational tools in smart drug designing for cancer treatment will be path-
breaking in the future.
I am sure that this collection of 21 chapters will be useful to the researchers working in
the field of drug discovery and development.

Kolkata, India Kunal Roy

References

1. Lu J-J, Pan W, Hu Y-J, Wang Y-T (2012) Multi-target drugs: the trend of drug research and
development. PLoS One 7(6):e40262. doi:10.1371/journal.pone.0040262
2. Chaudhari R, Tan Z, Huang B, Zhang S (2017) Computational polypharmacology: a new paradigm
for drug discovery. Expert Opin Drug Discov 12(3):279–291, doi: 10.1080/
17460441.2017.1280024
3. Rastelli G, Pinzi L (2015) Computational polypharmacology comes of age. Front Pharmacol 6:157.
doi: 10.3389/fphar.2015.00157
4. Korcsmáros T, Szalay MS, Böde C, Kovács IA, Csermely P (2007) How to design multi-target drugs.
Expert Opin Drug Discov 2:799–808. doi: 10.1517/17460441.2.6.799
5. Talevi A (2015) Multi-target pharmacology: possibilities and limitations of the “skeleton key
approach” from a medicinal chemist perspective. Front Pharmacol 6:205. doi: 10.3389/
fphar.2015.00205
6. Bérubé G (2016) An overview of molecular hybrids in drug discovery. Expert Opin Drug Discov
11:281–305. doi: 10.1517/17460441.2016.1135125
7. Cortes-Ciriano I, van Westen GJP, Murrell DS, Lenselink EB, Bender A, Malliavin TE (2015)
Applications of proteochemometrics—from species extrapolation to cell line sensitivity modeling.
BMC Bioinform 16(Suppl 3):A4. doi: 10.1186/1471-2105-16-S3-A4
Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

PART I CHEM-BIOINFORMATIC TOOLS

Cheminformatics Approaches to Study Drug Polypharmacology . . . . . . . . . . . . . . . . . . 3


J. Jesús Naveja, Fernanda I. Saldı́var-González, Norberto Sánchez-Cruz,
and José L. Medina-Franco
Computational Predictions for Multi-Target Drug Design . . . . . . . . . . . . . . . . . . . . . . . 27
Neelima Gupta, Prateek Pandya, and Seema Verma
Computational Multi-Target Drug Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Azizeh Abdolmaleki, Fereshteh Shiri, and Jahan B. Ghasemi

PART II COMPUTATIONAL MULTI-TARGET DRUG DESIGN:


LITERATURE REVIEWS

Multitarget Drug Design for Neurodegenerative Diseases . . . . . . . . . . . . . . . . . . . . . . . . 93


Marco Catto, Daniela Trisciuzzi, Domenico Alberga, Giuseppe Felice
Mangiatordi, and Orazio Nicolotti
Molecular Docking Studies in Multitarget Antitubercular Drug Discovery . . . . . . . . . 107
Jéssika de Oliveira Viana, Marcus T. Scotti, and Luciana Scotti
Advanced Chemometric Modeling Approaches for the Design
of Multitarget Drugs Against Neurodegenerative Diseases. . . . . . . . . . . . . . . . . . . . . . . . 155
Amit Kumar Halder, Ana S. Moura, and M. Natália D. S. Cordeiro
Computational Studies on Natural Products for the Development
of Multi-target Drugs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Veronika Temml and Daniela Schuster
Computational Design of Multitarget Drugs Against Alzheimer’s Disease . . . . . . . . . . 203
Sotirios Katsamakas and Dimitra Hadjipavlou-Litina
Design of Multi-target Directed Ligands as a Modern Approach
for the Development of Innovative Drug Candidates
for Alzheimer’s Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Cindy Juliet Cristancho Ortiz, Matheus de Freitas Silva,
Vanessa Silva Gontijo, Flávia Pereira Dias Viegas,
Kris Simone Tranches Dias, and Claudio Viegas Jr.

PART III CASE STUDIES


Virtual Screening for Dual Hsp90/B-Raf Inhibitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Andrew Anighoro, Luca Pinzi, Giulio Rastelli, and Jürgen Bajorath

xi
xii Contents

Strategies for Multi-Target Directed Ligands: Application


in Alzheimer’s Disease (AD) Therapeutics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Sucharita Das and Soumalee Basu
Computational Design of Multi-target Kinase Inhibitors . . . . . . . . . . . . . . . . . . . . . . . . . 385
Sinoy Sugunan and Rajanikant G. K.
Proteochemometrics for the Prediction of Peptide Binding
to Multiple HLA Class II Proteins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Ivan Dimitrov, Ventsislav Yordanov, Darren R. Flower
and Irini Doytchinova
Linked Open Data: Ligand-Transporter Interaction Profiling and Beyond. . . . . . . . . . 405
Stefanie Kickinger, Eva Hellsberg, Sankalp Jain, and Gerhard F. Ecker
Design of Novel Dual-Target Hits Against Malaria and Tuberculosis
Using Computational Docking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
Manoj Kumar and Anuj Sharma
Computational Design of Multi-Target Drugs Against Breast Cancer . . . . . . . . . . . . . . 443
Shubhandra Tripathi, Gaurava Srivastava, and Ashok Sharma
Computational Methods for Multi-Target Drug Designing
Against Mycobacterium tuberculosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Gaurava Srivastava, Ashish Tiwari, and Ashok Sharma

PART IV DATABASES AND WEB SERVERS

Development of a Web-Server for Identification of Common


Lead Molecules for Multiple Protein Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
Abhilash Jayaraj, Ruchika Bhat, Amita Pathak, Manpreet Singh
and B. Jayaram
Computational Method for Prediction of Targets for Breast Cancer
Using siRNA Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
Atul Tyagi, Mukti N. Mishra, and Ashok Sharma

PART V SPECIAL TOPICS

Historeceptomics: Integrating a Drug’s Multiple Targets


(Polypharmacology) with Their Expression Pattern in Human Tissues . . . . . . . . . . . . . 517
Timothy Cardozo
Networking of Smart Drugs: A Chem-Bioinformatic Approach
to Cancer Treatment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
Kavindra Kumar Kesari, Qazi Mohammad Sajid Jamal,
Mohd. Haris Siddiqui, and Jamal Mohammad Arif

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
Contributors

AZIZEH ABDOLMALEKI  Department of Chemistry, Tuyserkan Branch, Islamic Azad


University, Tuyserkan, Iran
DOMENICO ALBERGA  Dipartimento di Farmacia-Scienze del Farmaco, Università degli
Studi di Bari “Aldo Moro”, Bari, Italy
ANDREW ANIGHORO  Department of Life Science Informatics, B-IT, LIMES Program Unit
Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms- Universit€ a t,
Bonn, Germany
JAMAL MOHAMMAD ARIF  Department of Bioscience, Integral University, Lucknow, Uttar
Pradesh, India
JÜRGEN BAJORATH  Department of Life Science Informatics, B-IT, LIMES Program Unit
Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universit€ at,
Bonn, Germany
SOUMALEE BASU  Department of Microbiology, University of Calcutta, Kolkata, West
Bengal, India
RUCHIKA BHAT  Department of Chemistry, Indian Institute of Technology Delhi, New Delhi,
India; Supercomputing Facility for Bioinformatics & Computational Biology, Indian
Institute of Technology Delhi, New Delhi, India
TIMOTHY CARDOZO  New York University School of Medicine, NYU Langone Health,
New York, NY, USA
MARCO CATTO  Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di
Bari “Aldo Moro”, Bari, Italy
M. NATÁLIA D. S. CORDEIRO  LAQV@REQUIMTE/Department of Chemistry and
Biochemistry, Faculty of Sciences, University of Porto, Porto, Portugal
SUCHARITA DAS  Department of Microbiology, University of Calcutta, Kolkata, West Bengal,
India
KRIS SIMONE TRANCHES DIAS  PeQuiM, Laboratory of Research in Medicinal Chemistry,
Institute of Chemistry, Federal University of Alfenas, Alfenas, Brazil
IVAN DIMITROV  Faculty of Pharmacy, Medical University of Sofia, Sofia, Bulgaria
IRINI DOYTCHINOVA  Faculty of Pharmacy, Medical University of Sofia, Sofia, Bulgaria
GERHARD F. ECKER  Department of Pharmaceutical Chemistry, University of Vienna,
Vienna, Austria
DARREN R. FLOWER  School of Life and Health Sciences, Aston University, Birmingham, UK
MATHEUS DE FREITAS SILVA  PeQuiM, Laboratory of Research in Medicinal Chemistry,
Institute of Chemistry, Federal University of Alfenas, Alfenas, Brazil; Programa de Pos-
Graduação em Quı́mica, Federal University of Alfenas, Alfenas, Brazil
RAJANIKANT G. K.  School of Biotechnology, National Institute of Technology Calicut,
Calicut, Kerala, India
JAHAN B. GHASEMI  Drug Design in Silico Lab, Chemistry Faculty, University of Tehran,
Tehran, Iran
VANESSA SILVA GONTIJO  PeQuiM, Laboratory of Research in Medicinal Chemistry, Institute
of Chemistry, Federal University of Alfenas, Alfenas, Brazil; Programa de Pos-Graduação
em Quı́mica, Federal University of Alfenas, Alfenas, Brazil

xiii
xiv Contributors

NEELIMA GUPTA  Centre of Advanced Study, Department of Chemistry, University of


Rajasthan, Jaipur, India
DIMITRA HADJIPAVLOU-LITINA  Department of Pharmaceutical Chemistry, School of
Pharmacy, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki,
Greece
AMIT KUMAR HALDER  LAQV@REQUIMTE/Department of Chemistry and Biochemistry,
Faculty of Sciences, University of Porto, Porto, Portugal
EVA HELLSBERG  Department of Pharmaceutical Chemistry, University of Vienna, Vienna,
Austria
SANKALP JAIN  Department of Pharmaceutical Chemistry, University of Vienna, Vienna,
Austria
QAZI MOHAMMAD SAJID JAMAL  Department of Health Information Management, College of
Applied Medical Sciences, Buraydah Colleges, Buraydah, Al Qassim, Saudi Arabia; Novel
Global Community Educational Foundation, Sydney, NSW, Australia
ABHILASH JAYARAJ  Department of Chemistry, Indian Institute of Technology Delhi, New
Delhi, India; Supercomputing Facility for Bioinformatics & Computational Biology,
Indian Institute of Technology Delhi, New Delhi, India
B. JAYARAM  Department of Chemistry, Indian Institute of Technology Delhi, New Delhi,
India; Supercomputing Facility for Bioinformatics & Computational Biology, Indian
Institute of Technology Delhi, New Delhi, India; Kusuma School of Biological Sciences,
Indian Institute of Technology Delhi, New Delhi, India
J. JESÚS NAVEJA  Department of Pharmacy, School of Chemistry, Universidad Nacional
Autonoma de México, Mexico City, Mexico; PECEM, School of Medicine, Universidad
Nacional Autonoma de México, Mexico City, Mexico
SOTIRIOS KATSAMAKAS  Department of Pharmaceutical Chemistry, School of Pharmacy,
Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
KAVINDRA KUMAR KESARI  Department of Applied Physics, Aalto University, Helsinki,
Finland; Department of Bioproduct and Biosystem, Aalto University, Helsinki, Finland
STEFANIE KICKINGER  Department of Pharmaceutical Chemistry, University of Vienna,
Vienna, Austria
MANOJ KUMAR  Department of Chemistry, Indian Institute of Technology Roorkee, Roorkee,
Uttarakhand, India; Department of Chemistry and Chemical Biology, McMaster
University, Hamilton, ON, Canada
GIUSEPPE FELICE MANGIATORDI  Dipartimento di Farmacia-Scienze del Farmaco,
Universita` degli Studi di Bari “Aldo Moro”, Bari, Italy
JOSÉ L. MEDINA-FRANCO  Department of Pharmacy, School of Chemistry, Universidad
Nacional Autonoma de México, Mexico City, Mexico
MUKTI N. MISHRA  Biotechnology Division, CSIR-Central Institute of Medicinal and
Aromatic Plants, Lucknow, Uttar Pradesh, India
ANA S. MOURA  LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty
of Sciences, University of Porto, Porto, Portugal
ORAZIO NICOLOTTI  Dipartimento di Farmacia-Scienze del Farmaco, Università degli
Studi di Bari “Aldo Moro”, Bari, Italy
JÉSSIKA DE OLIVEIRA VIANA  Federal University of Paraı́ba, Health Center, João Pessoa, PB,
Brazil
CINDY JULIET CRISTANCHO ORTIZ  PeQuiM, Laboratory of Research in Medicinal
Chemistry, Institute of Chemistry, Federal University of Alfenas, Alfenas, Brazil;
Programa de Pos-Graduação em Quı́mica, Federal University of Alfenas, Alfenas, Brazil
Contributors xv

PRATEEK PANDYA  Amity Institute of Forensic Sciences, Amity University, Noida, India
AMITA PATHAK  Department of Chemistry, Indian Institute of Technology Delhi, New Delhi,
India; Supercomputing Facility for Bioinformatics & Computational Biology, Indian
Institute of Technology Delhi, New Delhi, India
LUCA PINZI  Department of Life Sciences, University of Modena and Reggio Emilia,
Modena, Italy
GIULIO RASTELLI  Department of Life Sciences, University of Modena and Reggio Emilia,
Modena, Italy
FERNANDA I. SALDÍVAR-GONZÁLEZ  Department of Pharmacy, School of Chemistry,
Universidad Nacional Autonoma de México, Mexico City, Mexico
DANIELA SCHUSTER  Institute of Pharmacy/Pharmacognosy and Center for Molecular
Biosciences Innsbruck, University of Innsbruck, Innsbruck, Austria; Department of
Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, Paracelsus Medical
University Salzburg, Salzburg, Austria
LUCIANA SCOTTI  Federal University of Paraı́ba, Health Center, Teaching and Research
Management—University Hospital, João Pessoa, PB, Brazil
MARCUS T. SCOTTI  Federal University of Paraı́ba, Health Center, João Pessoa, PB, Brazil
ANUJ SHARMA  Department of Chemistry, Indian Institute of Technology Roorkee, Roorkee,
Uttarakhand, India
ASHOK SHARMA  Biotechnology Division, CSIR-Central Institute of Medicinal and
Aromatic Plants, Lucknow, Uttar Pradesh, India
FERESHTEH SHIRI  Department of Chemistry, University of Zabol, Zabol, Iran
MOHD. HARIS SIDDIQUI  Department of Bioengineering, Integral University, Lucknow,
Uttar Pradesh, India
MANPREET SINGH  Supercomputing Facility for Bioinformatics & Computational Biology,
Indian Institute of Technology Delhi, New Delhi, India
GAURAVA SRIVASTAVA  Biotechnology Division, CSIR-Central Institute of Medicinal and
Aromatic Plants, Lucknow, Uttar Pradesh, India
SINOY SUGUNAN  School of Biotechnology, National Institute of Technology Calicut, Calicut,
Kerala, India
NORBERTO SÁNCHEZ-CRUZ  Department of Pharmacy, School of Chemistry, Universidad
Nacional Autonoma de México, Mexico City, Mexico
VERONIKA TEMML  Institute of Pharmacy/Pharmacognosy and Center for Molecular
Biosciences Innsbruck, University of Innsbruck, Innsbruck, Austria
ASHISH TIWARI  Biotechnology Division, CSIR-Central Institute of Medicinal and Aromatic
Plants, Lucknow, Uttar Pradesh, India
SHUBHANDRA TRIPATHI  Biotechnology Division, CSIR-Central Institute of Medicinal and
Aromatic Plants, Lucknow, Uttar Pradesh, India
DANIELA TRISCIUZZI  Dipartimento di Farmacia-Scienze del Farmaco, Universita ` degli
Studi di Bari “Aldo Moro”, Bari, Italy
ATUL TYAGI  Biotechnology Division, CSIR-Central Institute of Medicinal and Aromatic
Plants, Lucknow, Uttar Pradesh, India
SEEMA VERMA  Centre of Advanced Study, Department of Chemistry, University of
Rajasthan, Jaipur, India
FLÁVIA PEREIRA DIAS VIEGAS  PeQuiM, Laboratory of Research in Medicinal Chemistry,
Institute of Chemistry, Federal University of Alfenas, Alfenas, Brazil
xvi Contributors

CLAUDIO VIEGAS JR.  PeQuiM, Laboratory of Research in Medicinal Chemistry, Institute of


Chemistry, Federal University of Alfenas, Alfenas, Brazil; Programa de Pos-Graduação em
Quı́mica, Federal University of Alfenas, Alfenas, Brazil
VENTSISLAV YORDANOV  Faculty of Pharmacy, Medical University of Sofia, Sofia, Bulgaria
Part I

Chem-Bioinformatic Tools
Methods in Pharmacology and Toxicology (2018): 3–25
DOI 10.1007/7653_2018_6
© Springer Science+Business Media New York 2018
Published online: 09 June 2018

Cheminformatics Approaches to Study Drug


Polypharmacology
J. Jesús Naveja, Fernanda I. Saldı́var-González, Norberto Sánchez-Cruz,
and José L. Medina-Franco

This work is dedicated to the loving memory of Nicolás Medina Sandoval.

Abstract
Herein is presented a tutorial overview on selected chemoinformatics methods useful for assembling,
curating/preparing a chemical database, and assessing its diversity and chemical space. Methods for
evaluating the structure–activity relationships (SAR) and polypharmacology are also included. Usage of
open source tools is emphasized. Step-by-step KNIME workflows are used for illustrating the methods. The
methods described in this chapter are applied onto a chemical database especially relevant for epi-
polypharmacology that is an emerging area in drug discovery. However, the methods described herein
could be extended to other therapeutic areas and potentially to other areas of chemistry.

Keywords Chemoinformatics, ChemMaps, Chemical space, Data mining, Epigenetics, Epi-


informatics, KNIME, Molecular diversity, Open-access, Polypharmacology, Structure–activity rela-
tionships, SmARt

1 Introduction

The rapid growth of chemical information demands efficient and


reliable computational algorithms to analyze the accumulated data.
Similarly, current trends in drug discovery such as polypharmacol-
ogy [1, 2] demand the organization and efficient mining of multi-
ple drug–target interactions and study of structure–multiple activity
relationships (SMARt) efficiently [3]. Indeed, a plethora of methods
and resources for exploiting SMARt and other data relevant to
polypharmacology have been published, and many of them are
open access [4]. This review includes methodological details for
implementing scalable KNIME cheminformatics workflows for:
a. Curating a chemical database;
b. Computing chemical descriptors;

Electronic supplementary material: The online version of this article (https://doi.org/10.1007/7653_2018_


6) contains supplementary material, which is available to authorized users.
3
4 J. Jesús Naveja et al.

c. Analyzing and comparing database diversity, and


d. Visualizing their chemical space.
Of note, KNIME is an open-access initiative intended for gen-
erating data mining pipelines or workflows, which are capable of
integrating multiple tools [5].
Although sufficiently detailed, this review aims at being a quick
practical guide. More comprehensive tutorials in chemoinformatics
can be found elsewhere [6, 7]. Additionally, web applications for
cheminformatics methods that have been developed by our research
group are mentioned in the respective subsections. These applications
are part of the D-Tools initiative for generating open cheminformatics
resources (available at https://www.difacquim.com/d-tools/). The
D-Tools usage is further described elsewhere [4, 8–11], and these
are not the focus of this review.

2 Methods

2.1 Construction and Due to the increase in the amount of chemical information, where
Curation of a it is common to the concept of big data [12], the efficient manage-
Compound Database ment of information represents a challenge today. This is of partic-
ular importance in polypharmacology where large compound
datasets contain information of screening across several biological
endpoints. In response to this need, the construction of compound
and other databases can be a convenient way to sort information
according to the data available and the specific objectives of the
study.
In chemoinformatics, construction of databases is a fundamen-
tal practice to perform various computational studies like the design
of chemical libraries, characterization and comparison of the chem-
ical space, the study of the structure–activity relationships (SAR),
and virtual screening studies, among others.
Currently, web pages of large public databases such as Drug-
Bank [13], ChEMBL [14], ZINC [15], and BindingDB [16] allow
the user to download their own databases (complete or partial
downloads) with information on approved drugs, drugs in the
experimental phase, commercially available compounds, molecular
targets, etc. However, these databases are not always updated, so
they can be enriched with new information published in books or in
scientific articles.
Also, in research groups devoted to the synthesis, isolation from
natural sources and/or evaluation of new chemical entities can be
carried out for the construction of completely new compounds’ data-
bases. Such collections are usually referred to as in-house databases.
The process of building and annotating chemical databases is
not trivial. Each organization has its own rules, conventions, and
Cheminformatics and Polypharmacology 5

procedures. However, the steps that are considered essential are


listed below:
1. Identify compounds and resources that contain information
required, e.g., journals and databases with chemical informa-
tion [4, 17].
2. In a spreadsheet, it is recommended that the user has the
following information for each compound:
a. Name of each compound. This can be searched in public
databases.
b. A number that identifies this compound in the database that
has been consulted, for example, ChemSpider ID, Sub-
stance or Compound ID (SID, CID in PubChem, the
CAS registry number, or an internal and consistent code if
building an in-house collection).
c. Structure input. An example of this is the use of Canonical
SMILES notation used for encoding molecular structures
that can be imported to other molecular editing systems. It
is worth noting the relevance of creating a single computa-
tional representation. This can be achieved by using various
algorithms in a process known as canonicalization.
3. Once this information is collected in the spreadsheet, save the
database preferably in .csv format (comma delimited). Other
database formats with chemical information and compatible
with most computer programs as KNIME are sdf (structure
data file), mol (molecular data file), and mol2 (tripos mol2 file).
For the management and analysis of databases, the KNIME
Example Server provides access to many explanatory workflows.
The example server is accessible via the KNIME Explorer panel
within the KNIME workbench and represents a great help when
starting a new workflow.
Some of the nodes to start working with files with chemical
information are: Molecule Type Cast, a node useful for reading
chemical data from a .csv file or database, and this node casts
any string as a chemical type (i.e., It tells KNIME “This is a
smiles string”) and Marvin MolConverter, a node provided by
Chemaxon/Infocom that translates seamlessly between types
(smiles $ sdf $ mrv).
An important aspect to consider when analyzing molecular
databases generated by other scientists is that these may contain
wrong information or unnecessary information for the intended
application or project. Therefore, cleaning or curating the informa-
tion is highly relevant to enhance the quality of the data and to
avoid erroneous results [18].
As in the construction of databases, there is no widely accepted
standard protocol for the preparation of small molecules. However,
6 J. Jesús Naveja et al.

hereunder are described the essential points in the preparation and


curation of databases:
1. Normalize the chemical structures. In this step, each chemical
structure is checked for valid atom types, valence checks, and
functional groups such as nitro groups are converted to a
consistent representation. This is followed by a standardization
step in which chemical structures are converted to a canonical
tautomeric form, aromatic structures are kekulized, placement
of stereo bonds is standardized, and all implicit hydrogens are
converted to explicit hydrogens [19].
2. Remove duplicates. After the molecules have been properly
standardized, it is appropriate to detect duplicates. InChiKeys
is a useful method to identify several states of protonation and
tautomers of a molecule.
3. Discard inorganic and organometallic atoms or molecules if
these are not the object of study. It is worth mentioning that
the majority of the chemoinformatics programs currently avail-
able are developed to process small organic molecules.
4. Wash the compound database by applying to each molecule a
set of rules of “cleaning” such as the elimination of salts and the
adjustment of the protonation states. The purpose of this step
is to ensure that each chemical structure is in a form suitable for
the subsequent modeling.
5. Enumerate tautomers and stereoisomers. This step is impor-
tant in virtual screening studies, particularly when using search
methods such as docking or pharmacophore.
6. Optimize the geometry and minimize the energy if the data-
base will be used to evaluate the potential of each compound to
bind to a receptor or enzyme, or to calculate descriptors which
depend on the three-dimensional conformation of the mole-
cule. The specific method to optimize the geometry will largely
depend on the type, quantity of molecules to optimize, and,
most importantly, on the specific application.
In addition, if the quantity of compounds is too large to be
examined or tested with the resources available, different strategies
can be employed to reduce the number of compounds in a rational
and consistent manner. Such strategies include: filtering—essen-
tially imposing secondary search criteria to eliminate compounds,
clustering—taking a representative subset of a larger set, and human
inspection of the compound structures (with or without extra
data) [20].
In several articles, the impact of the use of duplicates and incon-
sistencies in the molecular structures in prediction models had
already been discussed [21]. For this reason, the project CERAPP
(Collaborative Estrogen Receptor Activity Prediction Project) has
Cheminformatics and Polypharmacology 7

developed a workflow to curate databases [22]. A similar workflow


can be found at the link https://github.com/zhu-lab/curation-
workflow/blob/master/Structure%20Standardizer2.zip.
Gally et al. also report a workflow designed to prepare molecu-
lar databases but focused on studies of virtual screening [23]. In
addition to carrying out of the standardization of chemical struc-
tures, the workflow of Gally et al. has implemented filters (based on
molecular property distribution) to characterize specific subsets of
chemical libraries such as drug-like, lead-like, or fragment-like sub-
sets of compounds.
See Workflow 1 in the Supplementary Information for an
example in KNIME.
The following analyses use an epigenomics chemical database
that has already been curated and published [24].

2.2 Diversity In drug discovery projects focused on one single target or multiple
Analysis targets, it is of high relevance quantifying the structural diversity of
compound datasets. For instance, if the goal of a high-throughput
screening campaign is to identify hit compounds with a desirable
polypharmacological profile, it is desirable to screen a compound
collection with high diversity. This will increase the possibilities to
find active molecules with a desirable profile. If the goal of the
screening campaign is to further develop a focused library (e.g.,
increase the structure–activity information of a focused region in
chemical space [25]), it is desirable to screen a compound dataset
with high internal similarity (low diversity).
The diversity in a chemical library can be assessed in multiple
ways, mainly depending on the data under scrutiny. In addition to
the diversity metric, a key aspect of diversity analysis is molecular
representation [26, 27]. The most common ways to represent mol-
ecules in chemoinformatic applications are molecular descriptors
(including physicochemical properties and molecular fingerprints),
and chemical scaffolds [28]. Depending on the type of descriptor
and the level of accuracy desired (considering the time of computa-
tion and the number of compounds to analyze), the input structures
can be in two or three dimensions (the latter requires conformational
analysis). The choice of molecular representation depends on the
goals of the study.
A more detailed description on how to use molecular descrip-
tors and scaffolds as an input for diversity analysis follows in the
next paragraphs. See Workflow 2 in the Supplementary Informa-
tion for an exemplary diversity analysis in KNIME.

2.2.1 Molecular Molecular descriptors capture information of the whole molecule


Descriptors and are usually straightforward to interpret. Also, whole molecular
properties such as physicochemical properties of pharmaceutical
interest are usually part of empirical rules for drug likeness that
aids to guide drug discovery programs. KNIME includes RDKit,
8 J. Jesús Naveja et al.

CDK, and Indigo nodes, with which complexity descriptors (e.g.,


chiral carbons, and fraction of sp3 carbon atoms), and physicochemi-
cal properties of pharmaceutical interest (including molecular
weight, number of hydrogen bond donors and acceptors, number
of rotatable bonds, logarithm of octanol–water partition coefficient,
and topological polar surface area) [28].
Starting with curated databases (discussed in Sect. 2.1), the
steps for quantifying diversity with molecular descriptors are:
1. Select the features to be evaluated (usually the six commonest
physicochemical properties of pharmaceutical relevance, vide
supra).
2. Scale the data using a Z-transformation. This transforms the
data to dimensional units. The purpose is to improve the
comparability of the variables and give a similar weight to all
of them independently of the units with which they were
originally measured.
3. Compute pairwise euclidean distance. For a database with
n compounds, n  (n  1)/2 pairwise comparisons are to be
computed. Euclidean distance is calculated with the formula:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Xn ffi
2
D ðA; BÞ ¼ i¼1
ða i  b Þ
i ,

where D(A, B) is the euclidean distance between compound A


and B, ai and bi are the i-th descriptor, and n the total number
of descriptors [29]. D(A, B) can take any positive real number
as value.
4. Compute a central tendency statistic (e.g., mean or median) for
all the pairwise comparisons. The larger the mean or median,
the more diverse the dataset is [30].
5. Finally, for comparison, the statistic can be computed for other
reference databases or looked up at the literature if already
reported.

2.2.2 Molecular Many structural features escape the very general information
Fingerprints obtained with physicochemical and complexity descriptors. Molec-
ular fingerprints are vectors that aim towards a more comprehensive
set of features (usually more than a hundred) to compare molecules.
Every feature is encoded as a Boolean variable, where “0” represents
absence and “1” represents presence of the feature. Therefore,
repeated motifs are not generally acknowledged. For every molecule,
a Boolean vector of features is obtained, and these are susceptible of
standard set operations [31–33]. However, molecular fingerprints
do have limitations, for example, they could be more difficult to
interpret intuitively, and therefore pose a greater difficulty for extract-
ing insights relevant for medicinal chemistry.
Cheminformatics and Polypharmacology 9

The steps for computing diversity based on fingerprints are:


1. Select a molecular fingerprint. Although the selection of the
“best” fingerprint could be different from case to case, it has
been consistently found that MACCS keys 166-bits [34] are
useful for quantifying database diversity. In turn, extended
connectivity fingerprints of diameter 4 (ECFP4) [32] as well
as other circular fingerprints are, overall, better suited for vir-
tual screening, activity landscape modeling, and SAR studies in
general.
2. Compute pairwise Tanimoto similarity [27, 35]. For a database
with n compounds, n  (n  1)/2 pairwise comparisons are to
be computed. Tanimoto similarity is calculated with the
expression:
c
T ðA; BÞ ¼
a þ b  c0
where T(A, B) is Tanimoto similarity with possible values being
any real number between 0 and 1, c is the number of features
for which both molecules A and B have a “1” value, a is the
number of features for which molecule A has a “1” value, and
b is the number of features for which molecule B has a “1”
value. Dissimilarity matrices implemented in KNIME are quite
efficient at these calculations. However, by default they com-
pute values as dissimilarities, the complement of similarities, or
distance matrices. Conversion from Tanimoto dissimilarity to
similarity is accomplished by just subtracting the value from 1
(Ts ¼ 1  Td, where Ts is Tanimoto similarity and Td is
Tanimoto dissimilarity).
3. Compute a central tendency statistic (e.g., mean or median) for
all the pairwise comparisons. Conversely to Euclidean distance
(and any distance metric in general), the smaller the mean or
median, the more diverse the dataset is [30].
4. Finally, for comparison, the statistic can be computed for other
reference databases or looked up at the literature if already
reported.

2.2.3 Molecular KNIME has nodes for finding Murcko scaffolds [36, 37]. By defi-
Scaffolds nition, Murcko scaffolds contain all the cyclic systems in a molecule
as well as the linkers between them. All other decorations and
ramifications are omitted. The greatest benefit of working with
scaffolds data is that, unlike molecular fingerprints, they are readily
interpreted by medicinal chemists. Nonetheless, the representation
is rougher and loses information from the side chains. Also, more
advanced methods must be applied to account for the structural
relations among the scaffolds.
10 J. Jesús Naveja et al.

It is logical and generally accepted that a dataset is more diverse


when it has a large number of different scaffolds, and the proportions
of compounds with each scaffold are evenly distributed. The proce-
dure for measuring scaffold diversity is as follows:
1. Find Murcko scaffolds for every molecule in the dataset.
2. Compute a frequency table of the scaffolds.
3. From here, there are a number of different methods for asses-
sing the diversity [38]:
a. Order the scaffolds by their frequency of occurrence and
compute the median (i.e., the minimum number of scaffolds
in the database that contain at least 50 % of the total entries).
Lower values in this statistic mean higher diversity.
b. Order the scaffolds by their frequency of occurrence. This
order would be an index from 1 to n, where n is the total
number of different scaffolds in the dataset. Divide all
indexes by n, such that the highest index value is 1. Using
scaffold indexes in the x-axis and their respective cumulative
proportions in the y-axis, compute the area under the curve
as a diversity statistic. This statistic admits as value any real
number in the domain [0.5, 1.0]. Lower values in this
statistic mean higher diversity.
c. Compute scaled Shannon entropy (SSE) with the formula:

SE
SSE ¼
log2 n0
Pn
where SE ¼ i¼1 pi log2 pi ,
where pi is the proportion in the dataset of th i-th scaffold
(calculated by dividing the occurrence of this i-th scaffold by
the total number of entries/molecules), SE is the Shannon
entropy, and n is the total number of scaffolds in the dataset.
SSE takes as value a real number in the range [0,1]. For this
statistic, higher values mean higher scaffold diversity.
4. Finally, the statistic can be computed for other reference data-
bases for comparison.

2.2.4 Consensus In the light of numerous variables that can be used to quantify
Diversity Plots diversity, visual representations have been built in order to summa-
rize multiple of them simultaneously. These are the consensus
diversity plots (CDPs). A CDP, as defined by González-Medina
et al. [10], renders 2D diversity measured by scaffolds, fingerprints,
physicochemical properties, and the number of compounds in the
databases. It is also possible to integrate 3D data [24]; however, we
will not emphasize on 3D data usage here. The steps required for
plotting a CDP from data are:
Cheminformatics and Polypharmacology 11

Fig. 1 An exemplary consensus diversity plot (CDP). Each data point represents a compound database.
Molecular fingerprints diversity is plotted in the x-axis, the scaffold diversity in the y-axis, the physicochemical
properties diversity in a color continuous scale, and the relative number of compounds in the database as the
data point size. AUC area under the curve, PCP physicochemical properties

1. Curate databases; calculate diversity with physicochemical prop-


erties, molecular fingerprints, and scaffolds (see above for
details).
2. Plot the molecular fingerprints diversity in the x-axis, the scaf-
fold diversity in the y-axis, the physicochemical properties in a
color continuous scale, and the number of compounds in the
database as the data point size. Every data point represents a
database. (See Fig. 1 and Supplementary KNIME Workflow 3
for a few examples.)
As an alternative, an online server was developed for generating
CDPs and is also available in D-Tools (see Sect. 1). A video tutorial
is available at https://youtu.be/lruo1ypKGbE, and detailed writ-
ten instructions about how to use it can be found at http://132.
248.103.152:3838/CDPlots/.

2.3 A common assumption in virtual screening is that similar molecules


Structure–Activity are expected to have similar properties, e.g., comparable biological
Relationship Analysis activity. This assumption is called the similarity principle. Although
virtual screening is often useful for detecting active compounds, it is
reassuring to verify whether the similarity principle is valid for the
molecules under scrutiny. Such insights can be obtained through a
subtype of SAR analysis, activity landscape modeling. SAR analysis
of chemical libraries, for which activity against a biological target is
12 J. Jesús Naveja et al.

known, can also reveal substructures that are relevant for inhibiting
the target in question. The next paragraphs give details onto some
useful methods for assessing SAR of single and multiple libraries
simultaneously. Workflow 4 in the Supplementary Information illus-
trates a KNIME implementation of the methods described below.

2.3.1 Structure–Activity Structure–activity similarity (SAS) maps are bidimensional activity


Similarity Maps landscape representations that contrast structural similarity (e.g.,
measured with Tanimoto coefficient of molecular fingerprints) and
activity similarity (for example, as pIC50 or pKi). Systematic pair-
wise compound comparisons are included in the plot [39]. Each
point in a SAS map represents a pair of compounds and is colored
according to the most active compound of the pair. The sequence
of steps for generating and ultimately interpreting a SAS map is as
follows:
1. Given n compounds in a library, compute the n  (n  1)/2
paired chemical similarity as described in Sect. 2.2.2.
2. Similarly, for the same paired comparisons calculate the abso-
lute difference in potency. All compounds should have potency
in pIC50 units. It is calculated from IC50 measurements in
nanomolar concentration units with the formula (ideally, all
compounds should have IC50 values measured under the
same protocol and assay conditions):

pIC50 ¼ log10 ðIC50 ½nMÞ:

3. Plot the structural similarity in the x-axis and the potency


difference in the y-axis. The color of the data points can also
be set to render more information, for example, the maximum
potency value in the pair.
4. The resultant plot, illustrated in Fig. 2, can be divided into four
quadrants with thresholds defined a priori: (a) smooth (high
structural similarity and low activity difference), (b) activity
cliffs (high structural similarity but high activity difference),
(c) scaffold hops (low structural similarity but low activity
difference), and (d) uncertainty (low structural similarity and
high activity difference) [40–42]. Typical potency thresholds
are 2 for deep activity cliffs and 1 for shallow activity cliffs. In
the case of structural similarity, 1 or 2 standard deviations
above the mean could be used.
Alternatively, a web application for plotting SAS maps can be
found at D-Tool under https://unam-shiny-difacquim.shinyapps.
io/ActLSmaps/. A video tutorial is available at https://youtu.be/
52jHCcg5mXU.
Cheminformatics and Polypharmacology 13

Fig. 2 Structure–activity similarity (SAS) maps. Each data point represents a pair of compounds. The x-axis
plots the structural similarity, while the y-axis plots the activity difference. Four quadrants are formed as
described in Sect. 2.3.1. A color scale might be added to represent density of points or the maximum activity
value in the pair. Tc Tanimoto coefficient

2.3.2 Scaffold SAR can also be explored based on chemical scaffolds. For every
Enrichment Factor dataset with activity annotations against a particular biological
target, every scaffold could be considered as a cluster of molecules.
At this point, it is interesting to find which clusters have a higher or
lesser proportion of active molecules, pointing towards clusters of
highly related molecules that tend to be more or less active than the
average. This is the basis of the calculation of enrichment factors
(EF) for scaffolds, which are obtained as follows:
1. If activity is represented quantitatively in the dataset, a thresh-
old of activity should be set a priori. Often, a pIC50 of 5–6 or
more is useful for defining a compound as active.
2. Essentially, the EF is an odds ratio, i.e., a ratio of proportions.
Specifically, the proportion of active compounds with a given
scaffold is divided by the proportion of active compounds in
the general dataset. A more formal definition would be that, for
every scaffold λ, an EF is calculated using the equation [43]:

ActðC λ Þ
EFðC λ Þ ¼
Act ðC Þ
jC þλ j
where ActðC λ Þ ¼ jC λ j

jC þ j
and Act ðC Þ ¼ jC j ,

where, in turn, C is the total number of compounds tested, C+


the number of compounds active, Cλ the number of total
compounds with a scaffold λ tested, and C þ λ the number of
14 J. Jesús Naveja et al.

compounds with a scaffold λ active against the target. Values


above 1 imply a positively enriched scaffold (i.e., a scaffold that
has a higher proportion of active compounds than the general
dataset), while values below 1 have the opposite meaning.
3. EFs are susceptible of hypothesis testing. For finding statisti-
cally significant enriched scaffolds, chi-squared tests can be run
using a 2  2 contingency table for the compounds considering
as variables whether they have a given scaffold and whether
they are active. Since sometimes values in the cells might be
lesser than 5, and this interferes with the analytic calculation of
the chi-squared statistic, simulated values can be obtained.
4. After running all p-values for every scaffold, the false discovery
rate correction (or other method for correcting for multiple
hypothesis testing) should be applied.

2.3.3 Degree of The methods for SAR analysis mentioned above are useful for single
Polypharmacology target studies. However, sometimes inhibition data of multiple
targets are available for single compounds. These data could lead
to polypharmacology studies. Maggiora and Gokhale recently for-
malized the notion of polypharmacology and polyspecificity [44].
In practical terms, the degree of polypharmacology of a molecule
equals the number of different targets against which the molecule is
active, while the analogous degree of polyspecificity of a target
equals the number of different molecules that are active against
the target.

2.3.4 Multiple A review addressing SmARt analysis in epigenetics was recently pub-
Structure–Activity lished [3]. Two of the most useful SmARt tools are methodologically
Relationship Analysis explained in the following paragraphs: dual-activity difference
(DAD) maps and structure–promiscuity index difference (SPID).
Similarly as for other SAR analyses, Workflow 4 in the Supplemen-
tary Information contains practical tools for computing them.

Dual-Activity DAD maps are designed to compare at once the activity of com-
Difference Maps pounds against two biological endpoints, in contrast to SAS maps
[45]. However, DAD maps lose structural information, which is
accounted for with SAS maps. The procedure for generating a
DAD map is straightforward:
1. Select a library of compounds with the activity of each inde-
pendently measured against two different endpoints.
2. Plot in the x-axis one of the measurements and on the y-axis the
other. A general form of a DAD map is presented in Fig. 3.

Structure–Promiscuity Aiming towards a statistic for quantifying the relationship between


Index Difference structural similarity and polypharmacology (or promiscuity), the
SPID was created [46]. It is computed with the formula:
Another random document with
no related content on Scribd:
identify now Hamilton of Bangour, young David Malloch, a William
Crawford, and a William Walkinshaw,—contained about sixty songs
of Ramsay’s own composition. Similarly, among several mock-
antiques by modern hands inserted into The Evergreen, were two
by Ramsay himself, entitled The Vision and The Eagle and Robin
Redbreast.
The time had come for Ramsay’s finest and most characteristic
performance. More than once, in his miscellanies hitherto, he had
tried the pastoral form in Scotch, whether from a natural tendency to
that form or induced by recent attempts in the English pastoral by
Ambrose Philips, Pope, and Gay. Besides his pastoral elegy on the
death of Addison, and another on the death of Prior, he had written a
pastoral dialogue of real Scottish life in 162 lines, entitled Patie and
Roger, introduced by this description:—
“Beneath the south side of a craigy bield,
Where a clear spring did halesome water yield,
Twa youthfu’ shepherds on the gowans lay,
Tending their flocks ae bonny morn of May:
Poor Roger graned till hollow echoes rang,
While merry Patie hummed himsel a sang.”

This piece, and two smaller pastoral pieces in the same vein, called
Patie and Peggy and Jenny and Meggie, had been so much liked
that Ramsay had been urged by his friends to do something more
extensive in the shape of a pastoral story or drama. He had been
meditating such a thing through the year 1724, while busy with his
two editorial compilations; and in June 1725 the result was given to
the public in The Gentle Shepherd: A Scots Pastoral Comedy.
Here the three pastoral sketches already written were inwoven into a
simply-constructed drama of rustic Scottish life as it might be
imagined among the Pentland Hills, near Edinburgh, at that time,
still within the recollection of very old people then alive, when the
Protectorates of Cromwell and his son had come to an end and Monk
had restored King Charles. The poem was received with enthusiastic
admiration. There had been nothing like it before in Scottish
literature, or in any other; nothing so good of any kind that could be
voted as even similar; and this was at once the critical verdict. It is a
long while ago, and there are many spots in Edinburgh which
compete with one another in the interest of their literary
associations; but one can stand now with particular pleasure for a
few minutes any afternoon opposite that decayed house in the High
Street, visible as one is crossing from the South Bridge to the North
Bridge, where Allan Ramsay once had his shop, and whence the first
copies of The Gentle Shepherd were handed out, some day in June
1725, to eager Edinburgh purchasers.
The tenancy of this house by Ramsay lasted but a year longer.
He had resolved to add to his general business of bookselling and
publishing that of a circulating library, the first institution of the
kind in Edinburgh. For this purpose he had taken new premises, still
in the High Street, but in a position even more central and
conspicuous than that of “The Mercury” opposite Niddry’s Wynd.
They were, in fact, in the easternmost house of the Luckenbooths, or
lower part of that obstructive stack of buildings, already mentioned,
which once ran up the High Street alongside of St. Giles’s Church,
dividing the traffic into two narrow and overcrowded channels. It is
many years since the Luckenbooths and the whole obstruction of
which they formed a part were swept away; but from old prints we
can see that the last house of the Luckenbooths to the east was a tall
tenement of five storeys, with its main face looking straight down the
lower slope of the High Street towards the Canongate. The strange
thing was that, though thus in the very heart of the bustle of the town
as congregated round the Cross, the house commanded from its
higher windows a view beyond the town altogether, away to Aberlady
Bay and the farther reaches of sea and land in that direction. It was
into this house that Ramsay removed in 1726, when he was exactly
forty years of age. The part occupied by him was the flat immediately
above the basement floor, but perhaps with that floor in addition.
The sign he adopted for the new premises was one exhibiting the
heads or effigies of Ben Jonson and Drummond of Hawthornden.
Having introduced Ramsay into this, the last of his Edinburgh
shops, we have reached the point where our present interest in him
all but ends. In 1728, when he had been two years in the new
premises, he published a second volume of his collected poems,
under the title of Poems by Allan Ramsay, Volume II., in a
handsome quarto matching the previous volume of 1721, and
containing all the pieces he had written since the appearance of that
volume; and in 1730 he published A Collection of Thirty Fables.
These were his last substantive publications, and with them his
literary career may be said to have come to a close. Begun in the last
years of the reign of Queen Anne, and continued through the whole
of the reign of George I., it had just touched the beginning of that of
George II., when it suddenly ceased. Twice or thrice afterwards at
long intervals he did scribble a copy of verses; but, in the main, from
his forty-fifth year onwards, he rested on his laurels. Thenceforward
he contented himself with his bookselling, the management of his
circulating library, and the superintendence of the numerous
editions of his Collected Poems, his Gentle Shepherd, and his Tea
Table Miscellany that were required by the public demand, and the
proceeds of which formed a good part of his income. It would be a
great mistake, however, to suppose that, when Allan Ramsay’s time
of literary production ended, the story of his life in Edinburgh also
came to a close, or ceased to be important. For eight-and-twenty
years longer, or almost till George II. gave place to George III.,
Ramsay continued to be a living celebrity in the Scottish capital,
known by figure and physiognomy to all his fellow-citizens, and
Ramsay’s bookshop at the end of the Luckenbooths, just above the
Cross, continued to be one of the chief resorts of the well-to-do
residents, and of chance visitors of distinction. Now and then,
indeed, through the twenty-eight years, there are glimpses of him
still in special connections with the literary, as well as with the social,
history of Edinburgh. When the English poet Gay, a summer or two
before his death in 1732, came to Edinburgh on a visit, in the
company of his noble patrons, the Duke and Duchess of
Queensberry, and resided with them in their mansion of
Queensberry House in the Canongate,—now the gloomiest and
ugliest-looking house in that quarter of the old town, but then
reckoned of palatial grandeur,—whither did he tend daily, in his
saunterings up the Canongate, but to Allan Ramsay’s shop? One
hears of him as standing there with Allan at the window to have the
city notabilities and oddities pointed out to him in the piazza below,
or as taking lessons from Allan in the Scottish words and idioms of
the Gentle Shepherd, that he might explain them better to Mr. Pope
when he went back to London.
Some years later, when Ramsay had reached the age of fifty, and
he and his wife were enjoying the comforts of his ample success, and
rejoicing in the hopes and prospects of their children,—three
daughters, “no ae wally-draggle among them, all fine girls,” as
Ramsay informs us, and one son, a young man of three-and-twenty,
completing his education in Italy for the profession of a painter,—
there came upon the family what threatened to be a ruinous disaster.
Never formally an anti-Presbyterian, and indeed regularly to be seen
on Sundays in his pew in St. Giles’s High Kirk, but always and
systematically opposed to the unnecessary social rigours of the old
Presbyterian system, and of late under a good deal of censure from
clerical and other strict critics on account of the dangerous nature of
much of the literature put in circulation from his library, Ramsay
had ventured at last on a new commercial enterprise, which could
not but be offensive on similar grounds to many worthy people,
though it seems to have been acceptable enough to the Edinburgh
community generally. Edinburgh having been hitherto deficient in
theatrical accommodation, and but fitfully supplied with dramatic
entertainments, he had, in 1736, started a new theatre in Carrubber’s
Close, near to his former High Street shop. He was looking for great
profits from the proprietorship of this theatre and his partnership in
its management. Hardly had he begun operations, however, when
there came the extraordinary statute of 10 George II. (1737),
regulating theatres for the future all over Great Britain. As by this
statute there could be no performance of stage-plays out of London
and Westminster, save when the King chanced to be residing in some
other town, Ramsay’s speculation collapsed, and all the money he
had invested in it was lost. It was a heavy blow; and he was moved by
it to some verses of complaint to his friend Lord President Forbes
and the other judges of the Court of Session. While telling the story
of his own hardship in the case, he suggests that an indignity had
been done by the new Act to the capital of Scotland:—
“Shall London have its houses twa
And we doomed to have nane ava’?
Is our metropolis, ance the place
Where langsyne dwelt the royal race
Of Fergus, this gate dwindled down
To a level with ilk clachan town,
While thus she suffers the subversion
Of her maist rational diversion?”
However severe the loss to Ramsay at the time, it was soon tided
over. Within six years he is found again quite at ease in his worldly
fortunes. His son, for some years back from Italy, was in rapidly
rising repute as a portrait-painter, alternating between London and
Edinburgh in the practice of his profession, and a man of mark in
Edinburgh society on his own account; and, whether by a junction of
the son’s means with the father’s, or by the father’s means alone, it
was now that there reared itself in Edinburgh the edifice which at the
present day most distinctly preserves for the inhabitants the memory
of the Ramsay family in their Edinburgh connections. The
probability is that, since Allan had entered on his business premises
at the end of the Luckenbooths, his dwelling-house had been
somewhere else in the town or suburbs; but in 1743 he built himself a
new dwelling-house on the very choicest site that the venerable old
town afforded. It was that quaint octagon-shaped villa, with an
attached slope of green and pleasure-ground, on the north side of the
Castle Hill, which, as well from its form as from its situation, attracts
the eye as one walks along Princes Street, and which still retains the
name of Ramsay Lodge. The wags of the day, making fun of its
quaint shape, likened the construction to a goose-pie; and something
of that fancied resemblance may be traced even now in its extended
and improved proportions. But envy may have had a good deal to do
with the comparison. It is still a neat and comfortable dwelling
internally, while it commands from its elevation an extent of scenery
unsurpassed anywhere in Europe. The view from it ranges from the
sea-mouth of the Firth of Forth on the east to the first glimpses of the
Stirlingshire Highlands on the west, and again due north across the
levels of the New Town, and the flashing waters of the Firth below
them, to the bounding outline of the Fifeshire hills. When, in 1743,
before there was as yet any New Town at all, Allan Ramsay took up
his abode in this villa, he must have been considered a fortunate and
happy man. His entry into it was saddened, indeed, by the death of
his wife, which occurred just about that time; but for fourteen years
of widowerhood, with two of his daughters for his companions, he
lived in it serenely and hospitably. During the first nine years of
those fourteen he still went daily to his shop in the Luckenbooths,
attending to his various occupations, and especially to his circulating
library, which is said to have contained by this time about 30,000
volumes; but for the last five or six years he had entirely relinquished
business. There are authentic accounts of his habits and demeanour
in his last days, and they concur in representing him as one of the
most charming old gentlemen possible, vivacious and sprightly in
conversation, full of benevolence and good humour, and especially
fond of children and kindly in his ways for their amusement. He died
on the 7th of January 1758, in the seventy-second year of his age, and
was buried in Greyfriars Churchyard.
Ramsay had outlived nearly all the literary celebrities who had
been his contemporaries during his own career of active authorship,
ended nearly thirty years before. Swift and Pope were gone, after
Gay, Steele, Arbuthnot, and others of the London band, who had
died earlier. Of several Scotsmen, his juniors, who had stepped into
the career of literature after he had shown the way, and had attained
to more or less of poetic eminence under his own observation, three,
—Robert Blair, James Thomson, and Hamilton of Bangour,—had
predeceased him. Their finished lives, with all the great radiance of
Thomson’s, are wholly included in the life of Allan Ramsay. David
Malloch, who had been an Edinburgh protégé of Ramsay’s, but had
gone to London and Anglicised himself into “Mallet,” was about the
oldest of his literary survivors into another generation; but in that
generation, as Scotsmen of various ages, from sixty downwards to
one-and-twenty, living, within Scotland or out of it, at the date of
Ramsay’s death, we count Lord Kames, Armstrong, Reid, Hume,
Lord Monboddo, Hugh Blair, George Campbell, Smollett, Wilkie,
Blacklock, Robertson, John Home, Adam Smith, Adam Ferguson,
Lord Hailes, Falconer, Meikle, and Beattie. Such of these as were
residents in Edinburgh had known Allan Ramsay personally; others
of them had felt his influence indirectly; and all must have noted his
death as an event of some consequence.
The time is long past for any exaggeration of Allan Ramsay’s
merits. But, call him only a slipshod little Horace of Auld Reekie,
who wrote odes, epistles, satires, and other miscellanies in Scotch
through twenty years of the earlier part of the eighteenth century,
and was also, by a happy chance, the author of a unique and
delightful Scottish pastoral, it remains true that he was the most
considerable personality in Scottish literary history in order of time
after Drummond of Hawthornden, or, if we think only of the
vernacular, after Sir David Lindsay, and that he did more than any
other man to stir afresh a popular enthusiasm for literature in
Scotland after the Union with England. All in all, therefore, it is with
no small interest that, in one’s walks along the most classic
thoroughfare of the present Edinburgh, one gazes at the white stone
statue of Allan Ramsay, from the chisel of Sir John Steell, which
stands in the Gardens just below the famous “goose-pie villa.” It
looks as if the poet had just stepped down thence in his evening
habiliments to see things thereabouts in their strangely changed
condition. By the tact of the sculptor, he wears, one observes, not a
wig, but the true poetic night-cap or turban.
LADY WARDLAW AND THE BARONESS NAIRNE[6]

In 1719 there was published in Edinburgh, in a tract of twelve


folio pages, a small poem, 27 stanzas or 216 lines long, entitled
Hardyknute, a Fragment. It was printed in old spelling, to look like
a piece of old Scottish poetry that had somehow been recovered; and
it seems to have been accepted as such by those into whose hands the
copy had come, and who were concerned in having it published.
Among these were Duncan Forbes of Culloden, afterwards Lord
President of the Court of Session, and Sir Gilbert Elliott of Minto,
afterwards Lord Justice-Clerk; but there is something like proof that
it had come into their hands indirectly from Sir John Hope Bruce of
Kinross, baronet, who died as late as 1766 at a great age, in the rank
of lieutenant-general, and who, some time before 1719, had sent a
manuscript copy of it to Lord Binning, with a fantastic story to the
effect that the original, in a much defaced vellum, had been found, a
few weeks before, in a vault at Dunfermline.
The little thing, having become popular in its first published
form, was reproduced in 1724 by Allan Ramsay in his Evergreen,
which professed to be “a collection of Scots Poems wrote by the
ingenious before 1600”; but it there appeared with corrections and
some additional stanzas. In 1740 it had the honour of a new
appearance in London, under anonymous editorship, and with the
title “Hardyknute, a Fragment; being the first Canto of an Epick
Poem: with general remarks and notes.” The anonymous editor, still
treating it as a genuine old poem, of not later than the sixteenth
century, praises it very highly. “There is a grandeur, a majesty of
sentiment,” he says, “diffused through the whole: a true sublime,
which nothing can surpass.” It was but natural that a piece of which
this could be said should be included by Percy in his Reliques of
Ancient English Poetry, published in 1765. It appeared, accordingly,
in the first edition of that famous book, still as an old poem and in
antique spelling; and it was reprinted in the subsequent editions
issued by Percy himself in 1767, 1775, and 1794, though then with
some added explanations and queries.
It was through Percy’s collection that the poem first became
generally known and popular. Even there, though in very rich
company, it was singled out by competent critics for special
admiration. But, indeed, good judges, who had known it in its earlier
forms, had already made it a favourite. The poet Gray admired it
much; and Thomas Warton spoke of it as “a noble poem,” and
introduced an enthusiastic reference to it into one of his odes. Above
all, it is celebrated now as having fired the boyish genius of Sir
Walter Scott. “I was taught Hardyknute by heart before I could read
the ballad myself,” he tells us, informing us further that the book out
of which he was taught the ballad was Allan Ramsay’s Evergreen of
1724, and adding, “It was the first poem I ever learnt, the last I shall
ever forget.” In another place he tells us more particularly that it was
taught him out of the book by one of his aunts during that visit to his
grandfather’s farmhouse of Sandyknowe in Roxburghshire on which
he had been sent when only in his third year for country air and
exercise on account of his delicate health and lameness, and which
he remembered always as the source of his earliest impressions and
the time of his first consciousness of existence. He was accustomed
to go about the farmhouse shouting out the verses of the ballad
incessantly, so that the Rev. Dr. Duncan, the minister of the parish,
in his calls for a sober chat with the elder inmates, would complain of
the interruption and say, “One may as well speak in the mouth of a
cannon as where that child is.” Hardyknute, we may then say, was
the first thing in literature that took hold of the soul and imagination
of Scott; and who knows how far it may have helped to determine the
cast and direction of his own genius through all the future?
Afterwards, through his life in Edinburgh, Ashestiel, and Abbotsford,
he was never tired of repeating snatches of the strong old thing he
had learnt at Sandyknowe; and the very year before his death (1831)
we find him, when abroad at Malta in the vain hope of recruiting his
shattered frame, lamenting greatly, in a conversation about ballad-
poetry, that he had not been able to persuade his friend Mr. John
Hookham Frere to think so highly of the merits of Hardyknute as he
did himself.
What is the piece of verse so celebrated? It must be familiar to
many; but we may look at it again. We shall take it in its later or
more complete form, as consisting of 42 stanzas or 336 lines; in
which form, though it is still only a fragment, the conception or story
is somewhat more complex, more filled out, than in the first
published form of 1719. The fragment opens thus:—
“Stately stept he east the wa’,
And stately stept he west;
Full seventy years he now had seen,
With scarce seven years of rest.
He lived when Britons’ breach of faith
Wrocht Scotland mickle wae;
And aye his sword tauld, to their cost,
He was their deadly fae.

High on a hill his castle stood,


With halls and towers a-hicht,
And guidly chambers fair to see,
Whare he lodged mony a knicht.
His dame, sae peerless ance and fair,
For chaste and beauty deemed,
Nae marrow had in a’ the land,
Save Eleanour the Queen.

Full thirteen sons to him she bare,


All men of valour stout;
In bluidy fecht with sword in hand
Nine lost their lives bot doubt:
Four yet remain; lang may they live
To stand by liege and land!
High was their fame, high was their micht,
And high was their command.

Great love they bare to Fairly fair,


Their sister saft and dear:
Her girdle shawed her middle jimp,
And gowden glist her hair.
What waefu’ wae her beauty bred,
Waefu’ to young and auld;
Waefu’, I trow, to kith and kin,
As story ever tauld!”
Here we see the old hero Hardyknute in peace in the midst of his
family, his fighting days supposed to be over, and his high castle on
the hill, where he and his lady dwell, with their four surviving sons
and their one daughter, Fairly Fair, one of the lordly boasts of a
smiling country. But suddenly there is an invasion. The King of
Norse, puffed up with power and might, lands in fair Scotland; and
the King of Scotland, hearing the tidings as he sits with his chiefs,
“drinking the blude-red wine,” sends out summonses in haste for all
his warriors to join him. Hardyknute receives a special message.
“Then red, red grew his dark-brown cheeks;
Sae did his dark-brown brow;
His looks grew keen, as they were wont
In dangers great to do.”

Old as he is, he will set out at once, taking his three eldest sons with
him, Robin, Thomas, and Malcolm, and telling his lady in his
farewell to her:—
“My youngest son sall here remain
To guaird these stately towers,
And shoot the silver bolt that keeps
Sae fast your painted bowers.”

And so we take leave of the high castle on the hill, with the lady,
her youngest son, and Fairly fair, in it, and follow the old lord and his
other three sons over the moors and through the glens as they ride to
the rendezvous. On their way they encounter a wounded knight,
lying on the ground and making a heavy moan:—
“‘Here maun I lie, here maun I die,
By treachery’s false guiles;
Witless I was that e’er gave faith
To wicked woman’s smiles.’”

Hardyknute, stopping, comforts him; says that, if he can but mount


his steed and manage to get to his castle on the hill, he will be tended
there by his lady and Fairly fair herself; and offers to detach some of
his men with him for convoy.
“With smileless look and visage wan
The wounded knicht replied:
‘Kind chieftain, your intent pursue,
For here I maun abide.

‘To me nae after day nor nicht


Can e’er be sweet or fair;
But soon, beneath some drapping tree,
Cauld death sail end my care.’”

Farther pleading by Hardyknute avails nothing; and, as time presses,


he has to depart, leaving the wounded knight, so far as we can see, on
the ground as he had found him, still making his moan. Then, after
farther riding over a great region, called vaguely Lord Chattan’s land,
we have the arrival of Hardyknute and his three sons in the King of
Scotland’s camp, minstrels marching before them playing pibrochs.
Hardly have they been welcomed when the battle with the Norse
King and his host is begun. It is described at considerable length, and
with much power, though confusedly, so that one hardly knows who
is speaking or who is wounded amid the whirr of arrows, the
shouting, and the clash of armour. One sees, however, Hardyknute
and two of his sons fighting grandly in the pell-mell. At last it is all
over, and we know that the Norse King and his host have been
routed, and that Scotland has been saved.
“In thraws of death, with wallert cheek,
All panting on the plain,
The fainting corps of warriors lay,
Ne’er to arise again:
Ne’er to return to native land;
Nae mair wi’ blythesome sounds
To boist the glories of the day
And shaw their shinand wounds.

On Norway’s coast the widowed dame


May wash the rock with tears,
May lang look ower the shipless seas,
Before her mate appears.
Cease, Emma, cease to hope in vain:
Thy lord lies in the clay;
The valiant Scats nae reivers thole
To carry life away.

There, on a lea where stands a cross


Set up for monument,
Thousands full fierce, that summer’s day,
Filled keen war’s black intent.
Let Scots, while Scots, praise Hardyknute;
Let Norse the name aye dread;
Aye how he foucht, aft how he spared,
Sall latest ages read.”

Here the story might seem to end, and here perhaps it was intended
at first that it should end; but in the completer copies there are three
more stanzas, taking us back to Hardyknute’s castle on the high hill.
We are to fancy Hardyknute and his sons returning joyfully thither
after the great victory:—
“Loud and chill blew the westlin wind,
Sair beat the heavy shower;
Mirk grew the nicht ere Hardyknute
Wan near his stately tower:
His tower, that used with torches’ bleeze
To shine sae far at nicht,
Seemed now as black as mourning weed:
Nae marvel sair he sich’d.

‘There’s nae licht in my lady’s bower;


There’s nae licht in my hall;
Nae blink shines round my Fairly fair,
Nor ward stands on my wall.
What bodes it? Robert, Thomas, say!’
Nae answer fits their dread.
‘Stand back, my sons! I’ll be your guide!’
But by they passed wi’ speed.

‘As fast I have sped ower Scotland’s faes.’


There ceased his brag of weir,
Sair shamed to mind oucht but his dame
And maiden Fairly fair.
Black fear he felt, but what to fear
He wist not yet with dread:
Sair shook his body, sair his limbs;
And all the warrior fled.”

And so the fragment really ends, making us aware of some dreadful


catastrophe, though what it is we know not. Something ghastly has
happened in the castle during Hardyknute’s absence, but it is left
untold. Only, by a kind of necessity of the imagination, we connect it
somehow with that wounded knight whom Hardyknute had met
lying on the ground as he was hurrying to the war, and whom he had
left making his moan. Was he a fiend, or what?
It is quite useless to call this a historical ballad. There was a
reference, perhaps, in the author’s mind, to the battle of Largs in
Ayrshire, fought by the Scots in 1263, in the reign of Alexander III.,
against the invading King Haco of Norway; and there is a Fairly
Castle on a hill near Largs which may have yielded a suggestion and a
name. But, in truth, any old Scottish reign, and any Norse invasion,
will do for time and basis, and the ballad is essentially of the
romantic kind, a story snatched from an ideal antique, and appealing
to the pure poetic imagination. A battle is flung in; but what rivets
our interest is the hero Hardyknute, a Scottish warrior with a Danish
name, and that stately castle of his, somewhere on the top of a hill, in
which he dwelt so splendidly with his lady, his four sons, and their
sister Fairly fair, till he was called once more to war, and in which
there was some ghastly desolation before his return. Such as it is, we
shall all agree, I think, with Gray, Warton, Scott, and the rest of the
best critics, in admiring the fragment. It has that something in it
which we call genius.
It seems strange now that any critic could ever have taken the
ballad for a really old one, to be dated from the sixteenth century or
earlier. Apart from the trick of old spelling, and affectation of the
antique in a word or two, the phraseology, the manner, the cadence,
the style of the Scotch employed, are all of about the date of the first
publication of the ballad, the first quarter of the eighteenth century.
The phrase “Let Scots, while Scots, praise Hardyknute,” and the
phrase “And all the warrior fled,” are decisive; and, while there might
be room for the supposition that some old legend suggested the
subject to the author, the general cast of the whole forbids the idea
that it is merely a version of some transmitted original.
Suspicions, indeed, of the modern authorship of Hardyknute
had arisen in various quarters long before any one person in
particular was publicly named as the author. That was first done by
Percy in 1767, in the second edition of his Reliques, when he gave his
reasons for thinking, from information transmitted to him from
Scotland by Sir David Dalrymple, Lord Hailes, that the ballad was
substantially the composition of a Scottish lady, who had died in
1727, eight years after it had first appeared in its less perfect form,
and three years after it had appeared with the improvements and the
additional stanzas. That lady was Elizabeth Halket, born in 1677, one
of the daughters of Sir Charles Halket of Pitfirran in Fifeshire,
baronet, but who had changed her name to Wardlaw in the year
1696, when she became the wife of Sir Henry Wardlaw of Pitreavie,
also a Fifeshire baronet. All subsequent evidence has confirmed the
belief that this Lady Wardlaw was the real author of Hardyknute,
though, to mystify people, it was first given out by her relatives as an
ancient fragment. This was the statement more especially of the
already-mentioned Sir John Hope Bruce of Kinross, who was one of
her brothers-in-law.
Of Lady Wardlaw herself we hear nothing more distinct than
that she was “a woman of elegant accomplishments, who wrote other
poems, and practised drawing and cutting paper with her scissors,
and who had much wit and humour, with great sweetness of
temper.” So we must be content to imagine her,—a bright-minded
and graceful lady, living in Fifeshire, or coming and going between
Fifeshire and Edinburgh, nearly two centuries ago, and who, while
attending to her family duties and the duties of her station, could
cherish in secret a poetic vein peculiarly her own, and produce at
least one fine ballad of an ideal Scottish antique. This in itself would
be much. For that was the age of Queen Anne and of the first of the
Georges, when poetry of an ideal or romantic kind was perhaps at its
lowest ebb throughout the British Islands, and the poetry most in
repute was that of the modern school of artificial wit and polish
represented by Addison and Pope.
But this is not all. In the year 1859 the late Mr. Robert Chambers
published a very ingenious and interesting essay entitled “The
Romantic Scottish Ballads: their Epoch and Authorship.” The ballads
to which he invited critical attention were the particular group which
includes Sir Patrick Spens, Gil Morrice, Edward Edward, The Jew’s
Daughter, Gilderoy, Young Waters, Edom o’ Gordon, Johnnie of
Braidislee, Mary Hamilton, The Gay Goss Hawk, Fause Foodrage,
The Lass of Lochryan, Young Huntin, The Douglas Tragedy, Clerk
Saunders, Sweet William’s Ghost, and several others. With but one
or two exceptions, these were first given to the world either in Percy’s
Reliques in 1765, or in the subsequent collections of Herd (1769),
Scott (1802), and Jamieson (1806); but, since they were published,
they have been favourites with all lovers of true poetry,—the “grand
ballad” of Sir Patrick Spens, as Coleridge called it, ranking perhaps
highest, on the whole, in general opinion. There is a certain common
character in all the ballads of the group, a character of genuine
ideality, of unconnectedness or but hazy connectedness with
particular time or place, of a tendency to the weirdly, and also of a
high-bred elegance and lightsome tact of expression, distinguishing
them from the properly historical Scottish Ballads, such as the Battle
of Otterbourne, or the Border Ballads proper, such as Kinmont
Willie, or the homely rustic ballads of local or family incident of
which so many have been collected. Hence the distinctive name of
“romantic,” usually applied to them.
Respecting these ballads the common theory was, and still is,
that they are very old indeed,—that they are the transmitted oral
versions of ballads that were in circulation among the Scottish
people before the Reformation. This theory Mr. Chambers
challenged, and by a great variety of arguments. Not only was it very
suspicious, he said, that there were no ancient manuscripts of them,
and that, save in one or two cases, they had never been heard of till
the eighteenth century; but the internal evidence, of conception,
sentiment, costume, and phraseology,—not in lines and passages
merely, where change from an original might be supposed, but
through and through, and back to the very core of any supposed
original,—all pointed, he maintained, to a date of composition not
farther back than the beginning of the century in which they first
came into print. He maintained farther that they all reveal the hand
of some person of superior breeding and refinement, with a
cultivated literary expertness and sense of the exquisite, and that,
just as the difference of age would be seen if one of them were placed
side by side with an authentic piece of old Scottish poetry of the
sixteenth century, so would this other difference of refined or
cultured execution be at once seen if one of them were placed side by
side with a genuine popular ballad of lowly origin, such as used to
please in sheets on street-stalls and in pedlars’ chap-books. Farther
still, in all or most of the ballads concerned, there are, he argued,
traces of feminine perception and feeling. And so, still pressing the
question, and noting the recurrence of phrases and ideas from ballad
to ballad of the group, not to be found in other ballads, but looking
like the acquired devices of one and the same writer’s fancy,—some
of the most remarkable of which recurring ideas and phrases he
chased up to the ballad of Hardyknute,—he arrived at the conclusion
that there was a “great likelihood” that all or most of the ballads he
was considering were either absolutely the inventions of Lady
Wardlaw of Pitreavie, or such complete recasts by her of traditional
fragments that she might be called the real author. He would not
advance the conclusion as more than a “great likelihood,” and he
allowed that it might be still controverted; but he cited in its favour
the fact that so high an authority as Mr. David Laing had previously
intimated his impression that Hardyknute and Sir Patrick Spens
were by the same hand.
Were Mr. Chambers’s conclusion to be verified, it would be a
sore wrench to the patriotic prejudices of many to have to abandon
the long-cherished fancy of the immemorial, or at least remote,
antiquity of so many fine Scottish favourites. But what a
compensation! For then that Lady Wardlaw whom we can already
station, for her Hardyknute alone, as undoubtedly one woman of
genius in the poverty-stricken Scotland of the beginning of the
eighteenth century, would shine out with greatly increased radiance
as the author of a whole cycle of the finest ballad-pieces in our
language, a figure of very high importance in Scottish literary
history, a precursor or sister of Burns and of Scott. For my own part,
I would willingly submit to the wrench for a compensation so
splendid. I am bound to report, however, that Mr. Chambers’s
speculation of 1859 was controverted strenuously at the time, has
been pronounced a heresy, and does not seem to have been
anywhere generally accepted. It was controverted especially, within a
year from its appearance, in a pamphlet of reply by Mr. Norval Clyne
of Aberdeen, entitled “The Scottish Romantic Ballads and the Lady
Wardlaw Heresy”; and I observe that Professor Child of America, in
his great Collection of English and Scottish Ballads, pays no respect
to it, treats it as exploded by Mr. Clyne’s reply, and expressly
dissociates Sir Patrick Spens and other ballads of the class from
Hardyknute. It may be enough, in these circumstances, merely to
intimate my opinion that the controversy is by no means closed.
There were shrewder and deeper suggestions, I think, in Mr.
Chambers’s paper of 1859 than Mr. Clyne was able to obviate; and,
having observed that most of the lore on the subject used by Mr.
Clyne in his reply, his adverse references and quotations included,
was derived from Mr. Chambers’s own Introduction and Notes to his
three-volume edition of Scottish Songs and Ballads in 1829, I cannot
but presume that Mr. Chambers had all that lore sufficiently in his
mind thirty years afterwards, and found nothing in it to impede or
disconcert him then in his new speculation. Apart, however, from the
special question of Lady Wardlaw’s concern in the matter, Mr.
Chambers seems to me to have moved a very proper and necessary
inquiry when he started his theory of the comparatively recent origin
of all or most of the Scottish Romantic Ballads. In what conception,
what kind of language, does the opposite theory couch itself? In the
conception that, besides the series of those literary products of past
Scottish generations, the work of learned or professional writers,
from the time of Barbour onwards, that have come down to us in
books, or in old manuscript collections like that of Bannatyne, there
was always a distinct literature of more lowly origin, consisting of
ballads and songs recited or sung in Scottish households in various
districts, and orally transmitted from age to age with no names
attached to them, and indeed requiring none, inasmuch as they were
nobody’s property in particular, but had “sprung from the heart of
the people.” Now, this phrase, “sprung from the heart of the people,”
I submit, is, if not nonsensical, at least hazy and misleading. Nothing
of fine literary quality ever came into existence, in any time or place,
except as the product of some individual person of genius and of
somewhat more than average culture. Instead of saying that such
things “spring from the heart of the people,” one ought rather to say
therefore that they “spring to the heart of the people.” They live after
their authors are forgotten, are repeated with local modifications,
and so become common property. It is, of course, not denied that
this process must have been at work in Scotland through many
centuries before the eighteenth. The proof exists in scraps of fine old
Scottish song still preserved, the earliest perhaps the famous verse
on the death of Alexander III., and in lists, such as that in The
Complaint of Scotland, of the titles of clusters of old Scottish songs
and tales that were popular throughout the country in the sixteenth
century, but have perished since. The very contention of Mr.
Chambers respecting Sir Patrick Spens and the other ballads in
question was that the fact that there is no mention of them in those
old lists is itself significant, and that they have a set of special
characteristics which came into fashion only with themselves.
If Lady Wardlaw was the author of those ballads, or of some of
them, we have lost much by her secretiveness. We have been put in a
perplexity where perplexity there ought to have been none. The
cause, on her part, was perhaps less a desire for mystification than
an amiable shrinking from publicity, dislike of being talked of as a
literary lady. This was a feeling which the ungenerous mankind of
the last century,—husbands, brothers, uncles, and brothers-in-law,—
thought it proper to foster in any feminine person of whose literary
accomplishments they were privately proud. It affected the careers of
not a few later Scottish women of genius in the same century, and
even through part of our own. Passing over several such, and among
them Lady Anne Barnard, the authoress of Auld Robin Gray, let me
come to an instance so recent that it can be touched by the memories
of many that are still living.
In the year 1766, seven years after the birth of Burns, and five
before that of Scott, there was born, in the old house of Gask in
Strathearn, Perthshire, a certain Carolina Oliphant, the third child of
Laurence Oliphant the younger, who, by the death of his father the
next year, became the Laird of Gask and the representative of the old
family of the Oliphants.
They were a Jacobite family to the core. The Laird and his father
had been out in the Rebellion of 1745; they had suffered much in
consequence and been long in exile; and not till a year or two before
the birth of this little girl had they been permitted to return and
settle on their shattered estates. They were true to their Jacobitism
even then, acknowledging no King but the one “over the water,”
praying for him, corresponding with him, and keeping up the
recollection of him in their household as almost a religion. Carolina
was named Carolina because, had she been a boy, she was to have
been named Charles, and she used to say that her parents had never
forgiven her for having been born a girl. But two boys were born at
last, and there were sisters both older and younger; and so, among
Oliphants, and Robertsons of Struan, and Murrays, and other
relatives, all Jacobite, and all of the Scottish Episcopal persuasion,
Carolina grew up in the old house of Gask, hearing Jacobite stories
and Highland legends from her infancy, and educated with some
care. The mother having died when this, her third, child was but
eight years of age, the Laird was left with six young ones. “A poor
valetudinary person,” as he describes himself, he seems, however, to
have been a man of fine character and accomplishments, and to have
taken great pains with his children. King George III., hearing
somehow of his unswerving Jacobitism and the whimsicalities in
which it showed itself, is said to have sent him this message by the
member for Perthshire: “Give my compliments, not the compliments
of the King of England, but those of the Elector of Hanover, to Mr.
Oliphant, and tell him how much I respect him for the steadiness of
his principles.”
Somewhat stately and melancholic himself, and keeping up the
ceremonious distance between him and his children then thought
proper, the Laird of Gask had those liberal and anti-morose views of
education which belonged especially to Scottish nonjuring or
Episcopalian families. A wide range of reading was permitted to the
boys and the girls; dancing, especially reel-dancing, was incessant
among them,—at home, in the houses of neighbouring lairds, or at
county-balls; in music, especially in Scottish song, they were all
expert, so that the rumour of a coming visit of Neil Gow and his
violin to Strathearn, with the prospect it brought them of a week
extraordinary of combined music and reel-dancing, would set them
all madly astir; but the most musical of the family by far was
Carolina. She lived in music, in mirth, legend, Highland scenery, and
the dance, a beautiful girl to boot, and called “the Flower of
Strathearn,” of tall and graceful mien, with fine eyes, and fine
sensitive features, slightly proud and aquiline. And so to 1792, when
her father, the valetudinary laird, died, some of his children already
out in the world, but this one, at the age of twenty-six, still
unmarried.
For fourteen years more we hear of her as still living in the old
house of Gask with her brother Laurence, the new Laird, and with
the wife he brought into it in 1795,—the even tenor of her existence
broken only by some such incident as a visit to the north of England.
During this time it is that we become aware also of the beginnings in
her mind of a deep new seriousness, a pious devoutness, which,
without interfering with her passionate fondness for song and music,
or her liking for mirth and humour and every form of art, continued
to be thenceforth the dominant feeling of her life, bringing her into
closer and closer affinity with the “fervid” or “evangelical” in religion

You might also like