You are on page 1of 20

1818 Send Orders for Reprints to reprints@benthamscience.

net

Combinatorial Chemistry & High Throughput Screening, 2022, 25, 1818-1837


REVIEW ARTICLE
ISSN: 1386-2073
eISSN: 1875-5402

COMBINATORIAL CHEMISTRY &


HIGH THROUGHPUT SCREENING
Accelerated Technologies for Biotechnology, Bioassays,
Medicinal Chemistry and Natural Products Research

Artificial Intelligence (AI) in Drugs and Pharmaceuticals


Impact
Factor:
1.714

BENTHAM
SCIENCE

Adarsh Sahu1*, Jyotika Mishra1 and Namrata Kushwaha2


Combinatorial Chemistry & High Throughput Screening

1
Department of Pharmaceutical Sciences, Dr. Harisingh Gour Vishwavidyalaya, Sagar, MP, India; 2Sri Aurobindo In-
stitute of Pharmacy, Indore, MP, India

Abstract: The advancement of computing and technology has invaded all the dimensions of sci-
ence. Artificial intelligence (AI) is one core branch of Computer Science, which has percolated to
all the arenas of science and technology, from core engineering to medicines. Thus, AI has found
its way for application in the field of medicinal chemistry and heath care. The conventional meth-
ods of drug design have been replaced by computer-aided designs of drugs in recent times. AI is
ARTICLE HISTORY being used extensively to improve the design techniques and required time of the drugs. Addition-
ally, the target proteins can be conveniently identified using AI, which enhances the success rate of
Received: August 02, 2021 the designed drug. The AI technology is used in each step of the drug designing procedure, which
Revised: October 11, 2021
Accepted: October 19, 2021
decreases the health hazards related to preclinical trials and also reduces the cost substantially. The
AI is an effective tool for data mining based on the huge pharmacological data and machine learn-
DOI: ing process. Hence, AI has been used in de novo drug design, activity scoring, virtual screening
10.2174/1386207325666211207153943 and in silico evaluation in the properties (absorption, distribution, metabolism, excretion and tox-
icity) of a drug molecule. Various pharmaceutical companies have teamed up with AI companies
for faster progress in the field of drug development, along with the healthcare system. The review
covers various aspects of AI (Machine learning, Deep learning, Artificial neural networks) in drug
design. It also provides a brief overview of the recent progress by the pharmaceutical companies in
drug discovery by associating with different AI companies.

Keywords: Deep learning, machine learning, artificial neural network, drug design, drug discovery, pharmaceuticals.

1. INTRODUCTION where the ligands are grown methodically to interact electro-


statically and sterically with the target site. However, the
The process of drug discovery is complex, expensive and
compounds designed by this process exhibited limitations
strenuous as a single molecule has to be identified from a
like weak properties of DMPK [2]. Another approach fol-
range of 1060 -10100 probable molecules. The selected mole-
lowed for de novo design is making a virtual library of the
cule should comply with characteristics like drug metabo- possible drug structure followed by exploring the practical
lism and pharmacokinetic profile (DMPK), bioactivity and
feasibility through docking studies [3]. Nevertheless, the
feasible synthetic methods. The period required from the
reaction conversion is predefined, which confines these
recognition of the drug to its implementation is around 3 - 5
methods within a certain sphere.
years. Apart from the period, numerous compounds (100 -
1000) have to be synthesized and used for clinical trials [1]. Artificial intelligence (AI) is a recent and promising ap-
The designed drugs should also find their target appropriate- proach for drug discovery and design processes. Machine
ly by moving through the body, which is another challenge learning is a part of AI which has been used extensively in
during the design of a particular drug. Additionally, the the arena of pharmaceutical science. Scientists and research-
pharmacogenetic factors like a response to a drug based on ers have become more competent and skilled due to the
genetic variation and toxicity of chemicals propelled due to availability and storing of data in machines. The progress
interaction of the drug with undesired proteins, are other and application of machine learning have led to the estab-
hindrances faced during the process of drug design. lishment of fundamental algorithms and their application.
The science of devising intelligent machines was termed
The labour, challenges and time have been potentially ebbed
artificial intelligence by John McCarthy at a Dartmouth Con-
by the development of de novo algorithms to design the spe-
ference in 1956 [4]. AI encompasses a multitude of disci-
cific drug. The approaches used initially were structure-based,
plines like mathematics, linguistics, computer science, psy-
chology, neuroscience and others. The application of AI is
*Address correspondence to this author at the Department of Pharmaceuti- diverse with basic methods like a representation of
cal Sciences, Dr. Harisingh Gour Vishwavidyalaya, Sagar, MP, India; knowledge, quest for a solution, maintenance and acquire-
E-mail: adarshniper@gmail.com ment of knowledge, logical reasoning, along with machine

1875-5402/22 $65.00+.00 © 2022 Bentham Science Publishers


Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1819

learning. This review encompasses the application of AI in exact protein. Apart from the identification of the adequate
drug discovery during the last 5 years. drug molecule, there are still other hurdles to overcome, like
the drug should endure the metabolic process of the liver and
1.1. Initiation of Artificial Intelligence in Pharmaceuti- possess excellent transport capability to reach the target site
cals from the gut through the bloodstream. Other factors to be
considered during the drug design are that the molecule
Artificial intelligence was initially introduced in the
should be effectively eliminated from the body prior to its
realm of molecular biology and chemistry for predicting the
potential of being toxic and should circumvent any interac-
sequencing of the secondary protein structure [5]. The con-
tion with other proteins which may create health issues. De-
cept of the application of AI in the arena of drug research
spite accounting for all these factors, there remains a major
was carried out for understanding the structure-activity rela-
tionship based on Hammett’s equation describing a relation- pertaining challenge. The identification of a particular mole-
cule interacting with a specific protein owes the potential to
ship between the equilibrium constants and reaction rates for
interact with around 300 other similar proteins. Additionally,
derivatives of benzene [6]. This was followed by the study
around 100 different proteins are responsible for complicated
for the recognition and optimisation of the different bioactive
diseases like Alzheimer’s and cancer. Thus targeting a spe-
molecules by Hansch who was the father of QSAR [7]. The
cific protein using a specific drug molecule is not the proper
chemical consequences on the biological process were ana-
lysed through a large number of AI techniques by different solution for these maladies.
medicinal chemists [8]. The similar patterns shared among Thus, the development of AI is used for the complex
various chemical structures were analysed and used as bases identification of drugs which can relate with around 20 dif-
for studying the in vitro biochemical effects and analogous ferent proteins, while evading hundreds of other similar pro-
physicochemical properties [9] by a route termed as pattern teins. The genetic data of a multitude of individuals are gath-
recognition method [10]. Again, the neural networks were ered for recognition of the difference in the proteins. The AI
used in the pharmaceutical industry as they had the capabil- software would analyse the data and specify a particular drug
ity for the identification of patterns and could be used as for the patient depending on the symptoms.
engines [11]. The initial proper application of the neural
The identification of the target molecule is the initial attempt
networks was used for forecasting the mechanism of action
of drug discovery. This information is gathered by AI from the
of a cancer drug [12]. The complete molecular design meth-
sample of human tissues, fluids and blood according to Berg’s
ods based on machine learning and AI possessed the efficacy approach. Thus the AI assembles data from diverging fields like
of solving the problems, adjusting with the novel situations
metabolomics, genomics, lipidomics, proteomics and other re-
and learning from experiences [13]. Thus algorithms related
lated physiological factors. The samples were collected from
to machine learning like random forest (RF) [14], and sup-
both healthy individuals and other individuals suffering from
port vector machine (SVM) [15] were put forth. These pro-
the disease and the live cells were exposed to various pathologi-
grammes were used for the identification of all the signifi-
cal conditions in the laboratory. The processes like the rigidity
cant features of the molecules for a precise prediction. The of the membrane or the capability of the cell to generate energy
method of deep learning was developed for studying the
produce different data which are processed by deep-learning
multifaceted association between the parameters like toxici-
software. The software analyses the data and recognises the
ty, bioactivity and molecular structures [16]. The difficulties
protein responsible for the specific disease. Additionally, the
of prediction for the toxicity of the drugs (NIH Tox21) [17]
software could identify the genes responsible for the malady
and forecasting of the activities of the compounds (Kaggle
which is also termed as an approach called precision –medicine
challenge) [18] were overcome by the application of the deep approach. The patients could be examined about the efficacy of
learning methods. The other commonly used deep learning
the drug, before its ingestion.
modelling methods are variational autoencoder (VAE), [19]
recurrent neural network (RNN) [20] and generative adver- 1.3. Impact of AI in Decreasing the Cost n Drug Design
sary network (GAN) [21].
The pharmaceutical companies decide the final cost of
1.2. Quest for Novel Drugs Using AI the drugs, based on the market analysis [23]. AI can be ap-
plied for determining the market price through the analysis
The term ‘chemical space’ is an important parameter for
of different factors, controlling the drug cost after its produc-
the execution of AI in the field of drug discovery. An exclu-
tion [24]. The factors influencing the cost of a manufactured
sive and superior quality molecule can be computationally
drug are market share value of the drug, cost regulatory
identified using chemical space from an assembly of numer-
schemes of the country, total budget for the manufacturing of
ous feasible molecules [22]. Thus the number of compounds the drug, cost of the reference product available in the mar-
to be synthesised is narrowed down by the application of AI
ket, and the policies for determining the price by the brands
technology which boosts the economy in the arena of phar-
[25]. These factors are analysed using ML for forecasting the
maceutical research. A well-known protein related to a par-
final cost of the drug. Conventional drug discovery ap-
ticular disease is selected and the AI technology is imple-
proaches require synthesizing a hundred derivatives and then
mented for designing a drug that can interact appropriately
analysing their medicinal properties which were very time
with the target protein for eliminating the symptoms of the consuming and cost a lot of money. But now using AI we
disease. There are software programmes for harmonizing the
can screen compound and can select the best compounds
biochemical properties and biophysical structures of around
which will save money and lots of efforts [15, 16, 18]. An-
150,000 proteins with a huge number of molecules for ex-
other AI technique is the In competitor, which is a competi-
posing the exact molecule which would interact with the
1820 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

Table 1. Different AI models used in drug discovery.

Process of Drug Discovery Designing of Drugs AI Models Refs.

Analysis of PPI FD/DCA [32]


Identification and study of targets
Analysis of protein folding CNN [33]

Drug repurposing Network pathology [34]

Discovering a hit Activity scoring SVM. RF, 3D graph CNN [33]

Virtual screening SVM, AAE [35]

De novo drug design Deep learning, VAE, AAE [36]


From hit to lead
QSAR Machine learning methods, DNN [37]

Analysis of absorption, distribution, metabolism, excre-


Optimisation of the lead molecule CNN, neural network [38]
tion and toxicity

tive platform for estimating the data related to market prices 2.1.1. Recursive Neural Network
by different brands [26].
The de novo process has successfully used the recursive
neural networks (RNN), which operate through the inputs of
2. AI AND DRUG DESIGN sequential information [40-43]. The RNN produces the
The structure-based drug discovery mainly emphasises chemical structures by the application of SMILES strings to
the 3D structure of the target protein [27]. The 3D environ- decode the chemical structures in form of letters. The RNN
ment of the binding site of the ligand is significant during the is fed with chemical structures from existing databases like
design of the target protein. The de novo design of protein ChEMBL for familiarising the software with the grammar of
and homology modelling is the conventional method [28] the SMILES strings. Thus RNN was used for generating a
which has been replaced by sophisticated AI technologies. multitude of SMILES strings [40] and new peptide structures
Among the different tools, AlphaFold demonstrated high [41]. The required properties were incorporated by the rein-
efficacy in the accurate prediction of the 3D structure of forcement learning process.
drugs that targeted a specific protein based on primary se- 2.1.2. Transfer Learning Process
quences of the proteins. This tool depends on the DNN,
which mainly forecasts the protein properties like the angles The drugs with expected chemical structures were de-
between the neighbouring peptide bonds and the distances signed using the transfer learning process. The network is
between the amino acid pairs by studying the primary se- fed with the SMILES grammar consisting of an extensive
quence. The analysis of the 3D structure is carried out by training set primarily followed by training the molecules
combining both the probabilities into score functions which having the required activities. Thus the desired drug can be
are used to investigate the target protein structure and finally screened for a particular chemical space [20]. In a study, five
predict the drug molecule. molecules were prepared using this method and four mole-
cules among these were found to be active against the recep-
The interaction between the protein and drug is analysed tors of nuclear hormones [38]. However, the size of the
using either quantum mechanics or hybrid methods of both
chemical space is yet to be explored.
quantum mechanics and molecular mechanics [29]. The
quantum effects are conceded at the atomic level, thereby 2.1.3. Autoencoder
enhancing the accuracy of the process. However, the time
One artificial intelligence approach is autoencoder which
required for the methods based on quantum mechanics is
is of two types: adversarial autoencoder and variational auto-
more as compared to the molecular mechanics method [30]. encoder (Figs. 1 and 2).
The intrusion of AI methods in the analysis of quantum me-
chanics decreased the time and cost [31]. The analysis and The variational autoencoder comprises double neural
atomic simulations of the electrical properties of the mole- networks: a decoder network and an encoder network. The
cule can also be done by AI. Moreover, the potential ener- chemical structures represented by SMILES are transformed
gies of the small molecules are analysed using deep learning into a continuous vector of a real value corresponding to
methods thus substituting the analysis of quantum chemistry latent space by the encoder network. The vectors from the
by machine learning. The different AI models for drug dis- latent space are transformed into chemical structures by the
covery are shown in Table 1 [32-38]. decoder. In silico model was used for the identification of
the accurate solutions in the latent space, which was used for
back converting the vectors into the real molecules using the
2.1. De Novo Design Process
decoder network. A single molecule supersedes the back
The technique of designing new drug molecules devoid translations with minor modifications in the structure. A
of reference compounds is termed as de novo design process, favourable route for the molecules with good target charac-
which has been conducted using different software [39]. teristics was obtained by the application of the latent space
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1821

     
   

Fig. (1). Variational autoencoder, a type of autoencoder [45]. (A higher resolution / colour version of this figure is available in the electronic
copy of the article).

     
   

Fig. (2). Adversarial autoencoder, a type of autoencoder [45]. (A higher resolution / colour version of this figure is available in the electronic
copy of the article).

representation on a model designed using synthetic accessi- required for finding the appropriate path. The backward syn-
bility score and QED drug-likeness score [44]. New chemi- thetic route is chalked out in order to find a suitable precur-
cal structures are produced by adversarial autoencoder which sor for the drug molecule through Monte Carlo tree search
is screened by the discriminative adversarial model for iden- (MCTS) [46]. The earlier method of computer-assisted syn-
tification of the real structures from the generated molecules. thesis planning (CASP) did not gain popularity as the algo-
Thus the authenticity of the structures generated by the ad- rithm depended on human knowledge for inputs. Alternately,
versarial autoencoder is high as compared to that of the vari- the empirical data is used by the machine learning process
ational autoencoder. New structures that are active against which aids in the selection of a random step along with ana-
the type 2 receptor of dopamine were achieved by a combina- lysing the feasible position for branching. Predefined transfor-
tion of in silico model with autoencoders. Furthermore, com- mation rules are adopted which relate the target molecule to the
pounds possessing putative anticancer properties were also particular precursors. Thus the most probable retro-synthetic route
predicted using a generative adversarial network (GAN) [31]. can be predicted by the AI. Three different neural networks were
combined with CASP to design the 3N-MCTS method where each
2.2. Synthesis of Drugs using AI network was assigned three different chores, an expansion node,
The synthesis of a drug is very important in drug chemis- rollout node followed by an update node.
try. Nevertheless, the available chemical spaces are limited The algorithm explores the feasibility of altering the
by stringent synthetic routes. The retrosynthetic procedure is molecule is analysed from 12.4 million methods of transfor-
a relevant method in drug synthesis. The retro-synthetic mation in the expansion node [47]. The route is selected by
pathway has been modified through the application of AI the neural networks for the most probable transformation
[45]. The prediction for forwarding synthesis has been car- with maximum yield. The formation of side products can be
ried out by the machine learning approach, which analyses eliminated by finding the route with maximum selectivity.
the synthetic methods after analysis of the retrosynthetic The recurrently used transformation routes for a molecule
pathways. The synthetic route of the drug is designed after are selected using a slow and methodical approach by the
its identification and the major influencing factor is the time rollout node [48]. The analysis of a particular route is added
1822 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

to the update node. The three approaches guide an individual clinical results can be predicted by scientists without in-
to find the best retro-synthetic pathway for the design of a depth knowledge about machine learning or statistics [55].
drug. The algorithm should also be able to perform a task The subtle patterns present in the complicated datasets are
within a stipulated time frame. The MCTS was capable of identified using the support vector machine tool [56]. The
predicting retro routes for drug molecules within 5 seconds most common approach to applying AI in pharmacoepide-
with 20 times of enhanced speed than the conventional Mon- miology is to use more than a single approach as the best
te Carlo method. algorithm is complicated to choose without prior knowledge.
The maximum accuracy of an AI application depends on the
Apart from designing the routes for the reaction, AI can
benchmark of the multiple algorithms by using ‘trial and
foretell the products and yield of the reaction by analysing
error’ method. The secondary data are used in selected cases
the molecular characteristics of the reaction. The results of a
complex chemical reaction were predicted by semi-empirical by the application of different data sources on the efficacy of
the drug, utilisation of the drug and standard treatment pro-
methods, the Hartree-Fock method and density functional
cedure [57]. This process is cheaper and easier than the use
theory (DFT) by using the method of quantum chemistry.
of primary data, which requires a lot of new data. The meth-
The experiments could also be modelled using in silico pro-
odological studies in pharmacoepidemiology were studied
cess. The machine learning process was used for analysing
among the particularly selected articles.
the yields of palladium catalysed Buchwald-Hartwig cou-
pling reactions for the synthesis of carbon-nitrogen bonds in
2.3. Polypharmacology
the total synthesis of drugs in pharmaceuticals [49]. The ana-
lysed values of the dipole moments and vibrational frequen- The capability of certain drugs to target different proteins
cies were used as descriptors and the yields were obtained simultaneously has been widely exploited and the study is
from different experimental syntheses. The relation between termed polypharmacology. It is an emerging tool as the de-
the yields of the product and the descriptors was analysed by signed drugs exhibit low resistance and developed safety
the random forest approach [50]. The algorithm was also profiles [58]. The pharmacokinetic profile of a single
used for analysing the yields of different products [49]. polypharmacological agent can be predicted efficiently and
the negative synergistic effects or interaction between the
2.2.1. Softwares
drugs can be reduced. The resistance to the drug can also be
There are certain chemical codes for the synthesis of overcome with polypharmacology. A common example of a
molecules in the Chemputer software, which is operated by polypharmacological drug is sildenafil, commonly known as
Chempiler program [51]. The software has been successfully Viagra, which was initially developed for treating hyperten-
used for the synthesis of three pharmaceutical drugs, namely sion and ischemic heart disease but was found to be more
rufinamide, diphenhydramine hydrochloride and sildenafil. potent for treating the symptoms of erectile dysfunctions.
The Chemical Assembly (ChASM) is a scripting language The deep learning and AI approaches have been used for the
that sends the synthetic methods using codes and chemical identification of the patterns obtained from data obtained
descriptive languages for understanding and designing the from multi-omics and HTS in this area [59]. The DeepVS is
synthetic process [51]. The Chempiler can operate the chem- a Convolutional Neural Network which analyses the results
ical synthesis using a markup language called GraphML obtained from the experiments on docking and recognises
[52]. The synthesis of the aforementioned drugs opens a new the features required for the binding of a particular protein to
realm in drug synthesis with more safety and reproducibility. a specific ligand [60]. The structural data which describes
The unexplored reaction space has also been explored using the complex is analysed by the Deep VS technique. A com-
chemical robots. parison of this particular algorithm with 40 other different
algorithms indicated that DeepVS was far more efficient.
The reaction and the products have been predicted by a
The docking poses were analysed devoid of using manual
combination of machine learning, encoded manual rules and
parameters, which made them suitable for large-scale screen-
quantum chemical descriptors, which have been used for the
ing. In this line, the kinase inhibitors are mainly considered
forecasting of reactions occurring in multiple steps [53].
in the field of polypharmacology. The kinases are proteins
Numerous reactions were obtained from Reaxys and fed into
a deep neural network. Another set of data was used with that aid in the regulation of the physiological processes and
signalling routes that get deregulated during cancers. The AI
failed chemical reactions. The neural network analysed the
has been used for targeting the molecules which are kinase
possible reactions and ranked them accordingly. This process
inhibitors. The structure-based analysis is done by KinomeX,
aided in the identification of the major reaction product.
which has been categorised as kinome-wide pharmacology
However, the parameters like catalysts, reaction conditions
[61]. However, the AI methods are less accurate in some
have to be considered for future development.
cases than that of the established methods.
2.2.1.1. Pharmacoepidemiology
2.4. Finding the Target Molecule or Properties
The study of using the drugs by a large group of people
and its effect is termed pharmacoepidemiology. The com- 2.4.1. Repositioning of Drugs
monly used AI techniques in this field are artificial neural The further development of an already approved drug for
network (ANN), random forest (RF) and support vector ma- increasing its efficacy and other potency is termed as drug
chine (SVM). However, among the aforementioned three repositioning [62]. The complete process is as follows: iden-
techniques, the artificial neural network is most predomi- tification of the compound, acquirement of the compound
nantly used [54]. Although the random forest was discovered and its development along with safe monitoring of the mar-
recently in 2001, it has gained immense popularity as the ket. The availability of the omics and bioinformatics data has
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1823

reduced the time required for the modification of a particular were extracted by CNN, as CNN had good image processing
drug. The repositioning of drugs is commonly carried out ability [73]. The implementation of the 3D graph CNN mod-
with Biovista which can correlate the diseases, molecular el for analysing the protein-ligand interaction showed fa-
targets, drugs and the routes based on the data acquired by vourable co-relation with the experimental data [74]. The
the response of gene expression of the different cell lines deep learning method (Deep VS) can interpret the abstract
after the ingestion of the drug. Thus the data for drug reposi- characteristics from the basic qualities like the charge on the
tioning can either be hypothesis based [63] or data-based or atom, type of atoms and distance between the atoms [60].
completely on the biological networks. The three types of Since the information is extorted from the image of the bind-
approaches are text mining approach, network-based ap- ing interaction of protein and ligand by CNN the accuracy is
proach and semantic approach. The propagation approach in more defined.
the network-based models works by gathering the prior data 2.4.3. Virtual screening
from the various layers of the network. The clustering ap-
proach forms small groups within the huge network by find- The search for the bioactive molecules from different
ing an interactive linkage between diseases and drugs. Most sources like the chemical libraries which are commercially
of the drugs are designed to have various targets exhibiting viable or collections from the range of in-house compounds
different relationships between the diseases and the drug through the application of software and algorithm is termed
[64]. Metformin was approved as a type 2 diabetic drug, as virtual screening. It is an approach to eliminate the unre-
which was also found to increase the lifespan [65]. The basic quired scaffolds while identifying the appropriate drug [75,
concept of network-based model is that similar drugs have 76]. The three types of virtual screening techniques for dock-
similar targets [66]. The elements to be considered for repur- ing studies are machine learning methods [77], pharmaco-
posing of drugs are genes causing the disease, drug targets phore based [78] and similarity searching methods [79]. The
and the particular drug, which are analysed by 9 different virtual screening for molecular docking can be both ligand-
network analysis: protein-protein network, gene regulatory based and structure-based virtual screening. However, the
network, drug-target network, drug-disease network, dis- evaluation process of the parameters like entropy and solva-
ease-disease network, drug adverse effect network, drug- tion effects are not accurate which in turn hindered the accu-
drug network, metabolic network and target disease network. rate analysis of binding affinities using scoring function of
A single drug is repurposed by the integration of different the docking [80]. Another complication in the analysis is the
networks to form a heterogeneous network. The drug target flexible structure of proteins [81]. Again, another significant
is considered as a link between the drug and disease [67]. parameter like residence time is ignored which makes the
The information of the multiple networks is compiled and docking score ineffective for the analysis of the drug potency
analysed using a network diffusion algorithm and dimen- as it leads to false-positive results [82]. The virtual screening
sionality reduction approach called DTINet for determining based on ligands is independent of the structural information
the novel targets [30]. The cyclooxygenase inhibitory effect of the proteins, but analyses and relates the molecular char-
of chlorpropamide, telmisartan and alendronate were pre- acteristics to their bioactive properties which mainly em-
dicted by using the algorithm which was later experimentally ploys machine learning methods [75]. The output is high
proved as competent anti-inflammatory drugs. with a reduced rate of false hits based on its low error, strong
capability of feature extraction along with excellent classifi-
Another approach is the Similarity Ensemble Approach
cation properties [73]. The similarity between the simplified
(SEA) which is based on the method of similarity. A com-
molecular input line entry specification (SMILES) and natu-
parative study was carried out for the assembly of ligands for
ral language was identified using a model on a short-term
each target. An arbitrary comparison is conducted for creat-
ing a distribution of the similarities. A single drug molecule memory network in order to overcome the issue of virtual
screening of meager distribution of the active compounds
is chosen and similarity is analysed with all the ligands tar-
[20]. The gradient boosting trees and machine learning
geting a similar protein.
methods (DNN) can filter the molecular libraries created by
2.4.2. Activity Scoring RNN.
The scoring function is used for the evaluation of the 2.4.4. In Silico Analysis
binding interactions of the drug molecules towards the target
2.4.4.1. Pharmacophore Modelling and Docking
protein [68]. The scores created by machine learning meth-
ods are done based on the inputs of chemical features, geo- An in silico drug discovery is a virtual screening method.
metric features and physical force features [69]. The scores The commonly used methods are the pharmacophore model
are owed to the non-linear mapping technique which extracts and the docking model. The docking method analyses the
the information directly from the experimental data about the fitting of the ligand in the binding pocket of the receptor for
interaction between the protein and ligand. This technique accessing the activity [83]. The most difficult part of the
overrules the docking-related complex physical functions docking studies was the use of solvents, flexibility of the
[70]. The SVM model which used the experimental data of protein structure and the entropy of the system, [83] even
binding affinity from eHiTS, was found to be improved both though the scoring function has been modified.
in terms of scoring power and screening power as compared
The chemical characteristics like hydrophobicity, pres-
to the hypothesis of energy parameters like linear additives
[71]. The RF was combined with scoring function of Auto- ence of the number of functional groups that can be easily
ionised, presence of hydrogen bonding groups are considered
Dock for improving the analysis [72]. The characteristic fea-
and geometrically analysed as a pharmacophore group for
tures of the interactions between the protein and ligands
investigating the interaction between the drug molecule and
1824 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

the receptor [84]. The pharmacophore model is obtained the concentration of the plasma at the steady-state is termed
using either a receptor-ligand complex or a simple receptor. as the distribution of the drug at a steady-state (VDss). The
Earlier studies used mainly ligand-based pharmacophores as capability of the distribution of the drug is given by VDss
detailed information about the protein structure was unavail- and the analysis of the value of VDss aids in the modifica-
able. The shape of the binding site can be included by con- tion of the pharmacokinetic properties. PLS and Random
sidering the exclusion volume [85]. The 3D model of the Forest (RF) models were generated for 1096 molecules for
pharmacophore is produced and the drug molecule which has analysing the VDss properties [94]. The administered drug
similarity with the generated structure is explored from with- moves through the metabolic system or might produce toxic
in the large chemical database. The database creates a fit metabolites. The metabolic stability of a molecule is a signif-
score that exhibits the alignment between the centre of the icant parameter for analysing the accurate site of metabo-
model and the ligand structure. The scaffold hopping process lism.
is also studied through these models for searching chemo-
2.4.4.3. In Silico ADME-PK Modelling
types with the appropriate geometry and interactions [86].
Antimycobacterial agents were identified by the utiliza- The drug is designed by considering the experimental
tion of both docking and pharmacophore modelling studies. endpoints like permeability, metabolic stability, transported
The results are standardised by the initial screening of phar- mediated efflux, CYP inhibition and solubility according to
macophores which was later confirmed by docking. An en- the in silico ADME-modelling. A model of in silico micro-
zyme required for the biosynthesis of the lipid pathway is somal stability is available at Genentech, while another
Mtb 7.8-diaminopelargonic acid aminotrasferase (BioA) was model representing in silico solubility is available at Astra-
studied by both the docking and pharmacophore studies for Zeneca. The ADME/Tox has been predicted using 2D mo-
producing a receptor-based pharmacophore [87]. 25 different lecular descriptors and different modelling processes. The
functionalities like 9 H bonds, 9 hydrophobic bonds and 7 time-dependent CYP inhibition is a certain type of model
acceptors of H-bonds were considered and used for identifi- endpoint which is challenging to predict [95]. Similarly, mi-
cation of the drug from 4.5 million compounds using the tochondrial toxicity is another parameter that is difficult to
Enamine REAL database. The suitable compounds were predict as the toxicity can be based on different mechanisms
allowed to dock to BioA protein using CDOCKER [88]. of cytotoxicity [96]. The application of AI in ADME/Tox
Thus 45 potent molecules were identified and 17 compounds
can be used in the pharmaceutical industries for selecting the
were validated. The final selection was carried out for the
drug which could invade and enter the waxy and thick lipid proper drug which aids in the streamlining of the particular
layer of the mycobacterium. Thus the potent compound was drug for in vivo and in vitro assays. The ADME can be ap-
{(Z)-N-(2-isopropoxyphenyl)-2-oxo-2-[(3-(trifluoromethyl)- plied to nanomaterials. The main advantage of the DNN al-
cyclohexyl) amino] acetimidic acid}which was a potent Bi- gorithm is that it can predict the chemical fingerprints more
oA inhibitor [89]. accurately as compared to that of ML which was established
by Merck and the University of Toronto [33, 37]. The DNN
2.4.4.2. Physicochemical Properties
models also analysed the hERG inhibition and solubility of
The failure related to drug design can be inhibited by the the drug in water. The endpoints are co-related by using mul-
initial identification of the negative chemical and physical titask models using similar compounds [97, 98]. An hERG
properties. The solubility of the drug molecules was fore- model with fingerprint-based descriptors is shown in Fig.
casted by CNN-ANN by analysis of the molecular graph. (3). The blue colour indicates the absence of inhibition, the
The hydrophilic groups responsible for the solubility of a orange colour indicates a weak inhibition, while the red col-
molecule were identified for accurate prediction [90]. An- our represents strong inhibition [99].
other method predicted the solubility of the molecules by
using convolutional embedding based on molecular tensor, As previously mentioned, DNN is mainly based on
which analysed the characteristics at the atomic level using a SMILES instead of chemical descriptors and uses physico-
molecular graph [91]. The pharmacokinetic properties of the chemical properties like logD. The novel representations like
drug were evaluated by relating the absorption of the oral the graph convolution conducted by DNN are more meaning-
drug with the Caco-2 permeability coefficient [92]. ful as compared to the depictions by the fingerprinting meth-
od, which are very complex to interpret [98]. The complex in
2.4.4.2.1. Adsorption, Distribution, Metabolism, Excretion vivo ADME/Tox data can be interpreted and modelled by
and Toxicology (ADME/Tox) DNN. Noise causes erroneous results in Deep learning as the
in vivo endpoints are mostly noisy. Additionally, nephrotoxici-
The process by which the drug gets absorbed in the ty and hepatotoxicity can be identified with DNN.
bloodstream after administration into the system is termed
absorption. The pharmacokinetic factor which indicates the
degree of absorption is called bioavailability. The absorption
properties of a molecule can be standardised by analysing
the bioavailability of a molecule. Thus the bioavailability of
the molecule was analysed with the molecular properties and
structural fingerprints of 1014 molecules using MLR model
[93]. The molecular properties were selected by using the
method of genetic function approximation to obtain the ap-
proximate output. The ratio of the dose of the drug in vivo to Fig. (3). hERG inhibition model [99].
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1825

Fig. (4). Alteration of half-life depending on the lipophilicity of the drugs by MMP [97].

2.4.5. MMP and ADME Models The transformation of a specific functional group, for in-
stance, a cyano group attached to an alkyl group, will be
The structures of the compounds have to be prefinalised
hugely different than that of a cyano group attached to a
in ADME models. Thus the matched molecular pair (MMP)
phenyl ring. This effect is called contextualisation. The
analysis was paired with ADME models [100]. The pair of
MMP process recognises the specific effect of a particular
molecules in MMP undergoes an endpoint variation if sub- molecule and applies it to a series of similar molecular struc-
jected to a simple structural alteration which is represented
tures, which may lead to errors. Thus the statistical parame-
as a particular chemical transform. A specific property can
ters should be accounted for while analysing the MMP trans-
be modified by studying the chemical transformation of all
forms [102]. Thus the statistical technique of drug design can
the molecular pairs.
be implemented accurately using MMP tools. Thus MMP
An example of MMP change is depicted in Fig. (4), where it method allows the development from model-based analysis
has been shown that a single structural alteration can be de- to the formulation of ideas.
picted by various structural transforms.
The MMP methods have also collaborated with the
The allometric principles were used for the analysis of QSPR model. The core molecule and the side chains are
the half-life of the drug with the number of doses [101]. The fragmented and organised to create a SAR table. The blank
half-life of the C1/C2 pair was enhanced by 0.37 h by the cells in the table represent all the virtual compounds ob-
introduction of a fluorine atom. Additionally, although the tained during the analysis. The structural transformations
in vivo CLu exhibited a double-fold enhancement, there was obtained from either QSPR techniques or MMP databases
no considerable alteration in the doses based on the increased are used for forecasting the properties. The virtual com-
half-life in the C2. The transformation of C3/C4 showed a pounds can be characterised by using several endpoints,
decreased half-life due to the addition of polar functional which can be synthesised using the required routes. The data
groups. Thus the significance of the half-life can be under- can also be generated by the visual demonstration of the
stood by a comparison between the C2 and C4. Although the QSPR data.
in vivo CLu of C4 is increased 5 times than that of C2, the
doses cannot be predicted well based on their short half-life. 2.5. Structure of Proteins
The MMP database is also used for the identification of 2.5.1. Analysis of the Folding of Protein from Sequence
unexpected characteristics in common fragments. The heat
map demonstrates the alteration in the parameters (passive The dysfunction of protein causes various diseases. The
strategies for structure-based drug design can be implement-
permeability, HLM clearance and P-gp efflux) for all six-
ed by obtaining knowledge about the structure of the pro-
membered aromatic rings from phenyl in the database of
teins. This contributes towards the identification of the small
Pfizer. It was found that 4-pyridazyl was a better replace-
drug molecules which can target the desired proteins. Thus
ment considering HLM but it leads to an enhancement of p-
algorithms that can forecast the 3D structure of the protein
gp efflux. The basic point to be considered during the appli-
cation of the MMP database is that for a particular alteration are in demand. However, precise de novo prediction of the
3D structure of the proteins is very complex. The deep learn-
in the structure, the endpoint value will also be similar and is
ing methods have been applied to analyse the torsion angle
independent of the other part of the molecule, as in the case
of the backbone [103], the secondary structure of the protein
of log D.
and contacts of the proteins [104]. The 1D structure is com-
1826 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

bined with 2D CNN for analysing the residue contacts of 2.6. Clinical Trial Design
protein [104]. The feature extraction process is used for un-
The complexity of clinical trials like monitoring of pa-
derstanding the relationship between the structure and se-
tients and detection of the clinical endpoints has been re-
quence by using the architecture of deep learning.
duced by the introduction of AI in this arena. Different sen-
2.5.2. Interaction Between the Protein Molecules sors can be used for recording the body functions, medicine
doses and response to medicines of selected patients. The
The reason behind many diseases can be analysed by un-
Deep Learning and Machine Learning methods can be used
derstanding the protein-protein interactions [105]. The bind-
for the analysis of these data which aids in the generation of
ing sites between two protein units constitute numerous resi-
disease diaries and becomes specific to the expression of the
dues which are termed the PPI interface [106]. The String
disease in a patient. The data points for the identification of
database as part of the PPI database comprises 1.4 billion
PPIs which has been achieved by both bioinformatics and endpoints are more reliable. The AI technique has also been
used for endpoint detection based on images [111]. The
experimental methods [107]. These drug targets are diverse
smallest doses for reducing the brain tumours were predicted
from conventional drug targets like nuclear receptors, G-
along with decreasing the side effects associated with the
protein coupled receptors, kinases and ion channels. The
chemotherapy doses, by the application of a deep reinforce-
inhibitors of the protein-protein Database (iPPI-DB) consist
ment learning algorithm [112]. The system investigated the
of 18 families of PPIs which can identify 1756 inhibitors of
non-peptides [108]. The development of small drug mole- current treatment schedule and managed the dose systemati-
cally. The system finally selects an optimal plan where the
cules will be instigated by the extension of the target space
frequency of the doses could decrease the size of the tumour
by PPI, which behaves as a novel class of targets. The bio-
with a minimum potency. Thus the doses were decreased by
logical selectivity required for the regulatory effects can be
half of the original dose while the reduced amount of the
enhanced with the targeting PPIs, since the unfavorable ef-
tumour size remained intact. The signs of nonadherence like
fects can be decreased substantially. An example is the hin-
dering of the transport of the Cu ions within the cells by the incompatibility or side effects could be identified early and
solved prior to the dropout.
compound -DC_AC50. This particular compound interacts
with the interfaces of the Cu transfer and thus restrains the
2.7. Personalised Medicine
proliferation of the tumour cells, while the normal somatic
cell remains unaffected [109]. The interface of the PPI The major challenge in the discovery of an appropriate
should be understood for obtaining the idea of the design of drug is the accurate classification of the symptoms and de-
drugs depending on the structure formed by the protein- velopment of the remedy. A disease is classified based on
protein complex. There are many computational methods for the histology reports by pathologists for the analysis of the
analysing the interface of the PPI, since the accurate PPI molecular marker expression like the receptors at the surface
information is inadequate. Since the PPI interface is pre- of the cell at the mRNA or protein level. For instance, the
served in templates, the template-based methods are con- marker gene expression is used for the PAM50 classification
sistent. The eFindSite is a website used for the prediction of of multiple subtypes of breast cancer [113]. Although the
the PPI interfaces based on template, features based on se- heterogeneity remains prevalent within the subtypes, the
quences and also residue-based for the construction of the numerous molecular data aids in recognition of the subtypes
NBC and SVM models [110]. Again, the methods for the of the diseases and forecast the response to a treatment pro-
docking of proteins (SymmDock and ZDOCK) in case of cedure and also the symptoms of the disease. Molecular
two interactive proteins, which have been created on the Prognostic Score (mPS) is a tool which is used for the accu-
complementarity principle are also used for the analysis of rate prediction of breast cancer in patients [114]. A meta-
the PPI interface. However, the forecasting of the alteration analysis was carried out on 6000 breast cancer patients by
in the conformation of two individual proteins to transform using the epidemiology process. 184 genes without any
into a single molecule is the most difficult task. The signifi- background biological information were identified. A com-
cant features of the sequences can be obtained from the Deep bination of Neural Network and Random Forest classifier
learning techniques for the analysis of the interfaces of PPI was built for forecasting the survival rate with half of
and have been found to be better than that of the machine METABRIC cohort [115]. Additionally, overtreatment could
learning methods like SVM. As the concealed area of the also be prohibited by using the score analysis. Thus the
interface is very large, the identification of the appropriate healthcare can be transformed into precision medicine by the
sites for introducing the drugs is very important. The spots application of AI.
which release huge binding energy are considered hot spots
[110]. The PPI sites which are favourable for introducing the 2.8. AI in Treating Different Diseases
drugs have been identified using fragment docking and direct 2.8.1. Kinase Activity
coupling analysis (FD-DCA) by introducing a tool for the
docking of the fragment called iFitDock. This tool can be Several characteristics should be fulfilled by a drug mol-
used for recognising the hot spots in the PPI interface. The ecule, like highly selective towards the target, good ADME-
tiny hot spots were accumulated to create a binding site for Tox and physicochemical properties, which make it a chal-
the drug molecule. The effective binding sites for the pro- lenging task for optimising a specific drug molecule. Ma-
teins were analysed using the scoring function depending on chine Learning (ML) technologies like Random Forest,
the conservative evolutionary level. Thus the hot spots in the Bayesian learning and support vector machines (SVM) have
PPI interfaces are effective drug targets. been extensively used for the identification of the drug prop-
erties. The properties can be forecasted by ML process
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1827

through the analysis of the large databases. The compound is 2.8.3. Tuberculosis
standardised by ML models using the datasets for anti-
The enhanced availability of the chemical and biological
targets as well as targets. The activities are forecasted against
data, development in the capacities of storage of data, the
different kinases and the profile is selected from larger da-
advancement of the software and technologies for drug
tasets of different kinases. 92 different kinases were selected
from the data matrix of 130,000 compounds using binary screening and an increase in the accuracy of the identifica-
tion of the targets have altered the existing scenario of the
Bayesian QSAR models. New compounds were analysed
antitubercular therapeutic drugs. The computer-aided drug
using these models for producing affinity fingerprints for
design (CADD) has been applied extensively in pharmaceu-
analysing the biological activities against the novel kinases
tical research involving tuberculosis drugs. The computa-
having sparse data points. Thus the models are modified by
tional drug discovery has been categorised as ligand-based
uploading the new experimental data and novel inhibitors of
kinase are identified by the iterative approach of machine drug design (LBDD) and structure-based drug de-
sign(SBDD). Thus the data obtained from the interaction of
learning.
different ligands are used for the generation of models based
Random Forest identified 200 kinases with different on LBDD, while the 3D structure and its binding pockets are
properties through seeing datasets from in-house and public analysed in SBDD [121]. The data mining and docking ap-
domains [116]. The results obtained by RF are superior proaches are also applied to the available TB proteome and
among all machine learning methods. DNN could be com- genome, which is analysed by the virtual screening for iden-
pared with RFF with upgraded sensitivity but degraded spec- tification of the potent candidate from the database-, the
ificity. tools used for studying the SBDD and LBDD.
2.8.2. Cardiovascular Symptoms The FtsZ propels the cell division and is an established
Mtb target [122]. The in silico method was applied to the
The speech and images are identified by a type of Ma-
chine Learning algorithm known as Deep learning using an FtsZ comprising of prediction of the binding site, 3D-QSAR,
MD simulation and docking. This protein was targeted using
artificial neural network (ANN). The ANNs replicate the
trisubstituted benzimidazoles using homology modelling
neurons of the brain and it attempts to learn from the large
using S. aures FtsZ as the template for acquiring the GDP-
amount of data obtained through different experiments [117].
bound structure. The GDP binding site was recognised using
These algorithms and data mining process acquire the capa-
proFunc and the selected drugs were allowed to dock using
bility to identify potent drug or combinations of drugs. The
asset of the deep learning process is that it provides litheness Mtb FtsZ model applying AutoDock. The orientation of the
cyclohexyl group in the benzimidazole scaffold was ana-
to the neural network architecture [118]. It possesses an
lysed using MD simulation studies to select the most hydro-
abundance of memory to work and can work appropriately
philic pocket, while the hydrophilic part was identified for
with 3D-STE data and strain imaging. The Deep learning
the carbamate group present in FtsZ. Thus the stabilisation of
procedure comprises a deep neural network, recurrent neural
the binding of the ligand and inhibition of Mtb FtsZ was
network and convolutional neural network, which has been
especially used in cardiovascular medicines and cardiovascu- studied by the in silico analysis [122].
lar imaging. The volume in end-diastolic and end-systolic in A combination of the ML algorithm and AI was used for
cardiac magnetic resonance was forecasted by the applica- the identification of the resistant or susceptible characteris-
tion of a convolutional neural network. The recurrent neural tics of single nucleotide variations (SNV) towards TB and
network has been used for the prediction of HF prior to the forecasting the resistance towards the mutations [123]. The
diagnosis by the doctors and was improved than that of the mutations were recognized which lead to the drug resistance
algorithm obtained from machine learning [119]. Labovitz in M.tb using ANN, SVM, naive Bayes (NB) and k nearest
et al. [120] used AI for examining the adherence rate of neighbour (kNN): the four different ML algorithms. The
drugs in patients suffering an ischemic stroke. They studied models were generated for genes associated with the first-
and evaluated the rate of adherence of three direct oral anti- line TB drugs like pyrazinamide, rifampicin, fluoroquin-
coagulation drugs, namely rivaroxaban, dabigatran and apix- olones and isoniazid. The studies were directed to recognise
aban along with warfarin through plasma sampling. It was the structure-based and sequence effect of these mutations
found that the rate of adherence to the drugs was lesser than for individual target genes. The susceptible or resistant fea-
that reported earlier. The patients were administered drug ture of the mutation was spotted using a characteristic selec-
doses recommended through AI simulations and 67% of the tion method. The alteration in the amino acid residues lead-
patients exhibited high improvement due to optimal adher- ing to a change in the properties of the mutant and wild-type
ence to the drugs. Thus the laboratory tests were minimised proteins was used as input for the model. The models were
for elderly patients, and the ingestion of the medicine was 88% accurate, where the polarity and hydrophobic properties
monitored using AI. were significant parameters in the models. Thus the drug-
associated mutations were studied.
Although the presence of multiple layers in Deep learn-
ing has delivered improved performances as compared to 2.8.4. Cancer
SVM, it is associated with different parameters during the
The field of anti-cancer drug development has been ex-
non-linear analysis. This phenomenon leads to overfitting
and fails to predict, which can be modified by reducing the tensively influenced by AI methods. There is a relationship
between the activity of the drug and the genomic variability
concealed datasets and enhancing the size of the training data
of the cancer cells. The integration of the Machine Learning
set.
and screening data was conducted to form a Random Forest
1828 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

Fig. (5). Structures of sensitive anticancer drugs.

model which analysed the performance of the anticancer single-task learning through sharing of model parts. An ac-
drug depending on the state of mutation of the cancer cell curate optimiser algorithm is used for analysing the differ-
genome [124]. Another model named elastic net regression ence between the actual values and the forecasted values.
was developed by using the ML process to study the sensi- Additionally, various data can be integrated through multi-
tivity of the drug [125]. Thus the sensitivity of drugs was modal learning. Thus AI combines all the multilayered data.
studied for patients with gastric cancer with 5-FU, ovarian The drug-susceptible genes are also recognised by relating
cancer exposed to tamoxifen and endometrial cancer exposed the omics data to the particular anti-cancer drugs.
to paclitaxel (Fig. 5).
2.8.5. Neurological Diseases
The prognosis of the patients with these drugs was low,
proving the role of AI in forecasting the sensitivity of the The development of drugs for neurological disorders is a
drugs used against cancer. Additionally, the resistance to- difficult task, especially for the identification of endpoints
wards a cancer drug was also predicted by the AI [126]. The and adherence control [132]. The self-monitoring of the pa-
tolerance and management of chemotherapy drugs were also tients is complex based on their neurological problems, diffi-
conducted by using AI. The optimum dose of enzalutamide cult for the control of their nature and managing logging of
and zen-3694 was done using an AI platform called CU- the medication routine. As an instance, an epileptic patient
RATE.AI, which contributed to the tolerance of the therapy succumbing to an absence seizure is unable to record the
[127]. The breast cancer cells with a deficiency in homologous incident. Similarly, an individual experiencing depression
recombination were treated using poly ADP-ribose polymer- also may be able to self-monitor the medications. Addition-
ase inhibitors. The cancer cells having the defect of homolo- ally, neurological diseases cannot be generalised. The symp-
gous recombination were detected with the appropriateness of toms are specific for every other person, which makes it
74% by a screening system using Deep learning models [128]. challenging for treatment and diagnosis. Thus AI has been
The effects of two drugs used during chemotherapy, namely used for designing sensors for real-time monitoring of the
gemcitabine and taxol were identified by analysing the rela- patients. The accuracy in the field of medicine was intro-
tion of the drug with the genes of the patient. duced by the use of Deep learning algorithms. The automat-
The algorithms of ML should be trained for screening the ed CAD system has been used for epilepsy, Alzheimer’s
data for producing models that can analyse the effect of a disease and Parkinson’s disease. The EEG signals help in
drug combination or a new drug on the response of cancer the identification of seizure and diagnosis of epilepsy. The
cell lines [129]. A large amount of data is required for the backpropagation neural network and SVM are used as classi-
synthesis of a new drug. These chemical data are processed fiers. The HOS and WT are coupled with energy and entropy
by ML to give the results for the synthesis of the drug [130]. characteristics for proper performances.
The drug application has also been improved by deep learn- 2.8.6. Dermatology
ing technologies.
The dermatological problems and skin cancer has been
2.8.4.1. Cancer Genomics analysed using the CNN approach with a specificity of 90%
The screening of genomic mutations between 1000 to and sensitivity of 87% [11, 133]. The ML approach has been
100000 was done for studying the genomic data for each used in teledermatology and dermatopathology. The scope
tumour sample from different patients suffering from cancer of developing personalised medicine has increased with the
[131]. The relation between the mutations and clinical phe- application of ML technology. The side effects of the medi-
notypes is a challenge in genomic medicine. The medical cine are thwarted by searching for precision drugs which are
literature is the basis for the clinical analysis of the genetic done by using a multi-omics platform or psychosocial char-
variants. Thus there should be a connection between the acteristics. Numerous data points can be analysed for identi-
effective drugs, types of diseases and genomic mutations, fying genetic biomarkers. The classification algorithm has
which is a complex procedure to be handled manually. been applied to melanoma and non-melanoma skin cancer.
Hence the role of AI is significant where the sequence is The application of AI is still in its infancy in other areas of
converted to a binary table. The binary table indicates the dermatology. The AI can be effectively used for psoriasis,
absence or presence of any of the four bases at each of their skin cancer and psoriatic arthritis, while other areas have to
position. Deep learning allows multitask learning along with be developed further.
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1829

2.9. AI and Pharmaceutical Companies GNS Healthcare. Individual cancer treatments can be found
using the models which divulge diagnostic markers and nov-
The pharmaceutical companies have collaborated with
el targets. The clinical models can be engineered in a reverse
the AI companies for the successful application of AI in drug
manner which aids in the growth of the cancer cells. Thus
discovery regimes since 2017. Thus biomarkers, diagnostics,
the responses to different drugs can be studied for the crea-
designing of new target molecules and recognition of drug tion of a response marker for the patients.
targets have seen many improvements [134]. A commonly
used tool in the discovery of drugs is Benevolent AI. Atom- The list of clinical trials was made feasibly in a faster
wise is a company dealing with AI in health care and has manner as compared to the traditional methods by IBM Wat-
collaborated with different pharmaceutical groups along with son [135]. The patients adequate for the trials were identified
Stanford University and Harvard University to search for by accounting the unstructured and structured data from the
innovative drugs for 27 targets. Benevolent AI analyses the medical records, which contributed towards concluding the
data through text mining by scanning through biological and significant characteristics for a specific diagnosis. The algo-
genetic information to have an insight into the relation. Thus rithm for deep natural language processing and reasoning in
graphs containing detailed information of dynamic maps and Watson aids the physicians to have an insight into the symp-
over a billion relationships are created, where the complicat- toms and status of the health of the patients. The software
ed relationship generates the information along with recogni- can predict the required clinical trials for the related symp-
tion of new connections which can provide the basis for new toms using the database while eliminating the undesired
theories. The human analysis on certain occasions leads to characteristics. Ali Health is a subsidiary of Alibaba from
bias interpretation, while the AI uses scientific data and un- China, which has partnered with AstraZeneca, a British drug
biased analysis. The drug discovery in phenotypes is made manufacturer for the development of proper drugs and health
by a company called Exscientia. Similarly, numerate is an- services using AI. There are multiple elderly people in China
other remarkable AI company which deals in ligand chemis- with symptoms of diabetes and cancer, which are being
try. The designed drugs have also been tested for their clini- treated by the AI technology of AstraZeneca. The patients
cal efficacy and the development of the health care system. are analysed in the ambulance with the technology so that
they can be admitted to the appropriate hospitals. The drugs
The AI substitutes conventional human intelligence by
of the company could also be easily accessed by the patients.
the application of automated algorithms in the arena of
pharmaceutical science. The application of AI in dealing The adherence to the drug for patients is significant for a
with complex diseases through the development of new pharmaceutical company. The data for the adherence of
drugs in both the biotech and pharmaceutical industries has drugs were submitted by the patients according to conven-
been prevalent for the last 5 years. According to a survey tional methods, which has been resolved by a mobile SaaS
conducted by Verdict AI on the opinion of the business platform (AiCure) in New York by using an algorithm for
houses on investing in the application of artificial intelli- the identification of images. The adherence to the drug is
gence in pharma companies, 70% of the groups were posi- monitored right after the patient swallows the drug using a
tive in the investment due to a bright future. Another report cell phone by AiCure. The adherence was subsequently en-
by Narrative Science revealed the application of AI by 61% hanced for the patients suffering from schizophrenia using
of the companies to identify any missing links in their novel the AI method by 89.7% as compared to modified directly
tactics of drug development. used therapy (71.9%).

2.9.1. Companies Using AI Another biotechnology company called Cyclica invents


drugs through a combination of biophysics and AI by group-
Novel drugs for glaucoma are being developed by the ing with Bayer. They apply cloud-based technologies called
collaboration of an AI-based biopharmaceutical company Ligand Express, which determines the polypharmacological
(twoXAR) and an ophthalmology company (Santen). The profiles of small drug molecules by screening against pro-
potential ocular drug candidates would be discerned and teins whose structure has been identified. The AI is used for
screened by the utilisation of the platform of drug discovery the verification of the effect of the screened drug on the par-
through AI, cloud computing and big data by twoXAR. Thus ticular target protein and analysing any associated side ef-
promising treatments for glaucoma are expected in the near fects. The interaction of the protein and the drug is generated
future. The antidote is based on AI which simplifies the in- visually. Chronic thromboembolic pulmonary hypertension
tricacy of the inclusion and exclusion characteristics regard- is a symptom having similarities to asthma and is not easily
ing the clinical trials by application of its natural language. recognised. Thus Merck & Co, along with Bayer, was grant-
The detailed reports of the clinical trials are submitted to ed AI software for the analysis of this particular symptom so
them by the pharmaceutical companies, which is analysed by that the specific patterns could be identified by the radiolo-
the Antidote. The main asset is that the eligibility character- gist. The image from the pulmonary vessels, cardiac and
istics of a patient can be entered, which can be deciphered by lung perfusion were analysed by AI for generating the spe-
the Antidote, whereby it can predict proper clinical trials. cific patterns. The ML technology has been extensively used
Cancer therapies are being developed by a partnership be- by Novartis for generating digital images of the cells, by
tween a biotechnology company (Genentech) and data ana- combining the cells possessing similar effects. The research
lyst company (GNS Healthcare). The ML approach and data in Novartis utilises the images generated from the algorithms
simulation would be used by GNS Healthcare from the data obtained from machine learning for examining the com-
provided by Genentech. The huge data from multiple pa- pounds which have not been tested. Additionally, Novartis
tients are transformed into different computer models by has highly been associated with the application of AI in the
Reverse Engineering and Forward Simulation (REFS) by field of pharmaceuticals.
1830 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

Deubiquitinase (DUB) inhibitors were developed by two underway using AI. Another company Verge Genomics has
companies, namely, AbbVie (a pharmaceutical company) been developing drugs by the collection of automated data
and Mission Therapeutics (drug-making company) for treat- and analysis for treating diseases like Alzheimer’s and ALS.
ing Alzheimer’s and Parkinson’s diseases. Both diseases are The numerous genes causing the diseases were targeted by a
caused due to the presence of toxic proteins which hamper single drug. The application of AI by different pharmaceuti-
the nerve cells. The DUB prevents the degeneration of the cal companies through a partnership with various AI compa-
proteins and maintains stability. Further modifications are nies has been shown in Table 2.

Table 2. Encapsulation of the collaboration of pharmaceutical companies with different AI companies.

Pharmaceutical Company AI Company Drug Development

Exscientia Metabolic-disease therapies like diabetes.

Sanofi Evaluate potential biomarkers of seasonal influenza vaccination outcomes in an


BERG
unbiased and data-driven manner.
(www. sanofi.com)
Move through multiple research studies and identify the most significant study to
Researchably
pharma stakeholders.

Roche subsidiary Genentech


GNS Healthcare Cancer treatments.
(www.gene.com)

Model to recognize unknown cancer mechanisms. Discovery of drug BPM31510


Berg Searching for drug targets and therapies for
(www.berghealth.com) diabetes and Parkinson’s disease.

BenevolentBio BenevolentAI Data obtained from sources like research papers,


patents, clinical trials and patient records to form knowledge graphs.

IBM Watson Drug discovery in immuno-oncology.

CytoReason to create a cell-based model of the trial-specific immune response.

Massachusetts Institute
Pharmaceutical Discovery and Synthesis Consortium.
of Technology
Pfizer Concerto HealthAI Study designs for therapeutics that are both pre and post-approved.
(www. pfizer.com)
Home robot for training patients on health and prescription drugs. Know about the
Catalia Health
clinical journey of the patient using artificial intelligence.

Roche
Exscientia, Owkin Diabetic macular edema
(www. roche.com)

PathAI Decode cancer pathology images through AI

IBM Watson Breast cancer clinical trial

McKinsey’s Quan-
500 clinical trial operations with machine learning around the world in real time.
tumBlack
Novartis
(www.novartis.com) MIT Pharmaceutical Discovery and Synthesis Consortium.

Application of DNN to accelerate high content screening, a key element of early


Intel
drug discovery.

University of Oxford’s Early forecasting of patient responses to treatments for inflammatory diseases, such
Big Data Institute (BDI) as multiple sclerosis (MS) and psoriasis.

Companies to develop appropriate models to predict promising compounds for the


MELLODDY
final stages of drug discovery and development.

Leverage data & AI to know about medicines are discovered, developed and com-
Microsoft
mercialized.
Table 2. Contd…
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1831

Pharmaceutical Company AI Company Drug Development

Janssen Application of AI to track the progress of skin progress over time for deep infor-
-
(www.janssen.com) mation about the actual requirement and health of the skin.

Surgeons can detect small and hard-to-access lung nodules in premature conditions
Auris Health
to diagnose and treat lung cancer.

Select the number of novel clinical-stage drug candidates and their extensive related
BenevolentAI
portfolio of patents.
Johnson & Johnson
Predicting neurodegenerative and dementia from voice samples obtained through
(www.jnj.com) WinterLight Labs
Janssen clinical trials.

Merck (MSD) Accenture in collabora-


A cloud-based informatics research platform to improve efficiency, productivity and
tion with Amazon Web
(www.msd.com) innovation in the early stages of developing the drug.
Services (AWS)

Abbvie AI-based patient monitoring platform improved adherence in an AbbVie phase 2


AiCure
(www.abbvie.com) schizophrenia trial.

Discover small molecules selectively for around 10 disease-related targets across


Exscientia
undisclosed therapeutic area.

“In silico Drug Discov-


AI in drug discovery
ery Unit.”

Design implantable devices that can improve electrical signals which transmit
Google through nerves in the body. The irregular impulses associated with different condi-
tions can also be detected.

GlaxoSmithKline In silico Medicine Recognition of new biological targets and pathways.


(www.gsk.com) Alliance for AI in Development and application of AI in healthcare to improve create more efficient
healthcare (AAIH) healthcare systems.

Train machine learning models on datasets from different partners. The privacy of
MELLODDY
each partner is maintained using federated learning.

Cloud Pharmaceuticals,
Design new small-molecule agents to GSK specified targets.
Inc.
-
Design potent in vivo active lead molecule and targeting a novel pathway for treat-
Exscientia
ing chronic obstructive pulmonary disease (COPD).

Universities of Strath-
Use AI in synthetic chemistry.
clyde and Nottingham

GNS Healthcare Use AI in precision medicine.

MIT Pharmaceutical discovery.

Amgen Owkin Pharmaceutical discovery.


(www.amgen.com)
MELLODDY Train machine learning models on datasets from multiple partners.

Designing disease models for nonalcoholic steatohepatitis (NASH). Search for


Gilead Sciences Insitro
targets that affect the disease's progression and regression.

2.10. Validation of the AI in Medicine are other few established methods. In 2018, the WAVE Clin-
The AI has shown potential in predicting the medicines ical Platform was approved by U.S. Food and Drug Admin-
for which different algorithms have been approved. Howev- istration (FDA) which combined the data of the real-time
er, the rate of approval for new algorithms is very scanty as vital signs for patients. The in-house patients in the hospital
the newly developed algorithms should meet the standards of who were at risk of unstable vital signs were detected by the
the clinical criteria dictated by the professional and regulato- algorithm. The WAVE was the primary predictive investi-
ry bodies. The testing and external validation are mandatory gating programme to be used for electronic health records
processes for the advanced algorithms [136]. Apart from the (EHR) and is recognised as the first AI algorithm product. It
TRIPODChecklist as the existing model for the multivaria- was cleared by FDA depending on the likely evidence.
ble prediction analytical method in medicines [137]. There Nevertheless, the FDA approval of the advanced algorithms
1832 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

in the field of clinical practice is limited as compared to that The main hurdle in the application of AI in new drug dis-
of the analysis of diagnostic imaging. Innumerable variables covery is the dearth of problem-specific, high-quality data.
are provided as inputs to the algorithms. Thus the algorithms This stands as a hindrance to the AI-assisted drug discovery
are not static compared to a device or drug since the predic- module. The AI application in the field of imaging is accu-
tive outcome of the algorithm may alter when additional data rate as each of the data points is minute. However, the appli-
may be provided as inputs. Thus new regulatory agendas cation of AI in discovering a novel drug is still not adequate.
have been implemented by the FDA for fresh diagnostic pro- This is mainly because the experimental conditions for gen-
cedures. A programme for FDA Biomarker Qualification erating the data points in very complex and the biological
was formed for the authentication of the biomarkers for test- systems affected by the drug are also very complicated. The
ing and development of the drug. results vary depending on different experimental conditions
The 5 prominent signs (saturation of oxygen, blood pres- and the data available for the drug discovery is very small.
Thus both the quantity and quality of data need improvement
sure, heart rate, temperature and rate of respiration) are used
for proper application of the AI in quest of new drugs. A
for the algorithm in WAVE and are used by the different
huge number of precise Kd values along with a crystal struc-
health systems. There are different complex ML algorithms
ture would aid in the docking and scoring problem solving
for imaging parameters or EHR, which are very specific and
and thus improving the health of the individuals. Thus data
thus cannot be generalised and used for other EHR. Addi-
tionally, the difference in the user interface and other multi- sharing is a significant endeavour, which is lacking between
the pharmaceutical industries. Small initiatives were put
ple limitations in the operations in a different setting of a
forth by a pharmaceutical company called Pistoia in this re-
clinic hinder the capability of the clinical staff to respond to
gard, which has to be embraced by all the pharmaceutical
the result of a predictive algorithm. Thus the predictive out-
company to make AI application in drug discovery a big
puts can be enhanced by the identification of the EHR inputs
success.
by the regulatory bodies. The input variables developed for
the algorithms should be highly specific for acquiring relia- Additionally, the format of the data is different from dif-
ble outputs across the organisation. The intellectual property ferent companies, which demands a unified format. Thus
and proprietary interests of the developers of the algorithms algorithms have to be processed for managing data in diverse
should be made transparent by the regulators. The data for formats. The property of a drug has been analysed using low
the algorithms should be taken from multi-ethnic population data by an algorithm developed by Stanford University
and should not be based on the data of a single institution called one-shot learning. New classes of drugs are identified
[138]. Thus data from extensive populations are required for by the application of standard one-shot learning and their
training the algorithms. characteristics are studied and predicted in the new experi-
mental system. The images are encoded by the application of
There should be audits for the predictive algorithm after
the convolutional layers into continuous vectors by the deep
the approval from the FDA, similar to the approved drugs
one-shot learning system. The nature of the molecules can be
after clinical trials, which undergo scrutiny after the market-
ing of the drugs. The alteration in the predictive output can studied by this technique when introduced in a new molecu-
lar scaffold, by providing new data points. Thus one-shot
be accounted for through this technique as the tools for deep
methods have to be further developed for the localisation of
learning algorithms will consider new variables with the
objects with the restricted amount of data.
progress of time. The systematic biases can be alleviated
through continuous audits. The algorithmic analysis should Another model called transfer learning can decipher the
be carried out for both anonymous and synthetic data. The sparse data, which was developed by the Schneider group at
audits after marketing should be conducted by the regulatory ETH, Zurich. The challenges encountered with the scarce
boards considering the intellectual property. data in the field of drug discovery were overcome by another
branch of the ML process called transfer learning. The ac-
CONCLUSION AND FUTURE PERSPECTIVES tivity of the drug along with the molecular properties can be
estimated by this method with small available data sets. The
The progress of AI is in its third wave with the applica- PPAR and PXR inhibitors were designed de novo by the
tion of deep learning and machine learning approaches. The transfer learning method. However, the experimental aspect
performance of AI is superior to humans in the field of im- of this field still remains to be explored. The output of the
age identification, voice identification and processing of nat- transfer learning process requires a standard metric for eval-
ural languages. AI has shown a huge advancement in the uation, instead of judging it based on the decrease of errors
initial phases in the arena of drug discovery. It has been ex- or enhancement of accuracy. The low data size is another
tensively used in studying the pharmacokinetic properties, challenging arena in the transfer learning process. The major
recognition of the drug target, standardisation of the synthet- hurdle is to quantify the similarity between two different
ic methods, structuring of the clinical trials and forecasting drug activities since the relation is more significant than that
the toxicity of a drug. The AI application has been made of the data size. Additionally, the network structure design is
popular by different pharmaceutical companies. A Kaggle also pivotal as inaccurate selection methods can lead to nega-
competition was organised by Merck to accurately predict tive transfer. There are different rules for fine-tuning which
the molecular properties of drugs. The winners lacked spe- are empirical. Additional layers are fixed for preventing
cific knowledge of the domain and had no expertise in the overfitting in the case of small target data. Alternately, all
medicinal field. However, they solved the problem using the layers are tuned finely in case of large target data.
multitask model.
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1833

It is expensive to bring a drug to a market, accounting for [11] Kneller, D.G.; Cohen, F.E.; Langridge, R. Improvements in protein
all the background activities besides having the probability secondary structure prediction by an enhanced neural network. J.
Mol. Biol., 1990, 214(1), 171-182.
of failure. Thus huge investment is mandatory in the field of http://dx.doi.org/10.1016/0022-2836(90)90154-E PMID: 2370661
drug research. AI can alter the scenario by enhancing the [12] Weinstein, J.N.; Kohn, K.W.; Grever, M.R.; Viswanadhan, V.N.;
success rate in drug discovery, which in turn will be benefi- Rubinstein, L. V.; Monks, A.P.; Scudiero, D.A.; Welch, L.; Kout-
cial both in terms of the financial and human front. soukos, A.D.; Chiausa, A.J.; Paull, K.D. Neural computing in can-
cer drug development: Predicting mechanism of action. Science,
1992, 258, 447-451.
CONSENT FOR PUBLICATION [13] Schneider, G. Generative models for artificially-intelligent molecu-
lar design. Mol. Inform., 2018, 37(1-2), 1880131.
Not applicable.
http://dx.doi.org/10.1002/minf.201880131 PMID: 29442446
[14] Ho, T.K. Random decision forests. Proc. Int. Conf. Doc. Anal.
FUNDING Recognition, ICDAR, 1995, pp. 278-282.
[15] Guenther, N.M.S. Support Vector Machines (SVM) Support Vector
None. Machines (SVM). Gesture, 2001, 23, 349-361.
[16] Lohmann, R.; Schneider, G.; Wrede, P. Structure optimization of
CONFLICT OF INTEREST an artificial neural filter detecting membrane-spanning amino acid
sequences. Biopolymers, 1996, 38(1), 13-29.
The authors declare no conflict of interest, financial or http://dx.doi.org/10.1002/(SICI)1097-0282(199601)38:1<13::AID-
otherwise. BIP2>3.0.CO;2-Z PMID: 8679941
[17] Mayr, A.; Klambauer, G.; Unterthiner, T.; Hochreiter, S. DeepTox:
Toxicity prediction using deep learning. Front. Environ. Sci., 2016,
ACKNOWLEDGEMENTS 3, 80.
http://dx.doi.org/10.3389/fenvs.2015.00080
Declared none. [18] Dahl, G.E.; Jaitly, N.; Salakhutdinov, R. Multi-Task Neural Net-
works for QSAR Predictions. arXiv:1406.1231v1, 2014.
REFERENCES [19] Gómez-Bombarelli, R.; Wei, J.N.; Duvenaud, D.; Hernández-
Lobato, J.M.; Sánchez-Lengeling, B.; Sheberla, D.; Aguilera-
[1] Schneider, G.; Fechner, U. Computer-based de novo design of Iparraguirre, J.; Hirzel, T.D.; Adams, R.P.; Aspuru-Guzik, A. Au-
drug-like molecules. Nat. Rev. Drug Discov., 2005, 4(8), 649-663. tomatic chemical design using a data-driven continuous representa-
http://dx.doi.org/10.1038/nrd1799 PMID: 16056391 tion of molecules. ACS Cent. Sci., 2018, 4(2), 268-276.
[2] Schneider, G.; Clark, D.E. Automated Automated de novo drug http://dx.doi.org/10.1021/acscentsci.7b00572 PMID: 29532027
design: Are we nearly there yet drug design: Are we nearly there [20] Segler, M.H.S.; Kogej, T.; Tyrchan, C.; Waller, M.P. Generating
yet? Angew. Chem. Int. Ed. Engl., 2019, 58(32), 10792-10803. focused molecule libraries for drug discovery with recurrent neural
http://dx.doi.org/10.1002/anie.201814681 PMID: 30730601 networks. ACS Cent. Sci., 2018, 4(1), 120-131.
[3] Schneider, G.; Geppert, T.; Hartenfeller, M.; Reisen, F.; Klenner, http://dx.doi.org/10.1021/acscentsci.7b00512 PMID: 29392184
A.; Reutlinger, M.; Hähnke, V.; Hiss, J.A.; Zettl, H.; Keppner, S.; [21] Putin, E.; Asadulaev, A.; Ivanenkov, Y.; Aladinskiy, V.; Sanchez-
Spänkuch, B.; Schneider, P. Reaction-driven de novo design, syn- Lengeling, B.; Aspuru-Guzik, A.; Zhavoronkov, A. Reinforced
thesis and testing of potential type II kinase inhibitors. Future Med. Adversarial Neural Computer for de novo Molecular Design. J.
Chem., 2011, 3(4), 415-424. Chem. Inf. Model., 2018, 58(6), 1194-1204.
http://dx.doi.org/10.4155/fmc.11.8 PMID: 21452978 http://dx.doi.org/10.1021/acs.jcim.7b00690 PMID: 29762023
[4] McCarthy, J.; Hayesm, P.J. Some Philosophical Problems from [22] Reymond, J.L.; Van Deursen, R.; Blum, L.C.; Ruddigkeit, L.
Standpoint of Artificial Intelligence. In: Machine Intelligence; Ed- Chemical space as a source for new drugs. MedChemComm, 2010,
inburgh University Press: Edinburgh, 1969; pp. 463-502. 1, 30-38.
[5] Qian, N.; Sejnowski, T.J. Predicting the secondary structure of http://dx.doi.org/10.1039/c0md00020e
globular proteins using neural network models. J. Mol. Biol., 1988, [23] Paul, D.; Sanap, G.; Shenoy, S.; Kalyane, D.; Kalia, K.; Tekade,
202(4), 865-884. R.K. Artificial intelligence in drug discovery and development.
http://dx.doi.org/10.1016/0022-2836(88)90564-5 PMID: 3172241 Drug Discov. Today, 2021, 26(1), 80-93.
[6] Hammett, L.P. The effect of structure upon the reactions of organic http://dx.doi.org/10.1016/j.drudis.2020.10.010 PMID: 33099022
compounds. Temperature and solvent influences. J. Chem. Phys., [24] Duran, O.; Rodriguez, N.; Consalter, L.A. Neural networks for cost
1936, 4, 613-617. estimation of shell and tube heat exchangers. Expert Syst. Appl.,
http://dx.doi.org/10.1063/1.1749914 2009, 36, 7435-7440.
[7] Hansch, C.; Fujita, T. ρ-σ-π Analysis. A method for the correlation http://dx.doi.org/10.1016/j.eswa.2008.09.014
of biological activity and chemical structure. J. Am. Chem. Soc., [25] Park, Y.; Goto, D.; Yang, K.F.; Downton, K.; Lecomte, P.; Olson,
1964, 86, 5710. M.; Mullins, C.D. A literature review of factors affecting price and
http://dx.doi.org/10.1021/ja01078a623 competition in the global pharmaceutical market. Value Health,
[8] Radchenko, E.V.; Dyabina, A.S.; Palyulin, V.A.; Zefirov, N.S. 2016, 19, A265.
Prediction of human intestinal absorption of drug compounds. http://dx.doi.org/10.1016/j.jval.2016.03.816
Russ. Chem. Bull., 2016, 65, 576-580. [26] de Jesus, A. AI for Pricing – Comparing 5 Current Applications.
http://dx.doi.org/10.1007/s11172-016-1340-0 EMERJ, 2019, Available from: https://emerj.com/ai-sector-
[9] Jayaram, H.N.; Gharehbaghi, K.; Jayaram, N.H.; Rieser, J.; Krohn, overviews/ai-for-pricing-comparing-5-current-applications/
K.; Paull, K.D. Cytotoxicity of a new IMP dehydrogenase inhibitor, [27] Chan, H.C.S.; Li, Y.; Dahoun, T.; Vogel, H.; Yuan, S. New binding
benzamide riboside, to human myelogenous leukemia K562 cells. sites, new opportunities for GPCR drug discovery. Trends Bio-
Biochem. Biophys. Res. Commun., 1992, 186(3), 1600-1606. chem. Sci., 2019, 44(4), 312-330.
http://dx.doi.org/10.1016/S0006-291X(05)81591-8 PMID: http://dx.doi.org/10.1016/j.tibs.2018.11.011 PMID: 30612897
1354960 [28] Cavasotto, C.N.; Phatak, S.S. Homology modeling in drug discov-
[10] Martin, Y.C.; Holland, J.B.; Jarboe, C.H.; Plotnikoff, N. Discrimi- ery: Current trends and applications. Drug Discov. Today, 2009,
nant analysis of the relationship between physical properties and 14(13-14), 676-683.
the inhibition of monoamine oxidase by aminotetralins and ami- http://dx.doi.org/10.1016/j.drudis.2009.04.006 PMID: 19422931
noindans. J. Med. Chem., 1974, 17(4), 409-413. [29] Hayik, S.A.; Dunbrack, R., Jr; Merz, K.M., Jr. A mixed QM/MM
http://dx.doi.org/10.1021/jm00250a008 PMID: 4830537 scoring function to predict protein-ligand binding affinity. J. Chem.
Theory Comput., 2010, 6(10), 3079-3091.
http://dx.doi.org/10.1021/ct100315g PMID: 21221417
1834 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

[30] Smith, J.S.; Isayev, O.; Roitberg, A.E. ANI-1: an extensible neural V.; Lanctot, M.; Dieleman, S.; Grewe, D.; Nham, J.; Kalchbrenner,
network potential with DFT accuracy at force field computational N.; Sutskever, I.; Lillicrap, T.; Leach, M.; Kavukcuoglu, K.; Grae-
cost. Chem. Sci. (Camb.), 2017, 8(4), 3192-3203. pel, T.; Hassabis, D. Mastering the game of Go with deep neural
http://dx.doi.org/10.1039/C6SC05720A PMID: 28507695 networks and tree search. Nature, 2016, 529(7587), 484-489.
[31] Zhang, Y.J.; Khorshidi, A.; Kastlunger, G.; Peterson, A.A. The http://dx.doi.org/10.1038/nature16961 PMID: 26819042
potential for machine learning in hybrid QM/MM calculations. J. [49] Chuang, K. V.; Keiser, M.J. Predicting reaction performance in C–
Chem. Phys., 2018, 148(24), 241740. N cross-coupling using machine learning. Science. Science, 2018,
http://dx.doi.org/10.1063/1.5029879 PMID: 29960374 362, 186-190.
[32] Bai, F.; Morcos, F.; Cheng, R.R.; Jiang, H.; Onuchic, J.N. Elucidat- [50] Maryasin, B.; Marquetand, P.; Maulide, N. Machine learning for
ing the druggable interface of protein-protein interactions using organic synthesis: Are robots replacing chemists? Angew. Chem.
fragment docking and coevolutionary analysis. Proc. Natl. Acad. Int. Ed. Engl., 2018, 57(24), 6978-6980.
Sci. USA, 2016, 113(50), E8051-E8058. http://dx.doi.org/10.1002/anie.201803562 PMID: 29701305
http://dx.doi.org/10.1073/pnas.1615932113 PMID: 27911825 [51] Steiner, S.; Wolf, J.; Glatzel, S.; Andreou, A.; Granda, J.M.; Kee-
[33] Wang, S.; Sun, S.; Li, Z.; Zhang, R.; Xu, J. Accurate de novo pre- nan, G.; Hinkley, T.; Aragon-Camarasa, G.; Kitson, P.J.; Angelone,
diction of protein contact map by ultra-deep learning model. PLOS D.; Cronin, L. Organic synthesis in a modular robotic system driv-
Comput. Biol., 2017, 13(1), e1005324. en by a chemical programming language. Science, 2019, 363,
http://dx.doi.org/10.1371/journal.pcbi.1005324 PMID: 28056090 eaav2211.
[34] Luo, Y.; Zhao, X.; Zhou, J.; Yang, J.; Zhang, Y.; Kuang, W.; Peng, [52] Fuhrman, J.A.; Schwalbach, M.S.; Stingl, U. Proteorhodopsins: An
J.; Chen, L.; Zeng, J. A network integration approach for drug- array of physiological roles? Nat. Rev. Microbiol., 2008, 6(6), 488-
target interaction prediction and computational drug repositioning 494.
from heterogeneous information. Nat. Commun., 2017, 8(1), 573. http://dx.doi.org/10.1038/nrmicro1893 PMID: 18475306
http://dx.doi.org/10.1038/s41467-017-00680-8 PMID: 28924171 [53] Fooshee, D.; Mood, A.; Gutman, E.; Tavakoli, M.; Urban, G.; Liu,
[35] Kadurin, A.; Aliper, A.; Kazennov, A.; Mamoshina, P.; Vanhaelen, F.; Huynh, N.; Van Vranken, D.; Baldi, P. Deep learning for chem-
Q.; Khrabrov, K.; Zhavoronkov, A. The cornucopia of meaningful ical reaction prediction. Mol. Syst. Des. Eng., 2018, 3, 442-452.
leads: Applying deep adversarial autoencoders for new molecule http://dx.doi.org/10.1039/C7ME00107J
development in oncology. Oncotarget, 2017, 8(7), 10883-10890. [54] Jones, L.D.; Golan, D.; Hanna, S.A.; Ramachandran, M. Artificial
http://dx.doi.org/10.18632/oncotarget.14073 PMID: 28029644 intelligence, machine learning and the evolution of healthcare: A
[36] Kadurin, A.; Nikolenko, S.; Khrabrov, K.; Aliper, A.; Zhavor- bright future or cause for concern? Bone Joint Res., 2018, 7(3),
onkov, A. druGAN: An advanced generative adversarial autoen- 223-225.
coder model for de novo generation of new molecules with desired http://dx.doi.org/10.1302/2046-3758.73.BJR-2017-0147.R1 PMID:
molecular properties in silico. Mol. Pharm., 2017, 14(9), 3098- 29922439
3104. [55] Couronné, R.; Probst, P.; Boulesteix, A.L. Random forest versus
http://dx.doi.org/10.1021/acs.molpharmaceut.7b00346 PMID: logistic regression: A large-scale benchmark experiment. BMC Bio-
28703000 informatics, 2018, 19(1), 270.
[37] Ma, J.; Sheridan, R.P.; Liaw, A.; Dahl, G.E.; Svetnik, V. Deep http://dx.doi.org/10.1186/s12859-018-2264-5 PMID: 30016950
neural nets as a method for quantitative structure-activity relation- [56] Huang, S.; Cai, N.; Pacheco, P.P.; Narrandes, S.; Wang, Y.; Xu, W.
ships. J. Chem. Inf. Model., 2015, 55(2), 263-274. Applications of Support Vector Machine (SVM) learning in cancer
http://dx.doi.org/10.1021/ci500747n PMID: 25635324 genomics. Cancer Genomics Proteomics, 2018, 15(1), 41-51.
[38] Kearnes, S.; Goldman, B.; Pande, V. Modeling industrial ADMET PMID: 29275361
data with multitask networks. arXiv, 1606, 08793v3., 2016. [57] Hennessy, S. Use of health care databases in pharmacoepidemiolo-
[39] Schneider, P.; Schneider, G. De novo De novo design at the edge of gy. Basic Clin. Pharmacol. Toxicol., 2006, 98(3), 311-313.
chaos. J. Med. Chem., 2016, 59(9), 4077-4086. http://dx.doi.org/10.1111/j.1742-7843.2006.pto_368.x PMID:
http://dx.doi.org/10.1021/acs.jmedchem.5b01849 PMID: 26881908 16611207
[40] Gupta, A.; Müller, A.T.; Huisman, B.J.H.; Fuchs, J.A.; Schneider, [58] Anighoro, A.; Bajorath, J.; Rastelli, G. Polypharmacology: Chal-
P.; Schneider, G. Generative recurrent networks for de novo drug lenges and opportunities in drug discovery. J. Med. Chem., 2014,
design. Mol. Inform., 2018, 37, 1700111. 57(19), 7874-7887.
http://dx.doi.org/10.1002/minf.201700111 http://dx.doi.org/10.1021/jm5006463 PMID: 24946140
[41] Müller, A.T.; Hiss, J.A.; Schneider, G. Recurrent neural network [59] Jasial, S.; Gilberg, E.; Blaschke, T.; Bajorath, J. Machine learning
model for constructive peptide design. J. Chem. Inf. Model., 2018, distinguishes with high accuracy between pan-assay interference
58(2), 472-479. compounds that are promiscuous or represent dark chemical matter.
http://dx.doi.org/10.1021/acs.jcim.7b00414 PMID: 29355319 J. Med. Chem., 2018, 61(22), 10255-10264.
[42] Merk, D.; Friedrich, L.; Grisoni, F.; Schneider, G. De novo design http://dx.doi.org/10.1021/acs.jmedchem.8b01404 PMID: 30422657
of bioactive small molecules by artificial intelligence. Mol. Inform., [60] Pereira, J.C.; Caffarena, E.R.; Dos Santos, C.N. Boosting docking-
2018, 37(1-2), 1700153. based virtual screening with deep learning. J. Chem. Inf. Model.,
http://dx.doi.org/10.1002/minf.201700153 PMID: 29319225 2016, 56(12), 2495-2506.
[43] Hessler, G.; Baringhaus, K.H. Artificial intelligence in drug design. http://dx.doi.org/10.1021/acs.jcim.6b00355 PMID: 28024405
Molecules, 2018, 23(10), 2520. [61] Li, Z.; Li, X.; Liu, X.; Fu, Z.; Xiong, Z.; Wu, X.; Tan, X.; Zhao, J.;
http://dx.doi.org/10.3390/molecules23102520 PMID: 30279331 Zhong, F.; Wan, X.; Luo, X.; Chen, K.; Jiang, H.; Zheng, M.; Ki-
[44] Bickerton, G.R.; Paolini, G.V.; Besnard, J.; Muresan, S.; Hopkins, nome, X. KinomeX: A web application for predicting kinome-wide
A.L. Quantifying the chemical beauty of drugs. Nat. Chem., 2012, polypharmacology effect of small molecules. Bioinformatics, 2019,
4(2), 90-98. 35(24), 5354-5356.
http://dx.doi.org/10.1038/nchem.1243 PMID: 22270643 http://dx.doi.org/10.1093/bioinformatics/btz519 PMID: 31228181
[45] Klucznik, T. Efficient syntheses of diverse, medicinally relevant [62] Lotfi Shahreza, M.; Ghadiri, N.; Mousavi, S.R.; Varshosaz, J.;
targets planned by computer and executed in the laboratory. Chem., Green, J.R. A review of network-based approaches to drug reposi-
2018, 4, 522-532. tioning. Brief. Bioinform., 2018, 19(5), 878-892.
http://dx.doi.org/10.1016/j.chempr.2018.02.002 http://dx.doi.org/10.1093/bib/bbx017 PMID: 28334136
[46] Browne, C.B. A Survey of monte Carlo tree search methods. IEEE [63] Gönen, M. Predicting drug-target interactions from chemical and
T. Comp. Intel. Al, 2017, 4, 1-43. genomic kernels using Bayesian matrix factorization. Bioinformat-
[47] Segler, M.H.S.; Waller, M.P. Neural-symbolic machine learning ics, 2012, 28(18), 2304-2310.
for retrosynthesis and reaction prediction. Chemistry, 2017, 23(25), http://dx.doi.org/10.1093/bioinformatics/bts360 PMID: 22730431
5966-5971. [64] Klaeger, S.; Heinzlmeir, S.; Wilhelm, M.; Küster, B. The target
http://dx.doi.org/10.1002/chem.201605499 PMID: 28134452 landscape of clinical kinase inhibitors. Mol. Cell. Proteomics,
[48] Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; van den 2017, 16, S14.
Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam,
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1835

[65] Cabreiro, F.; Au, C.; Leung, K.Y.; Vergara-Irigaray, N.; Cochemé, targeting bromodomain-containing protein 4. J. Chem. Inf. Model.,
H.M.; Noori, T.; Weinkove, D.; Schuster, E.; Greene, N.D.E.; 2017, 57(7), 1677-1690.
Gems, D. Metformin retards aging in C. elegans by altering micro- http://dx.doi.org/10.1021/acs.jcim.7b00098 PMID: 28636361
bial folate and methionine metabolism. Cell, 2013, 153(1), 228- [83] Halperin, I.; Ma, B.; Wolfson, H.; Nussinov, R. Principles of dock-
239. ing: An overview of search algorithms and a guide to scoring func-
http://dx.doi.org/10.1016/j.cell.2013.02.035 PMID: 23540700 tions. Proteins, 2002, 47(4), 409-443.
[66] Yamanishi, Y.; Araki, M.; Gutteridge, A.; Honda, W.; Kanehisa, http://dx.doi.org/10.1002/prot.10115 PMID: 12001221
M. Prediction of drug-target interaction networks from the integra- [84] Leach, A.R.; Gillet, V.J.; Lewis, R.A.; Taylor, R. Three-
tion of chemical and genomic spaces. Bioinformatics, 2008, 24(13), dimensional pharmacophore methods in drug discovery. J. Med.
i232-i240. Chem., 2010, 53(2), 539-558.
http://dx.doi.org/10.1093/bioinformatics/btn162 PMID: 18586719 http://dx.doi.org/10.1021/jm900817u PMID: 19831387
[67] Wang, W.; Yang, S.; Zhang, X.; Li, J. Drug repositioning by inte- [85] Hein, M.; Zilian, D.; Sotriffer, C.A. Docking compared to 3D-
grating target information through a heterogeneous network model. pharmacophores: The scoring function challenge. Drug Discov.
Bioinformatics, 2014, 30(20), 2923-2930. Today. Technol., 2010, 7, e229-e236.
http://dx.doi.org/10.1093/bioinformatics/btu403 PMID: 24974205 http://dx.doi.org/10.1016/j.ddtec.2010.12.003
[68] Huang, S.Y.; Grinter, S.Z.; Zou, X. Scoring functions and their [86] Hessler, G.; Baringhaus, K.H. The scaffold hopping potential of
evaluation methods for protein-ligand docking: Recent advances pharmacophores. Drug Discov. Today. Technol., 2010, 7(4), e203-
and future directions. Phys. Chem. Chem. Phys., 2010, 12(40), e270.
12899-12908. http://dx.doi.org/10.1016/j.ddtec.2010.09.001 PMID: 24103802
http://dx.doi.org/10.1039/c0cp00151a PMID: 20730182 [87] Dassault Systèmes, BIOVIA Discovery Studio., Available from:
[69] Khamis, M.A.; Gomaa, W.; Ahmed, W.F. Machine learning in https://discover.3ds.com/discovery-studio-visualizer-download
computational docking. Artif. Intell. Med., 2015, 63(3), 135-152. [88] Wu, G.; Robertson, D.H.; Brooks, C.L., III; Vieth, M. Detailed
http://dx.doi.org/10.1016/j.artmed.2015.02.002 PMID: 25724101 analysis of grid-based molecular docking: A case study of
[70] Ain, Q.U.; Aleksandrova, A.; Roessler, F.D.; Ballester, P.J. Ma- CDOCKER-A CHARMm-based MD docking algorithm. J. Com-
chine-learning scoring functions to improve structure-based bind- put. Chem., 2003, 24(13), 1549-1562.
ing affinity prediction and virtual screening. Wiley Interdiscip. Rev. http://dx.doi.org/10.1002/jcc.10306 PMID: 12925999
Comput. Mol. Sci., 2015, 5(6), 405-424. [89] Billones, J.B.; Carrillo, M.C.O.; Organo, V.G.; Sy, J.B.A.; Clavio,
http://dx.doi.org/10.1002/wcms.1225 PMID: 27110292 N.A.B.; Macalino, S.J.Y.; Emnacen, I.A.; Lee, A.P.; Ko, P.K.L.;
[71] Kinnings, S.L.; Liu, N.; Tonge, P.J.; Jackson, R.M.; Xie, L.; Concepcion, G.P. In silico discovery and in vitro activity of inhibi-
Bourne, P.E. A machine learning-based method to improve dock- tors against Mycobacterium tuberculosis 7,8-diaminopelargonic ac-
ing scoring functions and its application to drug repurposing. J. id synthase (Mtb BioA). Drug Des. Devel. Ther., 2017, 11, 563-
Chem. Inf. Model., 2011, 51(2), 408-419. 574.
http://dx.doi.org/10.1021/ci100369f PMID: 21291174 http://dx.doi.org/10.2147/DDDT.S119930 PMID: 28280303
[72] Wang, C.; Zhang, Y. Improving scoring-docking-screening powers [90] Duvenaud, D.; Maclaurin, D.; Aguilera-Iparraguirre, J.; Gómez-
of protein-ligand scoring functions using random forest. J. Comput. Bombarelli, R.; Hirzel, T.; Aspuru-Guzik, A.; Adams, R.P. Convo-
Chem., 2017, 38(3), 169-177. lutional networks on graphs for learning molecular fingerprints.
http://dx.doi.org/10.1002/jcc.24667 PMID: 27859414 Adv. Neural Inf. Process. Syst., 2015, 2, 224-2232.
[73] LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature, 2015, [91] Coley, C.W.; Barzilay, R.; Green, W.H.; Jaakkola, T.S.; Jensen,
521(7553), 436-444. K.F. Convolutional Embedding of Attributed Molecular Graphs for
http://dx.doi.org/10.1038/nature14539 PMID: 26017442 Physical Property Prediction. J. Chem. Inf. Model., 2017, 57(8),
[74] Jiménez, J.; Škalič, M.; Martínez-Rosell, G.; De Fabritiis, G. KDEEP: 1757-1772.
Protein-ligand absolute binding affinity prediction via 3D- http://dx.doi.org/10.1021/acs.jcim.6b00601 PMID: 28696688
convolutional neural networks. J. Chem. Inf. Model., 2018, 58(2), [92] Hubatsch, I.; Ragnarsson, E.G.E.; Artursson, P. Determination of
287-296. drug permeability and prediction of drug absorption in Caco-2
http://dx.doi.org/10.1021/acs.jcim.7b00650 PMID: 29309725 monolayers. Nat. Protoc., 2007, 2(9), 2111-2119.
[75] McInnes, C. Virtual screening strategies in drug discovery. Curr. http://dx.doi.org/10.1038/nprot.2007.303 PMID: 17853866
Opin. Chem. Biol., 2007, 11(5), 494-502. [93] Tian, S.; Li, Y.; Wang, J.; Zhang, J.; Hou, T. ADME evaluation in
http://dx.doi.org/10.1016/j.cbpa.2007.08.033 PMID: 17936059 drug discovery. 9. Prediction of oral bioavailability in humans
[76] Lavecchia, A.; Di Giovanni, C. Virtual screening strategies in drug based on molecular properties and structural fingerprints. Mol.
discovery: a critical review. Curr. Med. Chem., 2013, 20(23), 2839- Pharm., 2011, 8(3), 841-851.
2860. http://dx.doi.org/10.1021/mp100444g PMID: 21548635
http://dx.doi.org/10.2174/09298673113209990001 PMID: [94] Lombardo, F.; Jing, Y. In silico prediction of volume of distribu-
23651302 tion in humans. Extensive data set and the exploration of linear and
[77] Leelananda, S.P.; Lindert, S. Computational methods in drug dis- nonlinear methods coupled with molecular interaction fields de-
covery. Beilstein J. Org. Chem., 2016, 12, 2694-2718. scriptors. J. Chem. Inf. Model., 2016, 56(10), 2042-2052.
http://dx.doi.org/10.3762/bjoc.12.267 PMID: 28144341 http://dx.doi.org/10.1021/acs.jcim.6b00044 PMID: 27602694
[78] Kim, K.H.; Kim, N.D.; Seong, B.L. Pharmacophore-based virtual [95] Zientek, M.; Stoner, C.; Ayscue, R.; Klug-McLeod, J.; Jiang, Y.;
screening: A review of recent applications. Expert Opin. Drug Dis- West, M.; Collins, C.; Ekins, S. Integrated in silico-in vitro strategy
cov., 2010, 5(3), 205-222. for addressing cytochrome P450 3A4 time-dependent inhibition.
http://dx.doi.org/10.1517/17460441003592072 PMID: 22823018 Chem. Res. Toxicol., 2010, 23(3), 664-676.
[79] Willett, P. Similarity-based virtual screening using 2D fingerprints. http://dx.doi.org/10.1021/tx900417f PMID: 20151638
Drug Discov. Today, 2006, 11(23-24), 1046-1053. [96] Zhang, H.; Chen, Q.Y.; Xiang, M.L.; Ma, C.Y.; Huang, Q.; Yang,
http://dx.doi.org/10.1016/j.drudis.2006.10.005 PMID: 17129822 S.Y. In silico prediction of mitochondrial toxicity by using GA-
[80] Huang, S.Y.; Zou, X. Inclusion of solvation and entropy in the CG-SVM approach. Toxicol. In Vitro, 2009, 23(1), 134-140.
knowledge-based scoring function for protein-ligand interactions. http://dx.doi.org/10.1016/j.tiv.2008.09.017 PMID: 18940245
J. Chem. Inf. Model., 2010, 50(2), 262-273. [97] Hop, P.; Allgood, B.; Yu, J. Geometric deep learning autonomous-
http://dx.doi.org/10.1021/ci9002987 PMID: 20088605 ly learns chemical features that outperform those engineered by
[81] Chen, Y.C. Beware of docking! Trends Pharmacol. Sci., 2015, domain experts. Mol. Pharm., 2018, 15(10), 4371-4377.
36(2), 78-95. http://dx.doi.org/10.1021/acs.molpharmaceut.7b01144 PMID:
http://dx.doi.org/10.1016/j.tips.2014.12.001 PMID: 25543280 29863875
[82] Xing, J.; Lu, W.; Liu, R.; Wang, Y.; Xie, Y.; Zhang, H.; Shi, Z.; [98] Kearnes, S.; McCloskey, K.; Berndl, M.; Pande, V.; Riley, P. Mo-
Jiang, H.; Liu, Y.C.; Chen, K.; Jiang, H.; Luo, C.; Zheng, M. Ma- lecular graph convolutions: Moving beyond fingerprints. J. Com-
chine-learning-assisted approach for discovering novel inhibitors put. Aided Mol. Des., 2016, 30(8), 595-608.
http://dx.doi.org/10.1007/s10822-016-9938-8 PMID: 27558503
1836 Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 Sahu et al.

[99] Lombardo, F.; Desai, P.V.; Arimoto, R.; Desino, K.E.; Fischer, H.; [112] Yauney, G.; Shah, P. Reinforcement learning with action-derived
Keefer, C.E.; Petersson, C.; Winiwarter, S.; Broccatelli, F. In silico rewards for chemotherapy and clinical trial dosing regimen selec-
Absorption, Distribution, Metabolism, Excretion, and Pharmacoki- tion. Proc. 3rd Mach. Learn. Healthc. Conf., 2018, pp. 161-226.
netics (ADME-PK): Utility and best practices. An Industry Per- [113] Ohnstad, H.O.; Borgen, E.; Falk, R.S.; Lien, T.G.; Aaserud, M.;
spective from the International Consortium for Innovation through Sveli, M.A.T.; Kyte, J.A.; Kristensen, V.N.; Geitvik, G.A.;
Quality in Pharmaceutical Development. J. Med. Chem., 2017, Schlichting, E.; Wist, E.A.; Sørlie, T.; Russnes, H.G.; Naume, B.
60(22), 9097-9113. Prognostic value of PAM50 and risk of recurrence score in patients
http://dx.doi.org/10.1021/acs.jmedchem.7b00487 PMID: 28609624 with early-stage breast cancer with long-term follow-up. Breast
[100] O’Boyle, N.M.; Boström, J.; Sayle, R.A.; Gill, A. Using matched Cancer Res., 2017, 19(1), 120.
molecular series as a predictive tool to optimize biological activity. http://dx.doi.org/10.1186/s13058-017-0911-9 PMID: 29137653
J. Med. Chem., 2014, 57(6), 2704-2713. [114] Shimizu, H.; Nakayama, K.I.A. A 23 gene-based molecular prog-
http://dx.doi.org/10.1021/jm500022q PMID: 24601597 nostic score precisely predicts overall survival of breast cancer pa-
[101] Gunaydin, H.; Altman, M.D.; Ellis, J.M.; Fuller, P.; Johnson, S.A.; tients. EBioMedicine, 2019, 46, 150-159.
Lahue, B.; Lapointe, B. Strategy for extending half-life in drug de- http://dx.doi.org/10.1016/j.ebiom.2019.07.046 PMID: 31358476
sign and its significance. ACS Med. Chem. Lett., 2018, 9(6), 528- [115] Curtis, C.; Shah, S.P.; Chin, S.F.; Turashvili, G.; Rueda, O.M.;
533. Dunning, M.J.; Speed, D.; Lynch, A.G.; Samarajiwa, S.; Yuan, Y.;
http://dx.doi.org/10.1021/acsmedchemlett.8b00018 PMID: Gräf, S.; Ha, G.; Haffari, G.; Bashashati, A.; Russell, R.; McKin-
29937977 ney, S.; Langerød, A.; Green, A.; Provenzano, E.; Wishart, G.;
[102] Kramer, C.; Fuchs, J.E.; Whitebread, S.; Gedeck, P.; Liedl, K.R. Pinder, S.; Watson, P.; Markowetz, F.; Murphy, L.; Ellis, I.;
Matched molecular pair analysis: Significance and the impact of Purushotham, A.; Børresen-Dale, A.L.; Brenton, J.D.; Tavaré, S.;
experimental uncertainty. J. Med. Chem., 2014, 57(9), 3786-3802. Caldas, C.; Aparicio, S.; Speers, C.; Watson, P.; Blamey, R.;
http://dx.doi.org/10.1021/jm500317a PMID: 24738976 Green, A.; MacMillan, D.; Rakha, E.; Gillett, C.; Grigoriadis, A.;
[103] Li, H.; Hou, J.; Adhikari, B.; Lyu, Q.; Cheng, J. Deep learning De Rinaldis, E.; Tutt, A.; Parisien, M.; Troup, S.; Chan, D.; Field-
methods for protein torsion angle prediction. BMC Bioinformatics, ing, C.; Maia, A.T.; McGuire, S.; Osborne, M.; Sayalero, S.M.;
2017, 18(1), 417. Spiteri, I.; Hadfield, J.; Bell, L.; Chow, K.; Gale, N.; Kovalik, M.;
http://dx.doi.org/10.1186/s12859-017-1834-2 PMID: 28923002 Ng, Y.; Prentice, L.; Tavaré, S.; Markowetz, F.; Langerød, A.;
[104] Sahu, A.; Agrawal, R.K.; Pandey, R. Synthesis and systemic toxici- Provenzano, E.; Purushotham, A.; Børresen-Dale, A.L.; Caldas, C.
ty assessment of quinine-triazole scaffold with antiprotozoal poten- The genomic and transcriptomic architecture of 2,000 breast tu-
cy. Bioorg. Chem., 2019, 88, 102939. mours reveals novel subgroups. Nature, 2012, 486(7403), 346-352.
http://dx.doi.org/10.1016/j.bioorg.2019.102939 PMID: 31028993 http://dx.doi.org/10.1038/nature10983 PMID: 22522925
[105] Scott, D.E.; Bayly, A.R.; Abell, C.; Skidmore, J. Small molecules, [116] Merget, B.; Turk, S.; Eid, S.; Rippmann, F.; Fulle, S. Profiling
big targets: Drug discovery faces the protein-protein interaction prediction of kinase inhibitors: Toward the virtual assay. J. Med.
challenge. Nat. Rev. Drug Discov., 2016, 15(8), 533-550. Chem., 2017, 60(1), 474-485.
http://dx.doi.org/10.1038/nrd.2016.29 PMID: 27050677 http://dx.doi.org/10.1021/acs.jmedchem.6b01611 PMID: 27966949
[106] Cukuroglu, E.; Engin, H.B.; Gursoy, A.; Keskin, O. Hot spots in [117] Grys, B.T.; Lo, D.S.; Sahin, N.; Kraus, O.Z.; Morris, Q.; Boone,
protein-protein interfaces: Towards drug discovery. Prog. Biophys. C.; Andrews, B.J. Machine learning and computer vision approach-
Mol. Biol., 2014, 116(2-3), 165-173. es for phenotypic profiling. J. Cell Biol., 2017, 216(1), 65-71.
http://dx.doi.org/10.1016/j.pbiomolbio.2014.06.003 PMID: http://dx.doi.org/10.1083/jcb.201610026 PMID: 27940887
24997383 [118] Chen, H.; Engkvist, O.; Wang, Y.; Olivecrona, M.; Blaschke, T.
[107] Szklarczyk, D.; Franceschini, A.; Wyder, S.; Forslund, K.; Heller, The rise of deep learning in drug discovery. Drug Discov. Today,
D.; Huerta-Cepas, J.; Simonovic, M.; Roth, A.; Santos, A.; Tsafou, 2018, 23(6), 1241-1250.
K.P.; Kuhn, M.; Bork, P.; Jensen, L.J.; von Mering, C. STRING http://dx.doi.org/10.1016/j.drudis.2018.01.039 PMID: 29366762
v10: protein-protein interaction networks, integrated over the tree [119] Choi, E.; Schuetz, A.; Stewart, W.F.; Sun, J. Using recurrent neural
of life. Nucleic Acids Res., 2015, 43(Database issue), D447-D452. network models for early detection of heart failure onset. J. Am.
http://dx.doi.org/10.1093/nar/gku1003 PMID: 25352553 Med. Inform. Assoc., 2017, 24(2), 361-370.
[108] Labbé, C.M.; Kuenemann, M.A.; Zarzycka, B.; Vriend, G.; Nico- http://dx.doi.org/10.1093/jamia/ocw112 PMID: 27521897
laes, G.A.F.; Lagorce, D.; Miteva, M.A.; Villoutreix, B.O.; Speran- [120] Labovitz, D.L.; Shafner, L.; Reyes Gil, M.; Virmani, D.; Hanina,
dio, O. iPPI-DB: an online database of modulators of protein- A. Using artificial intelligence to reduce the risk of nonadherence
protein interactions. Nucleic Acids Res., 2016, 44(D1), D542-D547. in patients on anticoagulation therapy. Stroke, 2017, 48(5), 1416-
http://dx.doi.org/10.1093/nar/gkv982 PMID: 26432833 1419.
[109] Wang, J.; Luo, C.; Shan, C.; You, Q.; Lu, J.; Elf, S.; Zhou, Y.; http://dx.doi.org/10.1161/STROKEAHA.116.016281 PMID:
Wen, Y.; Vinkenborg, J.L.; Fan, J.; Kang, H.; Lin, R.; Han, D.; 28386037
Xie, Y.; Karpus, J.; Chen, S.; Ouyang, S.; Luan, C.; Zhang, N.; [121] Macalino, S.J.Y.; Gosu, V.; Hong, S.; Choi, S. Role of computer-
Ding, H.; Merkx, M.; Liu, H.; Chen, J.; Jiang, H.; He, C. Inhibition aided drug design in modern drug discovery. Arch. Pharm. Res.,
of human copper trafficking by a small molecule significantly at- 2015, 38(9), 1686-1701.
tenuates cancer cell proliferation. Nat. Chem., 2015, 7(12), 968- http://dx.doi.org/10.1007/s12272-015-0640-5 PMID: 26208641
979. [122] Li, D.; Chi, B.; Wang, W.W.; Gao, J.M.; Wan, J. Exploring the
http://dx.doi.org/10.1038/nchem.2381 PMID: 26587712 possible binding mode of trisubstituted benzimidazoles analogues
[110] Maheshwari, S.; Brylinski, M. Template-based identification of in silico for novel drug designtargeting Mtb FtsZ Med. Chem. Res.,
protein-protein interfaces using eFindSitePPI. Methods, 2016, 93, 2017, 26, 153-169.
64-71. http://dx.doi.org/10.1007/s00044-016-1734-4
http://dx.doi.org/10.1016/j.ymeth.2015.07.017 PMID: 26235816 [123] Jamal, S.; Khubaib, M.; Gangwar, R.; Grover, S.; Grover, A.;
[111] De Fauw, J.; Ledsam, J.R.; Romera-Paredes, B.; Nikolov, S.; To- Hasnain, S.E. Artificial intelligence and machine learning based
masev, N.; Blackwell, S.; Askham, H.; Glorot, X.; O’Donoghue, prediction of resistant and susceptible mutations in Mycobacterium
B.; Visentin, D.; van den Driessche, G.; Lakshminarayanan, B.; tuberculosis. Sci. Rep., 2020, 10, 1-16.
Meyer, C.; Mackinder, F.; Bouton, S.; Ayoub, K.; Chopra, R.; [124] Lind, A.P.; Anderson, P.C. Predicting drug activity against cancer
King, D.; Karthikesalingam, A.; Hughes, C.O.; Raine, R.; Hughes, cells by random forest models based on minimal genomic infor-
J.; Sim, D.A.; Egan, C.; Tufail, A.; Montgomery, H.; Hassabis, D.; mation and chemical properties. PLoS One, 2019, 14(7), e0219774.
Rees, G.; Back, T.; Khaw, P.T.; Suleyman, M.; Cornebise, J.; http://dx.doi.org/10.1371/journal.pone.0219774 PMID: 31295321
Keane, P.A.; Ronneberger, O. Clinically applicable deep learning [125] Wang, Y.; Wang, Z.; Xu, J.; Li, J.; Li, S.; Zhang, M.; Yang, D.
for diagnosis and referral in retinal disease. Nat. Med., 2018, 24(9), Systematic identification of non-coding pharmacogenomic land-
1342-1350. scape in cancer. Nat. Commun., 2018, 9(1), 3192.
http://dx.doi.org/10.1038/s41591-018-0107-6 PMID: 30104768 http://dx.doi.org/10.1038/s41467-018-05495-9 PMID: 30093685
Artificial Intelligence (AI) in Drugs and Pharmaceuticals Combinatorial Chemistry & High Throughput Screening, 2022, Vol. 25, No. 11 1837

[126] Leventakos, K.; Helgeson, J.; Mansfield, A.S.; Deering, E.; [132] BIO industry analysis. Clinical Development Success Rates 2006-
Schwecke, A.; Adjei, A.; Molina, J.; Hocum, C.; Halfdanarson, T.; 2015. Bio Ind. Anal. Rep., 2016. Available from:
Marks, R.; Parikh, K.; Pomerleau, K.; Coverdill, S.; Rammage, M.; https://www.bio.org/sites/default/files/Clinical
Haddad, T. Implementation of Artificial Intelligence (AI) for Lung [133] Abramoff, M.D.; Lavin, P.T.; Birch, M.; Shah, N.; Folk, J.C. Piv-
Cancer Clinical Trial Matching in a Tertiary Cancer Center. Ann. otal trial of an autonomous AI-based diagnostic system for detec-
Oncol., 2019, 30, ii74. tion of diabetic retinopathy in primary care offices. NPJ Digital
http://dx.doi.org/10.1093/annonc/mdz065 Med., 2018, 1, 39.
[127] Pantuck, A.J.; Lee, D-K.; Kee, T.; Wang, P.; Lakhotia, S.; Silver- [134] Chan, B. The rise of artificial intelligence and the crisis of moral
man, M.H.; Mathis, C.; Drakaki, A.; Belldegrun, A.S.; Ho, C-M.; passivity. AI Soc., 2020, 35, 991-993.
Ho, D. Artificial intelligence: Modulating BET bromodomain in- http://dx.doi.org/10.1007/s00146-020-00953-9
hibitor ZEN-3694 and enzalutamide combination dosing in a meta- [135] Fleming, N. How artificial intelligence is changing drug discovery.
static prostate cancer patient using CURATE.AI, an artificial intel- Nature, 2018, 557(7707), S55-S57.
ligence platform (Adv. Therap. 6/2018). Adv. Ther., 2018, 1, http://dx.doi.org/10.1038/d41586-018-05267-x PMID: 29849160
1870020. [136] Yu, K.H.; Kohane, I.S. Framing the challenges of artificial intelli-
http://dx.doi.org/10.1002/adtp.201870020 gence in medicine. BMJ Qual. Saf., 2019, 28(3), 238-241.
[128] Gulhan, D.C.; Lee, J.J.K.; Melloni, G.E.M.; Cortés-Ciriano, I.; http://dx.doi.org/10.1136/bmjqs-2018-008551 PMID: 30291179
Park, P.J. Detecting the mutational signature of homologous re- [137] Collins, G.S.; Reitsma, J.B.; Altman, D.G.; Moons, K.G.M. Trans-
combination deficiency in clinical samples. Nat. Genet., 2019, parent reporting of a multivariable prediction model for individual
51(5), 912-919. prognosis or diagnosis (TRIPOD): The TRIPOD statement. BMJ,
http://dx.doi.org/10.1038/s41588-019-0390-2 PMID: 30988514 2015, 350, g7594.
[129] Goecks, J.; Jalili, V.; Heiser, L.M.; Gray, J.W. How Machine learn- http://dx.doi.org/10.1136/bmj.g7594 PMID: 25569120
ing will transform biomedicine. Cell, 2020, 181(1), 92-101. [138] Ting, D.S.W.; Cheung, C.Y.L.; Lim, G.; Tan, G.S.W.; Quang,
http://dx.doi.org/10.1016/j.cell.2020.03.022 PMID: 32243801 N.D.; Gan, A.; Hamzah, H.; Garcia-Franco, R.; San Yeo, I.Y.; Lee,
[130] Watson, O.P.; Cortes-Ciriano, I.; Taylor, A.R.; Watson, J.A. A S.Y.; Wong, E.Y.M.; Sabanayagam, C.; Baskaran, M.; Ibrahim, F.;
decision-theoretic approach to the evaluation of machine learning Tan, N.C.; Finkelstein, E.A.; Lamoureux, E.L.; Wong, I.Y.; Bress-
algorithms in computational drug discovery. Bioinformatics, 2019, ler, N.M.; Sivaprasad, S.; Varma, R.; Jonas, J.B.; He, M.G.; Cheng,
35(22), 4656-4663. C.Y.; Cheung, G.C.M.; Aung, T.; Hsu, W.; Lee, M.L.; Wong, T.Y.
http://dx.doi.org/10.1093/bioinformatics/btz293 PMID: 31070704 Development and validation of a deep learning system for diabetic
[131] Rutering, J.; Ilmer, M.; Recio, A.; Coleman, M.; Vykoukal, J.; Alt, retinopathy and related eye diseases using retinal images from mul-
E.; Orleans, N. Mutational landscape of metastatic cancer revealed tiethnic populations with diabetes. JAMA, 2017, 318(22), 2211-
from prospective clinical sequencing of 10,000 patients. Nat. Med., 2223.
2016, 5, 1-8. http://dx.doi.org/10.1001/jama.2017.18152 PMID: 29234807

You might also like