You are on page 1of 7

IPSJ Transactions on Bioinformatics Vol.11 41–47 (Dec.

2018)
[DOI: 10.2197/ipsjtbio.11.41]

Original Paper

Predicting Strategies for Lead Optimization via Learning


to Rank

Nobuaki Yasuo1,2,a) Keisuke Watanabe1 Hideto Hara3 Kentaro Rikimaru3


Masakazu Sekijima1,4,b)
Received: August 21, 2018, Accepted: September 18, 2018

Abstract: Lead optimization is an essential step in drug discovery in which the chemical structures of compounds
are modified to improve characteristics such as binding affinity, target selectivity, physicochemical properties, and tox-
icity. We present a concept for a computational compound optimization system that outputs optimized compounds
from hit compounds by using previous lead optimization data from a pharmaceutical company. In this study, to predict
the drug-likeness of compounds in the evaluation function of this system, we evaluated and compared the ability to
correctly predict lead optimization strategies through learning to rank methods.

Keywords: lead optimization, learning to rank, computer-aided drug design, machine learning

computer-aided drug discovery (CADD), which has been utilized


1. Introduction since the 1960s, are also leading current drug discovery. The
During drug discovery, enormous attempts are being made to methods of CADD can be combined with various biological data
identify better drug candidates. Since the cost of drug discovery including genomic sequence, protein tertiary structure, and chem-
has been drastically increased, recently the process of drug dis- ical structure, and can be utilized in various steps in drug discov-
covery typically takes 12–14 years [1] and costs approximately ery: target identification, compound screening, and ADME (ab-
2.6 billion USD [2]. The process of drug discovery is sometimes sorption, distribution, metabolism, excretion, toxicity) properties
likened to looking for a needle in a haystack; it is the process prediction [9], [10], [11]. To this end, methods in CADD such as
of finding out suitable compounds from vast “chemical space.” virtual screening, have been widely applied in drug discovery to
First, compounds are screened on the basis of their binding affin- reduce experimental costs [12], [13]. It is expected that CADD
ity to a target protein to obtain hit compounds. Then, in hit-to- reduces the cost of drug development by 50% [14].
lead and lead optimization steps, these hits are optimized to ob- Nearly all of the cost of lead optimization originates from the
tain drug candidates. Subsequently, the optimized compounds are synthesis of many compounds in an effort to explore the entire
designated for preclinical and clinical testing. Compounds that chemical space, but this exploration typically results in only a
pass these tests are finally approved as drugs. Lead optimization, few, or if any, potential candidates. A discovery strategy that
in which the chemical structures of lead compounds are modi- minimizes the number of compounds synthesized would greatly
fied to obtain with improved properties, is an essential step in improve the efficiency of candidate development, since 17% of
drug discovery [3], [4]. Properties such as binding affinity, se- total drug discovery cost were invested for lead optimization [1].
lectivity, physicochemical and ADMET (absorption, distribution, However, researches on lead optimization are limited since prac-
metabolism, excretion, toxicity) properties are optimized in the tical data of lead optimization have not been published from phar-
hit-to-lead and lead optimization steps [5], [6] (Fig. 1). maceutical companies.
In order to reduce the cost of these processes, diverse ap- The ultimate research objective in this study was to develop
proaches have been developed. Combinatorial chemistry and an in silico compound optimization system to produce optimized
high-throughput screening are the key technologies to acceler- compounds from hit compounds (Fig. 2). In this system, two
ate the drug discovery from experimental biology [7], [8], while modules are iteratively applied. The first module focuses on
the exploration of candidate compounds, and the second evalu-
1
Department of Computer Science, Tokyo Institute of Technology, Yoko- ates the identified candidates. The exploration module is based
hama, Kanagawa 226–8503, Japan
2
Research Fellow of Japan Society for the Promotion of Science DC1, on virtual modification of compounds by using matched molecu-
Yokohama, Kanagawa 226–8503, Japan lar pairs (MMPs) or chemical reaction-based method. An MMP
3
Shonan Research Center, Takeda Pharmaceutical Company Limited, is a pair of compounds that differing in only in one part of
Fujisawa, Kanagawa 251–0012, Japan
4
Advanced Computational Drug Discovery Unit, Tokyo Institute of Tech- their chemical structure [16], and MMPs have previously been
nology, Yokohama, Kanagawa 226–8503, Japan used for ADME prediction [17] and compound optimization [18].
a)
yasuo@cbi.c.titech.ac.jp
b) Chemical reaction-based method simulates virtual chemical re-
sekijima@c.titech.ac.jp


c 2018 Information Processing Society of Japan 41
IPSJ Transactions on Bioinformatics Vol.11 41–47 (Dec. 2018)

Fig. 1 Scheme of lead optimization. The hit compound is optimized through step-wise exploration. In
each step, new compounds are synthesized and evaluated. Compounds with unfavorable proper-
ties are pruned and will not be explored in further steps. In this figure, binding affinity towards the
target protein is used as an example of the evaluation function.

Fig. 2 Concept of an in silico compound optimization system. The system learn optimization strategies
from previous lead optimization projects. When new hit compound are input, the two modules are
iteratively applied: exploration module and evaluation module. In exploration module, modified
compounds are virtually explored from input compounds using virtual compound optimization
system [15]. In evaluation module, input compounds are evaluated by learned strategy.

actions to generate new compounds, and have previously been of compounds. Machine-learning-based approaches such as
used for compound optimization [15]. As this method uses practi- support vector machine (SVM) and neural network (NN) have
cal chemical reactions for exploration, generated compounds are also been applied to distinguish drug-like from non drug-like
more synthesizable than MMP-based systems. compounds [26], [27], [28]. However, these methods remain
In contrast, quantitative structure-activity relationships inadequate because they were originally developed only to
(QSARs) [19], [20] and quantitative structure-property relation- distinguish drugs from non-drugs. Consequently, there is a
ships (QSPRs) [21], [22] have been widely used to evaluate high demand for new methods that can predict drug-likeness
compounds. However, such methods permit the simultaneous of compounds that have been gradually optimized during lead
comparison of only a limited number of properties. Various optimization.
physicochemical-property-based metrics or substructure- Learning to rank is a machine learning method that is well
based drug-likeness indices, such as solubility [23], Lipinski’s suited for addressing this issue. Figure 3 shows the idea of learn-
rule of five [24], and quantitative estimate of drug-likeness ing to rank method. This method, which has been developed
(QED) [25], have been developed to assess the drug-likeness in the field of information retrieval, predicts the order of a set


c 2018 Information Processing Society of Japan 42
IPSJ Transactions on Bioinformatics Vol.11 41–47 (Dec. 2018)

Fig. 3 Idea of pairwise learning to rank method. In learning phase, the pair of data and the relationship
are input as the training data. In inference phase, test data are sorted using learned relationship.

of data [29] rather than the class or specific value of each data. likely to be drug-like than compounds that synthesized earlier.
Learning to rank methods have previously been applied to vir- All compounds were numbered in time order, and the assigned
tual screening, where the accuracies outperformed simple SVM numbers were normalized before use. For the pairwise methods,
and support vector regression (SVR) [30], [31]. Learning to rank only pairs from the same project were used for training.
methods can be categorized as pointwise, pairwise, and listwise
methods. In pointwise methods, a value is assigned to each data 2.2 Features
point, and those value are sorted to determine the order. In pair- All compounds in the dataset were encoded as feature vec-
wise methods, the ordering is first determined for pairs of data tors. In this study, the feature vector consisted of Extended-
points, and the ordered pairs are then sorted to determine the fi- connectivity fingerprint (ECFP) [32], which is topological finger-
nal order as in Fig. 3 In listwise methods, the order of a dataset print for molecular characterization. The algorithm of ECFP fin-
is directly predicted. Among these methods, listwise method is gerprint is described in Fig. 4, using 4-methyloxazole as an ex-
not suited for this task since large datasets are required to train ample. Substructures starting from each atom are iteratively com-
listwise method. bined to the neighbor atoms until the diameters of substructures
In this study, we propose a strategy for lead optimization based reach specified number. Each of the constructed substructures
on compounds that have previously been synthesized at Takeda corresponds to one bit of the fixed-length feature vector by us-
Pharmaceutical Company Limited. In this strategy, we predicted ing a hash function. Hash collisions are allowed when assigning
the order of synthesis of compounds, because compounds syn- more than one structure to one bit.
thesized later during optimization are more likely to be drug-like. In this study, the maximum diameter of substructures and the
All factors that are implicitly involved in the order of optimiza- length of bits was 6 and 512, respectively. None of the feature bits
tion are considered in this method. We compared six different had the same value for every compound, and there were no highly
learning to rank methods, including both pointwise and pairwise correlated feature pairs with correlation coefficients of > 0.95.
methods.
2.3 Machine Learning Methods
2. Materials and Methods In this study, several pointwise and pairwise methods of learn-
2.1 Dataset ing to rank, namely support vector machine (SVM) with a linear
The dataset used in this study consisted of in-house data from kernel, SVM with a radial basis function (RBF) kernel, random
31 projects corresponding to previous lead optimization studies forest, rankSVM [33], [34], logistic classification and lasso, were
at Takeda Pharmaceutical Company Limited. The data from each compared in terms of their prediction accuracy. The methods
project contained information on compounds synthesized in a and their corresponding method types, descriptions are summa-
previous study. The total number of compounds are 15,097, and rized in Table 1. The detail of hyperparameter tuning, namely
the number of compounds consisted in each project ranged from the ranges of tuned hyperparameters and final values is summa-
291 to 583, with a mean of 486. 14 out of 31 projects are used rized in Table 2. The hyperparameters of these methods were
for parameter tuning, and other 17 projects are used for testing by optimized by using 14 out of 31 projects of the dataset, which
cross validation. contain different proteins and compounds to avoid train-test over-
The labels used for the machine learning represented the order lap. For Python 3.5.2 with scikit-learn version 0.18.1 [35] was
of synthesis, because compounds that synthesized later are more used for implementation.


c 2018 Information Processing Society of Japan 43
IPSJ Transactions on Bioinformatics Vol.11 41–47 (Dec. 2018)

Fig. 4 Generation algorithm of ECFP.

Table 1 Details of the methods used in this study. The methods, their corresponding method types (point-
wise or pairwise), and descriptions are described.

Method Type Description


SVM (linear) pointwise Support vector machine, linear kernel, regression
SVM (rbf) pointwise Support vector machine, RBF kernel, regression
random forest pointwise Random forest, regression
rankSVM pairwise SVM linear kernel, classification
logistic pairwise Logistic regression for classification
lasso pairwise Least square regression for classification with L1 regularization

Table 2 Details of the hyperparameter tuning. The methods, their corresponding hyperparameters, the
ranges of tuned hyperparameters, and the final hyperparameter values are described.

Method Hyperparameter names Range Final values


SVM (linear) C 10−9 , 10−8 , · · · , 1010 10−5
−3 −2 −6 −5
SVM (rbf) C, γ C : 10 10 , · · · , 10 , γ : 10 10 , · · · , 10
3 0
C : 1.0, γ : 0.1
random forest Number of features 5, 10, · · · , 85 25
rankSVM C 10−3 , 10−2 , · · · , 103 1.0
logistic α 10−3 , 10−2 , · · · , 103 102
−3 −2
lasso α 10 , 10 , · · · , 10 3
10−3

We used a linear classifier for the pairwise methods for the fol- scent method [36] through each minibatch. It is needed because
lowing reason. In pairwise methods, the ranking task of two data the order of training data is the square of the data, as the pairs
points (xi , yi ) and (x j , y j ) can be transformed into the binary clas- of data are used in pairwise method. The minibatch training was
sification task of (x , y ) = (xi − x j , sign(yi − y j )) if the classifier conducted for 10 epochs until the training loss is not decreased.
is linear, where xi ∈ Rk denotes the feature vector of i-th data and The minibatch size was 500, which roughly corresponds to the
yi ∈ R denotes the label of i-th data [33]. Here k denotes the num- number of compounds in each project.
ber of features. For the pairwise methods, the minibatch training
was used due to the number of training data. The training data 2.4 Evaluation
are randomly split into minibatches, and subsequently the weight The evaluation was conducted by using project-wise leave-one-
of each feature in these methods are iteratively updated using the out cross validation. It means that each project was independently
gradient of loss function by means of the stochastic gradient de- evaluated using other 16 projects as training data to reduce over-


c 2018 Information Processing Society of Japan 44
IPSJ Transactions on Bioinformatics Vol.11 41–47 (Dec. 2018)

fitting. As an evaluation metric, Spearman’s correlation coeffi- Figure 5 shows the correlation coefficients for all projects for
cient ρ was used to evaluate of the prediction accuracy. The defi- the SVM (rbf), logistic classification, and lasso methods. SVM
nition is: (rbf) and logistic classification were the best methods among the
 pointwise and pairwise methods, respectively. Because the lasso
6 × di2
ρ=1−  2  (1) did not achieve effective prediction, it is also included for compar-
n n −1
ison. SVM (rbf) and logistic classification achieved high correla-
where di denotes the difference of rank of i-th data, and n denotes tion for projects 1, 6, 9, 12, and 17 though all methods failed to
the number of data. The range of values is −1 ≤ ρ ≤ 1 and the predict projects 4, 11, and 13. Succeeded projects seemed to have
expected ρ value for the random prediction is 0. similar pattern of optimization strategy and the pattern was rec-
3. Results and Discussions ognized by machine learning methods and other projects might
have dissimilar pattern to other projects.
Table 3 shows the mean rank correlation coefficients among And Fig. 6 shows similarity of compounds in all project us-
17 projects. All methods except the lasso had statistically signif- ing Tanimoto coefficient of ECFP6. Dissimilar and similar com-
icant positive correlations, thus indicating that their predictions pounds are described as dark and bright color of cell, respectively.
showed a significant positive correlation with the true order com- Projects 1–17 are used for cross validation and projects 18–31 are
pared with random prediction. used for hyperparameter tuning. As shown in this figure, almost
One possible reason for the relativity poor performance of all pair of compounds from different projects are dissimilar except
the lasso method may be the characteristics of the feature vec- project 1 and 6, though compounds from the same project tend to
tors. The solutions predicted by the lasso method tends to be be similar. Prediction accuracy for project 1 and 6 were relatively
more sparse than logistic regressions or ridge regression. In this higher than others, but the order of optimization for other projects
dataset, the feature vectors may have been too dense to obtain can be predicted with certain accuracy. Furthermore, the weight
sparse solutions. Moreover, the accuracy of the pointwise meth- of linear models were examined for the understand of the opti-
ods was higher than that of the pairwise methods in general. This mization strategy. More than the half of the features were pos-
finding indicated that comparing compounds between different itive weight in all models. This implies that compounds which
projects contributed to more successful prediction, because such have more “on” bits were predicted to be synthesized latter. The
comparison were performed in the pointwise methods but not in number of “on” bits in ECFP6 corresponds to the variation of sub-
the pairwise methods. structures of the compound. Thus, the prediction models reflect
Table 3 Prediction accuracies of all methods. The mean Spearman’s rank the tendency that compounds become complex as the optimiza-
coefficient ρ among 17 projects was used for evaluation. The ∗: tion proceeds.
p < 0.05.
Method Type Mean rank correlation coefficient ρ 4. Conclusion
SVM (linear) pointwise 0.321*
SVM (rbf) pointwise 0.344* In this study, we have shown the optimization strategy using
random forest pointwise 0.301* six learning to rank models in order to predict the drug-likeness
rankSVM pairwise 0.238*
of compounds as an evaluation function of the in silico compound
logistic pairwise 0.262*
lasso pairwise −0.041 optimization system. All results indicated that the order of syn-

Fig. 5 Prediction accuracies for all projects. The x-axis and y-axis correspond to the project number
and the prediction accuracy. Spearman’s rank coefficient ρ was used for evaluation. Each bar
corresponds to each method. Blue: pointwise SVM (rbf), Red: pairwise logistic regression for
classification, Green: pairwise lasso.


c 2018 Information Processing Society of Japan 45
IPSJ Transactions on Bioinformatics Vol.11 41–47 (Dec. 2018)

Fig. 6 Heatmap of the similarities of the compounds in all projects on the basis of their Tanimoto coef-
ficient of ECFP6. The ticks corresponds to the boundary of the projects. Compounds pairs with
higher similarity are represented in brighter color.

thesis can be predicted with significant correlation using dataset Discovery, Vol.8, No.3, pp.203–212 (2009).
which contains 17 optimization projects from an established phar- [4] Bleicher, K.H., Böhm, H.-J., Müller, K. and Alanine, A.I.: Hit and
lead generation: Beyond high-throughput screening, Nature Reviews
maceutical company. The results also suggested that compounds Drug Discovery, Vol.2, No.5, pp.369–378 (2003).
dissimilar to the test compounds contribute to the prediction and [5] Keserü, G.M. and Makara, G.M.: Hit discovery and hit-to-lead ap-
proaches, Drug Discovery Today, Vol.11, No.15, pp.741–748 (2006).
the optimization trend are reflected in the trained model. Al- [6] Korfmacher, W.: Lead optimization strategies as part of a drug
though lead optimization is a complex, challenging process, we metabolism environment, Current Opinion in Drug Discovery & De-
velopment, Vol.6, No.4, pp.481–485 (2003).
strongly believe our system will create a smooth and efficient op- [7] Haggarty, S.J., Mayer, T.U., Miyamoto, D.T., Fathi, R., King, R.W.,
timization process in drug discovery. Mitchison, T.J. and Schreiber, S.L.: Dissecting cellular processes us-
ing small molecules: Identification of colchicine-like, taxol-like and
other small molecules that perturb mitosis, Chemistry & Biology,
5. Competing interests Vol.7, No.4, pp.275–286 (2000).
[8] Young, K., Lin, S., Sun, L., Lee, E., Modi, M., Hellings, S., Husbands,
The authors declare the following competing financial inter- M., Ozenberger, B. and Franco, R.: Identification of a calcium chan-
est(s): Hideto Hara and Kentaro Rikimaru are employed by nel modulator using a high throughput yeast two-hybrid screen, Nature
Biotechnology, Vol.16, No.10, pp.946–950 (1998).
Takeda Pharmaceutical Co. Ltd. [9] Sliwoski, G., Kothiwale, S., Meiler, J. and Lowe, E.W.: Computa-
tional methods in drug discovery, Pharmacological Reviews, Vol.66,
6. Acknowledgement No.1, pp.334–395 (2014).
[10] Egan, W.J., Merz, K.M. and Baldwin, J.J.: Prediction of drug ab-
The authors would like to thank N. Arai for useful discussions. sorption using multivariate statistics, Journal of Medicinal Chemistry,
This work was partially supported by the Research Complex Vol.43, No.21, pp.3867–3877 (2000).
[11] Jorgensen, W.L. and Duffy, E.M.: Prediction of drug solubility from
Program “Well-being Research Campus: Creating new values structure, Advanced Drug Delivery Reviews, Vol.54, No.3, pp.355–
through technological and social innovation” from Japan Sci- 366 (2002).
[12] Chiba, S., Ikeda, K., Ishida, T., Gromiha, M.M., Taguchi, Y., Iwadate,
ence and Technology Agency (JST), the Japanese Society for M., Umeyama, H., Hsin, K.-Y., Kitano, H., Yamamoto, K., et al.:
the Promotion of Science (JSPS) KAKENHI Grant Numbers Identification of potential inhibitors based on compound proposal con-
test: Tyrosine-protein kinase Yes as a target, Scientific Reports, Vol.5,
15H02776 (To M.S.) and 16J09021 (To N.Y.), and the Platform 17209 (2015).
Project for Supporting Drug Discovery and Life Science Re- [13] Yoshino, R., Yasuo, N., Hagiwara, Y., Ishida, T., Inaoka, D.K.,
Amano, Y., Tateishi, Y., Ohno, K., Namatame, I., Niimi, T., et al.:
search (Basis for Supporting Innovative Drug Discovery and Life In silico, in vitro, X-ray crystallography, and integrated strategies for
Science Research (BINDS)) from AMED under Grant Number discovering spermidine synthase inhibitors for Chagas disease, Scien-
tific Reports, Vol.7, 6666 (2017).
JP18am0101112. [14] Tan, J.J., Cong, X.J., Hu, L.M., Wang, C.X., Jia, L. and Liang, X.-J.:
Therapeutic strategies underpinning the development of novel tech-
niques for the treatment of HIV infection, Drug Discovery Today,
References Vol.15, No.5-6, pp.186–197 (2010).
[1] Paul, S.M., Mytelka, D.S., Dunwiddie, C.T., Persinger, C.C., Munos, [15] Arai, N., Yoshikawa, S., Yasuo, N., Yoshino, R. and Sekijima, M.:
B.H., Lindborg, S.R. and Schacht, A.L.: How to improve R&D pro- Compound property enhancement by virtual compound synthesis,
ductivity: the pharmaceutical industry’s grand challenge, Nature re- Journal of Bioinformatics and Computational Biology, Vol.16, No.3,
views Drug discovery, Vol.9, No.3, pp.203–214 (2010). 1840016 (2018).
[2] Mullard, A.: New drugs cost US $2.6 billion to develop, Nature Re- [16] Tyrchan, C. and Evertsson, E.: Matched Molecular Pair Analysis in
views Drug Discovery, Vol.13, No.12, p.877 (2014). Short: Algorithms, Applications and Limitations, Computational and
[3] Keserü, G.M. and Makara, G.M.: The influence of lead discovery Structural Biotechnology Journal, Vol.15, pp.86–90 (2016).
strategies on the properties of drug candidates, Nature Reviews Drug [17] Ritchie, T.J. and Macdonald, S.J.: Heterocyclic replacements for ben-


c 2018 Information Processing Society of Japan 46
IPSJ Transactions on Bioinformatics Vol.11 41–47 (Dec. 2018)

zene: maximising ADME benefits by considering individual ring iso- Nobuaki Yasuo received his M. Eng. de-
mers, European Journal of Medicinal Chemistry, Vol.124, pp.1057– gree from Tokyo Institute of Technology.
1068 (2016).
[18] Weber, J., Achenbach, J., Moser, D. and Proschak, E.: VAMMPIRE: He is currently a Ph.D. student at Depart-
A matched molecular pairs database for structure-based drug design ment of Computer Science, Tokyo Insti-
and optimization, Journal of Medicinal Chemistry, Vol.56, No.12,
pp.5203–5207 (2013). tute of Technology. He is also a research
[19] Cherkasov, A., Muratov, E.N., Fourches, D., Varnek, A., Baskin, I.I., fellow of Japan Society for the Promotion
Cronin, M., Dearden, J., Gramatica, P., Martin, Y.C., Todeschini, R.,
et al.: QSAR modeling: Where have you been? Where are you going of Science DC1. His research interests in-
to?, Journal of Medicinal Chemistry, Vol.57, No.12, pp.4977–5010 clude computer-aided drug discovery, ma-
(2014).
[20] Verma, J., Khedkar, V.M. and Coutinho, E.C.: 3D-QSAR in drug chine learning, and cheminformatics. He is a student member of
design-a review, Current Topics in Medicinal Chemistry, Vol.10, No.1, IPSJ.
pp.95–115 (2010).
[21] Zheng, W. and Tropsha, A.: Novel variable selection quantitative
structure- property relationship approach based on the k-nearest- Keisuke Watanabe received his B. Eng.
neighbor principle, Journal of Chemical Information and Computer
Sciences, Vol.40, No.1, pp.185–194 (2000). degree from Tokyo Institute of Technol-
[22] Karelson, M., Lobanov, V.S. and Katritzky, A.R.: Quantum-chemical ogy. He is currently a graduate student
descriptors in QSAR/QSPR studies, Chemical Reviews, Vol.96, No.3,
pp.1027–1044 (1996). at Graduate School of Arts and Sciences,
[23] Wildman, S.A. and Crippen, G.M.: Prediction of physicochemical pa- The University of Tokyo. His current
rameters by atomic contributions, Journal of Chemical Information
and Computer Sciences, Vol.39, No.5, pp.868–873 (1999). research interests include computational
[24] Lipinski, C.A., Lombardo, F., Dominy, B.W. and Feeney, P.J.: Ex- game player.
perimental and computational approaches to estimate solubility and
permeability in drug discovery and development settings, Advanced
Drug Delivery Reviews, Vol.23, No.1-3, pp.3–25 (1997).
[25] Bickerton, G.R., Paolini, G.V., Besnard, J., Muresan, S. and Hopkins, Hideto Hara received his Ph.D. degree
A.L.: Quantifying the chemical beauty of drugs, Nature Chemistry, in Pharmaceutical Sciences from Kyoto
Vol.4, No.2, pp.90–98 (2012).
[26] Walters, W.P. and Murcko, M.A.: Can we learn to distinguish be- University, Kyoto, Japan, in 2008. He
tween “drug-like” and “nondrug-like” molecules?, Journal of Medici- is currently a principal scientist at Drug
nal Chemistry, Vol.41, No.18, pp.3314–3324 (1998).
[27] Müller, K.-R., Rätsch, G., Sonnenburg, S., Mika, S., Grimm, M.
Safety Research and Evaluation, Re-
and Heinrich, N.: Classifying ‘drug-likeness’ with kernel-based learn- search, Takeda Pharmaceutical Company
ing methods, Journal of Chemical Information and Modeling, Vol.45,
No.2, pp.249–253 (2005).
Limited. His current research interests in-
[28] Byvatov, E., Fechner, U., Sadowski, J. and Schneider, G.: Compari- clude computational toxicology, computa-
son of support vector machine and artificial neural network systems tional chemistry and cheminformatics.
for drug/nondrug classification, Journal of Chemical Information and
Computer Sciences, Vol.43, No.6, pp.1882–1889 (2003).
[29] Liu, T.-Y.: Learning to Rank for Information Retrieval, Foundations
and Trends in Information Retrieval, Vol.3, No.3, pp.225–331 (on- Kentaro Rikimaru received his Master
line), DOI: 10.1561/1500000016 (2009). degree (2004) and his Ph.D (2012) in
[30] Agarwal, S., Dugar, D. and Sengupta, S.: Ranking chemical struc-
tures for drug discovery: A new machine learning approach, Jour- Pharmaceutical Sciences from University
nal of Chemical Information and Modeling, Vol.50, No.5, pp.716–731 of Tokyo. He joined Takeda Pharmaceu-
(2010).
[31] Rathke, F., Hansen, K., Brefeld, U. and Müller, K.-R.: StructRank: A tical Company in 2004, then joined Axce-
new approach for ligand-based virtual screening, Journal of Chemical lead Drug Discovery Partners Ltd in 2017.
Information and Modeling, Vol.51, No.1, pp.83–92 (2010).
[32] Rogers, D. and Hahn, M.: Extended-connectivity fingerprints, Jour- He is currently an engineer at ExaWiz-
nal of Chemical Information and Modeling, Vol.50, No.5, pp.742–754 ards, Inc. He has served his current role
(2010).
[33] Joachims, T.: Optimizing search engines using clickthrough data,
since 2018. His research interests include medicinal chemistry,
Proc. 8th ACM SIGKDD International Conference on Knowledge Dis- computational chemistry, and real world data analysis.
covery and Data Mining, pp.133–142, ACM (2002).
[34] Chapelle, O. and Keerthi, S.S.: Efficient algorithms for ranking with
SVMs, Information Retrieval, Vol.13, No.3, pp.201–215 (2010). Masakazu Sekijima received his Ph.D.
[35] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B.,
Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., from the University of Tokyo in 2002.
Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. Since 2002 he worked at the National In-
and Duchesnay, E.: Scikit-learn: Machine Learning in Python, Jour-
nal of Machine Learning Research, Vol.12, pp.2825–2830 (2011). stitute of Advanced Industrial Science and
[36] Bottou, L.: Large-scale machine learning with stochastic gradient de- Technology (AIST) as a Research Staff
scent, Proc. COMPSTAT 2010, pp.177–186 (2010).
and since 2003 as a Research Scientist.
He also worked at the Waseda University
from 2006 to 2010 as a visiting associate
professor. Since 2008 he has worked at Tokyo Institute of Tech-
nology as an associate professor. His current research interests
are Computational Drug Discovery, High Performance Comput-
ing, Bioinformatics and Protein Science. He is a member of IPSJ,
IEEE, ACM, Protein Society and Biophysical Society.

(Communicated by Yoichi Takenaka)


c 2018 Information Processing Society of Japan 47

You might also like