You are on page 1of 8

pubs.acs.

org/jcim Application Note

NEXTorch: A Design and Bayesian Optimization Toolkit for Chemical


Sciences and Engineering
Yifan Wang,§ Tai-Ying Chen,§ and Dionisios G. Vlachos*
Cite This: https://doi.org/10.1021/acs.jcim.1c00637 Read Online

ACCESS Metrics & More Article Recommendations *


sı Supporting Information
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

ABSTRACT: Automation and optimization of chemical systems


Downloaded via UNIV TUNKU ABDUL RAHMAN on October 28, 2021 at 02:19:39 (UTC).

require well-informed decisions on what experiments to run to


reduce time, materials, and/or computations. Data-driven active
learning algorithms have emerged as valuable tools to solve such
tasks. Bayesian optimization, a sequential global optimization
approach, is a popular active-learning framework. Past studies have
demonstrated its efficiency in solving chemistry and engineering
problems. We introduce NEXTorch, a library in Python/PyTorch,
to facilitate laboratory or computational design using Bayesian
optimization. NEXTorch offers fast predictive modeling, flexible
optimization loops, visualization capabilities, easy interfacing with
legacy software, and multiple types of parameters and data type
conversions. It provides GPU acceleration, parallelization, and
state-of-the-art Bayesian optimization algorithms and supports both automated and human-in-the-loop optimization. The
comprehensive online documentation introduces Bayesian optimization theory and several examples from catalyst synthesis, reaction
condition optimization, parameter estimation, and reactor geometry optimization. NEXTorch is open-source and available on
GitHub.

■ INTRODUCTION
Data generation in chemical sciences can be expensive and
method is designed to balance the exploration of uncertainty
and exploitation of current knowledge in the parameter space.
time consuming when performing computations or experi- Previously, researchers have designed similar global optimiza-
ments. Chemical systems are often complex, multidimensional, tion methods for chemistry applications, such as the stable
and include various interacting parameters. These traits make noisy optimization by Branch and Fit (SNOBFIT),14,15 genetic
the autonomous discovery,1 predictive modeling, and opti- algorithms (GA),16,17 and reinforcement learning.18 Some
mization challenging. Design of experiments (DOE) has been methods resemble GP-based BO with different surrogate
employed to determine the relationships between input models (e.g., neural networks) and customized acquisition
parameters and output responses. However, the traditional functions.19,20 Here, we focus our discussion on GP-based BO
DOE is difficult to scale to high dimensional problems where because its simple architecture scales well for medium-sized
the cost of fine-tuning parameters and locating optima is systems with less than 20 dimensions,21 requires less data and
prohibitive. Active learning refers to the idea of a machine training time, and supports native uncertainty quantification.
learning algorithm “learning” from data, proposing next GP-based BO often outperforms experts or other algorithms in
experiments or calculations and improving prediction accuracy locating optima and producing accurate surrogate models with
with fewer training data or lower cost.2 Bayesian optimization minimal customization effort.6 There is no doubt that BO
(BO), an active learning framework, often used to tune could supercharge the field toward faster adoption of
hyperparameters in machine learning models, has seen a rise in automation.
its applications to various chemical science fields, including The machine learning community has developed several BO
parameter tuning for density functional theory (DFT) software tools, with most of them interfaced in Python. We
calculations,3 catalyst synthesis,4,5 high throughput reactions,6
and computational material discovery.7−9 Its close variant,
kriging,10 originating in geostatistics, has also been widely Received: June 4, 2021
applied in process engineering.11,12 Fundamentally, BO is a
sequential global optimization approach consisting of two
essential parts: (1) a surrogate model (often a Gaussian
process13) to approximate the system response and (2) an
acquisition function to suggest new experiments to run. The

© XXXX American Chemical Society https://doi.org/10.1021/acs.jcim.1c00637


A J. Chem. Inf. Model. XXXX, XXX, XXX−XXX
Journal of Chemical Information and Modeling pubs.acs.org/jcim Application Note

Figure 1. Overview of (a) the active learning framework, (b) high-level workflow for (1) automated optimization and (2) human-in-the-loop
optimization, and (c) main classes and functions in NEXTorch.

curate a list of open-source BO packages and provide further difficult to extend to chemistry or engineering problems, where
discussion in Table S1. Spearmint22 and GyOpt23 are among domain knowledge is essential. We have seen such an attempt
the early works that make BO accessible to end users. Recently, by the authors of edbo6 (a Bayesian reaction optimization
some packages are built on popular machine learning package). They performed extensive testing and benchmarking
frameworks, such as PyTorch and TensorFlow, to benefit to showcase the effectiveness of the method.6 However, the
from fast matrix operations, parallelization, and GPU software is still based on command-line scripts, and clear
acceleration. Examples include BoTorch24 and GPflowOpt.25 documentation is lacking. The edbo software lacks hardware
BoTorch stands out since it naturally supports parallel acceleration or the latest state-of-the-art BO methods.
optimization, Monte Carlo acquisition functions, and advanced From a practical perspective, we believe a BO tool should be
multitask and multiobjective optimization. The PyTorch scalable, flexible, and accessible to the end users, i.e., chemists
backend also makes it suitable for easy experimentation and and engineers. Hence, we build NEXTorch (Next EXperiment
fast prototyping. However, most tools are designed for AI toolkit in PyTorch), extending the capabilities of BoTorch, to
researchers or software engineers, often requiring a steep democratize the use of BO in chemical sciences. NEXTorch is
learning curve. For example, the adaptive experimentation unique for several reasons. First, it benefits from the modern
platform, Ax, a wrapper of BoTorch, has multiple and complex architecture and a variety of models and functions offered by
application programming interfaces (APIs).26 The workflow BoTorch. Second, going beyond BoTorch, NEXTorch
may also be less intuitive to end users for laboratory provides connections to real-world problems, including
experiments. Occasionally, design choices are made for fully automatic parameter scaling, data type conversions, and
automated optimization, keeping humans out of the visualization. Compared to Ax, NEXTorch offers more
optimization loop.27,28 The above reasons make these tools intuitive options for keeping researchers in the optimization
B https://doi.org/10.1021/acs.jcim.1c00637
J. Chem. Inf. Model. XXXX, XXX, XXX−XXX
Journal of Chemical Information and Modeling pubs.acs.org/jcim Application Note

21 − ν ijj d yz ij d yz
jj 2 ν zzz K νjjj 2 ν zzz
ν
process and more choices for DOE and visualization functions.
Γ(ν) k ρ{ k ρ{
Both packages are wrappers of BoTorch and should converge Cv(d) = σ 2
(1)
to near-identical optima at similar rates (an example is shown
in Figure S1). These features allow human-in-the-loop design where d represents the distance between two points, σ
where decision making on generating future data is aided by represents the standard deviation, Γ is the gamma function,
domain knowledge. Utilizing these features, NEXTorch can Kv is the modified Bessel function, and ν and ρ are non-
assist not only chemical synthesis in laboratory experiments4 negative parameters.
but also the multiscale computational tasks from molecular- Acquisition Function. The acquisition function is applied
scale design, such as heterogeneous catalyst calculations17 and to obtain the new sampling point Xnew. It measures the value of
homogeneous (ligand) catalyst discovery, to reactor-scale evaluating the objective function at Xnew, based on the current
optimization, i.e., automatic reactor optimization with posterior distribution over f.̂ The most commonly used
computational fluid dynamics (CFD).29 Third, NEXTorch is acquisition function is the expected improvement (EI). The
modular, making it easy to extend to other frameworks. It also EI is the expectation taken under the posterior distribution f ̂ of
serves as a library for learning the theory and implementing the improvement of the new observation at X over the current
BO. We provide clear documentation and guided examples at best observation f *
https://nextorch.readthedocs.io/en/latest/. We believe its easy
use could serve the community, including experimentalists EI(X ) = [max(f ̂ (X ) − f * , 0)] (2)
with little or no programming background.
Aside from the EI, there are also other acquisition functions for
In the first section of this paper, we review the theoretical
a single objective BO available in NEXTorch, including
foundations of NEXTorch. In the second, we describe the probability of improvement (PI), upper confidence bound
design of the software and the data structures. In the third, we (UCB), and their Monte Carlo variants (qEI, qPI, qUCB).
demonstrate how NEXTorch can optimize a plug flow reactor Parameter Types. Parameters and the associated data are
(PFR) performance. generally continuous, categorical, and ordinal. Continuous

■ THEORY
Overview. Our framework (Figure 1a) integrates DOE,
parameters are numerical and take any real value in a range.
Categorical are non-numeric and are denoted by words, text,
or symbols. Ordinal parameters take ordered discrete values.
BO, and surrogate modeling. The goal is to optimize the Traditionally, BO is designed for systems with all continuous
parameters. However, depending on the problem, the resulting
function of interest, i.e., objective function f(X). X denotes the
design space can be discrete or mixed (continuous discrete). If
input variables (parameters). Here, X = x1, x2, x3, . . . xd; each
the number of discrete combinations is low, one possible
xi is a vector xi = (x1i, x2i, . . ., xni)T; n is the number of sample solution is to enumerate the values and optimize the
points, and d is the dimensionality of the input. Each continuous parameters for each. Without loss of generality,
parameter is bounded, and the resulting multidimensional we use a continuous relaxation, where the acquisition function
space is called the design space A, i.e., X ∈ A ∈ 9 d . The is optimized and rounded to the available values. For ordinal
objective function is evaluated using complex computer parameters, these values are the ordered discrete values. For
simulations or experiments. The responses,Y, are usually categorical parameters, we encode the categories with integers
expensive, time consuming, or difficult to calculate or measure. from 0 to ncategory − 1 and then perform continuous relaxation
A set of initial sampling Xinit data is generated using DOE. in the encoding space. Since a parameter can be approximated
These sampling points are passed to evaluate f. One collects as continuous, given a high order discretization, this approach
the data D = (X,Y) and use it to train a cheap-to-evaluate usually works well for problems with many discrete
surrogate model f ̂ (e.g., a Gaussian process). Next, an combinations.
acquisition function gives the new sampling points (i.e., infill Multiobjective Optimization (MOO). MOO involves
points, Xnew) for achieving an optimization goal. At this stage, minimizing (maximizing) the multiobjective function. The
one could choose to visualize the response surfaces using the optimal is not a single point but a set of solutions defining the
surrogate model or the infill point locations in the design best trade-off between competing objectives. The goodness of
the solution is determined by the dominance, where X1
space. A new data set is collected by evaluating f at Xnew, and
dominates X2 when X1 is not worse than X2 in all objectives,
the surrogate model f ̂ is updated. This process is repeated until
and X1 is strictly better than X2 in at least one objective. A
the accuracy of f ̂ is satisfactory or the optimum location Pareto optimal set defines points not dominated by each other,
x* = argmaxf (x) is found. where the boundary is the Pareto front.
x∈A The weighted-sum method is a classic MOO method that
Design of Experiments (DOE). Standard methods include scalarizes a set of objectives {Y1, Y2, . . . Yi, . . . YM} into a new
full factorial design, completely randomized design, and Latin objective Ymod by multiplying each with user-defined weights
hypercube sampling (LHS), described in statistics books.30 {w1,w2, . . .wi, . . .wM}, where Ymod = ∑i M= 1 wi Yi. The MOO
Here, we use LHS heavily due to its near-random design and becomes a single-objective optimization. An alternative is to
efficient space-filling abilities.31 use the expected hypervolume improvement (EHVI) as the
Gaussian Process (GP). A GP model constructs a joint acquisition function. A hypervolume indicator (HV) approx-
probability distribution over the variables, assuming a multi- imates the Pareto set and EHVI evaluates its EI. In NEXTorch,
variate Gaussian distribution. A GP is determined from a mean one can use either the classic weighted-sum method or the
function μ(X) and a covariance kernel function ∑ (X,X’).13 A state-of-the-art Monte Carlo EHVI (qEHVI) method as an
Matern covariance function is typically used as the kernel acquisition function,32 which avoids scalarizations. The
function21 methods are detailed in the Supporting Information (SI).
C https://doi.org/10.1021/acs.jcim.1c00637
J. Chem. Inf. Model. XXXX, XXX, XXX−XXX
Journal of Chemical Information and Modeling pubs.acs.org/jcim Application Note

Figure 2. Optimization of HMF yield. (a) Scheme of a PFR that converts fructose to HMF via acid-catalyzed dehydration. (b) Response surface
and (c) surface plot for the ground truth model at pH 0.7. (d) Sampling plan, (e) response surface, and (f) surface plot at pH 0.7 for the full
factorial approach. (g) Sampling plan, (h) response surface, and (i) surface plot at pH 0.7 for the LHS+BO approach. In (d) and (g), the sampling
points are visualized in three dimensions, and the two-dimensional plane at pH 0.7 is shown in green. In (g), the initial points from LHS are shown
in blue, and the infill points in red.

■ SOFTWARE DESIGN
The NEXTorch software package is structured in a similar way
optimization, where the objective function is unknown, as
happens in laboratory experiments. We call the action of
to the active learning framework. Figure 1b shows the high- generating data from the objective function an “experiment,”
level workflow. Initially, users identify the parameters and which is also the name of the core class in NEXTorch,
objectives and frame the optimization problem. The critical irrespective of whether this is done in the laboratory or on a
information includes the ranges and types (categorical, ordinal, computer. In (1), data are passed through the loop, and new
continuous, or mixed) of each parameter. It is helpful to know experiments are conducted as suggested by the acquisition
the sensitivity of the parameters by performing exploratory function automatically. In (2), visualization could help the
data analysis. Depending on the availability of the objective users decide whether to carry on the experiments or adjust the
function, NEXTorch supports two types of optimization: (1) experimental setup. The users are left to perform new trials,
Automated optimization, where the analytical form of the i.e., conduct further experiments and supply additional data.
objective function is known and provided to the software (in NEXTorch reads CSV or Excel files and exports the data in the
the form of a Python object). This is often the case in same format.
modeling and simulations; however, closed-form objective NEXTorch is implemented in a modular fashion. Figure 1c
functions may also not be available in complex models, a summarizes its available classes, modules, and functions. The
situation that falls under the next type; (2) Human-in-the-loop Parameter class stores the range and type of each parameter.
D https://doi.org/10.1021/acs.jcim.1c00637
J. Chem. Inf. Model. XXXX, XXX, XXX−XXX
Journal of Chemical Information and Modeling pubs.acs.org/jcim Application Note

The ParameterSpace class consists of all Parameter classes. In approach with two other sampling plans with the same total
the beginning, users can choose a DOE function to construct a number of 64 points: (1) full factorial design with four levels in
sampling plan and get the initial responses. Next, the data is each dimension and (2) completely random sampling. In each
passed to an Experiment class. We integrate most acquisition approach, we construct an Experiment object and train a GP
functions and GP models from the upstream BoTorch with the model based on all points. BO samples more in the low pH
Experiment classes. The Experiment classes are the core of and low tf region where the potential optimum is (Figure 2g).
NEXTorch, where data are preprocessed, GP models trained A two-dimensional plane at pH 0.7 is shown in the three-
and used for prediction, and optima located. There are several dimensional space. On this plane, the response surfaces
variants of the Experiment class depending on the application. (Figure 2b, e, h) and surface plots (Figure 2c, f, i) suggest
At the higher level, users interact with the Experiment class, that the surrogate model predictions from LHS+BO are
WeightedMOOExperiment class, and EHVIMOOExperiment for generally closer to the ground truth. Three additional sampling
sequential single-objective optimization, MOO with a weighted methods for the response surface design and low discrepancy
sum method, and MOO with EHVI. Moreover, NEXTorch sequence, including the Box-Behnken design, the central
also offers various visualization functions for the sampling composite design, and the Sobol sequence, are also compared.
plans, response surfaces/heatmaps, acquisition functions, The sampling plans, response surfaces, and surface plots are
Pareto fronts, etc., up to three dimensions. shown in Figure S4. Similarly, the surface plots suggest that the

■ APPLICATIONS
To demonstrate the easy implementation and modularity of
surrogate model predictions of the LHS+BO are generally
closer to the ground truth. Next, we plot the best yield
observed sequentially (Figure S5). The method also locates a
NEXTorch, we provide comprehensive examples on the higher optimal value compared to others. This example
documentation page (https://nextorch.readthedocs.io/en/ showcases that LHS+BO efficiently converges to the optimum
latest/user/examples.html). The examples cover various types and produces accurate surrogate models at an affordable
of experiments, designs, BO methods, and parameter types computational cost. Depending on the available time and
from reaction engineering and catalysis synthesis. A full list of resources, traditional DOEs, such as the factorial design can
examples and their descriptions are in Table S2. also be used for the initial plan to sample across the entire
Here, we briefly demonstrate the overall optimization space and then BO can search for the global optimum. The
pipeline for two specific cases with code snippets (shown in traditional factorial design method samples the space less
the SI) on single- and multi-objective optimization, respec- extensively and can be less efficient. The runtime of core BO
tively. The examples are built on prior kinetics studies of functions (GP training and acquisition function evaluation)
fructose conversion to 5-hydroxymethylfurfural (HMF) in a completes in seconds per iteration on a laptop CPU, negligible
PFR.33,34 Fructose, produced from biomass, can be converted compared to the cost of the objective function evaluation. Even
to valuable fuels and chemicals through a chain of reactions. though we use a simple PFR model here with a relatively small
HMF is an essential intermediate in this supply chain derived reaction network of overall reactions, the central idea and
from fructose through acid-catalyzed dehydration.35,36 Besides, workflow remain the same for the other complex systems, such
side reactions produce byproducts, such as humins, formic acid as CFD simulations and/or parameter estimation of complex
(FA), and levulinic acid (LA). A schematic of the PFR is reaction networks. We recently demonstrated such an example
shown in Figure 2a. The objective functions are derived from a of using a combination of the proposed workflow and CFD
kinetic model consisting of a set of ordinary differential simulations to efficiently optimize a microwave reactor in a
equations (ODEs) that compute the concentrations of each high dimensional space by taking advantage of the modular
species at a given time in a batch reactor or location in a PFR design of NEXTorch and the effectiveness of BoTorch.29
(as is the case here).1 The input parameters (X), include the Case 2: Simultaneous Two Variable (HMF Yield and
reaction temperature (T), pH, and final residence time (tf). Selectivity) Optimization. While aiming to maximize the
Their ranges are 140−200 °C, 0−1, and 0.01 to 100 min, HMF yield, similar to Case 1, it is crucial to maintain a high
respectively. HMF selectivity to reduce the separation costs. In this regard,
Case 1: Renewable Platform Chemical (HMF) Yield two objectives, yield and selectivity, need to be optimized.
Optimization. In this case, the goal is to maximize a single These two variables may not be maximized simultaneously,
objective, the HMF yield (Y), in the three-dimensional indicating that an increase in yield may lead to a decrease in
continuous space. The objective function (PFR_yield) is a selectivity and vice versa. Therefore, we perform the
Python function object. The first steps involve importing multiobjective optimization to find out a Pareto front.
NEXTorch modules, defining the parameter space and a DOE As in Case 1, the optimization is performed in the three-
sampling plan to obtain an initial set of responses. Code dimensional continuous space, and the objective function
example S1 illustrates these steps. Next, we initialize an (PFR_yield) is a Python function object, which returns the
Experiment object, input the initial data, and set the goal as yield and selectivity of HMF. We initialize an EHVIMOOEx-
maximization (Code example S2). Here, 54 additional BO periment object, which uses the Monte Carlo EHVI (qEHVI)
trials using the default acquisition function (EI), where one as the acquisition function, and set the reference point. The
infill point is generated in each iteration, are obtained (Code reference point defines slightly worse values than the lower
example S3). An explicit (human-in-the-loop) loop allows bound of the objective values that are acceptable for each
users to access the values of parameters (X_new_real) and objective (Code example S4). Alternatively, we can initialize a
responses (Y_new_real) in real units. The final set of optimal WeightedMOOExperiment object, which handles multiple
values can also be extracted and further exported to standard weight combinations to perform multiobjective optimizations
data storage files. using the weighted-sum method automatically (Code example
NEXTorch also offers a variety of visualization options S5). Then, we perform multiobjective optimization using
(Figure 2). We compare the aforementioned LHS+BO either method to obtain the Pareto optimal (Code example
E https://doi.org/10.1021/acs.jcim.1c00637
J. Chem. Inf. Model. XXXX, XXX, XXX−XXX
Journal of Chemical Information and Modeling pubs.acs.org/jcim Application Note

Figure 3. Pareto optimum of HMF yield and selectivity using the (a) qEHVI and (b) weighted-sum methods. The diagonal line indicates the locus
of 100% conversion conditions. As the points approach the diagonal, the conversion increases. The selectivity drops only slightly as the conversion
increases until high conversions are reached.

S6). Utilizing the visualization tool of NEXTorch, we compare computing practices could advance the field toward a more
the Pareto front from the two methods, where both sets of efficient and automated future.
optimals lay in the region that HMF selectivity is greater or
equal to the HMF yield and share a similar Pareto front
(Figure 3). Interestingly, the Pareto front obtained from the
■ DATA AND SOFTWARE AVAILABILITY
NEXTorch is publicly available free of charge at https://
qEHVI method covers a larger range than the weighted sum github.com/VlachosGroup/nextorch.37 It can be installed
method. This is attributed to the limitation of the preassigned using the standard package-management system in Python.
weights to each objective in the weighted-sum method, which The online documentation is available at https://nextorch.
are not guaranteed to include the desired region of the design readthedocs.io/en/latest.


space. Moreover, the computational time of the weighted-sum
method (12.9 min) is orders of magnitude higher than that of ASSOCIATED CONTENT
the qEHVI method (0.2 min) because the objective function is
called many more times (n_trials × n_exp = 420 in Code *
sı Supporting Information

examples S5 and S6). This example showcases that the qEHVI The Supporting Information is available free of charge at
method efficiently finds out the Pareto optimum and produces https://pubs.acs.org/doi/10.1021/acs.jcim.1c00637.
an accurate surrogate model at a lower cost. Comparison of existing Python-based Bayesian opti-

■ CONCLUSIONS
Optimization and predictive modeling are ubiquitous in
mization and kriging packages, details of multiobjective
optimization methods, list of examples in the NEXTorch
online documentation, comparison of the best yield
value attained using different sample methods, and code
chemical sciences. Bayesian optimization-based active learning
examples for cases 1 and 2 (PDF)
is a powerful approach for such tasks. The NEXTorch library
was created to enable streamlined and efficient Bayesian
optimization and surrogate modeling for chemical sciences and
engineering applications. Its backend from BoTorch/PyTorch
■ AUTHOR INFORMATION
Corresponding Author
enables GPU acceleration, parallelization, and state-of-the-art Dionisios G. Vlachos − Department of Chemical and
Bayesian optimization algorithms. The modular and flexible Biomolecular Engineering, University of Delaware, Newark,
design of NEXTorch expands its capabilities and connects with Delaware 19716, United States; Catalysis Center for Energy
real-world problems. Specifically, it can deal with mixed types Innovation, RAPID Manufacturing Institute, and Delaware
of parameters and data-type conversions, supports both Energy Institute (DEI), University of Delaware, Newark,
automated and human-in-the-loop optimization, and offers Delaware 19716, United States; orcid.org/0000-0002-
various visualization options. It can be used in chemical 6795-8403; Email: vlachos@udel.edu
synthesis, molecular modeling, reaction condition optimiza-
tion, parameter estimation, and reactor geometry optimization, Authors
to mention a few examples. Such tasks can easily be performed Yifan Wang − Department of Chemical and Biomolecular
without extensive programming effort so that the user can Engineering, University of Delaware, Newark, Delaware
focus on domain-specific questions. Moreover, NEXTorch 19716, United States; Catalysis Center for Energy
enables an interface with the commonly used simulation tools Innovation, RAPID Manufacturing Institute, and Delaware
in the reaction engineering field, such as CFD and multiphysics Energy Institute (DEI), University of Delaware, Newark,
simulations, for automatic optimization. Including surrogate Delaware 19716, United States
models, such as random forest and multifidelity models, Tai-Ying Chen − Department of Chemical and Biomolecular
extending the acquisition function list and adding a graphical Engineering, University of Delaware, Newark, Delaware
user interface are important future directions. We believe the 19716, United States; Catalysis Center for Energy
adoption of Bayesian optimization in daily laboratory or Innovation, RAPID Manufacturing Institute, and Delaware
F https://doi.org/10.1021/acs.jcim.1c00637
J. Chem. Inf. Model. XXXX, XXX, XXX−XXX
Journal of Chemical Information and Modeling pubs.acs.org/jcim Application Note

Energy Institute (DEI), University of Delaware, Newark, Reconfigurable System for Automated Optimization of Diverse
Delaware 19716, United States Chemical Reactions. Science (Washington, DC, U. S.) 2018, 361,
1220−1225.
Complete contact information is available at: (15) Mateos, C.; Nieves-Remacha, M. J.; Rincón, J. A. Automated
https://pubs.acs.org/10.1021/acs.jcim.1c00637 Platforms for Reaction Self-Optimization in Flow. React. Chem. Eng.
2019, 4, 1536−1544.
Author Contributions (16) Berardo, E.; Turcani, L.; Miklitz, M.; Jelfs, K. E. An
§
Y. Wang and T.-Y. Chen contributed equally. Evolutionary Algorithm for the Discovery of Porous Organic Cages.
Notes Chem. Sci. 2018, 9, 8513−8527.
(17) Wang, Y.; Su, Y.-Q.; Hensen, E. J. M.; Vlachos, D. G. Finite-
The authors declare no competing financial interest.


Temperature Structures of Supported Subnanometer Catalysts
Inferred via Statistical Learning and Genetic Algorithm-Based
ACKNOWLEDGMENTS Optimization. ACS Nano 2020, 14, 13995−14007.
Funding from the RAPID manufacturing institute, supported (18) Zhou, Z.; Li, X.; Zare, R. N. Optimizing Chemical Reactions
by the U.S. Department of Energy (DOE), Advanced with Deep Reinforcement Learning. ACS Cent. Sci. 2017, 3, 1337−
Manufacturing Office (AMO), Award No. DE-EE0007888- 1344.
9.5 is gratefully acknowledged. The Delaware Energy Institute (19) Häse, F.; Roch, L. M.; Kreisbeck, C.; Aspuru-Guzik, A.
gratefully acknowledges the support and partnership of the Phoenics: A Bayesian Optimizer for Chemistry. ACS Cent. Sci. 2018,
State of Delaware in furthering the essential scientific research 4, 1134−1145.
(20) Janet, J. P.; Ramesh, S.; Duan, C.; Kulik, H. J. Accurate
being conducted through the RAPID projects. The authors
Multiobjective Design in a Space of Millions of Transition Metal
thank Jaynell Keely for assistance with the graphics.


Complexes with Neural-Network-Driven Efficient Global Optimiza-
tion. ACS Cent. Sci. 2020, 6, 513.
REFERENCES (21) Frazier, P. I. A Tutorial on Bayesian Optimization. arXiv
(1) Coley, C. W.; Eyke, N. S.; Jensen, K. F. Autonomous Discovery Preprint, arXiv:1807.02811v1, 2018.
in the Chemical Sciences Part II: Outlook. Angew. Chem., Int. Ed. (22) Snoek, J.; Larochelle, H.; Adams, R. P. Practical Bayesian
2020, 59, 23414. Optimization of Machine Learning Algorithms In NIPS'12:
(2) Settles, B. Active Learning Literature Survey; Compuiter Sciences Proceedings of the 25th International Conference on Neural Information
Technical Report 1648; University of Wisconson, 2010. Processing Systems, December 2012; Vol. 2, pp 2951−2959.
(3) Yu, M.; Yang, S.; Wu, C.; Marom, N. Machine Learning the (23) GPyOpt: A Bayesian Optimization Framework in Python. 2016,
Hubbard U Parameter in DFT+U Using Bayesian Optimization. npj https://github.com/SheffieldML/GPyOpt.
Comput. Mater. 2020, 6, na DOI: 10.1038/s41524-020-00446-9. (24) Balandat, M.; Karrer, B.; Jiang, D. R.; Daulton, S.; Letham, B.;
(4) Ebikade, E. O.; Wang, Y.; Samulewicz, N.; Hasa, B.; Vlachos, D. Wilson, A. G.; Bakshy, E. BoTorch: Programmable Bayesian
Active Learning-Driven Quantitative Synthesis−Structure−Property Optimization in PyTorch. arXiv Preprint, arXiv:1910.06403v3, 2019.
Relations for Improving Performance and Revealing Active Sites of (25) Knudde, N.; Van Der Herten, J.; Dhaene, T.; Couckuyt, I.
Nitrogen-Doped Carbon for the Hydrogen Evolution Reaction. React. GPflowOpt: A Bayesian Optimization Library Using TensorFlow.
Chem. Eng. 2020, 5, 2134. arXiv Preprint, arXiv:1711.03845v1, 2017.
(5) Burger, B.; Maffettone, P. M.; Gusev, V. V.; Aitchison, C. M.; (26) Facebook/Ax: Adaptive Experimentation Platform. https://
Bai, Y.; Wang, X.; Li, X.; Alston, B. M.; Li, B.; Clowes, R.; Rankin, N.; github.com/facebook/Ax (accessed October 2021).
Harris, B.; Sprick, R. S.; Cooper, A. I. A Mobile Robotic Chemist. (27) Shahriari, B.; Swersky, K.; Wang, Z.; Adams, R. P.; De Freitas,
Nature 2020, 583, 237−241. N. Taking the Human out of the Loop: A Review of Bayesian
(6) Shields, B. J.; Stevens, J.; Li, J.; Parasram, M.; Damani, F.; Optimization. Proc. IEEE 2016, 104, 148−175.
Alvarado, J. I. M.; Janey, J. M.; Adams, R. P.; Doyle, A. G. Bayesian (28) Kandasamy, K.; Vysyaraju, K. R.; Neiswanger, W.; Paria, B.;
Reaction Optimization as a Tool for Chemical Synthesis. Nature Collins, C. R.; Schneider, J.; Poczos, B.; Xing, E. P. Tuning
2021, 590, 89−96. Hyperparameters without Grad Students: Scalable and Robust
(7) Del Rosario, Z.; Rupp, M.; Kim, Y.; Antono, E.; Ling, J.
Bayesian Optimisation with Dragonfly. arXiv Preprint, ar-
Assessing the Frontier: Active Learning, Model Accuracy, and Multi-
Xiv:1903.06694v2, 2019.
Objective Candidate Discovery and Optimization. J. Chem. Phys.
(29) Chen, T.-Y.; Baker-Fales, M.; Vlachos, D. G. Operation and
2020, 153, 024112.
Optimization of Microwave-Heated Continuous-Flow Microfluidics.
(8) Tran, A.; Tranchida, J.; Wildey, T.; Thompson, A. P. Multi-
Fidelity Machine-Learning with Uncertainty Quantification and Ind. Eng. Chem. Res. 2020, 59, 10418−10427.
Bayesian Optimization for Materials Design: Application to Ternary (30) Ogunnaike, B. A.; Babatunde, A. Random Phenomena:
Random Alloys. J. Chem. Phys. 2020, 153, 074705. Fundamentals of Probability and Statistics for Engineers; CRC Press:
(9) Montoya, J. H.; Winther, K. T.; Flores, R. A.; Bligaard, T.; Boca Raton, FL, 2010.
Hummelshøj, J. S.; Aykol, M. Autonomous Intelligent Agents for (31) McKay, M. D.; Beckman, R. J.; Conover, W. J. A Comparison
Accelerated Materials Discovery. Chem. Sci. 2020, 11, 8517−8532. of Three Methods for Selecting Values of Input Variables in the
(10) Forrester, A. I. J.; Sóbester, A.; Keane, A. J. Engineering Design Analysis of Output from a Computer Code. Technometrics 1979, 21,
via Surrogate Modelling; Wiley, 2008. 239.
(11) Jones, D. R.; Schonlau, M.; Welch, W. J. Efficient Global (32) Daulton, S.; Balandat, M.; Bakshy, E.Differentiable Expected
Optimization of Expensive Black-Box Functions,. J. Glob. Optim. Hypervolume Improvement for Parallel Multi-Objective Bayesian
1998, 13 (4), 455−492. Optimization. arXiv Preprint, arXiv:2006.05078v3, 2020.
(12) Boukouvala, F.; Muzzio, F. J.; Ierapetritou, M. G. Dynamic (33) Swift, T. D.; Bagia, C.; Choudhary, V.; Peklaris, G.; Nikolakis,
Data-Driven Modeling of Pharmaceutical Processes. Ind. Eng. Chem. V.; Vlachos, D. G. Kinetics of Homogeneous Brønsted Acid Catalyzed
Res. 2011, 50, 6743−6754. Fructose Dehydration and 5-Hydroxymethyl Furfural Rehydration: A
(13) Rasmussen, C. E.; Williams, C. K. I. Gaussian Processes for Combined Experimental and Computational Study. ACS Catal. 2014,
Machine Learning; MIT Press, 2005; Vol. 7. DOI: 10.7551/mitpress/ 4, 259−267.
3206.001.0001 (34) Desir, P.; Saha, B.; Vlachos, D. G. Ultrafast Flow Chemistry for
(14) Bédard, A.; Adamo, A.; Aroh, K. C.; Russell, M. G.; Bedermann, the Acid-Catalyzed Conversion of Fructose. Energy Environ. Sci. 2019,
A. A.; Torosian, J.; Yue, B.; Jensen, K. F.; Jamison, T. F. 12, 2463−2475.

G https://doi.org/10.1021/acs.jcim.1c00637
J. Chem. Inf. Model. XXXX, XXX, XXX−XXX
Journal of Chemical Information and Modeling pubs.acs.org/jcim Application Note

(35) Chen, T. Y.; Desir, P.; Bracconi, M.; Saha, B.; Maestri, M.;
Vlachos, D. G. Liquid-Liquid Microfluidic Flows for Ultrafast 5-
Hydroxymethyl Furfural Extraction. Ind. Eng. Chem. Res. 2021, 60,
3723−3735.
(36) Chen, T. Y.; Cheng, Z.; Desir, P.; Saha, B.; Vlachos, D. G. Fast
Microflow Kinetics and Acid Catalyst Deactivation in Glucose
Conversion to 5-Hydroxymethylfurfural. React. Chem. Eng. 2021, 6,
152−164.
(37) Wang, Y.; Chen, T.-Y.; Vlachos, D. G. NEXTorch: A Design
and Bayesian Optimization Toolkit for Chemical Sciences and
Engineering. ChemRxiv Preprint, 2021. DOI: 10.26434/chem-
rxiv.14727861.

H https://doi.org/10.1021/acs.jcim.1c00637
J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

You might also like