You are on page 1of 38

Review

Cite This: ACS Catal. 2020, 10, 2260−2297 pubs.acs.org/acscatalysis

Machine Learning for Catalysis Informatics: Recent Applications and


Prospects
Takashi Toyao,†,‡ Zen Maeno,† Satoru Takakusagi,† Takashi Kamachi,‡,§ Ichigaku Takigawa,*,∥,⊥
and Ken-ichi Shimizu*,†,‡

Institute for Catalysis, Hokkaido University, N-21, W-10, Sapporo 001-0021, Japan

Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan
§
Department of Life, Environment and Materials Science, Fukuoka Institute of Technology, 3-30-1Wajiro-Higashi, Higashi-ku,
Fukuoka 811-0295, Japan
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.


RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan

Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo,
Hokkaido 001-0021, Japan
Downloaded via UNIV OF EXETER on July 17, 2020 at 14:28:15 (UTC).

ABSTRACT: The discovery and development of catalysts and catalytic


processes are essential components to maintaining an ecological balance in the
future. Recent revolutions made in data science could have a great impact on
traditional catalysis research in both industry and academia and could
accelerate the development of catalysts. Machine learning (ML), a subfield of
data science, can play a central role in this paradigm shift away from the use of
traditional approaches. In this review, we present a user’s guide for ML that we
believe will be helpful for scientists performing research in the field of catalysis
and summarize recent progress that has been made in utilizing ML to create
homogeneous and heterogeneous catalysts. The focus of the review is on the
design, synthesis, and characterization of catalytic materials/compounds as
well as their applications to catalyzed processes. The ML technique not only
enhances ways to discover catalysts but also serves as a powerful tool to
establish a deeper understanding of relationships between the properties of materials/compounds and their catalytic activities,
selectivities, and stabilities. This knowledge facilitates the establishment of principles employed to design catalysts and to
enhance their efficiencies. Despite such advantages of ML, it is noteworthly that the current ML-assisted development of real
catalysts remains in its infancy, mainly because of the complexity of catalysis associated with the fact that catalysis is a time-
dependent dynamic event. In this review, we discuss how seamless integration of experiment, theory, and data science can be
used to accelerate catalyst development and to guide future studies aimed at applications that will impact society’s need to
produce energy, materials, and chemicals. Moreover, the limitations and difficulties of ML in catalysis research originating from
the complex nature of catalysis are discussed in order to make the catalysis community aware of challenges that need to be
addressed for effective and practical use of ML in the field.
KEYWORDS: machine learning, catalysis informatics, high-throughput experiments/computations, data mining,
structure−activity relationships

1. INTRODUCTION under a variety of conditions.8,9 Also, fundamental kinetic,


Catalysis is a complex, multidimensional, dynamic, and characterization, and theoretical studies have been carried out
multiscale method for promoting chemical reactions.1 In to develop preparation−structure relationships or structure−
particular, heterogeneous catalysis is still a largely empirical performance relationships, which are useful in guiding
science because of the complexity of the surface chemistry hypothesis-driven designs of new catalysts.10−12 For example,
involved.1−3 As depicted in Figure 1, wide length and time to create structure−performance relationships, catalyst per-
scale ranges need to be considered in order to fully understand formance (vertical axis) is plotted as a function of key
the dynamic nature of catalytic reactions. Recent experimental structural factors (horizontal axis) in order to uncover
and theoretical studies have yielded atomic-level insight into important features of catalysts that affect their activities,
these processes.4−7 Nevertheless, the discovery of truly novel selectivities, and/or stabilities. Unfortunately, the enormous
catalysts and catalytic reactions is still a formidable task, and as
a result, many of the advances in this area have arisen from Received: September 28, 2019
trial-and-error investigations. This empirical approach has Revised: December 8, 2019
uncovered numerous multicomponent catalysts that operate Published: December 16, 2019

© 2019 American Chemical Society 2260 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 1. Description of heterogeneous catalysis from the atomic level to large-scale reactors.

amount of experimental data gathered in the study of catalyst reactions are still in their infancy.42 Despite having a 30 year
preparation−structure−performance relationships have not yet history, the use of artificial intelligence (AI) for catalyst
been integrated into readily searchable databases. Although design43−46 has had only a small impact on the discovery of
recent advances in theoretical and computational chemistry effective homogeneous or heterogeneous catalysts prepared
have enabled descriptions of elementary steps in catalytic according to insight from data-intensive methodology. The
processes that have already been discovered, for the most part main barriers to employing data-intensive methods for catalyst
they have not focused on strategies to design new catalysts.13 design are the lack of universal data sets for catalytic activities
Therefore, the enormous amount of experimental and and selectivities and the existence of only a few descriptors.
theoretical data accumulated to date on various aspects of In this review, we summarize the current state of
catalysis are not that useful in predicting new catalysts or investigations aimed at the ML-guided development of
catalytic reactions. Scaling the relationship between calculated homogeneous and heterogeneous catalysts. Recent excellent
reaction barriers and binding energies of surface intermediates reviews42,47−52 have described research studies of ML in
is a promising method employed in data-intensive catalyst catalysis,49 pioneering investigations of ML-assisted identi-
design.13−15 Several studies have demonstrated how this fication of catalysts,46,53 predictions of catalytic properties by
approach can contribute to the design of novel catalysts the use of density functional theory (DFT) calculations along
based on predicted volcano trends in catalytic activity versus with scaling relationships,13−15 and catalysis informatics.47,48 In
binding energies of surface intermediates.14 However, the contrast to others, this comprehensive review covers studies of
accuracy of these models in predicting catalytic performance is ML-assisted theoretical and experimental investigations of
limited because of the large number of other factors involved. computer-aided catalyst design that have been reported in the
Consequently, more effective relationships (descriptors) that past decade, with a major focus on illustrating how ML, in
can be employed to analyze these processes must be devised in combination with DFT and experimental data, can be used to
order to develop more effective strategies to design novel identify ideal catalysts. The limitations and difficulties of ML in
catalysts for important reactions. catalysis research originating from the complex nature of
Molecular/materials informatics has become a central catalysis are also discussed. The review begins with a tutorial
paradigm in molecular and materials science.16−29 In the overview of applications of ML in catalysis science (section 2).
data-intensive materials design (so-called inverse design)23,24 This is followed by a discussion of representative studies in
protocol, the desired function of a new material is identified which high-throughput computation and ML have been
first, and then a candidate is extracted from a computational or utilized to design inorganic catalysts such as alloys, complex
experimental database using machine learning (ML) methods. metal oxides, zeolites, and inorganic−organic hybrid materials
This new methodology, which is an improved version of (section 3). Representative studies focusing on high-
“quantitative structure−property relationship (QSPR) model- throughput computation to uncover important catalyst proper-
ing of materials properties”,30 has been shown to be a powerful ties, including d-band centers, oxygen vacancy formation
tool for predicting the properties of substances, including those energies (EOvac), and surface adsorption energies of molecules
of unknown molecules/materials. Recent success in utilizing and atoms, are covered in section 4. A review of ML
this approach is exemplified by the ML-assisted prediction of approaches to predict these functions is given in section 5.
physical properties of molecules/materials such as atom- These efforts are highly important because describing catalysis
ization31,32 and formation energies,33−35 densities of states,36 as a function of a physical parameter would be necessary for
band gaps,37−39 vibrational free energies,40 and melting fast and efficient catalyst discovery. Illustrations of the
temperatures.41 Therefore, if a single physical property is application of ML techniques to the characterization of
dominant in governing the performance of a material, heterogeneous catalysts are provided in section 6. This
molecular/materials informatics serves as an ideal tool to would enable more accurate and rapid characterization of the
identify a novel functional material. However, catalyst design catalysts, and eventually, automated control of these processes
does not fit into this category because theoretical models for in catalysis research would be possible. The discussion in
catalysis, especially heterogeneous catalysis, are not available at sections 7 and 8 focuses on ML treatments of experimental
this time. In contrast to those involving physical properties, data aimed at designing new heterogeneous and homogeneous
ML-assisted predictions of rates and selectivities of catalytic catalysts, respectively. Finally, a summary of the efforts
2261 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 2. Schematic representation of ML-aided future catalysis research.

Figure 3. ML landscape in the context of a brief overview of machine learning tasks and algorithms.

discussed in the review and future perspectives in the area of supervised and unsupervised learning. The details of each
ML-guided catalyst development (Figure 2) are given in algorithm are covered in many excellent existing reviews.16,51
section 9. Thus, rather than just rephrasing these explanations, in this
section we highlight from a user perspective the essentials,
2. A GUIDE TO USING MACHINE LEARNING IN practices, and pitfalls of using ML in natural sciences and
CATALYSIS RESEARCH specifically for catalyst development.
ML is an interdisciplinary field of study that concentrates on Particular emphasis will be given to supervised learning,
developing computer algorithms that enable learning from data which is the most common setting used not only in catalysis
without being explicitly programmed. ML emerged primarily studies but also in other ML applications such as image and
from the fields of computer science and statistical science, but speech recognition, machine translation, and medical diag-
it now encompasses a broad range of disciplines. The nosis. In supervised learning, a function that maps an input to
landscape of ML is briefly given Figure 3 in the form of the desired output is built without explicit modeling by
typical ML problems along with some common methods of providing many examples of input−output pairs (called
2262 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 4. Caution of ‘‘garbage in, garbage out’’ in ML studies (section 2.1). Because ML is data-driven, the quality and quantity of data are essential
prerequisites for every ML study. Establishing sufficient data quality and quantity should always be prioritized over what ML algorithm is used. If
input data are of low quality, flawed, or nonsense, any ML model produces nonsense output or “garbage”.

Figure 5. Theory-driven and data-driven approaches for scientific research. Theory-driven methods try to explicitly model the inner workings of a
target phenomenon (e.g., through first-principles simulations), while data-driven methods try to precisely approximate its outer behavior observable
as data (section 2.2).

training data), and the obtained mapping function is used to However, to maximize the capabilities of this new technology,
make a prediction (an output) for any possible input. One of we need to have a careful understanding of what ML-acquired
the main interests of catalysis science is often to gain a better mapping means, what grounds are utilized to obtain a
understanding of unclear relationships between inputs (e.g., prediction or insight, and what types of results can be
the properties, compositions, and structures of catalysts) and misleading and force us to conclude nonexistent cause-and-
outputs (e.g., their activities, selectivities, and stabilities). effect relationships.
Unfortunately, it is not yet feasible to have an explicit 2.1. “Data-Driven”: The Quality and Quantity of Data
theoretical model that can perfectly describe such complicated Matter Most. The principle that underpins all of the ML
relationships. Instead, by the use of supervised learning, an algorithms listed in Figure 3 is that ML is data-driven, which
inferred relationship between the inputs and outputs can be means that ML models are defined only by feeding a given data
learned from given examples without explicitly knowing the set (training data) into them. ML prediction models are
underlying principles. As a result, ML can serve as an representative of such given training data. Thus, the quality
important tool to investigate and predict phenomena that are and quantity of “data” are essential prerequisites to elevate the
difficult to explicitly model mathematically. This can open up a predictive capability of ML independent of whether a state-of-
new direction for scientific research on highly complex the-art or simple ML algorithm is utilized. Realizing that data is
phenomena with many uncertain intertwined factors, where more important than algorithms is the first step for building
traditional trial-and-error styles are already reaching the limit. any ML application, and the fact that accurate ML is
2263 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 6. (top) Supervised learning (section 2.3). Supervised learning acquires a function mapping inputs to outputs that can be fitted to the
training data by using tuning parameters. (bottom) The bias−variance trade-off (section 2.5). Highly complex models often cause overfitting,
whereas models that are too simple can cause underfitting.

impossible with inaccurate data is often described as “garbage effect relationships conventionally start with a hypothesis or
in, garbage out” (Figure 4). theory that arises from past observations and intuition. Then
The first questions in every ML-based study would be what data from carefully designed experiments or computer
types of and how much data ML requires. The most typical simulations are collected to determine whether they are in
form of data is tabular data (e.g., a spreadsheet), where each accord with the proposed hypothesis or theory. The theory-
row has numeric and non-numeric records of variables driven approach has been employed in the field of catalysis to
(columns) from experimental observations, theoretical calcu- gain significant insight and an atomic-level understanding of
lations, and existing databases. However, as this paper how catalysts operate. In contrast, in the data-driven approach,
showcases existing studies in subsequent sections, diverse an existing data set is subjected to an unfixed mathematical
types of data are now used in ML-based studies, including time function (the ML model) that includes various tuning
series, images, sensor signals, texts, and structure data such as parameters targeted to produce the desired outcome. Data
chemical formulas, steric configurations, three-dimensional are used to train the ML model so that the model can produce
(3D) point clouds, and atom mappings. These forms define the defined outcome. Hence, the data-driven approach does
the information contents that ML can use for predictions, and not encode domain-specific information, and as a result, it is
thus, any forms of data should be used if they characterize the not cognizant of the underlying phenomenon before ML
targets under study well. The size of the data set is another training begins. Since data are always in accord with the model
major concern in ML-based studies, and it determines the after the ML training, it is generally hard to confidently assess
confidence or reliability of ML predictions. Technically, ML whether an ML model is valid.
can be applied to any small-sized data set, but the prediction The well-known statistician David Hand stated the following
made from such a small data set would be less confident and in his keynote talk at the ACM SIGKDD Conference on
reliable in a statistical sense. Considering the use of any Knowledge Discovery and Data Mining (KDD ’18):54
existing data sets can help when the available data sets are very “Theory-driven models can be wrong but data-driven models
limited. cannot be wrong or right. Data-driven models are not trying to
2.2. Data-Driven and Theory-Driven Approaches. The describe an underlying reality but are merely intended to be
data-driven principle should also be contrasted with theory- useful. So, they could be poor or useless, but not wrong.”
driven (or hypothesis-driven) methods (Figure 5), which are Fitting ML models to data is purely based on input−output
most often used in traditional sciences. Theory-driven and correlations, which do not imply causation. Training data points
data-driven methods are based on very different principles. are finite, and a limitless number of possible mappings having
Theory-driven methods try to explicitly model the inner the same goodness of fit can potentially exist. It is even
workings of a target phenomenon (e.g., through first-principles possible to deliberately choose a mapping that is consistent
simulations), while data-driven methods try to precisely with an expected outcome but is merely subjective and lacks
approximate its outer behavior observable as data. These two any rational basis. It requires considerable attention to avoid a
are quite complementary, and data-driven approaches can hypothesizing after the results are known (“HARKing”)
complement limitations of theory-driven ones such as high conclusion. As David Hand also stated, “If data can speak for
computing costs, target/condition/parameter considerations, themselves, they can also lie for themselves. So, it’s critically
scalability issues, complexity and uncertainty of real-world important to exercise caution, do not claim too much,
systems, and known and unknown imperfections of currently understand the data and its quality.”
available theories. 2.3. Supervised Learning and an Example Workflow.
The exact origins of scientific phenomena are often By analysis of training data, supervised learning identifies a
unknown. As a result, studies aimed at identifying cause-and- target function that best maps input variables X to the desired
2264 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 7. Differences in commonly used algorithms for supervised learning (section 2.3). Decision boundaries of each algorithm for classifying the
red and blue points in the leftmost plots are visualized. For exactly the same training sets (i.e., the three problems at the far left), each algorithm
forms very different decision boundaries (from different inductive biases) to interpolate the interspace for generalization beyond the given training
samples.

output variable Y (Figure 6). The training data consist of many is linear, meaning that Y needs to be a weighted sum of each
sample input−output pairs, and the trained ML model input variable (no interaction effect is considered). As
corresponds to a function that fits the data to reproduce the portrayed in Figure 6, a model with appropriate complexity
given input−output pairs. Because the number of input depending on the given data needs to be chosen or designed.
variables can be large, X can be high-dimensional, and thus, The differences among several ML algorithms in terms of
the fitting process can be technically nontrivial. For example, decision boundaries for binary classification in two dimensions
supervised learning can be used to create a computer program are displayed in Figure 7. An intuitive understanding of how
that recognizes handwritten characters without explicitly each parameter affects the decision boundaries can be gained
knowing any underlying principles. In contrast, it is exceed- by viewing an interactive demo dealing with neural networks at
ingly difficult to explicitly program the recognition process to https://playground.tensorflow.org/.
use visual images of handwritten characters (the input) to The aspect of evaluation is about the choice of criteria for a
determine what characters the images represent (the output). goodness-of-fit test of the chosen ML model function. For
When a process that is hard to model explicitly is encountered example, linear regression analysis usually minimizes the least-
but a good set of corresponding input−output examples can be squares error between Y and f(X), but when Y is binary (e.g.,
collected, it is always worth considering using the supervised −1 or +1), the use of the squared error is not appropriate, and
learning approach. However, it also should be emphasized that logistic regression is used instead. Lastly, the aspect of
high prediction performance currently available for the optimization is about the choice of a tuning algorithm (the
handwritten character recognition does not directly mean optimization algorithm) that optimizes the evaluation criteria
that we could understand any underlying mechanism by which for fitting the ML model function to the given data. For
our brain recognizes characters. It should be noted that the example, in the cases of least-squares regression or several
supervised learning setup arises in many subfields of computer kernel methods, optimal parameters can be obtained in a
science and statistics, where many synonyms are used, closed form by using a matrix calculation, but for general ML
including where an input variable is called a feature, descriptor, models such as neural networks, incremental parameter
attribute, independent variable, predictor variable, explanatory updating with random initialization is often used.
variable, or covariate and an output variable is called a Although recent advances made in the development of open-
dependent, response, or outcome variable. source software such as scikit-learn and tensorflow have
As presented in an excellent publication by Pedro provided ready access to numerous high-quality ML methods,
Domingos,55 the general framework of ML can be understood nonexperts would still be uncertain about what method to
in terms of “Learning = Representation + Evaluation + choose from a long list of available algorithms. Below we
Optimization”. The wide variety of ML methods displayed in provide an outline of a typical workflow, practices, and
Figure 3 stem from differences in these three factors. The checkpoints from a user’s viewpoint. Another publication56
aspect of representation is about the choice of an ML model gives further details on best practices.
function that is tunable by using parameters, where “tunable” A. Problem and Variable Design. First, the problem under
means that the shape of the function is changed in various study needs to be defined in an input−output format. Thus,
forms by changing the values of the parameters. For example, if the first but crucial step is to identify or devise a set of relevant
we assume a linear input−output relationship, we can use Y = input variables that characterize the output variables well (or
f(X1, ···, Xp) = θ0 + θ1X1 + ... + θpXp with p + 1 parameters (θ0, fully). Traditionally, a small number of dominant factors as
···, θp) for a p-dimensional input of X = (X1, ···, Xp). By changes input variables is preferred, but this is not a requirement when
in the values of the parameters θ0, ..., θp, the function f can modern ML is used. Thousands or even millions of potentially
represent any hyperplane in the p-dimensional space, but it related factors beyond human capability can be considered as
cannot represent nonlinear hypersurfaces. Moreover, it long as they are observable. For example, each pixel value of an
requires a strong assumption that the input−output mapping image can be employed as an input variable, and in this case,
2265 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

10 000 input variables would be considered for an image of new input variables (meta features). For example, out-of-
100 × 100 pixels. This approach has not been a common sample prediction values, principal component values,
practice in conventional statistics, and regression using many clustering results, and hierarchical features from deep learning
variables has not been recommended. It is often called “kitchen models can be used.
sink regression” because of its high risk of overfitting and E. Choosing, Applying, and Customizing ML Algorithms.
capturing spurious correlations. At the current time, we can After a training set is properly designed and set up, several ML
consider a large number of input variables by using many algorithms are applied and evaluated.
modern ML methods that can handle extremely high • Performance Evaluation Measure (or “Loss Func-
dimensional data with appropriate control of the model tion”). First, an appropriate evaluation measure needs to
complexity. This would bring a practical advantage because if be selected. Typical choices are the prediction accuracy,
any primary factor is missed, the prediction can suffer from so- F1 score, or area under the receiver operating character-
called omitted variable bias and potentially yield a spurious istic curve (ROC-AUC) score for classification tasks and
correlation. However, it also should be noted that inclusion of the mean absolute error (MAE), mean square error
unnecessarily many irrelevant factors can deteriorate the (MSE), root-mean-square error (RMSE) and root-
prediction capability even for state-of-the-art ML methods mean-square logarithmic error (RMSLE) for regression
such as deep learning. tasks. Special cases like those with imbalanced classes
B. Data Collection, Sanity Check, and Exploratory Data (e.g., 0.1% positive, 99.9% negative) can be properly
Analysis. ML is purely data-driven, and consequently, the handled by customizing the evaluation measure (e.g.,
quality and quantity of data are the key determining factors for class-weighted loss or focal loss). Another example is a
success. Because ML prediction models are representative of case with outliers handled by a robust loss function (e.g.,
the given training examples, it is crucial that the training data Huber loss).
be representative of actual situations we want to treat.
However, it is extremely difficult in general to guarantee this • Preparation of Validation and Test Data: Also,
natural requirement since truly desirable data are not yet establishing reliable validation strategies for estimating
the prediction performance is quite important (but
obtained and are literally “unseen”. Collecting available data
notoriously difficult in general). Standard options are to
can often be plagued by unintentional “selection bias” because
use cross-validation or to split the data into training,
any collectable data can be easily and strongly biased toward
validation, and test sets in order to evaluate a prediction
targets that were successful in the past or that are merely easily
in an out-of-sample way. By definition, the training data
testable. Furthermore, indiscriminate data collection can be
cannot be used for model evaluation. Thus, the
plagued by errors, outliers, and noise. Thus, preliminary
validation set is utilized to evaluate the hyperparameter
exploratory data analysis (EDA) and a sanity check of the
tuning, while the test set is used to evaluate the
training data are always required. EDA is often conducted by
prediction performance. When cross-validation is
investigating data distributions and outliers via descriptive
employed, each subpart of the data needs to be further
statistics and visualizations such as a histogram, correlation
split into training and validation sets (or perform nested
matrix or correlogram, box plot, and various projection
cross-validation, often called double cross-validation),
methods to two or three dimensions.
being careful not to have data leakage from the training
C. Data Cleaning and Preprocessing. When a sufficiently
set into the validation or test set. We will discuss this
large training data set is collected, cleaning and preprocessing
point more in section 2.4.
are often required. Specifically, one needs to reformat
individual data into a single tabular form, remove or correct • Constant Prediction: Next, a constant prediction model
erroneous or incomparable data points, impute missing values, can be applied to obtain a worst-case baseline for the
and normalize and rescale the data in order to stabilize the given evaluation measure. For example, scikit-learn
numerical computation. The outcomes of several ML contains DummyClassifier and DummyRegressor for
algorithms such as kernel methods and neural networks can this purpose. For regression tasks, the mean target value
differ according to whether any normalization, standardization, in the training data set is often used as the constant. R2
or scaling is performed. It should also be noted that not only scores are relative measures using the MSE against this
input variables but also output variables can be normalized or basic reference. For classification tasks, the most
scaled. frequent class label is commonly used.
D. Feature Engineering and Selection. The selection of • Linear Prediction: Subsequently, a linear prediction is
optimal input variables is a formidable step. ML is often used applied. A common choice is logistic regression for
because the underlying principle is so unclear that it cannot be classification tasks and linear regression for regression
subjected to explicit programming, particularly in the case of tasks. Linear prediction causes underfitting when the
scientific data. Also, in many cases it is not certain what the input−output relationship is not linear, but in many
determining factors are. Thus, in practice it is necessary to practical cases it can produce quite stable predictions
utilize engineering steps in order to derive new input variables, with a sufficient level of performance. It should also be
which then need to be modified, replaced, and combined in a noted that even simple log scaling of the variables breaks
human-understandable manner. In particular, crosstalk be- linearity. Furthermore, when interaction variables are
tween two variables A and B can be taken it into account by introduced into the model, linear prediction leads to
adding the product, A × B, as an interaction variable (more (partially) polynomial classification or regression con-
generally, polynomial features). Further analysis of cases where sidering an interaction effect. When the number of
ML fails to predict correctly can provide new insight into variables is large (in particular, larger than that of
features that should be added. In more advanced cases, called training examples), linear regression can be non-
stacking or stacked generalization, ML outputs can be reused as computable or unstable without any regularization
2266 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

because of matrix noninvertibility. In this case, a data, missing data, real-time processing, domain
regularized linear regression technique such as LASSO adaptations, and dependencies arising from time series
or ridge regression should be used instead. LASSO and or structural variable correlations. When this is the case,
ridge regression for regression tasks correspond to L1- a customized or newly developed method is required to
penalized and L2-penalized linear regression, respec- meet each specific requirement.
tively. Similarly, logistic regression for classification tasks
is also commonly used with L1- and L2-penalized F. Analysis of the Trained ML Models. The main use of
regularization (e.g., LogisticRegression in scikit-learn has trained ML models is to make predictions for the target
these options). Other common options in chemo- problem, but they can also provide much additional
informatics are partial least-squares (PLS) regression information. The trained ML models represent possible
and principal component analysis (PCA) regression, approximations of the unknown input−output relationship
which are linear regression with supervised or under study. Thus, extracting information on how strongly
unsupervised dimensionality reduction by linear coor- each variable or each sample contributes to the prediction is
dinate transformation, respectively. quite helpful to gain a better understanding of the problem.
• First-Choice Nonlinear Prediction: After performing Popular methods for this purpose include significance tests on
constant and linear predictions, we can test nonlinear regression coefficients, decision rules extraction, variable
predictions. The recommended first choices for this importance scores from PLS regression (known as VIP analysis
purpose are random forest or extra trees (and possibly k- in chemoinformatics), feature importance scores from tree
nearest neighbors for classification) because both are less ensemble methods such as random forests, partial dependency
hyperparameter-dependent and the critical parameter is plots (such as the sklearn.inspection module of scikit-learn),
often only the number of trees (or k for k-nearest (global) sensitivity analysis methods, and feature visualization
neighbors). For relatively large data sets, the use of extra methods for DNNs (such as saliency maps and gradient-based
trees is preferred because it usually exhibits a comparable methods such as GradCAM, DeepLift, and SHAP). However,
and sometimes better performance and it requires much it is noteworthly that forcibly projecting multivariate trends
less computation time. into independent univariate contributions of individual features
• More Complex Nonlinear Prediction: When nonlinear always entails information loss because of the correlation
predictions perform better than linear predictions, we between features. These “interpretability” or “explainability”
can move on to more complicated models such as kernel aspects of ML algorithms have been a recent hot topic among
methods, neural networks, and gradient boosted trees ML reseachers. For another example, investigating hard-to-
(GBDT/GBRT, including XGBoost and LightGBM). predict examples in detail would also give some insights into
However, these methods are sensitive to hyperparameter the problem design under study and the input variable design
tuning and can display significantly worse performance if and engineering steps.
the hyperparameters are incorrectly set. The best G. Evaluating and Maintaining Predictions. Following the
practice is to use systematic hyperparameter tuning attainment of ML predictions, a further engineering effort is
based on a grid or a random or more advanced search sometimes needed to carefully check whether the prediction
method (e.g., model-based optimization such as works in the intended manner. Unfortunately, this step is not
Bayesian optimization, multiarmed bandits, genetic simple because ML procedures depend not only on the
algorithms (GAs), and any AutoML pipelines). We algorithm but also on the data and possibly random numbers
will discuss this point in more detail in section 2.5. for model initialization. Traditional software engineering
• Deep Learning Applicability: In cases where large data techniques are often useless for ML-based software develop-
sets of direct measurements such as images, videos, and ment because of its data and parameter dependencies. Here we
sensor signals are available, deep neural network (DNN) need not only code versioning but also proper data versioning.
(i.e., deep learning) models can often perform Moreover, in these cases debugging, testing, and requirement
significantly better than other methods. They can also definition are often more troublesome, and some data
learn a series of nonlinear feature transformations to treatments also require data infrastructure and management
convert the inputs to representations that are easier to along with iterative system deployment and maintenance.
predict. In particular, for image observations (e.g., arising 2.4. Data Leakage and Data Dredging. The only
from microscopy), convolutional network (CNN) trustworthy way to estimate the prediction performance of the
methods such as VGG, ResNet, and DenseNet are trained ML model is to test it using completely unseen external
widely used. Nowadays, many CNN models that are data derived from actual situations. In many cases, however,
pretrained on quite large data sets of natural images acquiring a completely new data set in a real-world context is
(e.g., the ImageNet database) are available. These costly from both a time and money perspective. Thus, some
pretrained CNN models can be used as handy feature strategy is required for simulating “unseen” data to test the ML
extractors for any natural images (through a series of prediction by using available data at hand. A simple approach is
convolution filter banks). Futhermore, f ine-tuning of to set aside a portion of the data set for model validation and
these models by reusing the pretrained weights for testing by a three-way split of the data into training, validation,
initialization or remodeling can improve the prediction and test sets. When the data set is not large, the most standard
performance even when only a small number of image practice is to use k-fold or leave-k-out cross-validation or
observations are available for a given task. repeated random subsampling.
• Bespoke ML Models: For many practical cases, it is However, since the test data are taken from a portion of the
sufficient to select an off-the-shelf ML method from the already obtained data, the estimated performance can some-
available list. However, consideration needs to be given times be overly optimistic as a consequence of data leakage,57
to specific constraints, including imbalanced/skewed which can occur when information on the test set leaks into
2267 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 8. Example of the bias−variance trade-off (section 2.5), showing how the decision boundaries of an SVM vary as two main hyperparameters,
gamma and C, are changed, in the form of underfitting and overfitting cases.

the training or validation sets in a complicated and unintended underlying trends and causes underfitting because it contains
way. For example, hyperparameter tuning based on the small degrees of freedom. Variance is the expected variability of
performance for test data can easily cause data leakage. In the model prediction. A high-variance ML model often has a
practice, test data are finite, and as a result, we can even overfit large degree of freedom and, as a result, it captures every detail
the ML model to the given test data by tweaking the of the training data, which then causes overfitting arising from
hyperparameters. In this case, the ML prediction exhibits high sensitivity and variability. What is needed is a “just right”
“too good” performance, but in fact it has no real effect. This model that balances trade-off between high-bias and high-
kind of misuse is often called data dredging. In cases caused by variance models (cf. Figure 6). Complicated nonlinear ML
accidents, normalizing or preprocessing the entire data set models such as DNNs can sometimes have millions or even
before splitting can also cause small data leakage. Because data billions of parameters (i.e., a tremendously large number of
leakage can occur unintentionally, it should always be kept in degrees of freedom). As a result, a limitless number of
mind for assuring scientific reproducibility. parameter settings exist that have the same goodness of fit to
2.5. Bias−Variance Trade-Off and Hyperparameter the finite data. In these cases, independent of its size, the
Tuning. An ML model that perfectly predicts the given training data alone are not sufficient to properly train a given
training data (with zero training error) is seemingly preferable. ML model.
However, real-world data usually contain irreducible errors that To remedy this situation, the number of degrees of freedom
arise from random noise factors or deterministic unpredict- of the model (i.e., the model complexity) must be restricted by
ability due to the practical impossibility of perfectly sensing all using some inductive bias. Since “unseen” data can take
relevant information. ML is not able to precisely predict trends arbitrary values without any additional assumptions, all
arising from completely random noise. Thus, learning every supervised learning strategies rely on some inductive bias
little detail of the training data can often result in inaccurate implicitly or explicitly. For example, a penalty can be placed on
prediction for unseen data. overly complex models, a perturbation can be applied to
In practice, a trade-off between two sources of error, termed stabilize too-variable fitting, or the size of the model
the bias−variance trade-of f, should be carefully considered for components can be explicitly limited, for example, by
ML models to make generalizations beyond the training data. restricting the number of units and layers in neural networks
Bias is the difference between the prediction of an ML model or the maximum depth (or the number of leaf nodes) of
and the ground-truth value to be predicted for every given decision trees. This process, termed regularization, is often
training set. A high-bias ML model often cannot capture the controlled by using hyperparameter tuning. The hyperpara-
2268 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

meters, such as the strength of the penalty or perturbation and The exploitation−exploration trade-of f is an established
the size, shape, or structure of the model units, should be guiding principle to learn something novel. To gain essential
clearly distinguished from the model parameters that are knowledge and avoid overlooking possible discoveries, it is not
autotuned in the training process. In contrast, hyperparameters sufficient to perform only “exploitation” of knowns by spotting
control inductive bias that is not determinable from the trends seen in currently identified high-performance cata-
training data and need to be explicitly set before the model is lystsit is also important to carry out “exploration” for
trained. The plots in Figure 8 demonstrate the changes in the unknowns by probing potentially novel candidates with almost
decision boundaries of support vector machines (SVMs) that no information that can be quite different from any existing
occur when two main hyperparameters, the kernel coefficient catalysts. Algorithms for balancing this trade-off include a wide
(gamma) and the penalty parameter (C), are varied. For more variety of model-based optimization algorithms such as
practical examples, studies by our group provide detailed sequential experimental design, active learning, Bayesian
explanations about how to tune hyperparameters in a optimization, multiarmed bandits, black-box optimization,
systematic way.58,59 and reinforcement learning. Applying metaheuristic optimiza-
Regularization is a crucial step when a risk of overfitting tion algorithms, including evolutionary algorithms such as GA,
exists as a result of the use of a highly complicated nonlinear is another popular option. Implementing these search
model, an insufficient number of training samples, or a large algorithms by both utilizing and questioning current
number of input variables. The best practice is to systemati- assumptions and biases would be a promising approach for
cally test a predefined set of candidate values by using a grid or carrying out automated chemical exploration and discovery
random search. In recent years, advanced methods such as using autonomous robotic systems.65 Many practical examples
Bayesian optimization, GA, and model-based optimization for will be covered in subsequent sections and existing reviews, but
hyperparameter tuning have been developed in the ML in section 7.1 (Figure 24) we also introduce an example from
community. The recent concept of AutoML attempts to our group that is based on balancing this exploitation−
automate this demanding process, and several software exploration trade-off.66
packages are available for conducting typical tasks such as 2.7. Bayesian Inference, Generative Models, and
hyperparameter optimization (HPO) and neural architecture Inverse Design. Up to this point, we have discussed ML in
search (NAS). a standard setting. However, for most of the algorithms listed
2.6. Exploitation−Exploration Trade-Off and Extrap- in Figure 3, including neural networks and decision trees,
olation. The mechanism of prediction in supervised learning Bayesian versions exist where probabilistic inference is made
instead of performing function fitting by pinpointing single
is based on interpolating given training data. When a
best parameters.23,67 In Bayesian inference, a probability
prediction lies outside the region that the training data
distribution is set for parameters prior to data collection, and
cover, it can be quite inaccurate and baseless. The notions of
it is updated every time new data are obtained. The prior
interpolation and extrapolation might be intuitively reasonable
probability distribution, rewritten into the posterior distribu-
(Figure 6), but in practice they are not simple because of high
tion using Bayes’ theorem, represents the degree of certainty of
dimensionality. In a high dimension, it is not clear whether a
the values of the parameter, which begins with an unbiased
specific input point is interpolative or extrapolative. For distribution and becomes more and more biased as the number
example, tree ensemble classifiers such as random forests and of observations increases. When prediction is quite certain, the
extra trees seem to be quite restricted and nonsmooth in distribution would be peaked only at the specific parameter
Figure 7 when they are compared with other nonlinear value, but otherwise it can be quite broadly distributed over
methods. However, it is widely known that the tree ensemble possible values. Standard ML inference relies on optimization
classifiers function well in a very high dimensional space (even for function fitting, but Bayesian inference relies on integration
up to 100000−700000 features).60,61 Also, decision boundaries required for normalizing any distribution into a probability
of deep neural networks can be counterintuitive in an distribution so that the integral over its domain is equal to 1. If
extremely high dimension (e.g., for images) and have a widely a prior distribution is set arbitrarily, the corresponding
known problem of adversarial examples to easily fool them.62 posterior distribution cannot be obtained in closed form as a
We can conclude that prediction would be quite uncertain result of the integration in Bayes’ theorem. As a result of the
when no training examples exist around a specified input point recent advent of numerical computation methods such as
where the prediction is made. In the materials and molecular accelerated Markov chain Monte Carlo (MCMC) and
sciences, quantitative structure−activity relationships (QSARs) automatic variational inference, Bayesian inference now is a
or QSPRs have been investigated extensively, and problems quite flexible and practical tool for statistical modeling.
arising from the potentially huge gap between available and In the materials sciences, Bayesian methods are often used
unseen data have long been discussed.63,64 In particular, the for the inverse design approach.67,68 Apparently, when the aim
crucial importance of the “applicability domain” (AD) of an is to predict possible inputs that give a specific desirable
ML model has been recognized. output, inverse design can be executed by simply interchanging
A strong goal in catalysis research is the development of inputs and outputs in supervised learning. However, this is not
genuinely new catalysts that exhibit better performance than that simple because inputs and outputs usually have a many-to-
currently available catalysts. However, predicting high- one correspondence. For example, a limitless number of
performance catalysts would be extrapolative because by images (inputs) can correspond to the label “dog” (output).
definition such catalysts would be dissimilar to any existing Similarly, an infinite number of inputs can potentially exist that
catalysts and thus outliers in the training data. Therefore, it give a catalyst with a specific level of performance. However, by
would be difficult to predict a novel catalyst simply by using the use of Bayesian methods, a probability distribution of
supervised learning that only models an average trend in the candidates that might give a specific level of performance can
given data. be produced, and this distribution can be employed to generate
2269 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

samples of these catalysts. Also, a hierarchical generative graphic database of known and hypothetical structures
process can be specified in order to explicitly model some comprising a very large number of inorganic compounds.71
latent intermediate variables that cannot be observed directly Several materials databases that contain the properties of
but can characterize well the data and problem under study. experimentally observed and hypothetical materials are now
This type of statistical modeling, which is based on generative available. Examples of such data sets include the Inorganic
models, is now called probabilistic programming, for which Crystal Structure Database (ICSD),72 AtomWork,73 the
many software packages such as Stan, PyMC3, and Edward are Materials Project,74 the Open Quantum Materials Database
available. (OQMD),75 and the Automatic Flow of Materials Discovery
2.8. Use and Misuse of ML in Catalyst Identification: Library (AFLOWLIB).76 Recent advances made in developing
Correlation and Causality. As implied by the statement high-throughput DFT calculations and ML techniques have
“correlation does not imply causation”, it is generally difficult enabled the efficient utilization of these inorganic material
to identify any “causal” relationships on the sole basis of databases for crystal structure predictions and the design of
observational data. Any ML-generated result can have a risk of materials (Figure 9).77−82 The new methods even enable rapid
being an artifact arising from a spurious correlation and thus
cannot alone be considered to be scientifically conclusive.
Interventional studies are needed to more safely ensure that
ML-derived trends are real, as suggested by the statement “to
find out what happens to a system when you interfere with it
you have to interfere with it (not just passively observe it)”.69
In general, many strong and unrealistic assumptions are
required to guarantee that ML algorithms can produce causal
information from observational data. For example, formal
discussions of causal inference usually assume that all
confounders have been identified and included in the model
(i.e., that no unidentified but related factors exist). It is widely
recognized that “causality” is notoriously difficult to even
define, formalize, or quantify, but in practice there are useful
guidelines such as the Bradford Hill criteria in epidemiology to
carefully evaluate the potential gap between the observed
correlation and causality. Figure 9. Schematic representation of ML-aided future materials
Although ML is based only on correlation and thus might design. The compound database is filtered to a smaller list of
suffer from the potential risk of being spurious, correlation is promising catalyst candidates through computational and exper-
clearly a sign of a potential causal connection. Therefore, the imental methods associated with ML approaches.
result can be quite informative and can be subjected to further
investigation using good data to assess whether it is actually
“causal”. For example, in chemoinformatics, QSAR/QSPR in silico predictions of the structures of hypothetical materials.
studies with many descriptors are widely known to have an When the large set of DFT-driven crystallographic data is
enhanced risk of exhibiting chance correlations, and the combined with ML, the search space of unknown compounds
significance of the input−output correlation is often further is greatly expanded. In this section, we describe typical
ensured by using computational methods such as Y-random- examples of predictions of structures of catalytically important
ization and random-number pseudodescriptors.70 In Y-random- materials using both high-throughput DFT calculations and
ization, the prediction performance of the original ML model is ML.
compared to that of a model built from randomly shuffled Y 3.1. Alloys (Intermetallics). Intermetallic compounds
values (outputs). Also, random-number pseudodescriptors can (intermetallics) are a class of substances that have defined
be used instead of or in addition to the original input variables. compositions and adopt structures distinct from those of the
The performance score of the original model should be component elements. Despite their importance in many areas,
significantly better than the randomized cases when a potential including catalysis, rules for predicting intermetallics are not
causal connection exists between inputs and outputs. In well-defined, and as a result, the design of novel intermetallics
particular, when a large collection of input variables is challenging. Mar et al. applied various ML methods (e.g.,
(descriptors) is utilized in ML models, the risk of producing SVM and PLS) to develop a crystal structure predictor for
a spurious correlation should always be assessed. binary AB intermetallics.83,84 Models were trained and
validated by classifiing 706 AB compounds with seven
3. ML PREDICTION OF NEW MATERIALS structure types (NaCl, CsCl, CuAu, ZnS, TlI, NiAs, and β-
The discovery of new heterogeneous catalysts begins with the FeB) extracted from a crystallographic database. The data were
discovery and structural identification of new materials. split into a training set and a validation set, and the best
Current approaches used to identify novel materials primarily descriptors and a better model were selected. SVM was found
rely on trial-and-error synthesis combined with exhaustive to be a better model, giving a sensitivity of 94.2%, specificity of
phase diagram analysis. This protocol for materials discovery is 92.7%, and accuracy of 93.2%. The ML method used to obtain
far too slow and expensive to meet the demands for quantitative predictions of hypothetical compounds led to the
development of new heterogeneous catalysts. Instead, a identification of the new compound RhCd, which was
rational and structured method that fully explores chemical- predicted by SVM to have a CsCl-type structure (0.918
wide space is needed. In data-driven (inverse) design of probability). The authors synthesized RhCd and demonstrated
catalysts of this type, it is required to establish a crystallo- that it has a CsCl-type structure. This finding suggests that the
2270 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 10. (a) Schematic overview of text extraction and database construction. (b) A hierarchical neural network assigns labels (“MATERIAL”) to
words one at a time by converting them to embedding and heuristic vector representations and outputting them to a classifier. (c) A grammatical
parse of a sentence is used to resolve word-level labels (below the colored bars) into sequential word-chunk-level labels (above the colored bars),
followed by resolution into word-chunk relations (curved arcs). (d, e) Calcination and hydrothermal (d) temperature and (e) time kernel density
estimates for TiO2. From ref 86. CC BY 4.0.

ML method can be an accurate predictor of a crystal structure based method was developed recently by Olivetti and co-
of an intermetallic. workers that overcomes this problem.86,87 In that approach,
3.2. Metal Oxides. Ceder et al. explored the ternary oxide aggregated synthesis parameters were first computed using
(A x B y O z ) chemical space in a search for previously data extracted from over 640 000 publications by using natural
uncharacterized compounds using ML methods along with language processing (NLP) and ML algorithms. Then a set of
high-throughput DFT calculations. 85 In this effort, a tabulated and collated synthesis parameters associated with 30
probabilistic model was built from the ICSD and then used known oxide systems were input into the treatment. An
to identify new compositions that would likely to form a overview of the approaches employed to transform human-
ternary oxide. The most likely crystal structure of each readable publications into machine-readable synthesis planning
composition was predicted using the same probabilistic model. resources and synthesis parameters is given in Figure 10.
Finally, the stabilities of these substances were determined Notably, direct human intervention is not involved in
using DFT. The high efficiency of this method is demonstrated implementing this method. Rather, an automated text
by the observation that 355 new compounds were identified processing method downloads articles, extracts key synthesis
within ca. 55 days following initiation of computing using 400 information, codifies this information into a database, and
Intel Xeon 5140 2.33 GHz cores. subsequently aggregates the data into material system types.
The ability to identify what structure types result from Scientific validation of the scheme was carried out by briefly
synthetic routes would be an important tool in the materials comparing the aggregated data in the provided data set to
sciences. However, finding this type of specific information is known parameters. For example, frequent use of temperatures
difficult because of the fact that different synthetic methods close to the anatase-to-rutile phase transformation for TiO2
can generate the same types of products. An excellent ML- were observed (Figure 10). Additional patterns showed that
2271 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 11. (a) Schematic representation of self-assembly of the {Mo120Ce6} wheel from basic building blocks in polyhedron mode. (b) Schematic
diagram of the three exploration methods used. (c) Explorations of crystallization space using the three methods. The exploration is computed as
the volume of the convex envelope of the experiments leading to crystals. (d) Average prediction accuracies between the classes of crystals and
noncrystals for the three methods, using a random forest classifier. From ref 89. CC BY 4.0.

hydrothermal reactions occur in the narrow temperature range discover complex patterns embedded in the large amount of
between 100 and 200 °C, which confirmed that hydrothermal data accumulated for zeolites.90,91 For example, in pioneering
reactions are more likely to occur than calcination when long studies of ML applications to zeolite research, Blaisten-Barojas
time periods are utilized. et al.92−94 developed a knowledge-based model that is able to
Studies of catalysis by polyoxometalates (POMs) are classify and predict topological types of zeolites with high
extremely important in catalysis. The group headed by Cronin, accuracy using a nine-dimensional feature vector, including
who is a leader in the field of automated chemical synthesis,88 topological descriptors obtained by computations, together
constructed a workflow for crystallization of the new POM with some physical and chemical properties of zeolites.
cluster Na6[Mo120Ce6O366H12(H2O)78]·200H2O that pos- Alkaline desilication of zeolites is a conventional method
sesses a trigonal-ring-type architecture (Figure 11).89 Synthesis employed to prepare mesoporous zeolites, and the conditions
and crystallization of this substance were probed by an ML utilized strongly influence important properties of zeolites,
algorithm. The obtained results were compared with those such as the micro- and mesopore volumes, Si/Al ratio, and
arising from experimentation. The ML-based approach was acidity, and their consequent influence on the activities,
able to cover approximately 9 times more crystallization space selectivities, and durabilites of zeolite catalysts. However, it is
than that of a random search and approximately 6 times more difficult to define the optimal desilication conditions that
than a search conducted by humans. This expansion increased should be used to produce a catalyst tailored for a specific
the crystallization prediction accuracy from 77.1 ± 0.9% for a application. Blay et al. showed that perturbation theory
human search to 82.4 ± 0.7%. Cronin and co-workers combined with ML (PTML) is a powerful approach for
suggested that this outcome might be a consequence of the predicting the effect of desilication conditions on the
fact that the ML-based method is untied to prior chemical properties of ZSM-5 zeolites.95 Accordingly, the PTML models
knowledge and is more “adventurous” in that it performs enable prediction of the properties of a query material or
“jumps” in chemical space directly into what are believed to be compound starting with the properties for a reference and then
boundaries between crystals and noncrystals. adding PT operators to measure deviations from the reference.
3.3. Zeolites. Zeolites are crystalline aluminosilicate This method can be used to study large data sets with multiple
materials having a network of linked pores and cavities. For inputs obtained from experiments. In the approach, a PTML
decades, these materials have been used in a large number of model was first constructed using 1019 data sets for various
industrial catalytic processes, and consequently, a large ZSM-5 zeolites (H-ZSM-5, NH4-ZSM-5, Na-ZSM-5, Na,K-
demand exists for the synthesis of new zeolites that display ZSM-5, etc.) collected from the literature. The data set
high catalytic performances. ML algorithms have been used to contains information about the effect of preparation and
2272 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

treatment conditions on eight properties, including the BET described a new ML approach to aid the computational design
surface area, mesopore volume, micropore volume, Si/Al molar of synthesizable OSDAs for use in the preparation of zeolite
ratio, etc. The model was found to have a high accuracy (R2 = beta (Figure 13).97 Models to predict stabilization energies
0.98) in an external validation series, demonstrating that the using a neural network were trained and validated using a data
PTML method is useful for the rational design of alkaline- set of 4781 putative BEA zeolite OSDAs. The stabilization
desilicated zeolites. In the model, the sizes of the mesopores in energies in BEA were produced through MD calculations. This
the alkaline-treated zeolites (z axis in Figure 12) are described effort led to the identification of 3062 promising OSDAs, 469
of which were predicted to stabilize the zeolite structure better
than known compounds. The use of ML in this case sped up
the computation by a factor of 350. Recently, Olivetti and co-
workers extended the automatic data extraction approach
developed for metal oxides (see above) to prediction of the
crystallization of zeolite structures.98 This method was used to
synthesize Ge-containing zeolites to investigate relationships
between the synthesis parameters and the resulting topology.
3.4. Inorganic−Organic Hybrid Materials. Inorganic−
organic hybrid materials, including organically templated metal
oxides and metal−organic frameworks (MOFs), have attracted
much attention because they possess characteristics that can be
controlled at the molecular level. Such unique features make
inorganic−organic hybrids suitable for use as catalysts.
Nevertheless, the formation of these materials is not perfectly
understood, and as a result, the development of new materials
currently relies primarily on the results of exploratory synthetic
investigations. However, the ML method developed by
Raccuglia et al. allows prediction of the success of reactions
between organic and inorganic building blocks by using data
from failed experiments (Figure 14).99 Using this approach,
the authors were able to predict reaction outcomes of
crystallization of templated vanadium selenites. The training
set came from information arising from “dark” reactions,
specifically from unsuccessful or failed hydrothermal syntheses.
When applied to hydrothermal synthesis with commercially
available and previously untested organic building blocks, the
ML model outperformed conventional human approaches and
successfully predicted experimental conditions for the for-
mation of a new organically templated inorganic compound
with a notable success rate (89%).

4. SCALING RELATIONSHIPS FOR REACTIONS ON


Figure 12. (a) Schematic workflow used to develop the PTML
model. (b) Predicted values of εk(mi, cj)pred = mesopore size for >1000
SURFACES
data points. Reproduced from ref 95. Copyright 2018 American Describing catalytic activity as a function of a physical
Chemical Society. parameter without carrying out extensive trial-and-error
experimentation has been a general strategy for catalyst design.
by the micropore and mesopore volumes before treatment. Important contributions have been made by employing surface
This map will be helpful for designing rational methods to properties related to catalysis such as work function,100,101 d-
prepare new zeolites with desired mesopore sizes. band center,102 coordination number (CN),103 surface
The mechanical stabilities (elastic properties) play an energy,104 and strain105 as descriptors of catalyst properties.
essential role in the selection of feasible zeolite structures. In these efforts, the most widely explored examples involve
However, determination of the elastic properties of crystalline establishing scaling relationships between the adsorption
materials using DFT-based energy calculations requires a great energies of reactants and intermediates using the d-band
computational cost. Consequently, Evans and Coudert96 model and between activation and reaction energies, called
developed ML methods to efficiently predict the elastic Brønsted−Evans−Polanyi relationships (Figure
properties of pure silica zeolites employing a DFT-calculated 15).13−15,104,106−111 On the basis of the set of linear
accurate training set of elastic properties. The ML approach relationships, reactivity trends for a target catalytic reaction
based only on geometric parameters of the zeolites (structure, can be readily derived from adsorption energies of reactants
porosity, and local geometry) is able to predict bulk and shear without having to utilize tedious activation barrier calculations
moduli of zeolites with high accuracies. The methodology was and full analysis of the steps involved in the reaction. By using
employed to predict elastic responses of 590 448 hypothetical the scaling relationships strategy, Nørskov and co-workers
zeolites. The availability of this large database enabled an predicted and experimentally verified that bimetal catalysts
assessment of stability trends in zeolites. such as NiZn would be applicable for selective hydrogenation
Zeolite BEA is typically synthesized using of a number of of acetylene112 and that NiGa would be applicable to CO2
organic structure-directing agents (OSDAs). Daeyaert et al. hydrogenation.113
2273 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 13. (a) Scatter plots of MD- vs ML-predicted stabilization energies for the OSDAs in the validation set. (b) Top five OSDA substances
predicted by model 1b. The scores are ML-determined binding energies in kJ/(mol of Si). From ref 97. CC BY 4.0.

Figure 14. Schematic representation of the three hypotheses generated from the developed model together with representative structures for each
hypothesis. The experimental conditions required for the formation of single crystals depend on the properties of the amine. Reproduced with
permission from ref 99. Copyright 2016 Springer Nature.

Despite its importance in catalysis, the role played by molecules on metal oxide surfaces have shown that scaling
electronic structures in determining adsorption energies on relationships exist for adsorption energies despite the structural
semiconducting and insulating oxide surfaces is still a matter of and electronic complexities of metal oxide surfaces. Fernández
debate. Hence, in silico discovery of metal oxide catalysts is et al. performed DFT calculations on the adsorption of AHx
quite rare. In silico prediction of new catalysts using intermediates and bare A atoms (A = O, S, N) on various
computationally low-cost descriptors could be possible in the transition metal oxide, nitride, and sulfide surfaces.114 The
future if comprehensive data sets containing quantitative same linear relationship was found to exist between adsorption
descriptors for many reactants on many unknown surfaces energies of AHx and A on surfaces of both metal oxides and
were available. This section describes the results of recent pure metals. The trends in adsorption energies can be
studies aimed at determining relationships between electronic understood by using a modified version of the d-band
properties and adsorption energies on metal oxide surfaces. model115 and by “Fermi softness”,109 which is defined as the
4.1. Scaling Relationships between Adsorption sum of the density of states weighted by the derivative of the
Energies. Extensive studies of the adsorption of atoms and Fermi−Dirac distribution function at a certain nonzero
2274 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 15. Schematic representation of volcano plots and scaling relationships for reactions on surfaces.

temperature. Calle-Vallejo et al. reported that the adsorption metal oxide surfaces such as oxygen vacancies have dominant
energies of O, OH, and OOH are linearly related to the effects on the properties of heterogeneous catalysts.119,120 The
number of outer electrons at the adsorption sites of metal, energies required to form the oxygen defects in metal oxide
monoxide, and perovskite surfaces.116 Although this effort catalysts are typically used to rationalize and predict the
found that the number of outer electrons can be a simple and performance of these catalysts.108
efficient descriptor, van Santen et al. pointed out that Oxygen vacancy formation in ABO3 perovskites has been
experimental data on perovskite LaXO3 compounds do not extensively investigated using DFT computations because it
follow this trend but instead show double-peaked volcano-type controls the electrocatalytic activity.121−123 Using DFT
behavior for the third-row reducible metal oxides, which is calculations, Chen and Ciucci systematically investigated the
most likely to be due to the high spin-state stabilization of d5 dependence of the stability and electronic and ionic
electron systems.5 Recently, scaling relationships were reported conductivity on substitutions of A and B sites of BaFeO3.124
to exist between the adsorption energies of various molecules They found that EOvac is reduced by the presence of Na, K, Ca,
containing atoms bearing lone-pair electrons across a wide Sr, and Pb in the A site and Cu, Ni, Zn, and Ag in the B site.
variety of materials.117 Moreover, our group recently The value of EOvac was found to be linearly correlated with the
investigated the adsorption of a number of molecules on occupied O p-band center, which is related to the basicity of
TiO2 surfaces employing DFT calculations together with the oxygen atoms. Emery et al. used high-throughput DFT to
statistical approaches.118 We observed linear relationships search for novel perovskites that promote thermochemical
between the highest occupied molecular orbital (HOMO) water splitting (TWS).125 They screened 5329 calculated
levels of the molecules and the adsorption energies. ML-based perovskites on the basis of the stabilities and vacancy
statistical analysis indicates that the dipole moment of the formation energies and identified 139 potential candidates
molecule also plays a role in molecular adsorption. It should be for TWS.
noted that adsorption energies on metal oxide surfaces are Kumar et al. studied oxidative coupling reactions of CH4 on
significantly affected by the presence of surface vacancies, various metal oxide surfaces.108 These reactions begin with
dopants, and coadsorbed molecules. Therefore, further abstraction of the H atom from a C−H bond of CH4 by a
consideration is required in order to describe more precisely lattice oxygen atom, which leads to the formation of a CH3
the activities and selectivities of working metal oxide catalysts. radical. They reported that this process takes place more
4.2. Oxygen Vacancy Formation Energy as a readily for highly reducible oxides and that the activation
Descriptor for the Catalytic Behavior of Metal Oxides. energy to cleave the C−H bond is linearly correlated with
Defects play a significant role in governing the properties of EOvac. The CH3 radical intermediate is converted to the desired
catalysts. Various kinds of extended and point defects C2 hydrocarbons by radical coupling or to undesired methoxy
introduced into structures of catalysts largely govern their species by absorption to surface oxygen leading to the eventual
physical and chemical properties. In particular, point defects in formation of CO or CO2. Thus, adsorption of CH3 radical
2275 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

regulates the selectivity for C2 hydrocarbon production. The strated that energies of the s- and p-band centers are useful for
adsorption energy of methyl radical is also linearly correlated rapid screening of potential active sites of metal oxide catalysts.
with EOvac. Surfaces activating the C−H bond more easily bind Brønsted acid sites (BASs) in zeolite micropores are formed
methyl radical more strongly. The authors proposed that metal by protonation of oxygen atoms linking neighboring silicon
oxides like TbOx that can exist in multiple oxidation states and aluminum atoms for charge balancing. The presence of
could overcome the intrinsic competitive processes by BASs in the confined environment is a crucial factor governing
changing the oxygen atmosphere. the catalytic properties of many zeolite catalysts.132 Pidko and
Although theoretical studies of the formation of oxygen co-workers determined the acid strength of various zeolites
vacancies on metal oxides have been conducted recently, using the adsorption energies of probe molecules such as NH3,
research on surface defects is still rare, and the number of CO, pyridine, etc.133 The acid strength obtained by the
investigated surfaces remains limited despite the obvious adsorption energy calculations of NH3 was found to be
importance of oxygen vacancies for catalytic activity. There- correlated with the activation barrier of isobutene protonation
fore, studies that provide EOvac values and reveal the physical in FAU zeolites. Matsuoka et al. screened 933 613 structures in
factors determining EOvac are highly desirable. In this context, a database of hypothetical zeolites to identify chemically
our group recently determined EOvac values of various feasible members that have strong Brønsted acidity.134 Density,
semiconducting and insulating oxide surfaces using DFT ring counting, and average ring sizes were used to eliminate
calculations with comparable structure models at the same unsuitable candidates. To assess the Brønsted acid strength,
computational level.126 A further investigation aimed at they introduced a new structural descriptor called b, which is
providing a deep understanding of the formation of oxygen the distance between barycenters of the two triangles
vacancies found that the band gap, formation energy of the connecting O atoms linked to T atoms in the acid site. This
bulk, and electron affinity of a metal oxide are important parameter can be calculated for pure silica with low
properties that govern EOvac. The reason why these electronic computational cost, although the correlation between b and
properties determine EOvac is that electrons enter the defect the Brønsted acid strength is weak. They confirmed by using
state(s) after removal of oxygen, which could be in the DFT calculations that six out of 12 final candidates contain a
conduction band for most cases. It is worth mentioning here strongly acidic Brønsted site. This encouraging result suggests
that use of ML for statistical analysis to assess the physical that even nonquantitatively accurate structural descriptors
basis of this relationship would perhaps be more meritorious might be sufficient for screening purposes where speed is more
than predicting physical property values themselves.127 important than accuracy.
Although limitations associated with variety of the metal
oxide surfaces and computational accuracy exist, the 5. ML PREDICTION OF DFT-DERIVED FUNCTIONS
information would be helpful for catalysis researchers. FOR CATALYSIS
4.3. Understanding Acid−Base Properties of Metal As described above, the activity orders of different catalysts are
Oxides. Solid acids and bases have been employed to catalyze correlated with physical parameters such as adsorption
various reactions, and the role of the properties of these solids energies and values of d-band centers, which can be
in governing the catalytic performance has been investigated determined with less computational cost than activation
extensively.128 An early computational study of the acid−base energies for reactions. Recent efforts focusing on these
properties of the γ-Al2O3 surface found that the energies of descriptor-based approaches for analysis of trends across the
occupied and unoccupied orbitals can be employed to rank periodic table and reaction mechanisms as well as recent
surface Lewis acid and base sites, respectively.129 Jenness et al. developments in computer technology have developed
computed the adsorption energies of alcohols, diethyl ether, effective screening platforms for the discovery of novel
and water and the activation barriers for ethanol dehydration catalysts. However, the costs of these approaches, which are
and etherification promoted by several Lewis acid sites on γ- normally based on first-principles calculations, are still high. If
Al2O3 surfaces.130 The results showed that energies of DFT-derived values were to be predictable using ML with
unoccupied s-band centers can be used as descriptors to widely available descriptors, it would be possible to quickly
evaluate the Lewis acidities of the Al3+ sites. These descriptors predict trends in catalytic activity. The investigations discussed
correlate the adsorption energies with the barriers for ethanol in this section exemplify how fast calculations of catalytically
dehydration in a quantitative manner but describe the barrier important parameters (e.g., adsorption energies, d-band
for ethanol etherification only qualitatively because of the centers, bond strengths between metals and supports) can be
bimolecular nature of the process. Very recently, the occupied carried out using ML models.
p-band center of O and unoccupied s-band center of Al were 5.1. Adsorption Energies. Several recent reports have
employed to evaluate the reactivity for C−H bond activation of described ML predictions of adsorption energies of reactants
CH4 over γ-Al2O3 surfaces.131 and surface intermediates. In a pioneering work, Yildirim and
In a study of n-butane oxidation over vanadium phosphorus co-workers showed that neural network models can be
oxides, Wang et al. identified the active site for this reaction employed to predict the adsorption energies of O2 and CO
among 15 possible VO and PO sites in total.111 Both V on Au2 to Au10 clusters using the size and charge of the cluster,
O and PO sites abstract a H atom from n-butane in what the number of unpaired electrons, and the CN as
can be viewed as a proton-coupled electron transfer process in descriptors.135 ML of DFT-derived adsorption energies on
which V5+ and oxygens serve as electron and proton acceptors, metal (alloy) surfaces,136−144 low-symmetry Pt nanopar-
respectively. Thus, the activity of the PO sites is lowered by ticles,145 and metal cation sites in zeolites146 were also
increasing the distance between V5+ and these sites. The results determined using different descriptors and ML models. Virtual
showed that a linear relationship exists between the centers of screening by utilizing the artificial neural network model for
the energies of the PO lone-pair bands and the energies for the adsorption energy of CO on core−shell alloy surfaces
reaction at the PO sites. Accordingly, the study demon- (Cu3B-A@Cu) indentified new catalyst candidates for selective
2276 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 16. Combinatorial challenge of identifying surfaces and active sites for bimetallic catalysts. (A) Ni/Ga intermetallic compounds synthesized
experimentally and identified as the lower hull by the Materials Project database. (B) 40 identified facets/terminations. (C) 583 adsorption
configurations identified as having unique average coordination of bonding metal atoms. (D) High-throughput methods developed to catalog and
evaluate necessary thermodynamic quantities. Reproduced from ref 149. Copyright 2017 American Chemical Society.

CO2 electrochemical reduction to form C2 species.137 The ECH2) were predicted to make selective catalysts that can
group of Asahi described how adsorption energies of N, O, and enable the efficient use of CH4.
NO on Rh1−xAux alloy nanoparticles can be predicted using a Ni2P is known to be an effective catalyst for the
Bayesian linear regression model and a multidimensional electrochemical hydrogen evolution reaction (HER). Rappe
structural descriptor.147,148 By combining this information with et al. carried out virtual screening of Ni2P(0001) surfaces
a kinetic model, those researchers were able to predict the rates doped at P sites with different concentrations of nonmetal
of NO decomposition by the Rh1−xAux alloy nanoparticles as a dopants (B, C, N, O, Si, S, As, Se, and Te).152 This process led
function of composition and particle size. Nørskov et al. to the development of a regularized random forest ML
described a systematic ML-aided discovery approach for algorithm to quantify the relative importance of DFT-derived
identifying NiGa bimetallic catalysts that are active for the descriptors of the surface structure on determining the free
electrochemical reduction of CO2 (Figure 16).149 Also, Tran energies of H atom adsorption (ΔGH) on the catalyst
and Ulissi demonstrated how ML-assisted theoretical screening candidates. The results showed that the Ni−Ni bond length
can be used to discover new intermetallic catalysts (Figure is the most important descriptor for the HER activity, with
17).150 By screening various alloys, they uncovered 258 shorter Ni−Ni bonds giving higher HER activities.
candidate surfaces across 102 alloys for inducing H2 evolution By using DFT, O’Connor et al. were able to calculate
and 131 surfaces across 54 alloys for promoting CO2 energies for adsorption of various early and late transition
reduction. For the purpose of experimental validation, the metals such as V, Mn, Fe, Co, Ni, Cu, Ru, Rh, Pd, Ag Ir, Pt,
and Au on several reducible and irreducible oxide supports
top candidates were used as catalysts for the electrochemical
such as CeO2(111), MgO(100), CeO2(110), ZnO(100),
H2 evolution, CO2 reduction, and O2 evolution reactions.
TbO2(111), α-Al2O3(0001), and TiO2(011).153 DFT together
Our group also developed a simple ML model for efficient
with an ML method based on LASSO regression was
prediction of adsorption energies of CH4-related species,
employed to identify descriptors that predict interaction
including CH3, CH2, CH, C, and H, on Cu-based alloys strengths between oxide supports and single metal atoms.
(Figure 18).151 The ML model developed was demonstrated The results demonstrated that the adsorption energies are
to predict DFT-derived adsorption energies using 12 correlated with properties of both the metal and the support,
descriptors that are readily obtained for selected elements. It namely, the metal oxidation enthalpies, support reducibilities,
was also shown that the top three descriptors for doped metals, and enthalpies of metal−metal interactions.
including melting point, surface energy, and group in the 5.2. d-Band Centers. Our group used ML to predict d-
periodic table, can be employed to predict the adsorption band centers, which are widely employed as general and
energies with comparable accuracies. Moreover, differences versatile descriptors for a variety of reactions occurring over
between the adsorption energies of CH3 and CH2 (ECH3 − heterogeneous transition metal catalysts.58,59 The d-band
2277 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 17. CO2 reduction activity map for bimetallics. A visualization of two-component intermetallics with surfaces that have low-coverage CO
adsorption energy (ΔECO) values within the range of −0.77 to −0.57 eV is shown. White shading indicates an absence of enumerated surfaces; gray
shading indicates that all of the ΔECO values are outside the range of −0.77 to −0.57 eV; colored shading indicates possible activity. The ΔECO
values used to create the upper half of this map were calculated using DFT, and the values used to create the bottom half were calculated using the
surrogate machine learning model. Reproduced with permission from ref 150. Copyright 2018 Springer Nature.

model assumes that chemisorption is dominantly governed by approximate description of the exchange−correlation func-
d electrons of transition metals.14 The approach involves linear tionals, the missing derivative discontinuity, and the self-
scaling between the adsorption energy of a given adsorbate and interaction error.154 Although use of the GW approach and
the energy of the d-band center relative to the Fermi level. The hybrid functionals usually gives reasonable results, these
higher the energies of the d states are relative to the Fermi methods are often computationally too expensive. It would
level, the emptier are the antibonding states and the larger is be desirable to obtain the band gap, which is a key property for
the adsorption energy of an adsorbed species on a surface. the treatment of (photo)catalysts and other relevant
Thus, we demonstrated that the d-band centers for applications such as solar absorbers and transparent con-
monometallic and bimetallic surfaces having two different ductors, with sufficiently high accuracy at a reasonable
structures (overlayers on clean metal surfaces and surface computational cost.
impurities) can be predicted by employing an ML method In this regard, Dey and co-workers predicted the band gaps
(gradient boosting regression) (Figure 19). of over 227 new chalcopyrite compounds using the band gaps
5.3. Band Gaps. For the most part, experimental databases of 44 chalcopyrites reported in the literature.37 Subsequently,
contain information about crystal structures of materials. It is the group headed by Tanaka determined prediction models of
much more difficult to access information about the electronic G0W0 band gaps for 270 inorganic compounds employing
structures of compounds. For example, band gaps in the Kohn−Sham band gaps, cohesive energies, crystalline volumes
Materials Project database have been calculated using the per atom, and other fundamental information about the
generalized gradient approximation (GGA) functional by constituent elements as descriptors.38 As determined using
Perdew, Burke, and Ernzerhof (PBE) and GGA PBE+U for support vector regression, the best model had an RMSE of
some of the compounds. Therefore, the band gaps of 0.24 eV, showing that the method is useful for screening a large
semiconductors are underestimated because of the only set of materials. Band gaps of double perovskites have also
2278 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 18. (a) Representative adsorption model for CH3 on Cu-based alloys. (b) DFT-derived adsorption energies of CH3, CH2, CH, C, and H on
Cu-based alloys. (c) Average RMSEs to predict the adsorption energies of CH3 by 100 rounds of random leave-25%-out trials with various ML
models. (d) Correlation maps of the 12 descriptors. (e) Feature-importance scores of the descriptors for the extra tree regression prediction of the
adsorption energies of CH3. Reproduced from ref 151. Copyright 2018 American Chemical Society.

Figure 19. ML prediction of the d-band center values for 11 metals (Au, Pt, Ir, Ag, Pd, Rh, Ru, Cu, Ni, Co, and Fe) and their pairwise bimetals for
two types of structures (overlayer-covered or 1% metal-doped metal surfaces). Reproduced with permission from ref 58. Copyright 2016 Royal
Society of Chemistry.

been predicted using ML methods.39 In addition, ML for determining the band edges with GW level accuracy having
predictions have recently been made for the band edges of a minimum RMSE of 0.12 eV. They also found that
MXenes, which are 2D layered transition metal carbides and electronegativity differences between transition metals and
nitrides such as graphene and molybdenum disulfide. functional groups play a crucial role in tuning the electronic
Determination of accurate positions of the band edges of properties and positioning the band edges of MXenes.
these novel materials is extremely important because they have 5.4. Surface Phase Diagrams. Surface phase diagrams are
promise in catalysis applications, in particular as photocatalysts of great importance to understand the surface chemistry
and eletrocatalysts. Mishra et al.155 developed an ML model occurring in (electro)catalysis, where various adsorbates and
2279 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 20. (a) Workflow scheme of the ELSIE algorithm, which consists of two steps. In the first step, the absorption species is identified and used
to narrow the number of candidate computed reference spectra. In the second step, the spectral matching ensemble yields a list of computational
spectra rank-ordered by their similarity to the target spectrum. (b) Performance of the ELSIE algorithm on 19 test spectra. From ref 162. CC BY
4.0.

coverages exist at varying applied potentials. These diagrams 6. ML PREDICTION FOR CATALYST
are typically made by employing intuition, an approach that CHARACTERIZATION
has the risk of missing complex configurations and coverages at Spectroscopic and microscopic analysis of catalysts requires
potentials of interest. Cluster expansion methods showing efforts by experienced experts. If an ML model can be
higher accuracy are often difficult to implement for treatment constructed for a large number of catalysts and surface
of new surfaces. The group of Nørskov employed an ML reactions from various types of spectroscopy and microscopy
approach to address these issues.156 The free energies of all data, rapid identification of active sites, intermediates, and
possible adsorbate surface coverages were predicted for a finite products would be possible.157 Although various ML-assisted
number of adsorption sites using a Gaussian process regression spectroscopic analysis methods have been described,158−160
(GPR) model. The results showed that a simple, rational, and the discussion in this section focuses on the current status of
systematic approach could be implemented to obtain free ML-assisted analysis of various spectroscopy and microscopy
data that are closely related to heterogeneous catalysis.
energy diagrams at relatively low computational costs. The
6.1. XAFS. X-ray absorption fine structure (XAFS) is a
Pourbaix diagram for the IrO2(110) surface having nine widely used technique for determining the local atomic
coverages from fully oxygenated to fully hydrogenated can be structures of catalytic materials. XAFS is divided into two
reconstructed using only 20 electronic structure relaxations, regimes: X-ray absorption near-edge structure (XANES) and
compared to ca. 90 required when typical search approaches extended X-ray absorption fine structure (EXAFS). XANES
are utilized. The authors also extended this method to the gives a fingerprint of coordination chemistry and the oxidation
MoS2 surface and obtained similar efficiencies. states of X-ray-absorbing atoms of a material, and EXAFS
2280 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

provides information on the CN of neighboring atoms and Akai et al. applied sparse modeling based on l1 regularization
their distances. Quantitative XANES analysis is typically to EXAFS oscillations analysis of Cu foil. In this approach,
difficult to carry out. Moreover, XANES analysis relies mainly simplified basis functions based on a single-scattering
on manual comparison of measured and reliable reference approximation for the EXAFS oscillation are used instead of
spectra. However, existing databases of XAFS spectra are plane-wave basis functions for the Fourier transform (FT).168
limited in terms of the number of reference spectra, energy Consequently, the radial structure function takes a discrete
resolution, and signal-to-noise ratio.161 Zheng et al. recently spectrum and increases its intensity with increasing distance
constructed a large database of computed reference XANES from the X-ray absorbing atom, suggesting the possibility of
(more than 800 000 K-edge spectra for more than 40 000 analysis of the CN at much higher shells. The Debye−Waller
compounds) and developed an algorithm called Ensemble- factor can also be obtained directly without prior assumptions
Learned Spectra Identification (ELSIE) for the automatic about the atomic-scale local structure.
identification of matching spectra for any experimental XANES An ML-based approach that can classify the local
spectra.162,163 The algorithm combines 33 weak learners coordination environments of the absorbing atom from
composed of a set of preprocessing steps (peak shifting/ simulated K-edge XANES spectra was demonstrated recently
alignment, spectra normalization, feature transformation, etc.) by Carbone and co-workers.169 The fidelity and robustness of
and a similarity measure (Figure 20). The approach attained the developed ML method were demonstrated by high average
up to 84.2% accuracy in identifying the correct coordination accuracy (86%) across the wide chemical space of oxides in
environment and oxidation state of a test set of 19 K-edge eight 3d transition metal families. Lamberti et al. also applied
XANES spectra covering a variety of materials. The XANES ML techniques to simulate XANES spectra in order to
database and ELSIE algorithm have recently been integrated investigate molecular adsorption on open metal sites (Ni2+) of
into the Materials Project database,74 which is available to all a MOF material.170 It should also be mentioned that automatic
materials scientists. oxidation threshold recognition was demonstared by using ML
Kiyohara et al. developed a data-driven approach based on and experimental XAFS spectra by the group of Takahashi.171
the clustering and decision tree method to “predict” and Although several successful examples are available for ML
“interpret” oxygen K-edge XANES/electron-loss near-edge prediction of XAFS spectra, most of the studies used
structure (ELNES) spectra of oxides.164 The “prediction” computationally derived (well-behaved) data sets as inputs.
proposes candidate spectra for a target oxide from a This is mainly because proper energy alignments are required
theoretically calculated spectral database on the basis of for ML modeling of XAFS spectra, and thus, it is difficult to
obtain a sufficiently large number of experimental data sets for
geometric and element input information such as bond length,
this purpose. Improvements in the accuracy of theoretical
bond angle, and valence state, while the “interpretation”
simulations are strongly desired.
suggests material information for the target spectrum. This
6.2. TEM/STEM. It has become increasingly common to
methodology operated well in making an actual prediction and
record atomically resolved transmission electron microscopy
interpretation for the target SiO2 polymorph among 32 SiO2
(TEM) and scanning TEM (STEM) images of catalytic
polymorphs. Suzuki et al. showed how to estimate materials
materials and to monitor their atomic dynamics during
parameters (charge, symmetry, the crystal field parameter reactions. However, because the amount of TEM/STEM
10Dq, etc.) for manganese oxide from experimental Mn L2,3 data available has significantly increased recently and their
XANES spectra.165 A spectral data set was first prepared by interpretation often requires a great deal of time for
theoretical simulation using various parameters. Then the nonexperts, it is highly desired to have automated analysis
similarity values between the individual spectra and the tools that can be utilized to identify and classify the local
reference spectrum were calculated to construct a statistical structures such as atoms (atomic columns), defects, dopants,
learning model for estimating the material’s parameters from and phases.
its spectrum. A discrete value, for example the charge of Mn, is Madsen et al. showed that deep convolutional neural
then estimated from the measured spectrum using dimension- networks can be used to directly identify atoms (atomic
ality reduction, and its 10Dq (continuous value) is determined columns) of Au nanoparticles supported on CeO2 in high-
using the regression model obtained from the similarity of the resolution TEM (HRTEM) images.172 In this protocol, the
spectra. network is trained using simulated TEM images of 500 Au
Timoshenko et al. described an approach for determination nanoparticles. Because the network should be able to recognize
of the 3D geometry of Pt nanoparticle catalysts (1−3 nm size) both atomically flat and rough surfaces, the training data set
supported on γ-Al2O3 that employs a supervised ML method includes both types of nanoparticles. Nanoparticles are initially
and an artificial neural network.166 The artificial neural cut from a regular crystal, keeping random numbers of layers in
network was trained using theoretically calculated XANES directions with low Miller indices (the ⟨100⟩, ⟨110⟩, and ⟨111⟩
data for Pt nanoparticles having different shapes and sizes, directions), and then a random number of additional atoms are
which are represented as the average CNs for the first four added to the particle to make it more rough. Subsequently,
coordination shells {C1, C2, C3, C4}. The artificial neural each particle is aligned into the ⟨110⟩ or ⟨111⟩ zone axis and
network displayed high performance for predicting correct CN rotated a random amount around the zone axis. It is finally
values {C1, C2, C3, C4} for experimental XANES spectra of the tilted 0−5° away from the zone axis. Figure 21a shows TEM
supported Pt nanoparticles, and thus, it is useful for obtaining images of CeO2-supported Au nanoparticles in an oxygen
their 3D structures. These researchers also utilized a modified atmosphere and the corresponding analysis using the neural
artificial neural network method that included the effect of the network. As can be seen, the network can confidently identify
bond distance R ({C1, C2, C3, C4, R}) to obtain 3D structures the atoms in the nanoparticle, and in particular, the surface
of subnanometer Cu clusters (Cu4, Cu12, C20) deposited on a atoms at the corners of the nanoparticle appear and disappear
ZnO or ZrO2 substrate under a reactive atmosphere.167 again. The same CeO2-supported Au nanoparticle in vacuum
2281 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

doped WS2. The method allowed tracking and determination


of the rate of growth of voids (defects). Belianinov et al.
proposed an approach to separate mixed phases, symmetries,
and defects that is based on multivariate statistical analysis of
the coordination spheres of individual atoms, represented by
an array of values that correspond to a variety of metrics
(distances, angles, etc.) between an atom and its nearest or
next-nearest neighbors.175 A k-means clustering analysis was
performed for the two-phase (M1 and M2) mixed STEM
image of Mo−V−Te−Ta oxide. The analysis enabled clear
distinction of different areas of the image based on the
similarity of the chemical neighborhoods of their constituent
atoms and also predictions of the existence of a third phase in
addition to the original two phases.
6.3. Nanoscale Elemental Analysis Using STEM-EDX
and EELS. Advances in STEM have enabled comprehensive
energy-dispersive X-ray (EDX)/electron energy-loss (EELS)
spectral data sets from a specified region of interest (ROI) to
be obtained. The spectrometers collect a set of spectra, each
from a subnanometer area of the sample, by using the
subnanometer electron probe scanning over the two-dimen-
sional ROI. However, because of the presence of heteroge-
neous volumes in which spatial overlap of different phases exist
within the beam path, the STEM-EDX signals from different
depths in the beam path are mixed. Therefore, they need to be
separated to determine the compositions of the individual
phases. Rossouw et al. applied a blind source separation
method based on PCA and independent component analysis
(ICA) to analyze EDX spectral images (SIs) of core−shell
Figure 21. (a) Top row: Successive experimental TEM images of a nanoparticles in order to determine the number of phases in an
Au particle on CeO2 in a 4.5 Pa O2 atmosphere. Middle row: Output analyzed volume (core, shell, and supporting film).176 Figure
images given by the neural network from the images above. Bottom 22a,c−e contain HAADF STEM images of FePt@Fe3O4 core−
row: Atoms identified by using the neural network are marked as shell nanoparticles supported on carbon and the respective
black circles overlaid on the original image. (b) Surface dynamics of EDX elemental maps. It is clear that the particles consist of Pt-
Au nanoparticles on CeO2 influenced by the gaseous atmosphere. The rich cores surrounded by Fe oxide shells. However, it is
occurrence is the percentage of frames where the neural network difficult to tell whether the particles contain a monometallic Pt
identified an atomic column at a possible site. The event is the core with a surrounding Fe oxide shell or Fe makes an alloy
percentage of frames where a site was previously occupied but is with Pt to form a bimetallic core. The scree plot in Figure 22b
unoccupied in the frame immediately after, or vice versa. Reproduced
with permission from ref 172. Copyright 2018 Wiley-VCH Verlag
suggests that the sample is composed of three principal phases.
GmbH & Co. KGaA. The spatial distribution of the independent components
IC#0−IC#2 (Figure 22f−h) and the corresponding spectra
(Figure 22i) show that IC#2, IC#1, and IC#0 represent the
and in an oxygen atmosphere is shown in Figure 21b. In the carbon support, Fe oxide shells, and bimetallic FePt cores,
oxygen atmosphere, many surface and corner atoms are respectively. The small Cu peaks in all components likely
present part of the time, indicating the occurrence of surface originate from the Cu mesh support. More recently, Shiga et al.
diffusion. developed a new non-negative matrix factorization (NMF)
Ziatdinov et al. used DNN to locate atomic species and method for automatic resolution and extraction of constituent
defects in atomically resolved STEM images.173 For this chemical components in STEM-EDX/EELS SI data.177 The
purpose, they developed a “weakly supervised” approach that method uses an extended NMF model with two penalty terms,
employs coordinates of all atomic species and types of defects namely, a prior automatic relevance determination (ARD) that
to identify various defects that are not part of the initial optimizes the number of components and a soft orthogonal
training set. This technique was used to interpret atomic and constraint that resolves each spectral component. An assess-
defect transformations, including switching between different ment of the method employing both phantom and real STEM-
coordinations of Si dopants in graphene and the formation of a EDX/EELS SI data sets demonstrated that the prior ARD
peculiar Si dimer with mixed threefold and fourfold successfully identified the correct number of physically
coordination. meaningful components.
Vasudevan et al. used the deep convolutional neural
networks technique for automatic determination of the Bravais 7. ML PREDICTION OF EXPERIMENTAL DATA FOR
lattice symmetry seen in atomically resolved STEM and STM HETEROGENEOUS CATALYSTS
images.174 The deep convolutional neural network was trained Experimentally produced data (conversion, selectivity, reaction
to identify this type of lattice symmetry when a 2D fast FT of rate) for reactions promoted using different catalyst
the input image is given and then applied to sequential STEM compositions or different reaction conditions can be utilized
images recorded during electron-beam decomposition of Mo- to construct ML models for predicting reaction data. Although
2282 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 22. ICA of a cluster of bimetallic FePt nanoparticle seeds coated by Fe3O4 shells. (a) HAADF-STEM image of the nanoparticles. (b) Scree
plot of the first 50 principal components displaying the first three components lying above the noise. (c−e) Element maps of (c) Pt, (d) Fe, and (e)
O. (f−h) The IC maps (f) IC#0, (g) IC#1, and (h) IC#2 and (i) the corresponding IC spectra. Reproduced from ref 176. Copyright 2015
American Chemical Society.

Figure 23. Proposed ML method considering elemental features as representative input instead of the catalyst compositions themselves. As
elemental descriptors, 11 physical properties that are readily available from chemical handbooks and the periodic table (atomic number, atomic
weight, period, group, melting point, atomic radius, electronegativity, boiling point, enthalpy of fusion, density, and ionization energy) were
employed for each element. Reproduced with permission from ref 66. Copyright 2019 Wiley-VCH Verlag GmbH & Co. KGaA.

2283 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 24. (a) Comparison of 90%/10% training (blue)/test (red) error plots for three representation models of catalyst performance predictions
for OCM (C2 yield): conventional ML methods using (i) only catalyst compositions and (ii) catalyst compositions + experimental conditions and
(iii) proposed ML method using catalyst compositions + experimental conditions. (b) Top 20 descriptors to predict the C2 yield. (c) Top 20
promising candidate catalysts worth testing next. Reproduced with permission from ref 66. Copyright 2019 Wiley-VCH Verlag GmbH & Co.
KGaA.

most of the studies mentioned above focused on building ML oxidative coupling of CH4 (OCM) to produce C2 products
models from computationally derived (well-behaved) data sets, such as C2H6 and C2H4 is the most popular. Catalysts for
developing approaches to use ML with “real world” OCM reactions have been analyzed and predicted by several
experimental catalysis data would be more useful because the groups using a database consisting of nearly 2000 data sets on
computational costs to obtain accurate theoretical models for catalyst compositions and their performances collected from
heterogeneous catalysis are currently prohibitively high. The diverse published data.189−193 Very recently, our group
dynamic nature of catalytically active sites/fields makes the reported a statistical analysis and a proposal of novel
modeling even more difficult. Even though excellent efforts heterogeneous catalysts for OCM using ML treatment of
have been conducted to create databases of surface reaction literature data.66 This effort led to the development of a novel
energetics, such as CatApp178 and Catalysis-Hub.org,179 the ML method considering elemental features as input
accuracies and varieties of methods developed to capture the representations instead of inputting catalyst compositions
complex phenomena of heterogeneous catalysis are not yet directly (Figure 23). Effective analysis of literature data by
adequate. After a brief overview of previous studies, this ML methods has the capability to provide valuable
section summarizes recent examples of efforts focusing on ML- information. Nevertheless, utilization of literature data has
assisted prediction of catalytic reaction data. obstacles such as bias from prior published data, low sample
Hattori et al. described pioneering studies of artificial counts for many elements, and lack of composition overlaps.
intelligence approaches to catalyst design that use an expert The newly developed method has the potential to guide
system or neural network. 43−46 Subsequently, several catalyst design and discovery, including those that promote
groups180−186 reported approaches for investigating and reactions where limited catalyst composition overlap exists in
optimizing catalyst compositions using neural network, GA, the given data. The prediction accuracy of this approach is
or SVM methods, which have been summarized in previous higher than those of conventional methods using catalyst
reviews.4,30,187,188 Because the concepts and a tutorial overview compositions as input (Figure 24). Among various state-of-the-
of this general approach are described in reviews by art ML models developed, gradient boosting regression with
Rothenberg,4,188 we only summarize progress made in this XGBoost displayed the highest level of performance to predict
area in the past decade. catalytic performance. In order to determine quantitatively the
7.1. CH4 Conversion. Because of the large amount of CH4 most important input variables for prediction of the catalytic
that is available from natural sources, it is strongly desired to performance, a feature importance score was also obtained.
develop processes for direct and indirect conversion of this Moreover, a catalyst optimization procedure that uses ML as
substance into value-added products. Among the methods “surrogate” models identified the top 20 promising catalyst
devised for direct conversion of CH4 to value-added chemicals, candidates. This investigation provided fundamental knowl-
2284 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 25. Performance of the training and test sets for the model describing the conversion of 5-ethoxymethylfurfural to the corresponding
unsaturated alcohol. Reproduced with permission from ref 201. Copyright 2012 Royal Society of Chemistry.

edge about catalytic processes. Moreover, it has the potential of A database containing 4360 experimental data points on the
being used to identify truly novel heterogeneous catalysts Pt- or Au-catalyzed water gas shift reaction (WGS), which
where the reported data are limited and biased because they converts CO and H2O to CO2 and H2, was also constructed by
are very noisy and inconsistent as a consequence of variations the Yildirim group using data obtained from publications
arising from the use of different instruments, procedures, and appearing between 2002 and 2012.199 The database was
platforms. analyzed using three mining tools to extract key information.
Currently, CH4 is industrially converted through autother- Specifically, (1) decision trees were used to determine the
mal or catalytic steam reforming to form synthesis gas (a empirical rules and conditions that allow for high CO
mixture of H2 and CO), which then is used for large-scale conversion, (2) artificial neural networks were used to provide
manufacture of methanol and higher-molecular-weight hydro- the relative importance of a variety of experimental variables
carbons. Although these processes are already industrially and their effects on the catalytic activity, and (3) SVMs were
viable, they suffer from harsh reaction conditions. Thus, better used to predict outcomes of reactions performed under
catalysts are needed. The Yildirim group constructed 5508 unstudied conditions.
experimental data points for steam reforming of CH4 (SRM) 7.3. Friedel−Crafts Reaction. Omata devised a method
using 81 publications appearing between 2004 and 2014.194 for predicting the activities of heteropolyacid catalysts
The database was analyzed using decision trees to extract supported on active carbon to promote a Friedel−Crafts
correlations, heuristics, and trends that are not identifiable reaction that utilizes GPR.200 Additives that enhance activity
using only the naked eye. Analysis of the performance variable were predicted using the regression model and verified through
showed that Ni, Ru, Rh, and Pt are the most frequently used experimentation. The influence of physicochemical properties
active metals and that they are normally supported on Al2O3, on the activity was estimated by employing GPR. The most
CeO2, or ZrO2 employing impregnation methods. In addition, influential properties were found to be density, atomic weight,
the analysis determined the ranges of catalyst preparation and and ionic radius of the additive element. It is significant that
operational conditions employed that lead to high CH4 this approach uncovered the excellent effect of Pt addition on
conversion. More recently, the same group extracted knowl- this process.
edge for the dry reforming of CH4 (DRM) reaction from 7.4. Utilization of Biomass. Rothenberg et al. reported
literature experimental data employing data mining tools.195 A experimental data on the hydrogenation of 5-ethoxymethyl-
database having 5521 data points was constructed from 101 furfural, an important intermediate to obtain industrial
papers reported between 2005 and 2014. chemicals from sugars, promoted by different Al2O3-supported
7.2. CO Oxidation. Günay and Yildirim performed a metal catalysts under various conditions.201 The obtained data
systematic analysis of experimental data for CO oxidation were compared with the results of studies of 48 bimetallic
reactions reported between 2000 and 2012 using combinations catalysts and rationalized by an effective model that applied
of various data mining tools.196 The decision tree classification catalyst descriptors based on Slater-type orbitals. In this
formulated some simple rules needed for high conversion. The approach, each metal is described using four parameters: the
same group reported neural network analysis of the results of distance of the orbital peak from the metal atom center, the
experimental CO oxidation reactions over Cu- and Au-based height of the peak, the peak skewness, and the peak width at
catalysts.197,198 The original data set was collected from half height. The authors then applied these descriptors to
publications by different groups. The analysis was successful in modeling of the reaction data using multivariate methods. In
the experimental region that contained many data points, spite of the inherent complexity of the reaction pathway, the
whereas the conditions with smaller data sets were not easily models developed in this way reasonably described the
generalized. performances of the catalysts (Figure 25).
2285 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 26. (a) Loading magnitudes for the 14 descriptors identified using factor analysis. (b) Relative importance of descriptor variables for the
penalized regression models (elastic nets, LASSO, LAR). Descriptors are sorted by median (gold bars). The bar color indicates the sign of the
coefficient: positive (light gray) and negative (dark gray). (c) Heat map of relative OER activity predictions for ABO3 perovskites using LAR and
data obtained from the Materials Project. Reproduced from ref 206. Copyright 2015 American Chemical Society.

The Yildirim group conducted a statistical analysis of these low-cost predictive models. The wide operational range
catalytic transesterification reactions by employing artificial of the model was demonstrated by its use in uncovering not
neural network and decision tree techniques. The aims were to only catalysts that have expected high performances but also
estimate important variables affecting conversion of fatty acid those expected to have low activities. It is important to note
and suitable ranges of these variables.202 For this purpose, a that an understandable bias exists for including only “good”
database of 1324 data points was constructed from the results in publications because articles often exclude
experimental results reported in 31 publications. Importantly, descriptions of badly performing catalysts. However, models
artificial neural network analysis showed that the most developed to predict the performance of catalysts should be
important variable to obtain high conversion of the fatty acid based on a wide range of data, highlighting the importance of
was the reaction time, followed by the catalyst loading, including both “good” and “bad” candidates in publications.
alcohol/oil molar ratio, reaction temperature, and type of 7.6. Reduction of CO2. Several new catalysts for CO2
support, each of which have a similar relative importance of hydrogenation to form higher-molecular-weight hydrocarbons
about 10%. (CO2-based Fischer−Tropsch synthesis) were devised by
7.5. Oxidative Dehydrogenation of Butane. The Baerns and co-workers, who applied an evolutionary strategy
oxidative dehydrogenation of butane to form butenes and using a GA.204 Among those predicted are Fe-based catalysts
butadiene can also be accurately modeled employing heuristic modified using K and Zn or Cu, which promote high CO2
descriptor models. Rothenberg et al.203 synthesized and conversions combined with high selectivity toward C5−C15
experimentally tested 15 bimetallic mixed oxides supported hydrocarbons and low CH4 formation. Moreover, it was
on Al2O3 in order to construct a descriptor model. The results demonstrated that TiO2 is required as a support in order to
obtained utilizing data from this study to assess 1711 mixed obtain low CH4 selectivities.
oxide catalysts in silico led to the prediction of six new Experimental data on the electrocatalytic reduction of CO2
bimetallic oxides, which were then experimentally synthesized (471 data points) were collected and used for statistical
and tested. The correlation between the predicted and analysis by Günay and co-workers.205 The collected data were
experimental results was excellent, showing the power of first examined by exploratory analysis using box-and-whisker
2286 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 27. Chemoinformatics-guided optimization protocol. (A) Generation of a large in silico library of catalyst candidates. (B) Calculation of
robust chemical descriptors. (C) Selection of a universal training set (UTS). (D) Acquisition of experimental selectivity data. (E) Application of
ML to use moderate- to low-selectivity reactions to predict high-selectivity reactions. Reproduced with permission from ref 226. Copyright 2019
American Association for the Advancement of Science.

plots. Subsequently, decision tree analysis was carried out to electrons were found to be the most important, while other
identify the importance of the variables and conditions that factors, including the M−O−M bond angle, optimum eg
lead to higher reduction efficiencies and rates as well as occupancy, and tolerance factor were determined to be
selectivities for product formation. It was found that Cu secondary descriptors. Figure 26c contains estimates of the
contents less than 71% are the main requirement to realize relative OER activities made using the best-performing model
high Faradaic efficiency while Cu contents less than 13% are (least angle regression, LAR) for ABO3 perovskites from the
essential for high production rates. The topmost factor Materials Project. More recently, Palkovits and Palkovits also
determining the selectivity was found to be the Sn content. developed ML models using literature data to forecast OER
The most important variables for obtaining high Faradaic catalysts.207 They used a data set consisting of more than 6000
efficiency were also identified to be the Cu content, applied catalysts with four features represented by composition to
potential, TGP-H-60-type diffusion layer, filterpress cell, predict overpotentials at 10 mA/cm2.
Nafion 117-type membrane, and KHCO3 in the anolyte and 7.8. Photocatalytic Water Splitting. ML techniques
catholyte. The CO2 flow rate and applied potential were could serve as a powerful tool in the field of photocatalysis.208
identified to be the most highly significant input variables to The Yildirim group constructed and analyzed a database
maximize the production rate. Moreover, Cu contents less than containing 540 cases from 151 published articles on photo-
52% and Sn contents higher than 15% were found to be the catalytic water splitting reacions over perovskites using data
most important factors for obtaining the highest rate and mining.209 Important factors to achieve efficient H2 generation
selectivity for formic acid production. were revealed by association rule mining. Decision tree analysis
7.7. Oxygen Evolution Reaction. Hong and co-workers also identified useful heuristics for future investigations.
demonstrated how statistical learning approaches can be Moreover, they developed predictive models using random
utilized to investigate ambiguities in the reported descriptors forest regression.
for oxygen evolution reaction (OER) catalytic performance.206
The authors found through aggregation of 101 observations of 8. ML PREDICTION OF EXPERIMENTAL DATA FOR
51 perovskites reported in the literature and their own HOMOGENEOUS CATALYSTS
experimental data that 14 descriptors of metal−oxygen bond Homogeneous catalysts, generally organocatalysts and organo-
strengths can be grouped into five basic descriptor families, metallic catalysts, play essential roles in industry, especially for
including electrostatics, metal−oxygen covalency, structure, promoting synthetically useful reactions in a highly selective
transition metal electron occupancy, and exchange interactions manner. Most organocatalysts and organometallic catalysts
(Figure 26). Electron occupancy and covalency have the have been discovered through serendipity or trial-and-error
strongest influence on the OER catalytic performance. The studies rather than by rational design, which is often too
charge-transfer energy (covalency) and the number of d resource intensive and intellectually frustrating. As a result of
2287 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 28. (a) General catalytic cycle for C−C cross-coupling reactions. (b) Reference volcano plot for the reaction. Region I corresponds to
reductive elimination, region II to transmetalation, and region III to oxidative addition. Good catalyst candidates should fall into the middle region.
(c) Database having 25 116 molecular transition metal catalyst candidates. Each complex comprises one out of six transition metals and a
combination of two out of 91 ligands. Each ligand was written in SMILES notation, and all possible L1−M−L2 combinations were constructed.
SMILES strings were then converted into Cartesian coordinates. (d) Histogram ranking of the five most identified ligands that appear on the
volcano plateau. The histogram is scaled relative to the Pd/oxazole ligand combination. (e) Estimated prices of the catalysts (for 1 mmol in U.S.
dollars) in the selected range of −32.1 to −23.0 kcal/mol. From ref 227. CC BY-NC 3.0.

the drawbacks of these approaches, increasing attention has design of homogeneous catalysts using ML models together
been given to the development of predictive strategies that can with DFT or simple computationally derived descriptors.
provide insights into the connection between structure and 8.1. Organocatalysis. Enantioselective organocatalysis is a
function. To predict reactivity trends (linear free energy useful synthetic method for the synthesis of enantiomerically
relationships) of organic reactions, semiquantitative descriptors pure or enriched chiral compounds. Enantioselectivity in these
have been established that are based on steric and electronic processes arises from a difference in free energy between
properties of the substituents (i. e., Hammett and related competing transition states leading to different enantiomers
substituent parameters).210 Because the outcomes of reactions (ΔΔG⧧). Various factors, including electronic and steric
catalyzed by transition metal complexes strongly depend on interactions between the catalyst and substrate, are important
the nature of the ligands, well-known ligand descriptors such as in governing ΔΔG⧧ and thus the level of enantioselectivity.
cone angles and electronic parameters have been estab- However, the effect of these factors on ΔΔG⧧ is not yet
lished.211 As discussed in excellent reviews by Fey and co- quantitively predictable and understood, making data-intensive
workers,212−214 modern organometallic chemistry has estab- discovery of new enantioselective organocatalysts challenging.
lished DFT-derived ligand descriptors to identify how steric Sigman et al. developed MLR models for predicting the
and electronic properties affect the properties of complexes. enantiomeric excess (% ee) for enantioselective cross-
Because homogeneous catalysts are generally well-defined dehydrogenative C−N coupling reactions of various amide-
molecules composed of transition metal centers and ligands group-containing substrates promoted by various chiral
(organometallic complexes), DFT calculations of descriptors phosphoric acid derivatives as catalysts.225 Among the large
for homogeneous catalysis are relatively simpler than those for number of steric and vibrational terms initially examined, four
heterogeneous catalysts. Accumulated experimental data were statistically selected as molecular descriptors for which
derived from ligand screening of organometallic catalysts in ΔΔG⧧ predictions were validated experimentally. The results
combination with calculated ligand descriptors serve as demonstrated the success of a data-intensive method for
databases for the learning phase of ML-assisted catalyst design. deriving a mechanistic model for predicting enantioselectivities
Sigman, 2 1 5 − 2 1 9 Fey, 2 1 2 − 2 1 4 Rothenberg, 2 2 0 −2 2 2 and of organocatalyst-promoted reactions.
others223,224 have extensively utilized QSAR to correlate Zahrt et al. developed averaged steric occupancy (ASO)
input data (ligand descriptors and reaction conditions) and descriptors of catalyst structures, which include 3D informa-
output data (conversion, yield, selectivity, etc.) using conven- tion and are responsible for induction of enantioselectivity
tional algorithms such as PLS, multivariate linear regression (Figure 27).226 They demonstrated accurate predictive
(MLR), PCA, GA, SVM, and artificial neural networks. These modeling of enantioselectivities (ΔΔG⧧) in N,S-acetal-forming
authors demonstrated the success of predictive modeling of processes with SVM and deep feed-forward neural networks
homogeneous catalysts. Very recent progress has been made in using ASOs as readily calculable descriptors for large
this area as a result of the increasing availability of descriptor- compound libraries. These examples serve to demonstrate
computation methods and modern ML algorithms. This the potential of data-driven approaches for optimization of
section discusses several examples of efforts aimed at the chiral organocatalysts.
2288 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Figure 29. Tungsten-catalyzed epoxidation reactions of 1-octene (1) using previously unverified phosphonic acid derivatives as catalysts. (a)
Comparison of predicted and experimental yields. (b) Plot of experimental vs predicted yields. Reproduced with permission from ref 228.
Copyright 2018 The Chemical Society of Japan.

Figure 30. Flowchart for CatVS. User-defined substrates are merged into a reaction template together with ligands from a predefined library,
followed by an automated conformational search using TSFF parameters and Boltzman averaging of the free energies calculated for the ensembles
of the diastereomeric transition states. FF, force field. Reproduced with permission from ref 230. Copyright 2018 Springer Nature.

8.2. Organometallic Catalysis. Rothenberg et al. devised addition step, ΔE(Rxn A), were machine-learned using simple
a QSAR approach for selecting catalysts and reaction molecular descriptors, including SMILES notation of ligands in
conditions for promoting Heck cross-coupling reactions.220 vast libraries of Pd-, Pt-, Ni-, Cu-, Au-, and Ag-based
Correlations between catalyst precursors, ligands, substrates, organometallic catalysts combined with 91 ligands. Figure
and reaction conditions in a data set of 412 Heck cross- 28b shows the volcano relationship between the reaction free
coupling reactions based on a set of defined and calculated energy and ΔE(Rxn A), which indicates that good catalysts
steric and electronic descriptors were analyzed using various should fall into the region −32.1 kcal/mol < ΔE(Rxn A) <
regression methods. Becuase of the complex nature of the −23.0 kcal/mol. Out-of-sample ML predictions made using a
Heck cross-coupling reaction, nonlinear methods such as total of 18 062 compounds led to the identification of 557
classification trees and neural networks were found to be catalyst candidates falling into this ideal thermodynamic
superior to simple linear regression methods in modeling and window. Considering the price estimation of <$10/mmol,
predicting the catalytic performance. The models were used to the authors recommended 37 catalysts, including earth-
predict the catalytic activities of 60 000 combinations of virtual abundant Cu and less common ligands as well as a large
catalysts as well as reaction conditions. number of Pd−phosphine complexes.
Although volcano plots are frequently used to understand or Tungstate-catalyzed epoxidation reactions of alkenes with
predict trends in heterogeneous catalysts, their application to H2O2 are significantly accelerated by phosphonic acid
homogeneous catalysts is relatively rare. Corminboeuf et al. cocatalysts. Yada et al. demonstrated an ML model (LASSO)
applied a volcano plot approach to ML-assisted prediction of for predicting the yields of epoxidation of 1-octene using 14
homogeneous catalysts for C−C cross-coupling reactions types of phosphonic acids (Figure 29).228 Vibrational and
(Figure 28).227 The computed energies for the oxidative electronic descriptors of phosphonic acids (HOMO−LUMO
2289 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

gaps, NBO charges, and frequencies and intensities of the IR and chemoinformatics, it is distinguished by the fact that
modes, especially [P(O)(OH)2] vibrations) were calculated catalysis is a dynamic event controlled by the structures and
using DFT. The established model was used to predict yields chemical nature of catalytically active sites. Moreover, the time-
of reactions promoted by previously unstudied phosphonic dependent nature of catalysis makes data representation and
acids. The success of the model was verified by comparing analysis more challenging. It would be beneficial for future
predicted and experimental results for eight types of investigations in this area to aim at bridging the gap between
phosphonic acids. simplified descriptor-based screening and complex real models
The group headed by Doyle used two easily calculable ligand in a manner that considers dynamics during catalytic events.
descriptors, percent buried volume and cone angle, to predict Although high-throughput or combinatorial methods, which
yields of Ni-catalyzed Suzuki coupling reactions of acetals with have been applied successfully to relevant fields including
boronic acids to form benzylic ethers.229 The model suggested materials synthesis, serve as powerful tools to discover novel
a new concept for ligand design in which remote steric catalysts and catalytic processes,53,136,235−243 at the current
hindrance is used to predict the effectiveness of a ligand. time these experiments have not been explored fully. As a
Rosales et al. developed an automated method for rapid virtual consequence, there is a lack of good and refined data to guide
screening of substrate and ligand libraries for asymmetric ML. Although text- and data-mining approaches also
catalyzed processes (Figure 30).230 A quantum-guided demonstrate promise to accelerate research in this area,87,188
molecular mechanics (Q2MM) method was used for catalytic reactions are not only highly dependent on specified
automated calculation of transition state force fields, which experimental conditions such as temperature, reactant
were then used as descriptors for an ML model to predict the concentration, and flow rate but also effected by unstated
stereoselectivities of organic reactions. Predictive computa- conditions such as the shape and volume of the reactor. Also,
tional ligand selection was shown by using a virtual ligand approaches tend to be biased toward successful examples
screen of a library of diphosphine ligands for the Rh-catalyzed reported previously. Direct use of the ML technique could
asymmetric hydrogenation of enamides. The high selectivities result in discovery of only a narrow range of finely tuned
of predicted substrate−ligand combinations were verified variations of previously studied catalysts because ML builds
experimentally. Predictive modeling was also applied to make predictive models that are representative of the given training
predictions about isomerization reactions of alkenes promoted data. Therefore, published data might not serve as good
by zipper Ru catalysts231 and catalytic regioselective difluori- training sets for ML predictions. In this sense, experimental
nation reactions of alkenes.232 data need to be generated under the same or at least
Recently, Amar et al. developed a new ML-based method for comparable reaction conditions. Also, coupling data science
rational selection of solvents.233 Rational solvent selection has with theoretical and experimental approaches could lead to
a large impact on the performance of catalytic processes and new methods for discovering catalysts to ensure “closing the
remains a formidable task in process development. The loop”.
developed ML method enabled the rapid identification of Another aim of catalysis research is to establish more
promising solvents for asymmetric hydrogenation, outperform- accurate and lower-computational-cost calculations of tran-
ing those identified by human intuition in terms of sition states for catalyzed reactions. Although calculations of
diastereomeric excess and conversion. They found optimal this type are generally performed using first-principles
operating conditions and successfully explored an idea that electronic structure methods and as a result are computation-
uses mixed solvents to attain a range of experimental space not ally expensive, use of ML can significantly reduce costs.244−251
accessible with pure solvents. More recently, Yamaguchi et al. In addition, the artificial-force-induced reaction (AFIR)
found that a regression technique, molecular field analysis, is methodology in the global reaction route mapping strategy
highly effective for designing molecules in asymmetric can be employed for automatic and unbiased searches of
catalysis.234 The analysis using intermediate structures in an complex reaction pathways.252,253 The AFIR approach locates
enantiodetermining step allowed for extraction and visual- local minima and transition states of reaction paths without
ization of important 3D structural information to improve the guessing. Therefore, it can find both unanticipated and
enantioselectivity. In addition, they experimentally confirmed anticipated reaction pathways. Ultimately, integration of
that the newly designed substrate obtained using their method these methods with existing chemical and physical models
exhibited improved enantioselectivity. should greatly accelerate the automated design, discovery, and
optimization of catalysts as well as catalytic processes and lead
9. CONCLUSIONS AND FUTURE OUTLOOK to the establishment of “catalysis informatics”.
For the most part, discovering and optimizing catalysts/
catalysis are still empirically driven. It is true that ML-assisted
development of real catalysts remains in its infancy despite the
■ AUTHOR INFORMATION
Corresponding Authors
fact that this approach has had success in molecular and
materials science. Nevertheless, little doubt exists that ML will *E-mail: ichigaku.takigawa@riken.jp (I.T.).
become a valuable addition to the toolkit for generating *E-mail: kshimizu@cat.hokudai.ac.jp (K.S.).
knowledge about and designing catalysts in the future. Not ORCID
only first-principles-calculated data but also experimental data Takashi Toyao: 0000-0002-6062-5622
for specific catalytic reactions are required in order to use ML Takashi Kamachi: 0000-0001-9281-0454
for practical discovery of new catalysts. This is especially true Ichigaku Takigawa: 0000-0001-5633-995X
for heterogeneous catalysts, for which adequate theoretical
models are currently unavailable. Thus, ML-predicted data
Ken-ichi Shimizu: 0000-0003-0501-0294
cannot directly lead to the design of novel catalysts. Although Notes
“catalysis informatics” is highly related to materials informatics The authors declare no competing financial interest.
2290 DOI: 10.1021/acscatal.9b04186
ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

■ ACKNOWLEDGMENTS
The preparation of this review and studies described that were
Synthetic Planning: The End of the Beginning. Angew. Chem., Int. Ed.
2016, 55, 5904−5937.
(20) Segler, M. H. S.; Preuss, M.; Waller, M. P. Planning Chemical
carried out by the authors were supported by the Japanese Syntheses with Deep Neural Networks and Symbolic AI. Nature 2018,
Ministry of Education, Culture, Sports, Science, and 555, 604−610.
Technology (MEXT) within the projects Elements Strategy (21) Coley, C. W.; Thomas, D. A.; Lummiss, J. A. M.; Jaworski, J.
Initiative To Form Core Research Center and Integrated N.; Breen, C. P.; Schultz, V.; Hart, T.; Fishman, J. S.; Rogers, L.; Gao,
Research Consortium on Chemical Sciences (IRCCS) and by H.; Hicklin, R. W.; Plehiers, P. P.; Byington, J.; Piotti, J. S.; Green, W.
JST-CREST Project JPMJCR17J3. H.; Hart, A. J.; Jamison, T. F.; Jensen, K. F. A Robotic Platform for

■ REFERENCES
(1) Schlögl, R. Heterogeneous Catalysis. Angew. Chem., Int. Ed.
Flow Synthesis of Organic Compounds Informed by AI Planning.
Science 2019, 365, eaax1566.
(22) Tabor, D. P.; Roch, L. M.; Saikin, S. K.; Kreisbeck, C.; Sheberla,
2015, 54, 3465−3520. D.; Montoya, J. H.; Dwaraknath, S.; Aykol, M.; Ortiz, C.; Tribukait,
(2) Nørskov, J. K.; Bligaard, T.; Rossmeisl, J.; Christensen, C. H. H.; Amador-Bedolla, C.; Brabec, C. J.; Maruyama, B.; Persson, K. A.;
Towards the Computational Design of Solid Catalysts. Nat. Chem. Aspuru-Guzik, A. Accelerating the Discovery of Materials for Clean
2009, 1, 37−46. Energy in the Era of Smart Automation. Nat. Rev. Mater. 2018, 3, 5−
(3) Matera, S.; Schneider, W. F.; Heyden, A.; Savara, A. Progress in 20.
Accurate Chemical Kinetic Modeling, Simulations, and Parameter (23) Sanchez-Lengeling, B.; Aspuru-Guzik, A. Inverse Molecular
Estimation for Heterogeneous Catalysis. ACS Catal. 2019, 9, 6624− Design Using Machine Learning: Generative Models for Matter
6647. Engineering. Science 2018, 361, 360−36.
(4) Ras, E. J.; Rothenberg, G. Heterogeneous Catalyst Discovery (24) Zunger, A. Inverse Design in Search of Materials with Target
Using 21st Century Tools: A Tutorial. RSC Adv. 2014, 4, 5963−5974. Functionalities. Nat. Rev. Chem. 2018, 2, 0121.
(5) van Santen, R. A.; Tranca, I.; Hensen, E. J. M. Theory of Surface (25) Ward, L.; Agrawal, A.; Choudhary, A.; Wolverton, C. A
Chemistry and Reactivity of Reducible Oxides. Catal. Today 2015, General-Purpose Machine Learning Framework for Predicting
244, 63−84. Properties of Inorganic Materials. npj Comput. Mater. 2016, 2, 16028.
(6) Liu, L.; Corma, A. Metal Catalysts for Heterogeneous Catalysis: (26) Ramprasad, R.; Batra, R.; Pilania, G.; Mannodi-Kanakkithodi,
From Single Atoms to Nanoclusters and Nanoparticles. Chem. Rev. A.; Kim, C. Machine Learning in Materials Informatics: Recent
2018, 118, 4981−5079. Applications and Prospects. npj Comput. Mater. 2017, 3, 54.
(7) Bruix, A.; Margraf, J. T.; Andersen, M.; Reuter, K. First- (27) Rajan, K. Materials Informatics: The Materials “Gene” and Big
Principles-Based Multiscale Modelling of Heterogeneous Catalysis. Data. Annu. Rev. Mater. Res. 2015, 45, 153−169.
Nat. Catal. 2019, 2, 659−670. (28) Nandy, A.; Duan, C.; Janet, J. P.; Gugler, S.; Kulik, H. J.
(8) Ishikawa, S.; Zhang, Z.; Ueda, W. Unit Synthesis Approach for Strategies and Software for Machine Learning Accelerated Discovery
Creating High Dimensionally Structured Complex Metal Oxides as in Transition Metal Chemistry. Ind. Eng. Chem. Res. 2018, 57, 13973−
Catalysts for Selective Oxidations. ACS Catal. 2018, 8, 2935−2943. 13986.
(9) Sehested, J. Industrial and Scientific Directions of Methanol (29) Duan, C.; Janet, J. P.; Liu, F.; Nandy, A.; Kulik, H. J. Learning
Catalyst Development. J. Catal. 2019, 371, 368−375. from Failure: Predicting Electronic Structure Calculation Outcomes
(10) Pelletier, J. D. A.; Basset, J. M. Catalysis by Design: Well- with Machine Learning Models. J. Chem. Theory Comput. 2019, 15,
Defined Single-Site Heterogeneous Catalysts. Acc. Chem. Res. 2016, 2331−2345.
49, 664−677. (30) Le, T.; Epa, V. C.; Burden, F. R.; Winkler, D. A. Quantitative
(11) Toyao, T.; Hakim Siddiki, S. M. A.; Kon, K.; Shimizu, K. The Structure−Property Relationship Modeling of Diverse Materials
Catalytic Reduction of Carboxylic Acid Derivatives and CO2 by Metal Properties. Chem. Rev. 2012, 112, 2889−2919.
Nanoparticles on Lewis-Acidic Supports. Chem. Rec. 2018, 18, 1374− (31) Rupp, M.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A.
1393. Fast and Accurate Modeling of Molecular Atomization Energies with
(12) Grajciar, L.; Heard, C. J.; Bondarenko, A. A.; Polynski, M. V.; Machine Learning. Phys. Rev. Lett. 2012, 108, 058301.
Meeprasert, J.; Pidko, E. A.; Nachtigall, P. Towards Operando (32) Pilania, G.; Wang, C.; Jiang, X.; Rajasekaran, S.; Ramprasad, R.
Computational Modeling in Heterogeneous Catalysis. Chem. Soc. Rev.
Accelerating Materials Property Predictions Using Machine Learning.
2018, 47, 8307−8348.
Sci. Rep. 2013, 3, 2810.
(13) van Santen, R. A.; Neurock, M.; Shetty, S. G. Reactivity Theory
(33) Meredig, B.; Agrawal, A.; Kirklin, S.; Saal, J. E.; Doak, J. W.;
of Transition-Metal Surfaces: A Brønsted−Evans−Polanyi Linear
Thompson, A.; Zhang, K.; Choudhary, A.; Wolverton, C.
Activation Energy−Free-Energy Analysis. Chem. Rev. 2010, 110,
2005−2048. Combinatorial Screening for New Materials in Unconstrained
(14) Nørskov, J. K.; Abild-Pedersen, F.; Studt, F.; Bligaard, T. Composition Space with Machine Learning. Phys. Rev. B: Condens.
Density Functional Theory in Surface Chemistry and Catalysis. Proc. Matter Mater. Phys. 2014, 89, 094104.
Natl. Acad. Sci. U. S. A. 2011, 108, 937−943. (34) Faber, F.; Lindmaa, A.; Von Lilienfeld, O. A.; Armiento, R.
(15) Hammer, B.; Nørskov, J. Theoretical Surface Science and Crystal Structure Representations for Machine Learning Models of
Catalysis. Adv. Catal. 2000, 45, 71. Formation Energies. Int. J. Quantum Chem. 2015, 115, 1094−1101.
(16) Butler, K. T.; Davies, D. W.; Cartwright, H.; Isayev, O.; Walsh, (35) Ubaru, S.; Miȩdlar, A.; Saad, Y.; Chelikowsky, J. R. Formation
A. Machine Learning for Molecular and Materials Science. Nature Enthalpies for Transition Metal Alloys Using Machine Learning. Phys.
2018, 559, 547−555. Rev. B: Condens. Matter Mater. Phys. 2017, 95, 214102.
(17) Butler, K. T.; Frost, J. M.; Skelton, J. M.; Svane, K. L.; Walsh, A. (36) Schütt, K. T.; Glawe, H.; Brockherde, F.; Sanna, A.; Müller, K.
Computational Materials Design of Crystalline Solids. Chem. Soc. Rev. R.; Gross, E. K. U. How to Represent Crystal Structures for Machine
2016, 45, 6138−6146. Learning: Towards Fast Prediction of Electronic Properties. Phys. Rev.
(18) Fujinami, M.; Seino, J.; Nukazawa, T.; Ishida, S.; Iwamoto, T.; B: Condens. Matter Mater. Phys. 2014, 89, 205118.
Nakai, H. Virtual Reaction Condition Optimization Based on (37) Dey, P.; Bible, J.; Datta, S.; Broderick, S.; Jasinski, J.; Sunkara,
Machine Learning for a Small Number of Experiments in High- M.; Menon, M.; Rajan, K. Informatics-Aided Bandgap Engineering for
Dimensional Continuous and Discrete Variables. Chem. Lett. 2019, Solar Materials. Comput. Mater. Sci. 2014, 83, 185−195.
48, 961−964. (38) Lee, J.; Seko, A.; Shitara, K.; Nakayama, K.; Tanaka, I.
(19) Szymkuć, S.; Gajewska, E. P.; Klucznik, T.; Molga, K.; Dittwald, Prediction Model of Band Gap for Inorganic Compounds by
P.; Startek, M.; Bajczyk, M.; Grzybowski, B. A. Computer-Assisted Combination of Density Functional Theory Calculations and

2291 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Machine Learning Techniques. Phys. Rev. B: Condens. Matter Mater. (60) Caruana, R.; Karampatziakis, N.; Yessenalina, A. An Empirical
Phys. 2016, 93, 115104. Evaluation of Supervised Learning in High Dimensions. In Proceedings
(39) Pilania, G.; Mannodi-Kanakkithodi, A.; Uberuaga, B. P.; of the 25th International Conference on Machine Learning (ICML ’08);
Ramprasad, R.; Gubernatis, J. E.; Lookman, T. Machine Learning ACM: New York, 2008; pp 96−103.
Bandgaps of Double Perovskites. Sci. Rep. 2016, 6, 19375. (61) Fernández-Delgado, M.; Cernadas, E.; Barro, S.; Amorim, D.
(40) Legrain, F.; Carrete, J.; van Roekeghem, A.; Curtarolo, S.; Do We Need Hundreds of Classifiers to Solve Real World
Mingo, N. How the Chemical Composition Alone Can Predict Classification Problems? J. Mach. Learn. Res. 2014, 15, 3133−3181.
Vibrational Free Energies and Entropies of Solids. Chem. Mater. 2017, (62) Nguyen, A.; Yosinski, J.; Clune, J. Deep Neural Networks Are
29, 6220−6227. Easily Fooled: High Confidence Predictions for Unrecognizable
(41) Seko, A.; Maekawa, T.; Tsuda, K.; Tanaka, I. Machine Learning Images. In 2015 IEEE Computer Society Conference on Computer Vision
with Systematic Density-Functional Theory Calculations: Application and Pattern Recognition (CVPR); IEEE, 2015.
to Melting Temperatures of Single- and Binary-Component Solids. (63) Cherkasov, A.; Muratov, E. N.; Fourches, D.; Varnek, A.;
Phys. Rev. B: Condens. Matter Mater. Phys. 2014, 89, 054303. Baskin, I. I.; Cronin, M.; Dearden, J.; Gramatica, P.; Martin, Y. C.;
(42) Goldsmith, B. R.; Esterhuizen, J.; Bartel, C. J.; Sutton, C.; Liu, Todeschini, R.; Consonni, V.; Kuz’min, V. E.; Cramer, R.; Benigni, R.;
J.-X. Machine Learning for Heterogeneous Catalyst Design and
Yang, C.; Rathman, J.; Terfloth, L.; Gasteiger, J.; Richard, A.;
Discovery. AIChE J. 2018, 64, 2311−2323.
Tropsha, A. QSAR Modeling: Where Have You Been? Where Are
(43) Hattori, T.; Kito, S.; Murakami, Y. Integration of Catalyst
Activity Pattern (INCAP) Artificial Intelligence Approach in Catalyst You Going To? J. Med. Chem. 2014, 57, 4977−5010.
(64) Tropsha, A. Best Practices for QSAR Model Development,
Design. Chem. Lett. 1988, 17, 1269−1272.
(44) Hattori, T.; Kito, S. Artificial Intelligence Approach to Catalyst Validation, and Exploitation. Mol. Inf. 2010, 29, 476−488.
Design. Catal. Today 1991, 10, 213−222. (65) Gromski, P. S.; Henson, A. B.; Granda, J. M.; Cronin, L. How
(45) Kito, S.; Hattori, T.; Murakami, Y. Estimation of the Acid to Explore Chemical Space Using Algorithms and Automation. Nat.
Strength of Mixed Oxides by a Neural Network. Ind. Eng. Chem. Res. Rev. Chem. 2019, 3, 119−128.
1992, 31, 979−981. (66) Suzuki, K.; Toyao, T.; Maeno, Z.; Takakusagi, S.; Shimizu, K.;
(46) Hattori, T.; Kito, S. Neural Network as a Tool for Catalyst Takigawa, I. Statistical Analysis and Discovery of Heterogeneous
Development. Catal. Today 1995, 23, 347−355. Catalysts Based on Machine Learning from Diverse Published Data.
(47) Medford, A. J.; Kunz, M. R.; Ewing, S. M.; Borders, T.; ChemCatChem 2019, 11, 4537−4547.
Fushimi, R. Extracting Knowledge from Data through Catalysis (67) Ghahramani, Z. Probabilistic Machine Learning and Artificial
Informatics. ACS Catal. 2018, 8, 7403−7429. Intelligence. Nature 2015, 521, 452−459.
(48) Takahashi, K.; Takahashi, L.; Miyazato, I.; Fujima, J.; Tanaka, (68) Noh, J.; Kim, J.; Stein, H. S.; Sanchez-Lengeling, B.; Gregoire, J.
Y.; Uno, T.; Satoh, H.; Ohno, K.; Nishida, M.; Hirai, K.; Ohyama, J.; M.; Aspuru-Guzik, A.; Jung, Y. Inverse Design of Solid-State Materials
Nguyen, T. N.; Nishimura, S.; Taniike, T. The Rise of Catalyst via a Continuous Representation. Matter 2019, 1, 1370−1384.
Informatics: Towards Catalyst Genomics. ChemCatChem 2019, 11, (69) Box, G. E. P. Use and Abuse of Regression. Technometrics 1966,
1146−1152. 8, 625−629.
(49) Kitchin, J. R. Machine Learning in Catalysis. Nat. Catal. 2018, (70) Rücker, C.; Rücker, G.; Meringer, M. Y-Randomization and Its
1, 230−232. Variants in QSPR/QSAR. J. Chem. Inf. Model. 2007, 47, 2345−2357.
(50) Li, Z.; Wang, S.; Xin, H. Toward Artificial Intelligence in (71) Hill, J.; Mulholland, G.; Persson, K.; Seshadri, R.; Wolverton,
Catalysis. Nat. Catal. 2018, 1, 641−642. C.; Meredig, B. Materials Science with Large-Scale Data and
(51) Schlexer Lamoureux, P.; Winther, K.; Garrido Torres, J. A.; Informatics: Unlocking New Opportunities. MRS Bull. 2016, 41,
Streibel, V.; Zhao, M.; Bajdich, M.; Abild-Pedersen, F.; Bligaard, T. 399−409.
Machine Learning for Computational Heterogeneous Catalysis. (72) Belsky, A.; Hellenbrandt, M.; Karen, V. L.; Luksch, P. New
ChemCatChem 2019, 11, 3581−3601. Developments in the Inorganic Crystal Structure Database (ICSD):
(52) Freeze, J. G.; Kelly, H. R.; Batista, V. S. Search for Catalysts by Accessibility in Support of Materials Research and Design. Acta
Inverse Design: Artificial Intelligence, Mountain Climbers, and Crystallogr., Sect. B: Struct. Sci. 2002, 58, 364−369.
Alchemists. Chem. Rev. 2019, 119, 6595−6612. (73) Xu, Y.; Yamazaki, M.; Villars, P. Inorganic Materials Database
(53) Caruthers, J. M.; Lauterbach, J. A.; Thomson, K. T.; for Exploring the Nature of Material. Jpn. J. Appl. Phys. 2011, 50,
Venkatasubramanian, V.; Snively, C. M.; Bhan, A.; Katare, S.; 11RH02.
Oskarsdottir, G. Catalyst Design: Knowledge Extraction from High- (74) Jain, A.; Ong, S. P.; Hautier, G.; Chen, W.; Richards, W. D.;
Throughput Experimentation. J. Catal. 2003, 216, 98−109. Dacek, S.; Cholia, S.; Gunter, D.; Skinner, D.; Ceder, G.; Persson, K.
(54) Hand, D. J. Data Science for Financial Applications. In A. Commentary: The Materials Project: A Materials Genome
Proceedings of the 24th ACM SIGKDD International Conference on
Approach to Accelerating Materials Innovation. APL Mater. 2013,
Knowledge Discovery and Data Mining (KDD ’18); ACM: New York,
1, 011002.
2018.
(75) Saal, J. E.; Kirklin, S.; Aykol, M.; Meredig, B.; Wolverton, C.
(55) Domingos, P. A Few Useful Things To Know about Machine
Materials Design and Discovery with High-Throughput Density
Learning. Commun. ACM 2012, 55, 78−87.
(56) Wujek, B.; Hall, P.; Güneş, F. Best Practices for Machine Functional Theory: The Open Quantum Materials Database
Learning Applications; SAS Paper SAS2360-2016; SAS Institute: Cary, (OQMD). JOM 2013, 65, 1501−1509.
NC, 2 01 6. ht t ps : / / s u p p o r t . s a s. co m / r e s o u r c e s / pa p e r s / (76) Curtarolo, S.; Setyawan, W.; Wang, S.; Xue, J.; Yang, K.; Taylor,
proceedings16/SAS2360-2016.pdf. R. H.; Nelson, L. J.; Hart, G. L. W.; Sanvito, S.; Buongiorno-Nardelli,
(57) Kaufman, S.; Rosset, S.; Perlich, C. Leakage in Data Mining: M.; Mingo, N.; Levy, O. AFLOWLIB.ORG: A Distributed Materials
Formulation, Detection, and Avoidance. In Proceedings of the 17th Properties Repository from High-Throughput Ab Initio Calculations.
ACM SIGKDD International Conference on Knowledge Discovery and Comput. Mater. Sci. 2012, 58, 227−235.
Data Mining (KDD ’11); ACM: New York, 2011; pp 556−563. (77) Wicker, J. G. P.; Cooper, R. I. Will It Crystallise? Predicting
(58) Takigawa, I.; Shimizu, K.; Tsuda, K.; Takakusagi, S. Machine- Crystallinity of Molecular Materials. CrystEngComm 2015, 17, 1927−
Learning Prediction of the d-Band Center for Metals and Bimetals. 1934.
RSC Adv. 2016, 6, 52587−52595. (78) Bassman, L.; Rajak, P.; Kalia, R. K.; Nakano, A.; Sha, F.; Sun, J.;
(59) Takigawa, I.; Shimizu, K.; Tsuda, K.; Takakusagi, S. Machine Singh, D. J.; Aykol, M.; Huck, P.; Persson, K.; Vashishta, P. Active
Learning Predictions of Factors Affecting the Activity of Heteroge- Learning for Accelerated Design of Layered Materials. npj Comput.
neous Metal Catalysts. Nanoinformatics 2018, 45−64. Mater. 2018, 4, 74.

2292 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

(79) Graser, J.; Kauwe, S. K.; Sparks, T. D. Machine Learning and Approach to Zeolite Synthesis Enabled by Automatic Literature Data
Energy Minimization Approaches for Crystal Structure Predictions: A Extraction. ACS Cent. Sci. 2019, 5, 892−899.
Review and New Horizons. Chem. Mater. 2018, 30, 3601−3612. (99) Raccuglia, P.; Elbert, K. C.; Adler, P. D. F.; Falk, C.; Wenny, M.
(80) Jha, D.; Ward, L.; Paul, A.; Liao, W.-k.; Choudhary, A.; B.; Mollo, A.; Zeller, M.; Friedler, S. A.; Schrier, J.; Norquist, A. J.
Wolverton, C.; Agrawal, A. ElemNet: Deep Learning the Chemistry of Machine-Learning-Assisted Materials Discovery Using Failed Experi-
Materials From Only Elemental Composition. Sci. Rep. 2018, 8, ments. Nature 2016, 533, 73−76.
17593. (100) Vayenas, C. G.; Bebelis, S.; Ladas, S. Dependence of Catalytic
(81) Aykol, M.; Hegde, V. I.; Hung, L.; Suram, S.; Herring, P.; Rates on Catalyst Work Function. Nature 1990, 343, 625−627.
Wolverton, C.; Hummelshøj, J. S. Network Analysis of Synthesizable (101) Shen, X.; Pan, Y.; Liu, B.; Yang, J.; Zeng, J.; Peng, Z. More
Materials Discovery. Nat. Commun. 2019, 10, 2018. Accurate Depiction of Adsorption Energy on Transition Metals Using
(82) Braham, E. J.; Cho, J.; Forlano, K. M.; Watson, D. F.; Arròyave, Work Function as One Additional Descriptor. Phys. Chem. Chem.
R.; Banerjee, S. Machine Learning-Directed Navigation of Synthetic Phys. 2017, 19, 12628−12632.
Design Space: A Statistical Learning Approach to Controlling the (102) Ruban, A.; Hammer, B.; Stoltze, P.; Skriver, H. L.; Nørskov, J.
Synthesis of Perovskite Halide Nanoplatelets in the Quantum- K. Surface Electronic Structure and Reactivity of Transition and
Confined Regime. Chem. Mater. 2019, 31, 3281−3292. Noble Metals. J. Mol. Catal. A: Chem. 1997, 115, 421−429.
(83) Oliynyk, A. O.; Mar, A. Discovery of Intermetallic Compounds (103) Calle-Vallejo, F.; Tymoczko, J.; Colic, V.; Vu, Q. H.; Pohl, M.
from Traditional to Machine-Learning Approaches. Acc. Chem. Res. D.; Morgenstern, K.; Loffreda, D.; Sautet, P.; Schuhmann, W.;
2018, 51, 59−68. Bandarenka, A. S. Finding Optimal Surface Sites on Heterogeneous
(84) Oliynyk, A. O.; Adutwum, L. A.; Harynuk, J. J.; Mar, A. Catalysts by Counting Nearest Neighbors. Science 2015, 350, 185−
Classifying Crystal Structures of Binary Compounds AB through 189.
Cluster Resolution Feature Selection and Support Vector Machine (104) Zhuang, H.; Tkalych, A. J.; Carter, E. A. Surface Energy as a
Analysis. Chem. Mater. 2016, 28, 6672−6681. Descriptor of Catalytic Activity. J. Phys. Chem. C 2016, 120, 23698−
(85) Hautier, G.; Fischer, C. C.; Jain, A.; Mueller, T.; Ceder, G. 23706.
Finding Natures Missing Ternary Oxide Compounds Using Machine (105) Kitchin, J. R.; Nørskov, J. K.; Barteau, M. A.; Chen, J. G. Role
Learning and Density Functional Theory. Chem. Mater. 2010, 22, of Strain and Ligand Effects in the Modification of the Electronic and
3762−3767. Chemical Properties of Bimetallic Surfaces. Phys. Rev. Lett. 2004, 93,
(86) Kim, E.; Huang, K.; Tomala, A.; Matthews, S.; Strubell, E.; 156801.
Saunders, A.; McCallum, A.; Olivetti, E. Machine-Learned and (106) Greeley, J. Theoretical Heterogeneous Catalysis: Scaling
Relationships and Computational Catalyst Design. Annu. Rev. Chem.
Codified Synthesis Parameters of Oxide Materials. Sci. Data 2017, 4,
Biomol. Eng. 2016, 7, 605−635.
170127.
(107) Calle-Vallejo, F.; Loffreda, D.; Koper, M. T. M.; Sautet, P.
(87) Kim, E.; Huang, K.; Saunders, A.; McCallum, A.; Ceder, G.;
Introducing Structural Sensitivity into Adsorption−Energy Scaling
Olivetti, E. Materials Synthesis Insights from Scientific Literature via
Relations by Means of Coordination Numbers. Nat. Chem. 2015, 7,
Text Extraction and Machine Learning. Chem. Mater. 2017, 29,
403−410.
9436−9444.
(108) Kumar, G.; Lau, S. L. J.; Krcha, M. D.; Janik, M. J. Correlation
(88) Caramelli, D.; Salley, D.; Henson, A.; Camarasa, G. A.; Sharabi,
of Methane Activation and Oxide Catalyst Reducibility and Its
S.; Keenan, G.; Cronin, L. Networking Chemical Robots for Reaction Implications for Oxidative Coupling. ACS Catal. 2016, 6, 1812−1821.
Multitasking. Nat. Commun. 2018, 9, 3406. (109) Huang, B.; Xiao, L.; Lu, J.; Zhuang, L. Spatially Resolved
(89) Duros, V.; Grizou, J.; Xuan, W.; Hosni, Z.; Long, D. L.; Miras, Quantification of the Surface Reactivity of Solid Catalysts. Angew.
H. N.; Cronin, L. Human versus Robots in the Discovery and Chem., Int. Ed. 2016, 55, 6239−6243.
Crystallization of Gigantic Polyoxometalates. Angew. Chem., Int. Ed. (110) Capdevila-Cortada, M.; Vilé, G.; Teschner, D.; Pérez-Ramírez,
2017, 56, 10815−10820. J.; López, N. Reactivity Descriptors for Ceria in Catalysis. Appl. Catal.,
(90) Moliner, M.; Román-Leshkov, Y.; Corma, A. Machine Learning B 2016, 197, 299−312.
Applied to Zeolite Synthesis: The Missing Link for Realizing High- (111) Wang, P.; Fu, G.; Wan, H. How High Valence Transition
Throughput Discovery. Acc. Chem. Res. 2019, 52, 2971−2980. Metal Spreads Its Activity over Nonmetal Oxoes: A Proof-of-Concept
(91) Muraoka, K.; Sada, Y.; Miyazaki, D.; Chaikittisilp, W.; Okubo, Study. ACS Catal. 2017, 7, 5544−5548.
T. Linking Synthesis and Structure Descriptors from a Large (112) Studt, F.; Abild-Pedersen, F.; Bligaard, T.; Sørensen, R. Z.;
Collection of Synthetic Records of Zeolite Materials. Nat. Commun. Christensen, C. H.; Nørskov, J. K. Identification of Non-Precious
2019, 10, 4459. Metal Alloy Catalysts for Selective Hydrogenation of Acetylene.
(92) Yang, S.; Lach-hab, M.; Vaisman, I. I.; Blaisten-Barojas, E. Science 2008, 320, 1320−1322.
Identifying Zeolite Frameworks with a Machine Learning Approach. J. (113) Studt, F.; Sharafutdinov, I.; Abild-Pedersen, F.; Elkjær, C. F.;
Phys. Chem. C 2009, 113, 21721−21725. Hummelshøj, J. S.; Dahl, S.; Chorkendorff, I.; Nørskov, J. K.
(93) Carr, D. A.; Lach-hab, M.; Yang, S.; Vaisman, I. I.; Blaisten- Discovery of a Ni-Ga Catalyst for Carbon Dioxide Reduction to
Barojas, E. Machine Learning Approach for Structure-Based Zeolite Methanol. Nat. Chem. 2014, 6, 320−324.
Classification. Microporous Mesoporous Mater. 2009, 117, 339−349. (114) Fernández, E. M.; Moses, P. G.; Toftelund, A.; Hansen, H. A.;
(94) Lach-hab, M.; Yang, S.; Vaisman, I. I.; Blaisten-Barojas, E. Martínez, J. I.; Abild-Pedersen, F.; Kleis, J.; Hinnemann, B.;
Novel Approach for Clustering Zeolite Crystal Structures. Mol. Inf. Rossmeisl, J.; Bligaard, T.; Nørskov, J. K. Scaling Relationships for
2010, 29, 297−301. Adsorption Energies on Transition Metal Oxide, Sulfide, and Nitride
(95) Blay, V.; Yokoi, T.; González-Díaz, H. Perturbation Theory− Surfaces. Angew. Chem., Int. Ed. 2008, 47, 4683−4686.
Machine Learning Study of Zeolite Materials Desilication. J. Chem. (115) Vojvodic, A.; Hellman, A.; Ruberto, C.; Lundqvist, B. I. From
Inf. Model. 2018, 58, 2414−2419. Electronic Structure to Catalytic Activity: A Single Descriptor for
(96) Evans, J. D.; Coudert, F. X. Predicting the Mechanical Adsorption and Reactivity on Transition-Metal Carbides. Phys. Rev.
Properties of Zeolite Frameworks by Machine Learning. Chem. Mater. Lett. 2009, 103, 146103.
2017, 29, 7833−7839. (116) Calle-Vallejo, F.; Inoglu, N. G.; Su, H. Y.; Martínez, J. I.; Man,
(97) Daeyaert, F.; Ye, F.; Deem, M. W. Machine-Learning Approach I. C.; Koper, M. T. M.; Kitchin, J. R.; Rossmeisl, J. Number of Outer
to the Design of OSDAs for Zeolite Beta. Proc. Natl. Acad. Sci. U. S. A. Electrons as Descriptor for Adsorption Processes on Transition
2019, 116, 3413−3418. Metals and Their Oxides. Chem. Sci. 2013, 4, 1245−1249.
(98) Jensen, Z.; Kim, E.; Kwon, S.; Gani, T. Z. H.; Román-Leshkov, (117) Kakekhani, A.; Roling, L. T.; Kulkarni, A.; Latimer, A. A.;
Y.; Moliner, M.; Corma, A.; Olivetti, E. A Machine Learning Abroshan, H.; Schumann, J.; Aljama, H.; Siahrostami, S.; Ismail-Beigi,

2293 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

S.; Abild-Pedersen, F.; Nørskov, J. K. Nature of Lone-Pair-Surface (136) Li, Z.; Wang, S.; Chin, W. S.; Achenie, L. E.; Xin, H. High-
Bonds and Their Scaling Relations. Inorg. Chem. 2018, 57, 7222− Throughput Screening of Bimetallic Catalysts Enabled by Machine
7238. Learning. J. Mater. Chem. A 2017, 5, 24131−24138.
(118) Kamachi, T.; Tatsumi, T.; Toyao, T.; Hinuma, Y.; Maeno, Z.; (137) Ma, X.; Li, Z.; Achenie, L. E. K.; Xin, H. Machine-Learning-
Takakusagi, S.; Furukawa, S.; Takigawa, I.; Shimizu, K. Linear Augmented Chemisorption Model for CO2 Electroreduction Catalyst
Correlations between Adsorption Energies and HOMO Levels for the Screening. J. Phys. Chem. Lett. 2015, 6, 3528−3533.
Adsorption of Small Molecules on TiO2 Surfaces. J. Phys. Chem. C (138) Noh, J.; Back, S.; Kim, J.; Jung, Y. Active Learning with Non-
2019, 123, 20988−20997. Ab Initio Input Features toward Efficient CO2 Reduction Catalysts.
(119) Ruiz Puigdollers, A.; Schlexer, P.; Tosoni, S.; Pacchioni, G. Chem. Sci. 2018, 9, 5152−5159.
Increasing Oxide Reducibility: The Role of Metal/Oxide Interfaces in (139) Andersen, M.; Levchenko, S. V.; Scheffler, M.; Reuter, K.
the Formation of Oxygen Vacancies. ACS Catal. 2017, 7, 6493−6513. Beyond Scaling Relations for the Description of Catalytic Materials.
(120) Jia, J.; Qian, C.; Dong, Y.; Li, Y. F.; Wang, H.; Ghoussoub, M.; ACS Catal. 2019, 9, 2752−2759.
Butler, K. T.; Walsh, A.; Ozin, G. A. Heterogeneous Catalytic (140) Jäger, M. O. J.; Morooka, E. V.; Federici Canova, F.;
Hydrogenation of CO2 by Metal Oxides: Defect Engineering − Himanen, L.; Foster, A. S. Machine Learning Hydrogen Adsorption
Perfecting Imperfection. Chem. Soc. Rev. 2017, 46, 4631−4644. on Nanoclusters through Structural Descriptors. npj Comput. Mater.
(121) Lee, Y.-L.; Kleis, J.; Rossmeisl, J.; Yang, S.-H.; Morgan, D. 2018, 4, 37.
Prediction of Solid Oxide Fuel Cell Cathode Activity with First- (141) Back, S.; Yoon, J.; Tian, N.; Zhong, W.; Tran, K.; Ulissi, Z. W.
Principles Descriptors. Energy Environ. Sci. 2011, 4, 3966−3970. Convolutional Neural Network of Atomic Surface Structures to
(122) Curnan, M. T.; Kitchin, J. R. Effects of Concentration, Crystal Predict Binding Energies for High-Throughput Screening of Catalysts.
Structure, Magnetism, and Electronic Structure Method on First- J. Phys. Chem. Lett. 2019, 10, 4401−4408.
Principles Oxygen Vacancy Formation Energy Trends in Perovskites. (142) Chowdhury, A. J.; Yang, W.; Walker, E.; Mamun, O.; Heyden,
J. Phys. Chem. C 2014, 118, 28776−28790. A.; Terejanu, G. A. Prediction of Adsorption Energies for Chemical
(123) Staykov, A.; Téllez, H.; Akbay, T.; Druce, J.; Ishihara, T.; Species on Metal Catalyst Surfaces Using Machine Learning. J. Phys.
Kilner, J. Oxygen Activation and Dissociation on Transition Metal Chem. C 2018, 122, 28142−28150.
Free Perovskite Surfaces. Chem. Mater. 2015, 27, 8273−8281. (143) Hoyt, R. A.; Montemore, M. M.; Fampiou, I.; Chen, W.;
(124) Chen, C.; Ciucci, F. Designing Fe-Based Oxygen Catalysts by Tritsaris, G.; Kaxiras, E. Machine Learning Prediction of H
Density Functional Theory Calculations. Chem. Mater. 2016, 28, Adsorption Energies on Ag Alloys. J. Chem. Inf. Model. 2019, 59,
7058−7065. 1357−1365.
(125) Emery, A. A.; Saal, J. E.; Kirklin, S.; Hegde, V. I.; Wolverton, (144) García-Muelas, R.; López, N. Statistical Learning Goes beyond
C. High-Throughput Computational Screening of Perovskites for the D-Band Model Providing the Thermochemistry of Adsorbates on
Transition Metals. Nat. Commun. 2019, 10, 4687.
Thermochemical Water Splitting Applications. Chem. Mater. 2016, 28,
(145) Gasper, R.; Shi, H.; Ramasubramaniam, A. Adsorption of CO
5621−5634.
on Low-Energy, Low-Symmetry Pt Nanoparticles: Energy Decom-
(126) Hinuma, Y.; Toyao, T.; Kamachi, T.; Maeno, Z.; Takakusagi,
position Analysis and Prediction via Machine-Learning Models. J.
S.; Furukawa, S.; Takigawa, I.; Shimizu, K. Density Functional Theory
Phys. Chem. C 2017, 121, 5612−5619.
Calculations of Oxygen Vacancy Formation and Subsequent
(146) Göltl, F.; Müller, P.; Uchupalanun, P.; Sautet, P.; Hermans, I.
Molecular Adsorption on Oxide Surfaces. J. Phys. Chem. C 2018,
Developing a Descriptor-Based Approach for CO and NO Adsorption
122, 29435−29444. Strength to Transition Metal Sites in Zeolites. Chem. Mater. 2017, 29,
(127) Pankajakshan, P.; Sanyal, S.; De Noord, O. E.; Bhattacharya,
6434−6444.
I.; Bhattacharyya, A.; Waghmare, U. Machine Learning and Statistical (147) Jinnouchi, R.; Asahi, R. Predicting Catalytic Activity of
Analysis for Materials Science: Stability and Transferability of Nanoparticles by a DFT-Aided Machine-Learning Algorithm. J. Phys.
Fingerprint Descriptors and Chemical Insights. Chem. Mater. 2017, Chem. Lett. 2017, 8, 4279−4283.
29, 4190−4201. (148) Jinnouchi, R.; Hirata, H.; Asahi, R. Extrapolating Energetics
(128) Tanabe, K.; Hölderich, W. F. Industrial Application of Solid on Clusters and Single-Crystal Surfaces to Nanoparticles by Machine-
Acid-Base Catalysts. Appl. Catal., A 1999, 181, 399−434. Learning Scheme. J. Phys. Chem. C 2017, 121, 26397−26405.
(129) Digne, M.; Sautet, P.; Raybaud, P.; Euzen, P.; Toulhoat, H. (149) Ulissi, Z. W.; Tang, M. T.; Xiao, J.; Liu, X.; Torelli, D. A.;
Use of DFT to Achieve a Rational Understanding of Acid-Basic Karamad, M.; Cummins, K.; Hahn, C.; Lewis, N. S.; Jaramillo, T. F.;
Properties of γ-Alumina Surfaces. J. Catal. 2004, 226, 54−68. Chan, K.; Nørskov, J. K. Machine-Learning Methods Enable
(130) Jenness, G. R.; Christiansen, M. A.; Caratzoulas, S.; Vlachos, Exhaustive Searches for Active Bimetallic Facets and Reveal Active
D. G.; Gorte, R. J. Site-Dependent Lewis Acidity of γ-Al2O3 and Its Site Motifs for CO2 Reduction. ACS Catal. 2017, 7, 6600−6608.
Impact on Ethanol Dehydration and Etherification. J. Phys. Chem. C (150) Tran, K.; Ulissi, Z. W. Active Learning across Intermetallics
2014, 118, 12899−12907. To Guide Discovery of Electrocatalysts for CO2 Reduction and H2
(131) Cholewinski, M. C.; Dixit, M.; Mpourmpakis, G. Computa- Evolution. Nat. Catal. 2018, 1, 696−703.
tional Study of Methane Activation on γ-Al2O3. ACS Omega 2018, 3, (151) Toyao, T.; Suzuki, K.; Kikuchi, S.; Takakusagi, S.; Shimizu, K.;
18242−18250. Takigawa, I. Toward Effective Utilization of Methane: Machine
(132) Bhan, A.; Iglesia, E. A Link between Reactivity and Local Learning Prediction of Adsorption Energies on Metal Alloys. J. Phys.
Structure in Acid Catalysis on Zeolites. Acc. Chem. Res. 2008, 41, Chem. C 2018, 122, 8315−8326.
559−567. (152) Wexler, R. B.; Martirez, J. M. P.; Rappe, A. M. Chemical
(133) Liu, C.; Tranca, I.; van Santen, R. A.; Hensen, E. J. M.; Pidko, Pressure-Driven Enhancement of the Hydrogen Evolving Activity of
E. A. Scaling Relations for Acidity and Reactivity of Zeolites. J. Phys. Ni2P from Nonmetal Surface Doping Interpreted via Machine
Chem. C 2017, 121, 23520−23530. Learning. J. Am. Chem. Soc. 2018, 140, 4678−4683.
(134) Matsuoka, T.; Baumes, L.; Katada, N.; Chatterjee, A.; Sastre, (153) O’Connor, N. J.; Jonayat, A. S. M.; Janik, M. J.; Senftle, T. P.
G. Selecting Strong Brønsted Acid Zeolites through Screening from a Interaction Trends between Single Metal Atoms and Oxide Supports
Database of Hypothetical Frameworks. Phys. Chem. Chem. Phys. 2017, Identified with Density Functional Theory and Statistical Learning.
19, 14702−14707. Nat. Catal. 2018, 1, 531−539.
(135) Davran-Candan, T.; Günay, M. E.; Yildirim, R. Structure and (154) Castelli, I. E.; Hüser, F.; Pandey, M.; Li, H.; Thygesen, K. S.;
Activity Relationship for CO and O2 Adsorption over Gold Seger, B.; Jain, A.; Persson, K. A.; Ceder, G.; Jacobsen, K. W. New
Nanoparticles Using Density Functional Theory and Artificial Neural Light-Harvesting Materials Using Accurate and Efficient Bandgap
Networks. J. Chem. Phys. 2010, 132, 174113. Calculations. Adv. Energy Mater. 2015, 5, 1400915.

2294 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

(155) Mishra, A.; Satsangi, S.; Rajan, A. C.; Mizuseki, H.; Lee, K. R.; (173) Ziatdinov, M.; Dyck, O.; Maksov, A.; Li, X.; Sang, X.; Xiao, K.;
Singh, A. K. Accelerated Data-Driven Accurate Positioning of the Unocic, R. R.; Vasudevan, R.; Jesse, S.; Kalinin, S. V. Deep Learning
Band Edges of MXenes. J. Phys. Chem. Lett. 2019, 10, 780−785. of Atomically Resolved Scanning Transmission Electron Microscopy
(156) Ulissi, Z. W.; Singh, A. R.; Tsai, C.; Nørskov, J. K. Automated Images: Chemical Identification and Tracking Local Transformations.
Discovery and Construction of Surface Phase Diagrams Using ACS Nano 2017, 11, 12742−12752.
Machine Learning. J. Phys. Chem. Lett. 2016, 7, 3931−3935. (174) Vasudevan, R. K.; Laanait, N.; Ferragut, E. M.; Wang, K.;
(157) Timoshenko, J.; Frenkel, A. I. ”Inverting” X-ray Absorption Geohegan, D. B.; Xiao, K.; Ziatdinov, M.; Jesse, S.; Dyck, O.; Kalinin,
Spectra of Catalysts by Machine Learning in Search for Activity S. V. Mapping Mesoscopic Phase Evolution during E-Beam Induced
Descriptors. ACS Catal. 2019, 9, 10192−10211. Transformations via Deep Learning of Atomically Resolved Images.
(158) Liu, J.; Osadchy, M.; Ashton, L.; Foster, M.; Solomon, C. J.; npj Comput. Mater. 2018, 4, 30.
Gibson, S. J. Deep Convolutional Neural Networks for Raman (175) Belianinov, A.; He, Q.; Kravchenko, M.; Jesse, S.; Borisevich,
Spectrum Recognition: A Unified Solution. Analyst 2017, 142, 4067− A.; Kalinin, S. V. Identification of Phases, Symmetries and Defects
4074. through Local Crystallography. Nat. Commun. 2015, 6, 7801.
(159) Ito, K.; Obuchi, Y.; Chikayama, E.; Date, Y.; Kikuchi, J. (176) Rossouw, D.; Burdet, P.; De La Peña, F.; Ducati, C.; Knappett,
Exploratory Machine-Learned Theoretical Chemical Shifts Can B. R.; Wheatley, A. E. H.; Midgley, P. A. Multicomponent Signal
Closely Predict Metabolic Mixture Signals. Chem. Sci. 2018, 9, Unmixing from Nanoheterostructures: Overcoming the Traditional
8213−8220. Challenges of Nanoscale X-ray Analysis via Machine Learning. Nano
(160) Ghosh, K.; Stuke, A.; Todorović, M.; Jørgensen, P. B.; Lett. 2015, 15, 2716−2720.
Schmidt, M. N.; Vehtari, A.; Rinke, P. Deep Learning Spectroscopy: (177) Shiga, M.; Tatsumi, K.; Muto, S.; Tsuda, K.; Yamamoto, Y.;
Neural Networks for Molecular Excitation Spectra. Adv. Sci. 2019, 6, Mori, T.; Tanji, T. Sparse Modeling of EELS and EDX Spectral
1801367. Imaging Data by Nonnegative Matrix Factorization. Ultramicroscopy
(161) Asakura, K.; Abe, H.; Kimura, M. The Challenge of 2016, 170, 43−59.
Constructing an International XAFS Database. J. Synchrotron Radiat. (178) Hummelshøj, J. S.; Abild-Pedersen, F.; Studt, F.; Bligaard, T.;
2018, 25, 967−971. Nørskov, J. K. CatApp: A Web Application for Surface Chemistry and
(162) Zheng, C.; Mathew, K.; Chen, C.; Chen, Y.; Tang, H.; Dozier, Heterogeneous Catalysis. Angew. Chem., Int. Ed. 2012, 51, 272−274.
A.; Kas, J. J.; Vila, F. D.; Rehr, J. J.; Piper, L. F. J.; Persson, K. A.; Ong, (179) Winther, K. T.; Hoffmann, M. J.; Boes, J. R.; Mamun, O.;
S. P. Automated Generation and Ensemble-Learned Matching of X- Bajdich, M.; Bligaard, T. Catalysis-Hub.Org, an Open Electronic
ray Absorption Spectra. npj Comput. Mater. 2018, 4, 12. Structure Database for Surface Reactions. Sci. Data 2019, 6, 75.
(163) Mathew, K.; Zheng, C.; Winston, D.; Chen, C.; Dozier, A.; (180) Holeňa, M.; Baerns, M. Feedforward Neural Networks in
Rehr, J. J.; Ong, S. P.; Persson, K. A. Data Descriptor: High- Catalysis: A Tool for the Approximation of the Dependency of Yield
Throughput Computational X-ray Absorption Spectroscopy. Sci. Data on Catalyst Composition, and for Knowledge Extraction. Catal. Today
2018, 5, 180151. 2003, 81, 485−494.
(164) Kiyohara, S.; Miyata, T.; Tsuda, K.; Mizoguchi, T. Data- (181) Serra, J. M.; Corma, A.; Chica, A.; Argente, E.; Botti, V. Can
Driven Approach for the Prediction and Interpretation of Core- Artificial Neural Networks Help the Experimentation in Catalysis?
Electron Loss Spectroscopy. Sci. Rep. 2018, 8, 13548. Catal. Today 2003, 81, 393−403.
(165) Suzuki, Y.; Hino, H.; Kotsugi, M.; Ono, K. Automated (182) Huang, K.; Zhan, X. L.; Chen, F. Q.; Lü, D. W. Catalyst
Estimation of Materials Parameter from X-ray Absorption and Design for Methane Oxidative Coupling by Using Artificial Neural
Electron Energy-Loss Spectra with Similarity Measures. npj Comput. Network and Hybrid Genetic Algorithm. Chem. Eng. Sci. 2003, 58,
Mater. 2019, 5, 39. 81−87.
(166) Timoshenko, J.; Lu, D.; Lin, Y.; Frenkel, A. I. Supervised (183) Umegaki, T.; Watanabe, Y.; Nukui, N.; Omata, K.; Yamada,
Machine-Learning-Based Determination of Three-Dimensional Struc- M. Optimization of Catalyst for Methanol Synthesis by a
ture of Metallic Nanoparticles. J. Phys. Chem. Lett. 2017, 8, 5091− Combinatorial Approach Using a Parallel Activity Test and Genetic
5098. Algorithm Assisted by a Neural Network. Energy Fuels 2003, 17, 850−
(167) Timoshenko, J.; Halder, A.; Yang, B.; Seifert, S.; Pellin, M. J.; 856.
Vajda, S.; Frenkel, A. I. Subnanometer Substructures in Nano- (184) Omata, K.; Watanabe, Y.; Hashimoto, M.; Umegaki, T.;
assemblies Formed from Clusters under a Reactive Atmosphere Yamada, M. Simultaneous Optimization of Preparation Conditions
Revealed Using Machine Learning. J. Phys. Chem. C 2018, 122, and Composition of the Methanol Synthesis Catalyst by an All-
21686−21693. Encompassing Calculation on an Artificial Neural Network. Ind. Eng.
(168) Akai, I.; Iwamitsu, K.; Igarashi, Y.; Okada, M.; Setoyama, H.; Chem. Res. 2004, 43, 3282−3288.
Okajima, T.; Hirai, Y. Sparse Modeling of an Extended X-ray (185) Corma, A.; Serra, J. M.; Serna, P.; Valero, S.; Argente, E.;
Absorption Fine-Structure Spectrum Based on a Single-Scattering Botti, V. Optimisation of Olefin Epoxidation Catalysts with the
Formalism. J. Phys. Soc. Jpn. 2018, 87, 074003. Application of High-Throughput and Genetic Algorithms Assisted by
(169) Carbone, M. R.; Yoo, S.; Topsakal, M.; Lu, D. Classification of Artificial Neural Networks (Softcomputing Techniques). J. Catal.
Local Chemical Environments from X-ray Absorption Spectra Using 2005, 229, 513−524.
Supervised Machine Learning. Phys. Rev. Mater. 2019, 3, 033604. (186) Baumes, L. A.; Serra, J. M.; Serna, P.; Corma, A. Support
(170) Guda, A. A.; Guda, S. A.; Lomachenko, K. A.; Soldatov, M. A.; Vector Machines for Predictive Modeling in Heterogeneous Catalysis:
Pankin, I. A.; Soldatov, A. V.; Braglia, L.; Bugaev, A. L.; Martini, A.; A Comprehensive Introduction and Overfitting Investigation Based
Signorile, M.; Groppo, E.; Piovano, A.; Borfecchia, E.; Lamberti, C. on Two Real Applications. J. Comb. Chem. 2006, 8, 583−596.
Quantitative Structural Determination of Active Sites from In Situ and (187) Le, T. C.; Winkler, D. A. Discovery and Optimization of
Operando XANES Spectra: From Standard Ab Initio Simulations to Materials Using Evolutionary Approaches. Chem. Rev. 2016, 116,
Chemometric and Machine Learning Approaches. Catal. Today 2019, 6107−6132.
336, 3−21. (188) Rothenberg, G. Data Mining in Catalysis: Separating
(171) Miyazato, I.; Takahashi, L.; Takahashi, K. Automatic Knowledge from Garbage. Catal. Today 2008, 137, 2−10.
Oxidation Threshold Recognition of XAFS Data Using Supervised (189) Zavyalova, U.; Holena, M.; Schlögl, R.; Baerns, M. Statistical
Machine Learning. Mol. Syst. Des. Eng. 2019, 4, 1014−1018. Analysis of Past Catalytic Data on Oxidative Methane Coupling for
(172) Madsen, J.; Liu, P.; Kling, J.; Wagner, J. B.; Hansen, T. W.; New Insights into the Composition of High-Performance Catalysts.
Winther, O.; Schiøtz, J. A Deep Learning Approach to Identify Local ChemCatChem 2011, 3, 1935−1947.
Structures in Atomic-Resolution Transmission Electron Microscopy (190) Kondratenko, E. V.; Schlüter, M.; Baerns, M.; Linke, D.;
Images. Adv. Theory Simul. 2018, 1, 1800037. Holena, M. Developing Catalytic Materials for the Oxidative

2295 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

Coupling of Methane through Statistical Analysis of Literature Data. (210) Hansch, C.; Leo, A.; Taft, R. W. A Survey of Hammett
Catal. Sci. Technol. 2015, 5, 1668−1677. Substituent Constants and Resonance and Field Parameters. Chem.
(191) Takahashi, K.; Miyazato, I.; Nishimura, S.; Ohyama, J. Rev. 1991, 91, 165−195.
Unveiling Hidden Catalysts for the Oxidative Coupling of Methane (211) Tolman, C. A. Steric Effects of Phosphorus Ligands in
Based on Combining Machine Learning with Literature Data. Organometallic Chemistry and Homogeneous Catalysis. Chem. Rev.
ChemCatChem 2018, 10, 3223−3228. 1977, 77, 313−348.
(192) Schmack, R.; Friedrich, A.; Kondratenko, E. V.; Polte, J.; (212) Fey, N.; Orpen, A. G.; Harvey, J. N. Building Ligand
Werwatz, A.; Kraehnert, R. A Meta-Analysis of Catalytic Literature Knowledge Bases for Organometallic Chemistry: Computational
Data Reveals Property-Performance Correlations for the OCM Description of Phosphorus(III)-Donor Ligands and the Metal-
Reaction. Nat. Commun. 2019, 10, 441. Phosphorus Bond. Coord. Chem. Rev. 2009, 253, 704−722.
(193) Ohyama, J.; Nishimura, S.; Takahashi, K. Data Driven (213) Fey, N. The Contribution of Computational Studies to
Determination of Reaction Conditions in Oxidative Coupling of Organometallic Catalysis: Descriptors, Mechanisms and Models.
Methane via Machine Learning. ChemCatChem 2019, 11, 4307−4313. Dalton Trans. 2010, 39, 296−310.
(194) Baysal, M.; Günay, M. E.; Yıldırım, R. Decision Tree Analysis (214) Jover, J.; Fey, N. The Computational Road to Better Catalysts.
of Past Publications on Catalytic Steam Reforming to Develop Chem. - Asian J. 2014, 9, 1714−1723.
Heuristics for High Performance: A Statistical Review. Int. J. Hydrogen (215) Milo, A.; Neel, A. J.; Toste, F. D.; Sigman, M. S. A Data-
Energy 2017, 42, 243−254. Intensive Approach to Mechanistic Elucidation Applied to Chiral
(195) Ş ener, A. N.; Günay, M. E.; Leba, A.; Yıldırım, R. Statistical Anion Catalysis. Science 2015, 347, 737−743.
Review of Dry Reforming of Methane Literature Using Decision Tree (216) Sigman, M. S.; Harper, K. C.; Bess, E. N.; Milo, A. The
and Artificial Neural Network Analysis. Catal. Today 2018, 299, 289− Development of Multidimensional Analysis Tools for Asymmetric
302. Catalysis and Beyond. Acc. Chem. Res. 2016, 49, 1292−1301.
(196) Günay, M. E.; Yildirim, R. Knowledge Extraction from (217) Reid, J. P.; Sigman, M. S. Comparing Quantitative Prediction
Catalysis of the Past: A Case of Selective CO Oxidation over Noble Methods for the Discovery of Small-Molecule Chiral Catalysts. Nat.
Metal Catalysts between 2000 and 2012. ChemCatChem 2013, 5, Rev. Chem. 2018, 2, 290−305.
1395−1406. (218) Santiago, C. B.; Guo, J. Y.; Sigman, M. S. Predictive and
(197) Günay, M. E.; Yildirim, R. Neural Network Analysis of Mechanistic Multivariate Linear Regression Models for Reaction
Selective CO Oxidation over Copper-Based Catalysts for Knowledge Development. Chem. Sci. 2018, 9, 2398−2412.
Extraction from Published Data in the Literature. Ind. Eng. Chem. Res. (219) Bess, E. N.; Bischoff, A. J.; Sigman, M. S. Designer Substrate
2011, 50, 12488−12500. Library for Quantitative, Predictive Modeling of Reaction Perform-
(198) Günay, M. E.; Yildirim, R. Developing Global Reaction Rate ance. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 14698−14703.
Model for CO Oxidation over Au Catalysts from Past Data in (220) Burello, E.; Farrusseng, D.; Rothenberg, G. Combinatorial
Literature Using Artificial Neural Networks. Appl. Catal., A 2013, 468, Explosion in Homogeneous Catalysis: Screening 60,000 Cross-
395−402. Coupling Reactions. Adv. Synth. Catal. 2004, 346, 1844−1853.
(199) Odabaşi, Ç .; Günay, M. E.; Yildirim, R. Knowledge Extraction (221) Hageman, J. A.; Westerhuis, J. A.; Frü hauf, H. W.;
for Water Gas Shift Reaction over Noble Metal Catalysts from Rothenberg, G. Design and Assembly of Virtual Homogeneous
Publications in the Literature between 2002 and 2012. Int. J. Hydrogen Catalyst Libraries - Towards In Silico Catalyst Optimisation. Adv.
Energy 2014, 39, 5733−5746. Synth. Catal. 2006, 348, 361−369.
(200) Omata, K. Screening of New Additives of Active-Carbon- (222) Maldonado, A. G.; Rothenberg, G. Predictive Modeling in
Supported Heteropoly Acid Catalyst for Friedel−Crafts Reaction by Homogeneous Catalysis: A Tutorial. Chem. Soc. Rev. 2010, 39, 1891−
Gaussian Process Regression. Ind. Eng. Chem. Res. 2011, 50, 10948− 1902.
10954. (223) Kreutz, J. E.; Shukhaev, A.; Du, W.; Druskin, S.; Daugulis, O.;
(201) Ras, E. J.; Louwerse, M. J.; Rothenberg, G. New Tricks by Ismagilov, R. F. Evolution of Catalysts Directed by Genetic
Very Old Dogs: Predicting the Catalytic Hydrogenation of HMF Algorithms in a Plug-Based Microfluidic Device Tested with
Derivatives Using Slater-Type Orbitals. Catal. Sci. Technol. 2012, 2, Oxidation of Methane by Oxygen. J. Am. Chem. Soc. 2010, 132,
2456−2464. 3128−3132.
(202) Alper Tapan, N.; Yıldırım, R.; Günay, M. E. Analysis of Past (224) Occhipinti, G.; Bjørsvik, H. R.; Jensen, V. R. Quantitative
Experimental Data in Literature to Determine Conditions for High Structure-Activity Relationships of Ruthenium Catalysts for Olefin
Performance in Biodiesel Production. Biofuels, Bioprod. Biorefin. 2016, Metathesis. J. Am. Chem. Soc. 2006, 128, 6952−6964.
10, 422−434. (225) Milo, A.; Neel, A. J.; Toste, F. D.; Sigman, M. S. A Data-
(203) Madaan, N.; Shiju, N. R.; Rothenberg, G. Predicting the Intensive Approach to Mechanistic Elucidation Applied to Chiral
Performance of Oxidation Catalysts Using Descriptor Models. Catal. Anion Catalysis. Science 2015, 347, 737−743.
Sci. Technol. 2016, 6, 125−133. (226) Zahrt, A. F.; Henle, J. J.; Rose, B. T.; Wang, Y.; Darrow, W. T.;
(204) Rodemerck, U.; Holeňa, M.; Wagner, E.; Smejkal, Q.; Denmark, S. E. Prediction of Higher-Selectivity Catalysts by
Barkschat, A.; Baerns, M. Catalyst Development for CO2 Hydro- Computer-Driven Workflow and Machine Learning. Science 2019,
genation to Fuels. ChemCatChem 2013, 5, 1948−1955. 363, eaau5631.
(205) Günay, M. E.; Türker, L.; Tapan, N. A. Decision Tree Analysis (227) Meyer, B.; Sawatlon, B.; Heinen, S.; Von Lilienfeld, O. A.;
for Efficient CO2 Utilization in Electrochemical Systems. J. CO2 Util. Corminboeuf, C. Machine Learning Meets Volcano Plots: Computa-
2018, 28, 83−95. tional Discovery of Cross-Coupling Catalysts. Chem. Sci. 2018, 9,
(206) Hong, W. T.; Welsch, R. E.; Shao-Horn, Y. Descriptors of 7069−7077.
Oxygen-Evolution Activity for Oxides: A Statistical Evaluation. J. Phys. (228) Yada, A.; Nagata, K.; Ando, Y.; Matsumura, T.; Ichinoseki, S.;
Chem. C 2016, 120, 78−86. Sato, K. Machine Learning Approach for Prediction of Reaction Yield
(207) Palkovits, R.; Palkovits, S. Using Artificial Intelligence To with Simulated Catalyst Parameters. Chem. Lett. 2018, 47, 284−287.
Forecast Water Oxidation Catalysts. ACS Catal. 2019, 9, 8383−8387. (229) Wu, K.; Doyle, A. G. Parameterization of Phosphine Ligands
(208) Masood, H.; Toe, C. Y.; Teoh, W. Y.; Sethu, V.; Amal, R. Demonstrates Enhancement of Nickel Catalysis via Remote Steric
Machine Learning for Accelerated Discovery of Solar Photocatalysts. Effects. Nat. Chem. 2017, 9, 779−784.
ACS Catal. 2019, 9, 11774−11787. (230) Rosales, A. R.; Wahlers, J.; Limé, E.; Meadows, R. E.; Leslie,
(209) Can, E.; Yildirim, R. Data Mining in Photocatalytic Water K. W.; Savin, R.; Bell, F.; Hansen, E.; Helquist, P.; Munday, R. H.;
Splitting over Perovskites Literature for Higher Hydrogen Production. Wiest, O.; Norrby, P. O. Rapid Virtual Screening of Enantioselective
Appl. Catal., B 2019, 242, 267−283. Catalysts Using CatVS. Nat. Catal. 2019, 2, 41−45.

2296 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297
ACS Catalysis Review

(231) Landman, I. R.; Paulson, E. R.; Rheingold, A. L.; Grotjahn, D. (250) Takahashi, K.; Miyazato, I. Rapid Estimation of Activation
B.; Rothenberg, G. Designing Bifunctional Alkene Isomerization Energy in Heterogeneous Catalytic Reactions via Machine Learning. J.
Catalysts Using Predictive Modelling. Catal. Sci. Technol. 2017, 7, Comput. Chem. 2018, 39, 2405−2408.
4842−4851. (251) Garrido Torres, J. A.; Jennings, P. C.; Hansen, M. H.; Boes, J.
(232) Banerjee, S.; Sreenithya, A.; Sunoj, R. B. Machine Learning for R.; Bligaard, T. Low-Scaling Algorithm for Nudged Elastic Band
Predicting Product Distributions in Catalytic Regioselective Reac- Calculations Using a Surrogate Machine Learning Model. Phys. Rev.
tions. Phys. Chem. Chem. Phys. 2018, 20, 18311−18318. Lett. 2019, 122, 156001.
(233) Amar, Y.; Schweidtmann, A. M.; Deutsch, P.; Cao, L.; Lapkin, (252) Sameera, W. M. C.; Kumar Sharma, A.; Maeda, S.;
A. Machine Learning and Molecular Descriptors Enable Rational Morokuma, K. Artificial Force Induced Reaction Method for
Solvent Selection in Asymmetric Catalysis. Chem. Sci. 2019, 10, Systematic Determination of Complex Reaction Mechanisms. Chem.
6697−6706. Rec. 2016, 16, 2349−2363.
(234) Yamaguchi, S.; Sodeoka, M. Molecular Field Analysis Using (253) Sugiyama, K.; Sumiya, Y.; Takagi, M.; Saita, K.; Maeda, S.
Intermediates in Enantio-Determining Steps Can Extract Information Understanding CO Oxidation on the Pt(111) Surface Based on a
for Data-Driven Molecular Design in Asymmetric Catalysis. Bull. Reaction Route Network. Phys. Chem. Chem. Phys. 2019, 21, 14366−
Chem. Soc. Jpn. 2019, 92, 1701−1706. 14375.
(235) Sasmaz, E.; Mingle, K.; Lauterbach, J. High-Throughput
Screening Using Fourier-Transform Infrared Imaging. Engineering
2015, 1, 234−242.
(236) Katare, S.; Caruthers, J. M.; Delgass, W. N.;
Venkatasubramanian, V. An Intelligent System for Reaction Kinetic
Modeling and Catalyst Design. Ind. Eng. Chem. Res. 2004, 43, 3484−
3512.
(237) Yamada, Y.; Kobayashi, T. Utilization of Combinatorial
Method and High Throughput Experimentation for Development of
Heterogeneous Catalysts. J. Jpn. Pet. Inst. 2006, 49, 157−167.
(238) Diamond, G. M.; Hall, K. A.; Lapointe, A. M.; Leclerc, M. K.;
Longmire, J.; Shoemaker, J. A. W.; Sun, P. High-Throughput
Discovery and Optimization of Hafnium Heteroaryl-Amido Catalysts
for the Isospecific Polymerization of Propylene. ACS Catal. 2011, 1,
887−900.
(239) Kondratyuk, P.; Gumuslu, G.; Shukla, S.; Miller, J. B.;
Morreale, B. D.; Gellman, A. J. A Microreactor Array for Spatially
Resolved Measurement of Catalytic Activity for High-Throughput
Catalysis Science. J. Catal. 2013, 300, 55−62.
(240) Gumuslu, G.; Kondratyuk, P.; Boes, J. R.; Morreale, B.; Miller,
J. B.; Kitchin, J. R.; Gellman, A. J. Correlation of Electronic Structure
with Catalytic Activity: H2−D2 Exchange across CuxPd1−x Compo-
sition Space. ACS Catal. 2015, 5, 3137−3147.
(241) Kitchin, J. R.; Gellman, A. J. High-Throughput Methods
Using Composition and Structure Spread Libraries. AIChE J. 2016,
62, 3826−3835.
(242) Renom-Carrasco, M.; Lefort, L. Ligand Libraries for High
Throughput Screening of Homogeneous Catalysts. Chem. Soc. Rev.
2018, 47, 5038−5060.
(243) Isbrandt, E. S.; Sullivan, R. J.; Newman, S. G. High
Throughput Strategies for the Discovery and Optimization of
Catalytic Reactions. Angew. Chem., Int. Ed. 2019, 58, 7180−7191.
(244) Wellendorff, J.; Lundgaard, K. T.; Møgelhøj, A.; Petzold, V.;
Landis, D. D.; Nørskov, J. K.; Bligaard, T.; Jacobsen, K. W. Density
Functionals for Surface Science: Exchange-Correlation Model
Development with Bayesian Error Estimation. Phys. Rev. B: Condens.
Matter Mater. Phys. 2012, 85, 235149.
(245) Snyder, J. C.; Rupp, M.; Hansen, K.; Müller, K. R.; Burke, K.
Finding Density Functionals with Machine Learning. Phys. Rev. Lett.
2012, 108, 253002.
(246) Pozun, Z. D.; Hansen, K.; Sheppard, D.; Rupp, M.; Müller, K.
R.; Henkelman, G. Optimizing Transition States via Kernel-Based
Machine Learning. J. Chem. Phys. 2012, 136, 174101.
(247) Peterson, A. A. Acceleration of Saddle-Point Searches with
Machine Learning. J. Chem. Phys. 2016, 145, 074106.
(248) Ulissi, Z. W.; Medford, A. J.; Bligaard, T.; Nørskov, J. K. To
Address Surface Reaction Network Complexity Using Scaling
Relations Machine Learning and DFT Calculations. Nat. Commun.
2017, 8, 14621.
(249) Koistinen, O. P.; Dagbjartsdóttir, F. B.; Á sgeirsson, V.;
Vehtari, A.; Jónsson, H. Nudged Elastic Band Calculations
Accelerated with Gaussian Process Regression. J. Chem. Phys. 2017,
147, 152720.

2297 DOI: 10.1021/acscatal.9b04186


ACS Catal. 2020, 10, 2260−2297

You might also like