You are on page 1of 8
‘08/12/2022 13:49 Available online at www.sciencedirect.com ScienceDirect ELSEVIER Machine learning for multiscale modeling in computational molecular design | Elsevier Enhanced Reader Machine learning for multiscale modeling in computational molecular design Abdulelah $ Alshehri"” and Fengqi You' “The chemical industry is facing ever-increasing challenges for developing novel products and processes capable of reducing environmental impacts and curbing resource depletion. Yet, the interplay between molecular phenomena and the design of products and processes are often oversimpiied. Machine leaming stands uniquely positioned to disentangle the complexity of multiscale madeling by leveraging data to navigate the design spaces of multifaceted molecular systems, Herein, we limit our survey of machine learning applications in Computational molecular design (CMD) to four elements: property estimation, catalysis, synthesis planning, and design ‘methods. Through this perspective, we aim to offer a roadmap ‘or future work on muitiscale modeling that better explores the Interplay between nanoscale features and macroscale decisions in product and process design, ‘Addresses "Robert Frederick Smith School of Chemical and Biomolecular Engi rnoerng, Comat University, thaca, NY 14853, USA Deparment of Chemical Engnaering, Cotege of Enginoarng, King Saud Unversity, P.O. Box 800, Rvaah 1142", Sauct Arabia Coresponding author: You, Feng engalyoudcorel ed) ‘Current Opinion in Chemical Engineering 2022, 96100752 “Tri review comes rom a thamed iaue on Frontiers in chemical engineering: chemical product design atic by Rafiqu Gani, Lei Zhang and Chrysanthos Gounaris For complete overview ofthe section, please refer tothe ale col lection, Fronters n Chemical Engineering: Chemical Product Design — 1" ‘Availabe online 25tn October 2021 psd org/10.1016/,.coche 2021100752 2211-9988/0 2021 Elser Lid Al ight reserved. Introduction The manipulation of microscopic properties toenhance the functional performance of molecules has often been the primary basis for scientific and technological advances. Yet, ‘many major chemical inventions have both been accidental and conceived without optimizing their applicable pro- duets and processes [1°*,2"7]. The funetionality and effi ciency of chemical products and processes are typically considered in later design stages due to the high sensitivity between molecular descriptors and productprocess vari- ables [3.4%]. For that reason and the fact that that the evaluation of all. productiprocess alternatives isnot optimal products and processes may practically feasible, be excluded [5 Therefore, there isan urgent need for frameworks integrating multiple facets of molecular design along with uncertainty reduction in multiscale CMD pro- blems. Such frameworks have the potential to catalyze the pace of innovation towards solving many pressing issues, such as climate change and sustainable energy [7.8 Machine learning has been transforming many services and industries with uncemitting improvements fueled by the exponential growth in computing power and big, data, Remarkably, the subfield of deep learning has outperformed traditional machine learning and expert-crafted methods on many learning tasks [8°*9]. For molecular systems, deep -arning has recently made gigantic leaps in synthesis plan- ning and protein folding, outperforming decades of efforts based on both theory and experimentation [10°]. Moreover, the applications of deep learning to molecular design ele _ments have achieved state-of-the-art performance in molec ular/material property prediction [11] and synthesis planning [12]. Given the rapid protiferation of deep learning methods, the adoption of deep learning. to transform multiscale molec ular design appears more promising for bypassing common challenges than ever before [13] In this short review, we survey the different roles of machine learning in CMD for various tasks at different scales. More elaborate reviews are present in the literature t 5] with details on suitable methods [16] and specific examples on the roles of machine learning within CMD[17]. However, ourreviewaimsto shed ofthe-art advances in deep learning and highlight their transformative implications to CMD with a focus on the multiscale problems of produet and process des Although there are numerous opportunities for the appli- ceationof machine learningtoCMD, we limicoursurvey:nd discussion to four areas: property estimation, catalysis, synthesis planning, and design methods. We next offer ‘our perspectives for the next generation of deep learning based CMD applicationsand tools. The article isconcluded with a summarizing outlook, highlighting key needs and ‘open avenues towards unifying product and process design frameworks. A summary of the function, input, output, and the major pros and cons for the roles of machine learning, withi ble 1 CMD and perspectives are given in | Roles of machine lear in CMD Property prediction Property prediction model underlying thermodynami e constructed to learn the behavior of molecules. ‘wow acianoadirect com “Gurrent Opinion in Chemical Engineering 2022, 96100752 hitps reader elsever comitesderisdiplS221133982 1000848 7token=B61A30437C4025069EEADS2C9942630FC513740A341EA1 320685F1E. 118 ‘08/12/2022 13:49 Machine learning for multiscale modeling in computational molecular design | Elsevier Enhanced Reader 2. Frontiers in chemical engineering; chemical product design — W Table 4 ‘A summarizing table of the function of diferent machine learning applications in CMD listing their function, input and output, and major pros and cons San Es Tea lore a ag Eee aes aera “im poeta moron cat + Moni ermganrtncttt + ects ne coy seca catty gy “il et od rc roan ed sone ee ee a chen Ee eee eae Scere ‘Synthesis Providing wable synthesis: __ synthesis route/precursor selection/easibilty — Reliance on large scale data for speciic Planning a ‘molecular classes: Deg Hote an wage ran pee eee Onion tou concen pom meta nota 2 non co Paes ee a Generative: i + Simultaneous learning of desired properties ro eS ee pacing Fa Longer training time and lack of uncertainty —— ‘quantification Feiraet {Tron note of non sncna eed + Ponting rte pacer! St nag costs Conpr ams a oxv ryeeptsne tang fear Lei + endanger ata {Stare a pete wae psn mcrae stitehntomndond oven Stes od Semele SEE peoaten Tp rntbw undten tat Tass ik sooner nd ott ewan egeedcvsioinsy ors” otion eg pair itso eenoce nt ceet “zaman progunn CMD Platforms ‘Standardized evawiation Evaluations and rankings — High collaborative efforts: Within CMD, these models are used for guiding the design of molecules and verifying the simulations of products and processes [18]. Even though experimental observations are the best means to obtain molecular properties, the time and cost related to experimental exploration become rapidly probibitive when searching for 4 set of properties. A popular alternative to experi- mental screening is construeting quantitative steucture- property relationship (QSPR) functions based on thermo- dynamics and experimental data. Recently, there has been a surge of interest in applying machine learning to QSPR modeling for a diverse array of pure component properties. ‘These applications have demonstrated high accuracies comparable to DFT calculations and experi- mental accuracy at a fraction of the computational cost [19], Further, although limices ning has also. been applied to estimate mixture properties [20,21]. ‘machine le For CMD applications, Group Contribution (GC)-based models have been the most popular type of property prediction models duc co their low computational cost and easy incorporation into optimization models [3]. Yer, these models require enforcing additional constraints in the form of application ranges and valid. struecural combinations [1°] Notably, a differentiable and uncer tainty-calibrated library of 25 properties central to CMD. has been proposed to accurately prediet 87% and 91% of ‘over 24,000 molecules within the 1% and 5% relative exror thresholel, respectively [22]. Different GC-based models have shown promise in predicting the propertics of sol- vents [23], ionie liquids [24], and fragrance molecules [25]. Other property estimation models based on descrip tors (26, eharaerer embedding [27], and graphs [28] have also demonstrated the clear benefit of using: more sophis- ticated descriptions of molecules. Yer, iis worth noting that the incorporation of such QSPR models into CMD methods is questionable due to the use of noninvertible representations or the complexity of machine leaning models [1]. Given the small size of property data relative to the size of the chemical space, « eritieal component to the efficacy of | CMD methods lies in quantifying uncertainty within estimations. The importance of uncertainty quantifics- tion is also underlined by the fact that machine learning models are highly reliant on the quality and volume of data. Analysis and experiments for understanding the roles and impact of aleatorie (data-related) and epistemic ‘Current Opinion in Chemical Engineering 2022, 98: T00752 ‘wi selonoedreck com hitps reader elsever comitesderisdiplS221133982 1000848 7token=B61A30437C4025069EEADS2C9942630FC513740A341EA1 320685F1E. 218 ‘08/12/2022 13:49 Machine learning for multiscale modeling in computational molecular design | Elsevier Enhanced Reader Machine learning for multiscale modeling in computational molecular design Alsheti and You 3 (model-related) uncertainties on property estimation have been performed [29] performed on five public datasets using 22 different "uncertainty quantification methods, showing no unequiy~ ‘cal superiority of any specific approach [30] Another analysis was also Catalysis Although catalysis is not an element of general-purpose OMD but rather an application, shortcomings in deep learning-based CMD emphasize the central role of cat- alysts, revealing them as a major bottleneck to developi novel molecular solutions to many CMD applications. That is, novel molecules require establishing new reac tivity through catalyses [31]. Thus, in this subsection, we shed on fight catalysis for its essential role in accessing, novel molecules. Given the complexity of catalysis, no single general-purpose computational strategy’ has. emerged for computational catalyst design [32]. A broad category of computational catalyst design, however, is focused on the prediction of catalytic and mosphological properties to screen and large swaths of eatalyses [32]. Here, we provide an overview of machine learning. appli- cations within catalysis for two specific roles: predicting catalytic and morphological properties, and guiding fist principle calculation and sereening for different types of catalysts. Te should be noted that extensive reviews on catalysis are available in the literature [33-35] The sheer complexity in the surface chemistry and multi scale dynamic nature of catalytic reactions renders the problem computationally intractable. Despite the ‘ively long application of artificial intelligence techniques to this field, only minor advances have occurred in the discovery of novel homogeneous or heterogeneous eata- Iysts Still, even with the complexity of catalytic processes and the lack of large datasets, emerging approaches have made important strides in demonstrating the potential for machine leaning to improve catalyst design. Recent advances have been established across different morpho- logical and catalytic properties such as d-band centre and s 133,34) which are essential for describing reactions over heterogencous transition metals. Also, as an important class in homogenous catalysis, transition metal complexes have an intricate electronic structure. In this direction, several methods have proposed novel molecular descriptors for their electronic structure [36], and redox properties [37], with high accuraey. Furth more, the microscopic and spectroscopic deseriptions of catalysts are essential t© probe the morphology of the c. For predicting such properties, several machine learning-based models have been developed based on several established techniques including X-ray absorption near-edge and fine structure [38,59], and transmission electron microscopy [OL Effective catalysts are typically discovered with first- principles calculations such as density function theory (DFT) to determine atom-level potentials and the inter- actions of their catalytic and morphological properties. The computational costs associated with DET alee tions, however, become prohibitively expensive for che practical development of novel catalyses [33]. While there ts literature demonstrating the capability for machine leaming to prediet the potentials for elemental systems [41,42], only a few have shown reliable performance for multicomponent systems, Such issues have always arisen in heterogeneous catalysts provided their computational complexity of generating luge DFT datmets and opt- sizing for Phe other signitieane direction facuses on predicting catalytic and morphological prop- cries using diferent descriptors. The catalytic and mi Phological properties are utilized for computational sereening of large set of catalysts followed by qualita tively analyzing che best candidates {33} Sil, the scope of these approaches considers large swaths of specific catalysts classes such as porphyrins [44], bimetallic facets [45], and perovskite [46], among others. Remarkable electrocatalysts were obttined by coupling machine learning and optimization to beter guide DFT calcul tions for finding best-performing candidates [47]. Since both first-principles calculations and experimental vali- dation are needed for catalysts discovery, data-driven approaches integrating active learning with experimencal design, and faster calculations and screening are viewed as major enablers for ground-breaking, discoveries, Synthesis planning The proceduce of finding optimal or feasible sequences of chemical teaetions to yield a tagget molecule is generally referred to as synthesis planning or retrosynthesis. The planning problem can also be reversed by torial selection (forward synthesis) of available precursors to produce a final seructure. ‘The use of machine learning in the forward and backwaed problems has brought ground-breaking leap, outperforming six decades of work Jnestimating synthetic accessibility seores, and eonstruct- ing expertctafed rules [10°], ‘The importance of inte- grating synthesis planning into CMD stems from he observation thac design methods do not aecount for the synthetic feasibility of proposed molecules [48]. Hence, embedding synthetic knowledge into moleeular design frameworks is essential to avoid infeasibilites in experi- mental stages by limiting the search to synthetically essible mol To mimic chemists’ decision-making, data-driven retro- synthesis or forward synthesis models leverage chemical reaction databases to construct models for identifying promising routes, estimating reaction mechanisms, and avoiding reactivity conflicts. These actions are attainable through decomposing the complexity of 1 simpler steps such as forward enumeration of reactants [49.50], feasibility classification [51], and reaction tem- plate tanking [52]. A notable work that achieved ‘wow acianoadirect com “Gurrent Opinion in Chemical Engineering 2022, 96100752 hitps reader elsever comitesderisdiplS221133982 1000848 7token=B61A30437C4025069EEADS2C9942630FC513740A341EA1 320685F1E. 38 ‘08/12/2022 13:49 Machine learning for multiscale modeling in computational molecular design | Elsevier Enhanced Reader 4. Frontiers in chemical engineering; chemical product design ~ M1 significant improvements over other methods applies conditional graph logie networks to retrosynthesis, which offers interpretations of synthesis aetions through proba tie deep learning [53]. However, forthe integration of synthesis planning into the molecular design, icappears to bbe more practical to develop novel synthetic accessibility scores for different classes of chemicals. Efforts in this direction include the general RAscore [54], and the ExtractionScore for liquid-liquid extraction design [55]. Design methods As design methods integrate multiple facets at different seales and levels, CMD has long relied on mathematical optimization and metaheuristics with fixed molecular repr search for candidate molecules [1°56]. Hybrid and data-driven optimization approaches have also emerged as a result of machine learning appli- cations in CMD. Such approaches offer more expressive molecular representations, highly accurate property mod- els, and uncertainty quantification. Yet, a comparative study has demonstrated that genetic algorithms consis: tently perform as well as or better than many data-driven models a a lower computational cost [57]. As such, hybrid approaches appear to hold the most potential in solving problems of significance in the short term [1" ssentations. to In hybrid methods, data-driven models are integraced within deterministic optimization or other iterative or sequential knowledge-based procedures under a fixed mol Many applications. involve the use of machine leaming as a screening strategy 0 ind the most promising molecules from a pool of eandi- dates. Such applications typically involve more complex. structures such as catalysts [45,46], solvents [58], and ultrafiltration membranes [59]. Alternatively, decomposi- that iteratively or sequentially involve ine learning models to verify best didates have been applied for the design of fragrance molecules [25], solvents [23], and surfuctants [60]. As: these applications stand, screening and decomposition approaches render the evaluation process suboptimal, excluding potentially superior molecular structures [1". cular representation im: forming ca Arguably, however, superior resules can be obtained by integrating machine learning models into optimization methods given their ability to incorporate uncertainty and reveal insights, In many data-driven design methods, the molecular representation is concurrently learned with molecular properties from raw molecules, Raw molecular representations, such as SMILES [61] or 3D graphs, are generally transformed into numerical representations using two multilayer neural networks. The two necworks are pitted against the other to generace novel molecules oF decode a learned numerical representation [14"]. Although these models are in their infancy and guided | nature of machine learning, they are as holding the promise to transform CMD. Such a view is supported by data-driven applications on complex molecular classes, sueh as solid-state materials [62] and metal-organic frameworks [63], leading, to novel discoveries of superior molecular structures. Perspectives Methods for multiscale modeling ‘The development of multifaceted systems involving all aspects of productiprocess design requices. synergetic integrations of the different elements underlying the system, To the best of our knowledge, there literature chat simultaneously optimizes sgn, synthetic accessibility, produet/process design, and uncertainty estimation in molecular proper- ties. Instead, given the large design space, different stages are sequentially optimized without guarding against the exclusion of possible optimal solutions from the design space [1""]. Such sequential strategies highligh the exit cal need for more efficient daca-driven strategies that constrain the search by uncertainties present in molecular data. In the following evo paragraphs, we explore the applicability and efficaey of two other alternatives: gen- erative modeling and reinforcement learning (RL). Generative models are a subclass of deep learning models that aims to capture che underlying probability distribu: tion of molecular representations and their prope! ‘These models exploit the knowledge of structures and properties t model their nonlinear joint distribution, transforming a molecular representation into one that maximizes its expressivity relative «0 given prope the utility of such models can be realized from the notion of molecular design, where a novel structure is generated given a set of properties [14]. Furthermore, generative modeling, offers the advantage of converting discrete molecular representations into continuous ones, which can be direetly used in gradient-based opt algorithms. Advances in probabilistic generative model- ing have resulted in promising molecular design applica- tions, leading to discoveries competitive against best- performing molecules ever reported in metal-organic framework/zcolites [63], and inorganic solid-state func~ On the other hand, RL is more capable of navigating more complex design spaces involving multiple complex facets (Figure 14). This unifying framework attempts to find optimal molecular candidates by learning the optimal policy between possible selections and actions in the chemical space using a trial-and-error search [49]. Phereby, RL extruets patterns learned from its actions to find a balance between diversifying and intensifying the pursuit for molecular struccuces that satisfy or maxi- [8]. Indeed, RL has proven to be an ideal substitute to exaet optimization methods for navi- gating the large chemical space. This is demonstrated by applying RL methods to narrow down the chemical space ‘Current Opinion in Chemical Engineering 2022, 98: T00752 ‘wi selonoedreck com hitps reader elsever comitesderisdiplS221133982 1000848 7token=B61A30437C4025069EEADS2C9942630FC513740A341EA1 320685F1E. 48 ‘08/12/2022 13:49 Machine learning for multiscale modeling in computational molecular design | Elsevier Enhanced Reader Machine learning for multiscale modeling in computational molecular design Alsheti and You 5 om Seas —— pweeoot EES eee jgn a = = 2 Z = y | jana E4 is. otf [A desertion of promising areas for accelerating progress in computator nal molacular design for chemical products and pracestes, to synthetically accessible molecules [48]. Further, CMD RL applications have shown an increased capacity for complexity and refinement in optimizing 3D structures and incorporating quantum-chemical calculations [64] Yer, given the limited number of applications of RL. in CMD, the field may benefit from RL comparative studies oon problems with similar mathematical structures, such as combinatorial optimization. In this domain, RL. algo- rithms have shown the eapability o outperform comme cial solvers on small to medium-sized instances of hard problems [65]. Specially, the REINFORCE [66] and the Actor-Critie [67] family of models have shown super- ior resules compared to other methods. Looking ahead, advanees in applying RL and generative models to singh molecule design problems can also be extended. for integrating. product and process considerations under more complex factors such as uncertainty [68], and phys- ies-based descriptions [69] ‘Active learning for optimal simulation and experimental design Given the exceedingly complex chemical space, machine learning can be utilized wo minimize the eosts associated with experiments and first-principles calculations. The subfield of machine learning that optimizes experimental design, active leasing, offers a systematic framework to pinpoint the best next experiment or DFT caleulations to realize user-defined design objectives (Figure 1b) [70} Such an iterative approach between experiments/simuli- tions and machine learning has demonstrated astounding suecesses not only in catalyst design [47], bur also in materials design [70]. Yet, it should be highlighted chat sampling functions are at the core of active learning fameworks and their selection is highly dependent on the given task and available data. For example, in imbal- anced datasets, nsemblesbased methods [71] offer better performance than other popular methods, such as Baye: inbased [72] and density-based [73] methods. Nonetheless, in molecular systems applications, Bay jan-based methods are more dominant as a natural strat egy [74]. Proven to reduce annotation costs by orders of magnitude [75] and as an effective strategy in materials dlesign [76], we envisage that active learning methods can accelerate the multiscale design of new products by directing experimental/simulation efforts at the mictor scale and macroscale levels Benchmarking platforms for accelerating progress in cmp To accelerate progress in molecular, product, and process design, itis essential co develop henchmarking platforms with standard ease studies and general evaluation metres. these ease studies and metrics are needed to indepen- dently assess the characteristics of developed molecular design models. In generative modeling, a few platforms have been introduced to quantify multiple benchmarks, such as the validity and novelty of generated molecules [57]. Despite the large number of case studies in CMD, purallel evaluation metties are yer to be introduced in the Titerature [1°]. ‘Thus, collaborative work is. urgently ceded to build a library of open-source datasets, inde~ pendently tested models, metries, and product/process cease studies for the different scales of molecular design (Figure 10). Conclusions In this paper, we reviewed the current progress of apply- ing machine leaning to clements of molecular design problems aeross different scales. As scen, however, the current CMD methods and tools do not leverage the full, capabilities and novelties that machine learning offers. In the perspectives section, we highlighted how deep learn- ing can better manage the complexity of integrating microscale properties to their macroscale effects and decisions to obtain superior products and processes. Fur- thermore, we emphasized the importance of applying ‘wow acianoadirect com “Gurrent Opinion in Chemical Engineering 2022, 96100752 hitps reader elsever comitesderisdiplS221133982 1000848 7token=B61A30437C4025069EEADS2C9942630FC513740A341EA1 320685F1E. 58 ‘08/12/2022 13:49 Machine learning for multiscale modeling in computational molecular design | Elsevier Enhanced Reader 6 Frontiers in chemical engineering; chemical product design ~ M active learning, to optimize physical experimental and simulation systems. Although the potential payol of applying machine learning to CMD is higher than ever, practical challenges in benchmarking and datasets avail- ability need to be addressed to catalyze quantifiable and benchmarked improvements. Looking ahead, many cemerging methods for uncertainty and interpretability in deep learning can also alleviate the opaqueness of data-driven methods. by providing. error descriptions and generating explanations of decisions in the chemical product/process design space Conflict of interest statement Nothing declared. ‘Acknowledgement pte ne erated with BioRendet com. References and recommended reading Papers of parila interest, pubiched within the period of review, have been highightod as + of special intrest bt outstanding Interest, 41. Ashetri AS, Gani A, You F: Deep learning and knowledge- ‘+ based methods for computer-aided molecular design= toward a unified approach: state-of-the-art and future ‘Grectiona. Comput Cham Eg 2029, 141107005 “The current perspectives and problems of knowedge-bated and data- drvon CiD'ae dacaased n dap n ts poper, which ato suageo Exal areas for ftir research ewectons. 2. _Unlemann J, Costa. Charpentier J-C: Product design and ‘engineering — pas, prosent, future trends in teaching, Tie arice eames Proc Des & Engineering cuet sae ane Imtatons, sa wel as academe and indy requrements, untied needs. and product manufacturing views in the canton of Ity 4.0. 3. Austin, Sahnicis NV, Tanan OM: Computer-aided moteculat| ‘design: an introduction and review of tool, applications, and ‘okiton techniques. Chom Eng Res Des 2046, 1182-20, 44, Zhang L, Mao H, Liu Q, Gani F: Chemical product desion = * focont advances and perspectives. Cur Opi Grom Eng 202, areas This paper presents and dscusses methodologies fr systematic che rica product design and application, computer-aided design methods Sand tool, 9 wel 86 obstacles an opportuni. 5. Zhang, Babi DK, Gari: Now vistas in chomical product and ‘rocess design. Ann Rov Chom Bromo) Eng 2046, P57 582 6 Tafours M, Marin M, Marinez A Esqueo N: Challenges in the ‘design of formulated ‘multiscale process and [product design. Curr Opin Chem Eng 2020, 2713 “This paper offer 2 hole overview ofthe present ond in frmulated prodvet design as a maltascpinary fla, in which wadeots between Proret porormance,envronmental mac, andastare xamied om 2 slobal viewpoint 7. Garcia D, You F: Systems engineering opportunities for ‘2gricltural and organic waste management in tho food Water-energy nexus. Cur Opin Chem Eng 2017, 1823-91 5. Achat AS, You F: Paradigm shift the promise of deep “s+ leaming in molecular systems engineering and design. Fro (Ghom eng 2021, 826 Tp /ox col org0 2380) feeng 2021700717 “Tis recent work highightscurent advancements and promising drec- toe or several oeep learning archnectures,algortens, and optimization Blatorms for CMD, nd summarzes the progress across several Key Espocts of moecaar systems 9. Yeokatsbramanian The promise ofa ntlognce in ‘chemical engineering: is ithere finaly? AIOE 2019, 68.465" oe 10. Colay GW, Groen Wt, Jensen KF: Machine learning in 5 ‘Symthesis planaing. Ace Cham Pas 2018, Bitzo-268 ‘This paper examines two major domains and applictins within erga smote where machine lsring has boon applic: revosyninest and forward syrnest. 11. Rong Y, Blan Y, Xu, Xie W, Wel ¥, Huang W, Huang J: Slt {uperviaed oranh transformer on large-scale molecular data, ‘abe Neural in Process Syst 2020, 38 12 Totko Iv, Karpow P, Van Deusen, Godin G: State-of-the-art augmented NLP transformer models for drect and single-step ‘etrosynthesis. Nat Commun 2000, 11-11 18, Pisthopouios EN, Barboss-Povoa A, Leo JH, Msoner Ri, Mitzos A, Feliate GV, Venkatasubramanian V, You F, Gan F: Process ‘ystome engineering the gonoration next? Camaut Chom Eng Ben var orese, 14. Sancher-Langsing 8, Aspuru-Guzik&: Inverse molecular design <=" uaing machine leering: ganorative modols for mator engineering. Science 2018, 3616055 “This paper dscussos doop gororative approaches for achiving verso design, which attempts to.fnd customeed materials staring om Specialy intended functonay 16, Butlor KT, Davos DW, CartrightH,IsayovO, Walsh A: Machine learing for molecular and materials scence. Nature 2018, ‘ona 886, 16, faghu M, Schmict EA survey of daep learning fr selentite ‘scovery. a preprint 2020, arxv2008 11755 17, Ramprasad Ri, Batra, Plana G, Manno Kanakithod A, Kin: Machine teaming in materias informatics: recent applications {and prospects "pj Comput Mater 2017, 3.54, 18, Gani Fi Group contribution-based property estimation ‘methods: advances and perspectives. Cur Opn Chem Eng Bote, 2184 190, 19. na, Choudhary K,TavazzaF, Lino WK, Choudhary A ‘Camabsl! ©, Agrawsi Ar Enhancing materials property prediction by leveraging computational and experimental data Using deep transfer leaning. Nat Commun 2018, 1055918. 20. Chan G, Song Z, IZ, SundmacherK: Neural recommender ‘Syotam forthe activiy coetfiient prediction and UNIFAC. ‘model extension oflonie iqud-solute systems. AICNE J 2021, ener 21. Zang, LY, LY, SunS, Gao X: A self-adaptive deep learning lgoritim for accelerating mult-component ash calculation. (Comput Methods Appl Mech Eng 2020, 388"11S207. 22. Abhetwi AS, Tula AK, Wang L, Zhang L, Gani, You F Next ‘generation pure component property estimation models: with ‘nd without machine leaning techniquae. AIGNE J 2021 (017499 Pitofox doLorgt0.T00zlae-1 7208, pees) 23, Lu Q, 2nang L, Tang K, LL, Du J, Mang Q, Gani R: Machine leaming-basod atom contribution method forthe presicton of ‘surface charge density profes and solvent design AICTE J 2oet, erot7110 24. Song, Shi, Zhang X, Zhou T: Pradetion of CO2 solubility in Tonle liquids using machine learning methods. Chem Eng Sl 2020, 268416782, 25, Dang Mao Us| Du), Gai Amachin eaming based ‘computer-aided moléeular design ‘methodology ‘esign/sereening ffagrance molecules: Comput Crom Eng 2016, 162958. 26, Yalamanchi KK, van Oxdenhovsn VCO, Tuo F, Monge: Palacios M,Alshet A, Gao x Sarathy SM: Machine earning to ‘redtet standard enthalpy of formation of hydrocarbons, Bhys Cham A 2016, 12518808 8313. 27. SuY, WangZ, Jn 8, Shon W, Fn J, Eden MR: An architecture of ‘deep leaning in QSPR modeling for the prediction of erical jes using molecular signatures. AGRE J 2018, 65: Sheer ‘Current Opinion in Chemical Engineering 2022, 98: T00752 ‘wi selonoedreck com hitpsreader-elsevier comiresderisdipllS221133982 1000848 7token=B61A30437C4025069FEADS2C9942630FC5137A0AS41EA1320585F1E... 6/8 ‘08/12/2022 13:49 ao 2. Machine learning for multiscale modeling in computational molecular design | Elsevier Enhanced Reader Machine learning for multiscale modeling in computational molecular design Alsheti and You 7 Schrectmann AM, tig IG, Konig A, Grohe M, Miso A, DDahmen hi: Graph neural networks for pracleton of fl Ignition quality. Energy Fuss 2000, 341 1905-11407 ‘Scala G, Grambow CA, Peenic , LP, Green WH: Evaluating Scalable uncertainty estimation methods for deep leaning Based molecular property predtion. J Crom int Mode 202, 6026072717, Hirschfeld L Swanson K, Yang K, Baraay R, Coley CW: Uncertainty quantifieation using noural nworks for ‘motecular property prediction «Chom In Msde 2020, 60:1770 are, Cronin L, Maw SHM, Granda JM: Catalyst: the metaphysics of chemical reactiviy. Chom 2018, 41758-1761 Ess D, Gagla L, Hammes-SchiferS: Introduction: ‘computational design of catalyets from molocules to ‘materials. Chern Fev 2019, 1195507 6508. grin J Machin earning in cats Nat Cats 2018, 1:20 Toyan T, Macro 2, Takakusag S, KamachiT, Takigawa | ‘Shiau: Machine fearing fr catalysts informatics: recent ‘pplicaions and prospects, ACS Cala 2020, 1022002297. Nandy A, Duan C, Tayor MG, Liu F,Steoves AH, Kull ‘Computational discovery of transition-metal complexes rom igh'throughput sereening to machine learning. Chom Fev 20 Janet JP, Kulk Hu: Predicting electronic structure properties of ‘ansiton metal complexes with neural networks. Chem So) 2017, 88137 5152, (Chang AM, Froaze JG, Batista VS: Hammett neural networks: prediction of frontier orbital enerles of tungsten benz Photoredox complexes. Chom Se! 2079, 106841 6854 Gua AA, Gua SA, Lomachenko KA, Soldatov MA, Parkin 1A Soldiow AV, Bragha , Bugaew AL, Narn, Signor Mot ‘Sonttatve struct determination of ative ats rom in Situ and operando XANES specira: rom standard ab ini Simulations to chemomette and machine learang spproaches. Cats! Today 2018, 398321 Miyazato | Takahashi L, Takahashi Kc Automatic oxidation. Lveshold recognition ot XAFS data using supervised machine Hearing. Mol Syst Des Eng 2018, 41014-1018. Vasudevan RK, Laaalt N,Feragut EM, Wang K, Geohegan DB, Images. oy Comput Mater 2078, £50 Zong H, PlanlaG, Ong, Ackland GJ, Lookman T: Developing an interatomic potential for martensitic phase tanstormations ‘Zreconium by machine leaning. npj Comput Mater 2018, 448. Pun GPP, Batra R, Ramprasad i, Mishin Y: Physically informed ‘rfc! noural notwork for tomistic modeling of materials. Nat Gommun 2019, 102508 ‘Andolna CM, Wilamson P, Sai WA: Optimization and Walldaton ofa deep learning Cuzz atomistic potential: robust ‘Spplications for erytaline and amorphous phases with near= Br accuracy. Chem Prys 2020, 152:154701 WZ, OmidvarN, Chin WS, Robb E; Morris A Achenio L, Xin 'Machine-leaning eneray gaps of Porpiyrins with mélecular ‘raph representations. Phys Cham A 2018, 122497'-4578 Lisi 2W, Tang MT, Xia J, Lu X, Tore DA, Karamad M, (Cummins Hann C, Lows NS, Jaramillo TF eta: Machine- fearing methods enable exhaustive searches for actve ‘metalic facets and reveal active ste motif for CO2 eduction. ACS Catal 2017, 7:8500-6608. Lz, Aerie LE. Xn An adaptive machine earmog strategy for accelerating discovery of perovskite electrocatalysts, ACS aa 2030, 10ST -000N a. 8, 48. 50. st 2. 53. 55. 56. Ea 58. 0, on. e. ‘rank, Ulss\ 2K: Active learning across ntermetalics to guide discovery a Yor CO? reduction and H2 ‘evolution. Not Cota 2018, 600-702. Gottpat Sk, Satay 8, Nu S, Pathak Y, Wai H, Lu S, US, Proseodings earning. «vol 118. Elted by Hal Il, Aart, 2020-3008 9673. Coley OW, Baraay Ri, Jaakkola TS, Groon WH, Jensen KF: rediction of organi reaction outcomes using machine leaming. ACS Cen Se 2017, 854-845, ‘Mann V, Verkatasubamanian V: Predeting chemical reaction ‘utcames: a grammar ontology-based wranstormer framework, ACME J 2021, 7.617190 ‘Seger Mi, Preuss M, Walle MP: Panning chemical syntheses ‘with deep neural networks and symbole Al Nature 2018 Bosoos640 ‘Sgr Mt, Wale MP: Neural-symbotic machine leaning for rottoeynthosis and reaction predietion. Cham Eur 2017, 2ass000 5971 al H, Lic, Coley CW, Dal B, Song L:Retrosynthesis prediction with conditional graph logic network. AN 2020 Thakkar A, Chacimovd V, Bjorum Ei, EnghvistO, Reymond Retrosyninaticaccesiilty scare (RAscore) rapt machine eamed syntheatzabilty classification from Al deen, ‘etrosynthetic planning, Chem Sl 2021, 12359939, Kuznetsov A, Sahinicis NV: ExtratlonScore: a quantitative framework for evaluating aymhete routes on precited gud liguid extraction performance. J Chom In eae 2021, 61-2274 2a, Gani A, Bateyga J, Biscans 8, Brunazal E, Charpentir J, Dri E, Foe H,Futong A. Van Goer Kil, de Hemp J-C eta: A mull-layered view of chemical and biochemical ‘engineering. Chor Eng Ros Dos 2020, 1BEA3S ATA ‘Brown N, Fiscato M, Sealer MHS, Vaucher AC: GuacaMel: benchmarking models for de nove molecular design. J Chem In ode! 2018, 88 1036-1108. ‘Amar ¥, Schweidtmann Artur M, Deutsch P, Cao L, Lapin A ‘Machine learning and molecular descriptors enable rational ‘Solvent selection in asymmetric catalysis. Cher Sci 2018, ‘oss 6705. Fetanat M, Keshitora M, Low 2-X, Keykogla Ri, Khataoe A, (rool ¥, Chan V, Lasts G, Razmiou A’ Machine learning for ‘sdvanced design of nanocomposite utratitration ‘membranes. ind Eng Chem Ras 2021, 60:5236 5250. ‘Alshebi AS, Tula AK, Zhang L, Gani, You : A platform of ‘machine leaning-based nexi-generation property estimation ‘methods fer CAMD. Computer Aided Chomea! Enginearng Elsevier 2021. Wieiringer 0: SMILES, a chemical language and information systom. 1. Introduction to methodology and encoding rules. J Chom It dod) 1968, 285313. ‘Noh J Kim J, Stain HS, Sanchez-Lengolng B. Gregowe uM, ‘spur Gun A dung V:lnverse design of eoid-stae materala ‘ia a continuous representation. Matter 2018, 11370-1388. ‘Yao, Sanchez-Lengeling B, Bobbitt NS, Bucior B, Kumar SGH, Catia SP. Burs T Woo TK, Fara OK Shur RO otal inverse ‘design of nanoporous crystaline reticular materials with deep ‘Senerative models. Nat Mach Infa 2021, 87688. ‘Simm G, Pnsler A, Hemandes-Lobato JW: Reinforcement eaming for molecular design guided by quantum mechanics. In Procseange of ache Leaming Aeseareh: PULA. In ‘ofthe 37th Intemational Cnferanes on Machine {oaming, 119. Cites by Hal ID, Aart 8, 2020-8059 0969, Mazyavtina N, Svidov S, Wanov S, Burmaey E: Reinforcement Teaming or combinatorial optimization: a survey. Comput Oper Fes 2021105400. ‘wow eloncodirect cam hitps reader elsever comitesderisdiplS221133982 1000848 7token=B61A30437C4025069EEADS2C9942630FC513740A341EA1 320685F1E. “Gurrent Opinion in Chemical Engineering 2022, 96100752 718 ‘08/12/2022 13:49 8 Frontiers in chemical engineering; chemical product design — M 8, ils Ru: Simple stata radenfolowng agoritns tor connectionist reinforcement leaming, isch Laat 1220 e80-206. (7. MoV, Baca AP, Miza M, Graves A, LikerapT, Haley, Saver, KawukcuoglK: Asynelonous methods for Joep feinfercement leaming. Intemational Conferencs 00 Machine {earnings PLA 201610281037 {68 LitjonsB, Everett M, How JP: Safe reinforcement learning with ‘model uncertainty estimates. 2019 iterrtiona Conferice of Fobotes anc Automation (CRA) 20-24 May 2019 2019:8682- 8668, (88, Wang ¥, Lamim fibro JM, Tary P: Machine learning Approaches for analyzing and enhancing molecular dynamics ‘Simulations. Cur Opn Stuct Bio! 2020, @1:199-145, 70. Kusne AG, Yu H, Wu C, Zhang H, Hatrick:Simpers J, DaCost 8, ‘Sarr 8, Osos G, ToherC, Curtarco Set al: On-the fy closed: oop materials discovery via Bayesian active Teaming. Nat (Commun 2020, 175008 71. Baluch WH, Gonowsin T, Nimnborgor A, Kehler JM: The power of ensembles for active learning in image classification 74, Machine learning for multiscale modeling in computational molecular design | Elsevier Enhanced Reader Proceedings ofthe IEEE Gonferance on Computer Vision and atte Recogntion 201833589377 2. Ostapuk N, Yang J, Cucé-Mawroux P: Activelink: doop activo Tearing for ink prediction In knowledge graphs. The Wor Wide Wed Confrence 2018:1300-1400. ‘Sonor ©, Savarese S: Active learning for convolutional neural ‘networks: a core-st approach. a” preprint 2017, ‘rane 7b8 Loolean T, Balachandran PY, Xue D, Yuan Re Active leaning in ‘materials science with emphasis on adaptive sampling using Uncertainties for targeted design. rp) Comput Mater 2019, 5:21 en P, iso Y, Chang x, Huang PY. LZ, Chen X, Wang X:A ‘Survey of deep active learning av prprnt 2020, ‘snow es, MacLeod BP, Parle FG, Monissey TD, Hse F, Roch LM, Bettelbach KE, Yurker (P, Rooney Mi, Doct JP Starving laboratory for accelerated discovery of thin-fim materials. Se ‘ir 2080, ase? ‘Current Opinion In Ghomical Enginooring 2022, 36-100752 ‘wi selonoedreck com hitpsreader-elsever comireaderisdiplS221133982 1000848 7token=B61A30437C4025069EEADS2C9942630FC513740A341EA1 320685F1E.. a8

You might also like