Welcome to Scribd!

AAAI Poster TAC PDF

Uploaded by

0% found this document useful (0 votes)

16 views1 page

This document proposes a novel reinforcement learning framework called TAC (towered actor critic) to handle problems with multiple action types. TAC is applied to the task of reaction-based drug design, where there are different action types at each time step. Experimental results show TAC achieves significant improvements over the existing state-of-the-art in drug design. TAC is also applied to standard RL tasks without inherent hierarchical actions, demonstrating comparable or improved performance relative to TD3 on several openAI gym environments. The TAC formulation builds on previous work called Policy Gradient for Forward Synthesis, which uses actor and critic networks to select reactants for chemical reactions to transition to new drug-like states.

Original Description:

Original Title

AAAI-poster-TAC.pptx.pdf

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

16 views1 page

AAAI Poster TAC PDF

Uploaded by

Saikiran Naidu

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 1

Search inside document

TAC: Towered Actor Critic for Handling Multiple Action Sai Krishna Gottipati, Yashaswi Pathak, Boris Sattarov,

tipati, Yashaswi Pathak, Boris Sattarov, Sahir, Rohan

Nuttall, Mohammad Amini, Mathew E. Taylor, Sarath Chandar
Types in Reinforcement Learning for Drug Discovery

Introduction In PGFS, the parameters of the actor and critic networks are Experimental Results:
● Reinforcement Learning (RL) algorithms typically deal with only updated as follows:
monotonic actions (discrete or continuous).
● However, in several pivotal real world applications like drug discovery,
the agent needs to reason over different types of actions.
● To accommodate these needs, we propose a novel framework: TAC
(towered actor critic) architecture that aims to handle multiple action
types.

Contributions ● PGFS uses two types of reaction templates (actions): uni-molecular and
bi-molecular.
• Introduce a novel mechanism, towered actor critic to train agents on
● uni-molecular reactions do not require a second reactant but, the parameters
MDPs with multiple action types.
of the pi-network are updated.
• TAC is applied to the task of reaction based de novo drug design (where ● We thus propose a novel critic architecture to independently handle both the
there are different action types at every time step) and show significant action types (uni-molecular and bi-molecular).
improvement over the existing state-of-the-art results. ● We first predict the product of the chemical reaction (i,e, next state) and then
• TAC is also applied to standard RL tasks that do not have an inherent compute the value function for the predicted product:
pyramidal/hierarchical structure of actions. Performance is comparable,
or improved, relative to TD3 on several continuous action openAI gym
environments.

Method
The TAC formulation is developed on top of our previous work for de
novo drug design called Policy Gradient for Forward Synthesis (PGFS).
PGFS is an off-policy algorithm having an actor module that consists of f
and 𝜋 networks. The f network takes in the current state s (reactant R(1))
as input and outputs reaction template T. The 𝜋 network uses R(1) and T
to compute action a (feature representation for second reactant). Inside
the environment, a KNN uses a and computes the k closest valid second
reactants among which the maximum rewarding reactant is selected as
R(2). R(1), R(2), and T are used to simulate a reaction and transition to the
next state. PGFS uses TD3 for updating the actor (f, 𝜋) and critic (Q)
networks.

Conclusion
We propose a novel framework for handling multiple action types and demonstrated
significant improvements over SOTA in de novo drug design. We further extended the
framework to all gym environments and demonstrated better or comparable
performance to TD3.

Completing Your Qualitative Dissertation A Road Map From Beginning To End (Linda Dale Bloomberg Marie F. Volpe)
Document612 pages
Completing Your Qualitative Dissertation A Road Map From Beginning To End (Linda Dale Bloomberg Marie F. Volpe)
Pearl Ciabo
100% (1)
Context Attentive Bandits: Contextual Bandit With Restricted Context
Document8 pages
Context Attentive Bandits: Contextual Bandit With Restricted Context
ankurkothari
No ratings yet
Soft Actor-Critic Algorithms and Applications: UC Berkeley, Google Brain, Contributed Equally
Document17 pages
Soft Actor-Critic Algorithms and Applications: UC Berkeley, Google Brain, Contributed Equally
a0133814
No ratings yet
4 ACCNet Actor Coordinator Critic Net
Document18 pages
4 ACCNet Actor Coordinator Critic Net
Blanca Hill
No ratings yet
L C PDE D P: Earning To Ontrol S With Ifferentiable Hysics
Document28 pages
L C PDE D P: Earning To Ontrol S With Ifferentiable Hysics
prfdjkdkvcssd
No ratings yet
A3C-GS Adaptive Moment Gradient Sharing With Locks For Asynchronous ActorCritic Agents
Document15 pages
A3C-GS Adaptive Moment Gradient Sharing With Locks For Asynchronous ActorCritic Agents
yenuguyash
No ratings yet
Rahmani 2021 J. Phys. Conf. Ser. 1869 012080
Document8 pages
Rahmani 2021 J. Phys. Conf. Ser. 1869 012080
wladimir cevallos
No ratings yet
A Review of DIMPACK Version 1.0
Document11 pages
A Review of DIMPACK Version 1.0
Anusorn Koedsri
100% (1)
Regression Testing Techniques
Document6 pages
Regression Testing Techniques
Journal of Computing
No ratings yet
Conover and Iman 1981 Rank Transformations As A Bridge Between Parametric and Nonparametric Statistics
Document7 pages
Conover and Iman 1981 Rank Transformations As A Bridge Between Parametric and Nonparametric Statistics
Carlos Andrade
No ratings yet
Fuzzy Logic Controllers
Document5 pages
Fuzzy Logic Controllers
Chetan Kotwal
No ratings yet
Re-Evaluating Evaluation in Text Summarization
Document13 pages
Re-Evaluating Evaluation in Text Summarization
Mayank Saini
No ratings yet
A Comparative Analysis of Indistinguishability Operators Applied To Swarm Multi-Robot Task Allocation Problem
Document10 pages
A Comparative Analysis of Indistinguishability Operators Applied To Swarm Multi-Robot Task Allocation Problem
Carlos Rueda Armengot
No ratings yet
Review of Developments in Methods of Reconfiguration of Distribution Networks
Document3 pages
Review of Developments in Methods of Reconfiguration of Distribution Networks
abbasali
No ratings yet
Laser Is
Document20 pages
Laser Is
j3qbzodj7
No ratings yet
Real Coded Genetic Algorithm
Document4 pages
Real Coded Genetic Algorithm
anoopeluvathingal100
No ratings yet
Using MANOVA Methodology in A Competitive Electric Market Under Uncertainties
Document6 pages
Using MANOVA Methodology in A Competitive Electric Market Under Uncertainties
jp3107-1
No ratings yet
Ray A Distributed Framework
Document17 pages
Ray A Distributed Framework
Trinadh Gupta
No ratings yet
Dueling Network 1511.06581v2
Document16 pages
Dueling Network 1511.06581v2
heavywater
No ratings yet
A Population Based Heuristic Algorithm For Optimal Relay Operating Times
Document10 pages
A Population Based Heuristic Algorithm For Optimal Relay Operating Times
javed shaikh chaand
No ratings yet
Ill Outlines::ontrol
Document5 pages
Ill Outlines::ontrol
Pratibha V. Hegde
No ratings yet
Feature Gradients: Scalable Feature Selection Via Discrete Relaxation
Document9 pages
Feature Gradients: Scalable Feature Selection Via Discrete Relaxation
Gaston GB
No ratings yet
A New Classification Technique: Random Weighted LSTM (RWL)
Document4 pages
A New Classification Technique: Random Weighted LSTM (RWL)
Tajbia Hossain
No ratings yet
Suggestions On Test Suite Improvements With Automatic Infection and Propagation Analysis
Document20 pages
Suggestions On Test Suite Improvements With Automatic Infection and Propagation Analysis
Jerry Cvem
No ratings yet
NRTL and WILSON Method
Document11 pages
NRTL and WILSON Method
Mihai Pruteanu
No ratings yet
2020 - Implementation Matters in DRL A Case Study On PPO and TRPO
Document14 pages
2020 - Implementation Matters in DRL A Case Study On PPO and TRPO
Frank Li
No ratings yet
Modeling Cellular Signaling Systems: An Abstraction-Refinement Approach
Document5 pages
Modeling Cellular Signaling Systems: An Abstraction-Refinement Approach
dhermit
No ratings yet
Evolution Strategies As A Scalable Alternative To Reinforcement Learning
Document13 pages
Evolution Strategies As A Scalable Alternative To Reinforcement Learning
Max Chen
No ratings yet
Active Cooperative Perception in Network Robot Systems Using Pomdps
Document6 pages
Active Cooperative Perception in Network Robot Systems Using Pomdps
henry diaz
No ratings yet
Spectral Normalization For GANs
Document26 pages
Spectral Normalization For GANs
Yasuru Thiwanka
No ratings yet
Discovering Reinforcement Learning Algorithms: Preprint. Under Review
Document19 pages
Discovering Reinforcement Learning Algorithms: Preprint. Under Review
Nei Cardoso De Oliveira Neto
No ratings yet
2159-Article Text-2524-1-10-20080906
Document16 pages
2159-Article Text-2524-1-10-20080906
jousuke higashikata
No ratings yet
Xóa 20
Document20 pages
Xóa 20
Rin Tohsaka
No ratings yet
Mathiassen - The Principle of Limited Reduction
Document19 pages
Mathiassen - The Principle of Limited Reduction
celogc
No ratings yet
Rossini MPSPBC 2011
Document23 pages
Rossini MPSPBC 2011
Paulo Bedaque
No ratings yet
Comparison of Defuzzification Techniques For Analy
Document4 pages
Comparison of Defuzzification Techniques For Analy
Devender Jain
No ratings yet
Human Designed vs. Genetically Programmed Differential Evolution Operators
Document7 pages
Human Designed vs. Genetically Programmed Differential Evolution Operators
77451154
No ratings yet
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
Document12 pages
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
Nguyễn Hồng Nhung
No ratings yet
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
Document12 pages
An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
Nguyễn Hồng Nhung
No ratings yet
2020 BioRob Impedance
Document8 pages
2020 BioRob Impedance
Ravi Prakash
No ratings yet
N-G Approach For Solving The Multiresponse Problem in Taguchi Method
Document6 pages
N-G Approach For Solving The Multiresponse Problem in Taguchi Method
csrajmohan2924
No ratings yet
Deep Reinforcement Learning Framework For Category-Based Item Recommendation
Document14 pages
Deep Reinforcement Learning Framework For Category-Based Item Recommendation
2K20CO016 ABHAY THAKUR
No ratings yet
AI Drug Design
Document13 pages
AI Drug Design
joaomariasaude
No ratings yet
从LQR角度看RL和控制
Document28 pages
从LQR角度看RL和控制
金依菲
No ratings yet
Groot Ecai2010
Document6 pages
Groot Ecai2010
Perry Groot
No ratings yet
Proposal of Hybrid Controller Based On Reinforcement Learning For Temperature System
Document8 pages
Proposal of Hybrid Controller Based On Reinforcement Learning For Temperature System
kika意佳
No ratings yet
1 s2.0 S0957417413007434 Main PDF
Document10 pages
1 s2.0 S0957417413007434 Main PDF
mourad
No ratings yet
297 Submission
Document5 pages
297 Submission
Halim Tlemçani
No ratings yet
Paper 32-A New Automatic Method To Adjust Parameters For Object Recognition
Document5 pages
Paper 32-A New Automatic Method To Adjust Parameters For Object Recognition
Editor IJACSA
No ratings yet
Fault Tolerant Control For A Class of Switched Linear Systems Using Generalized Switched Observer Scheme
Document12 pages
Fault Tolerant Control For A Class of Switched Linear Systems Using Generalized Switched Observer Scheme
Fatiha HAMDI
No ratings yet
A Generalized Probabilistic Framework and Its Variants For Training Top-K Recommender Systems
Document8 pages
A Generalized Probabilistic Framework and Its Variants For Training Top-K Recommender Systems
Oualid Hicham
No ratings yet
Ekstrand, Michael, Riedl, John - When Recommenders Fail - Predicting Recommender Failure For Algorithm Selection and Combination
Document4 pages
Ekstrand, Michael, Riedl, John - When Recommenders Fail - Predicting Recommender Failure For Algorithm Selection and Combination
Marcos Curvello
No ratings yet
Control System Design Using Particle Swarm Optimization (PSO)
Document4 pages
Control System Design Using Particle Swarm Optimization (PSO)
saeed
No ratings yet
Cognitive Engine Training R 12
Document35 pages
Cognitive Engine Training R 12
Ahmed Abdelhaseeb
No ratings yet
Efficiency of Power Distribution Companies in Pakistan (Application of Non Parametric Approach)
Document14 pages
Efficiency of Power Distribution Companies in Pakistan (Application of Non Parametric Approach)
Nauman Mushtaq
No ratings yet
Research Paper
Document10 pages
Research Paper
李默然
No ratings yet
Ray: A Distributed Framework For Emerging AI Applications
Document19 pages
Ray: A Distributed Framework For Emerging AI Applications
neeraj12121
No ratings yet
A Comprehensive Survey of Multiagent
Document17 pages
A Comprehensive Survey of Multiagent
Usman Ahmed
No ratings yet
Application of Hyper-Heuristics in Real-World Spacecraft Trajectory Optimization Under Limited Evaluation Constraints Using Hyper-Heuristics
Document7 pages
Application of Hyper-Heuristics in Real-World Spacecraft Trajectory Optimization Under Limited Evaluation Constraints Using Hyper-Heuristics
Alex Fanda
No ratings yet
A Unified Analytical Foundation for Constraint Handling Rules
From Everand
A Unified Analytical Foundation for Constraint Handling Rules
Hariolf Betz
No ratings yet
Feedback Control Theory
From Everand
Feedback Control Theory
Bruce Francis
Rating: 5 out of 5 stars
5/5 (1)
Section 1 Self-Assessment Questions: Dela Cerna, Jefferson O. Research BSMT 3
Document4 pages
Section 1 Self-Assessment Questions: Dela Cerna, Jefferson O. Research BSMT 3
JeffersonDelacerna
No ratings yet
Breeam Rating System
Document53 pages
Breeam Rating System
Ezhil Thalapathi
100% (1)
Pepe Teh
Document11 pages
Pepe Teh
Alwin lantin
No ratings yet
66-Article Text-135-2-10-20200922
Document9 pages
66-Article Text-135-2-10-20200922
shabana ali
No ratings yet
Answer For Foundation of Education
Document4 pages
Answer For Foundation of Education
vnintra
No ratings yet
IELTS Academic Writing Sample Script
Document6 pages
IELTS Academic Writing Sample Script
Rabita
No ratings yet
ICT Training Design and Matrix
Document4 pages
ICT Training Design and Matrix
Shen De Asis
No ratings yet
Culture - Ethics Presentation
Document6 pages
Culture - Ethics Presentation
Katrina Alexis Alba
No ratings yet
06 Session Plan Driving NC II by GML
Document8 pages
06 Session Plan Driving NC II by GML
jazzy umbal
No ratings yet
Beyond Conventional Boundaries - Corporate Governance As Inspiration For Critical Accounting Research
Document11 pages
Beyond Conventional Boundaries - Corporate Governance As Inspiration For Critical Accounting Research
TheGilmoreBuddies
No ratings yet
Learning Activity Sheet Quarter 2 - MELC 2: Discipline and Ideas in Applied Social Sciences
Document4 pages
Learning Activity Sheet Quarter 2 - MELC 2: Discipline and Ideas in Applied Social Sciences
gayle gallaza
100% (1)
Chapter 1
Document16 pages
Chapter 1
MAXIMMIN VIERNES
No ratings yet
CHAD Project Paper
Document3 pages
CHAD Project Paper
Miraj 3rd speaker-JCPSC
No ratings yet
1-Integrated Curriculum AY 2021-2022
Document34 pages
1-Integrated Curriculum AY 2021-2022
Cyryhl Gutlay
No ratings yet
Urban Seminar
Document6 pages
Urban Seminar
Tmiky Gate
No ratings yet
CAFJ 1 Worksheet and Rubric
Document5 pages
CAFJ 1 Worksheet and Rubric
Ichi Ha
No ratings yet
1SJ18CS101 Subhash K V
Document33 pages
1SJ18CS101 Subhash K V
Abhishek S
No ratings yet
Veneziano, J., Shea, S. They Have A Voice Are We Listening. Behav Analysis Practice (2022) - Httpsdoi - Org10.1007s40617-022-00690-Z
Document18 pages
Veneziano, J., Shea, S. They Have A Voice Are We Listening. Behav Analysis Practice (2022) - Httpsdoi - Org10.1007s40617-022-00690-Z
Mônica Rezende
No ratings yet
5 Domains
Document3 pages
5 Domains
SHERWIN VARGAS
No ratings yet
Introduction To Psychology Study Guide 1550683460
Document327 pages
Introduction To Psychology Study Guide 1550683460
mark issac
No ratings yet
Dorothy Johnson
Document5 pages
Dorothy Johnson
Joanna Marie Tulio
No ratings yet
Writing An Informal Proposal Sample
Document4 pages
Writing An Informal Proposal Sample
Bannawag HS North Cotabato
No ratings yet
Reflection On ArtsHum
Document3 pages
Reflection On ArtsHum
adelinanureenlcmn
No ratings yet
Chapter I: The Self From Various Perspective B. Sociology: Study Guide For Module No.
Document5 pages
Chapter I: The Self From Various Perspective B. Sociology: Study Guide For Module No.
Myoui Mina
No ratings yet
Artikel Some Media Used in Teaching Listening To Junior High School - Nadhatul&Fania
Document6 pages
Artikel Some Media Used in Teaching Listening To Junior High School - Nadhatul&Fania
fania diah
No ratings yet
Mark Scheme (Results) January 2018
Document17 pages
Mark Scheme (Results) January 2018
Rafa
No ratings yet
Diferencias Entre El Método de Traducción Gramatical y Otros Métodos de Enseñanza de Idiomas
Document2 pages
Diferencias Entre El Método de Traducción Gramatical y Otros Métodos de Enseñanza de Idiomas
Torres De Los Santos Johanny Patricia 37
No ratings yet
CO1 Topic 1 Introduction To Globalization
Document25 pages
CO1 Topic 1 Introduction To Globalization
Kyle Hanz Andaya
No ratings yet
Revisiting The Rule of Law PDF
Document86 pages
Revisiting The Rule of Law PDF
Đoàn Duy
No ratings yet