Welcome to Scribd!

s21 Rec1 Gym

Uploaded by

0% found this document useful (0 votes)

8 views14 pages

OpenAI Gym is a toolkit that provides various environments for testing reinforcement learning algorithms. It has a standard API to access different environments, making it safe and easy to get started with RL. Common aspects of environments include making the environment instance, understanding the action and state spaces, resetting the environment to start an episode, having the agent select an action, and stepping through the environment to get the new state, reward, and done signal. OpenAI Gym is widely used for RL research and practice developing agents.

Original Description:

Original Title

s21_rec1_gym

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

8 views14 pages

s21 Rec1 Gym

Uploaded by

KSD

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 14

Search inside document

Introduction to OpenAI Gym

05-Feb-21
Basic RL Framework
Open AI Gym
What is OpenAI Gym?

● A toolkit for testing RL algorithms

● Provides you with different environments (https://gym.openai.com/envs/)
● Up to you to create RL agents for these environments
● Has a standard API to access these different environments
Why do we want to use the OpenAI gym?

● Safe and easy to get started

● Its open source
● Intuitive API
● Widely used in a lot of RL research
● Great place to practice development of RL agents
Common Aspects of OpenAI Gym Environments

● Making the environment

● Action space, state space
● Reset function
● Step function
Action and State/Observation Spaces

● Environments come with the variables state_space and observation_space (contain shape information)
● Important to understand the state and action space before getting started
○ What kind of information does the environment give the agent? (state information)
○ What are the actions that the agent needs to choose from?

State Space Action Space

Stepping Through The Environment
Make the Environment

env = gym.make(“CartPole-v0”)

● Instantiating the environment

● Use the ‘env’ variable to access the instance
Before this point we had:

env

Reset Function

state = env.reset()

● Resets the environment to its starting state

● Call this before beginning each episode
● Returns the state information of the starting state
Before this point we had:

env state

Choosing The Action

action = YourAgent(state)

● Your agent will generally contain a tensorﬂow model that accepts the state information as an input
● action should be a vector that matches the dimensions of the action space
Before this point we had:

env state action

Stepping Through The Environment

new_state, reward, done, info = env.step(action)

● New State : state information of the state after executing the action in the environment
● Reward : numerical reward received from executing the action
● Done : boolean value representing whether this episode has terminated or not
● Info : Additional information about the environment
At This Point We Have Values For:

env state action new_state reward done info

Example

Link to Colab Notebook :

https://colab.research.google.com/drive/1PDdfwG1cZB6YXYsqkask6iDw3_XoYHTR
Resources

https://gym.openai.com/envs/

https://gym.openai.com/docs/

https://github.com/openai/gym/wiki

Computability Theory: An Introduction to Recursion Theory, Students Solutions Manual (e-only)
From Everand
Computability Theory: An Introduction to Recursion Theory, Students Solutions Manual (e-only)
Herbert B. Enderton
No ratings yet
Work Flow
Document77 pages
Work Flow
apraju403
No ratings yet
Corrosion Predictions and Risk Assessment in Oilfield Production Systems
Document37 pages
Corrosion Predictions and Risk Assessment in Oilfield Production Systems
Muhammad Awais
No ratings yet
Computer SC Specimen QP Class Xi
Document7 pages
Computer SC Specimen QP Class Xi
Ashutosh Raj
0% (1)
Cooling
Document26 pages
Cooling
Shahrukh Ansari
No ratings yet
Cloning and Refreshing An Oracle Database
Document11 pages
Cloning and Refreshing An Oracle Database
chandramani.tiwari
0% (1)
Dryden Inference Chaining
Document10 pages
Dryden Inference Chaining
matalacur
No ratings yet
CS451/CS551/EE565 Artificial Intelligence: Agent Types 8-29-2008 Prof. Janice T. Searleman, Jetsza
Document30 pages
CS451/CS551/EE565 Artificial Intelligence: Agent Types 8-29-2008 Prof. Janice T. Searleman, Jetsza
s.ranjith
No ratings yet
Ai 2
Document35 pages
Ai 2
bandook
No ratings yet
AI Agent CU CSE
Document5 pages
AI Agent CU CSE
abhi_69
No ratings yet
CS361 Lec 03
Document26 pages
CS361 Lec 03
Omar Ahmed
No ratings yet
Android Interview Questions
Document7 pages
Android Interview Questions
أسماء جمال
No ratings yet
Brief Overview of Workflow Step
Document32 pages
Brief Overview of Workflow Step
Ashok Srinivasan
No ratings yet
SOS Midterm
Document8 pages
SOS Midterm
Jay Vora
No ratings yet
Agent Types
Document20 pages
Agent Types
Madara
No ratings yet
Adit Day 3
Document3 pages
Adit Day 3
Adit Ghadi
No ratings yet
Reinforcement Learning - Ipynb - Colaboratory
Document7 pages
Reinforcement Learning - Ipynb - Colaboratory
zb lai
No ratings yet
10-703 Deep RL and Controls Openai Gym Recitation: Devin Schwab
Document43 pages
10-703 Deep RL and Controls Openai Gym Recitation: Devin Schwab
Juan Daniel Aguilar
No ratings yet
Flutter Mobile Programming Navigators: 1: What Is Gesture-Detector? Ans
Document9 pages
Flutter Mobile Programming Navigators: 1: What Is Gesture-Detector? Ans
Ahmed Abduqadir Bare
No ratings yet
Day 1 - Control Flow
Document53 pages
Day 1 - Control Flow
Nii Okai Quaye
No ratings yet
PEGA
Document14 pages
PEGA
Super Surya
No ratings yet
Search The Community Search SAP: Login Register
Document48 pages
Search The Community Search SAP: Login Register
Anil Kumar
No ratings yet
Reinforcement Learning: Russell and Norvig: CH 21
Document16 pages
Reinforcement Learning: Russell and Norvig: CH 21
Zuzar
No ratings yet
AI (FSM, Behavior Tree, GOAP, Utility AI) - Nez Framework Documentation
Document3 pages
AI (FSM, Behavior Tree, GOAP, Utility AI) - Nez Framework Documentation
Anna J Bischoff
No ratings yet
Intelligent Agents
Document62 pages
Intelligent Agents
Raunak Das
No ratings yet
Visual Studio 2008: Windows Workflow Foundation
Document29 pages
Visual Studio 2008: Windows Workflow Foundation
api-3700231
No ratings yet
Intelliget Agent 2 How To Design
Document24 pages
Intelliget Agent 2 How To Design
thejaka aloka
No ratings yet
"Quo Teinp Ut" "Color Outpu T" "Col Orin Put" "Fontf Amilyi Nput" "Feed Backo Utput" "Quo Teou Tput" "Font Sizei Nput"
Document3 pages
"Quo Teinp Ut" "Color Outpu T" "Col Orin Put" "Fontf Amilyi Nput" "Feed Backo Utput" "Quo Teou Tput" "Font Sizei Nput"
colin leach
No ratings yet
Software Testing Summary
Document3 pages
Software Testing Summary
iMos 1/7
No ratings yet
Artificial Intelligence
Document52 pages
Artificial Intelligence
Spencer Bharath
No ratings yet
AI Lec2-1
Document24 pages
AI Lec2-1
Omar Saber
No ratings yet
Neural Networks
Document39 pages
Neural Networks
Raja
No ratings yet
CSL0777 L19
Document23 pages
CSL0777 L19
Konkobo Ulrich Arthur
No ratings yet
AI As The Design of Agents
Document26 pages
AI As The Design of Agents
Nenad Nedeljkov
No ratings yet
Ibm BPM
Document9 pages
Ibm BPM
naresh514
No ratings yet
Is-Lecture 2 (Intelligent Agents)
Document51 pages
Is-Lecture 2 (Intelligent Agents)
Arlene Abillonar
No ratings yet
Spring Mass Damper
Document5 pages
Spring Mass Damper
Amit Sharma
No ratings yet
Unit 2 Problem Solving
Document9 pages
Unit 2 Problem Solving
jskri3399
No ratings yet
Computer Programming Languages: Lecture - 1
Document38 pages
Computer Programming Languages: Lecture - 1
Gul khan
No ratings yet
21BAI10450 MATLABReport
Document11 pages
21BAI10450 MATLABReport
meet.joysar2021
No ratings yet
Java
Document98 pages
Java
swethaTMR
No ratings yet
864 NGRX Cluj
Document50 pages
864 NGRX Cluj
Mnr Camille
No ratings yet
The Application Life Cycle Has The Following Stages
Document16 pages
The Application Life Cycle Has The Following Stages
Raghu
No ratings yet
To Complete This Project, Follow These Instructions:: Step 1: Download The Simulator
Document3 pages
To Complete This Project, Follow These Instructions:: Step 1: Download The Simulator
jose
No ratings yet
ROS-I Basic Developers Training - Session 2 PDF
Document45 pages
ROS-I Basic Developers Training - Session 2 PDF
blah
No ratings yet
AI As The Design of Agents
Document27 pages
AI As The Design of Agents
a0920936560
No ratings yet
ML Unit-Iv
Document18 pages
ML Unit-Iv
SB
No ratings yet
A Gent P R O G R A M S: Agent Programs
Document7 pages
A Gent P R O G R A M S: Agent Programs
S.Srinidhi sandoshkumar
No ratings yet
MIC ch4
Document51 pages
MIC ch4
Diksha Katakdhond
No ratings yet
Commonly Asked Pega Interview Questions
Document14 pages
Commonly Asked Pega Interview Questions
Riya Paul
No ratings yet
Twain Unit For Delphi Tacquireimage Ver 1.4 (Oct/2000)
Document13 pages
Twain Unit For Delphi Tacquireimage Ver 1.4 (Oct/2000)
Pedro Rodrigo Santos
No ratings yet
Data Challenge - NC Soft
Document4 pages
Data Challenge - NC Soft
DhivyaRajprasad
No ratings yet
CS361 AI Week2 Lecture1
Document21 pages
CS361 AI Week2 Lecture1
Mohamed Hany
No ratings yet
Android Programming CSE - IT Engineering Lecture Notes
Document36 pages
Android Programming CSE - IT Engineering Lecture Notes
Lokesh Rao
No ratings yet
m2 Agents
Document42 pages
m2 Agents
Tariq Iqbal
No ratings yet
M2 - Mad
Document40 pages
M2 - Mad
Sh Du
No ratings yet
ROS-I Basic Developers Training (ROS2) - Session 2
Document43 pages
ROS-I Basic Developers Training (ROS2) - Session 2
nhatduy.ctt1
No ratings yet
Silk Test
Document32 pages
Silk Test
nirkum
No ratings yet
Understanding State Management: Information Needed For Our Application View
Document22 pages
Understanding State Management: Information Needed For Our Application View
Yesmine Makkes
No ratings yet
Unit 2 AI
Document22 pages
Unit 2 AI
nityamprakasharya123
No ratings yet
Manapega Interview Questions
Document110 pages
Manapega Interview Questions
mahesh gani
No ratings yet
Summary of Previous Lecture
Document52 pages
Summary of Previous Lecture
faisal
No ratings yet
SentiBotics Trial Documentation
Document10 pages
SentiBotics Trial Documentation
Taufiq Ariefianto
No ratings yet
21BAI10063 MonteCarloLab
Document18 pages
21BAI10063 MonteCarloLab
meet.joysar2021
No ratings yet
Basdat II - Pertemuan III
Document27 pages
Basdat II - Pertemuan III
Kharisma widya Mutiara
No ratings yet
Lab Manual DBT 210 PDF
Document19 pages
Lab Manual DBT 210 PDF
BELAL ALSUBARI
No ratings yet
Manual Eng
Document103 pages
Manual Eng
Goran Klemčić
No ratings yet
05 Relevant Costing With Linear Programming
Document9 pages
05 Relevant Costing With Linear Programming
randomlungs121223
No ratings yet
Instruction and Maintenance Manual Electronic Lung Ventilator Model 190
Document12 pages
Instruction and Maintenance Manual Electronic Lung Ventilator Model 190
moh12109
No ratings yet
12th Round MTech Weblist
Document4 pages
12th Round MTech Weblist
sm69xd
No ratings yet
Umis 1268 1658060121
Document8 pages
Umis 1268 1658060121
shakir
No ratings yet
R7220402 Control Systems
Document1 page
R7220402 Control Systems
sivabharathamurthy
No ratings yet
IJSTR Template
Document4 pages
IJSTR Template
Giridhar Parameswaran
No ratings yet
Power Supply Cp-E 24-2.5a
Document4 pages
Power Supply Cp-E 24-2.5a
Fredy Solano
No ratings yet
Computer Graphics and Animation Journal 1
Document51 pages
Computer Graphics and Animation Journal 1
templategamingff
No ratings yet
Gayong-Gayong Sur Integrated School Second Periodic Examination in Science 10
Document2 pages
Gayong-Gayong Sur Integrated School Second Periodic Examination in Science 10
Laira Joy Salvador - Viernes
No ratings yet
It 1352-Cryptography and Network Security Question Paper
Document2 pages
It 1352-Cryptography and Network Security Question Paper
11dreamer
No ratings yet
Sediment Mixtures and Bed Stratigraphy - Jagers - 2012
Document33 pages
Sediment Mixtures and Bed Stratigraphy - Jagers - 2012
javier
No ratings yet
Physical Geography MCQ
Document8 pages
Physical Geography MCQ
MRINMOY SAHA
No ratings yet
Supermax Series
Document4 pages
Supermax Series
yury1102
No ratings yet
9828circle, GM and Conc 1
Document2 pages
9828circle, GM and Conc 1
Anshul Garg
No ratings yet
EPM Manual Model 42i
Document334 pages
EPM Manual Model 42i
Manuk Elfaruk
No ratings yet
Shell Flowmeter Eng1neer1ng Handbook: Edition
Document3 pages
Shell Flowmeter Eng1neer1ng Handbook: Edition
Dinesh
No ratings yet
MICROCONTROLLERS
Document35 pages
MICROCONTROLLERS
Ezequiel Posadas Bocacao
No ratings yet
04 Exp # 4,5,6,7 Free & Forced Convection
Document13 pages
04 Exp # 4,5,6,7 Free & Forced Convection
Abc
No ratings yet
High Pda and Corona Probe Readings: Confirmation of
Document53 pages
High Pda and Corona Probe Readings: Confirmation of
enmelmart
No ratings yet
UNIT III Directed Paths and Connectedness
Document2 pages
UNIT III Directed Paths and Connectedness
yashnaik7664
No ratings yet
643 Data Analysis Plan
Document8 pages
643 Data Analysis Plan
api-329054769
No ratings yet
Artificial Intelligence (Professional Elective - I) Ech. III Year II Sem. LTP C Course Code: CS613PE 3 0 0 3
Document2 pages
Artificial Intelligence (Professional Elective - I) Ech. III Year II Sem. LTP C Course Code: CS613PE 3 0 0 3
Swathi Tudicherla
No ratings yet
Data Mining Exercises - Solutions
Document5 pages
Data Mining Exercises - Solutions
Mehmet Zirek
No ratings yet