Open navigation menu

Welcome to Scribd!

Rafael Rangga - IF-7

Uploaded by

0% found this document useful (0 votes)

6 views7 pages

The document contains code to implement Q-learning in Python to solve a 5x6 gridworld environment. It imports NumPy, initializes the environment matrix and Q-table, sets hyperparameters, then runs the Q-learning algorithm over 1000 epochs to learn the optimal policy. It updates the Q-table based on rewards received from actions in each state and checks if the goal state is reached.

Original Description:

Original Title

10123265_Rafael Rangga_IF-7

Copyright

© © All Rights Reserved

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

The document contains code to implement Q-learning in Python to solve a 5x6 gridworld environment. It imports NumPy, initializes the environment matrix and Q-table, sets hyperparameters, then runs the Q-learning algorithm over 1000 epochs to learn the optimal policy. It updates the Q-table based on rewards received from actions in each state and checks if the goal state is reached.

Copyright:

© All Rights Reserved

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

6 views7 pages

Rafael Rangga - IF-7

Uploaded by

The document contains code to implement Q-learning in Python to solve a 5x6 gridworld environment. It imports NumPy, initializes the environment matrix and Q-table, sets hyperparameters, then runs the Q-learning algorithm over 1000 epochs to learn the optimal policy. It updates the Q-table based on rewards received from actions in each state and checks if the goal state is reached.

Copyright:

© All Rights Reserved

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 7

Search inside document

i

mpor
tnumpyasnp

#Mat
ri
ksr
epr
esent
asi
li
ngkungan

#0:
jal
ur,
1:r
int
angan,
2:t
ujuan

env
ironment=np.
arr
ay(
[

[
0,0,
0,1,
0,0]
,

[
0,1,
0,1,
0,0]
,

[
0,1,
0,0,
0,2]
,

[
0,0,
0,1,
1,0]
,

[
1,0,
0,0,
0,0]

]
)

#I
nisi
ali
sasi
par
amet
erQ-
tabl
e

q_
tabl
e=np.
zer
os(
(5,
6))

#Hy
per
par
amet
ers

l
ear
ning_
rat
e=0.
1

di
scount
_fact
or=0.
9

expl
orat
ion_
rat
e=0.
1

epochs=1000

#Al
gor
it
maQ-
Lear
ning

f
orepochi
nrange(
epochs)
:

st
ate=(
0,0)#Agenmul
aidar
iti
ti
kawal

whi
l
eTr
ue:

#Memi
l
iht
indakan(
act
ion)

i
fnp.
random.
uni
for
m(0,
1)<expl
orat
ion_
rat
e:

act
ion=np.
random.
randi
nt(
0,4)#Pi
l
ihaksi
acak

el
se:

act
ion=np.
argmax(
q_t
abl
e[st
ate]
)#Pi
l
ihaksi
ter
bai
kber
dasar
kanQ-
tabl
e
#Mel
akukant
indakandanmendapat
kanr
ewar
d

next
_st
ate=(
stat
e[0]+(
act
ion==0)-(
act
ion==1)
,st
ate[
1]+(
act
ion==2)-(
act
ion==3)
)

r
ewar
d=-
1ifenv
ironment
[next
_st
ate]==0el
se(
10i
fenv
ironment
[next
_st
ate]==2el
se-
100)

#Updat
eQ-
tabl
eber
dasar
kanr
ewar
d

q_
tabl
e[st
ate]
[act
ion]=q_
tabl
e[st
ate]
[act
ion]+l
ear
ning_
rat
e*(

r
ewar
d+di
scount
_fact
or*np.
max(
q_t
abl
e[next
_st
ate]
)-q_
tabl
e[st
ate]
[act
ion]
)

st
ate=next
_st
ate

#Mengecekapakahagent
elahmencapai
tuj
uanat
aumencapai
bat
asi
ter
asi

i
fenv
ironment
[st
ate]==2orepoch==epochs-1:

br
eak

#Pr
intQ-
tabl
ehasi
lpembel
ajar
an

pr
int
("
Q-t
abl
e:"
)

pr
int
(q_
tabl
e)

You might also like

An Improvement of Convergence Rate Estimates in The Lyapunov Theorem PDF
Document3 pages
An Improvement of Convergence Rate Estimates in The Lyapunov Theorem PDF
frank yang
No ratings yet
Omar Informatique
Document4 pages
Omar Informatique
kellynkiki333
No ratings yet
Mathematical Modeling and Computation in Finance
Document6 pages
Mathematical Modeling and Computation in Finance
Đạo Ninh Việt
No ratings yet
Funct I Onsact I Vi T Y2
Document3 pages
Funct I Onsact I Vi T Y2
dano
No ratings yet
Exercise 4.4: Solution
Document2 pages
Exercise 4.4: Solution
GONZALEZ ALATORRE ANA PAULINA
No ratings yet
Laboratory 10: Identification by The Least-Squares Method: Problem 1
Document3 pages
Laboratory 10: Identification by The Least-Squares Method: Problem 1
Franco Claudio Antonio Porras Yarasca
No ratings yet
1.2 - Orthogonal Plynomials PDF
Document11 pages
1.2 - Orthogonal Plynomials PDF
huijwehuw
No ratings yet
AM II - Unit 1
Document39 pages
AM II - Unit 1
keralas5194071
No ratings yet
ST MACH 04-Apr-2024 - 240421 - 185318
Document25 pages
ST MACH 04-Apr-2024 - 240421 - 185318
Anik Ghosh
No ratings yet
10 May 2021 m2
Document8 pages
10 May 2021 m2
alljenish1444
No ratings yet
Algebraic Group Kannan 250822
Document4 pages
Algebraic Group Kannan 250822
Anubhab Pahari
No ratings yet
x y E kz−ωt + π ; E kz−ωt − π E ωt +kz ; E E ωt+kz+π: Solutions
Document2 pages
x y E kz−ωt + π ; E kz−ωt − π E ωt +kz ; E E ωt+kz+π: Solutions
Mayuka sen
No ratings yet
LNN LN (N) LNN LN N + LN Because, Log (Ab) Log A + Log B
Document2 pages
LNN LN (N) LNN LN N + LN Because, Log (Ab) Log A + Log B
Tahminul Islam
No ratings yet
% %× VCT) PM : F (Newton
Document7 pages
% %× VCT) PM : F (Newton
Jay
No ratings yet
Pset 7
Document7 pages
Pset 7
jake frei
No ratings yet
Labassi Gnment (Assi Gnment11) : Quest I On1
Document12 pages
Labassi Gnment (Assi Gnment11) : Quest I On1
Amy
No ratings yet
Labassi Gnment (Assi Gnment11) : Quest I On1
Document12 pages
Labassi Gnment (Assi Gnment11) : Quest I On1
Amy
No ratings yet
Lecture 2 Reflection and Refraction
$Lecture 2 Reflection and Refraction$
Document15 pages
Lecture 2 Reflection and Refraction
Pan
No ratings yet
Ead An4 T: Fouwe Analyd
Document4 pages
Ead An4 T: Fouwe Analyd
Deepanshi Mishra
No ratings yet
Chap 6
Document16 pages
Chap 6
김민성
No ratings yet
1.7 Fourier Integral:: Lecture
Document13 pages
1.7 Fourier Integral:: Lecture
اطياف حامد محمد
No ratings yet
25th Math Notes (Z-Transform)
Document3 pages
25th Math Notes (Z-Transform)
Himanshu Saini
No ratings yet
) M y My - String - 1: Application: DSE-1
Document4 pages
) M y My - String - 1: Application: DSE-1
dimpyrathi23
No ratings yet
WR I T Eacpr Ogram T Hatt Akesaval Uenandcal Cul at Est Hefact Ori Alofn. Demonst R at Et Heuseofr Ecursi Vefunct I On. Answer
Document4 pages
WR I T Eacpr Ogram T Hatt Akesaval Uenandcal Cul at Est Hefact Ori Alofn. Demonst R at Et Heuseofr Ecursi Vefunct I On. Answer
SAR_Suvro
No ratings yet
Chapter 4 Laplace Transformation-JA
Document12 pages
Chapter 4 Laplace Transformation-JA
عبدالكريم الدليمي
No ratings yet
Ex 3 QM
Document7 pages
Ex 3 QM
Ángel Arriaga Reyes
No ratings yet
Tut 3 Solutions
Document6 pages
Tut 3 Solutions
Rashmi
No ratings yet
Chapt Er11 Tupl Es, Di CT I Onar I Es, Andset S
Document35 pages
Chapt Er11 Tupl Es, Di CT I Onar I Es, Andset S
Niladri Editz
No ratings yet
Chapter 8
Document77 pages
Chapter 8
Danny Alonso
No ratings yet
Chapter 2 Free Vibration
Document9 pages
Chapter 2 Free Vibration
TsiNat Natha
No ratings yet
Handwritten Notes AISC
Document193 pages
Handwritten Notes AISC
Naresh Alwala
No ratings yet
On The Joint Distribution of The Surplus Immediately Prior To Ruin and The Deficit at Ruin
Document5 pages
On The Joint Distribution of The Surplus Immediately Prior To Ruin and The Deficit at Ruin
ramzi
No ratings yet
Permutations and Combinations Lemmas
Document10 pages
Permutations and Combinations Lemmas
Padamati Pranavsai
No ratings yet
37 Wideband FM
Document3 pages
37 Wideband FM
Manish Kumawat
No ratings yet
RTL (1+t, VT), T:i.: KR Tu Rio
Document5 pages
RTL (1+t, VT), T:i.: KR Tu Rio
ingenieria civil
No ratings yet
ELG 3120 Signals and Systems: Midterm
Document7 pages
ELG 3120 Signals and Systems: Midterm
Yaseen
No ratings yet
EEM305-Odev07 Soln
Document5 pages
EEM305-Odev07 Soln
Oğulcan AKCA
No ratings yet
Solución Ecuaciones de Bateman (Articulo)
Document6 pages
Solución Ecuaciones de Bateman (Articulo)
Geraldin
No ratings yet
UNI Versi Tyofengi Neeri Ngand Technology, Lahore: (Departmentofmechani Calengi Neeri NG)
Document21 pages
UNI Versi Tyofengi Neeri Ngand Technology, Lahore: (Departmentofmechani Calengi Neeri NG)
Muhammad Furqan
No ratings yet
Assignment 1 Solutions
Document9 pages
Assignment 1 Solutions
shreya
No ratings yet
Списки, тип даних (list)
Document1 page
Списки, тип даних (list)
Михайлик &
No ratings yet
EXPERIMENT Web Tech Lab KARTHIK
Document25 pages
EXPERIMENT Web Tech Lab KARTHIK
AMIRUL ISLAM
No ratings yet
Full Download Signals and Systems Analysis Using Transform Methods and Matlab 3Rd Edition Roberts Solutions Manual PDF
Document73 pages
Full Download Signals and Systems Analysis Using Transform Methods and Matlab 3Rd Edition Roberts Solutions Manual PDF
maryann.spiller922
100% (17)
Smoothing
Document9 pages
Smoothing
mimi
No ratings yet
Unit 1 Laplace Transforms
Document7 pages
Unit 1 Laplace Transforms
Charan V Chan
No ratings yet
Einstein's and Debye's Theory, TC - Final
Document13 pages
Einstein's and Debye's Theory, TC - Final
MuthuLakshmi Rajendran
100% (1)
Que. WR I T Eapr Ogr Ammet Odi SPL Ayst Udentr Ecor DSF R Om ST Udentt Abl E
Document3 pages
Que. WR I T Eapr Ogr Ammet Odi SPL Ayst Udentr Ecor DSF R Om ST Udentt Abl E
Aditya Kunwar
No ratings yet
Collaborative Review Task M1
Document3 pages
Collaborative Review Task M1
Abdullah Abdullah
No ratings yet
18 Laplace Transform
Document17 pages
18 Laplace Transform
Christian Perez Macedo
No ratings yet
Quasi-Stationary Distributions For The Radial Ornstein-Uhlenbeck Processes
Document10 pages
Quasi-Stationary Distributions For The Radial Ornstein-Uhlenbeck Processes
onemahmud
No ratings yet
DCS Note Ktustudents - in
Document86 pages
DCS Note Ktustudents - in
Madhu Ck
No ratings yet
12-Ch1-Electric Charges and Fields Short Notes
Document2 pages
12-Ch1-Electric Charges and Fields Short Notes
SHASHANK
No ratings yet
傅立葉級數的複數形式
Document2 pages
傅立葉級數的複數形式
daisy
No ratings yet
OS FInal Practical
Document65 pages
OS FInal Practical
Anushka
No ratings yet
8.1E Introduction To The Laplace Transform (Exercises)
Document4 pages
8.1E Introduction To The Laplace Transform (Exercises)
Josh Dejasco
No ratings yet
Maths Unit 5
Document21 pages
Maths Unit 5
balaji12a1b
No ratings yet
Kira Is Justice
Document15 pages
Kira Is Justice
Roman Reigns
No ratings yet
Spring06 1 PDF
Document26 pages
Spring06 1 PDF
Luis Alberto Fuentes
No ratings yet
Computing With Daubechies' Wavelets: Adri B. Olde Daalhuis
Document13 pages
Computing With Daubechies' Wavelets: Adri B. Olde Daalhuis
rfid
No ratings yet
The Spectral Theory of Toeplitz Operators. (AM-99), Volume 99
From Everand
The Spectral Theory of Toeplitz Operators. (AM-99), Volume 99
L. Boutet de Monvel
No ratings yet