Welcome to Scribd!

Class Exercise MAB Demo, Thompson Sampling, RL - Sell Like A Wolf, Q-Learning

Uploaded by

0% found this document useful (0 votes)

8 views2 pages

This document summarizes several reinforcement learning algorithms and applications discussed in a class exercise. It describes using multi-armed bandits to identify the slot machine with the highest reward. It also discusses using Thompson sampling for online advertising to identify the banner with the highest click-through rate. Finally, it summarizes using Thompson sampling to choose the best slot machine, an algorithm to evaluate different sales methods and identify the best one, and using reinforcement learning to help warehouse robots find the optimal route between locations.

Original Description:

Original Title

PGP25116_Soubhagya_Dash_MAB_Thompson_RL_Qlearning

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

8 views2 pages

Class Exercise MAB Demo, Thompson Sampling, RL - Sell Like A Wolf, Q-Learning

Uploaded by

Soubhagya Dash

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

Class Exercise

MAB demo, Thompson sampling, RL_sell like a wolf, Q-learning

Soubhagya Dash
PGP/25/116
Section A

MAB:
Here we are using reinforcement learning in MAB (Multi Armed Bandit) problem to find out
the slot
machine that will give us the maximum reward. We were able to replicate a slot machine
where we could select any machine between 1 and 3. This function's result is essentially the
prize we will receive if we select that machine. So, as we repeatedly play a game, the
computer receives additional data, and due to the re-enforcement learning procedure, the
model will be fine-tuned, and the result will be a predicted reward depending on the number
of rounds the player has played.
In the exploration phase, the number of rounds we played, the more accurate was the result.
Hence, the machine was accurately able to guess the luckiest machine.

Online Advertising:
Here, we focused mostly on the banners of a firm that want to market a product on many
websites in order to identify the banner with the highest CTR.
First, we used A/B testing to find the best online banner that can provide us with the highest
generated reward. The algorithm was able to deduce D as the highest rewarding programme
but it was really inefficient so the algorithm was not able to exploit the dataset fully.
Next, we considered epsilon-greedy algorithm, the unique aspect of this algorithm is that it
chooses the exploration and exploitation randomly, hence having a balance between those
two.
Thirdly, we used Upper Confidence Bounds algorithm. In this, the potential for the
unexplored states is taken care of. The uncertainty of the state is calculated here to define the
UCB, and then onwards the potential of every action is calculated by adding up the estimate
of its value and a measure of how uncertain this estimate is.

Thompson Sampling:
In this, I learned using beta distribution to choose the best slot machine by looking at how
many times the user has lost and how many times the user has won.
This indicates that the fourth machine with the highest conversion rate was chosen.
Therefore, it is a good slot machine for placing our coins.
Sell like wolf:
This algorithm used to evaluate several sales methods and choose the best one. We utilised
random selection and Thomson's model to determine the random strategy, as well as the
quantity and total value of prizes. Then, we figured out the relative return, which comes to
98%. From the histogram, we can see that method "6" was the most popular, indicating that it
is the best.

Robots warehouse demo:

A methodology based on artificial intelligence that helps warehouses transfer their logistics
by finding the optimal and quickest route from source to destination.
Initially, we were working out the three fundamental ideas of reinforcement learning: states,
actions, and rewards. For each action performed by the robot, we provide the model's
rewards. If it selects the correct path/action, it will be compensated. Then, we determine how
to travel from the states to the destinations. Then, we create the AI model or function
responsible for determining the quickest path.

Reinforcement Learning
Document8 pages
Reinforcement Learning
Sarbani Mishra
No ratings yet
Machine Learning
Document13 pages
Machine Learning
MOHAMMED MANSOOR
No ratings yet
Dissecting Reinforcement Learning-Part6
Document25 pages
Dissecting Reinforcement Learning-Part6
Sep Electromecanica
No ratings yet
Reinforcement Learning - Chapter 2
Document22 pages
Reinforcement Learning - Chapter 2
Sivasathiya G
100% (1)
Designing A Learning System
Document12 pages
Designing A Learning System
Pooja Dixit
No ratings yet
Commonly Used Machine Learning Algorithms
Document38 pages
Commonly Used Machine Learning Algorithms
ashokmvanjare
No ratings yet
Essentials of Machine Learning Algorithms
Document15 pages
Essentials of Machine Learning Algorithms
Andres Valencia
No ratings yet
Reinforcement Learning Framework
Document12 pages
Reinforcement Learning Framework
yati kumari
No ratings yet
Unit 1
Document8 pages
Unit 1
vvvcxzzz3754
No ratings yet
Deep Reinforcement Learning For Algorithmic Trading
Document9 pages
Deep Reinforcement Learning For Algorithmic Trading
enghoss77
No ratings yet
A Baby Robot - 1
Document6 pages
A Baby Robot - 1
Gaurav Rohilla
No ratings yet
Commonly Used Machine Learning Algorithms (With Python and R Codes)
Document19 pages
Commonly Used Machine Learning Algorithms (With Python and R Codes)
Diego Moreno
No ratings yet
Exercise Set 1
Document7 pages
Exercise Set 1
Tasia Bueno De Mesquita
No ratings yet
Designing A Learning System
Document12 pages
Designing A Learning System
Srinadh Dhfm
No ratings yet
Designing A Learning System
Document21 pages
Designing A Learning System
bhavani
No ratings yet
Introduction To Machine Learning: Prof. (DR.) Honey Sharma Reference Books
Document26 pages
Introduction To Machine Learning: Prof. (DR.) Honey Sharma Reference Books
Director GGI
No ratings yet
07 - Model Selection & Building
Document17 pages
07 - Model Selection & Building
Omar Ben
No ratings yet
Unit 1
Document20 pages
Unit 1
moviedownloadas
No ratings yet
Intro - Types of Machine Learning
Document24 pages
Intro - Types of Machine Learning
Pratul Pandey
No ratings yet
ML (Unit-1)
Document17 pages
ML (Unit-1)
yashksharma181202
No ratings yet
Module 1 - ML
Document26 pages
Module 1 - ML
yashasdnhce
No ratings yet
Introduction
Document6 pages
Introduction
Arjun Singh
No ratings yet
Ensemble Learning
Document7 pages
Ensemble Learning
Gabriel Gheorghe
100% (1)
App
Document4 pages
App
NIRANJAN RAJANDRAN
No ratings yet
Quiz: Reinforcement Learning: Question 1 of 5
Document25 pages
Quiz: Reinforcement Learning: Question 1 of 5
Andreea Calarasu
No ratings yet
Machine Learning
Document46 pages
Machine Learning
Poorna Kalandhar
100% (3)
Unit-1 MLT
Document51 pages
Unit-1 MLT
rishuraijaishreeram
No ratings yet
Reinforcement Learning - Ipynb - Colaboratory
Document7 pages
Reinforcement Learning - Ipynb - Colaboratory
zb lai
No ratings yet
Commonly Used Machine Learning Algorithms
Document27 pages
Commonly Used Machine Learning Algorithms
sbs Analytics19-21
No ratings yet
Machine Learning Reftest
Document10 pages
Machine Learning Reftest
Nitish Solanki
No ratings yet
UNIT - IV: Machine Learning For Information Security
Document20 pages
UNIT - IV: Machine Learning For Information Security
Sai Chandan M
No ratings yet
Uber Ridesharing Clustering: Notebook
Document14 pages
Uber Ridesharing Clustering: Notebook
naresh tinnaluri
No ratings yet
ML Algorithms
Document12 pages
ML Algorithms
Shivendra Chand
No ratings yet
Designing A Learing System
Document5 pages
Designing A Learing System
19wh1a05h4 Taninki Keerthi Phanisree
No ratings yet
Unit Ii
Document56 pages
Unit Ii
mahih16237
No ratings yet
ML PracticalFile Sushil 1916413
Document65 pages
ML PracticalFile Sushil 1916413
Rohit Rana
No ratings yet
AWS DEEPRACER (AutoRecovered) (AutoRecovered)
Document9 pages
AWS DEEPRACER (AutoRecovered) (AutoRecovered)
firdaws
No ratings yet
Proposal
Document9 pages
Proposal
PAULO TELLES
No ratings yet
EDAP01
Document4 pages
EDAP01
Axel Rosenqvist
No ratings yet
Stock Market Analysis Using Supervised Machine Learning
Document6 pages
Stock Market Analysis Using Supervised Machine Learning
Abishek Pangotra (Abi Sharma)
No ratings yet
AI Session 3 Machine Learning Slides
Document35 pages
AI Session 3 Machine Learning Slides
Philani Mangezi
No ratings yet
Chapter Five
Document10 pages
Chapter Five
junedijoasli
No ratings yet
Introduction To Machine Learning For Beginners
Document6 pages
Introduction To Machine Learning For Beginners
Heekwan Son
No ratings yet
ML Mod 5 SEM
Document23 pages
ML Mod 5 SEM
Sai Phani
No ratings yet
Assignment - 13M Payment Behaviour Prediction - BAA
Document1 page
Assignment - 13M Payment Behaviour Prediction - BAA
Aditya
No ratings yet
Broadly, There Are 3 Types of Machine Learning Algorithms.
Document33 pages
Broadly, There Are 3 Types of Machine Learning Algorithms.
hbk.hariharan
No ratings yet
What Is Machine Learning
Document9 pages
What Is Machine Learning
ranamzeeshan
No ratings yet
1 Introduction
Document11 pages
1 Introduction
Matrix Bot
No ratings yet
Python
Document65 pages
Python
Anil
No ratings yet
Bagging and Boosting
Document4 pages
Bagging and Boosting
sumit rakesh
No ratings yet
Tesla Stock Marketing Price Prediction
Document62 pages
Tesla Stock Marketing Price Prediction
syedhaji1996
No ratings yet
What Is Machine Learning?: Trust Issues. How Do We Decide The Correctness Then?
Document2 pages
What Is Machine Learning?: Trust Issues. How Do We Decide The Correctness Then?
hadi645
No ratings yet
Machine Learning Algorithms in Depth (MEAP V09) (Vadim Smolyakov) (Z-Library)
Document550 pages
Machine Learning Algorithms in Depth (MEAP V09) (Vadim Smolyakov) (Z-Library)
Agaliev
No ratings yet
CSD311: Artificial Intelligence
Document11 pages
CSD311: Artificial Intelligence
Ayaan Khan
No ratings yet
Chapter 5 - Learning
Document16 pages
Chapter 5 - Learning
Seyo Kasaye
No ratings yet
Foundations of Machine Learning - 3
Document38 pages
Foundations of Machine Learning - 3
takunda
No ratings yet
CSD311: Artificial Intelligence
Document5 pages
CSD311: Artificial Intelligence
Ayaan Khan
No ratings yet
Machine Learning Interview Questions
Document4 pages
Machine Learning Interview Questions
Amandeep
100% (1)
Things You Need To Know About Reinforcement Learning PDF
Document3 pages
Things You Need To Know About Reinforcement Learning PDF
Narendra Patel
No ratings yet
Grokking Machine Learning
From Everand
Grokking Machine Learning
Luis Serrano
No ratings yet
EE Project Term 2 Group 4
Document13 pages
EE Project Term 2 Group 4
Soubhagya Dash
No ratings yet
Case 3: Alexa: A Pandora's Box of Risks
Document2 pages
Case 3: Alexa: A Pandora's Box of Risks
Soubhagya Dash
No ratings yet
PGP25116 - Soubhagya - Dash - DPolynomial Regression
Document4 pages
PGP25116 - Soubhagya - Dash - DPolynomial Regression
Soubhagya Dash
No ratings yet
Tiktok'S Ai Strategy: Bytedance'S Global Ambitions: Soubhagya Dash Pgp/25/116 Section A
Document3 pages
Tiktok'S Ai Strategy: Bytedance'S Global Ambitions: Soubhagya Dash Pgp/25/116 Section A
Soubhagya Dash
No ratings yet
Case2: Liulishuo: AI English Teacher
Document3 pages
Case2: Liulishuo: AI English Teacher
Soubhagya Dash
No ratings yet
Case1: Evie - Ai: The Rise of Artificial Intelligence and The Future of Work
Document4 pages
Case1: Evie - Ai: The Rise of Artificial Intelligence and The Future of Work
Soubhagya Dash
No ratings yet
Lab Manual
Document10 pages
Lab Manual
the
No ratings yet
جدول كميات كهربا
Document3 pages
جدول كميات كهربا
hany mohamed
100% (1)
OpTransactionHistory02 12 2019
Document4 pages
OpTransactionHistory02 12 2019
Vijay Kumar
No ratings yet
Image Encryption and Decryption Using Triple Des Algorithm: Submitted by
Document34 pages
Image Encryption and Decryption Using Triple Des Algorithm: Submitted by
Sreejith S
No ratings yet
Library - Management - Project Report by Vaishali & Sukhwinder Kaur BCA (3rd Yr.)
Document64 pages
Library - Management - Project Report by Vaishali & Sukhwinder Kaur BCA (3rd Yr.)
Naveen Sharma
No ratings yet
TES5Edit Log
Document621 pages
TES5Edit Log
Kurniawan Wijaya
No ratings yet
B&R Automation Studio Target For Simulink
Document76 pages
B&R Automation Studio Target For Simulink
Adrian Melero
No ratings yet
JAMES A. HALL - Accounting Information System Chapter 13
Document45 pages
JAMES A. HALL - Accounting Information System Chapter 13
Joe VaTa
No ratings yet
CV - Qian Yihao
Document19 pages
CV - Qian Yihao
destri_742053763
No ratings yet
Logic Gate - Wikipedia
Document17 pages
Logic Gate - Wikipedia
Ella Canonigo Cantero
No ratings yet
EL - 5 - Planning and Arranging Transport-37-44
Document8 pages
EL - 5 - Planning and Arranging Transport-37-44
Rizky Adi
No ratings yet
Molex M-100 Catalog 1973
Document28 pages
Molex M-100 Catalog 1973
Tzouralas Theodoros
No ratings yet
DARCY CV - Updated
Document3 pages
DARCY CV - Updated
Rodel Candelario
No ratings yet
Smart™ Sensors: Tilted Element™ Thru-Hull
Document2 pages
Smart™ Sensors: Tilted Element™ Thru-Hull
Danner
No ratings yet
PECD Mid - 2 18-19 Question Bank
Document2 pages
PECD Mid - 2 18-19 Question Bank
Osmium krypton
No ratings yet
3P
Document4 pages
3P
Wookie T Bradford
No ratings yet
Overcurrent Protection
Document61 pages
Overcurrent Protection
Huzaifa Wasim
No ratings yet
Zed-F9K: High Precision Dead Reckoning With Integrated IMU Sensors
Document2 pages
Zed-F9K: High Precision Dead Reckoning With Integrated IMU Sensors
Stevan Zupunski
No ratings yet
Breaking The Fifth Wall-Excerpt
Document4 pages
Breaking The Fifth Wall-Excerpt
GarlandArular
No ratings yet
Cummins Power Command IWatch100
Document31 pages
Cummins Power Command IWatch100
Leo Burns
No ratings yet
Android Project Report
Document38 pages
Android Project Report
Sahil Adlakha
57% (7)
G U - Resume
Document3 pages
G U - Resume
Naveen_naidu1
No ratings yet
A Brief History of Spreadsheets
Document5 pages
A Brief History of Spreadsheets
Dexter Castillo
No ratings yet
1327 Brochure LPKF Protomats S Series en
Document8 pages
1327 Brochure LPKF Protomats S Series en
Alex
No ratings yet
Erew
Document2 pages
Erew
Keet Wong
No ratings yet
WDG 4
Document2 pages
WDG 4
Anonymous rNLEUd
No ratings yet
GIS SPDSS Final
Document11 pages
GIS SPDSS Final
srverngp
No ratings yet
Transistor As An Amplifier
Document3 pages
Transistor As An Amplifier
Arslan Ashfaq
No ratings yet
EN RFK Interface Universell
Document18 pages
EN RFK Interface Universell
Iliescu Cristian
No ratings yet
05a1 E70 Central Locking PDF
Document19 pages
05a1 E70 Central Locking PDF
Mozes Simataa
No ratings yet