You are on page 1of 6

Matilde Santos Faculty of Informatics

A First Approach of a New Learning Strategy: Learning by Confirmation


Alejandro Carpio, Matilde Santos, Jos A. Martn-H
Computer Architecture and Systems Engineering, Facultad de Informtica, Universidad Complutense de Madrid, 28040-Madrid, Spain acarsan2@gmail.com; msantos@dacya.ucm.es; jamartinh@fdi.ucm.es

Matilde Santos Faculty of Informatics

LEARNING BY CONFIRMATION
Reinforcement Learning
The instructor does NOT give the answers Exploration by the agent

Learning by Examples
The instructor gives answers to agents questions The agent does NOT explore

Learning by Confirmation

Matilde Santos Faculty of Informatics

VALIDATION OF THE NEW LEARNING STRATEGY ON A REAL-TIME DEVICE


30 cm 30 cm target

Start position

Agent: Lego Mindstorms NXT 2.0 robot inverted pendulum configuration

Matilde Santos Faculty of Informatics

Exploratory Episode: Q-Learning algorithm is applied


The robot explores the space, choosing the next action based on its policy (which depends on the values of the reward function)

Training Episode: The instructor is asked by the robot for the right action (opinion)
When an example of a correct action is given, the learning function is slightly increased, whereas the rest of possible actions are penalized

Matilde Santos Faculty of Informatics

Scenario
Number of steps to find the optimal path

Rewards associated to each state

Number of training episodes

Matilde Santos Faculty of Informatics

CONCLUSIONS AND FUTURE WORK


It is possible to accelerate the learning process of an agent by introducing some examples as extra knowledge (Learning by confirmation)
Using these examples we are able to introduce knowledge about the problem to the algorithm of reinforcement learning

We have applied this new strategy to a real-time system, a Lego NXT robot, in an inverted pendulum configuration
Besides, the agent is able to find the route it should follow even if there are several optimal paths or if it has been deceived by the instructor

You might also like