Professional Documents
Culture Documents
A First Approach of A New Learning Strategy: Learning by Confirmation
A First Approach of A New Learning Strategy: Learning by Confirmation
LEARNING BY CONFIRMATION
Reinforcement Learning
The instructor does NOT give the answers Exploration by the agent
Learning by Examples
The instructor gives answers to agents questions The agent does NOT explore
Learning by Confirmation
Start position
Training Episode: The instructor is asked by the robot for the right action (opinion)
When an example of a correct action is given, the learning function is slightly increased, whereas the rest of possible actions are penalized
Scenario
Number of steps to find the optimal path
We have applied this new strategy to a real-time system, a Lego NXT robot, in an inverted pendulum configuration
Besides, the agent is able to find the route it should follow even if there are several optimal paths or if it has been deceived by the instructor