You are on page 1of 2

CS1AC16 MACHINE LEARNING

How to enable an expert system to learn like human experts have learned from experience?
How to start it with a few rules and make it deduce others?

Learning – a cognitive process of acquiring skills or knowledge – a process that allows an agent to adapt
its behavior. One of the key aspects of intelligence is the ability to learn.

Why is learning important?


When faced with a situation your set of rules doesn’t cater for, you need to be able to adapt to that
new situation – otherwise you're behaving like a pre-programmed robot that works on the basis of what
it sees at the moment with no memory of what it encountered before.

Example: Insects behave strictly according to instinct with their behavior governed by a set of rules –
encountering a situation these rules don't cater for, they appear stupid – unable to learn, remember or
adapt – a fly repeatedly hitting a window. Some insects can learn but only simple things (for example
associating particular color of flowers with the best nectar) – they don't have brain capacity for more.

Primates and dolphins are seen as more intelligent as they can learn abstract concepts.

In general the ability to adapt and learn from mistakes is associated with intelligence.

Types of learning
 Associative – like Pavlov's dogs
 Trial and error – learning a skill with direct interaction with the environment – you do
something, evaluate it and adapt your behavior on the basis of that evaluation – e.g. learning to
ride a bike
o In some cases there may be some previous knowledge or assistance involved
 Unsupervised – without feedback – e.g. learning to recognize the difference between cats and
dogs without being told they're different
 Supervised – with feedback – e.g. learning something with an instructor
 Reinforcement – with delayed feedback

Elements of a learning system


A learning system needs a method of transforming a set of input data into some useful output – the
function that maps the input to the desired output is the target function.
How to represent target function?
How to evaluate performance during learning?
How to know when learning is complete?
Simple reinforcement
The aim: get the robot to explore environment and avoid obstacles
The robot has states: object ahead, to the right, to the left, no object seen
And actions associated with states: go forward, reverse, turn left, turn right
In each state the robot has a probability of performing a particular action – at the start all actions have
equal probabilities for all states.
When an action is taken the resulting situation is examined and evaluated as good or bad,
If the action led to a good result, it's probability in the given situation increases, if it was bad – the
probability decreases – the robot learns.

The problem is that the robot needs a way of assessing whether an action was good or bad – it's not
obvious at the start. The robot may have to choose a next step without figuring that out and only learns
later if the action was good – e.g. in a maze:

There is a number of steps needed before


the agent finds out if they were correct – how to allocate the rewards in such situation?
Temporal difference algorithm – allocating reward share at intermediate points – the robot first
moves randomly but next time it adjusts its path and finds a better one

You might also like