Professional Documents
Culture Documents
Dr.Chandrika.J
Professor
CSE
Course Faculty
Effects of Programs that Learn
Application areas
Learning from medical records which treatments are most
effective for new diseases
Houses learning from experience to optimize energy costs
based on usage patterns of their occupants
Personal software assistants learning the evolving interests of
users in order to highlight especially relevant stories from the
online morning newspaper
2
Effective Applications of Learning
A successful understanding of how to make computers learn
would open up many new uses of computers and new levels
of competence and customization
Speech recognition
outperform all other approaches that have been attempted to date
Pattern detection
Learning algorithms being used to discover valuable knowledge from
large commercial databases
detect fraudulent use of credit cards
Play Games
Play backgammon at levels approaching the performance of human
world champions
3
Well Posed Learning Problems
A computer program is said to learn from experiences E with
respect to some class of tasks T and performance P, if its
performance at tasks in T, as measured by P, improves with
experience E
To have a well-defined learning problem, three features
needs to be identified:
The class of tasks
The measure of performance to be improved
The source of experience
Examples
A checkers learning problem:
Task T: playing checkers
Performance measure P: percent of games won against opponents
Training experience E: playing practice games against itself
Handwriting recognition learning problem:
Task T: recognizing and classifying handwritten words within images
Performance measure P: percent of words correctly classified
Training experience E: a database of handwritten words with given classifications
4
Designing a Learning System
Consider designing a program to learn to play checkers,
with the goal of entering it in the world checkers
tournament
Requires the following steps
Choosing Training Experience
Choosing the Target Function
Choosing the Representation of the Target Function
Choosing a Function Approximation Algorithm
Estimating training values
Adjusting the weights
The Final Design
5
Choosing the Training Experience (1)
The first design choice is to choose the type of training experience from
which the system will learn.
The type of training experience available can have a significant impact on
success or failure of the learner.
There are three attributes which impact on success or failure of the
learner
Indirect Feedback: Move sequences and final outcomes of various games played
Credit assignment problem: Value of early states must be inferred from the
outcome
6
Choosing the Training Experience (2)
2. Degree to which the learner controls the sequence of
training examples
7
Choosing the Training Experience (3)
3. How well the training experience represents the distribution
of examples over which the final system performance P will be
measured
If training the checkers program consists only of experiences played
against itself, it may never encounter crucial board states that are
likely to be played by the human checkers champion
Note - there is a danger that this training experience might not be
fully representative of the distribution of situations over which it will
later be tested.
Most theory of machine learning rests on the assumption that the
distribution of training examples is identical to the distribution of test
examples
8
Partial Design of Checkers Learning Program
A checkers learning problem:
Task T: playing checkers
Performance measure P: percent of games won in the world
tournament
Training experience E: games played against itself
Remaining choices
Choosing the Target Function
Choosing the Representation of the Target Function
Choosing a Function Approximation Algorithm
Estimating training values
Adjusting the weights
The Final Design
9
Choosing the Target Function (1)
The next design choice is to determine exactly what type
of knowledge will be learned and how this will be used
by the performance program.
For Checkers game, the most obvious choice for the type
of information to be learned is a program, or function,
that chooses the best move for any given board state.
10
Choosing the Target Function (2)
Alternative target function
An evaluation function that assigns a numerical score to any
given board state
V : B → R ( where R is the set of real numbers)
V(b) for an arbitrary board state b in B
if b is a final board state that is won, then V(b) = 100
if b is a final board state that is lost, then V(b) = -100
if b is a final board state that is drawn, then V(b) = 0
if b is not a final state, then V(b) = V(b '), where b' is the
best final board state that can be achieved starting from b
and playing optimally until the end of the game
11
Choosing the Target Function (3)
V(b) gives a recursive definition for board state b
Not usable because not efficient to compute except in first
three trivial cases
nonoperational definition
Goal of learning is to discover an operational description
of V
Learning the target function is often called function
approximation
Referred to as Vˆ
12
Choosing a Representation for the Target Function
Choice of representations involve trade offs
Pick a very expressive representation to allow close approximation to the
ideal target function V
More expressive, more training data required to choose among alternative
hypotheses
Use linear combination of the following board features:
x5: the number of black pieces threatened by red (i.e. which can be captured
13 Vˆ (b) w0 w1 x1 w2 x2 w3 x3 w4 x4 w5 x5 w6 x6
Partial Design of Checkers Learning Program
tournament
Training experience E: games played against itself
Vˆ (b) w0 w1 x1 w2 x2 w3 x3 w4 x4 w5 x5 w6 x6
14
Choosing a Function Approximation Algorithm
Derive training examples from the indirect training experience
available to the learner
Adjusts the weights wi to best fit these training examples
To learn we require a set of training examples describing the board b
and the training value Vtrain(b)
Each training example is an Ordered pair b , V train b
x1 3, x2 0, x3 1, x4 0, x5 0, x6 0 ,100
16
Adjusting the Weights
Choose the weights wi to best fit the set of training examples
Minimize the squared error E between the train values and the
values predicted by the hypothesis
V b V b 2
E ˆ
train
b ,V train b training examples
17
LMS Weight Update Rule
LMS weight update rule :-
For each training example (b, Vtrain(b))
Use the current weights to calculate V̂ (b)
For each weight wi, update it as
wi ← wi + ƞ (Vtrain (b) -V̂ (b)) xi
Here ƞ is a small constant (e.g., 0.1) that moderates the size of the
weight update.
Working of weight update rule
• When the error (Vtrain(b)- V̂ (b)) is zero, no weights are changed.
• When (Vtrain(b) - V̂ (b)) is positive (i.e., when V̂ (b) is too low),
then each w is increased in proportion to the value of its
corresponding feature. This will raise the value of V̂ (b), reducing
the error.
• If the value of some feature xi is zero, then its weight is not altered
regardless of the error, so that the only weights updated are those
18 whose features actually occur on the training example board.
Final Design
Experiment
New problem Hypothesis
(initial game board)
Generator
Vˆ
Performance Generalizer
System
19
The Performance System is the module that must solve the given
performance task by using the learned target function(s). It takes an instance
of a new problem (new game) as input and produces a trace of its solution
(game history) as output.
The Critic takes as input the history or trace of the game and produces as
output a set of training examples of the target function
Games against
Experts
Games against itself Table of
correct Moves …
Determine
Target Function
…
Board → move Board → value
Determine Representation
Polynomial
Artificial neural
of Learned Function
network
Linear function of six features …
Determine
Learning Algorithm
Gradient descent Linear Programming …
21
Complete Design
Training Classification Problems
Many learning problems involve classifying inputs into a
discrete set of possible categories.
Learning is only possible if there is a relationship
between the data and the classifications.
Training involves providing the system with data which
has been manually classified.
Learning systems use the training data to learn to classify
unseen data.
22