You are on page 1of 43

Experiments in

Learning
Edward Tolman
1886 - 1959

Cognitive maps - experimental study of


animal cognition.
Edward Tolman : Cognitive Maps
● According to early behaviorists, behavior was a matter of responding to
stimuli – it consisted of stimulus – response (S-R) connections.
● Learning was a matter of forming new stimulus – response relations.

● Tolman insisted that we must consider what is going on inside the animal

● Father of experimental study of animal cognition.


He believed that we have to consider internal variables
such as motives and expectations.
These can intervene between the stimulus situation and
whatever response occurs, so he called them
intervening variables.
Perceptions, expectations etc

According to him, experiments can be designed to study


these variables or internal cognitive operations.
Learning to make a series of responses

● Consider a rat that has mastered a maze


● Placed in the start box, it now runs promptly to the goal box and
collects its food reward.

● What has it learned?


● To make a series of responses in the correct order?

● This would have been Thorndike’s view.


Learning where things are

● Or has it learned where things are?


● Does it represent in its head the spatial layout of the apparatus,
forming what Tolman called a cognitive map of the situation and the
objects, pathways and rewards that were in it.

● The way to find out is to pit the two possibilities against each other. An
early experiment by Tolman began with the apparatus shown here
First step of experiment

● A rat would be placed in the start box at point A


and it would learn to run across the circular
arena to another alley at C, then to D, E, F and
finally G, where it would find a bit of food.
● A more direct route from the arena to the food
box would have saved the rat time and energy
but no such shortcut was available.

● The rats were given 5 trials in this convoluted


apparatus.
Second step of experiment

● In this step, the animals were tested in a


modified apparatus.
● The original route was blocked, but instead
there was a whole series of potential
pathways radiating out from the central arena.

● The question was, would the rats choose the


alley that pointed to the original goal?
Why the second step?

● This second step separates the two possibilities.


○ Were the rats learning to make responses?
○ Or were they learning where things were?

● If the reinforcement had simply strengthened the responses that had


led to it, then the rats should choose one of the alleys adjacent to the
original one; this was the closest possible response to the one that
had been reinforced.
● On the other hand, if the rats had learned where the food box was – if,
in other words, they were able to represent the spatial layout as a
cognitive map in their heads – then they should take the route that led
directly to where the food was, even though they had never been
reinforced for doing this.
○ And this is what happened.
○ These and other experiments made a strong case for a cognitive
mapping capability in laboratory rats.
Brain function in cognitive mapping

● Radial maze was used as a tool for the study of memory.


● In a radial maze there is a central starting place, from which
a number of alleys radiate outwards
● In this experiment, each of the alleys is baited with a bit of
food at the end – the animal cannot see the food from the
starting point.
● From the central starting place, the rat can run out to the end
of any arm and collect its bit of food.
● It must then run back to the central place, and then it is free to
choose another arm to run to.
● The original arm no longer has any food in it, for the rat has
eaten it all.
● After some exposure to the situation, the rat will run down one alley,
pick up the bit of food and eat it, return to the central place and then
choose a different alley.

● An experienced rat confronted with as many as eight arms of the


maze, will visit all of them before returning to one it has already
visited and which now has no food in it.
WOOOOTTT?

● In short, the rat can remember where it


has been and not go back there. ● Efficient food gathering in this situation
● Rats with damage in the hippocampal requires the use of short term memory:
area of the brain, a structure in the
the rat must remember which arms it has
forebrain below the cerebral cortex, are
not so good at this. already visited on this particular session.
○ They seem to remember the task ● Thus, this kind of experiment
itself; placed at the start, they will
run promptly down one of the
demonstrates the role of brain in internal
alleys and eat the food. mapmaking, navigation and memory.
○ But on successive trials, they are
far more likely than normal rats to
revisit an arm where they have
already eaten all the food there is.
B.F.Skinner
1904 - 1990

Operant Conditioning
Skinner : Operant Conditioning

● According to Skinner
○ Psychology is about behavior,
not about the mind, and not
about the nervous system. It
deals only with variables that
can be directly observed.

● He, like Thorndike, quickly became


convinced of the great power that
reward and reinforcement can
exert on behavior.
Skinner Box

● He invented the Skinner box (he called it


operant chamber)
● It was a small chamber in which there was a
response to be made (a lever for a rat to
press, or a disk mounted on the wall for a
pigeon to peck), and a means of delivering a
reinforcement (a bit of food, a sip of water
or anything else for which his animal would
work)
● The reward having been delivered, the rat or
pigeon was free to respond again.
Recording Responses

● He also invented a mechanical device for automatically recording


fine differences in the rate of response.

● Indeed, he was one of the pioneers of automation in behavioral


research: responses could be detected, recorded and followed up with
reinforcements, all by automatic apparatus.
Operant Conditioning

● Rather than focusing upon the


things that happen before a
response occurs, as with Pavlov’s
conditioned reflexes, Skinner
found (as had Thorndike) that the
events following a response had
a great influence on its
subsequent rate of occurrence.

● If food was presented to a


hungry rat after it had pressed a
lever, the rate of lever pressing
would increase. This he called
Operant Conditioning.
Operant Conditioning

● Operant Conditioning: If a response (the operant) is followed by a


reinforcing stimulus, the response strength is increased.

● Lever pressing is an operant


● Food for a hungry rat is a reinforcing stimulus
● If an animal is reinforced for lever pressing only if a light is on, and is never
reinforced if it is off, then the animal will come to press at a much higher rate
when the light is on than when it is off. This is discrimination.
 Stimulus Discrimination—Learned response to a specific stimulus but not to
other, similar stimuli
POSITIVE Presenting something
Behaviour strengthened
REINFORCEMENT the organism likes

NEGATIVE Removing something


Behaviour strengthened
REINFORCEMENT the organism doesn’t like

Presenting something
PUNISHMENT Behaviour weakened
the organism doesn’t like
Schedules of Reinforcement

● A schedule of reinforcement is one in which reinforcement is made


available to the subject or participant only some of the time,
according to certain rules; these rules define the schedule. And it
turns out that different schedules give rise to characteristically
different patterns of operant behavior.

1. Continuous reinforcement Schedule


2. Fixed Ratio
3. Variable Ratio
Intermittent Reinforcement
4. Fixed Interval
5. Variable Interval
Intermittent / Continuous Reinforcement

● Intermittent reinforcement ● In Continuous reinforcement


Schedule : One may reinforce schedule, every response the animal
certain instances of a given or subject makes is reinforced.
behavior while allowing other
instances to go unreinforced. ● It is the simplest schedule of
reinforcement
○ Any rule specifying a
procedure for occasionally
reinforcing a behavior.
Fixed Ratio (FR)

● Every nth response is reinforced.


● For e.g. every 10th response the animal makes is reinforced. In this case
it would be FR 10

● On such a schedule, the animal will typically pause after each


reinforcement and then run off the next series of responses at a high
rate.
● The length of the post reinforcement pause depends on the value of
the FR - the higher the value, the longer the pause.
Variable Ratio (VR)

● The number of responses required to produce reinforcement


changes unpredictably, from one reinforcement to the other.

● For e.g. after the first reinforcement, the next reinforcement occurs
after 5 response (VR 5), the third reinforcement after 2 responses
(VR 2), the fourth after 8 responses (VR 8) and so on.

● Variable ratio schedule produces a high steady rate of responses and


produces no or very small post reinforcement pause.
hmmmm
● Gambling and lottery games are good examples of a reward based on a variable ratio
schedule.
● The sequence of outcomes in some forms of gambling (e.g., slot machines) is quite
similar to a partial reinforcement schedule (Knapp, 1976; Skinner, 1953, 1969).
Winning, for example, represents a positive reinforcement. With partial reinforcement,
rewards occur with some bets, but not all. Gamblers are uncertain about which bets
will produce rewards because the number of responses is always changing. This is
called a variable ratio schedule of reinforcement
● Consider a door-to-door salesperson averages ones sale for every 10 houses called
upon. This does not mean that the salesperson made a sale at exactly every 10 th house.
● Sometimes a sale might have been made after calling upon five houses, sometimes a
sale might have been made at two houses in a row and sometimes the salesperson
might have called on a large number of houses before making a sale.
Fixed Interval (FI)

● A reinforcement becomes available after a fixed period of time,


following the previous reinforcement.

● With FI 10, after a reinforcement, no further reinforcement is


available until 10 minutes have passed.
● On an FI schedule, the animal will come to make only a few
responses following reinforcement, but then will respond at a
gradually increasing rate until the next reinforcement occurs.
Fixed Interval

● Suppose that two young children play together each morning.


Approximately two hours after breakfast, their mother has a
midmorning snack prepared for them; and approximately 2 hours after
that lunch is prepared. Thus, the behavior of arriving at the kitchen is
reinforced on FI 2 hour schedule. Within each 2 hour period, as the time
draws to a close, the children began making more and more frequent
trips to the kitchen, each time asking ‘is our food ready?’ after eating
they run out and play, and there is a fairly lengthy pause before they
start once again to make trips to the kitchen.
Variable Interval

● In this schedule, reinforcement is made available at variable


intervals following a reinforcement.

● This means that reinforcement could become available at any time


● On a VI schedule, there is a steady moderate rate of response
throughout the session.

● Checking your e-mail or Facebook.

● Going fishing—you might catch a fish after 10 minutes, then have


to wait an hour, then have to wait 18 minutes.
Summarized
Albert Bandura
1925 – still here :P

Social learning theory


Albert Bandura: Imitation and Social learning

● Social Learning Theory


○ A fusion of behaviorism and more
recent cognitive approaches.

○ It recognizes the contributions to


knowledge made by the objective
study of learning as pioneered by
Pavlov, Skinner and Thorndike.
However, it emphasizes a cognitive
interpretation of these processes as
emphasized by Kohler and especially
by Tolman.
Early experiments

● Bandura showed that a child’s aggressive behavior could be increased


by letting the child watch another person behave aggressively.

● In an early experiment
○ Small children watched a film showing two men, Rocky and
Johnny, playing with toys. Johnny refused to share his toys, so
rocky took them away by force and marched off with the toys,
while Johnny sat dejectedly in a corner.
● Then each child was left alone for 20 minutes, but
watched through a one way mirror.

○ Children who had watched the film were much


more aggressive in their play than children in the
control condition who had not watched the film.
Another experiment

● Nursery school children were allowed to


watch an adult punch, kick, hit and otherwise
mistreat a Bobo doll.

● After watching the violent mistreatment of the


Bobo doll, the children were taken into
another room that contained a few toys,
including a Bobo doll.
Results

● Compared to children who had not seen this aggressive behavior


modeled, the children who had were enthusiastic in doing their own
punching and kicking and otherwise mistreating the doll.

● The children imitated the very acts that they had observed and used
the very words they had heard the adult model use.
Follow up

● Follow up experiments showed that children are selective about which


behaviors they imitate; they do not imitate just anything they see.
■ Imitation is more likely if the model is rewarded for the
behavior rather than punished.
■ More likely if the model has high status
■ And more likely if the model is similar to the child.
Processes operating in Observational
Learning
1. Attentional Processes: the model must be attended to.

2. Retention Processes: what the model did and the consequences of this, must be
remembered.

3. Skills for performing the observed behavior: the child must have the necessary skills
and so be able to reproduce the activities in question.

4. Reinforcement/Rewards: if there are rewards or some reinforcement for the observed


behavior, it is more likely to be performed.
An experiment demonstrating the role of retention

● Children watched a model under three different conditions


○ First condition: the children described aloud the actions that were
performed by the model.
○ Second Condition: the children were instructed simply to observe
carefully.
○ Third condition: the children were required to count rapidly while
watching what the model did; this was designed to interfere with
the children’s processing of the information the model’s action
provided.
● Results showed that the children who had described the model’s
actions in words learned best, followed by those who simply watched.

● Those who watched under the interference condition learned the least.
Application in clinical settings

● Bandura has applied these findings to clinical settings, in attempting to


help people solve problems of living.
● For e.g. irrational fears or phobias may be treated by modeling
procedures. A person who has a severe fear of (say) cats may watch
another person carry out increasingly bold approaches to the feared
object. The model may encourage the client to follow along with the
modeled responses step by step so that the imitation of the model’s
actions occurs in small steps. reinforcement is given at each step, for
dealing with the feared object in a less fearful way.
“This is a quote, words full of
wisdom that someone important
said and can make the reader get
inspired.”

—SOMEONE
FAMOUS

You might also like