Experiments in Learning PT 2 - Psychology

Experiments in
Learning
Edward Tolman
1886 - 1959
Cognitive maps - experimental study of

animal cognition.
Edward Tolman : Cognitive Maps
● According to early behaviorists, behavior was a matter of responding to
stimuli – it consisted of stimulus – response (S-R) connections.
● Learning was a matter of forming new stimulus – response relations.
● Tolman insisted that we must consider what is going on inside the animal
● Father of experimental study of animal cognition.

He believed that we have to consider internal variables
such as motives and expectations.
These can intervene between the stimulus situation and
whatever response occurs, so he called them
intervening variables.
Perceptions, expectations etc
According to him, experiments can be designed to study

these variables or internal cognitive operations.
Learning to make a series of responses
● Consider a rat that has mastered a maze

● Placed in the start box, it now runs promptly to the goal box and
collects its food reward.
● What has it learned?

● To make a series of responses in the correct order?
● This would have been Thorndike’s view.

Learning where things are
● Or has it learned where things are?

● Does it represent in its head the spatial layout of the apparatus,
forming what Tolman called a cognitive map of the situation and the
objects, pathways and rewards that were in it.
● The way to find out is to pit the two possibilities against each other. An
early experiment by Tolman began with the apparatus shown here
First step of experiment
● A rat would be placed in the start box at point A

and it would learn to run across the circular
arena to another alley at C, then to D, E, F and
finally G, where it would find a bit of food.
● A more direct route from the arena to the food
box would have saved the rat time and energy
but no such shortcut was available.
● The rats were given 5 trials in this convoluted

apparatus.
Second step of experiment
● In this step, the animals were tested in a

modified apparatus.
● The original route was blocked, but instead
there was a whole series of potential
pathways radiating out from the central arena.
● The question was, would the rats choose the

alley that pointed to the original goal?
Why the second step?
● This second step separates the two possibilities.

○ Were the rats learning to make responses?
○ Or were they learning where things were?
● If the reinforcement had simply strengthened the responses that had

led to it, then the rats should choose one of the alleys adjacent to the
original one; this was the closest possible response to the one that
had been reinforced.
● On the other hand, if the rats had learned where the food box was – if,
in other words, they were able to represent the spatial layout as a
cognitive map in their heads – then they should take the route that led
directly to where the food was, even though they had never been
reinforced for doing this.
○ And this is what happened.
○ These and other experiments made a strong case for a cognitive
mapping capability in laboratory rats.
Brain function in cognitive mapping
● Radial maze was used as a tool for the study of memory.

● In a radial maze there is a central starting place, from which
a number of alleys radiate outwards
● In this experiment, each of the alleys is baited with a bit of
food at the end – the animal cannot see the food from the
starting point.
● From the central starting place, the rat can run out to the end
of any arm and collect its bit of food.
● It must then run back to the central place, and then it is free to
choose another arm to run to.
● The original arm no longer has any food in it, for the rat has
eaten it all.
● After some exposure to the situation, the rat will run down one alley,
pick up the bit of food and eat it, return to the central place and then
choose a different alley.
● An experienced rat confronted with as many as eight arms of the

maze, will visit all of them before returning to one it has already
visited and which now has no food in it.
WOOOOTTT?
● In short, the rat can remember where it

has been and not go back there. ● Efficient food gathering in this situation
● Rats with damage in the hippocampal requires the use of short term memory:
area of the brain, a structure in the
the rat must remember which arms it has
forebrain below the cerebral cortex, are
not so good at this. already visited on this particular session.
○ They seem to remember the task ● Thus, this kind of experiment
itself; placed at the start, they will
run promptly down one of the
demonstrates the role of brain in internal
alleys and eat the food. mapmaking, navigation and memory.
○ But on successive trials, they are
far more likely than normal rats to
revisit an arm where they have
already eaten all the food there is.
B.F.Skinner
1904 - 1990
Operant Conditioning
Skinner : Operant Conditioning
● According to Skinner
○ Psychology is about behavior,
not about the mind, and not
about the nervous system. It
deals only with variables that
can be directly observed.
● He, like Thorndike, quickly became

convinced of the great power that
reward and reinforcement can
exert on behavior.
Skinner Box
● He invented the Skinner box (he called it

operant chamber)
● It was a small chamber in which there was a
response to be made (a lever for a rat to
press, or a disk mounted on the wall for a
pigeon to peck), and a means of delivering a
reinforcement (a bit of food, a sip of water
or anything else for which his animal would
work)
● The reward having been delivered, the rat or
pigeon was free to respond again.
Recording Responses
● He also invented a mechanical device for automatically recording

fine differences in the rate of response.
● Indeed, he was one of the pioneers of automation in behavioral

research: responses could be detected, recorded and followed up with
reinforcements, all by automatic apparatus.
● Rather than focusing upon the

things that happen before a
response occurs, as with Pavlov’s
conditioned reflexes, Skinner
found (as had Thorndike) that the
events following a response had
a great influence on its
subsequent rate of occurrence.
● If food was presented to a

hungry rat after it had pressed a
lever, the rate of lever pressing
would increase. This he called
Operant Conditioning.
● Operant Conditioning: If a response (the operant) is followed by a

reinforcing stimulus, the response strength is increased.
● Lever pressing is an operant

● Food for a hungry rat is a reinforcing stimulus
● If an animal is reinforced for lever pressing only if a light is on, and is never
reinforced if it is off, then the animal will come to press at a much higher rate
when the light is on than when it is off. This is discrimination.
 Stimulus Discrimination—Learned response to a specific stimulus but not to
other, similar stimuli
POSITIVE Presenting something
Behaviour strengthened
REINFORCEMENT the organism likes
NEGATIVE Removing something

Behaviour strengthened
REINFORCEMENT the organism doesn’t like
Presenting something
PUNISHMENT Behaviour weakened
the organism doesn’t like
Schedules of Reinforcement
● A schedule of reinforcement is one in which reinforcement is made

available to the subject or participant only some of the time,
according to certain rules; these rules define the schedule. And it
turns out that different schedules give rise to characteristically
different patterns of operant behavior.
1. Continuous reinforcement Schedule

2. Fixed Ratio
3. Variable Ratio
Intermittent Reinforcement
4. Fixed Interval
5. Variable Interval
Intermittent / Continuous Reinforcement
● Intermittent reinforcement ● In Continuous reinforcement

Schedule : One may reinforce schedule, every response the animal
certain instances of a given or subject makes is reinforced.
behavior while allowing other
instances to go unreinforced. ● It is the simplest schedule of
reinforcement
○ Any rule specifying a
procedure for occasionally
reinforcing a behavior.
Fixed Ratio (FR)
● Every nth response is reinforced.

● For e.g. every 10th response the animal makes is reinforced. In this case
it would be FR 10
● On such a schedule, the animal will typically pause after each

reinforcement and then run off the next series of responses at a high
rate.
● The length of the post reinforcement pause depends on the value of
the FR - the higher the value, the longer the pause.
Variable Ratio (VR)
● The number of responses required to produce reinforcement

changes unpredictably, from one reinforcement to the other.
● For e.g. after the first reinforcement, the next reinforcement occurs
after 5 response (VR 5), the third reinforcement after 2 responses
(VR 2), the fourth after 8 responses (VR 8) and so on.
● Variable ratio schedule produces a high steady rate of responses and

produces no or very small post reinforcement pause.
hmmmm
● Gambling and lottery games are good examples of a reward based on a variable ratio
schedule.
● The sequence of outcomes in some forms of gambling (e.g., slot machines) is quite
similar to a partial reinforcement schedule (Knapp, 1976; Skinner, 1953, 1969).
Winning, for example, represents a positive reinforcement. With partial reinforcement,
rewards occur with some bets, but not all. Gamblers are uncertain about which bets
will produce rewards because the number of responses is always changing. This is
called a variable ratio schedule of reinforcement
● Consider a door-to-door salesperson averages ones sale for every 10 houses called
upon. This does not mean that the salesperson made a sale at exactly every 10 th house.
● Sometimes a sale might have been made after calling upon five houses, sometimes a
sale might have been made at two houses in a row and sometimes the salesperson
might have called on a large number of houses before making a sale.
Fixed Interval (FI)
● A reinforcement becomes available after a fixed period of time,

following the previous reinforcement.
● With FI 10, after a reinforcement, no further reinforcement is

available until 10 minutes have passed.
● On an FI schedule, the animal will come to make only a few
responses following reinforcement, but then will respond at a
gradually increasing rate until the next reinforcement occurs.
Fixed Interval
● Suppose that two young children play together each morning.

Approximately two hours after breakfast, their mother has a
midmorning snack prepared for them; and approximately 2 hours after
that lunch is prepared. Thus, the behavior of arriving at the kitchen is
reinforced on FI 2 hour schedule. Within each 2 hour period, as the time
draws to a close, the children began making more and more frequent
trips to the kitchen, each time asking ‘is our food ready?’ after eating
they run out and play, and there is a fairly lengthy pause before they
start once again to make trips to the kitchen.
Variable Interval
● In this schedule, reinforcement is made available at variable

intervals following a reinforcement.
● This means that reinforcement could become available at any time

● On a VI schedule, there is a steady moderate rate of response
throughout the session.
● Checking your e-mail or Facebook.
● Going fishing—you might catch a fish after 10 minutes, then have

to wait an hour, then have to wait 18 minutes.
Summarized
Albert Bandura
1925 – still here :P
Social learning theory

Albert Bandura: Imitation and Social learning
● Social Learning Theory

○ A fusion of behaviorism and more
recent cognitive approaches.
○ It recognizes the contributions to

knowledge made by the objective
study of learning as pioneered by
Pavlov, Skinner and Thorndike.
However, it emphasizes a cognitive
interpretation of these processes as
emphasized by Kohler and especially
by Tolman.
Early experiments
● Bandura showed that a child’s aggressive behavior could be increased

by letting the child watch another person behave aggressively.
● In an early experiment
○ Small children watched a film showing two men, Rocky and
Johnny, playing with toys. Johnny refused to share his toys, so
rocky took them away by force and marched off with the toys,
while Johnny sat dejectedly in a corner.
● Then each child was left alone for 20 minutes, but
watched through a one way mirror.
○ Children who had watched the film were much

more aggressive in their play than children in the
control condition who had not watched the film.
Another experiment
● Nursery school children were allowed to

watch an adult punch, kick, hit and otherwise
mistreat a Bobo doll.
● After watching the violent mistreatment of the

Bobo doll, the children were taken into
another room that contained a few toys,
including a Bobo doll.
Results
● Compared to children who had not seen this aggressive behavior

modeled, the children who had were enthusiastic in doing their own
punching and kicking and otherwise mistreating the doll.
● The children imitated the very acts that they had observed and used
the very words they had heard the adult model use.
Follow up
● Follow up experiments showed that children are selective about which

behaviors they imitate; they do not imitate just anything they see.
■ Imitation is more likely if the model is rewarded for the
behavior rather than punished.
■ More likely if the model has high status
■ And more likely if the model is similar to the child.
Processes operating in Observational
Learning
1. Attentional Processes: the model must be attended to.
2. Retention Processes: what the model did and the consequences of this, must be
remembered.
3. Skills for performing the observed behavior: the child must have the necessary skills
and so be able to reproduce the activities in question.
4. Reinforcement/Rewards: if there are rewards or some reinforcement for the observed

behavior, it is more likely to be performed.
An experiment demonstrating the role of retention
● Children watched a model under three different conditions

○ First condition: the children described aloud the actions that were
performed by the model.
○ Second Condition: the children were instructed simply to observe
carefully.
○ Third condition: the children were required to count rapidly while
watching what the model did; this was designed to interfere with
the children’s processing of the information the model’s action
provided.
● Results showed that the children who had described the model’s
actions in words learned best, followed by those who simply watched.
● Those who watched under the interference condition learned the least.
Application in clinical settings
● Bandura has applied these findings to clinical settings, in attempting to

help people solve problems of living.
● For e.g. irrational fears or phobias may be treated by modeling
procedures. A person who has a severe fear of (say) cats may watch
another person carry out increasingly bold approaches to the feared
object. The model may encourage the client to follow along with the
modeled responses step by step so that the imitation of the model’s
actions occurs in small steps. reinforcement is given at each step, for
dealing with the feared object in a less fearful way.
“This is a quote, words full of
wisdom that someone important
said and can make the reader get
inspired.”
—SOMEONE
FAMOUS

Experiments in Learning PT 2 - Psychology

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Experiments in Learning PT 2 - Psychology

Uploaded by

Copyright:

Available Formats

Experiments in

Cognitive maps - experimental study of

● Father of experimental study of animal cognition.

According to him, experiments can be designed to study

● Consider a rat that has mastered a maze

● What has it learned?

● This would have been Thorndike’s view.

● Or has it learned where things are?

● A rat would be placed in the start box at point A

● The rats were given 5 trials in this convoluted

● In this step, the animals were tested in a

● The question was, would the rats choose the

● This second step separates the two possibilities.

● If the reinforcement had simply strengthened the responses that had

● Radial maze was used as a tool for the study of memory.

● An experienced rat confronted with as many as eight arms of the

● In short, the rat can remember where it

● He, like Thorndike, quickly became

● He invented the Skinner box (he called it

● He also invented a mechanical device for automatically recording

● Indeed, he was one of the pioneers of automation in behavioral

● Rather than focusing upon the

● If food was presented to a

● Operant Conditioning: If a response (the operant) is followed by a

● Lever pressing is an operant

NEGATIVE Removing something

● A schedule of reinforcement is one in which reinforcement is made

1. Continuous reinforcement Schedule

● Intermittent reinforcement ● In Continuous reinforcement

● Every nth response is reinforced.

● On such a schedule, the animal will typically pause after each

● The number of responses required to produce reinforcement

● Variable ratio schedule produces a high steady rate of responses and

● A reinforcement becomes available after a fixed period of time,

● With FI 10, after a reinforcement, no further reinforcement is

● Suppose that two young children play together each morning.

● In this schedule, reinforcement is made available at variable

● This means that reinforcement could become available at any time

● Checking your e-mail or Facebook.

● Going fishing—you might catch a fish after 10 minutes, then have

Social learning theory

● Social Learning Theory

○ It recognizes the contributions to

● Bandura showed that a child’s aggressive behavior could be increased

○ Children who had watched the film were much

● Nursery school children were allowed to

● After watching the violent mistreatment of the

● Compared to children who had not seen this aggressive behavior

● Follow up experiments showed that children are selective about which

4. Reinforcement/Rewards: if there are rewards or some reinforcement for the observed

● Children watched a model under three different conditions

● Bandura has applied these findings to clinical settings, in attempting to

You might also like