You are on page 1of 16

EXTINCTION AS A FUNCTION OF PARTIAL

REINFORCEMENT AND DISTRIBUTION


OF PRACTICE i
BY VIRGINIA F. SHEFFIELD
Yah University

PURPOSE quired with partial reinforcement, but


also they are considerably more resistant
The purpose of this study was to to extinction than when reinforcement
determine whether distribution of is given on every trial. For example,
practice influences the effect of partial Skinner (20), using bar pressing with
reinforcement on resistance to ex- rats, interspersed reinforcements at regu-
tinction. The problem has theoreti- lar time intervals ('periodic recondition-
cal significance because of its bearing ing') or after given numbers of responses
on a general hypothesis concerning ('reinforcement at a fixed ratio'). He
extinction which may explain the found that many more responses were
finding that partial reinforcement re- required to produce extinction than when
sults in greater resistance to extinc- reinforcement had been given for each
response.
tion than reinforcement on every Attention was particularly focused on
trial. the phenomenon of increased resistance
For some time it has been recognized to extinction following partial reinforce-
that reinforcement on every trial is not ment in an experiment by Humphreys
essential either for the establishment or (n) using the conditioned eyelid re-
for the maintenance of a conditioned sponse in human subjects. Of two
response. Pavlov (18) describes an ex- groups of subjects, one trained with rein-
periment performed in his laboratory in forcement on every trial and one trained
which conditioned salivation in a dog was with reinforcement 'randomly' on only
established with food reinforcement on half the trials, the latter group responded
every other trial and on every third trial. at a significantly higher level throughout
Others (2, 3), while not specifically study- extinction. Humphreys (13) later ob-
ing partial reinforcement, have also tained the same result using the condi-
obtained successful acquisition using tioned galvanic response in humans.
partial reinforcement instead of rein- It has been claimed (8, n) that the
forcement on every trial. Also it has increased resistance to extinction with
been shown (i) that a conditioned re- partial reinforcement challenges the ade-
sponse, established with reinforcement quacy of stimulus-response learning the-
on every trial, may be maintained at ory. Humphreys (u) proposed the
almost its original level with relatively principle of expectancy as an alternative
infrequent reinforcement. theory. This 'principle* is not rigorously
The most striking finding with partial stated and depends on commonsense
reinforcement, however, is that not only concepts in its explanation of the results.
are conditioned responses readily ac- The 'expectancy' explanation may be
1
This article reports part of a dissertation paraphrased as follows. Conditioned
submitted to the faculty of the Department of responses are the consequence of the sub-
Psychology of Yale University in partial ful- ject's expectation that reinforcement
fillment of the requirements for the Ph.D. will appear. In extinction after rein-
degree. The writer is indebted to Dr, Neal E. forcement on every trial, the response
Miller, under whose direction the research was
conducted, and to Dr. Clark L. Hull and Dr. rapidly disappears because the sudden
Carl I. Hovland, who served on the advisory shift from uniform reinforcement, to uni-
committee. form nonreinforcement makes it easy to
511
512 VIRGINIA F. SHEFFIELD

change to an expectation of uniform non- forcement is given on every trial, the


reinforcement. But in extinction after after-effects of the reinforcement will
partial reinforcement, the subject con- be part of the conditioned stimulus
tinues to expect that reinforcement will pattern on every trial after the first.
be periodic as it was during training— In the present experiment, where rats
extinction is prolonged by his expectation
that reinforcement will be reintroduced. were trained to run in an alley to food,
This interpretation is contrasted with the after-effects of reinforcement
stimulus-response learning theories, would perhaps include continued food
which usually assume, in one form or taste and even particles of food in the
another, that reinforcement strengthens mouth, temporary relaxation, etc.
a conditioned response and nonreinforce- When extinction is begun, the stimu-
ment extinguishes it. This suggests that lus pattern is changed not only by
the infrequent reinforcements and the the absence of the after-effects of rein-
interspersed nonreinforcements which forcement, but also by the presence
characterize partial reinforcement would of whatever new stimulation results
produce a weaker response, which would
extinguish more rapidly than would be from the absence of reinforcement,
the case with uniform reinforcement. particularly from the subject's re-
However, it is believed that ordinary action to that absence. In the pres-
stimulus-response learning concepts, ent experiment, the after-effects of
such as those used by Guthrie (7) and nonreinforcement would include such
Hull (10), can explain the increased new reactions as searching, conflict,
resistance to extinction. The present frustration, slowing up, etc. This
study was designed to test a theoretical change in the conditioned stimulus
interpretation which utilizes only stimu- pattern should result in a weakening
lus-response concepts. The interpreta- of the conditioned response, because
tion hinges on a hypothesis about a gen-
eral factor in extinction which is particu- the cues present during extinction
larly applicable in the case of partial are at a point on the generalization
reinforcement. gradient at which responding is ex-
pected to be weaker than to the cues
THEORY present during training.
The basic hypothesis is that extinc- However, when training with par-
tion necessarily involves different cues tial reinforcement is given, the subject
from those used during training. is exposed, on reinforced training
Omission of reinforcement alters the trials that follow nonreinforced trials,
context and makes extinction a case to cues which are normally present
of 'transfer of training' in which a only during extinction. Examples
certain amount of generalization de- of such cues for animals reinforced
crement is expected because of the with food would be lack of food par-
change in cues. This hypothesis ticles in the mouth, a state of frustra-
would be applied in explaining the tion, etc. In other situations, ex-
effect of partial reinforcement on amples would be the absence of
extinction as follows. Occurrence of irritation from recent air puff to the
reinforcement on a given trial pro- eye or shock to the wrist, cues from
duces effects which, in varying de- any relaxation resulting from freedom
gree, provide characteristic stimuli from the air puff or shock, implicit
at the start of the following trial, verbalizations relating to nonrein-
these stimuli becoming part of the forcement, etc. With these nonrein-
total stimulus pattern acting at the forcement cues as part of the current
start of the next trial. When rein- stimulus pattern, reinforcement is
EXTINCTION AND PARTIAL REINFORCEMENT 513

reintroduced, and the subject there- assumed to dissipate with the passage of time
fore learns to perform the response in and the performance of other behavior. There-
fore, it follows that the extent to which differ-
the presence of such nonreinforcement entiable reinforcement and nonreinforcement
cues. Since the response has become cues enter into the total stimulus pattern on a
conditioned during training to the given trial will be maximized if trials are spaced
cues characteristic of extinction, there close together and will be minimized if trials are
spaced far apart. Massing of training trials,
is less loss through a change in the then, should give the maximum advantage to
conditioned stimulus pattern when partial reinforcement in preventing extinction as
reinforcement is withdrawn com- compared with reinforcement on every trial,
pletely than is found after training whereas relative spacing of trials should dimin-
by reinforcement on every trial. ish or destroy this advantage. It is this de-
duction which the present experiment was de-
Thus, the initiation of extinction signed to test.
trials produces a relatively large Seventy-two rats were used in the experiment.
change in the conditioned stimulus Half the animals were trained with 100 percent
pattern when it follows training with reinforcement for 30 trials, half with reinforce-
ment randomly on 50 percent of the 30 training
reinforcement on every trial, but trials. Of each group, half were trained with
much less change when it follows the trials massed (iS-sec. interval), half with the
training with partial reinforcement. trials spaced (i5-min. interval). The question
Because the response evoked by a to be answered was: would partial reinforcement
generalized stimulus is weaker than have less advantage in resistance to extinction
in the latter group?
that evoked by the reinforced stimul- It seemed probable that the spacing of trials,
us, the conditioned response during both during training and during extinction,
extinction will be weaker in the would have the direct effect of increasing resist-
former case than in the latter. The ance to extinction as well as its indirect effects
through interaction with partial reinforcement.
possibility of the operation of this At the same time, it seemed probable that a
mechanism has been recognized by change from massed to spaced or from spaced to
other authors (4, 9, 16, 17). massed when extinction begins would have the
effect of hastening extinction. In order to
balance completely for these factors, each of the
METHOD four training groups was divided for extinction,
The proposed teat of this interpretation con- half receiving massed extinction trials and half
sisted of controlling some of the after-effects of spaced extinction trials. This procedure not
reinforcement and nonreinforcement by spacing only served as a balancing procedure in the main
of trials. The assumption was that if animal problem to be investigated, but also had intrinsic
subjects were used with trials widely spaced, value in supplying data on the problem of the
most of the after-effects of reinforcement or non- relative resistance to extinction when extinction
reinforcement would have dissipated by the trials are massed compared with when they are
start of the next trial, making the conditioned spaced. The results of this aspect of the experi-
stimulus pattern much the same whether rein- ment will be presented in a separate article (19).
forcement had or had not been received on the Table I shows the design in schematic form;
preceding trial. The group receiving reinforce- each cell of the table represents a separate
ment on every trial and the group receiving subgroup of the experiment.
partial reinforcement would thus be trained with
more similar conditioned stimulus patterns than DETAILS OF APPARATUS
when trials were massed. Similarly the change AND PROCEDURE
in conditions when extinction was started would
be more nearly the same for the two groups; as a The apparatus, shown in Fig. I, consisted of
result, the difference in resistance to extinction a four-foot alley connecting a starting box and a
should be reduced. goal box. Walls throughout were nine inches
Specifically, rats were trained to go down an high. The interior of the starting box was
alley with food at the end of the alley as rein- painted white, and the interiors of the alley and
forcement. Such after-effects of reinforcement goal box were painted black. The light in the
as food taste and such after-effects of nonrein- room was from a ceiling fixture directly over the
forcement as frustration may reasonably be center of the apparatus. The two doors were
514 VIRGINIA F. SHEFFIELD

TABLE I
DESIGN OF THE EXPERIMENT

Massed Training Spaced Training

Massed Ext. Spaced Ext. Massed Ext. Spaced Ext.

100% MM 100% MS 100% SM 100% SS


100% Reinf.
9 9 9 9
50% MM 50% MS 50% SM 50% SS
50% Reinf.
9 9 9 9

operated vertically by strings and pulleys. the box. The alley and goal box had no cover-
Small pellets of wet mash, made from powdered ing. A holding relay prevented re-operation of
Purina dog chow, flour, and water, were pre- the switches by retracing. The goal box door
sented in a white food cup in one corner of the was set back a short distance in the alley so that
goal box. On all nonreinforced trials, both once the animal's forepaws were inside the goal
during partial reinforcement and during extinc- box and the second timer had stopped, the door
tion, the white food cup was removed from the could be closed even if the animal did not im-
goal box. A black curtain placed diagonally at mediately enter the box completely.
the entrance to the goal box cut off sight of the Seventy-two male albino rats from the albino
food cup from the alley, so that an animal could farms at Redbank, N. J. were used, their ages
not tell whether the cup was present or absent ranging from 83 to 107 days (average 93 days)
until he was actually inside the goal box. at the time they were started in the experiment.
The rats were always placed in the starting They had never been used in any previous ex-
box facing the closed door. After approxi- perimentation.
mately two sec. the door of the starting box was Before and during the experiment the animals
raised. When the rat left the starting box, the lived in the experimental room. During the
door was lowered; when he stepped onto the two weeks prior to training, the animals were
floor of the goal box, the door near the goal box given daily taming sessions, consisting of han-
was lowered to prevent retracing. Micro- dling during feeding periods until all would eat
switches mechanically operated by the animal's pellets of food from the experimenter's hand.
weight on hinged floor sections in the two end They were also given practice at eating in un-
boxes activated Springfield timers which re- familiar surroundings—cages unlike their living
corded starting time (time from opportunity to cage, a table top, and a large cardboard carton.
respond until departure from the starting box) During these two weeks, they were fed a re-
and running time (time from leaving the starting stricted diet of wet mash given about noon.
box to entering the goal box). The starting box Two days prior to the start of the experiment,
was covered with a glass lid to prevent animals the feeding time was shifted to the time at which
from prematurely operating the timer switches their training would be started, a shift ranging
by jumping into the air before they actually left from zero to four hours.

STARTING GOAL
BOX BOX

\ A /,
—l \ / UL^ 0
-DOOR DOOR-
CURTAIN -/
ONE FT. FOOD CUP-

FIG. I. Apparatus
EXTINCTION AND PARTIAL REINFORCEMENT 51S

On the first day of the experiment proper, all ning time ran to two min,, but all animals were
animals were given 10 pre-training trials in the given the full 30 extinction trials.
apparatus under constant conditions of massed Since the training session extended over a
trials and 100 percent reinforcement. This pre- much longer time for spaced groups than for
training provided a criterion for eliminating, in massed groups, it was not possible, with the
advance of the application of the experimental experimental setup used, to obtain strictly
variables, those rats too timid to explore the comparable hunger drive from group to group.
alley (and probably also helped insure that the However, the hunger drive was kept at a high
rats scheduled to receive partial reinforcement enough level for all rats that it was felt that any
would not give up on the first few training variability in absolute level would be unimpor-
trials). If during the course of the 10 pre- tant. In any case, whatever drive difference
training trials an animal took more than five may have been present between certain of the
min. to find the food or to eat it, he was dis- groups did not enter into the basic comparisons
carded. On this basis, 14 rats were eliminated. of 50 percent and 100 percent reinforcement, for
Two rats were discarded after pre-training de- which drive was constant in groups compared.
spite acceptable performance in the alley, one A restricted diet of seven gm. (dry weight) of
because he violently resisted being picked up and the dog chow and flour mash was used before
because he consistently bit the experimenter, the and during the experiment. Thirty pellets, the
other because the apparatus broke down during most any rat received during training, weighed
experimental training. All discards were re- about zj gm. (dry weight), which was only one-
placed to make the final group of 72 experi- third of the restricted diet and an even smaller
mental animals. fraction of what any rat would eat if allowed
The to pre-training trials were given on the an unlimited supply of food. Rats given massed
first day; 30 training trials were given on the training trials had a 23-hour hunger drive at the
second day; and 30 extinction trials were given start of extinction. Spaced training was spread
on the third day. The trials were started at through a period of about 7! hours, at the end of
the same time on each of the three successive which time the animals were given the balance
days. The 5o-percent reinforcement groups, of their food for that day. Extinction for
having the same number of training trials as the space-trained animals was started 24 hours after
loo-percent reinforcement groups, received only the start of training, i.e., i6i hours after feeding.
half as many reinforcements during training. In an attempt to make the hunger drive of
At the end of their training trials, when the space-trained animals more comparable at the
animals were given the balance of their ration start of extinction with that of mass-trained
for the day, this difference in amount eaten be- animals, a total food ration of only six gm. was
tween so-percent and loo-percent reinforcement given at the end of training instead of the usual
animals was made up. seven gm.
At the end of a run, animals were left in the In preparing the orders of reinforced and
goal box for 10 sec., or until they found and ate nonreinforced trials for presenting 15 rewards
the food, longer times being necessary on early distributed over 30 training trials, certain re-
training trials. Ten sec. was an adequate time strictions were used. These restrictions were
to eat the pellet of wet mash once the animal selected to produce orders which contained
had learned, and this time was used whether single nonreinforced trials, and successions of
food was present or not, during pre-training, two in a row, three in a row, and four in a row,
training, and extinction. The animals were which closely approximated, within each series
then removed to their individual living cages of 30 trials, the distribution expected by chance
to await the next trial. Animals receiving in an infinite series. In other words, chance
massed trials were left in their living cages for frequencies were stratified within orders.
15 sec. between trials. This amount of time Nine orders were prepared, each order being
was required to record the time measures and used once in each subgroup. In each of the
reset the apparatus for the next trial. Due to orders prepared:
variable running time, the time from the start 1. Four successive nonreinforcements and
of one trial to the start of the next was thus three successive nonreinforcements each ap-
variable, but it very rarely ran over two min. peared once.
and was usually close to j min. except on the 2. Three successive reinforcements appeared
first few training trials and during extinction. twice.
For spaced trials, the time from the start of one 3. Two successive reinforcements and two
trial to the start of the next was 15 min. During successive nonreinforcements each appeared
extinction the animals were removed from the twice.
apparatus if either, the starting time or the run- 4. Isolated single reinforcements appeared
516 VIRGINIA F. SHEFFIELD

five times, and isolated single nonreinforcements using Tippett's tables of random sampling
appeared four times. numbers (21).
In order to prevent partial-reinforcement As a result of this balancing procedure, the
animals from refusing to run in the early part eight rats within a replication were more homo-
of training and in order to end their training geneous than the rats within a given experi-
with the response at high strength, the following mental group. The greater precision in the
additional restrictions were used: experiment obtained by this procedure was
5. The first training trial was always re- taken into account in the statistical analysis by
warded. basing comparisons on the eight degrees of
6. The set of four successive nonreinforce- freedom provided by the nine replications.
ments never appeared in the first half of training. Thus in testing for the significances of the mean
7. The last training trial was always an differences between two major groups, the
isolated single reinforcement. appropriate difference was found among rats
Within these restrictions the distributions within each replication, and the mean difference
used were prepared from Tippett's (21) random was tested from the distribution of nine differ-
numbers. The nine orders used were assigned ences so obtained. This procedure removed
randomly to the nine animals in each subgroup. from the estimated variance of mean differences
any correlation due to balancing of cage differ-
ences.
STATISTICAL ANALYSIS
RESULTS
The statistical analysis of the results treated
the data as nine replications of the experiment,
Although two time measures were
with eight degrees of freedom for each compari- recorded—starting time and running
son made. This treatment was partly dictated time—comparison of the learning and
by a balancing procedure used in assigning rats extinction curves for the eight experi-
to the eight conditions. mental groups revealed that the pat-
The rats used were obtained in three different
shipments, each shipment upon arrival being tern of results was the same for the
broken down into sets of eight cage-mates. two measures. Therefore, the two
Differences in ease of taming were apparent measures were added together to
between the shipments and between the sets of make a single measure of 'response
eight cage-mates within each shipment (perhaps time' in order to increase the stability
due to the more docile being removed from the
shipping crate first). These differences were of the measurements and simplify
balanced out by dividing the rats from each the statistical analysis.
shipment equally among the eight experimental
conditions and by assigning one each of the eight Acquisition
rats in a cage to the eight experimental condi-
tions. The assignment of the eight rats to the In Fig. 2 are shown learning curves
eight experimental groups was done randomly, for the four training groups. The

MASSED TRAINING SPACED TRAINING


PRE-TRAININO TRAINING PRE-TRAINWG TRAINING

• « 100% REINE • • 100% REINE


0- O 50% REINF. 1-15 O- -0 90% REINE

HI
O

1-5 6-10 1-5 6-IO 11-15 I6-2O ZI-25 26-3O 1-5 6-10 1-5 6-10 11-15 16-20 2I-Z5 26-30
TRIALS TRIALS

FIG. 2. Acquisition curves for the four training groups, showing median response time (averaged
for groups of five trials) during the 10 pre-training^trials and the 30 training trials
EXTINCTION AND PARTIAL REINFORCEMENT 517

AFTER MASSED TRAINING AFTER SPACED TRAINING

100% REINS
o- -o 50% REINF.
280 BO-

160-

s
fc

END OF J-5 6-10 IMS 16-20 21-25 26-30 END OF 1-5 11-15 16-20 21-25 26-30
TRAINING TRAINING
TRIALS TRIALS

FIG. 3. Extinction curves for loo-percent and jo-percent reinforcement groups after massed
training and after spaced training, showing percentage of responses in sets of five trials which were
at or below 16.3 sec. (the overall median response time). The reference points at the beginning are
the values for the last five training trials.

massed and spaced extinction groups .15 overall.2 Groups receiving spaced
have been combined for each training training were very slightly superior
condition, since up to the end of to those receiving massed training,
training there was no difference in but the probabilities are .13 for 100
their treatment. The measure used percent reinforcement, .28 for 50
is median response time, which was percent reinforcement, and .15 overall.
obtained for each trial and averaged
for groups of five trials. Medians Extinction
were used as more representative of Fig. 3 shows extinction curves for
the data because means would be the loo-percent and 5o-percent rein-
unduly affected by occasional trials forcement groups that were mass
on which an animal had an unusually trained compared with the loo-percent
long response time. Pre-training and and 5o-percent reinforcement groups
training are both shown. that were space trained (massed and
Tests of the differences in the level spaced extinction being combined for
of performance reached on the last each training group). Median re-
half of the training trials show that sponse time could not be used in
there were no significant differences 1
The measure used in computing significances
among the four groups. Groups re- of differences was the time, for each animal, on
ceiving 100 percent reinforcement his median trial out of the last 15 trials. The
were very slightly superior to those measure used in the curves was the time, for
each trial, of the median animal (groups of five
receiving 50 percent reinforcement, trials being averaged). This difference between
but the probability that a difference the methods of obtaining the two measures
this large and in this direction could probably accounts for the fact that the slight
superiority of 100 percent reinforcement during
arise by chance is .20 for massed massed training indicated by the significance
training, .12 for spaced training, and test is not apparent in the curves.
S18 VIRGINIA F. SHEFFIELD

plotting extinction data since on some the last five training trials, which is
trials more than half the animals shown as a reference point.
were removed from the apparatus Extinction after massed training.—
because of exceeding the two-minute As can be seen in Fig. 3, the results
time limit, making the median indeter- after massed training confirm the
minate. Median response time over findings of previous investigators that
an individual animal's trials also training with 50 percent reinforce-
could not be used as a measure of ment results in greater resistance to
individual animals' performance in extinction than training with loo
the analysis of extinction data be- percent reinforcement. The results
cause some animals exceeded the time of a significance test are shown in
limit on more than half of their ex- Table II. All groups reached ap-
tinction trials. A frequency measure proximately the same level of per-
was indicated, and that chosen was formance at the end of training, and
the number (or percentage) of trials performance was also similar on the
with response times at or below the early extinction trials, but the two
combined median response time for groups diverged after the first ten
whatever groups were being com- extinction trials. The significance
pared. The median was chosen as test was made separately for all ex-
the cut-off point partly because of its tinction trials and for the last half of
arbitrary nature and partly because the trials. The means in Table II
it places the mean of the scores close for all 30 trials used the median of
to the middle of the range, which 14.2 sec., obtained for the 30 extinc-
should give maximum sensitivity in tion trials of the 5o-percent and the
comparing groups and at the same loo-percent mass-trained groups com-
time avoid skewed distributions. The bined, as the cut-off point in getting
median time for all 72 animals on all each animal's score. The means for
30 extinction trials was 16.3 sec., and the last 15 trials used the median of
it is the mean frequency of responses these 15 trials, 24.4 sec,, as the cut-
at or below this value for each set of off point. It should be pointed out
five trials that is plotted in Fig. 3. that with the scoring procedure used,
The first point on the curves is the the potential range of scores is the
corresponding percentage value for same as the number of trials involved,

TABLE II
MEAN NUMBER OF RESPONSES MEETING CRITERION *
DURING EXTINCTION AFTER MASSED TRAINING

Mean Score
Mean t P**
Diff. "M diff. <*/
50% Reinf. 100% Reinf.
All 30 ext. trials 17.1 13.0 4.1 1.87 2.2 8 .03
(Md, = 14.2 sec.)
Last 15 ext. trials 9.0 6.0 3-° 0.72 4.2 8 <.OI
(Md. = 24.4 sec.)

* I.e., mean number of responses at or below median response time for the particular groups and
number of trials being considered.
** For one tail of distribution.
EXTINCTION AND PARTIAL REINFORCEMENT 519
TABLE III
MEAN NUMBER OF RESPONSES MEETING CRITERION *
DURING EXTINCTION AFTER SPACED TRAINING

Mean Score
Mean / df P**
Diff. "M dlff.
50% Reinf. 100% Reinf.

All 30 ext. trials 14.7 154 -0.7 2.71 °-3 8 •77


(Md. = 19.1 sec.)
Last 15 ext. trials 6.9 8.0 — I.I 1.49 0.7 8 •5°
(Md. = 27.0 sec.)

* I.e., mean number of responses at or below median response time for the particular groups and
number of trials being considered.
** For both tails of distribution.

so the means shown for all extinction it would become negative are ques-
trials and for the last half of the trials tions that would have to be answered
cannot be directly compared. empirically. For this reason, the
Table II shows that animals trained P's shown are for a difference this
with partial reinforcement made a large in either direction.
significantly larger number of re- Table III shows that with training
sponses during extinction which were trials spaced, animals trained with
below the median response time than partial reinforcement did not differ
did animals trained with 100 percent significantly during extinction from
reinforcement, their superiority being animals trained with too percent rein-
more marked during the latter half forcement. The slight difference ob-
of extinction than over all extinction tained was in the reverse direction
trials. from that obtained with massed
Extinction after spaced training.— training, 100 percent reinforcement
When training trials were spaced 15 being superior.
min. apart, greater resistance to Effect of distribution of practice on
extinction following 50 percent rein- ' extinction.—The results confirm the
forcement as compared with 100 per- expectation from the stimulus-re-
cent reinforcement was absent. This sponse analysis of partial reinforce-
result can be seen in Fig. 3. A sig- ment that the difference in resistance
nificance test is shown in Table III. to extinction between groups trained
As in Table II, results are shown for with 100 percent reinforcement and
all extinction trials and for the last groups trained with 50 percent rein-
half of the extinction trials. The forcement would be greater if the
scores on which Table III is based training trials were massed than if
used the medians of the 5<D-percent they were spaced. Tables II and III
and loo-percent space-trained groups show that with massed training partial
combined for the relevant number of reinforcement significantly retarded
trials. The theory says nothing about extinction, whereas with spaced train-
the direction of this difference; as ing it did not. They do not show,
trials are more and more widely however, whether this obtained differ-
spaced, the difference should become ence in the effect of partial reinforce-
smaller and smaller, but at what inter- ment was significant. A significance
val it would become zero and whether test on this point is shown in Table
520 VIRGINIA F. SHEFFIELD

IV. Here the overall median response DISCUSSION


time of 16.3 sec. for all groups com-
bined was used to obtain the scores As stated in greater detail in the
over the 30 trials, and the median introduction, the present experiment
performance of 25.3 sec. for all was based on a hypothesis which
groups combined on the last 15 trials provides a stimulus-response explana-
"was used in obtaining scores for this tion for the greater resistance to
portion of extinction. The t-test is extinction that has been found after
based on the distribution of the training with partial reinforcement
second-order differences indicated in as compared with 100 percent rein-
the table, over the nine replications forcement. According to this ex-
of the experiment. planation, the partial reinforcement
Table IV shows that the difference technique has the effect of condition-
in resistance to extinction between ing the response to cues normally
100 percent and partial reinforcement present only during extinction. After
is significantly greater with massed one or more nonreinforced trials, the
training than with spaced training. conditioned stimulus pattern at the
Results in the same direction were start of a reinforced trial includes
obtained when the analysis shown in after-effects of the preceding nonrein-
Table IV was performed separately forced trials, which make it much like
for those groups whose extinction a trial during extinction. When per-
trials were massed and for those formance of the response in the pres-
groups whose extinction trials were ence of these nonreinforcement or
spaced, but the differences were not extinction cues is now reinforced in
nearly as reliable as in the overall the trial being considered, the partial
analysis. For massed extinction ana- reinforcement group learns to give the
lyzed separately, the probabilities response to cues for extinction.
were .07 for all extinction trials and When the actual extinction trials are
.12 for the last half of the extinction started, the omission of reinforcement
trials that the difference obtained does not introduce unfamiliar cues;
would arise by chance. For spaced the response has been conditioned to
extinction the corresponding prob- nonreinforcement cues during train-
abilities were .20 and .n. ing and continues at a relatively high

TABLE IV
COMPARISON OF THE DIFFERENCES * BETWEEN 50% AND 100% REINFORCEMENT GROUPS
DURING EXTINCTION AFTER MASSED TRAINING AND AFTER SPACED TRAINING

Massed Training Spaced Training


Di (massed)
Mean Score Mean Score minus "Di-Dt t df P**
Mean Mean Dz (spaced)
Diff. Diff.
50% 100% 50% 100%

All 30 ext. trials I7.8 14.2 3-6 13.6 14.4 -0.8 44 I -95 2.3 8 •03
(Md. = 16.3 sec.)
Last 15 ext. trials 9.1 6.3 2.8 6.8 7.8 — I.o 3-8 1.09 3-5 8 <.OI
(Md. = 25.3 sec.)

* The measure used was mean number of responses at or below the overall median response time.
** For one tail of distribution.
EXTINCTION AND PARTIAL REINFORCEMENT 521

level. After training with 100 per- greater resistance to extinction, to a


cent reinforcement, on the other hand, significant reversal, with 50 percent
the start of extinction introduces reinforcement extinguishing more
nonreinforcement cues for the first rapidly. Another factor affecting the
time. The conditioned response to size of the difference, and one which
the changed stimulus pattern is there- is probably little influenced by spacing
fore weakened because it is at a differ- of trials, is reinstatement of cues for
ent point on the generalization gradi- continued responding. There are un-
ent from the reinforced stimulus doubtedly stimuli from the animal's
pattern. behavior which are reinstated each
From this theoretical analysis it time the animal is placed in the ap-
was predicted that spacing of training paratus and which become cues for
trials with animal subjects should continued responding in spite of non-
reduce or destroy the advantage of reinforcement. For example, in ad-
partial reinforcement during extinc- dition to cues from the frustration
tion, because with long inter-trial which carries over from a nonrein-
intervals the after-effects of rein- forced trial to the start of the following
forcement or nonreinforcement on trial with massed training, there may
the preceding trials would have dis- be cues from conditioned frustration
sipated to some extent. The condi- aroused by apparatus cues on each
tioned stimulus pattern at the start trial with both massed and spaced
of a new trial would thus be more training. If the latter type of cue is
similar for the partial and loo-percent a big part of the stimulus pattern,
reinforcement groups during spaced spacing of trials would not reduce by
training, and their performance dur- much the difference between extinc-
ing extinction should be more com- tion following 50 percent and loo
parable. The results of the experi- percent reinforcement.
ment bear out the prediction. As stated above, the hypothesis is
Whereas after massed training, sig- applied specifically to a comparison
nificantly greater resistance to ex- of extinction after partial and too
tinction was found for 50 percent percent reinforcement. It may be
than for 100 percent reinforcement, applied more generally as a hypothe-
after spaced training, there was no sis to explain part of the decrement
advantage for 50 percent reinforce- that characterizes all extinction. Ac-
ment—the slight difference obtained cording to this hypothesis, a response
was in the reverse direction. becomes conditioned during training
The reversal obtained with spacing to the after-effects of reinforcement
in the present results is not a neces- as part of the conditioned stimulus
sary implication of the hypothesis, pattern. Omission of reinforcement
which implies only that the difference during extinction thus changes the
will be reduced. By changing the conditioned stimulus pattern, not
spaced intervals used, the degree of only by the removal of reinforcement
overlearning (i.e., amount of training cues, but also by introducing what-
with nonreinforcement followed by ever new cues result from the omis-
reinforcement), and other details of sion, such as those that are produced
the procedure, it may be possible to by the subject's reaction to the omis-
vary the size of the difference from a sion. Part of the decrement in the
small change, with 50 percent rein- response during extinction would thus
forcement still showing considerably be interpreted as a weakening of the
522 VIRGINIA F. SHEFFIELD

response through generalization to rats in a T-maze. However, this failure


the new stimulus pattern, combined actually provides confirmation for the
with the weakening effects of any in- results of the present study, for the
compatible responses that might be absence of an effect was under circum-
produced by the new stimulus pattern stances in which the present findings
indicate there should be no effect, namely
or the omission of reinforcement. following training with widely spaced
It might be asked whether this hy- trials. Partial reinforcement groups
pothesis (as applied to partial reinforce- were given one reinforced and one non-
ment) can explain cases in which reinforced trial per day, and loo-percent
experimenters failed to obtain greater reinforcement groups were given two
resistance to extinction with partial re- reinforced trials per day (plus, in each
inforcement. Humphreys (14), in a bar- case, two trials to the incorrect side, the
pressing experiment with rats, found an last two trials of the day being forced to
advantage for partial reinforcement only achieve this combination). But the
if number of reinforcements was held interval between trials given on the same
constant. Four groups received rein- day was 20—30 min., and 24 hours inter-
forcements on (i) 18 out of 52 trials, vened between every two correct choices.
(2) 18 out of 18 trials, (3) 7 out of 18 In the present study, it was shown that
trials, and (4) 7 out of 7 trials. Com- a 15-min. interval was sufficient to reduce
parison of number of responses during to zero the difference in extinction of a
fixed extinction periods showed an ad- running response following partial and
vantage for (i) over (2) and an advan- 100 percent reinforcement.
tage for (3) over (4), in each case the Finger (5,6), on the basis of his
number of reinforcements being the same studies, concluded that partial reinforce-
and the number of trials variable. But ment per se did not result in greater
there was no difference between (2) and resistance to extinction, but his results
(3), where number of reinforcements is are shown to be inconclusive in a study
variable and number of trials is constant. by Lawrence and Miller (13) because of
The explanation of the effects of the effects of certain aspects of Finger's
partial reinforcement proposed in the technique.
present report assumes that there must The question arises as to whether other
be a large enough number of occurrences theories that have been proposed to
of nonreinforcement followed by rein- explain the results of partial reinforce-
forcement during training for a stable ment are able to account for the results
response to be conditioned to the extinc- of the present study. Foremost among
tion cues. In (3) above, only seven other explanations is the 'expectancy'
interpretation proposed by Humphreys
reinforcements were given. If it is as-
(n) and later amplified by Hilgard (8).
sumed that the first trial was always
A difficulty in this case is that the inter-
reinforced, there would be a maximum pretation is not stated rigorously enough
of six reinforced trials following nonrein- for its application in a new situation to be
forced trials. It is unlikely that a very clear. In fact, its application even to
strong response could be conditioned to Humphreys' findings is not unambiguous.
the extinction cues in so few trials. If The relatively higher level of responding
there had been a control group for (i), during extinction after partial reinforce-
with 52 trials all reinforced, it is possible ment is explained merely as due to the
that a comparison of this group with (i) difficulty of shifting from an expectation
might have revealed a difference in of intermittent reinforcement to an ex-
resistance to extinction. pectation of complete nonreinforcement,
Another experiment which failed to compared with the ease (after 100 percent
demonstrate any difference in extinction reinforcement) of shifting from an ex-
between partial and 100 percent rein- pectation of uniform reinforcement to an
forcement is one by Denny (4), involving expectation of uniform nonreinforce-
EXTINCTION AND PARTIAL REINFORCEMENT 523

ment. One might as readily argue, at ential cue that has always been followed
the same level of discourse, just the re- by reinforcement, there is maximal
verse: that after being habituated to probability of eliciting the conditioned
infrequent reinforcements, the subject response.
finds it easy to get used to the idea that If expectancy is viewed as the basic
there will be none at all, but after being principle for explaining all learning, then
habituated to uniform reinforcement, there would appear to be no reason for
the subject finds it hard to believe that- predicting anything different for resist-
there will be no more reinforcements. ance to extinction following spaced
The point is that Humphreys' ex- training from that following massed
planation did not follow from an ex- training. The fact that the animals in
pectancy principle; it was an ex post this study learned as well, if not better,
facto conjecture as to what the course with spaced as compared with massed
of expectancies must have been if ex- training would indicate that the ex-
pectancies are to be considered the pectancies used by the expectancy prin-
causes of the behavior. The fact that in ciple can be set up and maintained
a subsequent study (12) Humphreys through intervals as long as those used.
showed that the course of verbal ex- Therefore, it is difficult to see why spacing
pectancies did roughly correspond to of training should remove the advantage
that of conditioned responses is treated in resistance to extinction usually found
by Hilgard (8) as confirmation of this with partial reinforcement. Presum-
conjecture and as evidence in favor of ably expectancies are aroused when a pre-
an expectancy principle. However, viously-experienced situation recurs, re-
Humphreys' subsequent results show gardless of the time interval separating
only that verbal responses are affected the two exposures. Distributed practice
by partial reinforcement in much the should facilitate the acquisition of an
same way as other responses. And the expectancy as much as it does the learn-
critical issue is that neither of these was ing of any response, and the advantage of
deduced from the expectancy hypothe- partial reinforcement should be increased,
sis; the course of expectancies was deter- if anything, by the spaced practice in
mined empirically, and the course of the forming expectancies.
other conditioned responses was ex- However, if expectancies are regarded
plained as being due to the course of as secondary phenomena sometimes in-
expectancies. One of the advantages of volved in learning, then such expect-
the present hypothesis is that it can ancies, particularly in the form of im-
predict the course of both kinds of plicit verbal responses with human sub-
responses without recourse to any empir- jects, would be included here as one of
ical analysis. the after-effects of reinforcement or non-
Humphreys' (n, 13) initial rise in the reinforcement as described in the present
extinction curve after partial reinforce- hypothesis. For human subjects, such
ment is attributed to high expectation of verbal responses would undoubtedly be
reinforcement after two extinction trials an important part of the reinforcement
because there never were more than two or extinction cues to which the response
successive nonreinforcements during becomes conditioned. For example, hu-
training. This must mean that the man subjects who tended to verbalize
subject is keeping track of (i.e., respond- what went on in Humphreys' experi-
ing differentially to the after-effects of) ments (n, 13) might, after a certain
the successions of reinforcements and amount of practice, think to themselves
nonreinforcements. To say that, after a the equivalents of: "Two shocks in a
series of events which has always been row, none next time," or "No shock and
followed by reinforcement, the subject then shock, maybe shock again this
has a high expectation of reinforcement time," or "No shock for two trials, shock
contributes nothing beyond the stimulus- for sure next time." Such verbalizations
response interpretation that to a differ- would function as after-effects and would
524 VIRGINIA F. SHEFFIELD

provide distinctive cues exactly corre- Another proposal for explaining the
lated with preceding patterns of rein- effects of partial reinforcement is one by
forcement (to the extent that they cor- J. S. Brown, which has been tested by
rectly described the actual events of the Mowrer and Jones (17). They have
experiment). They would also very designated it the 'response-unit' hypoth-
probably be one of the cues reinstated at esis. Speaking of their bar-pressing.re-
the start of each trial. Furthermore, sponse, they suggest that what is rein-
they would be differential cues that had forced in the partial reinforcement situ-
been reinforced, respectively, o percent, ation is not the single bar-depression that
50 percent, and 100 percent of the time immediately precedes food but the se-
in Humphreys' procedure. It is not quence of bar-depressions leading up to
surprising, therefore, that strength of the reward. Defining the 'response' as
actual responding correlated to some a sequence or pattern of behavior rather
extent with cues so differentially rein- than a single act, they propose that the
forced. A separate expectancy principle behavior temporally more remote from
is not required to explain why these ex- the reward is reinforced by it to a di-
pectancies help produce the greater re- minishing extent according to the prin-
sistance to extinction found by Hum- ciple of the gradient of reinforcement.
phreys. The subject who has always Accordingly, animals rewarded during
been reinforced while thinking "No training for several depressions of the
shock for two times" is more likely to bar would be expected during extinction
give the conditioned response when he to press the bar more times but not nec-
thinks "No shock for three or four times" essarily to make more 'responses' than
than the subject who not only has never animals whose 'response' during training
been reinforced while thinking this but was a single bar-depression.
also never thought anything like it dur- From this hypothesis it would prob-
ing acquisition. Such verbalizations ably be predicted that spacing of training
function exactly as the presence or trials would make it difficult for a se-
absence of food taste in the mouth and quence of behavior to be reinforced as a
the state of frustration in the case of 'unit' and would therefore cause a break-
rats. down in the usual advantage of partial
But applying the expectancy principle reinforcement in extinction. However,
to animals as if they had implicit verbal even with massed training, it would ap-
cues would very probably lead to errors pear to be much easier to conceive o
of prediction. Humphreys (14) himself several bar-depressions as a response-unit
does not attempt to apply the expect- than two or three runs down an alley
ancy principle in his study of partial separated by 15-second intervals in the
reinforcement with rats. In the case of home cage. In the latter case, the
human subjects, spacing of training length of time involved and the possi-
trials would probably have much less bility for variable intervening activity
effect of reducing the difference between make the idea of a response-unit unten-
partial and 100 percent reinforcement able.
in resistance to extinction, because verbal Denny (4) has obtained evidence
cues could doubtless still be re-aroused supporting a secondary reinforcement
after long inter-trial intervals. Still, hypothesis for explaining the fact that
some effect similar to the present findings acquisition is almost as good with partial
should show up since the advantage of reinforcement as with 100 percent rein-
partial reinforcement would partly de- forcement. He does not attempt to ap-
pend, in human subjects, on their ability ply it to differences in extinction under
to remember, at any point during learn- the two conditions, which in fact his
ing, what had happened on the last few study does not show. Secondary rein-
trials, and since some forgetting would forcement from the empty goal box un-
be expected with long intervals between questionably operated in the 50 percent
trials. groups in the present study, but there is
EXTINCTION AND PARTIAL REINFORCEMENT S25
no reason to expect it to operate differ- 6. These results verify a prediction
ently when training trials are spaced from a hypothesis utilizing stimulus-
from when they are massed. Secondary response learning concepts for ex-
reinforcement not only cannot explain plaining the effect of partial rein-
the differential effect on resistance to
forcement on extinction.
extinction obtained with massed and
spaced training trials, but also it cannot (Manuscript received November 5, 1948)
account for the basic difference in ex-
tinction between partial and 100 percent
REFERENCES
reinforcement with massed training trials.
In order for the secondary reinforcement 1. BROGDEN, W. J. The effect of frequency of
principle to explain the latter, it would reinforcement upon the level of condi-
have to be assumed that secondary rein- tioning. /, exp. Psycho!., 1939, 24, 419—
forcement is stronger than primary rein- 431-
2. BRUNSWIK, E. Probability as a determiner
forcement. of rat behavior. /. exp. Psychol., 1939,
25, I7S-I97-
SUMMARY 3. COLE, L. E. A comparison of the factors
of practice and knowledge of experimental
1. Seventy-two rats were trained to procedure in conditioning the eyelid
run down an alley for food. Half re- response of human subjects. /, gen.
Psychol, 1939, 20, 349-373.
ceived reinforcements on all training 4. DENNY, M. R. The role of secondary rein-
trials, and half randomly on 50 per- forcement in a partial reinforcement
cent of the trials. Half of each group learning situation. /. exp. Psychol.,
were trained with a 15-sec. interval 1946, 36, 373-389.
between trials and half with a 15-min. 5. FINGER, F. W, The effect of varying condi-
tions of reinforcement upon a simple
interval. Each of the four training running response. /. exp. Psychol.,
groups was divided for extinction, I942. 30, 53-68.
half being extinguished with the 15- 6. FINGER, F. W. Retention and subsequent
sec. interval and half with the 15-min. extinction of a simple running response
following varying conditions of reinforce-
interval.
ment. /. exp. Psychol., 194.2, 31, izo-
2. There were no significant differ- 133-
ences in level of performance on the 7. GUTHRIE, E. R. Psychology of learning.
last half of the acquisition trials, New York: Harper, 1935.
either between 100 percent and 50 8. HILGARD, E. R. Theories of learning.
New York: Appleton-Century-Crofts,
percent reinforcement or between
1948-
massed and spaced training. 9. HULL, C. L. Notes on the Humphreys
3. After massed training, resistance extinction paradox. Unpublished sym-
to extinction was significantly greater posium notes, Midwestern Psychol. Assn.,
1941.
for 50-percent reinforcement groups 10. HULL, C. L. Principles of behavior. New
than for loo-percent reinforcement York: Appleton-Century, 1943.
groups. 11. HUMPHREYS, L. G. The effect of random
4. After spaced training, the differ- alternation of reinforcement on the ac-
ence in resistance to extinction be- quisition and extinction of conditioned
eyelid reactions. /. exp. Psychol., 1939,
tween too-percent and 5o-percent
as, i4i-!S8.
reinforcement groups was not sig- 12. HUMPHREYS, L. G. Acquisition and extinc-
nificant; it was, in fact, slightly tion of verbal expectations in a situation
reversed. analogous to conditioning. /. exp. Psy-
5. The differential effect of partial chol., 1939, 25, 294-301.
13. HUMPHREYS, L. G. Extinction of condi-
reinforcement depending on whether tioned psychogalvanic responses following
training was massed or spaced was two conditions of reinforcement. /. exp..
found to be significant. Psychol.,h<)40,.27, 71-75.
526 VIRGINIA F. SHEFFIELD

14. HUMPHREYS, L. G. The strength of a strength as a function of the pattern of


Thorndikian response as a function of the reinforcement. /. exp. Psycho!., 1945,
number of practice trials. /. camp. 3S, 293-311.
PsycfioL, 1943,35, loi-no. 18. PAVLOV, I. P. Conditioned reflexes. (Trans.
15. LAWRENCE, D. H., & MILLER, N. E. A by G. V. Anrep) London: Oxford Univ.
positive relationship between reinforce- Press, 1927.
ment and resistance to extinction pro- 19. SHEFFIELD, VIRGINIA F. Resistance to ex-
duced by removing a source of confusion tinction as a function of the distribution
from a technique that had produced op- of extinction trials. /. exp. Psycho!.,
posite results. /. exp. Psycho!,, 1947, 1950, 40. (In press)
37, 494-509. 20. SKINNER, B. F. The behavior of organisms.
16. MILLER, N. E., & DOLLARD, J. Social New York: Appleton-Century, 1938.
learning and imitation. New Haven: 21. TIPPETT, L. H. C. Random sampling num-
Yale Univ. Press, 1941. bers. London: Cambridge Univ. Press,
17. MOWRER, O. H., & JONES, HELEN. Habit 1927.

You might also like