You are on page 1of 11

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 1982, 37, 171-181 NUMBE,R 2 (MARCH)

FAILURE TO PRODUCE RESPONSE VARIABILITY


WITH REINFORCEMENT
BARRY SCHWARTZ
SWARTHMORE COLLEGE

Two experiments attempted to train pigeons to produce variable response sequences. In the
first, naive pigeons were exposed to a procedure requiring four pecks on each of two keys
in any order, with a reinforcer delivered only if a given sequence was different from the
preceding one. In the second experiment, the same pigeons were exposed to this procedure
after having been trained successfully to alternate between two specific response sequences.
In neither case did any pigeon produce more than a few different sequences or obtain more
than 50% of the possible reinforcers. Stereotyped sequences developed even though stereo-
typy was not reinforced. It is suggested that reinlorcers have both hedonic and informative
properties and that the hedonic properties are responsible for sterotyped repetition of re-
inforced responses, even when stereotypy is negatively related to reinforcer delivery.
Key words: complex operants, differential reinforcement, stereotypy, variability, pigeons

In the course of being trained to emit oper- each key four times, so that the bottom right
ant responses for reinforcers, animals typically matrix light was illuminated, a reinforcer was
develop response topographies that are both delivered. Thus, to obtain reinforcers pigeons
stereotyped and efficient. Although the move- had to peck each of the keys four times, in any
ments that compose the rat's lever press or the order. If they pecked either key a fifth time, the
pigeon's key peck may be variable and uneco- trial ended without reinforcement. In all,
nomical early in training, as training proceeds, there were 70 different sequences of responses
extraneous movement tends to drop out, and that would result in reinforcement.
variability tends to diminish. By the time The consistent finding in these studies was
training is complete, the operant is usually that despite the wide range of sequence vari-
highly stereotyped with regard to properties ability permitted by the reinforcement con-
such as force, duration, and location (e.g., tingency, stereotyped sequences developed. For
Herrnstein, 1961; Notterman & Mintz, 1965; each pigeon, one sequence came to dominate
Schwartz, 1977). all others, occurring on 50 to 90% of trials.
Recently, the study of the development of Though the dominant sequence varied from
stereotyped response topographies has been ex- pigeon to pigeon, for each pigeon it was highly
tended to situations in which the required re- stereotyped from trial to trial and session to
sponse is more complex than a single key peck session.
or lever press (Schwartz, 1980, 1981a, 1981b; This reliable finding led Schwartz (1980, Ex-
Vogel & Annau, 1973). In these experiments, periment 4) to ask whether sequence stereo-
the required operant was a sequence of re- typy could be prevented if the reinforcement
sponses. Pigeons were exposed to trials that be- contingency required variability. Thus, pi-
gan with two response keys and the top left geons were exposed to the sequence task just
light of a 5 by 5 matrix of lights illuminated. described, with the reinforcement contingency
When they pecked one of the keys, the illumi- modified so that a given sequence was only re-
nated matrix light moved down one row, and inforced if it differed from the sequence that
when they pecked the other key, it moved had occurred in the preceding trial. Pigeons
across one column. When they had pecked were exposed to this procedure after a lengthy
period of training with the regular sequence
This research was supported by NSF grant BNS 78- task, which had resulted in the development of
15461 and by Swarthmore College Faculty Research
Grants to the author. Reprint requests should be sent stereotyped sequences. Schwartz found that in
to Barry Schwartz, Department of Psychology, Swarth- 40 sessions of exposure to this contingency re-
more College, Swarthmore, Pennsylvania 19081. quiring sequence variability, only one of eight
171
172 BARRY SCHWARTZ
pigeons showed an appreciable increase in se-
quence variability. In the initial sessions of the EXPERIMENT 1
procedure, the pigeons obtained reinforcers in METHOD
about 32% of trials; at the end of the proce-
dure, they obtained reinforcers in about 36% Subjects
of trials. In contrast, when the sequence task Six experimentally naive White Carneaux
contained no requirement of sequence vari- pigeons were maintained at 80% of their free-
ability, pigeons obtained reinforcers on 70 to feeding weights.
95% of trials.
This finding led Schwartz to speculate that Apparatus
perhaps reinforcement was not an effective Four Gerbrands pigeon chambers (G7313)
procedure for producing variations in response contained three-key pigeon intelligence panels.
topography. Perhaps reinforcement led inex- The keys were Gerbrands normally closed
orably to stereotyped responses. Schwartz keys, requiring a force of .1 N to operate. They
showed in another experiment (Schwartz, 1980, were spaced 7.5 cm apart, center-to-center, and
Experiment 5) that response sequences were were located 21 cm above the grid floor. A
not completely unmodifiable. If sequences grain hopper was directly below the center
were required to begin with two left key pecks, key, 5.5 cm above the grid floor, and a pair of
or with a left-right alternation, pigeons modi- houselights was located in the ceiling of the
fied their sequences accordingly. However, chamber. The houselights were illuminated
when they were simply required to vary se- throughout experimental sessions, except dur-
quences, behavior was not modified effectively. ing 4-sec feeder operations, when a light in the
However, it would be premature to suggest feeder was illuminated.
that reinforcement cannot produce sequence On the left wide wall of each chamber was
variability. There were two features of mounted a 5 by 5 matrix of red lights, spaced
Schwartz's procedure that might have worked 2 cm apart. The lights were .84 cm in diameter
against the variability requirement. First, the and .04 amp (Dialco No. 507-3917-1471-60D).
pigeons were highly trained and were emitting The top row of lights was 20 cm from the grid
stereotyped sequences when the procedure be- floor, and the right column (closest to the in-
gan. Perhaps with naive pigeons that had not telligence panel) was 4 cm from the panel.
yet developed stereotyped sequences, the vari- Scheduling of experimental events, data col-
ability contingency could effectively control lection, and data analysis were accomplished
behavior. Second, in the Schwartz experiment, with a Digital Equipment Corporation PDP
the interval between trials was 10 sec. Thus, to 8/E digital computer using interfacing and
master the contingency, pigeons had to bridge software provided by State Systems Incorpo-
a 10-sec delay between what they had just done rated, Kalamazoo, Michigan.
and what they would do next. Although one
can imagine strategies that would effectively Procedure
bridge the delay (e.g., the pigeon could spend Pretraining. The pigeons were trained to eat
the entire intertrial period next to the key on from the food magazine, after which they were
which the next sequence should begin), such exposed to a modified autoshaping procedure
a long intertrial period certainly does not opti- (Brown & Jenkins, 1968). Each session con-
mize the chances for successful performance. sisted of 50, 6-sec trials, separated by a variable
The present experiments were designed to intertrial interval (X = 40 sec). Each of three
explore further whether pigeons could learn to trial types was equiprobable: either the left
vary response sequences. Experiment 1 re- key was illuminated with white light, or the
moved both of the obstacles that were present right key was illuminated with white light, or
in the initial experiment: the pigeons were both keys were. These three types of trials oc-
naive at the start of training, and the intertrial curred in random order. After 6 sec, the key-
interval was reduced to .5 sec. Experiment 2 light (s) was extinguished and the feeder oper-
attempted to produce sequence variability by ated. Key pecks were recorded but had no
first explicitly establishing a repertoire that programmed consequence. Each pigeon was
contained two distinct sequences that occurred exposed to the autoshaping procedure for five
with appreciable frequency. full sessions after the one in which pecking be-
RESPONSE VARIABILITY AND REINFORCEMENT 173

gan. At the end of pretraining, all pigeons sions. It was then reintroduced for 50 sessions
were reliably pecking both keys (when illumi- and removed again for 30 sessions. Thus, the
nated). present experiment differed from the one re-
Sequence Training Procedure. Daily sessions ported in Schwartz (1980) in two respects: the
consisted of 50 trials, separated by an intertrial ITI was .5 rather than 10 sec, and the variabil-
interval (ITI) of .5 sec. At the beginning of ity requirement was imposed before stereo-
each trial, the two side keys were illuminated typed sequences could develop.
with white light and the top left matrix light
was lit. Each peck on the left key extinguished RESULTS AND DISCUSSION
the currently illuminated matrix light and lit In previous experiments of this type, there
the one to its right; each peck on the right key have been three dependent variables of pri-
also extinguished the currently illuminated mary interest: the number of reinforcers ob-
matrix light and lit the one beneath it. Four tained, the number of different sequences
left key pecks were required to move the ma- emitted, and the frequency of the sequence
trix light from extreme left to extreme right, that became dominant. Figure 1 shows the
and four right key pecks were required to number of reinforcers obtained per session by
move the matrix light from extreme top to ex- each pigeon, averaged across the first and last
treme bottom. To obtain a reinforcer it was five sessions of each procedure. The procedure
necessary to move the matrix light from the requiring sequence variability is labeled "DIF"
top left to the bottom right, that is, to peck (for different sequences) in the figure, and the
each key four times. A fifth peck on either key standard sequence procedure is labeled "SEQ."
terminated a trial immediately, without rein- One hundred sessions of exposure to the vari-
forcement. In all, there were 70 different se- ability contingency did not result in mastery of
quences of left and right key pecks that could the contingency. Only Pigeon A9 was obtain-
satisfy the reinforcement contingency. ing more than 20 reinforcers per session by the
All pigeons were placed on the sequence end of training. That this represents ineffec-
procedure immediately after the autoshaping tive performance in contrast to what one ob-
pretraining described above. Generally, this serves on the standard sequence task is clear
pretraining was sufficient to ensure that pi- from the second pair of bars for each pigeon.
geons would peck both keys on most sequence Within five sessions of exposure to the stan-
trials. If they were pecking both keys, they dard procedure, five of the pigeons were ob-
tended to obtain enough reinforcers in early taining 40 or more reinforcers a session. By the
sessions to keep them pecking until they mas- end of 30 sessions, all pigeons were performing
tered the contingency. However, one pigeon effectively. When the variability contingency
(Al0) tended to peck exclusively on one key
and thus obtained no reinforcers at all. For so

this pigeon, a special procedure was instituted. z 40


A9 A0I es

To decrease the tendency to perseverate on


one key, it was exposed to a single session in 0" 20

which either keylight was extinguished when


E

it had been pecked four times. After one such


S
10 FH rHN [IiM
Lfl2¶ = -H 11dil PaE

session, it was returned to the regular sequence a! 10


procedure. z 30

Z 20 .L7 B
Variability Requirement
Aside from having to peck each key four
times, the pigeons were required to vary their ODf 335 DiV SuA DIir Su air
V Su VUSt DIV SEO

response sequences. To produce reinforce- PROCEDURE


ment, a correct sequence had to be different Fig. 1. Reinforcements per session for each pigeon
from the sequence that had occurred on the averaged across the first and last five sessions of each
immediately preceding trial. Thus, repetition procedure in Experiment 1. The first bar in each pair
of a single sequence was never reinforced. is from the first five sessions and the second is from the
last five. The procedures are labeled on the X-axis. The
After 100 sessions of this procedure, the vari- procedure requiring variability is labeled DIF and the
ability requirement was removed for 30 ses- standard procedure is labeled SEQ.
174 BARRY SCHWARTZ

was reintroduced, reinforcers obtained imme- (f)


w
diately declined and increased only modestly u
z
LLI
40 9Rca
over the course of 50 sessions. Only Pigeon B6 M
C3 30
LLJ
improved beyond the level it had reached in cn
20 fl
its first exposure to the variability contin-
gency. When the variability requirement was
again removed (last pair of bars), the pigeons
I--
w
w
w
LL.
LL.
bkmwja E1W hT-hLLErU.ffLl2F
again rapidly began obtaining more than 40
(3
LL.
E3
20
40

30
-.i
06
FFL mm m mm
87 Ba

reinforcers per session. Thus, whether pigeons w


LLJ
m
were naive (first pair of bars) or experienced X:
M
(third pair of bars), the variability contingency
had only modest success in increasing vari- DiV Su3 8Wf an DiV SEA DIF 3(3 DIF SIEA off 5(9

ability. This finding replicates and extends PROCEDURE


Schwartz (1980). Fig. 2. Number of different sequences per session for
Figure 2 presents for each pigeon the num- each pigeon averaged across the first and last five ses-
ber of different sequences per session. The data sions of each procedure in Experiment 1. The first bar
are taken from the same sessions as in Figure in each pair is from the first five sessions and the second
is from the last five. The procedures are labeled on the
1, and include both correct (four pecks on each X-axis. The procedure requiring variability is labeled
key) and incorrect sequences. For all but Pi- DIF and the standard procedure is labeled SEQ.
geon AIO, sequence variability decreased sub-
stantially over the course of exposure to the from the next pair of bars in Figure 3. When
variability requirement. By the end of that exposed to the standard sequence procedure,
procedure, the other five pigeons were emit- the frequency of the dominant sequence in-
ting 4 to 8 different sequences per session. Once creased for all pigeons but B8. By the end of
this level of variability was reached, it re- this phase of the experiment, the group mean
mained, virtually unchanged, for the remain-
der of the experiment. For Pigeon AIO, there
LLJ
was only a modest decline in sequence variabil- u
z 40
ity during prolonged exposure to the variabil- ui
C3
30

ity contingency. This decline continued over LLJ


(f)

the course of the next two procedures, reaching z


cr.
20

a level comparable to that of the other pi- z


X: Iaa,m
geons. Thus, in every case, sequence variability C)
C3 40 F . ..

decreased, although the reinforcement contin- LL.


to

gency required sequence variability. u


30

Figure 3 presents the frequency of the se- ui


=2
C3
20

quence that became dominant for the same ses- LLJ


w
LL.
sions as depicted in Figures 1 and 2. Each DIF SEA DIF SEA OIF SEA DIF SEA Off WEA DIF SEA
panel also indicates what each pigeon's domi-
nant sequence was. In initial exposure to the PROCEDURE
variability contingency, each pigeon's domi- Fig. 3. Frequency of the dominant sequence per ses-
nant sequence occurred fewer than ten times sion for each pigeon averaged across the first and last
five sessions of each procedure in Experiment 1. The
per session. Pigeon B8 was an exception; but first bar in each pair is from the first five sessions and
its dominant sequence at the start of training the second is from the last five. The procedures are
was RRRRR (see asterisk in the figure), indi- labeled on the X-axis. The procedure requiring vari-
cating a strong tendency to perseverate on one ability is labeled DIF and the standard procedure is
key early in training. For the other pigeons, labeled SEQ. Each pigeon's dominant sequence is indi-
cated in the appropriate panel of the figure. In addi-
despite the fact that repetition of their correct, tion, for Pigeons A9, B6, and B7, a second sequence is
dominant sequence was never reinforced, its identified with an asterisk. This sequence occurred with
frequency increased over training. By the end substantial frequency in the procedure marked by the
of the variability procedure, dominant se- asterisk in the appropriate panel. Pigeon B8 also has a
second sequence identified with an asterisk. This was
quences were occurring nearly 30 times per its dominant sequence in the first five sessions of train-
session, averaged across all birds. That the ing, after which the other indicated sequence re-
variability contingency had some effect is clear placed it.
RESPONSE VARIABILITY AND REINFORCEMENT 175
frequency was 40. When returned to the vari- establish a repertoire that included more than
ability contingency, the frequency of the dom- one dominant response sequence, perhaps pi-
inant sequence decreased substantially for geons would learn to alternate among the se-
three of the pigeons. These three pigeons (A9, quences in their repertoire when the vari-
B6, and B7) satisfied the variability contin- ability contingency was imposed. Thus, a
gency by roughly alternating between the dom- repertoire of two dominant sequences was
inant sequence and another one. The asterisks shaped in the pigeons that served in Experi-
in the sixth bars of the panels for these three ment 1, after which they were again exposed
pigeons indicate that the sequence identified to the variability contingency.
by the asterisk increased during this proce-
dure. Thus, Pigeon A9 intermixed LLLRR- METHOD
RRL sequences with the dominant LLLL-
RRRR sequence. Perhaps more interesting, Subjects and Apparatus
Pigeons B6 and B7 mixed incorrect sequences The subjects and apparatus were the same as
(either RRRRR or LLLLL) with their domi- in Experiment 1. However, Pigeon B8 became
nant one. Note that this is a relatively effective ill shortly after the experiment began and was
strategy. If a pigeon strictly alternated LLLL- dropped from the experiment. Experiment 2
RRRR with LLLLL, it would obtain 25 rein- began immediately after Experiment I termi-
forcers per session. In contrast, if it emitted nated.
LLLLRRRR on all 50 trials, it would obtain
only a single reinforcer per session. Procedure
To summarize, these data suggest that rein- Most aspects of the procedure were the same
forcement produces sequence stereotypy in the as in Experiment 1. There were 50 trials per
face of a contingency that should discourage it. session, separated by a .5-sec ITI, reinforce-
This effect occurs in animals that have never ment was 4-sec access to grain, trials began with
experienced consistent reinforcement for ste- the illumination of the top left matrix light,
reotyped sequences, and it occurs when the and so on. The sequence task differed from
time between trials is short enough to mini- that in Experiment 1 in that an explicit at-
mize memory problems, to the extent that they tempt was made to train two distinct se-
can be minimized. quences. Each pigeon was trained to emit the
There is evidence that a contingency requir- sequences LLLLRRRR and RRRRLLLL.
ing variability has some effect on stereotypy, Training involved five stages. In the first
since there was less stereotypy when that con- stage, a trial began with only the left or right
tingency was in force than when it was not. key light illuminated, left if an LLLLRRRR
However, it should be noted that there is sequence was required and right if an RRRR-
greater reinforcement intermittency in the LLLL sequence was required. After the pigeon
variability procedures than in the standard se- pecked the lit key four times (moving the ma-
quence procedures. This intermittency of rein- trix light to the appropriate place), the key-
forcement might contribute to sequence vari- light was extinguished and the other key was
ability as much as the variability contingency illuminated. Four pecks on this other key
does. The question raised by these data is (again moving the matrix lights appropriately)
whether any procedure can be found that will produced a reinforcer. Pecks on either key
effectively prevent or eliminate sequence ste- when it was not illuminated had no conse-
reotypy. quence. Thus, as long as the pigeons pecked at
all, they could not help but obtain a reinforcer
on every trial.
EXPERIMENT 2 The RRRRLLLL and LLLLRRRR (here-
Experiment 1 demonstrated that a contin- after RL and LR respectively) requirements
gency that required variable response se- alternated irregularly from trial to trial.
quences failed to prevent stereotyped se- Which sequence was required was signaled
quences from developing, even in naive both by the illumination of the appropriate
pigeons. This experiment employed a different side key and by the illumination of the center
strategy, suggested by the behavior of Pigeons key. The center key was red on LR trials and
A9, B6, and B7 in Experiment 1. If one could green on RL trials. Pecks at it had no conse-
176 BARRY SCHWARTZ
quence. The pigeons were exposed to this pro- whereas the alternation contingency required
cedure for 20 sessions. specific sequences on each trial.
In the next stage of training, trials of the
type just described alternated with trials in RESULTS AND DISCUSSION
which both side keys were lit from the begin- Since the focus of this experiment was
ning, and the required sequence was cued by whether, if pigeons had a two-sequence reper-
the color of the center key. If the center key toire, it would allow them to master the vari-
was red, an LR sequence was required, and if ability contingency, no data will be presented
it was green, an RL sequence was required. No on the acquisition of LR and RL sequences
other sequence was reinforced. On these types through the various stages of the training pro-
of trials, all the sequences that could occur on cess. Instead Figure 4 presents data from the
the standard sequence procedure were possible final alternation procedure in which no extero-
but not reinforced. This stage of training ceptive cues to the required sequence were
lasted for 20 sessions. available. Figure 4 presents, for each subject,
The third stage of training consisted of the the mean reinforcements per session, LR se-
same two types of trials as the second stage, quences per session, and RL sequences per
but only one of every four trials guided the session, in the first and last five sessions of the
pigeons through the required sequence by hav- procedure. By the end of training, the pigeons
ing only the appropriate side key lit at any were obtaining 26 to 36 reinforcers per session
given time. In all other trials, both keys were (the group mean was 31). At the start of this
available and the center key signaled the re- stage, LR sequences were more frequent than
quired sequence. RL sequences for Pigeons A9 and A1O, and the
After 20 sessions, the fourth stage of training reverse was true for Pigeons B6 and B7. By the
was instituted. Here, all trials were of the type end of 50 sessions, both sequences were occur-
in which both side keys were lit. Thus the side- ring with substantial frequency for each pi-
key lights were no longer used to guide pigeons geon. About 80% of all sequences were either
through the required sequence, but the color LR or RL. For no pigeon did either sequence
of the center key was still used as a signal. This occur less than 15 times per session. Thus, it
stage was in force for 20 sessions. seems reasonable to conclude that the pigeons
In the final stage of training, signaling of re- met the requirements of this task with substan-
quired sequences by the center-key color was tial success. That they did so indicates that fail-
discontinued. All trials were of standard se- ure on the task requiring variability could not
quence type, with both side keys lit from the be attributed to difficulties pigeons might have
beginning. What was required was alternation remembering what they did on the previous
between LR and RL sequences. If an LR se- trials. For on this alternation task, there were
quence was reinforced on a given trial, the no exteroceptive cues available at the time of
next reinforcement depended upon the occur- a trial to indicate what sequence was required,
rence of an RL sequence. All other sequences
terminated trials without reinforcement. This so
Bs
R9 RIO
requirement remained in effect until an RL 40

sequence occurred, after which the next rein-


forcement depended upon an LR sequence. By °:-

strictly alternating between the two sequences,


a pigeon could obtain a reinforcer on every
trial. This procedure was in effect for 50 ses-
sions.
After this sequence of procedures requiring
two specific sequences, the pigeons were ex-
z
Q
20 11 i h in 1 Cl 1L
40

R
RET0
BR

LR
R
RL
R
RE
97

L
LB
L
RL
GROUP

FI
RUFT R

posed again to the variability contingency of


Experiment 1 for 50 sessions. It should be
noted that neither procedure allowed rein-
forcement for repetition. They differed in that Fig. 4. Reinforcements per session, LLLLRRRR (LR)
sequences per session and RRRRLLLL (RL) sequences
the variability contingency simply required per session for each pigeon and for the group in the
different sequences to occur from trial to trial, first and last five sessions of the alternation procedure.
RESPONSE VARIABILITY AND REINFORCEMENT 177
5
9 0 85
io
the absence of an explicit contingency on al-
40_
x
tD 30 X
ternation, sequences tended to drift back to
Er)
(r)
the patterns observed before the alternation
20
LLJ
(f) contingency was introduced. Establishing a
m
tLi
a-
I0 repertoire that included two dominant se-
quences did not substantially facilitate perfor-
u
m
LLJ
40 86 87 GROUP
mance under the variability contingency.
C3 30 1
LLI
w --
LL. 20 --

GENERAL DISCUSSION
Mu T AFT LR RL
The results of these experiments confirm
UL J
OTT Ri

Schwartz's (1980) earlier conclusion that rein-


Fig. 5. Reinforcements per session, LLLLRRRR (LR) forcement of variable response sequences in
sequences per session and RRRRLLLL (RL) sequences pigeons does not succeed. Sequence variability
per session for each pigeon and for the group in the is only slightly greater under the variability
first and last five sessions of the variability procedure.
contingency than in its absence, and variabil-
ity decreases over exposure to the variability
and the pigeons performed accurately. Clearly contingency. In the 1980 experiment, a num-
some aspect of what had occurred on the pre- ber of possible explanations for the failure
ceding trial must have been available to the were proposed. One explanation was that once
pigeon as it began its next sequence. a stereotyped sequence has developed, it is re-
Having established a two-sequence reper- sistant to modification, so that a variability
toire with the alternation task, the question of contingency must overcome previous training.
interest was whether this repertoire would fa- The present experiments ruled out this possi-
cilitate performance under the variability con- bility by showing that stereotypy developed in
tingency. The relevant data are presented in the face of required variability even in naive
Figure 5. Figure 5 presents, for each pigeon, pigeons. A second possible explanation ap-
the mean reinforcements, and LR and RL se- pealed to the pigeon's memory limitations. To
quences per session over the first and last five meet the variability contingency, the pigeon
sessions of exposure to the variability contin- must remember at least a fragment of what
gency. Over the course of training, reinforce- occurred in the preceding trial. The present
ments per session decreased for every pigeon, experiments ruled out this possibility in two
quite substantially for Pigeons AIO and B5. ways. First, pigeons failed to master the con-
Though performance was better at the end of tingency even when a very short (.5-sec) ITI
this procedure than in past attempts to train was in effect. Second, with the very same ITI,
variability (see Figure 1), at the end of training pigeons mastered the alternation task, which
the pigeons were obtaining less than 50% of would seem to make the same demands on
the possible reinforcers. What makes this de- memory as the variability task.
crease in accuracy surprising is that the vari- A third possible explanation, not ruled out
ability contingency would seem to represent by the present data, is that the variability task
no significant change in procedure from the al- is simply too difficult for pigeons. But this can-
ternation contingency. Repetitions of a se- not constitute an explanation without some
quence would not be reinforced in either case, indication of what makes it difficult. Perhaps
and alternations between the two sequences the task is difficult because it requires that pi-
would be reinforced in both cases. Neverthe- geons have a concept of "different." This may
less, over the course of training, one of the se- be too abstract a concept for pigeons. This is
quences increased in frequency and the other implausible for a few reasons. First, pigeons
decreased for every pigeon but B5 and B6. For are able to perform effectively on tasks, like
every pigeon, performance at the end of the oddity discrimination, that require discrimina-
variability procedure was nearly identical to tive responding based upon the concept "dif-
performance at the beginning of the alterna- ferent stimulus." Second, even if "different" is
tion procedure (compare the fourth and sixth too difficult for pigeons, they could perform
bar of each panel in Figure 5 to the third and effectively by just alternating between two spe-
fifth bar of each panel in Figure 4). Thus, in cific sequences. Experiment 2 built these se-
178 BARRY SCHWARTZ
quences into their repertoires. Nevertheless, putatively automatized) simultaneously (e.g.,
over the course of exposure to the variability LaBerge & Samuels, 1974). If performance of
contingency alternation deteriorated, with one the nonautomatized task is not affected by con-
of the two sequences becoming more frequent current performance of the automatized one,
than the other. one concludes that automatization spares lim-
Although in the present experiment a vari- ited processing resources. Schwartz (Note 2)
ability requirement failed to promote variable has done an analogous experiment with pi-
response sequences, there is evidence in the lit- geons in the sequence situation. He found that
erature that trained variability is at least some- pigeons could perform the sequence task simul-
times possible. First, for at least some species taneously with either a matching-to-sample or
(rats), in at least some situations (spatial learn- a conditional-discrimination task. For exam-
ing), behavioral variability is the norm, and ple, in the matching task, on each sequence
stereotypy is difficult even to train (e.g., Olton, trial both response keys were either red or
1979; Olton & Samuelson, 1976). Second, even green. After the pigeon performed a correct
for pigeons pecking keys, there may be some sequence, the keys were darkened for a mo-
properties of responding for which variability ment, after which one was red and the other
can be trained. Blough (1966), for example, green. To obtain the reinforcer, the pigeon
was able to produce variability in interre- had to peck the key whose color matched the
sponse times. Although we have evidence that, color of both keys during the sequence portion
on the sequence task, interresponse times are of the trial. Thus, the pigeons had to process
highly stereotyped (Schwartz, 1981, Note 1), it key color while performing the sequence task.
is possible that if reinforcement were made to They were able to perform both sequence and
depend upon variable interresponse times, matching tasks extremely accurately. However,
such variability would occur. Third, Pryor, when delays were introduced in the concurrent
Haag, and O'Reilly (1969) successfully trained task (for example, three seconds with the keys
variable response patterns in porpoises. How- dark intervened between the end of the se-
ever, even if reinforcement induces stereotypy quence and the illumination of the keys with
only in some circumstances, it is appropriate to red and green for the matching choice), se-
ask why stereotypy should occur at all. What quence performance deteriorated. This deteri-
does stereotypy gain the animal? One possibil- oration did not occur when correct sequences
ity is that stereotypy gains efficiency. The ani- produced delayed reinforcement with no con-
mal comes to produce responses that maximize current required task. Thus, even though re-
reinforcement and minimize effort. Thus, the sponse sequences were stereotyped, they still
rat may begin training with highly variable required cognitive "effort" for successful execu-
and inefficient lever presses and end with tion. Thus, it is not obvious that efficiency,
highly stereotyped, smooth, and efficient ones. whether behavioral or cognitive, fully explains
Similarly, pigeons on the sequence task may stereotypy.
come to develop sequences that minimize ef- Some insight into the source of stereotypy
fort. Support for this possibility is that for may be gained by carefully examining what
most pigeons, the sequence that becomes effects reinforcement is supposed to have. Ac-
dominant is one that involves only one switch cording to the law of effect, reinforcement in-
between keys, presumably the least effortful creases the future probability of responses that
sequence. However, an argument against effi- have been followed by reinforcers. As Skinner
ciency is that the stereotyped sequences de- (1935) pointed out, reinforcement strengthens
velop even in the face of the variability contin- (increases in probability) an operant, a class
gency, where they are decidedly inefficient. of responses. What defines that class must be
One could make a somewhat different kind some property on which reinforcement de-
of efficiency argument, based upon cognitive pends (see Schick, 1971). Suppose we wanted to
rather than behavioral economy. It is possible use reinforcement to increase behavioral vari-
that stereotyped sequences are automatized so ability. What would define the operant class
that they do not require active attention for on which reinforcement depends? What objec-
execution. Tests of this idea with human sub- tive property of responses would unite them
jects have often assessed automaticity by hav- into a class? It is clear that there is no such
ing people engage in two tasks (one of them property. We could create such a property by,
RESPONSE VARIABILITY AND REINFORCEMENT 179
for example, requiring variation among a fixed golf or piano lesson and hears "good" from the
set of responses (as in the alternation task in instructor, the "good" may control behavior
Experiment 2), but then what we would obtain through the information it provides and not
is strengthening of just those responses that because it is hedonic. The instructor's "good,"
had the specified property. Thus, while it is as a purely informative consequence of a golf
clear that reinforcement might create a varied, swing, may influence the form that a swing
but definite and circumscribed response reper- takes if it occurs, but it may not have any in-
toire, it is not clear that it could increase vari- fluence on its probability of occurrence. If the
ability per se. Indeed, if reinforcement were to probability of playing golf increases, it may
be demonstrably successful in increasing vari- result from the fact that playing golf is more
ability, it would be hard to reconcile this with fun if one is playing well. On this analysis, the
the technical meaning of the law of effect. This instructor's "good" controls responding by in-
is not pedantry. The significance of under- fluencing response form. But it is not a rein-
standing reinforcement as affecting response forcer, because it does not, by itself, increase
classes is that it solves the teleological problem response probability. One might want to sug-
of seeming to have cause (reinforcement) fol- gest that "good" is a reinforcer, increasing the
low effect (behavior). If reinforcement did not probability of effective golf swings. On this
strengthen a class of responses, it is unclear view, the reinforcer would be functioning both
how it could intelligibly be said to have any to redefine and to strengthen the response class.
effect at all. However, this analysis is implausible, for if
If it does not make logical sense to suggest "good" were a reinforcer we would expect peo-
that reinforcement can increase variability, ple to favor instructors who said "good" at a
what is to be said about occasional demonstra- high indiscriminate rate, no matter what the
tions that reinforcement can increase novel swing looked like. But presumably if one en-
behavior (e.g., Goetz & Baer, 1973; Pryor, countered such an instructor, one would
Haag, & O'Reilly, 1969)? For example, when change instructors rather than increase the
social reinforcement was made contingent on probability of playing golf. The suggestion
novel block designs in young children, the fre- here is not that playing golf is not reinforced,
quency of novel designs increased (Goetz & or that successful golf shots are not reinforcing,
Baer, 1973). These effects may be understood for they may be. It is only that "good" from an
by distinguishing two properties that the conse- instructor may control behavior without rein-
quences of responding called "reinforcers" forcing it.
may have. The first property is hedonic. Or- If purely informative consequences of re-
ganisms seem to want reinforcers. The second sponding need not increase response probabil-
property is informational. Reinforcer delivery ity, it is not hard to see how they could increase
tells an organism that it has just done some- response variability. They would serve, as in-
thing properly. structions do, to define the appropriate range
From the definition of a reinforcer as a con- of activities-to establish a goal. Then whether
sequence of responding that strengthens a class or not that goal was actually achieved (appro-
of behavior it may follow that reinforcers must priate responses increased in frequency) would
inevitably be both informative and hedonic. presumably depend upon other factors. Since
Since reinforcers are by definition response-de- informative response consequences do not
pendent, they must provide feedback that re- strengthen, they do not depend for logical co-
sponding was correct. And since, by definition, herence on the existence of an objectively de-
they increase the probability of responses that fined response class.
produce them, they must be hedonic. A conse- It might seem that if one is trying to increase
quence of responding that did not produce this behavioral variability, there must be some ob-
increase would be judged not to be a reinforcer jectively defined physical properties of behav-
at all. ior on which one must depend before deliver-
However, from the fact that reinforcers are ing the informative consequences. At the very
by definition both hedonic and informative, it least, some range of variability must be speci-
does not follow that all consequences of re- fiable. However, this is not the case. One could
sponding that control behavior are both he- set out to produce "creative" block design
donic and informative. When one is taking a without having any idea in advance what a cre-
180 BARRY SCHWARTZ
ative design will look like. All one must be able formative but nonreinforcing response conse-
to do is recognize, after the fact, that a given quences might be used effectively to refine or
design differs from previous ones. The range shape the activity without encouraging stereo-
of variability will likely differ in extent and typed repetition. And if they did not encour-
detail from subject to subject. Obviously, if age repetition, they would presumably not re-
one is attempting to increase variability in an sult in lowered baseline rates of occurrence.
automated experiment, say with animal sub- The general significance of this distinction
jects, criteria for reinforcement must be estab- between hedonic and informative aspects of
lished in advance, as in the present experi- operant consequences comes clear in Horton's
ments. But it should be recognized that the (1967) discussion of the similarities and differ-
need for a priori criteria is a feature of auto- ences between traditional African and scien-
mated methods and not of the shaping process tific Western patterns of thinking. Horton sug-
itself. For example, in the Pryor, Haag, and gests that the hallmark of Western scientific
O'Reilly (1969) experiment that increased thought is that the practical and theoretical
novel responses in porpoises, the experiment- are kept distinct, whereas in traditional Afri-
ers had virtually no a priori criteria. The ani- can thought, they are inextricably connected.
mal's behavior established the experimenters' As a result, in science one can construct or at-
criteria. tack hypotheses, attempt to confirm or falsify
This distinction between hedonic and infor- them, without regard for any consequence but
mative properties of response consequences is truth. In contrast, tests of hypotheses in Afri-
not one that has been made systematically in can thought always have practical conse-
the literature, though it has appeared occa- quences. If, for example, an agronomist stum-
sionally. For example, Goetz and Baer (1973) bled onto a new method of cultivation and
pointed out that their social "reinforcement" improved crop yield as a result, the agronomist
of novel block designs might have depended would attempt to establish a causal relation by
on either or both of these properties. In a dif- withholding the new method and looking for
ferent context, the literature on second-order decreased yield. For the African farmer, with-
schedules of brief-stimulus presentation is con- holding the new method might mean starva-
cerned in part with distinguishing the discrim- tion. Thus, the farmer might continue to re-
inative and conditioned-reinforcing effects of peat the new method that has worked (been
these stimuli (see Gollub, 1977, pp. 299-308 for reinforced) in the past, and perhaps never dis-
a review). It is a distinction that deserves at- cover whether the relation between the new
tention, for it may be of practical as well as method and crop yield was spurious or not
theoretical importance. Suppose an educator is (see Schwartz, in press).
interested in using reinforcement to promote Consequences that have hedonic properties
novel or creative behavior. It might be that conflate the practical and the theoretical. If
some response consequences will be effective one wants the reinforcer, one repeats what has
at promoting novelty and others will not, de- worked in the past. Stereotypy is the result. A
pending upon whether they do or do not have consequence that is just informative need not
hedonic (reinforcing) properties. A good deal conflate the theoretical and the practical.
of the controversy that surrounds the use of What it does is provide information about
reinforcers in education could be resolved by what is correct (true). Whether or not one con-
means of this distinction. Consider, for exam- tinues to engage in the behavior that produced
ple, the so-called "overjustification effect," the the informative consequence will presumably
demonstration that reinforcement may de- depend on other (hedonic, practical?) factors
crease the baseline rate of occurrence of activi- that are separable from the informative con-
ties that are "intrinsically motivated," i.e., sequence per se.
have high baseline rates (e.g., Deci, 1975; Fein- If this distinction between informative and
gold & Mahoney, 1975; Lepper & Greene, hedonic properties of consequences makes it
1978; Lepper, Greene, & Nisbett, 1973). It is possible to speak with logical coherence of con-
possible that reinforcers, since they produce sequences producing novel behavior, and if it
stereotypy, undermine one of the characteris- helps establish procedures for the judicious
tics of an activity (that it permits novelty) that use of consequences (of both types) in educa-
makes it reinforcing. On the other hand, in- tion, it does so at a price. For it is not clear
RESPONSE VARIABILITY AND REINFORCEMENT 181

that the idea of informatiWe but nonreinforc- Lepper, M. R-., & Greene, D. (Eds.). The hidden costs of
ing consequences has much to do with the cen- reward. Hillsdale, N.J.: Erlbaum, 1978.
Lepper, M. R., Greene, D., & Nisbett, R. E. Under-
tral ideas of behavior analysis over its history. mining children's intrinsic interest with extrinsic
To the extent that this distinction is impor- reward: A test of the "overjustification" hypothesis.
tant, it may indicate significant incompleteness Journal of Personality and Social Psychology, 1973,
in the traditional concepts of behavior anal- 28, 129-137.
ysis. Notterman, J. M., & Mintz, D. E. Dynamics of re-
sponse. New York: Wiley, 1965.
Olton, D. S. Mazes, maps and memory. American Psy-
chologist, 1979, 34, 583-596.
REFERENCE NOTES Olton, D., & Samuelson, R. J. Remembrance of places
1. Schwartz, B. Interval and ratio reinforcement of a passed: Spatial memory in rats. Journal of Experi-
complex, sequential operant in pigeons. Manuscript mental Psychology: Animal Behavior Processes, 1976,
submitted for publication. 2, 97-116.
2. Schwartz, B. Stereotypy without automaticity in pi- Pryor, K. W., Haag, R., & O'Reilly, J. The creative
geons. Manuscript in preparation. porpoise: Training for novel behavior. Journal of
the Experimental Analysis of Behavior, 1969, 12,
653-661.
REFERENCES Schick, K. Operants. Journal of the Experimental
Analysis of Behavior, 1971, 15, 413-423.
Blough, D. S. The reinforcement of least-frequent in- Schwartz, B. Studies of operant and reflexive key pecks
terresponse times. Journal of the Experimental Anal- in the pigeon. Journal of the Experimental Analysis
ysis of Behavior, 1966, 9, 581-591. of Behavior, 1977, 27, 301-313.
Brown, P. L., & Jenkins, H. M. Auto-shaping of the Schwartz, B. Development of complex, stereotyped be-
pigeon's key-peck. Journal of the Experimental havior in pigeons. Journal of the Experimental
Analysis of Behavior, 1968, 11, 1-8. Analysis of Behavior, 1980, 33, 153-166.
Deci, E. L. Intrinsic motivation. New York: Plenum, Schwartz, B. Reinforcement creates behavioral units.
1975. Behaviour Analysis Letters, 1981, 1, 33-41. (a)
Feingold, B. D., & Mahoney, M. J. Reinforcement ef- Schwartz, B. Control of complex, sequential operants
fects on intrinsic interest: Undermining the overjus- by systemnatic visual information in pigeons. Journal
tification hypothesis. Behavior Therapy, 1975, 6, 367- of Experimental Psychology: Animal Behavior Pro-
377. cesses, 1981, 7, 31-44. (b)
Goetz, E. M., & Baer, D. M. Social control of form di- Schwartz, B. Reinforcement induced behavioral stereo-
versity and the emergence of new forms in children's typy: How not to teach people to discover rules.
blockbuilding. Journal of Applied Behavior Analy- Journal of Experimental Psychology: General, 1982,
sis, 1973, 6, 209-217. 111, in press.
Gollub, L. Conditioned reinforcement: Schedule ef- Skinner, B. F. The generic nature of the concepts of
fects. In W. K. Honig & J. E. R. Staddon (Eds.), stimulus and response. Journal of General Psychol-
Handbook of operant behavior. Englewood Cliffs, ogy, 1935, 12, 40-65.
N.J.: Prentice-Hall, 1977. Vogel, R-., & Annau, Z. An operant discrimination task
Herrnstein, R. J. Stereotypy and intermittent rein- allowing variability of reinforced response pattern-
forcement. Science, 1961, 133, 2067-2069. ing. Journal of the Experimental Analysis of Behav-
Horton, R. African traditional thought and Western ior, 1973, 20, 1-6.
science. Africa, 1967, 37, 50-71 & 155-187.
LaBerge, D., & Samuels, S. J. Toward a theory of auto-
matic information processing in reading. Cognitive Received June 19,1981
Psychology, 1974, 6, 293-323. Final acceptance October 15, 1981

You might also like