You are on page 1of 20

Journal of the Experimental Analysis of Behavior 2019, 1–20

Resurgence as Choice in Context: Treatment duration and on/off


alternative reinforcement
Timothy A. Shahan, Kaitlyn O. Browning, and Rusty W. Nall
Utah State University

Resurgence as Choice (RaC) is a quantitative theory suggesting that an increase in an extinguished


target behavior with subsequent extinction of an alternative behavior (i.e., resurgence) is governed
by the same processes as choice more generally. We present data from an experiment with rats
examining a range of treatment durations with alternative reinforcement plus extinction and dem-
onstrate that increases in treatment duration produce small but reliable decreases in resurgence.
Although RaC predicted the relation between target responding and treatment duration, the
model failed in other respects. First, contrary to predictions, the present experiment also repli-
cated previous findings that exposure to cycling on/off alternative reinforcement reduces resur-
gence. Second, RaC did a poor job simultaneously accounting for target and alternative behaviors
across conditions. We present a revised model incorporating a role for more local signaling effects
of reinforcer deliveries or their absence on response allocation. Such signaling effects are
suggested to impact response allocation above and beyond the values of the target and alternative
behaviors as longer-term repositories of experience. The new model provides an excellent account
of the data and can be viewed as an integration of RaC and a quantitative approximation of some
aspects of Context Theory.
Key words: resurgence, matching law, discrimination, operant behavior, lever pressing, rats

Resurgence is an increase in a previously behavior is typically placed on extinction while


suppressed behavior resulting from a relative an appropriate alternative behavior is reinforced
worsening in conditions for a more recently (see Petscher, Rey, & Bailey, 2009; Tiger, Han-
reinforced behavior (Epstein, 1985; Lattal & ley, & Bruzek, 2008). Although such treatments
Wacker, 2015; Shahan & Craig, 2017). Theoreti- are highly effective, elimination or reduction of
cally, resurgence is important because it pro- reinforcement for the alternative behavior can
vides a means to explicitly study the processes produce increases in the previously suppressed
governing how historically and more recently problem behavior (see Briggs, Fisher, Greer, &
effective behaviors are allocated in an environ- Kimball, 2018; Volkert, Lerman, Call, &
ment characterized by shifting contingencies Trosclair-Lasserre, 2009; Wacker et al., 2011).
over time. As such, a better understanding of Similarly, alternative-reinforcement-based inter-
the phenomenon might also provide important ventions for substance abuse are highly effec-
insights into problem solving and creativity tive (see Higgins, Heil, & Lussier, 2004;
(e.g., Epstein, 1985; Shahan & Chase, 2002). Prendergast, Podus, Finney, Greenwell, & Roll,
Clinically, resurgence is important because such 2006), but relapse to drug seeking is common
recurrence with changing consequences appears when treatment ends and alternative rein-
to be a source of relapse to previously treated forcers are suspended (e.g., Silverman et al.,
undesirable behavior. For example, in applica- 1998; Silverman, Chutuape, Bigelow, & Stitzer,
tions of differential-reinforcement-of-alternative- 1999; see Podlesnik, Jiminez-Gomez, &
behavior (i.e., DRA), an undesirable target Shahan, 2006; Quick, Pyszczynski, Colston, &
Shahan, 2011).
In laboratory examinations of resurgence,
This work was funded by grant R01HD093734 (TAS) a three-phase procedure is typically used. In
from the Eunice K. Shriver National Institute of Child
Health and Human Development. The authors thank
Phase 1, a target behavior (e.g., left lever
Wayne Fisher and Brian Greer for previous discussions press) is reinforced on some schedule of
and Paul Cunningham and Tony Nist for helping to con- reinforcement. In Phase 2, the target behav-
duct the experiment. ior is typically placed on extinction and an
Address correspondence to: Timothy A. Shahan, 2810 alternative behavior (e.g., right lever press)
Old Main Hill, Utah State University, Logan, UT
84321-2810. Email: tim.shahan@usu.edu is reinforced. In Phase 3, the alternative
doi: 10.1002/jeab.563 behavior is also placed on extinction, and as

© 2019 Society for the Experimental Analysis of Behavior


1
2 TIMOTHY A. SHAHAN et al.

a result, the rate of the target behavior weighting rule provides a series of weightings to
increases (i.e., resurgence occurs). be applied to experiences (e.g., reinforcement
Resurgence as Choice (RaC) theory suggests rates) in the past,
that resurgence results from the same basic
processes governing choice more generally 1=t c
(Shahan & Craig, 2017). RaC is an extension wx = P
n
x
ð3Þ
1=t c
of the concatenated matching law (Baum & i
i =1
Rachlin, 1969) and suggests that the rate of
target behavior is a function of the relative
where wx is the weighting for a particular ses-
values of the histories of consequences pro-
sion in the past. The numerator reflects the
duced by the target and alternative behaviors
recency of an experience, where tx is the time
over time. Specifically, RaC suggests that the
(number of sessions plus 1) between that
absolute response rate of target behavior is
experience and the session of interest. The
given by,
denominator is the sum of the recencies for
all sessions prior to and including the session
kV T of interest. The exponent c reflects how
BT = A = a ðV T + V Alt Þ ð1Þ
V T + V bAlt + A1 quickly the impact of experiences in past ses-
sions decreases, with lower values of c resulting
in less relative weight for more recent sessions.
where BT is the rate of the target behavior, VT RaC assumes that organisms should be
and VAlt are the current values of the histories influenced more by recent experiences when
of consequences provided by the target and the reinforcement rate is high and to be
alternative behaviors, k is a parameter influenced by experiences over a longer
reflecting the asymptotic rate of BT, the period of time when reinforcement rate is low
parameter b reflects any unaccounted for bias (Killeen, 1981), formalized as,
related to other factors (e.g., topographically
different responses, differences in effort) for c = λr + 1 ð4Þ
one option or the other (b > 1 = bias for the
target; b < 1 = bias for the alternative), and where r is the running average rate of rein-
A represents the invigorating (i.e., arousing) forcement provided by a particular response
effects of the current values of the two options across all sessions under consideration, and λ
(the parameter a represents how much of an is a free parameter representing sensitivity to
impact the current values of the options have running reinforcement rate. Equation 3 gener-
on invigoration and is likely reflective of cur- ates weighting functions that are hyperbolic as
rent motivational state). Simply re-expressing a function of session, with more recent ses-
Equation 1 in terms of the relative value of the sions receiving higher weights than more tem-
alternative behavior provides a similar equa- porally distant sessions (see Shahan & Craig,
tion for absolute response rate of the alterna- 2017 for discussion).
tive behavior (i.e., BAlt), In order to calculate values of the target
(i.e., VT) and alternative (i.e., VAlt) options across
k V bAlt sessions to be used in Equations 1 and 2, the
B Alt = ð2Þ reinforcement rates (i.e., Rx) experienced across
V T + V bAlt + A1
all sessions for each option are multiplied by the
weightings for those sessions provided by
where all terms are as in Equation 1.
Equation 3, and those weighted reinforcement
RaC supplements the concatenated matching
rates are summed across the sequence of ses-
law by providing a formal means to calculate the
sions under consideration. Formally, that is:
values of the histories of changing consequences
produced by the target and alternative behaviors X X
over time (i.e., VT and VAlt), even when those VT = w x R xT V Alt = w x R xAlt ð5Þ
histories include extinction of one or both x x
options. Specifically, RaC uses a modified tem-
poral weighting rule (e.g., Devenport & where RxT and RxAlt are the reinforcement
Devenport, 1994) to calculate VT and VAlt This rates (in reinforcers/hr) for the target and
RESURGENCE AS CHOICE IN CONTEXT 3

alternative behaviors experienced across the durations on subsequent resurgence will be


sequence of sessions. When reinforcement presented. Second, RaC appears to struggle to
rates are constant and nonzero across time, account for the effects of alternating exposures
Equations 3-5 generate value functions for the to sessions of alternative reinforcement and
target and alternative options that correspond extinction (hereafter “on/off” alternative rein-
to the veridical reinforcement rates. However, forcement) during Phase 2. Although exposure
when an option is placed on extinction and its to such on/off alternative reinforcement seems
true reinforcement rate is zero, the hyperbolic to reduce resurgence, there are some inconsis-
weighting function results in a value function tencies in the literature, as detailed below.
that decreases from the previously arranged Thus, the experiment also included such a
reinforcement rate quickly at first, and more condition allowing comparison with conditions
slowly as time in extinction progresses. As a testing for resurgence after a wide range of
result, in the typical three-phase resurgence constantly “on” alternative reinforcement.
procedure a target behavior that has been on Third, formal fitting of RaC to the obtained
extinction for some time reaches a low, but data reveals the nature of some of the limita-
relatively constant value. When the alternative tions of the model with respect to treatment
behavior is also placed on extinction, the ini- duration, but especially its failure with respect
tial precipitous drop in its value results in a to on/off alternative reinforcement. Thus, in
relative (although not absolute) increase in order to remedy these inadequacies, we present
the value of the target behavior in Equation 1 an augmented version of the model that might
(i.e., the denominator decreases with the large be considered a hybrid of RaC and a quantita-
drop in VAlt), and as a result BT increases (see tive approximation to Bouton’s Context Theory
Shahan & Craig, 2017 for examples). This (e.g., Bouton, Winterbauer, & Todd, 2012;
increase in the target behavior is what has Trask, Schepers, & Bouton, 2015; Winterbauer
been called resurgence, and according to RaC & Bouton, 2010). In what follows, we will
it is the result of the shifting relative values of address these points in order.
the target and alternative options over time. Treatment Duration and RaC
That is, resurgence is simply a natural out- The effects of treatment duration on resur-
come of the Matching Law. gence could be of particular clinical importance
RaC as formalized in Equations 1-5 has pro- if longer exposure to treatment might reduce
vided a reasonably good quantitative account subsequent resurgence of problem behavior.
of a wide range of findings in the resurgence Theoretically, the effects of treatment duration
literature, including a number of findings are a fundamental aspect of resurgence, and
where the only other quantitative theory of any viable theory must be able account for this
resurgence (i.e., Behavioral Momentum The- variable. The two studies examining the widest
ory; Shahan & Sweeney, 2011) has failed (see range of treatment durations (both with rats)
Craig & Shahan, 2016; Shahan & Craig, 2017; have generated somewhat mixed results.
Nevin et al., 2017, for reviews). Further, RaC Leitenberg, Rawson, and Mulick (1975) found
makes novel predictions relevant for both that 27 sessions of treatment in Phase 2 reduced
basic theoretical and clinical domains (see resurgence compared to 3 and 9 sessions. How-
Greer & Shahan, 2019), and it provides a way ever, Winterbauer, Lucke, and Bouton (2013)
to integrate the phenomenon of resurgence found that 4, 12, and 36 sessions led to statisti-
into the wider conceptual and quantitative cally equivalent magnitudes of resurgence,
framework of matching theory. although the four-session group showed some
The purpose of this paper is threefold. First, what numerically higher resurgence. Unfor-
RaC makes specific quantitative predictions tunately, it appears that both studies con-
about the effects of the duration of exposure to founded rates of alternative reinforcement
extinction+alternative reinforcement (i.e., treat- with duration of Phase 2 by using ratio sched-
ment duration) in Phase 2 on subsequent ules of reinforcement for the alternative
resurgence. The data available on this issue are behavior (i.e., alternative response rates, and
either complicated by interpretive difficulties thus reinforcement rates increased with train-
or examine a small number of treatment dura- ing duration). Nevertheless, two additional
tions. Thus, novel data from an experiment experiments from our laboratory without
with rats examining a wide range of treatment this confound examined target responding
4 TIMOTHY A. SHAHAN et al.

previously maintained by alcohol or cocaine with increases in treatment duration. However,


reinforcement and have similarly obtained no this predicted decrease might be too small to
statistical difference in resurgence with 5 and detect statistically, or to be of clinical
20 sessions of Phase 2 (Nall, Craig, Brow- significance.
ning, & Shahan, 2018). As in Winterbauer Although longer treatment durations are
et al. (2013), rats in the alcohol experiment predicted to have relatively little effect on
(but not in the cocaine experiment) showed resurgence, RaC predicts the specific form of
numerically, but not statistically, greater the function relating response rates in Phase
resurgence in the 5-session compared to the 3 to treatment duration. This predicted func-
20-session group. Thus, as a whole, the exis- tion is shown in Figure 2. Note that both axes
ting data with rats suggest that longer treat- are logarithmic. The simulation depicts
ment duration has relatively little impact on predicted target response rates during the first
resurgence, although there may be some session of resurgence testing occurring in ses-
small effect that is difficult to detect sions 4, 8, 16, 24, and 32. Target response
statistically. rates do indeed decrease, but the decreases
Consistent with the data described above, across the range are quite small. The fact that
RaC predicts that longer treatment durations the function is linear in these log–log coordi-
have only relatively small effects on resur- nates means that RaC predicts that resurgence
gence. Figure 1 shows the quantitative predic- in the first session of Phase 3 decreases as a
tions of RaC across treatment durations of negative power function of treatment duration
3, 7, 15, 23, and 31 sessions (i.e., testing for (i.e., y = ax-b). Given both the potential clinical
resurgence on days 4, 8, 16, 24, and 32). The and theoretical importance of the effects of
simulation assumes that the target behavior treatment duration on resurgence, the experi-
was reinforced on a variable-interval (VI) 30-s ment described below was designed in part to
schedule in Phase 1 for 30 sessions. In Phase test the prediction in Figure 2.
2, target behavior was placed on extinction On/Off Alternative Reinforcement
and the alternative behavior was reinforced on Two additional experiments have often
VI 10-s schedule for the appropriate number been cited to suggest that increases in treat-
of sessions. In Phase 3, both target and alter- ment duration produce more meaningful
native behaviors were extinguished. The simu- reductions in resurgence (Sweeney & Shahan,
lation suggests that RaC predicts that response 2013; Wacker et al., 2011). The Wacker et al.
rates in the Phase-3 test decrease somewhat (2011) experiment examined the effects of
increasing durations of exposure to DRA on
the problem behavior of children with
50 RaC Predictions
Target Rate (resp/min)

40 Day4
=0.006 50
Day8
k=45 Day16
Target Rate (resp/min)

30
a=0.0005
Day24 10
20 Day32

10
1
=0.006
0 k=45
0 5 10 15 20 25 30
a=0.0005
Sessions 0.1
Fig. 1. Predicted target response rates across a range
1 10 50
of Phase 2 treatment durations during which the target Test Session
behavior (previously reinforced on a VI 30-s schedule for
30 days in Phase 1) is extinguished and the alternative Fig. 2. Predicted first-session Phase 3 target response
behavior is reinforced on a VI 10-s schedule. The alterna- rates as a function of the session in which alternative rein-
tive behavior is also then placed on extinction on the 4th, forcement is removed for the alternative behavior (4th,
8th, 16th, 24th, or 32nd session. The data points at zero 8th, 16th, 24th, or 32nd session). Note that both axes are
on the x-axis represent Phase 1 target response rates. logarithmic.
RESURGENCE AS CHOICE IN CONTEXT 5

intellectual and developmental disabilities treatment was itself responsible for the
(IDD). However, the study included cycles of decreases.
DRA and extinction across months of treat- However, two additional experiments with
ment. Removal of DRA for a session generated rats (Schepers & Bouton, 2015; Trask, Keim, &
an increase in the rate of the problem behav- Bouton, 2018) have produced a different out-
ior (i.e., resurgence), and across time with come. Like Wacker et al. (2011) and Sweeney
these on/off cycles of DRA, these increases in and Shahan (2013), both Schepers and
problem behavior became smaller—leading to Bouton (2015) and Trask et al. (2018) have
the interpretation that longer treatment shown that exposure to alternating sessions of
reduces resurgence. However, it is possible on/off alternative reinforcement gradually
that the decreases in resurgence resulted from reduces the amount of resurgence observed
the alternating cycles of DRA and extinction, across successive “off” sessions. However, in
rather than the increases in treatment dura- both Schepers and Bouton and in Trask et al.,
tion per se. Similarly, Sweeney and Shahan control groups exposed to the same number
(2013) found with pigeons that alternating ses- of Phase 2 sessions except with alternative
sions of alternative reinforcement and extinc- reinforcement available constantly in each ses-
tion resulted in decreases in resurgence in sion showed significantly more resurgence
those sessions in which alternative reinforce- than groups exposed to on/off alternative
ment was removed (i.e., the off sessions). But reinforcement. Importantly, Trask et al. exam-
again, it is not clear if the robust decreases in ined the effects of on/off alternative reinforce-
resurgence during “off” days were a result of ment across two different overall treatment
increasing time in treatment, or if they were durations (i.e., 5 vs. 25 sessions). As with the
due to the history of exposure to alternating treatment duration experiments discussed
sessions of on/off alternative reinforcement. above, the 5- and 25-session constant alterna-
Data from a control group in Sweeney and tive reinforcement groups both showed resur-
Shahan was consistent with the possibility that gence, and the groups did not differ from one
time in treatment alone might be responsible another statistically. However, for the alternat-
for the decreases across the repeated “off” ing on/off alternative reinforcement groups,
resurgence tests in the on/off group. The con- neither the 5-session nor the 25-session group
trol group experienced the same number of showed significant resurgence (although mean
sessions in Phase 2, but with alternative rein- target response rates for both groups were
forcement constantly available in every session. higher during the test). In addition, for the
In the final resurgence test, this “constant on” 25-session alternating on/off group, resur-
alternative reinforcement group did not differ gence was significantly reduced as compared
from the “on/off” group in terms of the to the 25-session constant alternative reinforce-
degree of resurgence.1 This result suggests ment group. A similar, although only margin-
that the decreases in resurgence in the ally significant, effect was observed for the
repeated “off” sessions were not due to expo- 5-session on/off and constant groups. These
sure to the alternating sequence of on/off findings suggest that in contrast to Sweeney
alternative reinforcement, and that time in and Shahan, it does appear that experience
with on/off reinforcement across sessions
1 reduces resurgence, and thus, the effect of
Although Sweeney & Shahan (2013, Exp.2) did not
conduct a direct statistical comparison of resurgence in
treatment duration in Wacker et al. with chil-
the two conditions as measured by the last session of Phase dren with IDD could be due to this aspect of
2 and the first session of Phase 3, a 2 (on/off vs. constant their study.
condition) x 2(last Phase 2 vs. Phase 3 test session) As noted by Shahan and Craig (2017), RaC
repeated measures ANOVA based on the single-subject predicts that exposure to on/off alternative
data available in their Figure 7 reveals only a significant
main effect of session F(1,11) = 11.718, p = .006, ηp2 = .516. reinforcement produces essentially the same
Neither the main effect of condition, nor the condition x amount of resurgence as similar durations of
session interaction was significant. Thus, resurgence exposure to constant alternative reinforce-
occurred during the test but did not differ for the groups. ment. Figure 3 shows the same simulation as
Additional paired t-tests verify that both the on/off
[t(11) = 2.363, p = .0376, d = .68] and the constant
in Figure 1 with the inclusion of a condition in
[t(11) = 2.980, p = .0125, d = .86] groups did indeed show which on/off alternative reinforcement is pres-
resurgence when considered individually. ented. Note that there is no meaningful
6 TIMOTHY A. SHAHAN et al.

50 measured 30 cm x 24 cm x 21 cm and were


RaC Predictions
housed in sound- and light-attenuating
Target Rate (resp/min)

40 Day4 cubicles.
=0.006
Day8 Each chamber was constructed of work
k=45 Day16
30 panels on the front and back walls, and a clear
a=0.0005
Day24 Plexiglas ceiling, door, and wall opposite the
20 Day32 door. Two retractable levers on the front wall,
On/Off with stimulus lights above them, were posi-
10 tioned on either side of a food receptacle that
was illuminated with the delivery of 45-mg
0
0 5 10 15 20 25 30 grain-based food pellets (Bio Serv, Flemington,
Sessions NJ). A house light positioned above the food
receptacle on the front panel was used for gen-
Fig. 3. Predicted target response rates presented as in eral chamber illumination. All experimental
Figure 1, but with the inclusion of the predictions for a events and data collection were controlled by
condition in which on/off alternative reinforcement is Med-PC software run on a computer in an adja-
presented across sessions.
cent control room.

Procedure
difference between the on/off and constant Sessions were conducted 7 days per week
alternative reinforcement conditions. Obvi- at approximately the same time each day. All
ously, the data of Schepers and Bouton (2015) sessions were 30 min excluding time for
and Trask et al. (2018) with rats appear to reinforcement delivery during which all exper-
suggest that this prediction is incorrect. Thus, imental timers were paused for 3 s. During
in addition to examining a range of con- reinforcement delivery, the pellet dispenser
stant alternative reinforcement durations, the dropped a single food pellet into the illumi-
experiment described below also included a nated food receptacle and the stimulus and
group of rats exposed to on/off alternative house lights were darkened.
reinforcement and tested all the conditions Training. Rats were first trained to consume
depicted in Figure 3. RaC was fitted to the food pellets from the food aperture. Food
resulting data using Equations 1-5 and its ade- pellets were delivered response-independently
quacy was assessed. according to a variable-time (VT) 60-s schedule
of reinforcement for four sessions. The VT
schedule and all variable-interval (VI) sched-
Method
ules described below were constructed of
Subjects 10 intervals derived from the Flesher and Hoff-
Sixty male Long-Evans rats (Charles River, man (1962) constant-probability distribution.
Portage, MI), approximately 71-90 days old Levers remained retracted and lever lights and
upon arrival, served as subjects. Rats were indi- house lights were darkened throughout maga-
vidually housed in a humidity- and temperature- zine training. The day following magazine train-
controlled colony room with a 12:12 hr light/ ing, a single acquisition session was conducted
dark cycle. Rats had ab libitum access to water to train rats to press the target lever (left–right,
in the home cages and were maintained at 80% counterbalanced across subjects). This acquisi-
of their free feeding weights. Animal housing tion session began with illumination of the
and care and all procedures reported below house and target-lever lights and insertion of
were conducted in accordance with Utah State the target lever. To facilitate target lever press-
University’s Institutional Animal Care and Use ing, the first response on the target lever pro-
Committee. duced a food pellet and thereafter food pellets
were delivered for lever pressing according to a
VI 10-s schedule of reinforcement.
Apparatus Phase 1: Baseline. Sessions during baseline
Ten identical Med Associates (St. Albans, began as described for the lever-response
VT) operant chambers were used. Chambers acquisition session. Responses to the target
RESURGENCE AS CHOICE IN CONTEXT 7

lever produced food pellets according to a VI 50


30-s schedule of reinforcement. This phase

Target Rate (resp/min)


lasted 30 sessions. Following baseline, rats
10
were divided into a total of six groups. Rats
were assigned to groups such that average
responses per minute across the last three
sessions of baseline were comparable 1
between groups (overall M = 42.20, range:
41.19-43.36) and did not differ statistically
F(5, 52) = .001, p = .999. 0.1
Phase 2: Treatment. For five constant alter- 1 10 50
native reinforcement groups, alternative rein- Test Session
forcement was available in every session of this
phase. The five groups differed in terms of the Fig. 4. Obtained target response rates in the first ses-
length of this treatment phase, and group sion of the Phase 3 resurgence test as a function of the ses-
names reflect the day on which Phase 3 started sion in which resurgence testing began (4th, 8th, 16th,
24th, or 32nd session) for all groups exposed to constant
and alternative reinforcement was removed alternative reinforcement in Phase 2. The small data
(i.e., Day4, Day8, Day16, Day24, and Day32 points represent individual rats in each treatment duration
groups). For example, the Day8 group was group. The line is a linear regression using all individual-
exposed to constant alternative reinforcement subject data. The large data points represent group geo-
metric means. Note the log–log axes.
for seven sessions and alternative reinforce-
ment was suspended on session 8. A sixth
group received alternating sessions of alterna-
tive reinforcement availability and extinction designated by group names for the constant
of alternative-lever pressing (i.e., On/Off alternative reinforcement groups (and “on”
group). That is, alternative responding pro- session 32 for the On/Off group), alternative
duced alternative reinforcement only on odd- responding was also placed on extinction for
numbered days and extinction was in effect on all groups. Stimulus conditions remained as in
even-numbered days. This alternating treat- Phase 2, but neither the target nor the alterna-
ment continued for a total of 31 daily sessions. tive lever produced reinforcer deliveries. This
This arrangement allowed comparison of an phase lasted for 10 sessions for all groups.
“off” day for the On/Off group with the first
“off” day of Phase 3 for each of the constant Results and Discussion
alternative reinforcement groups after a com-
parable number of Phase 2 sessions. There Treatment Duration after Constant
were 10 rats in each group, with the exception Alternative Reinforcement
of the Day16 and Day24 groups which had We begin with a focus on the effects of treat-
only nine rats each due to an experimental ment duration on resurgence. Figure 4 shows
scheduling error. target response rates in the first session of the
Sessions during this phase began as in base- Phase 3 resurgence test as a function of the
line with the addition of insertion of the alter- session in which resurgence testing began for
native lever (right–left, counterbalanced all groups exposed to constant alternative rein-
across subjects) and illumination of the forcement in Phase 2. The small data points
alternative-lever stimulus light. During the first reflect target response rates of individual rats
session of this phase, the first response on the in each treatment duration group. In general,
alternative lever produced a food pellet, after target response rates decreased with increases
which food was delivered according to a VI in treatment duration prior to the resurgence
10-s schedule for the remainder of the first ses- test. Linear regression of all subject data con-
sion and in all sessions in which alternative ducted on logarithmic transformed data on
reinforcement was available for each group as both dimensions (corresponding to the dou-
described above. ble log axes depicted in the figure) revealed
Phase 3: Resurgence test. Beginning on the that target response rates decreased signifi-
session following the completion of Phase cantly (i.e., a significant nonzero slope) with
2 treatment and corresponding to the session longer Phase 2 durations F(1,46) = 11.53,
8 TIMOTHY A. SHAHAN et al.

Constant Versus On/Off Alternative


Target Rate (resp/min)

12 Alt SR Ext
10 Reinforcement
Constant
8 Next, consider the effects of constant versus
On/Off
on/off alternative reinforcement. Figure 5
6
shows mean target response rates during the
4 last session with alternative reinforcement in
2 Phase 2 versus the first resurgence test session
0 in Phase 3 for the constant alternative rein-
0 5 10 15 20 25 30 forcement groups and similar data for the
Sessions corresponding sessions for the On/Off alter-
native reinforcement group. Planned compari-
Fig. 5. Mean target response rates during the last sons employing 2 (phase; Alt SR present
session with alternative reinforcement in Phase 2 versus
the first resurgence test session in Phase 3 for the all
versus absent) x 2 (condition; constant versus
constant alternative reinforcement groups and similar on/off) mixed ANOVAs (see Table 1 for
data for the corresponding sessions for the On/Off details) reveal significant main effects of phase
alternative reinforcement group. Error bars repre- for all treatment durations, significant main
sent 1 SEM. effects of condition for all test days (i.e., treat-
ment durations) except the Day 4 comparison,
p = .0014. The larger data points depict geo- and significant phase x condition interactions
metric means (appropriate for the regression for all test days except for the Day 4 compari-
in log space) for each group and are provided son (although the interaction for Day 4 was
only as a visual aid. The significant linear rela- marginal p = .052). Follow up t-tests (see
tion in double logarithmic space in the figure Table 2 for details) comparing the last session
is consistent with prediction of RaC that target when alternative reinforcement was present
responding should decrease as a power func- and the first resurgence test session for the con-
tion of treatment duration. Although this is a stant and on/off conditions individually at each
positive outcome for RaC, the story becomes test day reveal that responding significantly
increasingly less positive with further inspec- increased for all comparisons. Thus, in sum-
tion of the full data set. mary, removal of alternative reinforcement gen-
erated significant resurgence of target behavior

Table 1
Results from planned 2 x 2 (Phase: Alt SR present vs absent x Condition: Constant vs On/Off) mixed-model ANOVA
conducted on target response rates during the last session with alternative reinforcement in Phase 2 versus the first
resurgence test session in Phase 3 for all the constant alternative reinforcement groups and similar data for the
corresponding sessions for the On/Off alternative reinforcement group

Degrees of Freedom
Comparison Effect F Effect Error p ηp2

Day 4 Phase 23.77 1 18 < .001 .57


Condition 2.31 1 18 .146 .11
Phase x Condition 4.34 1 18 .052 .19
Day 8 Phase 25.40 1 18 < .001 .59
Condition 11.81 1 18 .003 .40
Phase x Condition 9.41 1 18 .007 .34
Day 16 Phase 70.77 1 17 < .001 .81
Condition 32.91 1 17 < .001 .66
Phase x Condition 32.97 1 17 < .001 .66
Day 24 Phase 25.98 1 17 < .001 .60
Condition 16.88 1 17 < .001 .50
Phase x Condition 41.62 1 17 .001 .50
Day 32 Phase 23.52 1 18 < .001 .57
Condition 11.46 1 18 .003 .39
Phase x Condition 10.74 1 18 .004 .37
RESURGENCE AS CHOICE IN CONTEXT 9

Table 2
Results from paired-samples t-tests conducted on target response rates from the last session when alternative
reinforcement was present and the first resurgence test session for the constant groups and the corresponding sessions
for the On/Off group individually at each test day

Condition Duration t Degrees of Freedom p* d

Constant Day 4 3.98 9 .003 1.26


Day 8 4.10 9 .003 1.29
Day 16 7.18 8 < .001 2.39
Day 24 4.38 8 .002 1.46
Day 32 4.15 9 .002 1.31
On/Off Day 4 2.87 9 .018 0.90
Day 8 6.27 9 < .001 1.98
Day 16 3.88 9 .004 1.22
Day 24 4.99 9 < .001 1.58
Day 32 3.78 9 .004 1.19
* Bonferroni correction: α = .005.

for both constant and On/Off alternative rein- Quantitative Assessment of RaC
forcement groups across a range of treatment In order to better evaluate the nature of
durations. As compared to when reinforcement RaC’s shortcomings with the present data, the
was constantly available during Phase 2, alternat- model was fitted to data across conditions for
ing sessions of on/off alternative reinforcement both the target and alternative behaviors.
significantly reduced, but did not eliminate, Figure 7 shows response rates for the target
resurgence across the entire range of treatment (top panel) and alternative (bottom panel)
durations, including at the longest treatment behaviors across all Phase 2 and 3 sessions for
duration. The fact that on/off alternative rein- all groups (solid lines). In addition, the figure
forcement reduced resurgence is consistent shows the best fitting functions generated by
with both Schepers and Bouton (2015) and RaC using least-squares regression (Microsoft
Trask et al. (2018) and inconsistent with Excel Solver). Response rates for the target
Sweeney and Shahan (2013). However, unlike and alternative behaviors were fitted simulta-
in the current experiments, both Schepers and neously using Equations 1 and 2, respectively,
Bouton and Trask et al. reported that although with the relevant values of VT and VAlt across
all of their on/off alternative reinforcement sessions provided by Equations 3-5. The fit was
groups showed numerical increases in target conducted on logarithmic transformed data in
rates in the final extinction test session
(i.e., session 8 for Schepers & Bouton and ses-
sions 6 and 26 for Trask et al.), none of those
increases were statistically significant. In con- 2 On/Off Alt SR
trast, although the significant increases in target
Target Rate (resp/min)

behavior observed across all treatment dura-


tions in the on/off condition in the present
experiment were small, they were extremely 1
reliable. Indeed, 10 of 10 rats in the on/off
condition showed increases in responding at
every timepoint depicted in Figure 5 except
for Day 4 and Day 16, in which 8 of 10 and
9 of 10 rats, respectively, showed increases in 0
target response rate. As an illustrative exam- Alt On Alt Off
ple, Figure 6 shows that even with 16 cycles of Day 31 Day 32
on/off alternation (i.e., at the Day 32 time-
Fig. 6. Target response rates for individual rats in the
point), responding increased for every rat On/Off alternative reinforcement group for the final tran-
when alternative reinforcement was removed sition between alternative reinforcement on (day 31) ver-
in session 32. sus off (day 32).
10 TIMOTHY A. SHAHAN et al.

Fig. 7. Response rates for target (top panel) and alternative (bottom panel) behavior across sessions for all groups.
Data for the different groups are represented by differently colored solid lines. Mean target behavior response rates in
the final three sessions of Phase 1 are presented above zero on the x axis. The predictions of RaC for each group are
shown as dotted lines of the same color as for the data. All data were fitted simultaneously. Note the logarithmic y axes.

order to provide more proportional weighting k, and a) and is quite poor with r2 = .56. The
of the many-fold lower response rates associ- bias parameter (i.e., b) is omitted because
ated with the target as compared to alternative identical levers served as the target and alter-
behavior across sessions.2 The fit involves native behaviors, and there is no reason to
346 data points and three free parameters (λ, expect any meaningful bias.

2
Fitting in nonlogarithmic space results in the model for the much higher alternative response rates are much
basically ignoring the target behavior because the residuals greater.
RESURGENCE AS CHOICE IN CONTEXT 11

A couple of issues with the fit of the model response) to generalize to Context C, and thus
are apparent in Figure 7. First, as expected responding increases. As applied to resur-
based on the simulations above, the model pre- gence, Context Theory suggests that the pres-
dicts no differences in target response rates ence and absence of reinforcer deliveries
when comparing resurgence for the constant provide the relevant stimulus contexts. Specifi-
and on/off alternative reinforcement condi- cally, reinforcement of target behavior in
tions. Although the model predicts the saw- Phase 1 serves as the stimulus for Context
tooth patterns of responding for both target A. In Phase 2, reinforcement of alternative
and alternative behaviors across cycles of behavior serves as the stimulus for Context
on/off alternative reinforcement, the location B. Finally, in Phase 3, the absence of reinforce-
and size of the functions are inaccurate. Sec- ment serves as the stimulus for Context C, and
ond, the model generally overpredicts response target responding increases as a result of the
rates for the target behavior (except for the failure of the extinction learning during Phase
later resurgence tests after constant alternative 2 to generalize.
reinforcement where there is a tendency to When specifically applied to the effects of
underpredict) and generally underpredicts on/off alternative reinforcement, Context
response rates for the alternative behavior. If Theory suggests that “off” days provide experi-
the bias parameter (i.e., b) is included and ence with what will be the Phase 3 testing con-
allowed to vary freely (an unprincipled and ditions (i.e., extinction of both target and
unreasonable assumption, and the fit is not alternative behavior), thus facilitating generali-
depicted), target response rates decrease some- zation of extinction learning to the resurgence
what (all fitted functions in Fig. 7 shift down), test. Thus, the account essentially asserts that
alternative response rates increase somewhat on/off alternative reinforcement gives the
(the functions shift up), and r2 increases to .73 organism the opportunity to learn that rein-
(fit not shown). But, overall the pattern of forcement is not coming for the target behav-
mispredictions described above remains largely ior, even when the alternative reinforcer is
unchanged. Thus, we conclude that RaC as for- absent.
malized solely in Equations 1-5 is inadequate to
account for the effects of on/off alternative
reinforcement, and the model also does a poor Resurgence as Choice in Context (RaC2)
job accounting for target and alternative RaC in its most general sense conceptualizes
response rates simultaneously. resurgence as resulting from a choice between
the target and alternative behaviors, and it
focuses on how shifting relative values of the
Context Theory options over time produce shifts in the alloca-
Schepers and Bouton (2015) and Trask tion of behavior. Context Theory as applied to
et al. (2018) have suggested that the effects of resurgence focuses on the role of stimulus
on/off alternative reinforcement are consis- control by more local aspects of reinforcer
tent with the qualitative account of resurgence deliveries or their absence. Clearly, these two
provided by Context Theory. According to options are not mutually exclusive. RaC can
Context Theory (e.g., Bouton et al., 2012; be thought of as providing a longer-term
Trask et al., 2015; Winterbauer & Bouton, accounting of what is traditionally conceptual-
2010), resurgence is an example of the ABC ized as a reinforcement history for the two
renewal of extinguished behavior. In ABC options. But in order for exposure to rein-
renewal, behavior that is trained in the pres- forcement histories (including extinction) to
ence of one set of stimuli (i.e., Context A) and impact an organism, it must discriminate rein-
then extinguished in the presence of a differ- forcer deliveries (or their absence). Indeed,
ent set of stimuli (i.e., Context B), increases from our perspective, choice, relative value,
with a switch to a set of stimuli that is different and the matching law involve nothing more
from both the training and extinction contexts than an organism discriminating the signaling
(i.e., Context C). Context Theory suggests effects of rates and patterns of reinforcer deliv-
ABC renewal occurs because of a failure of eries across time and response options, and
new learning during extinction in Context B then behaving accordingly based on an innate
(i.e., learning not to respond or to inhibit the strategy known as “matching” (Davison &
12 TIMOTHY A. SHAHAN et al.

Baum, 2006; Gallistel et al., 2007; Gallistel, Phases 2 and 3 in the typical resurgence
Mark, King, & Latham, 2001; see Shahan, preparation—in which constant alternative
2017 for discussion). In RaC, such discrimina- reinforcement is available in Phase 2 and then
tions are assimilated over time and carried for- removed in Phase 3. In short, the discrimina-
ward by the shorthand concept of “value” tions described above must be applied uni-
which updates with experience according to formly to all sessions in which alternative
the temporal weighting rule. The long tails of reinforcement is present versus absent.
the hyperbolic weighting functions generated Thus, in order to incorporate these discrimi-
by the temporal weighting rule mean that nations into RaC we take inspiration from
although value does update with recent expe- matching-law based models of stimulus con-
rience, it also nevertheless serves as a relatively trol/detection (e.g., Davison & Nevin, 1999;
stable repository of longer-term histories. It is Davison & Tustin, 1978) where the modulat-
this longer-term repository of more distant ing effects of stimuli are characterized as a
experience that allows RaC to account for why source of bias. Thus, we propose the following
a behavior that has not been reinforced for for target response rates in sessions where alter-
quite some time might nevertheless suddenly native reinforcement is present and target behavior is
increase in relative frequency when conditions being extinguished (i.e., normal Phase 2 sessions
worsen for a more recently reinforced behav- in the typical procedure or “on” sessions for
ior. Although Context Theory provides an on/off alternative reinforcement):
informal qualitative account of how more local
stimulus effects of reinforcer deliveries might kV T
BT ðonÞ = A = a ðV T + V Alt Þ ð6Þ
contribute to resurgence, it provides no V T + d 1 ðV Alt Þ + A1
account of how a longer-term history is con-
structed, or why the organism is choosing to where all terms are as in Equation 1 above
engage in these behaviors at all. Our goal in and the added term d1 reflects the biasing
what follows is to provide a first attempt at effects of discriminating that the presence of
quantitatively integrating the roles of more alternative reinforcement signals the local
global histories of experiences and shorter- absence of reinforcement for the target behav-
term discriminating in an account of resur- ior.3 Calculations of VT and VAlt remain
gence. The account that emerges might be unchanged and are as described in Equations
thought of as a mashup of RaC and a quantita- 3-5 above. Similarly, the equation for the alter-
tive approximation to aspects of Context The- native behavior under these same condi-
ory. We call this mashup Resurgence as tions is:
Choice in Context (RaC2).
As noted above, Context Theory suggests kd 1 ðV Alt Þ
that on/off alternative reinforcement reduces B Alt ðon Þ = ð7Þ
resurgence because it gives the organism an V T + d 1 ðV Alt Þ + A1
opportunity to learn two discriminations: 1)
that during “alternative reinforcement on” ses- where all terms are as above. The d1 term is
sions, reinforcer deliveries for the alternative the same in both equations and reflects the
behavior (i.e., Context B) come to signal that biasing effect of the discrimination, because as
the target will not be reinforced, and 2) that it grows more behavior is allocated toward the
during “alternative reinforcement off” sessions, alternative option (and away from the target)
the absence of reinforcement for alternative above and beyond what would be expected
behavior (i.e., Context C) comes to signal that based on VT and VAlt alone as repositories of
reinforcement is not coming for either option. longer-term histories. When d1 = 1, there is no
In order to integrate these more local discrimi- additional bias and rates of both responses are
nations into RaC, one must somehow quantify governed by their relative values. But, as d1
their effects and incorporate them into RaC’s grows to >1, discrimination produces a bias
equations. It is critical to note that the discrimi-
nations suggested by Context Theory are not 3
The bias term b used to accommodate topographically
occurring only during conditions involving different responses or differences in effort in Equations 1
on/off alternative reinforcement, but also and 2 above is omitted here for simplicity because it is not
under the usual circumstances arranged by relevant for what follows.
RESURGENCE AS CHOICE IN CONTEXT 13

toward the alternative behavior and away from suggested by Gallistel, Fairhurst, and Balsam
the target. Thus, BT decreases and BAlt (2004).5 Thus,
increases. In the terms of Context Theory, one
could think of such a bias away from the target d 1 = d m ð1 − e −x on Þ ð10Þ
behavior as the organism learning not to
engage in the target behavior as much. and
In sessions where alternative reinforcement is
absent and target behavior is being extinguished d 0 = d m ð1 − e − x off Þ ð11Þ
(i.e., Phase 3 sessions in the typical procedure
or “off” sessions for on/off alternative where dm is a free parameter representing a
reinforcement): shared asymptotic value of d1 and d0, and xon
and xoff correspond to sessions of exposure to
kV T =d 0 “on” and “off” alternative reinforcement,
B T ðoff Þ = ð8Þ respectively. It is important to note that both
V T =d 0 + V Alt =d 0 + A1
d1 and d0 are bias terms that are unitless ratios,
for the target behavior and, just like other bias terms in the Matching Law.
Such bias terms represent the ratio of
kV Alt =d 0 responding to the two options that would
B Alt ðoff Þ = ð9Þ result from the source of bias if the values of
V T =d 0 + V Alt =d 0 + A1 the two options were equal. Thus, for example
d1 = 3 represents a 3/1 bias toward the Alt, if
for the alternative behavior, where all terms all else were equal. Accordingly, the right-
for both equations are as above, and d0 reflects hand sides of all the response rate equations
the biasing effects of learning to discriminate (i.e., Eqs. 6-9) are dimensionally consistent
that reinforcement is not available for either and deliver responses/min, as they should.
option (i.e., the absence of alternative rein- But, to be more specific, these unitless bias
forcement signals that target reinforcement is terms represent the biasing effects of the pro-
also unavailable). Both VT and VAlt are divided posed learned discriminations, and Equa-
by d0 in Equations 8 and 9, thus representing tions 10 and 11 describe how such bias grows
a bias away from both options (i.e., a bias with experience with the things being discrimi-
toward not responding). When d0 = 1 there is nated. dm (also a unitless ratio) is the asymp-
no additional bias and rates of both responses totic value of these biasing terms. Although it
are governed by their relative values. As d0 is certainly not impossible that the asymptotic
grows to >1, discrimination produces more values for the two functions could be different,
bias away from both options and both BT and we have yoked them at the same value in the
BAlt decrease.4
Equations 6-8 describe how the biasing
5
terms related to discriminating based on the Gallistel et al. (2004) actually propose a Weibull func-
presence or absence of alternative reinforce- tion as the learning curve, which if presented in the terms
of Equations 10 or 11 is d = dm(1-e-[(x/L)^S]), where L is the
ment might impact target and alternative value at which d reaches 63% of dm, and S is a parameter
response rates above and beyond the longer- representing the speed of the onset of acquisition. Given
term values of the options. But, those equa- the lack of information about the nature of the acquisition
tions are not particularly useful until numbers of the relevant discriminations under consideration, we
are actually provided for the d1 and d0 biasing have omitted both L and S for simplicity and, in effect, are
employing a simple increasing exponential decay function
terms across sessions. Context Theory suggests (see also Hutsell & Jacobs, 2013, for a related simplifica-
that the organism is learning the discrimina- tion applied to other issues in stimulus control). Thus, in
tions with exposure to the relevant stimuli our equations the onset of learning begins with the first
across sessions. Thus, we assume that both session of exposure to the relevant on or off sessions, and
63% of dm is reached after the first session. As will be
d terms increase across sessions according to a apparent below, the model does an excellent job, even
simplified version of the learning curve with these simplifying assumptions about the quantitative
nature of the inferred discrimination learning. However, it
is possible that the omitted features of the learning curve
4
Alternative but equivalent versions of Equations 8 and could be required for other potential discriminations aris-
ing in future applications of the model to other
9 are:B T ðoff Þ = V + VkV T+ d 1 and B Alt ðoff Þ = V + VkV Alt+ d 1
T Alt 0 ðAÞ T Alt 0 ðA Þ circumstances.
14 TIMOTHY A. SHAHAN et al.

interest of simplicity. According to Equa- present experiment. Again, the fit was con-
tions 10 and 11, the biasing effects of discrimi- ducted on logarithmic transformed data.
nating the relevant conditions during on and Equations 6 and 7 were used to generate
off sessions increase relatively quickly toward response rates for all sessions in which alterna-
an asymptote with increasing sessions of expo- tive reinforcement was present (i.e., all ses-
sure to those conditions (i.e., alternative rein- sions of Phase 2 for the constant alternative
forcement either on or off). reinforcement groups and all “on” sessions for
Figure 8 shows a simultaneous fit of RaC2 to the on/off group). Equations 8 and 9 were
target and alternative response rates from the used to generate response rates for all sessions

100
r2 = .92 = 0.0028
k = 68.59
a = 0.0002
Responses/min

10 dm= 8.61
dm on/off= 22.25

0.1 Target

100
Responses/min

10

0.1
Alternative
0 5 10 15 20 25 30 35 40
Sessions
Fig. 8. Response rates for target (top panel) and alternative (bottom panel) behavior across sessions for all groups.
Data for the different groups are represented by differently colored solid lines. Mean target behavior response rates in
the final three sessions of Phase 1 are presented above zero on the x axis. The predictions of RaC2 for each group are
shown as dotted lines of the same color as for the data. All data were fitted simultaneously. Note the logarithmic y axes.
RESURGENCE AS CHOICE IN CONTEXT 15

when alternative reinforcement was absent in Figure 8 involves 346 data points rep-
(i.e., all sessions of Phase 3 for the constant resenting absolute response rates for behavior
alternative reinforcement groups and all “off” in transition in six different conditions and for
sessions for the on/off group). The two separate responses simultaneously.
discrimination-based biasing terms d1 and d0 By way of comparison to the fit of RaC2 in
were applied to each session with their Figure 8, consider the only other existing
current-session values determined by Equa- quantitative model of resurgence. The Behav-
tions 10 and 11, in which xon and xoff ioral Momentum Theory of resurgence
incremented by sessions of exposure to on (Shahan & Sweeney, 2011) suggests
and off conditions, respectively. The fit !
includes 346 data points and a total of five − t ðpR a + c + dr Þ
parameters: λ, k, a, and one dm parameter for B t = B 0  10^ ð12Þ
the constant alternative reinforcement groups ðr + R a Þb
(i.e., dm) and one for the On/Off group
(i.e., dm on/off). Separate dm parameters were where Bt is the absolute rate of the target
permitted for the constant and On/Off alter- behavior at time t in extinction and B0 is
native reinforcement groups based on the the baseline rate of the target response
assumption that the on/off conditions might before extinction, c is a parameter reflect-
provide training that results in better discrimi- ing disruption associated with breaking the
nation of the relevant dimensions (i.e., essen- target-response reinforcer contingency, d is a
tially an assumption of Context Theory).6 The parameter scaling disruption associated with
model simultaneously provides a good descrip- elimination of reinforcers from the situation
tion of both target and alternative behaviors (i.e., generalization decrement), r is a variable
across all conditions with r2 = .92. The model reflecting the rate of reinforcement for the
does a much better job than the original ver- target during baseline, and b is a parameter
sion of RaC, and remedies its problems with reflecting sensitivity to reinforcement rate.
respect to the levels of target and alternative The variable Ra is the rate of reinforcement
response rates, and the difference between for alternative behavior during extinction,
the constant and on/off conditions of alterna- and the parameter p scales the additional dis-
tive reinforcement. Thus, we conclude that ruptive impact of alternative reinforcement
RaC2 as formalized in Equations 6-11 provides on the target behavior during extinction.
a promising approach for combining aspects Thus, like RaC2 this model has five free
of RaC and Context Theory. parameters (c, d, b, p, and B0). Unlike RaC2,
One potential criticism of the fit in Figure 8 this model has no means to account for alter-
is that it involves five free parameters. native behavior. In fact, it is not even clear
Although five free parameters might seem like how to apply the general conceptual
a lot, one must keep in mind that this is not a approach of disruption by extinction and
fit typical of that which occurs in much of the alternative reinforcement to the alternative
quantitative analysis of behavior, where a behavior in Phase 2 at all. Thus, before begin-
quantitative model is fitted to relatively few ning, Behavioral Momentum Theory has
data points. For example, while the general- failed to provide any account for half of the
ized matching law has only two free parame- data. Nevertheless, Equation 12 was fitted to
ters, it is usually fitted to 10 or fewer data the data for the target response in the pre-
points representing only averages across ses- sent experiment, and the result is shown in
sions of steady-state performance. Thus, the Figure 9. Despite having exactly the same
fits of more well-accepted models often involve number of free parameters and only
data-to-parameter ratios that are more than attempting to account for half of the data
tenfold worse than the fit in Figure 8. The fit (i.e., 173 rather than 346 data points), Equa-
tion 12 fails marvelously, as is apparent visu-
ally and based on the poor quality of the fit
6
A model comparison of a fit including only a single dm (i.e., r2 = .48). Thus, with the same number of
parameter for constant and On/Off groups and the fit parameters and twice the data, RaC2 does a
with separate parameters presented in the main body of considerably better job than its only quantita-
the text strongly supported the two-dm parameter model
with ΔAIC = 145.68 and ΔBIC = 141.91. tive competitor.
16 TIMOTHY A. SHAHAN et al.

100 Behavioral Momentum Theory

10
Responses/min

0.1 p = 0.006
c = 1.003
0.01 d = 0.004
b = 0.517 r2=.48
Target B0= 45.00
0.001
0 5 10 15 20 25 30 35 40
Sessions
Fig. 9. Response rates for target behavior across sessions for all groups. Data for the different groups are represented
by differently colored solid lines. Mean target behavior response rates in the final three sessions of Phase 1 are presented
above zero on the x axis. The predictions of Behavioral Momentum Theory for each group are shown as dotted lines of
the same color as for the data. Note the logarithmic y axes.

General Discussion As a result of RaC’s failures with these data,


we have proposed an expanded version of the
The data from the present experiment sug- model. Inspired by Context Theory, the new
gest that increases in the duration of treat- version (i.e., RaC2) quantitatively incorporates
ment with extinction plus alternative a role for more local discriminations of the rel-
reinforcement significantly reduce resurgence. evant conditions of reinforcement for target
The present study is the first to examine a and alternative behaviors—presumably based
range of treatment durations in the absence of on the signaling effects of reinforcer deliveries
other confounds, and it is the first to clearly or their absence. The model suggests that the
demonstrate that decreases in target behavior biasing effects of such discriminations on
with longer treatment durations are rather response allocation are above and beyond the
small, but reliable. Although RaC did a reason- values of target and alternative options as
ably good job predicting the relation between longer-term repositories of experience
target response rates in the first session of the (as provided by the temporal weighting rule).
resurgence test and treatment duration, the Thus, the new model represents an integra-
model failed in other respects. First, the pre- tion of RaC and a quantitative approximation
sent experiment replicated the previous find- of at least some aspects of Context Theory.
ings of Schepers and Bouton (2015) and This new model provides an excellent fit for
Trask et al. (2018) that exposure to alternat- the large dataset generated in the present
ing on/off alternative reinforcement reduces experiment and does a good job simulta-
resurgence. RaC predicted that there should neously accounting for target and alternative
be no meaningful difference between constant behavior across different treatment durations
and on/off alternative reinforcement. Second, and for both constant and on/off alternative
fits of RaC’s equations to both target and alter- reinforcement in Phase 2.
native behavior across sessions and conditions Although on/off alternative reinforcement
revealed that the model did a poor job simul- reduced resurgence as compared to constant
taneously accounting for both behaviors.
RESURGENCE AS CHOICE IN CONTEXT 17

alternative reinforcement, it did not eliminate studies, including the use of pigeons and a
it completely. Even after 16 cycles of on/off within-subjects design in which all pigeons
alternative reinforcement across sessions, tar- experienced multiple exposures to extinction
get behavior still increased when alternative and resurgence tests. However, it is not appar-
reinforcement was removed in session 32. The ent why such differences would be expected
increase was small, but it was reliable both sta- to eliminate the difference in resurgence
tistically and at the individual subject level. We between the on/off and constant reinforce-
suggest that these increases reflect the rela- ment conditions. Perhaps more promising is
tively stable influence of the value of the target the fact that Sweeney and Shahan used leaner
option over extended exposures to extinction. schedules of reinforcement in Phase 1 (i.e., VI
Although the improved discrimination of local 60 s) and in Phase 2 (VI 30 s) than did the
reinforcement availabilities encouraged by present study, Schepers and Bouton, and
on/off alternative reinforcement reduces Trask et al.—all of which used a VI 30 s in
resurgence as compared constant alternative Phase 1 and a VI 10 s in Phase 2. Given the
reinforcement, the effects of the more robust effects of Phase 2 rate of alternative
extended history as represented in value seem reinforcement and the relatively smaller
to contribute to resurgence for a rather effects of Phase 1 target reinforcement rate on
long time. resurgence (see Craig & Shahan, 2016), we
Clinically the effects of on/off alternative suspect that the difference in results could be
reinforcement could be important because due to the difference in Phase 2 alternative
they might provide a means to reduce resur- reinforcement rate. Indeed, if the effects of
gence as compared to standard on/off alternative reinforcement are due to
implementations of DRA with constant alter- the learning of discriminations related to the
native reinforcement. However, the small presence versus absence of alternative rein-
increases in target behavior with the omission forcement, then it stands to reason that the
of alternative reinforcement after even lower rate of Phase 2 alternative reinforce-
extended exposure to on/off alternative rein- ment might be less effective at training such
forcement suggest that the intervention could discriminations. Future research should sys-
result in some tendency for problem behavior tematically examine this possibility.
to return. Even a small increase in the ten-
dency for problem behavior to return could
result in reinforcement of that behavior via Ghosts of Models Past and Future
errors of commission and likely further The original version of RaC was developed
increase problem behavior (see Greer & as a result of the accumulation of serious fail-
Shahan, 2019, for discussion). Thus, although ings (see Craig & Shahan, 2016; Nevin et al.,
on/off alternative reinforcement appears to 2017, for reviews) of its predecessor, the
hold promise as a means to reduce resur- Behavioral Momentum Theory of resurgence
gence, additional research should consider (Shahan & Sweeney, 2011). Fits of Behavioral
means to improve the efficacy of the proce- Momentum Theory to the data from the pre-
dure by identifying ways to further reduce the sent experiment reveal yet more serious prob-
impact of the longer-term history of reinforce- lems for the model. At this point, it appears to
ment for the target behavior. us that the current version of the Behavioral
The need for additional research on the Momentum Theory of resurgence need not be
effects of on/off alternative reinforcement is seriously considered further. Nevertheless, we
also apparent in the continued discrepancy note that a more recent formulation of the
between the findings of Sweeney and Shahan general framework of behavioral momentum
(2013) and those of the present study, (Killeen & Nevin, 2018) based on Killeen’s
Schepers and Bouton (2015), and Trask et al. (1994) Mathematical Principles of Reinforce-
(2018). As noted above, (see Footnote 1) ment is couched in terms of response competi-
Sweeney and Shahan found that both on/off tion, and as those authors note, shares some
and constant alternative reinforcement gener- conceptual similarities to the choice-based
ated resurgence, and that the magnitude of approach of RaC. As of yet, this new approach
resurgence did not differ between the condi- has not been extended to resurgence or other
tions. There are many differences between the relapse phenomena. Perhaps such an
18 TIMOTHY A. SHAHAN et al.

extension could revive a momentum-based subsequent paper further exposition and eval-
approach to resurgence and provide a reason- uation of such a more general quantitative
able quantitative competitor to the amended approach to relapse as choice in context.
version of RaC developed here.
We also note that the model developed here References
shares many conceptual similarities with a
Bai, J. Y. H., Cowie, S., & Podlesnik, C. A. (2017). Quanti-
discrimination-based approach to choice in tative analysis of local-level resurgence. Learning and
which behavioral allocation is said to be Behavior, 45(1), 76–88. https://doi.org/10.3758/
governed by discrimination of reinforcers dis- s13420-016-0242-1
tributed in time and space (see Cowie & Baum, W. M., & Rachlin, H. C. (1969). Choice as time
Davison, 2016; Cowie, Davison, & Elliffe, 2016, allocation. Journal of the Experimental Analysis of Behav-
ior, 12(6), 861–874. https://doi.org/10.1901/jeab.
for reviews). The quantitative formulation of 1969.12-861
this model is based on the Matching Law, and Bouton, M. E., Winterbauer, N. E., & Todd, T. P. (2012).
more specifically, a refined version of the Relapse processes after the extinction of instrumental
more general Davison and Nevin (1999) learning: renewal, resurgence, and reacquisition.
model of stimulus control. A version of this Behavioural Processes, 90(1), 130–141. https://doi.org/
10.1016/j.beproc.2012.03.004
model (Bai, Cowie, & Podlesnik, 2017) has Briggs, A. M., Fisher, W. W., Greer, B. D., & Kimball, R. T.
been extended to account for a resurgence- (2018). Prevalence of resurgence of destructive
like effect in the free-operant psychophysical behavior when thinning reinforcement schedules dur-
procedure typically used to study timing, but it ing functional communication training. Journal of
is not clear how that version of the model Applied Behavior Analysis, 51(3), 620–633. https://doi.
org/10.1002/jaba.472
could be applied to resurgence data from Cowie, S., & Davison, M. (2016). Control by reinforcers
more typical arrangements. Nevertheless, across time and space: A review of recent choice
should such an approach be extended to research. Journal of the Experimental Analysis of Behavior,
resurgence more generally, it might also pro- 105(2), 246–269. https://doi.org/10.1002/jeab.200
vide a reasonable quantitative competitor to Cowie, S., Davison, M., & Elliffe, D. (2016). A model for
discriminating reinforcers in time and space. Behav-
the model developed here. ioural Processes, 127, 62–73. https://doi.org/10.1016/j.
Finally, the present model quantifies some beproc.2016.03.010
aspects of the purely descriptive, narrative Craig, A. R., & Shahan, T. A. (2016). Behavioral momen-
account of resurgence provided by Context tum theory fails to account for the effects of rein-
Theory. Our previous concerns about Context forcement rate on resurgence. Journal of the
Experimental Analysis of Behavior, 105(3), 375–392.
Theory (e.g., Greer & Shahan, 2019; Shahan & https://doi.org/10.1002/jeab.207
Craig, 2017) have never been rooted in a Davison, M., & Baum, W. M. (2006). Do conditional rein-
denial that reinforcer deliveries might serve as forcers count? Journal of the Experimental Analysis of
signaling stimuli (in fact we would assert that Behavior, 86(3), 269–283. https://doi.org/10.1901/
is all they ever do; Shahan, 2017), but rather jeab.2006.56-05
Davison, M., & Nevin, J. (1999). Stimuli, reinforcers, and
that the narrative nature of Context Theory is behavior: An integration. Journal of the Experimental
not sufficiently precise to allow rigorous Analysis of Behavior, 71(3), 439–482. https://doi.org/
empirical assessment. There is no doubt that 10.1901/jeab.1999.71-439
the unparalleled scope of Context Theory in Davison, M. C., & Tustin, R. D. (1978). The relation
accounting for relapse phenomena in general between the generalized matching law and signal-
detection theory. Journal of the Experimental Analysis of
has been admirable, but this scope has come Behavior, 29(2), 331–336. https://doi: https://doi.
at the extreme expense of precision. It is our org/10.1901/jeab.1978.29-331
hope that the quantitative integration of some Devenport, L. D., & Devenport, J. A. (1994). Time-
aspects of Context Theory into the choice- dependent averaging of foraging information in least
based framework of RaC will lead to a formal chipmunks and golden-mantled ground squirrels. Ani-
mal Behaviour, 47(4), 787–802. https://doi.org/10.
quantitative account of relapse in general with 1006/anbe.1994.1111
the scope of Context Theory. For example, Epstein, R. (1985). Extinction-induced resurgence: Prelim-
the biasing effects of discriminations of rein- inary investigations and possible applications. The Psy-
forcer presence or absence might be extended chological Record, 35(2), 143–153. https://doi.org/10.
to reinstatement, or replaced with the biasing 1007/BF03394918
Fleshler, M., & Hoffman, H. S. (1962). A progression for
effects of explicit contextual stimuli signaling generating variable-interval schedules. Journal of the
differential reinforcement conditions in an Experimental Analysis of Behavior, 5(4), 529–530.
extension to renewal. We postpone until a https://doi.org/10.1901/jeab.1962.5-529
RESURGENCE AS CHOICE IN CONTEXT 19

Gallistel, C. R., Fairhurst, S., & Balsam, P. (2004). The discontinuing non-drug reinforcement as an animal
learning curve: Implications of a quantitative analysis. model of drug relapse. Behavioural Pharmacology, 17
Proceedings of the National Academy of Sciences of the (4), 369–374. https://doi.org/10.1097/01.fbp.
United States of America, 101(36), 13124–13131. 0000224385.09486.ba
https://doi.org/10.1073/pnas.0404965101 Prendergast, M., Podus, D., Finney, J., Greenwell, L., &
Gallistel, C. R., King, A. P., Gottlieb, D., Balci, F., Roll, J. (2006). Contingency management for treat-
Papachristos, E. B., Szalecki, M., & Carbone, K. S. ment of substance use disorders: A meta-analysis.
(2007). Is matching innate? Journal of the Experimental Addiction, 101(11), 1546–1560. https://doi.org/10.
Analysis of Behavior, 87(2), 161–199. https://doi.org/ 1111/j.1360-0443.2006.01581.x
10.1901/jeab.2007.92-05 Quick, S. L., Pyszczynski, A. D., Colston, K. A., &
Gallistel, C. R., Mark, T. A., King, A. P., & Latham, P. E. Shahan, T. A. (2011). Loss of alternative non-drug
(2001). The rat approximates an ideal detector of reinforcement induces relapse of cocaine-seeking in
changes in rates of reward: implications for the law of rats: role of dopamine D(1) receptors.
effect. Journal of Experimental Psychology. Animal Behav- Neuropsychopharmacology, 36(5), 1015–1020. https://
ior Processes, 27(4), 354–372. https://doi.org/10.1037/ doi.org/10.1038/npp.2010.239
0097-7403.27.4.354 Schepers, S. T., & Bouton, M. E. (2015). Effects of rein-
Greer, B. D., & Shahan, T. A. (2019). Resurgence as forcer distribution during response elimination on
choice: Implications for promoting durable behavior resurgence of an instrumental behavior. Journal of
change. Journal of Applied Behavior Analysis, 52(3), Experimental Psychology: Animal Learning and Cognition,
816–846. https://doi.org/10.1002/jaba.573 41(2), 179–192. https://doi.org/10.1037/xan0000061
Higgins, S. T., Heil, S. H., & Lussier, J. P. (2004). Clinical Shahan, T. A. (2017). Moving beyond reinforcement and
implications of reinforcement as a determinant of response strength. The Behavior Analyst, 40(1),
substance use disorders. Annual Review of Psychology, 107–121. https://doi.org/10.1007/s40614-017-0092-y
55, 431–461. https://doi.org/10.1146/annurev.psych. Shahan, T. A., & Chase, P. N. (2002). Novelty, stimulus
55.090902.142033 control, and operant variability. The Behavior Analyst,
Hutsell, B. A., & Jacobs, E. A. (2013). Attention and psy- 25(2), 175–190. https://doi.org/10.1007/BF03392056
chophysics in the development of stimulus control. Shahan, T. A., & Craig, A. R. (2017). Resurgence as
Journal of the Experimental Analysis of Behavior, 100, 282– choice. Behavioural Processes, 141(Part 1), 100–127.
300. https://doi.org/10.1002/jeab.54 https://doi.org/10.1016/j.beproc.2016.10.006
Killeen, P. R. (1981). Incentive theory. In Nebraska Sympo- Shahan, T. A., & Sweeney, M. M. (2011). A model of resur-
sium on Motivation (Vol. 29, pp. 169–216). gence based on behavioral momentum theory. Journal
Killeen, P. R. (1994). Mathematical principles of reinforce- of the Experimental Analysis of Behavior, 95(1), 91–108.
ment. Behavioral and Brain Sciences, 17(1), 105–135. https://doi.org/10.1901/jeab.2011.95-91
https://doi.org/10.1017/S0140525X00033628 Silverman, K., Chutuape, M. A., Bigelow, G. E., &
Killeen, P. R., & Nevin, J. A. (2018). The basis of behav- Stitzer, M. L. (1999). Voucher-based reinforcement of
ioral momentum in the nonlinearity of strength. Jour- cocaine abstinence in treatment-resistant methadone
nal of the Experimental Analysis of Behavior, 109(1), patients: Effects of reinforcement magnitude. Psycho-
4–32. https://doi.org/10.1002/jeab.304 pharmacology, 146(2), 128–138. https://doi.org/10.
Lattal, K. A., & Wacker, D. (2015). Some dimensions of 1007/s002130051098
recurrent operant behavior. Mexican Journal of Behav- Silverman, K., Wong, C. J., Umbricht-Schneiter, A.,
ior Analysis, 41(2), 1–13. https://doi.org/10.5514/ Montoya, I. D., Schuster, C. R., & Preston, K. L.
rmac.v41.i2.63716 (1998). Broad beneficial effects of cocaine abstinence
Leitenberg, H., Rawson, R. A., & Mulick, J. A. (1975). reinforcement among methadone patients. Journal of
Extinction and reinforcement of alternative behavior. Consulting and Clinical Psychology, 66(5), 811–824.
Journal of Comparative and Physiological Psychology, 88 https://doi.org/10.1037//0022-006x.66.5.811
(2), 640–652. https://doi.org/10.1037/h0076418 Sweeney, M. M., & Shahan, T. A. (2013). Behavioral momen-
Nall, R. W., Craig, A. R., Browning, K. O., & Shahan, T. A. tum and resurgence : Effects of time in extinction and
(2018). Longer treatment with alternative non-drug repeated resurgence tests. Learning & Behavior, 414–424.
reinforcement fails to reduce resurgence of cocaine or https://doi.org/10.3758/s13420-013-0116-8
alcohol seeking in rats. Behavioural Brain Research, 341, Tiger, J. H., Hanley, G. P., & Bruzek, J. (2008). Functional
54–62. https://doi.org/10.1016/J.BBR.2017.12.020 communication training: A review and practical
Nevin, J. A., Craig, A. R., Cunningham, P. J., guide. Behavior Analysis in Practice, 1(1), 16–23.
Podlesnik, C. A., Shahan, T. A., & Sweeney, M. M. https://doi.org/10.1007/BF03391716
(2017). Quantitative models of persistence and Trask, S., Keim, C. L., & Bouton, M. E. (2018). Factors that
relapse from the perspective of behavioral momen- encourage generalization from extinction to test
tum theory: Fits and misfits. Behavioural Processes, 141 reduce resurgence of an extinguished operant
(Pt 1), 92–99. https://doi.org/10.1016/j.beproc.2017. response. Journal of the Experimental Analysis of Behavior,
04.016 110(1), 11–23. https://doi.org/10.1002/jeab.446
Petscher, E. S., Rey, C., & Bailey, J. S. (2009). A review of Trask, S., Schepers, S. T., & Bouton, M. E. (2015). Context
empirical support for differential reinforcement of change explains resurgence after the extinction of
alternative behavior. Research in Developmental Disabil- operant behavior. Mexican Journal of Behavior Analysis,
ities, 30(3), 409–425. https://doi.org/10.1016/j.ridd. 41(2), 187–210. https://doi.org/10.5514/rmac.v41.i2.
2008.08.008 63772
Podlesnik, C. A., Jimenez-Gomez, C., & Shahan, T. A. Volkert, V. M., Lerman, D. C., Call, N. A., & Trosclair-
(2006). Resurgence of alcohol seeking produced by Lasserre, N. (2009). An evaluation of resurgence
20 TIMOTHY A. SHAHAN et al.

during treatment with functional communica- Processes, 36(3), 343–353. https://doi.org/10.1037/


tion training. Journal of Applied Behavior Analysis, a0017365
42(1), 145–160. https://doi.org/10.1901/jaba. Winterbauer, N. E., Lucke, S., & Bouton, M. E. (2013).
2009.42-145 Some factors modulating the strength of resurgence
Wacker, D. P., Harding, J. W., Berg, W. K., Lee, J. F., after extinction of an instrumental behavior. Learning
Schieltz, K. M., Padilla, Y. C., … Shahan, T. A. (2011). and Motivation, 44(1), 60–71. https://doi.org/10.
An evaluation of persistence of treatment effects dur- 1016/j.lmot.2012.03.003
ing long-term treatment of destructive behavior. Jour-
nal of the Experimental Analysis of Behavior, 96(2),
261–282. https://doi.org/10.1901/jeab.2011.96-261 Received: September 15, 2019
Winterbauer, N. E., & Bouton, M. E. (2010). Mechanisms Final Acceptance: November 13, 2019
of resurgence of an extinguished instrumental behav- Editor in Chief: Mark Galizio
ior. Journal of Experimental Psychology: Animal Behavior Associate Editor: Christopher Podlesnik

You might also like