You are on page 1of 11

Behavioural Processes xxx (2005) xxx–xxx

3 Windows
4 François Tonneau ∗
5 Centro de Estudios e Investigaciones en Comportamiento, Universidad de Guadalajara, Guadalajara, Jalisco 45030, Mexico

Abstract
F

8 Some models of performance assume that behavior depends on environmental quantities (for example, rates of reinforcement)
OO

9 that are defined over intervals of fixed duration. Although such window models may serve as useful approximations, they are
10 incompatible with well-known properties of behavior (for instance, sensitivity to delay). Window models with variable window
11 length, however, are more difficult to refute. This article examines some implications of the assumption of random window
12 length. Variable windows are shown to produce continuous forgetting and temporal discounting functions, to display properties
13 analogous to parallel aggregation, and to make reasonable predictions about steady-state relations between reinforcement and
14 responding. Issues of interpretation nonetheless suggest that alternatives to window models should be developed.
PR

15 © 2005 Published by Elsevier B.V.

16 Keywords: Window; Forgetting; Discounting; Delay; Rate; Aggregation


17

1. Introduction (e.g., Valone, 1992). The psychological models that


D

1 15

rely on windows encompass a range of possibilities, 16

2 A persistent problem of psychology has been to un- depending on what events the hypothesized windows 17
TE

3 derstand how behavior depends on events distributed are supposed to contain, and on how their length is 18

4 over time. The dependence of behavior on past events defined. 19

5 has been the research focus not only of memory psy- Imagine, for example, a classical conditioning ex- 20

6 chologists, but also of behavior analysts studying re- periment consisting of different trials. A window model 21

7 inforcement histories (Marr, 1984). The concept of of length N might assume that conditional responding 22
EC

8 window has played a recurrent role in this endeavor. on any trial depends on what happened on the last N tri- 23

9 Sometimes the ‘window’ is merely an aspect of the als encountered. In this case, the window would be de- 24

10 experimental procedure, as is the case when the ex- fined over trials, not real time. Similarly, free-operant 25

11 perimenter delivers food to an animal at a rate deter- performance could depend on what happened to each of 26

12 mined by the subject’s behavior during the preceding the last N responses emitted, in which case the window 27
RR

13 W seconds (e.g., Vaughan, 1981). But windows can would be defined over responses and the accompanying 28

14 also be used to model the subject’s behavior itself reinforcers. In the models of Wearden and Clark (1988, 29

1989), for instance, the inter-response time the animal 30

∗ Tel.: +52 3 121 1158; fax: +52 3 121 1158. emits at any moment depends on a window containing 31

E-mail address: ftonneau@cencar.udg.mx. N reinforced inter-response times; the assumption of a 32


CO

1 0376-6357/$ – see front matter © 2005 Published by Elsevier B.V.


2 doi:10.1016/j.beproc.2005.02.007

BEPROC 1459 1–11


UN
2 F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx

33 fixed window size is useful for developing computer els, except as convenient approximations to more ex- 81

34 simulations of the reinforcement process (e.g., Peele et act theories (e.g., Rachlin, 1982; p. 161). If, however, 82

35 al., 1984). the assumptions behind window models are compati- 83

36 In an important class of window models, however, ble with well-known properties of behavior, then test- 84

37 the window is defined in terms of time. These mod- ing such models becomes worthwhile. In this paper, 85

38 els assume that the length of the window is a tempo- I shall examine how time-based windows perform in 86

39 ral interval W, and current performance is supposed situations of memory and reinforcement (in particu- 87

40 to depend on the content of the window over the last lar, delayed reinforcement). In memory, one or more 88

41 W seconds (or hours, weeks, etc.). Time-based win- past events are separated from current behavior by 89

42 dows have been discussed in relation with perception various delays tk ; in delayed reinforcement, respond- 90

43 and short-term memory (Turvey, 1977), and have fig- ing is followed by one or more reinforcers at vari- 91

44 ured in behavior analysis as areas of temporal inte- ous delays tk , and the response–reinforcer array affects 92

45 gration (Rachlin et al., 1981; p. 387) or as memory later behavior. The presence of delays raises similar is- 93

46 windows (Rachlin, 1982). Time-based windows are sues in both cases (Killeen and Smith, 1984; Logue, 94
F

47 necessarily part of any theory that expresses behavior 1988), issues that window models should be able to 95

48 as a function of reinforcement or stimulus frequency address. 96


OO

49 (e.g., Baum, 1973; Herrnstein, 1970, 1982; Prelec and For a given window length W, window models as- 97

50 Herrnstein, 1978), because a frequency must be de- sume that performance depends on some measure of 98

51 fined over some time interval W that is, by defini- the density of the reinforcing or memorable events that 99

52 tion, a window. Thus, in theories which appeal to rates are present in the window. Here, I shall focus on the 100

53 of reinforcement, predictions about reinforced perfor- most tractable window models, those in which behav- 101

54 mance may depend on the length of the windows over ior is caused by the number of target events present in a 102
PR

55 which the rates are defined, with different lengths giv- temporal window. Depending on the experimental set- 103

56 ing different answers (cf. Staddon, 1988). To the ex- ting, these events could consist of stimuli, responses, 104

57 tent that principles of reinforcement involve stimulus or both; the formal properties of window models are 105

58 rates (Williams, 1988), specifying the length of the the same in all cases. 106

59 relevant windows becomes an important issue in be- The simplest situation to consider involves a sin- 107

60 havior analysis, especially in the context of dynamic gle event (Section 2). In this situation, window models 108
D

61 models. should display adequate sensitivity to delay, which can 109

62 Time-based windows also circumvent the issue of be achieved if the length of the window is supposed 110

action at a temporal distance, which arises when ex- to be variable instead of fixed (Section 2.1); speci-
TE

63 111

64 planations rely on discrete events such as individual fying a probability density for window length allows 112

65 reinforcers or stimuli. Imagine, for example, that be- predictions about the shape of forgetting and discount- 113

66 havior at time t is caused by reinforcement rate over the ing functions (Section 2.2). Similar probabilistic anal- 114

67 window [t − W, t]. The individual reinforcers present yses apply to situations that involve multiple events 115
EC

68 in this window are separated from current behavior by (Section 3). From there, it is only a simple step to 116

69 various delays. The rate of reinforcers in the window, extend window models to the acquisition of behav- 117

70 however, being a property of composition of the envi- ior under repeated Pavlovian or operant trials (Sec- 118

71 ronment over [t − W, t], is spread over the entire interval tion 4.1) followed by extinction (Section 4.2). Sec- 119

72 and contacts behavior at time t. The concept of action tion 4.3 shows that variable-window models are com- 120
RR

73 at a temporal distance is unnecessary because there is patible with some standard properties of behavior on 121

74 no gap between the intervals [t − W, t] over which the variable-interval schedules. In a final section, I ex- 122

75 cause is defined, and the endpoint t at which behavioral amine what limitations window models retain even 123

76 effects occur. under the assumption of variable window size. My 124

77 In spite of their importance, little work has evalu- purpose is simply to explore the general capacities 125

ated the formal properties of window models. If win- and limitations of such models, and no attempt is
CO

78 126

79 dows cannot account for elementary aspects of behav- made to fit specific data except for illustrative pur- 127

80 ior, there is little point in developing window mod- poses.

BEPROC 1459 1–11


UN
F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx 3

128 2. One event behavioral impact V(t) is: 170

E[V (t)] = a · P(t ≤ W) + 0 · P(t > W) = a · g(t). 171


129 2.1. Forgetting and discounting functions
(2) 172

130 Some well-known properties of memory or delayed


Eq. (2) implies that E[V(t)] is decreasing over [0, +∞), 173
131 reinforcement with only one event (for example, a sin-
with E[V(0)] = a > 0 and E[V(t)] → 0 when t → +∞. 174
132 gle discriminative stimulus, or a single reinforcer) al-
One assumption behind Eq. (2) is that the effect 175
133 ready raise problems for window models. A first class
on behavior of a property of the environment can be 176
134 of problems concerns the shape of forgetting func-
measured by a real number a. Hence, the forgetting or 177
135 tions and delay-of-reinforcement functions when one
discounting functions described by Eq. (2) are neces- 178
136 event is presented at a delay t from a target response
sarily bounded; this approach should be modified when 179
137 (Killeen, 1994; p. 106; Mazur and Herrnstein, 1988).
dealing with unbounded dependent variables (for ex- 180
138 A window model with window length W predicts an
ample, log odds; Anderson and Schooler, 1991). Be- 181
139 impact V(t) on behavior that is positive (V(t) = a > 0)
F

cause observed behavior reflects multiple influences, 182


140 as long as the event remains in the window (for t ≤ W)
the parameter a, which characterizes a causal relation 183
141 and that is zero thereafter (for t > W). This hypothesis
between environment and responding, may not be iden-
OO

184
142 implies that forgetting functions and, analogously, the
tifiable with an empirical variable such as response 185
143 discounting or delay-of-reinforcement functions that
proportion or latency. In principle, therefore, different 186
144 express the ability of delayed reinforcers to maintain
mappings can be applied to a to produce different de- 187
145 behavior, should be step functions. However, in most
pendent variables, bounded or otherwise (cf. Wickens, 188
146 circumstances the data show that the impact V(t) of an
1998; Wixted, 1990). Other things being equal, though, 189
event separated t seconds from responding is a con-
PR

147
simpler mappings should be preferred. The simplest 190
148 tinuously decreasing function of t (e.g., Laming and
mapping assumption involves a linear relation between 191
149 Scheiwiller, 1985; Mazur, 2001; Rubin and Wenzel,
the parameter a and some bounded measure of perfor- 192
150 1996; Shull and Spear, 1987).
mance (for example, a response proportion); in this 193
151 Although the shape of empirical, forgetting and
case, it is only necessary to specify a probability den- 194
152 delay-of-reinforcement curves contradicts the hypoth-
sity for window length to predict behavior. 195
esis of a fixed window length W, the dependent vari-
D

153

154 able that figures in such curves is most often an average


2.2. Densities for window length 196
155 over trials. A continuous decrease in the averaged de-
TE

156 pendent variable is consistent with a window model in


A first possibility is to consider a uniform density 197
157 which the length of the window fluctuates over time
from 0 to wmax , where wmax (>0) is the maximal length 198
158 (say, from test trial to test trial in the case of a memory
of the window. This hypothesis implies forgetting and 199
159 experiment). Assume that window size W is a random
discounting functions (Eq. (2)) that decrease linearly 200
160 variable with probability density f over [0, +∞], and
from a (at t = 0) to 0 (at t = wmax ) and equal 0 there-
EC

201
161 for any delay t ≥ 0 define g(t) as:
after. Forgetting functions with these properties have 202

 been entertained by Bower (1967) and Treisman and 203


+∞
Williams (1984).
g(t) = P(t ≤ W) = f (w)dw.
204
162 (1)
t A second possibility is the exponential density: 205
RR

f (w) = ce−cw (3) 206


163 The g(t) function is equivalent to a survival or relia-
164 bility function for window size W (e.g., Mendenhall and the corresponding forgetting (or discounting) func- 207

165 and Sincich, 1995). It is decreasing over [0, +∞), with tion: 208

g(0) = 1 and g(t) → 0 when t → +∞. As in a model with


E[V (t)] = ae−ct ,
166
(4) 209
fixed window length, the impact of the content of the
CO

167

168 window on behavior is a (>0) when t ≤ W and 0 when- where the parameter c (>0) is usually measured in s−1 . 210

169 ever W < t. Hence, at delay t the expected value of the Eq. (4) is especially simple and convenient, and ex- 211

BEPROC 1459 1–11


UN
4 F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx

where c (>0) is measured in s−1 and α (>0) is dimen- 237

sionless. Eq. (6) can closely approximate the hyperbola 238

V(t) = a/(1 + kt), which has been used to model forget- 239

ting (McCarthy and White, 1987; Staddon, 1983; p. 240

381; Rubin and Wenzel, 1996) and delay discount- 241

ing as well (Mazur, 1984, 2001). The solid line in 242

Fig. 1, for example, shows that Eq. (6) can reproduce 243

the type of forgetting curve observed by Wixted and 244

Ebbesen (1991). In fact, Eq. (6) is consistent with the 245

exponential-power function proposed by Wickelgren 246

(1974, 1975) to model both short-term and long-term 247

memory data. Finally, when α = 1, Eq. (6) reverts to the 248

simpler exponential, Eq. (4). 249

Fig. 1. Example of forgetting data. The data are from Wixted and Whereas the hypothesis of a Weibull density lends 250
F

Ebbesen (1991; Exp. 1, 1-s condition) and appear as filled circles. flexibility to window models, the exponential function 251

The dotted line indicates the best-fitting predictions of an exponential may still be preferred for tractability. Independently of 252
OO

density for window size, whereas the solid line shows the predictions the probability density chosen for W, it is clear that 253
of a Weibull density with c = 0.018 and α = 0.18. In both cases, the
variable windows can predict continuously decreasing 254
functions were constrained so as to equal 1 at the origin. Data used
with permission. forgetting and delay-discounting curves, as long as the 255

dependent variable is averaged over trials or over time. 256

In all cases, a time-based window model with variable 257


212 ponential functions (or closely related variants) have
length expresses the level of performance at a delay t 258
PR

213 been amply used in memory research (e.g., Massaro,


as the product of a quantity a equal to E[V(0)] and of 259
214 1970; Norman and Waugh, 1968; White, 1985, 1991;
a decreasing function of delay g(t). Whereas Bogartz 260
215 Wickelgren, 1967).
(1990) had to assume a multiplicative rule to evaluate 261
216 Nevertheless, Eq. (4) seems to be at odds with many
the independence of forgetting from the level of perfor- 262
217 empirical forgetting curves, in which the performance
mance at time t = 0 (also see White, 1985, 1991), Eq. 263
218 measure decreases sharply at first, but then more slowly
(2) follows directly from a probabilistic conception of 264
D

219 than what exponential functions predict (e.g., Anderson


window length, as in discrete state models of memory 265
220 and Schooler, 1991; Wickelgren, 1972, 1974, 1975;
(e.g., Bernbach, 1967; Kintsch, 1967). 266
221 Wickelgren and Berian, 1971). In a study by Wixted
TE

222 and Ebbesen (1991; Exp. 1), for example, people ex-
223 amined words for 1 or 5 s and attempted to recall them
3. More than one event 267
224 after a delay that varied from 2.5 to 40 s. The proportion
225 of words correctly recalled in the 1-s condition is plot-
3.1. Serial versus parallel aggregation 268
ted as a function of delay in Fig. 1 (filled circles). The
EC

226

227 best-fitting exponential with a = 1 (dotted line) deviates


When a response is preceded or followed by more 269
228 systematically from the data, underpredicting them at
than one event of the same type (Fig. 2), for exam- 270
229 long delays. A similar issue arises with respect to delay
ple, n reinforcers at different delays, window models 271
230 discounting (e.g., Logue, 1988).
must face not only the issue of temporal discounting 272
231 Window models may account for such data by
RR

but also that of stimulus aggregation (e.g., Commons 273


232 assuming a Weibull density for window length (cf.
et al., 1982). For example, Mazur and Vaughan (1987) 274
233 Wickens, 1998) over (0, +∞),
distinguished models of reinforcement in which mul- 275

α−1 −(cw)α tiple stimuli “act in parallel” from those in which the
234 f (w) = cα (cw) e , (5) 276

stimuli “concatenate in a serial manner” (p. 260). Win- 277

dow models imply that all of the stimuli present in the


CO

235 which entails the forgetting function: 278

α
window function as a single aggregate (see Fig. 2; ar- 279

236 E[V (t)] = ae−ct (6) row a), for example, a single rate or number of events; 280

BEPROC 1459 1–11


UN
F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx 5

same event add independent increments to a single 305

trace, the strength of which is a sum of contributions 306

that decay over time. Exemplar models of memory 307

(e.g., Hintzman, 1986; Hintzman and Ludlam, 1980; 308

Logan, 1988), in which each repetition of the target 309

event creates its own trace, allow parallel aggregation 310

(Fig. 2; arrow b) and can display properties similar to 311

those of Eq. (7) (see Treisman and Williams, 1984). 312

3.2. Aggregation in variable windows 313

Window models with random window length pre- 314

dict Eq. (7) as well, but for quite different reasons. 315

Again, assume that the length of the window has a prob- 316
F

Fig. 2. Serial aggregation (arrow a) vs. parallel aggregation (arrow ability density f over [0, +∞), and for any delay t ≥ 0 317

b). Vertical bars represent stimuli separated by delays t1 , t2 , . . ., tn define g(t) as in Eq. (1). Also assume that n events have 318
OO

from current behavior, which occurs at time t0 = 0. For convenience, been presented at delays t1 , t2 , . . ., tn and that mem- 319
the n delays are measured from right to left, whereas time flows ory for them is tested at time t0 = 0 (see Fig. 2). The 320
from left to right. Figure deals with memory, but could be applied to
number N of events present in the window at time t0 321
delayed reinforcement by inverting the time axis.
is random, and may equal 0, 1, . . ., or n, depending on 322

window length; notice that P(N = k) = g(tk ) − g(tk + 1) 323

thus, window models imply a serial aggregation of tar- for 0 ≤ k < n and that P(N = n) = g(tn ). The impact V 324
PR

281

282 get stimuli in the sense of Mazur and Vaughan (1987; of window content is also random, and may equal 325

283 p. 260). In contrast, some models of behavior with mul- v0 , v1 , . . . , or vn , depending on whether the window 326

284 tiple, delayed reinforcers assume that the contribution contains 0, 1, . . ., or n of the target events. If N = 0, 327

285 of each event k is discounted by its own delay tk before the window is empty and therefore v0 = 0; otherwise 328

286 being summed or averaged with the others (as in Fig. 2; vk > 0 (for 1 ≤ k ≤ n). The expected value E[V] of the 329

arrow b). These models therefore involve a parallel behavioral impact is: 330
D

287

288 aggregation of multiple events (Mazur, 1984, 2001; n



289 McDiarmid and Rilling, 1965). E[V ] = vk P(N = k) 331
TE

290 The main advantage of parallel models is that they k=0


291 display sensitivity to individual delays; for the same n−1
global rate or number of events, different delay distri- 
= vk [g(tk ) − g(tk+1 )] + vn g(tn ).
292
(8) 332
293 butions give rise to different behaviors (see Mazur et
k=1
294 al., 1985). Among possible combination rules for the
EC

295 contributions of individual events, an additive rule is Rearranging: 333

296 the simplest (Shull and Spear, 1987): n



n
 E[V ] = g(t1 )v1 + g(tk )[vk − vk−1 ] 334

297 V ({tk }) = a g(tk ), (7) k=2


k=1 n
 n

RR

= g(tk )[vk − vk−1 ] = g(tk )D(k), (9)


where V({tk }) is the aggregated impact of the series
335
298
k=1 k=1
299 of reinforcers, a a parameter specific to the reinforcer
300 employed and g is a continuous, decreasing function of where D(k) = vk − vk−1 for k = 1, 2, . . ., n. 336

301 delay such that g(0) = 1 and g(t) → 0 when t → + ∞. To obtain predictions from Eq. (9) one needs to spec- 337

Eq. (7) or variants of it also arise in relation to mem- ify some functional relation, or O-rule (Baum, 1973),
CO

302 338

303 ory. In the models of Anderson (1982) and Anderson between the number of events in the window and the 339

304 and Schooler (1991), for example, repetitions of the corresponding impact. The hypothesis that vk is propor- 340

BEPROC 1459 1–11


UN
6 F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx

341 tional to k (that is, vk = a · k with a > 0) entails D(k) = a show that window models with random length can dis- 381

342 and play properties analogous to parallel aggregation. 382

n

343 E[V ] = a g(tk ), (10)
k=1 4. Acquisition, extinction and rate sensitivity 383

344 a result which also follows directly from the tail–sum


4.1. Acquisition 384
345 formula for E[aN] = aE[N] (see, for example, Stirzaker,
346 1999). Eq. (10) mimics an additive model of parallel
An important feature of window models is that they
aggregation (Eq. (7)). Notice that a = D(1) = v1 and
385
347
provide a ready mechanism for acquisition and extinc- 386
348 can be interpreted as the effect of the presence of a
tion (cf. McDowell et al., 1992). Eq. (9) can be applied 387
349 single event in the window.
to acquisition data under the simplifying assumption
Instead of a proportionality relation between vk and
388
350
that acquisition trials are spaced by a constant inter-
k, one could assume that the vk function is bounded over
389
351
trial interval T and that the duration of each trial is 390
F

352 the domain of natural numbers. In a maximal number


negligible. In classical and instrumental conditioning,
model, for example, the vk values increase linearly until
391
353
the animal’s performance on trial n is typically mea- 392
the number of events in the window reaches nmax , be-
OO

354
sured before the delivery of the unconditional stimulus 393
355 yond which additional events have no effect. Eq. (10)
or reinforcer. Accordingly, behavior on trial n depends 394
356 remains valid in a maximal number model, with the
only on the previous n − 1 trials, spaced from the cur- 395
357 proviso of replacing n by nmax whenever n > nmax . Al-
rent trial by delays of T, 2T, . . ., (n − 1) T seconds. Eq.
ternatively, one could assume that the vk function is
396
358
(9) implies that the average behavioral impact B(n) on 397
359 bounded, increasing and negatively accelerated, which
trial n is:
PR

398
360 implies that the D(k) term is decreasing and tends to-
361 ward 0 when k → ∞. In these conditions, Eq. (9) be- n−1

362 comes: B(n) = g(kT )D(k). (13) 399

n
 k=1
363 E[V ] = a g(tk )I(k), (11)
The growth of the behavioral impact from one trial to 400
k=1
another, B(n + 1) − B(n), equals g(nT)D(n), which is a 401
D

364 with a = D(1) > 0 and I(k) = D(k)/D(1) for k = 1, 2, . . ., decreasing function of trial number (n) under standard 402

365 n. assumptions about vk values. Thus, Eq. (13) predicts 403


TE

366 Eq. (11) emulates an additive model of parallel ag- negatively accelerated acquisition curves (e.g., Mazur 404

367 gregation in which the impact of each event is dis- and Hastie, 1978). 405

368 counted by g(tk ), times an additional reduction that Eq. (13) needs to be slightly modified if behavior 406

369 depends on the ordinal position k of the event in the on each trial is recorded after the presentation of an 407

370 series of target stimuli (see Fig. 2). Accordingly, I(k) unconditional stimulus or incentive, as when a pigeon’s 408
EC

371 behaves like a measure of retroactive interference in general activity is measured after n periodic deliveries 409

372 which the impact of the kth event on responding de- of food (Killeen et al., 1978). In this case, behavior on 410

373 teriorates due to the intercalation of k − 1 subsequent trial n depends on the current unconditional stimulus 411

374 stimuli (e.g., Catania et al., 1988; Keppel, 1968). If, for as well as on the n − 1 preceding ones, and Eq. (13) 412

375 example, vk = M(1 − e−dk ), then: becomes: 413


RR

n
 n

376 E[V ] = a g(tk ) e−d(k−1) , (12) B(n) = g[(k − 1)T ]D(k). (14) 414

k=1 k=1

377 where a = M(1 − e−d ) and M, d > 0. In terms of fit to An important special case of Eq. (14) involves an 415

the data, a relation such as Eq. (12) may be difficult exponential density for window size (Eq. (3)) and a
CO

378 416

379 to distinguish from a simple additive model (Eq. (10)). proportional relation between the content of the win- 417

380 Independently of the chosen vk function, Eqs. (9)–(12) dow and the associated impact (vk = a · k with a > 0). 418

BEPROC 1459 1–11


UN
F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx 7

419 In these conditions: ing response rate (Staddon, 1975) or other sources of 455

n
 non-linearity.
1 − e−cnT
456

420 B(n) = a e−c(k−1)T = a . (15)


1 − e−cT 4.3. Asymptotic predictions
k=1 457

421 Eq. (15) is identical to Eq. (8) by Killeen et al. (1978),


Following the approach of Killeen et al. (1978), the 458
422 which constitutes the kernel of their model for acqui-
equations for acquisition and extinction can be com- 459
423 sition curves under periodic feedings. In this case, the
bined to predict steady-state performance under inter- 460
424 predictions of a window model coincide with those of
val schedules of reinforcement. Assume for simplicity 461
425 a theory based on the cumulation of arousal (Killeen et
that the reinforcers are delivered at regular intervals, 462
426 al., 1978) or an exponentially weighted moving average
each interval being equal to 1/r (where r is the reinforce- 463
427 (e.g., Killeen, 1982). This formal equivalence would
ment rate). After n reinforcers, the expected behavioral 464
428 be impossible under the assumption of fixed window
impact B(n) = B(n, 0) is given by Eq. (14) (with T = 1/r), 465
429 length.
and Eq. (16) applies to B(n, t) during the 1/r seconds of 466
F

the next inter-reinforcement interval. Hence, the time 467


430 4.2. Extinction
average of the expected impact B(n, t) in this interval, 468
OO

with t in [0, 1/r], is: 469


431 Another prediction of window models is that
 1/r
432 performance should decrease in extinction. Assume
433 that extinction takes place after n conditioning trials B(n) = r B(n, t)dt, (18) 470
0
434 (Fig. 2). After t seconds of extinction, Eq. (9) im-
435 plies that the average behavioral impact B(n, t) will and the asymptotic impact: 471


PR

436 be: 1/r


n
 b = lim B(n) = r. lim B(n, t)dt, (19) 472
n→∞ n→∞ 0
437 B(n, t) = g(tk + t)D(k), (16)
k=1 assuming that the limit exists. If the probability density 473

for window length is exponential, Eq. (17) implies that 474


where each delay tk corresponds to a different condi-
B(n, t) = e−ct B(n, 0), and Eq. (19) becomes:
438
475
439 tioning trial. B(n, t) is decreasing in t and tends toward
D

440 0 when t → +∞.  1/r


The predictions about extinction derived from an b = r. lim e−ct B(n, 0)dt 476
441
n→∞ 0
TE

exponential density for window length are particularly


r
442

443 straightforward. After t s of extinction: = (1 − e−c/r ) lim B(n, 0). (20) 477
c n→∞
n

e−c(tk + t) D(k)
478
444 B(n, t) = The B(n, 0) term in Eq. (20) is identical to B(n) in 479
k=1
EC

Eq. (14) with T = 1/r, and different relations between 480


n
 window content and behavior will give rise to different 481
445 = e−ct e−ctk D(k) = e−ct B(n, 0), (17) predictions about asymptotic response rates. If the rela- 482
k=1 tion between window content and behavioral impact is 483

446 where B(n, 0) denotes the expected impact at the start a proportional one (vk = a · k), for example, then B(n, 484
RR

447 of extinction (for example, as given by Eq. (14)). The 0) is given by Eq. (15) and 485

exponential curves predicted by Eq. (17) are sometimes


r 1 − e−cn/r a
448

449 observed (e.g., Clark, 1959), but in other circumstances b = (1 − e−c/r ) lim a = r. (21) 486
c n→∞ 1 − e−c/r c
450 extinction curves tend to be ogival (Killeen, 1982). The
451 latter cannot arise from an exponential density for win- If response rate is proportional to b, Eq. (21) predicts a 487

dow length (or a Weibull density with α ≤ 1), unless an linear relation between asymptotic reinforcement and
CO

452 488

453 additional transformation is imposed on B(n, t). This response rate. Eq. (21), previously derived by Killeen et 489

454 transformation might involve inertia due to the ongo- al. (1978) from their model of cumulative arousal, was 490

BEPROC 1459 1–11


UN
8 F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx

491 found adequate for some adjunctive behaviors, but de- (1970) hyperbola account for an average of 88 and 89% 513

492 fective with respect to operant responses such as lever of the data variance, respectively (both fare poorly on 514

493 pressing (see Killeen et al., 1978; p. 579). On variable- the data of pigeons 121 and 129). 515

494 interval schedules, for example, the relation between Finally, data such as those of Fig. 3 can also arise 516

495 reinforcer and response rates typically is negatively ac- from a negatively accelerated relation between win- 517

496 celerated (Catania and Reynolds, 1968). dow content and impact. Assume for example that 518

497 Although Eq. (21) predicts a linear relation between vk = M(1 − e−dk ). Then Eq. (20) becomes: 519

498 reinforcer and response rates, negatively accelerated n


r 
499 curves follow directly from a maximal number model b = (1 − e−c/r ) lim a e−c(k−1)/r e−d(k−1) 520
500 of window impact (vk = a · k for k ≤ nmax and vk = c n→∞
k=1
501 a · nmax for k > nmax ). In a maximal number model:
r 1
r 1 − e−cnmax /r = (1 − e−c/r ) a , (23) 521

502 b = (1 − e−c/r ) a = k1 r(1 − e−k2 /r ), c 1 − e−(c/r + d)


c 1 − e−c/r
with a = M(1 − e−d ), as in Eq. (12). Replacing each
F

522
503 (22)
term of the form 1 − e−x by x, Eq. (23) can be simpli- 523

504 with k1 = a/c and k2 = c·nmax Eq. (22) defines a neg- fied: 524
OO

505 atively accelerated function of reinforcer rate such Md Mr


506 that b → 0 when r → 0 and b → k1 k2 (=a·nmax ) when ∼
b= = . (24) 525
(c/r) + d r + (c/d)
507 r → ∞. As an example, Fig. 3 shows the fit of Eq. (22)
508 to the data of Catania and Reynolds (1968) on variable- Eq. (24) gives a close approximation to Eq. (23) over 526

509 interval responding in pigeons (the response was key the range of reinforcer rates shown in Fig. 3, as long 527

pecking, and the reinforcer, access to grain). The best- as d remains small (for example, d < 0.10). Eq. (24)
PR

510 528

511 fitting values of k1 and k2 appear in parentheses under is formally equivalent to Herrnstein’s (1970) hyper- 529

512 each curve in the order k1 , k2 . Eq. (22) and Herrnstein’s bola. With c = 1, this equation was previously derived 530
D
TE
EC
RR
CO

Fig. 3. Steady-state relation between reinforcement and response rates. The data are from Catania and Reynolds (1968; Exp. 1). Each panel
shows the data of an individual pigeon (filled circles) and the best-fitting predictions of Eq. (22) (solid lines). The corresponding values of k1
and k2 appear in parentheses, in this order, under each curve. Data used with permission.

BEPROC 1459 1–11


UN
F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx 9

531 by Killeen (1982; p. 177) from his model of stimulus variations in parameter values from one data set to an- 576

532 averaging. The parameter d in Eq. (24) is similar to other would be to assume different types of windows 577

533 Killeen’s D and should be sensitive to the same inde- for different types of events, but this assumption may 578

534 pendent variables (for example, the duration and size of seem ad hoc. 579

535 the reinforcer), consistent with the model of retroactive Finally, window models may be incompatible with 580

536 interference adopted in Eq. (12). Of course, deriving some empirical findings even under the assumption of 581

537 a negatively accelerated relation with respect to rein- variable window length. Acquisition data appear to be 582

538 forcement rate (Eq. (24)) from a similar assumption crucial in this respect. According to window models, 583

539 about reinforcement number may seem circular. The for example, the same g(tk ) function is involved in 584

540 important point, though, is that window models can pre- both acquisition and forgetting; hence, rapid acquisi- 585

541 dict asymptotic relations about reinforcer rates while tion should be correlated with slow forgetting. It is not 586

542 displaying sensitivity to delay (Eq. (2)) and retaining clear that this prediction holds. A related issue concerns 587

543 the ability to emulate parallel aggregation (which is spacing effects in acquisition (e.g., Hintzman, 1976). 588

544 neither circular nor trivial). That events more widely spread in time can be bet- 589
F

ter remembered (e.g., Anderson and Schooler, 1991) 590

challenges window models and suggests that the local 591


OO

545 5. Conclusion conditions of presentation of individual stimuli should 592

be taken into account. 593

546 Window models with variable window length are If these difficulties are confirmed, molar approaches 594

547 more powerful than their fixed-length counterparts with to behavior and direct memory (Marr, 1984; Watkins, 595

548 respect to a number of empirical issues. The hypothesis 1981; White, 2001) will need to devise sound theoret- 596

549 of random window length allows for continuous forget- ical alternatives to rates, counts and windows. As the 597
PR

550 ting and discounting functions, and simulates additive present analysis shows, any such alternative should dis- 598

551 rules of parallel aggregation. The further assumption play sensitivity to delay as well as properties analogous 599

552 of an exponential density for window length predicts to parallel aggregation. 600

553 gradual acquisition curves identical to those reported


554 by Killeen et al. (1978), and leads to steady-state rela-
555 tions between reinforcement and responding that can Acknowledgements 601
D

556 be similar or equivalent to Herrnstein’s (1970) hyper-


557 bola. Part of this analysis was presented as a poster at 602

In spite of these positive features, some character- the Annual Meeting of the Society for the Quantitative
TE

558 603

559 istics of window models remain problematic. Firstly, Analyses of Behavior (Boston, MA, May 2004). I thank 604

560 window models produce the results discussed above Randy Grace, Peter Killeen and Anthony McLean for 605

561 only under a probabilistic interpretation for window their useful comments. 606

562 length. This interpretation is reasonable when the de-


EC

563 pendent variable consists of a rate or proportion, but


564 less so in other cases (e.g., Mazur, 1984). Even when References 607

565 dependent variables are averaged, an analysis in terms


566 of probability density, although formally feasible, may Anderson, J.R., 1982. Acquisition of cognitive skill. Psychol. Rev. 608

89, 369–406. 609


567 seem contrived. This would be the case, for example, if
Anderson, J.R., Schooler, L.J., 1991. Reflections of the environment
RR

610
568 the probability density necessary to fit the data did not in memory. Psychol. Sci. 2, 396–408. 611
569 consist of an easily interpretable or well-understood Baum, W.M., 1973. The correlation-based law of effect. J. Exp. Anal. 612

570 form (such as an exponential or Weibull density). Behav. 20, 137–153. 613

571 Secondly, although variable windows can provide Bernbach, H.A., 1967. Decision processes in memory. Psychol. Rev. 614

74, 462–480. 615


572 good fits to some data, whether these models can ac-
Bogartz, R.S., 1990. Evaluating forgetting curves psychologically. J. 616
commodate a broad range of findings with consistent
CO

573
Exp. Psychol. Learn. Mem. Cog. 16, 138–148. 617
574 values of the free parameters remains to be seen (cf. Bower, G.H., 1967. A multicomponent theory of the memory trace. 618

575 Gallistel, 1993; p. 417). One way to justify systematic In: Spence, K.W., Spence, J.T. (Eds.), The Psychology of Learn- 619

BEPROC 1459 1–11


UN
10 F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx

620 ing and Motivation, vol. 1. Academic Press, New York, pp. Marr, M.J., 1984. Conceptual approaches and issues. J. Exp. Anal. 677

621 229–325. Behav. 42, 353–362. 678

622 Catania, A.C., Reynolds, G.S., 1968. A quantitative analysis of the Massaro, D.W., 1970. Perceptual processes and forgetting in memory 679

623 responding maintained by interval schedules of reinforcement. J. tasks. Psychol. Rev. 77, 557–567. 680

624 Exp. Anal. Behav. 11, 327–383. Mazur, J.E., 1984. Tests of an equivalence rule for fixed and variable 681

625 Catania, A.C., Sagvolden, T., Keller, K.J., 1988. Reinforcement reinforcer delays. J. Exp. Psychol. Anim. Behav. Process. 10, 682

626 schedules: Retroactive and proactive effects of reinforcers in- 426–436. 683

627 serted into fixed-interval performances. J. Exp. Anal. Behav. 49, Mazur, J.E., 2001. Hyperbolic value addition and general models of 684

628 49–73. animal choice. Psychol. Rev. 108, 96–112. 685

629 Clark, F.C., 1959. Some quantitative properties of operant extinction Mazur, J.E., Hastie, R., 1978. Learning as accumulation: a reexami- 686

630 data. Psychol. Rep. 5, 131–139. nation of the learning curve. Psychol. Bull. 85, 1256–1274. 687

631 Commons, M.L., Woodford, M., Ducheny, J.R., 1982. How rein- Mazur, J.E., Herrnstein, R.J., 1988. On the functions relating delay, 688

632 forcers are aggregated in reinforcement-density discrimination reinforcer value, and behavior. Behav. Brain Sci. 11, 690–691. 689

633 and preference experiments. In: Commons, M.L., Herrnstein, Mazur, J.E., Snyderman, M., Coe, D., 1985. Influences of delay and 690

634 R.J., Rachlin, H. (Eds.), Quantitative Analyses of Behavior. rate of reinforcement on discrete-trial choice. J. Exp. Psychol. 691

635 Matching and Maximizing Accounts, vol. 2. Ballinger, Cam- Anim. Behav. Process. 11, 565–575. 692

bridge, MA, pp. 25–78. Mazur, J.E., Vaughan W.Jr., 1987. Molar optimization versus delayed
F

636 693

637 Gallistel, C.R., 1993. The Organization of Learning. MIT Press, reinforcement as explanations of choice between fixed-ratio and 694
638 Cambridge, MA. progressive-ratio schedules. J. Exp. Anal. Behav. 48, 251–261. 695
OO

639 Herrnstein, R.J., 1970. On the law of effect. J. Exp. Anal. Behav. 13, McCarthy, D., White, K.G., 1987. Behavioral models of delayed 696

640 243–266. detection and their application to the study of memory. In: Com- 697

641 Herrnstein, R.J., 1982. Melioration as behavioral dynamism. In: mons, M.L., Mazur, J.E., Nevin, J.A., Rachlin, H. (Eds.), Quanti- 698

642 Commons, M.L., Herrnstein, R.J., Rachlin, H. (Eds.), Quantita- tative Analyses of Behavior. The Effect of Delay and of Interven- 699

643 tive Analyses of Behavior. Matching and Maximizing Accounts, ing Events on Reinforcement Value, vol. 5. Erlbaum, Hillsdale, 700

644 vol. 2. Ballinger, Cambridge, MA, pp. 433–458. NJ, pp. 29–54. 701

645 Hintzman, D.L., 1976. Repetition and memory. In: Bower, G.H. McDiarmid, C.G., Rilling, M.E., 1965. Reinforcement delay and 702
PR

646 (Ed.), The Psychology of Learning and Motivation, vol. 10. Aca- reinforcement rate as determinants of schedule preference. Psy- 703

647 demic Press, New York, pp. 47–91. chol. Sci. 2, 195–196. 704
648 Hintzman, D.L., 1986. “Schema abstraction” in a multiple-trace McDowell, J.J., Bass, R., Kessel, R., 1992. Applying linear systems 705

649 memory model. Psychol. Rev. 93, 411–428. analysis to dynamic behavior. J. Exp. Anal. Behav. 57, 377–391. 706

650 Hintzman, D.L., Ludlam, G., 1980. Differential forgetting of proto- Mendenhall, W., Sincich, T., 1995. Statistics for Engineering and the 707

651 types and old instances: simulation by an exemplar-based classi- Sciences. Prentice-Hall, Englewood Cliffs, NJ. 708
652 fication model. Mem. Cog. 8, 378–382. Norman, D.A., Waugh, N.C., 1968. Stimulus and response interfer- 709
D

653 Keppel, G., 1968. Retroactive and proactive inhibition. In: Dixon, ence in recognition–memory experiments. J. Exp. Psychol. 78, 710

654 T.R., Horton, D.L. (Eds.), Verbal Behavior and General Be- 551–559. 711

655 havior Theory. Prentice-Hall, Englewood Cliffs, NJ, pp. 172– Peele, D.B., Casey, J., Silberberg, A., 1984. Primacy of inter- 712
TE

656 213. response-time reinforcement in accounting for rate differences 713

657 Killeen, P.R., 1982. Incentive theory. In: Bernstein, D.J. (Ed.), Ne- under variable-ratio and variable-interval schedules. J. Exp. Psy- 714

658 braska Symposium on Motivation, 1981: Response Structure chol. Anim. Behav. Process. 10, 149–167. 715

659 and Organization. University of Nebraska Press, Lincoln, pp. Prelec, D., Herrnstein, R.J., 1978. Feedback functions for rein- 716
660 169–216. forcement: a paradigmatic experiment. Anim. Learn. Behav. 6, 717

661 Killeen, P.R., 1994. Mathematical principles of reinforcement. Be- 181–186. 718
EC

662 hav. Brain Sci. 17, 105–172. Rachlin, H., 1982. Absolute and relative consumption space. In: 719

663 Killeen, P.R., Hanson, S.J., Osborne, S.R., 1978. Arousal: its genesis Bernstein, D.J. (Ed.), Nebraska Symposium on Motivation, 1981: 720

664 and manifestation as response rate. Psychol. Rev. 85, 571–581. Response Structure and Organization. University of Nebraska 721
665 Killeen, P.R., Smith, J.P., 1984. Perception of contingency in con- Press, Lincoln, NE, pp. 129–167. 722

666 ditioning: scalar timing, response bias, and erasure of memory Rachlin, H., Battalio, R., Kagel, J., Green, L., 1981. Maximization 723

667 by reinforcement. J. Exp. Psychol. Anim. Behav. Process. 10, theory in behavioral psychology. Behav. Brain Sci. 4, 371–417. 724
RR

668 333–345. Rubin, D.C., Wenzel, A.E., 1996. One hundred years of forgetting: 725

669 Kintsch, W., 1967. Memory and decision aspects of recognition a quantitative description of retention. Psychol. Rev. 103, 734– 726
670 learning. Psychol. Rev. 74, 496–504. 760. 727
671 Laming, D., Scheiwiller, P., 1985. Retention in perceptual memory: Shull, R.L., Spear, D.J., 1987. Detention time after reinforcement: ef- 728

672 a review of models and data. Percept. Psychophys. 37, 189–197. fects due to delay of reinforcement? In: Commons, M.L., Mazur, 729

673 Logan, G.D., 1988. Toward an instance theory of automatization. J.E., Nevin, J.A., Rachlin, H. (Eds.), Quantitative Analyses of 730
CO

674 Psychol. Rev. 95, 492–527. Behavior. The Effect of Delay and of Intervening Events on 731

675 Logue, A.W., 1988. Research on self-control: an integrating frame- Reinforcement Value, vol. 5. Erlbaum, Hillsdale, NJ, pp. 187– 732

676 work. Behav. Brain Sci. 11, 665–709. 204. 733

BEPROC 1459 1–11


UN
F. Tonneau / Behavioural Processes xxx (2005) xxx–xxx 11

734 Staddon, J.E.R., 1975. Learning as adaptation. In: Estes, W.K. (Ed.), White, K.G., 1991. Psychophysics of direct remembering. In: Com- 763

735 Handbook of Learning and Cognitive Processes, vol. 2. Erlbaum, mons, M.L., Nevin, J.A., Davison, M.C. (Eds.), Signal Detection: 764

736 Hillsdale, NJ, pp. 37–98. Mechanisms, Models, and Applications. Erlbaum, Hillsdale, NJ, 765

737 Staddon, J.E.R., 1983. Adaptive Behavior and Learning. Cambridge pp. 221–237. 766

738 University Press, New York. White, K.G., 2001. Forgetting functions. Anim. Learn. Behav. 29, 767

739 Staddon, J.E.R., 1988. Quasi-dynamic choice models: melioration 193–207. 768

740 and ratio invariance. J. Exp. Anal. Behav. 49, 303–320. Wickelgren, W.A., 1967. Exponential decay and independence from 769

741 Stirzaker, D., 1999. Probability and Random Variables: a Beginner’s irrelevant associations in short-term recognition memory for se- 770

742 Guide. Cambridge University Press, New York. rial order. J. Exp. Psychol. 73, 165–171. 771

743 Treisman, M., Williams, T.C., 1984. A theory of criterion setting Wickelgren, W.A., 1972. Trace resistance and the decay of long-term 772

744 with an application to sequential dependencies. Psychol. Rev. memory. J. Math. Psychol. 9, 418–455. 773

745 91, 68–111. Wickelgren, W.A., 1974. Single-trace fragility theory of memory 774

746 Turvey, M.T., 1977. Contrasting orientations to the theory of visual dynamics. Mem. Cog. 2, 775–780. 775

747 information processing. Psychol. Rev. 84, 67–88. Wickelgren, W.A., 1975. Alcoholic intoxication and memory storage 776

748 Valone, T.J., 1992. Patch estimation via memory windows and the dynamics. Mem. Cog. 3, 385–389. 777

749 effect of travel time. J. Theor. Biol. 157, 243–251. Wickelgren, W.A., Berian, K.M., 1971. Dual trace theory and 778

Vaughan W.Jr., 1981. Melioration, matching, and maximization. J. the consolidation of long-term memory. J. Math. Psychol. 8,
F

750 779

751 Exp. Anal. Behav. 36, 141–149. 404–417. 780


752 Watkins, M.J., 1981. Human memory and the information- Wickens, T.D., 1998. On the form of the retention function: com- 781
OO

753 processing metaphor. Cognition 10, 331–336. ment on Rubin and Wenzel (1996). A quantitative description of 782

754 Wearden, J.H., Clark, R.B., 1988. Interresponse-time reinforcement retention. Psychol. Rev. 105, 379–386. 783

755 and behavior under aperiodic reinforcement schedules: a case Williams, B.A., 1988. Reinforcement, choice, and response strength. 784

756 study using computer modeling. J. Exp. Psychol. Anim. Behav. In: Atkinson, R.C., Herrnstein, R.J., Lindzey, G., Luce, R.D. 785

757 Process. 14, 200–211. (Eds.), Stevens’ Handbook of Experimental Psychology. Learn- 786

758 Wearden, J.H., Clark, R.B., 1989. Constraints on the process of ing and Cognition, vol. 2, second ed. Wiley, New York, pp. 787

759 interresponse-time reinforcement as the explanation of variable- 167–244. 788


PR

760 interval performance. Behav. Process. 20, 151–175. Wixted, J.T., 1990. Analyzing the empirical course of forgetting. J. 789

761 White, K.G., 1985. Characteristics of forgetting functions in de- Exp. Psychol. Learn. Mem. Cog. 16, 927–935. 790
762 layed matching to sample. J. Exp. Anal. Behav. 44, 15– Wixted, J.T., Ebbesen, E.B., 1991. On the form of forgetting. Psychol. 791

34. Sci. 2, 409–415. 792


D
TE
EC
RR
CO

BEPROC 1459 1–11


UN

You might also like