You are on page 1of 9

This article appeared in a journal published by Elsevier.

The attached
copy is furnished to the author for internal non-commercial research
and education use, including for instruction at the authors institution
and sharing with colleagues.
Other uses, including reproduction and distribution, or selling or
licensing copies, or posting to personal, institutional or third party
websites are prohibited.
In most cases authors are permitted to post their version of the
article (e.g. in Word or Tex form) to their personal website or
institutional repository. Authors requiring further information
regarding Elsevier’s archiving and manuscript policies are
encouraged to visit:
http://www.elsevier.com/copyright
Author's personal copy

Available online at www.sciencedirect.com

Behavioural Processes 78 (2008) 302–309

Short report

A simultaneous procedure facilitates acquisition under an optimal


interstimulus interval in artificial neural networks and rats
José E. Burgos ∗ , Carlos Flores, Óscar Garcı́a, Carlos Dı́az, Yuria Cruz
University of Guadalajara-CEIC, Francisco de Quevedo 180, Col. Arcos de Vallarta, Guadalajara, Jalisco 41130, Mexico
Received 7 September 2007; accepted 28 February 2008

Abstract
In a computer simulation, a neural network first received a simultaneous procedure, where the interstimulus interval (ISI) was 0 time-steps
(ts). Output activations were near zero under this procedure. The network then received a forward-delay procedure where the ISI was 8 ts.
Output activations increased to the near-maximum level faster than those of a control network that first received an explicitly unpaired procedure.
Comparable results were obtained with rats that first received trials where a retractable lever was presented for 3 s concurrently with access to
water. Low-lever pressing was observed under this procedure. The rats then received trials where the lever was followed 15 s after by water. Lever
pressing appeared faster than a control group that received the 15-s ISI after an explicitly unpaired procedure. The model used in the simulation
explains these results as connection–weight increments that promote little output activations in a simultaneous procedure, but facilitate acquisition
in an optimal ISI.
© 2008 Elsevier B.V. All rights reserved.

Keywords: Simultaneous conditioning; Neural networks; Rats; ISI; Learning; Performance; Pavlovian conditioning

1. Introduction terms of negligible learning. This sort of account is found in


the model of Rescorla and Wagner (1972), and has influenced
This paper is a study of simultaneous Pavlovian condition- neural-network modeling of Pavlovian conditioning.
ing from the perspective of a neural-network model that draws Sutton and Barto (1981), for instance, proposed a real-time
on knowledge from neuroscience. In a simultaneous procedure, extension of the Rescorla–Wagner model. The learning rule of
the onset of the conditioned stimulus (CS) coincides with the the Sutton–Barto model, used to change the connection weights
onset of the unconditioned stimulus (US). More precisely, the of the neural element, includes factors that represent stimu-
interval between the CS onset and the US onset, or interstimulus lus traces or after-discharges that are left on an element by
interval (ISI), is zero or close to zero. The evidence shows that activations that represent the CS and the US. Such traces are
this procedure promotes negligible or no conditioned respond- neural activations that persist after the stimuli have ended, and
ing (CR) (e.g., Bevins and Ayres, 1995; Gibbon et al., 1977; are directly proportional to the stimulus’ duration. Typically,
Hawkins et al., 1986; McAllister, 1953; Schneiderman, 1966; learning in this element occurs when the CS and US traces over-
Schneiderman and Gormezano, 1964; Smith et al., 1969; Yeo, lap in a certain temporal relationship that obtains only under
1974). a forward procedure. The authors show a simulation where a
Two sorts of accounts of this phenomenon have been offered. simultaneous training protocol, defined as an ISI of 0 time-steps
On one sort of account, performance is assumed to be largely iso- (ts), resulted in a near zero asymptotic associative strength (con-
morphic to learning (except perhaps for conditioned inhibition, nection weight). On this model, then, a simultaneous procedure
which raises special issues we need not address here). Negli- causes a severe learning deficit, expressed in an equally severe
gible CR under a simultaneous procedure is thus explained in performance (output-activation) deficit.
On the second sort of account, performance is assumed to
be non-isomorphic to learning. Certain performance deficits are
∗ Corresponding author. Tel.: +52 33 38180735.
thus explained as failures in the expression of learning into
E-mail addresses: jburgos@cucba.udg.mx (J.E. Burgos), observable behavior, rather than learning failures. This idea was
carlos.flores@cucba.udg.mx (C. Flores), oscargl@cencar.udg.mx (Ó. Garcı́a). anticipated by classical learning theorists (e.g., Guthrie, 1935),

0376-6357/$ – see front matter © 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.beproc.2008.02.018
Author's personal copy

J.E. Burgos et al. / Behavioural Processes 78 (2008) 302–309 303

and has received renewed attention more recently (e.g., Donahoe


and Vegas, 2004; Rescorla, 1980). The idea is described
mathematically in the so-called “comparator hypothesis” (e.g.,
Gallistel and Gibbon, 2000; Gibbon and Balsam, 1981; Miller
and Matzel, 1988; Miller and Schachtman, 1985; Stout and
Miller, 2007), in terms of a non-linear response rule that involves
complex transformations of associative strength into perfor-
mance measures (e.g., reinforced trials to acquisition).
Recent studies of simultaneous conditioning provide evi-
dence that supports this sort of account. The evidence from
these studies shows that a CS that has been previously pre-
sented with a US according to a simultaneous procedure can
be successfully used as a conditioned reinforcer in a second- Fig. 1. Neural-network architecture used in the simulation. Neural elements
order conditioning test (e.g., Barnet et al., 1991; Matzel et al., (represented as circles) are organized from left to right into input, hidden, and
1988). The implication is that simultaneous conditioning pro- output layers. Activations propagate only in that direction. The architecture has
three input elements (small circles), one for the CS (small open circle labeled as
motes learning that is substantial enough to be expressed as “CS”), one for the context (small open circle labeled as CTX), and one for the US
a positive transfer in a succeeding task. This sort of account (small closed circle labeled as US). The hidden units are sa (sensory association),
explains the low CR observed under a simultaneous procedure ca1 (Cornu Ammon 1), ma (motor association), and vta (ventral-tegmental area).
as a performance deficit. The output element is labeled as CR/UR. Connections are represented as solid
The present model shares the second account’s central idea arrows. Thin solid arrows represent modifiable connections that are initially
weak. Thick solid arrows represent unmodifiable connections that are initially
that learning is not isomorphic to performance. However, there is maximally strong. See text for more details.
a key conceptual difference, namely: The learning-performance
distinction in the present model is made in a connectionist The small circles represent input elements, whose activations
manner, in terms of the distinction between activations and represent exteroceptive stimuli. The network has one input ele-
connection weights. This distinction is fundamental to all neural- ment for the CS (open small circle labeled as “CS”), one for the
network models. An activation is the state of a neural element at context (open small circle labeled as “CTX”), and one for the
a moment in time, as determined by an activation rule or func- US (closed small circle labeled as “US”). The model imposes no
tion (see Appendix A for the activation function of the present principled restriction on the number of activated inputs that can
model). A connection weight, in contrast, is the strength of the represent a stimulus. The use of only one input element per stim-
connection between two neural elements, which determines the ulus in this architecture, then, is just a convenient simplification,
efficacy with which one element can activate another. not a theoretical claim. The input elements are activated accord-
In neural-network models, learning is conceived as a change ing to a prespecified training protocol that represents a Pavlovian
in one or more connection weights, according to a learning func- procedure. The rest of the elements are activated according to
tion (see Appendix A for the learning rule of the present model). the activation function.
Learning can thus be said to be distributed throughout a net- The layers labeled as “sa” (for “sensory association”) and
work’s connections. Performance, in contrast, is conceived as “ma” (for “motor association”) represent hidden layers, which
the activation of output elements, as functionally related to the mediate between the input and the output layers. The element
activation of input elements. This connectionist interpretation labeled as “CR/UR” (for “conditioned response/unconditioned
of the learning-performance distinction can be extended to the response”) represents the output element. The reason for this
notions of a learning and performance deficit. A learning deficit label is that this element can be activated either directly by the
can be viewed as a negligible change in connection weights. US element or indirectly by the CS element via the sa and ma
A performance deficit can be conceived as an output activation elements. The first form of activation represents an uncondi-
below some response criterion (0.5 has been used in previous tioned response in that it requires no learning for it to occur at a
simulations). substantial level, because it is mediated by a maximally strong,
unmodifiable connection (thick arrow from US to CR/UR). The
2. Neural-network simulation second form of activation represents a CR in that it requires
learning (a change in connection weights) for it to be substantial,
We need not rehearse the model’s neuroscientific rationale, because it is mediated by initially weak, modifiable connections
the behavioral phenomena it can simulate, or how it differs from (thin solid arrows). Modifiable connections change according
other models, for they have been discussed at length elsewhere to a learning function that includes synaptic competition and a
(Burgos, 1997, 2003, 2005, 2007; Burgos and Donahoe, 2000; diffuse signal that depends upon the activations of the elements
Burgos and Murillo-Rodrı́guez, 2007; Donahoe et al., 1993; labeled as “ca1” (for “Cornu Ammon 1”) and “vta” (for “ventral-
Donahoe and Palmer, 1994; Donahoe and Burgos, 1999, 2000; tegmental area”), which are intended to represent hippocampal
Donahoe et al., 1997a,b). Here we just summarize some of the and dopaminergic systems, respectively.
model’s main features. The networks were naı̈ve in that all their initial connection
Two instances of the architecture shown in Fig. 1 were used. weights were set to 0.01, the values used in most other simu-
The circles represent neural elements, the arrows connections. lations with this model (cf., Burgos, 2003, 2005). One network
Author's personal copy

304 J.E. Burgos et al. / Behavioural Processes 78 (2008) 302–309

(S-FD) received in Phase 1 a protocol that simulated a simul- the same forward-delay procedure that was given to S-FD in
taneous procedure (S). The network received 200 trials where Phase 2. All free parameters of the activation and learning func-
the CS and US elements were concurrently and maximally acti- tions were the same as in previous simulations (see Appendix
vated (activation level of 1.0) for the last 3 ts of a 60-ts cycle. A).
During the entire cycle, the CTX unit was maximally activated Fig. 2 shows the results. The upper left panel depicts the
to simulate a context, for a fixed intertrial interval (ITI) of 57 ts. CR/UR activation of S-FD at ts 59 for the 10 test trials after
S-FD was then given 10 CS-alone test trials where the learning Phase 1 (the simultaneous procedure). It can be observed that the
function was disabled, in order to preserve the final connection CR/UR activation was close to zero, which represents a severe
weights from Phase 1 and assess the effect of the procedure with- performance deficit. The upper right panel depicts the CR/UR
out any learning. In Phase 2, S-FD received a training protocol activation of S-FD during Phase 2 (the forward-delay procedure
that simulated a forward-delay procedure (FD) with an optimal with an optimal ISI). It can be observed that the number of rein-
ISI. The network was given 200 CS–US pairings where the CS forced trials before the CR/UR activation reached the maximum
element was maximally activated for the last 9 ts and the US level was well below 25. This number contrasts with the 150
element was maximally activated at the last ts of the 60-ts cycle, trials that took the CR/UR element of EU-FD to reach the maxi-
for an interstimulus interval (ISI) of 8 ts. The context was again mum activation in Phase 2, as shown in the right lower panel of
simulated by activating CTX maximally throughout the entire the figure. Initial exposure to the simultaneous procedure thus
cycle of 60 ts, for an ITI of 51 ts. facilitated CR acquisition under an optimal ISI.
Another network with the same architecture (EU-FD) The results are consistent with those reported in experiments
received in Phase 1 a training protocol that simulated the sin- with animals where second-order conditioning tests have been
gle alternation variant of the explicitly unpaired procedure used (e.g., Barnet et al., 1991; Matzel et al., 1988). However, our
(EU). The US element was activated once in the middle of simulation used an acquisition test, so it is procedurally more
the 60-ts cycle (at ts 30). The CS and US were thus sepa- comparable to animal studies like those reported by Ross and
rated by a fixed interval of 29 ts. In Phase 2, EU-FD received Scavio (1983), and Salafia et al. (1980). However, neither study

Fig. 2. CR/UR activations at ts 59 for networks S-FD (upper panels) and EU-FD (lower panel). The upper left panel shows the activations during the test trials given
to S-FD immediately after Phase 1 (simultaneous procedure) and before Phase 2 (forward-delay procedure). The lower left panel shows the activations during the
test trials given to EU-FD immediately after Phase 1 (explicitly unpaired procedure) and before Phase 2 (forward-delay procedure). The right panels show the output
activations during Phase 2.
Author's personal copy

J.E. Burgos et al. / Behavioural Processes 78 (2008) 302–309 305

on the last two bars of the figure showed that the difference was
significant (t(8) = −12.78, p < .0001).

3. Animal experiment

In this section, we describe an animal experiment that is more


comparable to the preceding simulation in that it used an acqui-
sition instead of a second-order conditioning transfer test. This
feature will allow us to determine the generality of evidence
for simultaneous conditioning reported with second-order con-
ditioning tests. Also, as a preparation we used autoshaping of
the lever-press response in rats, which allowed us to determine
the generality of the evidence obtained with other preparations,
such as conditioned suppression of the lever-press and lick
responses.

Fig. 3. Mean initial (leftmost bar) and final weights for CS–sa and sa–ma con- 3.1. Method
nections of S-FD and EU-FD on at the end of Phases 1 and 2. The phase numbers
are indicated in parentheses. CS–sa: connections from the CS input to the sa
element. sa–ma: connections from the sa to the ma elements (see Fig. 1). The 3.1.1. Subjects
error bars represent standard errors. Sixteen Wistar female rats (Rattus norvegicus) of 3 months
of age with no prior history of responding on any experimen-
reported a positive transfer from a short to an optimal ISI. This tal procedure served as subjects. Between sessions, rats were
result is in disagreement not only with the present results, but individually housed with free access to food in a temperature-
also with the evidence from experiments where second-order controlled colony under 12:12 h light/dark cycle. They were
conditioning tests have been used. One possible explanation of maintained at regime of pre-session water deprivation of 23.5 h
this discrepancy is that the preparations used in the experiments and 30 min of post-session access to water.
are different. Second-order conditioning tests have been typi-
cally used with conditioned suppression in rats. Ross and Scavio, 3.1.2. Apparatus
as well as Salafia et al., in contrast, used the preparation of the Experimental sessions were conducted in five identical MED
nictitating-membrane response (NMR) in rabbits. It remains to Associates modular test chambers (305 mm long, 241 mm wide,
be seen whether this discrepancy remains with a second-order and 210 mm high), each enclosed in a sound- and light-
conditioning test using the NMR preparation. attenuating box equipped with a ventilating fan. The front, rear
The observed positive transfer in the networks was due to walls, and ceiling of the chambers were made of clear plastic.
an increment in the CS-connection weights under the simulta- The front wall was hinged and served as a door to the cham-
neous procedure for S-FD. Fig. 3 shows the mean initial and ber. The two side panels were made of aluminum, and the floor
final weights for certain connections in Phases 1 and 2 for both consisted of a stainless steel grid floor positioned above a stain-
networks. At the end of the simultaneous procedure, S-FD’s CS- less steel waste pan. One of the panels featured two retractable
sa weights increased to near-maximum levels (second bar from response levers flanking a liquid dipper. Only the lever to the
left to right). This increment is in sharp contrast to the near-zero right of the dipper was used as a CS and operandum. The other
CS-sa weights observed at the end of the explicitly unpaired lever remained retracted throughout the entire experiment. The
procedure for EU-FD (middle bar). The procedure thus induced lever protruded 19 mm and was located at 70 mm above the grid
a weight loss in this connection, due to the presentation of the floor. The lever had a time cycle of 700 ms and required a mini-
CS alone (the same mechanism accounts for latent inhibition in mal tension of 25 grams to be activated. The dipper was centered
this model, as reported in Burgos, 2003). Exposure to the simul- horizontally on the panel and placed at 25 mm above the grid
taneous procedure thus gave S-FD a sensory learning advantage floor. The dipper featured a receptacle opening (51 mm wide,
over EU-FD. 51 mm high) through which a motor driven dipper arm could be
Such advantage, however, was insufficient for S-FD to raised to deliver 0.01 cm3 of water. A houselight was mounted
respond under the simultaneous procedure, because the pro- 12 mm from the ceiling on the sidewall opposite the intelligence
cedure did not promote a sufficient increment in the sa–ma panel, and was on during the entire experiment. The ventilation
connection weights. There was some increment in these weights fan mounted on the rear wall of the sound-attenuating cham-
at the end of the simultaneous procedure for S-FD (fourth bar; ber provided masking noise of 60 dB. Experimental events were
a comparable increment was observed in EU-FD at the end of arranged using a Med interface connected to a PC controlled by
the explicitly unpaired procedure). However, this increment was Med-PC IV software.
not sufficient for S-FD to respond substantially under the simul-
taneous procedure. A more substantial responding required the 3.1.3. Procedure
increment observed in these weights at the end of the forward- The rats were randomly assigned to two groups (S-FD and
delay procedure (Phase 2) for S-FD (fifth bar). A paired t-test EU-FD) of eight subjects each, and were directly exposed to the
Author's personal copy

306 J.E. Burgos et al. / Behavioural Processes 78 (2008) 302–309

experimental conditions, without any adaptation to the chamber, sion. The left panels show the individual data for each group in
or pretraining with the liquid dispenser or the lever. The groups Phase 1. For S-FD (upper left panel), the percentage tended to be
were exposed to procedures that were analogous to those used below 25% throughout the phase, with the exception of rats R18
in the simulation; hence the same labels used for the networks (31.67%) and R17 (35%) in Sessions 1 and 2, respectively. Sim-
in the simulations were used here for the animal groups. ilar results can be seen for EU-FD, under the explicitly unpaired
In Phase 1, S-FD was given seven sessions of 60 trials where procedure (lower left panel).
the lever protruded into the chamber for 3 s (CS), during which The right panels depict the median percentages for each
there was a 3-s access to the cup of the liquid dispenser (US). group and session in Phase 2 (solid thick lines with large closed
This procedure was strictly simultaneous (S) in that the CS and circles). The light dashed lines with small various symbols
US occurred concurrently. In Phase 2, S-FD was shifted to a represent individual data. The median percentage increased to
forward-delay (FD) procedure with an ISI of 15 s for six more 69.17 for S-FD in the first session under the forward-delay
sessions of 60 trails. Group EU-FD, in contrast, was first exposed procedure with an ISI of 15 s (upper right panel). For EU-
to seven sessions of 60 trials of the single alternation variant of FD, the median percentage in the same session and procedure
the explicitly unpaired (EU) procedure, where the CS and US was 5.83. A Mann–Whitney test revealed a significant differ-
occurred alternately, separated by a fixed interval of 30 s. In ence between the two medians (U = 12.5, z = −2.06, p < .05,
Phase 2, EU-FD received the forward-delay procedure for six two-tailed). The differences in the remaining sessions were not
more sessions of 60 trials each. The ITI was held constant at 60 s significant.
for both groups in both phases. Lever presses had no scheduled The number of trials to acquisition was also measured. Acqui-
consequences. sition was defined as the occurrence of at least one response in
three out of four consecutive trials (Gibbon and Balsam, 1981).
3.1.4. Results The median numbers of trials to acquisition were 10.5 for S-FD
The results are shown in Fig. 4, which depicts the median and 75.5 for EU-FD, and the difference was significant (U = 13.0,
percentage of trials with at least one lever press for each ses- z = −2.0, p < .05, two-tailed).

Fig. 4. Percentage of trials with at least one response, for each session and phase of the experiment. (Left panels) Phase 1, where Group S-FD received the simultaneous
procedure (S, upper left panel) and Group EU-FD received the explicitly unpaired procedure (EU, lower left panel). The lines represent individual data. (Right panels)
Phase 2, where both groups received the forward-delay procedure (FD) with an ISI of 15 s. The solid lines with large closed circles represent the percentage medians
per group per session. The light dashed lines with various small symbols represent individual data.
Author's personal copy

J.E. Burgos et al. / Behavioural Processes 78 (2008) 302–309 307

4. General discussion EU-FD showed relatively high percentages in Session 1 of Phase


2. These percentages indicate that the preexposure to the explic-
The simulation results show that the present neural-network itly unpaired procedure somewhat facilitated acquisition in these
model predicts learning under a simultaneous Pavlovian pro- rats. This result is unexpected because studies have shown that
cedure. This prediction was supported by an experiment using an explicitly unpaired procedure makes the CS a conditioned
an acquisition transfer test with autoshaping of the lever-press inhibitor and thus produces acquisition retardation when the CS
response in rats. The results of this experiment lend further is explicitly paired with the US (e.g., Baker, 1977; Droungas
generality to experiments that have used a second-order condi- and LoLordo, 1995; Friedman et al., 1998; Kleiman and Fowler,
tioning test with conditioned suppression of the lever-press and 1984; Wasserman and Molina, 1975).
lick responses. The model explains the prediction as an incre- Finally, of the six rats of S-FD that showed an increased CR
ment in connection weights that was insufficient to allow for percentage on Session 1 of Phase 2, three (R11, R13, R15) main-
substantial output activation under the simultaneous procedure, tained the CR throughout the phase at relatively high levels. The
but allowed for faster acquisition under an optimal ISI. other three rats (R12, R16, R18) showed a substantial reduction
Some caveats of the model have been mentioned elsewhere of the CR percentage throughout the phase, after the first session,
(Burgos, 2005; Burgos and Murillo-Rodrı́guez, 2007). Here, we with different rates of reduction and recovery (two of the three
mention one more caveat. The caveat has to do with the reason rats showed indications of recovery towards the last session).
why we used an acquisition instead of the second-order condi- Wasserman and Molina (1975) reported a similar phenomenon,
tioning test in the simulation. Architectures like the one shown in which they called “reduction of postacquisition” in autoshaped
Fig. 1, with multiple CS input elements, cannot simulate second- key peck in pigeons, after an explicitly unpaired procedure. It
order conditioning. This limitation is due to the architecture’s remains to be seen what the neural substrate of this phenomenon
full connectivity, where all the elements in a layer connect to all could be.
the elements in the adjacent layer, and its lack of inhibitory con-
nections. The learning function in the present model includes Appendix A
factors that represent synaptic competition, where connections
that impinge the same neural element compete for a limited Fig. A1 shows a generic neural processing unit in this model.
amount of connection weight (see Appendix A). Due to this It consists of a number of inputs whose activations represent
competition, fully connected architectures without inhibitory either environmental stimuli or activations from other units.
+ −
connections cannot learn to respond to a CS after having learned An input activation can be excitatory (ai,t ) or inhibitory (ai,t ),
to respond another. In order to simulate second-order condi- although in the present study only excitatory activations were
tioning in this model, then, a sparsely connected architecture is used. The subindex t represents a discrete moment in time, or
required (unpublished simulations show that this is possible). It time-step (ts). Each input activation affects a summing junction
remains to be seen whether the present results also obtain with j through a connection that has a certain strength represented by
networks that can simulate second-order conditioning. Also, no a weight (w− +
i,j,t for excitatory connections, wi,j,t for inhibitory
simulations with this model have been conducted using networks connections; only excitatory weights were used). The junction
that have inhibitory connections, so the role of such connections computes the inner products of activation and weight vectors
in the present and other phenomena remains undetermined. separately for excitatory (a− −
j,t wj,t ) and inhibitory activation and
Regarding the animal experiment, three caveats deserve men- weight vectors (a+ +
j,t wj,t ), and then passes the results to separate
tion. First, the results were incompatible with those reported logistic functions (see equations below). The results of the logis-
by Ross and Scavio, as well as Salafia et al. with the NMR tic functions are passed through the activation function, which
preparation. Again, the difference in the preparations used might returns an activation state for the unit. All activations range from
account for the discrepancy. Second, three of the eight rats of 0.0 to 1.0. The distinction between excitatory and inhibitory

− +
Fig. A1. Generic neural processing element. The element consists of a number of inputs that are connected and send excitatory (ai,t ) or inhibitory (ai,t ) activations
to a summing junction j. Each connection has a strength represented by a weight (w− i,j,t for excitatory connections, w+
i,j,t for inhibitory connections). The junction
computes inner products between an activation vector (a) and a weight vector (w) separately for excitatory and inhibitory inputs. Each product (excj,t for excitatory
elements, inhj,t for inhibitory elements) is passed as an argument to a separate logistic function L. The results of both logistic functions are used to compute the
element’s activation state, according to the activation function (see Appendix A). Subindex t represents a time-step (ts).
Author's personal copy

308 J.E. Burgos et al. / Behavioural Processes 78 (2008) 302–309

activations is not made in terms of positive versus negative acti- remains to be done. However, it seems reasonable to expect
vations. Rather, it is made in terms of different types of units that such manipulations will make important differences in the
that have differential effects on the activation of those units to model’s capability to simulate certain phenomena.
which they connect:
The activation a of a hidden or output unit j at t is given by

⎨ L(excj,t ) + τj L(excj,t−1 )[1 − L(excj,t )] − L(inhj,t ) if L(excj,t ) > L(inhj,t ) and L(excj,t ) ≥ θj

aj,t = aj,t−1 − κj aj,t−1 (1 − aj,t−1 ) − L(inhj,t ) if L(excj,t ) > L(inhj,t ) and L(excj,t ) < θj


0 if L(excj,t ) ≤ L(inhj,t )
where
1 
s References
L(x) = ; x= ai,t wi,j,t
1+e (−(x−μ)/σ)
i=1 Baker, A.G., 1977. Conditioned inhibition arising from a between-sessions neg-
ative correlation. J. Exp. Psychol.: Anim. Behav. Process. 3, 144–155.
θ j is a random threshold generated according to a Gaussian dis- Barnet, R.C., Arnold, H.M., Miller, R.R., 1991. Simultaneous conditioning
tribution with a mean of 0.2 and a standard deviation of 0.15; demonstrated in second-order conditioning: evidence for similar associa-
τ j = 0.1 is a temporal summation parameter, κj = 0.1 is a decay tive structure in forward and simultaneous conditioning. Learn. Motiv. 22,
parameter; μ = 0.5 and σ = 0.1 are the mean and standard devi- 253–268.
ation (temperature) of the logistic function L. All activations Bevins, R.A., Ayres, J.J.B., 1995. One-trial context fear conditioning as a func-
tion of the interstimulus interval. Anim. Learn. Behav. 23, 400–410.
range from 0.0 to 1.0. Input units are not activated by this func- Burgos, J., 1997. Evolving artificial neural networks in Pavlovian environments.
tion, but their activations are manually assigned, according to In: Donahoe, J.W., Packard-Dorsel, V. (Eds.), Neural Network Models of
some training protocol. In the case of vta and CR/UR units, the Cognition. Elsevier, Amsterdam, pp. 58–79.
above function is used only in the absence of a US or primary Burgos, J.E., 2003. Theoretical note: simulating latent inhibition with selection
reinforcer (whenever the activation of I4 is 0.0). Otherwise, the neural networks. Behav. Process. 62, 183–192.
Burgos, J.E., 2005. Theoretical note: the C/T ratio in artificial neural networks.
unconditional connections (thick arrows in Fig. 1) are assumed Behav. Process. 69, 249–256.
to take precedence, in which case the activations are equal to the Burgos, J.E., 2007. Autoshaping and automaintenance: a neural-network
US magnitude (the level of activation of I4 ). approach. J. Exp. Anal. Behav. 88, 115–130.
Learning is defined as a change in one or more connection Burgos, J.E., Donahoe, J.W., 2000. Structure and function in selectionism: impli-
weights, according to the following learning function: cations for complex behavior. In: Leslie, J., Blackman, D. (Eds.), Issues in
Experimental and Applied Analyses of Human Behavior. Context Press,

αj aj,t dt pi,t rj,t if dt ≥ 0.001 Reno, pp. 39–57.
wi,j,t = Burgos, J.E., Murillo-Rodrı́guez, E., 2007. Neural-network simulations of two
−βj wi,j,t−1 ai,t aj,t otherwise context-dependence phenomena. Behav. Process. 75, 242–249.
Donahoe, J.W., Burgos, J.E., Palmer, D.C., 1993. A selectionist approach to
where αj = 0.5; βj = 0.1 are free parameters; ai,t denotes the reinforcement. J. Exp. Anal. Behav. 60, 17–40.
presynaptic activation; aj,t denotes the postsynaptic activa- Donahoe, J.W., Burgos, J.E., 1999. Timing without a timer. J. Exp. Anal. Behav.
tion; dt = ds,t = ϕt + υt (1 − υt ), if j is an sa or ca1 unit; 71, 257–263.
Donahoe, J.W., Burgos, J.E., 2000. Behavior analysis and revaluation. J. Exp.
dt = dm,t = υt , j is an ma, vta, or output unit; υt = Mean(vt − vt−1 );
Anal. Behav. 74, 331–346.
a w
 s
Donahoe, J.W., Palmer, D.C., 1994. Learning and Complex Behavior. Allyn &
ϕt = |Mean(ht − ht−1 )|; pi,t = i,t Ni,j,t−1 ; rj,t = 1 − wi,j,t ;
Bacon, Boston.
i=1 Donahoe, J.W., Palmer, D.C., Burgos, J.E., 1997a. The S-R issue: Clarification of
N = excj,t or N = inhj,t . its status in Donahoe and Palmer’s (1994), Learning and Complex Behavior.
All weights are between 0.0 and 1.0. The term dt desig- J. Exp. Anal. Behav. 67, 193–211.
nates a discrepancy signal that determines whether weights are Donahoe, J.W., Palmer, D.C., Burgos, J.E., 1997b. The units of selection: what
increased or decreased (0.001 is an arbitrary criterion for mak- is reinforced? J. Exp. Anal. Behav. 67, 259–273.
Donahoe, J.W., Vegas, R., 2004. Pavlovian conditioning: the CS–UR relation.
ing this decision), and if increased, by how much. The signal ds,t
J. Exp. Psychol.: Anim. Behav. Process. 30, 17–33.
influences weight changes in the input-sa and sa–ca1 connec- Droungas, A., LoLordo, V.M., 1995. The explicitly unpaired procedure yields
tions (see Fig. 1). The signal dm,t influences weight changes in conditioned inhibition whether the CS and the US alternate singly or ran-
the sa–ma, ma–vta, and ma-output connections. h designates a domly. Learn. Motiv. 26, 278–299.
vector of ca1 activations and v a vector of vta activations. The Friedman, B.X., Blaisdell, A.P., Escobar, M., Miller, R.R., 1998. Comparator
mechanisms and conditioned inhibition: conditioned stimulus preexposure
architecture used in the present study had one of instance of each
disrupts Pavlovian conditioned inhibition but not explicitly unpaired inhibi-
type of unit, so each vector consisted of one value. Factors p and tion. J. Exp. Psychol.: Anim. Behav. Process. 24, 453–466.
r make the function a competitive one, meaning that connections Gallistel, C.R., Gibbon, J., 2000. Time, rate and conditioning. Psychol. Rev.
impinging on NPE compete for a limited amount of weight (1.0). 107, 289–344.
The function also implements a mechanism in which the larger Gibbon, J., Baldock, M.D., Locurto, C., Gold, L., Terrace, H.S., 1977. Trial and
intertrial durations in autoshaping. J. Exp. Psychol.: Anim. Behav. Process.
the weight, the larger the change in that weight, everything else
3, 264–284.
being equal. Gibbon, J., Balsam, P., 1981. Spreading association in time. In: Locurto, C.M.,
Thus far, simulations with this model have not relied on para- Terrace, H.S., Gibbon, J. (Eds.), Autoshaping and Conditioning Theory.
metric manipulations, and a parametric analysis of the model Academic Press, New York, pp. 219–253.
Author's personal copy

J.E. Burgos et al. / Behavioural Processes 78 (2008) 302–309 309

Guthrie, E.R., 1935. The Psychology of Learning. Harper, New York. Ross, R.T., Scavio Jr., M.J., 1983. Perseveration of associative strength in rab-
Hawkins, R.D., Carew, T.J., Kandel, E.R., 1986. Effects of interstimulus interval bit nictitating membrane response conditioning following ISI shifts. Anim.
and contingency on classical conditioning of the Aplysia siphon withdrawal Learn. Behav. 11, 435–438.
reflex. Journal of Neuroscience 6, 1695–1701. Salafia, W.R., Host, K.C., Lambert, R.W., Chiaia, N.L., Ramirez, J.J., 1980. Rab-
Kleiman, M.C., Fowler, H., 1984. Variations in explicitly unpaired training are bit nictitating membrane conditioning: lower limit of the effect interstimulus
differentially effective in producing conditioned inhibition. Learn. Motiv. interval. Anim. Learn. Behav. 8, 85–91.
15, 127–155. Schneiderman, N., 1966. Interstimulus interval function of the nictitating mem-
Matzel, L.D., Held, E.E., Miller, R.R., 1988. Information and expression of brane response of the rabbit under delay versus trace conditioning. J. Comp.
simultaneous and backward associations: Implications for contiguity theory. Phys. Psychol. 62, 397–402.
Learn. Motiv. 19, 317–344. Schneiderman, N., Gormezano, I., 1964. Conditioning of the nictitating mem-
McAllister, W.R., 1953. Eyelid conditioning as a function of the CS–US interval. brane of the rabbit as a function of the CS–US interval. J. Comp. Phys.
J. Exp. Psychol. 45, 417–422. Psychol. 57, 188–195.
Miller, R.R., Matzel, L.D., 1988. The comparator hypothesis: A response rule Smith, M.C., Coleman, S.R., Gormezano, I., 1969. Classical conditioning of
for the expression of associations. In: Bower, G.H. (Ed.), The Psychology the rabbit’s nictitating membrane response at backward, simultaneous, and
of Learning and Motivation, vol. 22. Academic Press, San Diego, CA, pp. forward CS–US intervals. J. Comp. Phys. Psychol. 69, 226–231.
51–92. Stout, S.C., Miller, R.R., 2007. Sometimes-competing retrieval (SOCR): a
Miller, R.R., Schachtman, T.R., 1985. The several roles of context at the time formalization of the comparator hypothesis. Psychol. Rev. 114, 759–
of retrieval. In: Balsam, P.D., Tomie, A. (Eds.), Context and Learning. 783.
Lawrence Erlbaum, Hillsday, NJ, pp. 167–194. Sutton, R.S., Barto, A.G., 1981. Toward a modern theory of adaptive networks:
Rescorla, R.A., 1980. Pavlovian Second-Order Conditioning: Studies in Asso- expectation and prediction. Psychol. Rev. 88, 135–170.
ciative Learning. Lawrence Erlbaum, Hillsday, NJ. Wasserman, E.A., Molina, E.J., 1975. Explicitly unpaired key light and food pre-
Rescorla, R.A., Wagner, A.R., 1972. A theory of Pavlovian conditioning: varia- sentations: interference with subsequent auto-shaped key pecking in pigeons.
tions in the effectiveness of reinforcement and nonreinforcement. In: Black, J. Exp. Psychol.: Anim. Behav. Process. 1, 30–38.
A.H., Prokasy, W.F. (Eds.), Classical Conditioning II: Current Research and Yeo, A.G., 1974. The acquisition of conditioned suppression as a function of
Theory. Appleton-Century-Crofts, New York, pp. 64–99. interstimulus interval duration. Quar. J. Exp. Psychol. 26, 405–416.

You might also like