You are on page 1of 22

Holding Multiple Items in Short Term Memory: A attractor networks.

Instead, we focus on another


Neural Mechanism fundamental aspect and investigate the biophysical
mechanisms that establish such capacity limits in discrete
Edmund T. Rolls, 1, * Laura Damper-Marco, 1 , 2 and Gustavo attractor networks, and how this short term memory capacity
Deco 3 , 2 can be enhanced. Indeed, the issue arises of the extent to
which discrete (as contrasted with continuous [28]) attractor
Sidney Arthur Simon, Editor
networks with distinct subsets of neurons for each memory
Associated Data are able to maintain more than one memory simultaneously
active. One memory at a time is typical, and it was shown
Supplementary Materials that with overlapping memory patterns, more than 1–2
simultaneously active memories are difficult due to
Abstract
interference between the patterns even when the patterns
Introduction are graded [29]. Most investigations have involved non-
overlapping patterns as we do here, and it has been found
Short term memory can be implemented in for example the difficult to maintain more than approximately 4 memories
prefrontal cortex by the continuing firing of neurons during simultaneously active [30], [31]. Non-overlapping patterns
the short term memory period [1]–[3]. Here, we show how refers to patterns in which each population of neurons
synaptic facilitation in the recurrent synaptic connections can representing a memory is separate, and this has advantages
increase very considerably the number of short term in the mean field analysis and the simulations. However,
memories that can be maintained simultaneously active in a interference and cross-talk between the different pools was a
standard model of this continuing firing, a cortical attractor feature of the simulations described here, and was
network [4]–[7], and can make the short term memory implemented through w− as shown below. Nonetheless, by
system much more robust, i.e. less sensitive to the selection using sparse representations with a = 0.01 (where the
of model parameters such as the amount of inhibition. The sparseness a is the proportion of neurons that is active for
findings are of importance for understanding how several any one memory pattern), Amit et al. were able to maintain
different items can be maintained simultaneously in working six memory patterns simultaneously active in short term
memory [8], the impairments of cognitive function including memory [32], and mean field analysis showed that having
working memory and attention in schizophrenia and in aging reasonable number of patterns simultaneously active could
that can occur when the operation of cortical systems for be stable with sparse representations [30], [32]. Here, we
short term memory and attention are impaired [7], [9]–[12], show how the addition of synaptic facilitation to this
and how cortical areas involved in language processing can approach can increase very considerably the number of short
keep active several items, such as the subjects of a term memories that can be maintained simultaneously
sentence [13]. active, and can make the short term memory system much
more robust.
George Miller published a paper in 1956 entitled “The magic
number 7, plus or minus two: some limits on our capacity for Methods
processing information” [14]. A key issue was that the
capacity of short term memory (for example our memory for The attractor network architecture for short term memory
a list of telephone numbers) is approximately 7±2 items. For that was simulated is shown in Fig. 1. The network
visual short term memory, the capacity may be closer to 4 investigated had 10 excitatory pools of neurons, S1–S10, and
items [8], [15], [16]. one inhibitory pool of neurons, with the synaptic weights
shown in Fig. 1. The global inhibition in the model reflects the
Kohonen and other pioneers made neuronal network models evidence that in a local area of the cerebral cortex, inhibitory
that could maintain their activity during a short term memory neurons are connected to excitatory neurons within that local
period [17]–[19]. A key to the operation of such area [33]. In the network, the proportion of neurons active for
autoassociation networks was the strong recurrent synaptic any one representation, that is the sparseness with binary
connectivity within each population of excitatory neurons, neurons, was 0.1, and this value was chosen as this is in the
and feedback inhibition to all the excitatory neurons to order of the sparseness of the representation in the cerebral
maintain the activity within limits, and just some populations cortex [34]. A full description of the integrate-and-fire
firing actively. Hopfield introduced the methods of statistical attractor single network is provided below in subsections
mechanics from theoretical physics to allow for example the ‘Network’ and ‘Spiking Dynamics’. The network has a mean
calculation of the capacity of what became known as field equivalent allowing further quantitative analysis of the
attractor networks [4], [20], that was extended to more stability conditions and capacity, and the way in which we
biologically plausible networks with diluted connectivity, used this in the context of synaptic facilitation is described in
sparse representations, graded firing rate representations, subsection ‘Mean field analysis’.
and the speed of operation of dynamically realistic integrate-
and-fire neuronal attractor networks [6], [21]–[24].

To provide background information, we note that there are


currently a number of theories regarding the underlying
mechanisms that yield limits to the capacity of short term
memory. The main two competing models are “fixed capacity
models” (or slot models) [25], and dynamic allocation models
(or resource models) [26]. In fixed capacity models, all items
are recalled with equal precision up to the limit (3–4 items)
with no further information being stored beyond this limit,
whereas in dynamic allocation models, the limited resources
are shared out between items but not necessarily equally.
This is an issue that has recently received substantial
attention (e.g. Wei et al. [27]), and our work does not directly
tackle this question, although we suggest that with respect to
formal network models, this issue arises with continuous
connections between all of the excitatory neurons was
modulated by the utilization parameter u (the fraction of
resources used) reflecting the calcium level. When a spike
reaches the presynaptic terminal, calcium influx in the
presynaptic terminal causes an increase of u which increases
the release probability of transmitter and thus the strength of
that synapse. The time constant of the decay of the synaptic
facilitation is regulated by a parameter τ F which
experimentally is around 1–2 s [38], [43]. The value for the
baseline utilization factor U (0.15) and for τF (1.5 s) used here
are similar to values reported experimentally and used
elsewhere [38]–[40], [43]. In more detail, the strength of
each recurrent excitatory synapse j is multiplied by the
presynaptic utilization factor uj (t), which is described by the
following dynamics:

Figure 1

The attractor network model.


(1)
The single network is fully connected. The excitatory neurons
are divided into N selective pools or neuronal populations S1–
SN of which three are shown, S1, S2 and SN. There were where   is the time of the corresponding presynaptic spike k.
typically N = 10 short term memory populations of neurons in The first term shows how the synaptic utilization
the integrate-and-fire networks simulated, and analyzed with factor uj decays to the baseline utilization factor U = 0.15 with
mean field analyses. Each of these excitatory pools time constant τF = 1500 ms, and the second term shows
represents one short term memory by maintaining its activity how uj is increased by each presynaptic action potential k to
during a delay period after a cue has been applied. We show reach a maximum value of 1 when the neuron is firing fast.
that if the excitatory connections show synaptic facilitation, The modulation by the presynaptic utilization factor u is
then any number in the range 0–9 of the pools will maintain implemented by multiplying the synaptic weight by u to
its activity in the delay period depending on which set of produce an effective synaptic weight w eff. This models the
pools was activated by a cue λ1, λ2, … λ10. The synaptic underlying synaptic processes [38].
connections have strengths that are consistent with Network
associative learning. In particular, there are strong intra-pool
connection strengths w +, and weaker between-pool synaptic The integrate-and-fire attractor network, developed from an
connection strengths of w −. The excitatory neurons receive earlier model [7], [35]–[37] but with synaptic facilitation
inputs from the inhibitory neurons with synaptic connection added [38], contains NE = 800 pyramidal cells (excitatory)
strength w inh. The other connection strengths are 1. w + was and NI = 200 interneurons (inhibitory). There are ten
typically 2.3, w − 0.87, and w inh 0.945 as determined using a populations or pools of excitatory neurons each with
modified mean field analysis. The integrate-and-fire spiking 0.1 NE neurons (i.e. 80 neurons). The network is fully
network typically contained 1000 neurons, with 80 in each of connected. Neurons within the selective population are
the 10 non-overlapping short term memory (STM) excitatory coupled, by a factor w + = 2.3 (unless otherwise stated) above
pools, and 200 in the inhibitory pool. Each neuron in the the baseline connection synaptic weight w = 1. Connections
network also receives external Poisson inputs λext from 800 to inhibitory cells are set to the baseline level, w = 1, and
external neurons at a typical rate of 3.05 Hz/synapse to from the inhibitory neurons w inh = 0.945 (unless otherwise
simulate the effect of inputs coming from other brain areas. stated).

To the standard integrate-and-fire network [7], [35]–[37], we To model spontaneous background activity, every neuron in
added synaptic facilitation, which has been incorporated into the network is coupled through N ext = 800 synaptic
such networks, for example to account for the low firing rates connections to an external source of Poisson-distributed,
in the delay periods of some short term memory tasks [38], independent spike trains of rate 3.05 Hz per synapse, so that
in decision tasks with responses delayed over a period in each neuron received 2440 spikes/s. The presence of cue
which the firing rate may be low [39], and decision-making stimuli to initiate the short term memory was modeled by an
with sequentially presented stimuli [40]. In contrast to increase of λ to 3.3125 spikes/synapse. This value of λ was
Mongillo et al.’s work [38], we investigate the applied to any of the pools S1 to S10 (via λ 1 to λ 10 as
neurodynamical origin of short term memory capacity limits, illustrated in Fig. 1 of the paper) during the cue delivery
and assume that sustained neural activation (as opposed to period, which was from 500–1500 ms in the simulations. For
alternative mechanisms such as neural oscillations and/or pools not being cued on, and for the inhibitory
patterns of synaptic strengths already reviewed by the neurons, λ remained at λ ext during the cue delivery period
authors elsewhere [31]) is the mechanism underlying the and throughout the trial. After the cue had been delivered to
encoding and maintenance of multiple items in short-term the pools selected for short term memory for that
memory. trial, λ returned to the default value of λ ext = 3.05 Hz/synapse
for the remainder of the trial (1500–4500 ms).
Synaptic facilitation is common in higher cortical areas such
as the prefrontal cortex [41]–[43] implicated in working Spiking Dynamics
memory and attention [2], [8]. Synaptic facilitation is caused
for example by the increased accumulation of residual The model is based on integrate-and-fire (IF) neurons. The
calcium at the presynaptic terminals, which increases the subthreshold membrane potential V of a neuron evolves
probability of neurotransmitter release [41]. Short term according to
synaptic facilitation was implemented using a
phenomenological model of calcium-mediated
transmission [38]. The synaptic efficacy of the recurrent
(2) inhibitory components (mediated by GABA). External cells
contribute to the current only through AMPA receptors. The
where Cm is the membrane capacitance (see numerical total current is given by
values in Table 1), gm is the membrane leak
conductance, VL is the resting potential, and I syn is the
synaptic current.
(3)
Table 1
with the different currents defined by
Parameters used in the integrate-and-fire simulations.

Cm (excitatory) 0.5 nF

Cm (inhibitory) 0.2 nF
(4)
gm (excitatory) 25 nS

gm (inhibitory) 20 nS

VL −70
mV (5)

V thr −50
mV

V reset −55
mV (6)

V E 0 mV

V I −70
mV
(7)

gAMPA,ext (excitato 2.08 nS
where   are the synaptic weights,   is the fraction of open
ry)
channels for each receptor, and   is the synaptic
conductance for receptor x = AMPA, NMDA, GABA. Synaptic
gAMPA,rec (excitato 0.104
facilitation is implemented through the utilization
ry) nS
factor uj which modulates the recurrent excitatory synapses,
specifically the synaptic gating variables s as can be seen
gNMDA (excitatory 0.327
from Equations 5 and 6. The values for the synaptic
) nS
conductances and the reversal potentials VE and VI are given
in Table 1. NMDA currents are voltage dependent and
gGABA (excitatory) 1.25 nS
controlled by the intracellular magnesium concentration
([Mg2+] = 1 mM), with parameters γ = [Mg2+]/(3.57 mM) = 0.280
gAMPA,ext (inhibitor 1.62 nS
and β = 0.062 (mV) −1.
y)
The fraction of open channels in cell j, for all receptors, is
gAMPA,rec (inhibitor 0.081 described by the following differential equations, where   is
y) nS the derivative of s in time:

gNMDA (inhibitory) 0.258
nS

gGABA (inhibitory) 0.973 (8)


nS

τNMDA, decay 100 ms

(9)
τNMDA,rise 2 ms

τAMPA 2 ms
(10)
τGABA 10 ms

α 0.5
ms−1
(11)
Open in a separate window

The synaptic current includes glutamatergic excitatory


components (mediated by AMPA and NMDA receptors) and
(12) w inh} with the aim to reproduce a short term memory regime
in which only the stimulated (i.e. cued) subpopulations
where the rise time constant for NMDA currents is τNMDA,rise = 2 remain active during the delay period.
ms, and α = 0.5 ms−1; rise time constants for AMPA and GABA
currents are neglected. Decay time constants for AMPA,
NMDA, and GABA synapses are τAMPA = 2 ms, τNMDA,decay = 100
ms, and τGABA = 10 ms. The sums over k represent a sum over

spikes emitted by pre-synaptic neuron j at time  .

Mean Field Analysis

To complement the integrate-and-fire simulations described


in the paper, we performed mean field analyses of the
network when it is operating with synaptic facilitation. The
mean field analysis provides a simplification of the integrate-
and-fire dynamics considered in the spiking simulations,
which are computationally expensive and therefore not
suitable for extensive parameter explorations, though they
are important for establishing how the system operates with
stochastic dynamics when it is of finite size and has noise,
that is randomness, introduced by the almost random spiking
times of the neurons for a given mean rate [7]. The mean
field analyses allow the different operating regimes of the
network to be established analytically [7], [44], [45],
including in our case proving the stability of the system when
multiple short term memories are simultaneously active. The
mean field analyses also allow exhaustive parameter
explorations to define the values of the parameters within
which different operating regimes of the network occur. The
mean field analysis is noiseless, that is there is no
randomness introduced by the almost random spiking times
of the neurons for a given mean firing rate, and is in this
respect equivalent to an infinitely large integrate-and-fire
spiking simulation. The network that we simulated has,
without the synaptic dynamics, a mean field equivalent that
was developed by Brunel and Wang [35] (see Text S1 for a
full account of the original method). Here, we consider that
mean field approach but extend and apply it to the case
where synaptic facilitation is present in the network. In the
following we describe how the conventional mean field
approach has been extended.

Synaptic facilitation in the mean field approach

We extended and modified the standard mean field


approach [7], [35] to incorporate the effects of synaptic
facilitation. As can be seen from Fig. 2c, the synaptic
facilitation u(t) increases particularly for each excitatory
Figure 2
subpopulation that has received external stimulation
produced by the cues, and reaches a value close to 1 during Short term memory with 7 simultaneously active
the delay period. Only small increases occur in the pools that memories.
have not received this external stimulation by a memory cue
(and those increases occur because of effects produced After a 0.5 s period of spontaneous activity (with λext at the
through the w − connections between the excitatory pools baseline level of 3.05 Hz/synapse), cues λ1−λ7 were applied to
shown in Fig. 1). Since u(t) converges in both cases to a excitatory neuron pools S1 to S7 during the period 500–1500
given value, we can then define a new effective synaptic ms. (λ1−λ7 were applied by increasing λext to 3.3125
strength (w j)eff which operates during the steady state, and Hz/synapse for just these 7 pools during the cue period.) As
this opens the possibility to perform a mean field analysis of shown on the rastergram (a) and peristimulus time histogram
the network endowed with STP. The effective synaptic of the firing rate (b) this produced a high firing rate of
strengths are defined as: approximately 70 spikes/s in each of the pools S1–S7 (b, solid
lines), and the synaptic utilization factor uj increased in this
period to values close to 1 (c, solid line). In the rastergram
(a) each vertical line is a spike from a single neuron, and
(13) each row shows the spikes for a single neuron. 10 neurons
where uj ∞ is the asymptotic synaptic facilitation value that is chosen at random are shown for each pool of neurons. The
estimated from the average uj(t) observed in the last 500 ms input to pools λ8−λ10 remained at the baseline level of 3.05
of the delay period of the spiking network simulations. The Hz/synapse throughout the trial, and therefore their firing
protocol used in such simulations is described in detail in the rates did not increase in the period 500–1500 ms (b, dashed
next section. A conventional mean field analysis with the lines), and correspondingly the utilization factor u for these
effective synaptic strengths can then be performed and can pools remained low (c, dashed lines). At the end of the cue
be used to systematically scan the parameter space {w+, period, λ1−λ7 returned to the baseline level of 3.05
Hz/synapse, but in the short term memory period from 1500–
4500 ms the neurons in pools 1–7 continued to fire at a high of longer than approximately 2 s (depending on λext), the
rate of approximately 40 spikes/s (b, solid lines), well above synaptic facilitation had dissipated so much that selective
the baseline in the spontaneous period of 3 spikes/s. recall of the short term memories became very poor. This
Moreover, the synaptic facilitation uj remained high for pools helps to show the importance of the synaptic facilitation in
1–7 during the short term memory period from 1500–4500 the maintenance and even restoration of in this case multiple
ms (c, solid lines). short term memory representations.

Results

Integrate-and-fire Simulations

Results of the integrate-and-fire simulations of short term


memory are shown in Fig. 2. After a 500 ms period of
spontaneous activity, cues λ1−λ7 were applied to excitatory
neuron pools S1 to S7 during the period 500–1500 ms. This
produced a high firing rate of approximately 70 spikes/s in
pools S1–S7 (Fig. 2b), and the synaptic utilization
factor uj increased in this period to values close to 1 (Fig. 2c).
The inputs λ8−λ10 to pools S8–S10 remained at the baseline
level of 3.05 Hz/synapse throughout the trial, and therefore
their firing rates did not increase in the period 500–1500 ms,
and correspondingly the utilization factor u for these pools
remained low. At the end of the cue period, λ1−λ7 returned to
the baseline level of 3.05 Hz/synapse, but the neurons in
pools 1–7 continued to fire at a high rate of approximately 40
spikes/s, well above the baseline rate of 3 spikes/s in the
spontaneous period. The continuing relatively high firing rate
in pools S1–S7 was sufficient to keep via the recurrent
collateral connections the synaptic utilization factor within
such pools relatively high (Fig. 2c), and that in turn was what
kept each of the selective pools S1–S7 continuing to fire fast.
That high firing rate in pools S1–S7 was the implementation
of the short term memory. In contrast, the firing in the
uncued pools S8–S10 remained low, showing that the short
term memory was selective for just whichever pools of
neurons had been cued on earlier (Fig. 2a,b). The inhibition
prevented the effects of interference between the different
neuronal pools implemented through w −producing high firing
rates in the uncued neuronal pools. Open in a separate window

In this scenario, the synaptic facilitation in the recurrent Figure 3


connections is regenerated by the continuing firing of the
originally cued pools. Thus two factors are responsible for Multiple item short term memory can be selectively
enabling the firing to continue for long periods of many restored after a delay showing the importance of the
seconds after the cues have been removed. The first is the synaptic facilitation in the short term memory
recurrent collateral activity itself implemented in the process.
architecture of the network. Second, it is the regenerating
After the cue period from 500–1500 ms, there was a delay
synaptic facilitation just for the pools that were cued and had
period with λext for the excitatory neurons set to 0 for 500 ms.
high firing as a result, which gives the advantage in the
When λext was restored to all excitatory pools (in this case
competition implemented by the inhibitory neurons to the
with a value of 3.125 Hz/synapse) at time t = 2000 ms, there
previously cued pools, relative to the previously uncued
was sufficient synaptic facilitation remaining (b) to produce
pools.
firing selectively in the previously cued pools (a), and thus to
Further evidence for the importance of the synaptic restore the short term memory. Conventions as in Fig. 2.
facilitation in the process is that if there was a short delay Without synaptic facilitation, the network failed to recover
after the end of the cue period in the neuronal firing the previously cued memories.
(produced in the simulations by a decrease of λext to 0), then
In these simulations, with the parameters shown, we were
the firing in the previously cued pools could be restored by
able to keep any number between 0 and 9 of the pools of
restoring λext within approximately 1 s, before the synaptic
memory that represent each short term memory
facilitation had decayed too far. However, if λext was delayed
simultaneously active, depending on which pools were
for longer (e.g. 3 s) so that the synaptic facilitation had
activated by the cue. The system described thus has the
decayed, then the firing of the previously cued pools could no
strength that for a fixed set of parameters, it can flexibly
longer be selectively restored by restoring λext at its baseline
keep active any number of memories from 0–9. Without
or at any other value. An example is shown in Fig. 3, in which synaptic facilitation, we were able to maintain only 6 pools
after the cue period from 500–1500 ms, there was a delay
simultaneously active (w + = 2.3, w inh = 0.98) in the same
period with λext for the excitatory neurons set to 0 for 500 ms. network (Fig. 4b). Thus synaptic facilitation considerably
When λext was restored to all excitatory pools (in this case increased the number of short term memories that could be
with a value of 3.125 Hz/synapse) at time t = 2000 ms, there maintained simultaneously from 6 to 9 with the sparseness a 
was sufficient synaptic facilitation remaining (Fig. 3b) to = 0.1. We note that the condition where the number of cued
produce firing selectively in the previously cued pools (Fig. short term memory pools cued was 0 is an important
3a), and thus to restore the short term memory. With delays
condition that was satisfied in the results described, and 3 short term memories to be simultaneously active, and the
shows that the network remained stably in its low firing rate lower value shows the lowest value of inhibition at which 0
spontaneous state when no pools were cued. The mean field memories can be active, that is, when the spontaneous firing
analyses also confirmed this. state is stable without any cues being applied. Lower values
of w inh than this resulted in the spontaneous state with no
cues applied jumping into a high firing rate attractor state
with rates typical of those found in the cerebral cortex [34].
The simulations were run with the standard value of w + = 2.3,
and all the other parameters at their standard values
described elsewhere in the paper. The upper limit shown
in Fig. 4a defines the level of inhibition above which the
inhibition is too great for that number of short term
memories to be simultaneously active. Clearly if more
memories must be kept simultaneously active, the level of
inhibition must not be too great, as otherwise some of the
active attractors will be quenched. What Fig. 4a shows is that
the upper limit as expected decreases as the number of
memories required to be active increases, and that very
interestingly the upper limit reaches the lower limit at
approximately 9 memories simultaneously active in this short
term memory system. This leads to the concept that the
number of short term memory representations that can be
kept simultaneously active may in practice be limited by
(among other possible factors) the precision with which a
biological system must tune w inh. With any number from 0 to
9 memories simultaneously active, the tolerance with
which w inh must be set is very fine. This leads us to suggest
that at least in this model, 7 plus or minus 2 short term
memories simultaneously active in a single attractor network
may be a limit that is set at least in part by how finely in
practice the inhibition needs to be tuned as the number of
simultaneously active short term memories reaches 7 and
above, and also by the sparseness of the representation,
which in the cerebral cortex is not very sparse [34].

In order to investigate our hypothesis that synaptic


facilitation can not only increase the number of memories
Open in a separate window
that can be maintained simultaneously active, but may also
Figure 4 make the system more robust with respect to its sensitivity
to small changes in the parameters, we show the result
(a) The upper and lower values for the inhibitory in Fig. 4b of simulating the same short term memory
synaptic weights in the network as a function of the network, but without synaptic facilitation. It is clear that not
number of memories that are simultaneously active. only is the capacity less without the effects of the synaptic
facilitation used for Fig. 4a, but also for a given number of
For example, the data plotted for the value 3 short term short term memories simultaneously active, the system
memories simultaneously active are the upper value that is without synaptic facilitation is more sensitive to small
possible in the network for 3 short term memories to be parameter changes, such as of w inh as shown in Fig. 4b. In
simultaneously active, and the lower value shows the lowest fact, it can be observed that Δw = (w  ) - (w  )  is larger
inh high inh low
value of inhibition at which 0 memories can be active, that is, for the same number of memories (greater than 2) when the
when the spontaneous firing state is stable without any cues system is endowed with synaptic facilitation. Thus the
being applied. The integrate-and-fire simulations were run synaptic facilitation mechanism has the advantage that it
with the standard value of w + = 2.3, and all the other also makes the system more robust, and thus more
parameters at their standard values described elsewhere in biologically plausible, as well as increasing the short term
the paper. The results are from the integrate-and-fire spiking memory capacity. We emphasize that this type of
simulations, and from the modified mean field analyses. (b) robustness, relative insensitivity to the exact values of the
The same as (a), but without synaptic facilitation. The axes parameters, is likely to be important in biological systems, in
are to the same scale as in (a), but a slightly larger value which specification of the properties of neurons and networks
of w inh is needed to maintain stability of the spontaneous is not expected to be set with high precision.
state as the effective w + is now exactly 2.3, with no
modulation by u. Mean Field Approach

We investigated what parameters may be important in We complemented these integrate-and-fire simulations with
setting the limit on the number of short term memories that mean field analyses to define the areas of the parameter
can be maintained active simultaneously. A key parameter space within which these multiple item short term memory
was found to be the inhibitory synaptic weights in the model effects were found, and to confirm by analytic methods the
shown in Fig. 1. Fig. 4a shows the upper and lower values for stability of the multiple short term memory system that we
the inhibitory synaptic weights in the network as a function of describe. The system analyzed with the mean field approach
the number of memories that are simultaneously active. The was equivalent to that described for the integrate-and-fire
data are from the integrate-and-fire simulations confirmed simulations [7], [35], that is it had 10 specific pools each with
with the modified mean field analyses. For example, the data sparseness a = 0.1, one inhibitory pool, and, unless otherwise
plotted for the value 3 short term memories simultaneously specified, the same synaptic weights as the integrate-and-
active are the upper value that is possible in the network for fire simulations described (see Fig. 1, and the Methods). A
novel aspect of the mean field implementation used was that and the modified mean field analysis was then performed on
we estimated the effective synaptic weights that resulted the resulting network. Table 2 shows the effective w + for the
from the effects of the synaptic facilitation using the cued pools, and for the non-cued pools. These values for
integrate-and-fire spiking simulations, and used those values (w +)eff were obtained from the spiking simulations by
for the effective synaptic weights in the modified mean field multiplying the value of w + by the value of u ∞ that was
analyses described in detail in the methods. A hypothesis in obtained, and these values of (w +)eff were used in the mean
which we were especially interested was that the parameter field analyses. w inh was 0.945. The other values were the
space for multiple short term memory became smaller, and same as those used in the conventional mean field approach
therefore harder to prescribe when the network was built and in the spiking simulations shown in Fig. 2.
biologically, as the number 7±2 was exceeded. Of particular
interest was the parameter w inh, which in previous work with Table 2
this type of integrate-and-fire network in which one active
Mean field analyses.
pool was being investigated had the value 1.0 [46], [47]. In
order to allow multiple memory representations to be Number of (w +)eff cue (w +)eff noncu
active, w inh was reduced to a lower value, typically 0.945, for pools d ed
the simulations described here. This reduction of inhibition
was important in reducing the competition between the 0 0.97 0.97
multiple active short term memory pools. If it was reduced
too much, then the spontaneous state became unstable. If it
1 2.21 1.2
was increased too much towards 1.0, then it was difficult to
maintain active more than one pool of cued neurons. With
3 2.21 0.97
the mean field analyses, we were able to show that the range
of values of w inh, in which 3 or 7 multiple pools could be kept
7 2.16 0.67
simultaneously active was smaller than when the
requirement was just to maintain 1 pool active, as shown
in Fig. 4a and and5.5. Moreover, with the modified mean field 9 2.14 0.62
analyses it was possible to find a value of w inh that allowed 9
The values for the effective synaptic weights (w +)eff for the
out of the ten pools to remain simultaneously active (Fig.
cued and the non-cued pools, for different numbers of cued
4a and and5).5). The mean field approach thus allowed us to
pools as calculated from Equation 13.
show analytically that the system we describe with synaptic
facilitation could maintain up to 9/10 separate pools of Fig. 5 shows the results of the modified mean field analyses.
neurons simultaneously active. The values for the parameter w inh within which different
operating regimes occur are shown for different numbers of
cued short term memory pools. The first regime is
with w inh <0.94, in which case there is insufficent inhibition in
the network, and some of the pools start firing even when no
pools have been cued. This regime of unstable firing of the
uncued state when the neurons should be firing at the
spontaneous rate of approximately 3 Hz is labeled ‘unstable
spontaneous activity’ in Fig. 5.

The second regime labeled ‘multiple simultaneous STM


regime’ in Fig. 5 is the regime of interest in this paper. The
region lies above the dashed line in Fig. 5 of unstable
spontaneous activity and below the solid line. In this area
in Fig. 5, any number of cued memories can be maintained
stably. It is clear from Fig. 5 that for just one cued memory,
the value for w inh can be anywhere between 0.94 and 1.07.
For any 3 of the 10 possible STMs to be stable when cued on,
Figure 5 the value for w inh can be anywhere between 0.94 and 1.02.
For any 7 of the 10 possible STMs to be stable when cued on,
Mean field analysis showing the values of the the value for w inh can be anywhere between 0.94 and 0.96.
inhibitory synaptic weights w inh (between the lower Thus w inh must be set much more accurately when the
and upper limits) within which different numbers of required capacity is for any 7 of the 10 possible short term
simultaneously active short term memories can be memory attractor states to be maintained stably. For any 9 of
maintained stably. the 10 possible STMs to be stable when cued on, the value
for w inh must be 0.94 and nothing else will do. The mean field
The different regimes are described in the text. There were
analysis thus confirms that it is possible to maintain 9/10
10 selective pools and the pattern of effective synaptic
possible short term memory states active when the sparse of
strengths (w+)eff and (w −)eff was determined from the
the representation is 0.1 with binary neurons. The useful
corresponding spiking simulations. The coding level of the
operating region for multiple simultaneously active short
network was set to a = 0.1. The inset graph shows the firing
term memories is thus between the low limit and high limit
rate of the subpopulations which have been stimulated as
boundaries shown in Fig. 5 that are established by the
derived from the mean field analysis in the multiple active
modified mean field analyses. Previous mean field analyses
short term memory regime for different short term memory
have already demonstrated that without synaptic facilitation
set sizes. For all the set sizes the firing rates obtained are
it is difficult to find a value for which more than 5 short term
physiologically plausible. The results displayed in this figure
memories can be maintained simultaneously
correspond to the results for the integrate-and-fire network
active [30], [31]. We have also replicated such findings
shown in Fig. 4.
(results not shown) in this study although, as discussed by
In more detail, for each w inh value, a set of effective synaptic Dempere-Marco et al. [31], when the traditional mean field
weights for (w +)eff was obtained from the spiking simulations, approach is used, the role of the dynamics during the
stimulation period must be carefully considered. The mean
field analysis thus confirms that synaptic facilitation is an 0.05. We found that with synaptic facilitation incorporated as
important mechanism by which the number of short term described here, it was possible to maintain 20 different short
memories that can be maintained simultaneously active can term memories active simultaneously (Fig. 6a, which shows
be increased. 14 cued and perfect multiple short term memory) (w + = 
3.5; w inh = 0.945). However, without synaptic facilitation, it
The third regime shown in Fig. 5 and labeled ‘only some cued was possible to maintain only seven of the 20 memories
memories are stable simultaneously’ refers to the case in simultaneously active (Fig. 6b, which shows 8 cued, but only
which w inh is above the solid line, and is so large that only a 6 maintained in short term memory, with two of the cued
subset of the cued short term memory pools can maintain pools falling out of their high firing rate state) (w + = 2.5; w inh 
their activity. That is, the high firing rates expected of all = 0.975). These results again indicate that the use of
cued pools cannot be maintained stably. Thus the multiple synaptic facilitation can greatly increase the number of short
short term memory capability fails when w inh increases to a term memories that can be maintained simultaneously
value above the solid line in Fig. 5. The fourth regime shown active. These results also show that there is no ‘magic’ limit
in Fig. 5 is above the upper dashed line, when w inh is so large on the number of memories that can be maintained
that no cued pools maintain their high firing rates stably. simultaneously active in short term memory. The number is
These effects of the synaptic facilitation on the performance set in part by the sparseness of the representations, with
of the short term memory network can be understood as sparse representations allowing more short term memories
follows. The synaptic facilitation has an effect similar to to be simultaneously active; and by the use of synaptic
increasing the synaptic connection weights within each facilitation, which can increase the number of
neural population that was activated by a cue on a particular representations that can be kept simultaneously active, by
trial, relative to the non-cued pools. Thus the effective effectively increasing the synaptic connection weights within
synaptic weights of just the cued pools are increased for just each neural population that was activated by a cue on a
that trial, and are very different from the synaptic weights of particular trial, relative to the non-cued pools.
the uncued pools. In more detail, the effective synaptic
weights within a pool that result from the synaptic facilitation
are shown in Table 2, and indicate that when the neurons are
firing fast, u approaches 1.0, and the effective synaptic
weights for a cued pool become close to the value of 2.3 that
was usually specified for w +. That value for w + is sufficient
to maintain a pool of neurons firing stably. On the other
hand, for uncued pools, the value of u remains low, and the
effect of this is to reduce (by the multiplicative effect on the
synaptic weight) the effective synaptic weights to values that
are below the value of approximately 2.1 needed to maintain
a pool firing stably with a high rate in an attractor state. It is
this that enables the network to remain stable using the
mechanism of synaptic facilitation, with very many
simultaneously active pools, in the face of the noise
(stochastic fluctuations caused by the close to random firing
times of the neurons and of the external inputs λ) in the
system. In a network without synaptic facilitation, all the
internal recurrent collateral synaptic weights of the pools
(one pool for each stimulus) are of the same strength, so that
any noise in the system may cause a jump from an active to
an inactive pool as the weights and energy
landscape [7], [48] for the different pools are rather similar.
The synaptic facilitation makes the energy landscape have
very deep basins just for the cued pools.

The Short Term Memory Capacity with More Sparse


Representations

The sparseness of the representation, which for binary


neurons is the proportion of the excitatory neurons active for
any one stimulus, is another factor that we have found to be
important in setting the capacity for the number of items that
can be maintained simultaneously active in short term
memory. So far in this paper, we have considered a network
with a sparseness, a, of the representation of 0.1. The reason
for choosing a representation that is not very sparse is that
representations in the cerebral cortex are not very sparse,
and 0.1 is a biologically realistic value for binary neurons to
investigate, as shown by recordings from neurons in different Open in a separate window
cortical areas [34]. In fact, as representations in the cortex Figure 6
are graded, the measure of sparseness we have defined for
the graded case indicates even less sparse representations Simulations of short term memory with multiple
than this [34]. We hypothesized that more sparse simultaneously active pools with a sparseness of the
representations would enable more memories to be representation a = 0.05.
maintained active simultaneously, until the limit of all the
excitatory neurons being active was approached. We tested There were 20 pools in the integrate-and-fire simulations. (a)
the hypothesis by performing further simulations with 20 With synaptic facilitation, it was possible to cue on (500–1500
specific pools in the network each with the sparseness a =  ms) up to 20 pools, and for all cued pools to remain stably
active after the removal of the cues at 1500 ms. w + =  can, in combination with the recurrent collateral connections,
3.5, w inh = 0.95. In the example shown, 14 pools were cued, enable the process including the synaptic facilitation to be
and all 14 remained firing in the short term memory period regenerated so that the short-term memory of just the cued
after the cues. The firing rates of all 20 pools are shown, with pools can be maintained for many seconds.
those within the cue set shown with solid lines, and those not
cued with dotted lines. (b) Without synaptic facilitation, it To be clear, we note that the issue of how many memories
was possible to cue on (500–1500 ms) up to 7 pools, and for can be maintained simultaneously active in a short-term
these to remain stably active after the removal of the cues at memory is different from the issue of how many memories
1500 ms. w + = 2.5, w inh = 0.975. In the example shown, 8 can be stored in an attractor network and later correctly
pools were cued, and only 6 remained firing in the short term retrieved one at a time, which is a much larger number that
memory period after the cues, with 2 of the pools not scales with the number of synapses on each neuron, can be
maintaining their firing rates. The firing rates of all 20 pools increased by a reduction in the sparseness of the
are shown, with those within the cue set shown with solid representation, is facilitated by diluted connectivity, and is in
lines, and those not cued with dotted lines. There were 4000 the order of thousands in a network with 10,000 recurrent
neurons in these simulations, and λext was 3.05 Hz per collateral synapses onto every
synapse on each of the 800 external synapses. Similar neuron [4], [6], [21], [50], [51].
results to those in (b) could also be obtained with w + =  As shown in Figs. 4 and and5,5, the precision with which the
6, w inh = 1.1. inhibition must be set increases as the number of memories
Discussion required to be simultaneously active increases. In the
simulation shown with 10 specific pools (a = 0.1), the limit
In this work, we have investigated what parameters appear was reached with 9 pools simultaneously active. In this case,
to be important in setting the limit on the number of short- 90% of the excitatory neurons were active. Under these
term memories that can be maintained active simultaneously circumstances, the constraint is to find a value of w inh that is
in a discrete attractor network. A key parameter was found to sufficiently small that all these cued pools can be active
be the inhibitory synaptic weights, which has led us to without quenching each other through the inhibitory neurons;
suggest that, at least in part, there may be a capacity limit and at the same time for w inh to be sufficiently large that
set by how finely in practice the inhibition needs to be tuned when no pools are cued, the spontaneous state is stable, a
in the network, and also by the sparseness of the requirement for a short-term memory system.
representation. We have further shown that synaptic
facilitation of the type found in the prefrontal cortex boosts This led us to hypothesize that if we reduced the sparseness
such capacity limit by effectively increasing the synaptic of the representations, this would enable more
strengths just for those pools to which a cue is applied, and representations to be maintained active simultaneously. We
then maintaining the synaptic facilitation by the continuing tested the hypotheses by performing further simulations with
neuronal firing in just these pools when the cue is removed. the sparseness a = 0.05, with 20 specific pools in the network.
We found that with synaptic facilitation incorporated as
The neural mechanism described here for enabling a single described here, it was possible to maintain 20 different short-
attractor network to maintain multiple, in our case up to 9 term memories active simultaneously (with an example of 14
with a sparseness a = 0.1, memories simultaneously active is simultaneously active illustrated in Fig. 6a). However, without
very biologically plausible, in that recurrent collateral synaptic facilitation, it was possible to maintain only seven of
connections to nearby neurons with associative synaptic the 20 memories simultaneously active (Fig. 6b), and the
plasticity are a feature of the architecture of the increase in the number as the representations are made
neocortex [6], [33], as is synaptic facilitation. Indeed, more sparse is consistent with mean field analyses [30].
synaptic facilitation is common in higher cortical areas such These results show that there is no ‘magic’ limit on the
as the prefrontal cortex [41]–[43], in contrast with early number of memories that can be maintained simultaneously
sensory areas where depression is more usual [41]. This is active in short-term memory. The number is set in part by
very consistent with the evidence that the prefrontal cortex the sparseness of the representations, with sparse
is involved in short-term memory [1], [6], [49], whereas it is representations allowing more short-term memories to be
important that in early sensory cortical areas the neurons simultaneously active. However, the results show that there
faithfully represent the inputs, rather than reflecting short- is a very real gain in the number that can be kept
term memory [6]. simultaneously active if synaptic facilitation is used as part of
the mechanism.These points lead us to the following
The use of synaptic facilitation in the multiple active short- hypothesis. The number of memories that can be maintained
term memory mechanism described is important to the simultaneously active is in practice in a single network in the
success of the system, for without the synaptic facilitation it cortex in the order of 7, because the sparseness of the
is difficult to maintain more than a few representations representation is unlikely to be more sparse than a = 0.1.
simultaneously active [29]–[31]. In the attractor network we Representations that are somewhat distributed in this way
used, we took into account the sparseness of the are important to allow completion of incomplete
representation as shown here. In particular, with the representations in memory systems, and robustness against
sparseness a = 0.1 it was possible to maintain only 6 damage to synapses or neurons [6], [34]. In fact, as
memories simultaneously active without synaptic facilitation, representations in the cortex are graded, the measure of
and 9 with synaptic facilitation in the same network. With a =  sparseness we have defined for the graded case indicates
0.05 it was possible to maintain only 7 memories even less sparse representations than the value of
simultaneously active without synaptic facilitation, and 20 0.1 [6], [34]. With that level of sparseness, a large number of
with synaptic facilitation. Thus synaptic facilitation greatly neurons will be simultaneously active, and this will make
increases the number of memories that can be maintained setting the inhibition difficult, as just described. We are led
simultaneously active, and this is one of the new findings therefore to the suggestion that the number of memories
described in this research. The use of synaptic facilitation is that can be kept simultaneously active in short term memory
of conceptual interest in the mechanism described, for it is a in a single cortical network is limited by the sparseness of the
non-associative process, which is what enables just the cued representation, which is not very sparse, and by the difficulty
pools to remain active when they are cued without any of setting the inhibition when many neurons are
further associative learning. Moreover, as shown here, the simultaneously active. Moreover, as we show here, synaptic
effect of the synaptic facilitation is sufficiently large that it
facilitation can significantly increase the number that can be to attractor networks with discrete representations where the
maintained simultaneously active, by effectively altering the term short term memory capacity, i.e. the number of
energy landscape on each trial for just the pools of neurons separate memories that can be stored, actively maintained
that have been cued. and correctly recalled, does not aim to reflect such precision.
The reason for this is that, whereas continuous attractor
We note that these findings were obtained with non- networks account for precision by considering the dynamics
overlapping pools of neurons, and that the restriction on the of the width of the bumps that are actively maintained during
number that can be maintained simultaneously active will if the delay period [27], in discrete attractors such widths can
anything be increased when there is overlap between the not be defined. In a sense, discrete attractor networks can be
pools, due to interference between the different pools due to considered a limit case of continuous attractor networks in
their overlap [29]. The system that we have simulated does which the items that can be encoded differ sufficiently from
in fact implement some functional overlap between the each other to engage clearly distinct populations of selective
pools, through the effect of w − (see Fig. 1), and this does neurons. Discrete attractor networks are highly relevant
simulate effects of interference and cross-talk between the when the items being stored are not part of a continuum, but
different excitatory short term memory populations of are separate and different items [6], [59]. Thus, although
neurons. most experimental (and theoretical) paradigms addressing
the question of how neural resources are allocated to
Another feature of the mechanism described is that it does
different items in short term memory [25]–[27] consider the
not rely on oscillations, precise timing in the system, or a
accuracy dimension to discriminate between two main
special mechanism to read out which short term memories
competing families of models (i.e. fixed capacity
have been cued: the firing rate of the cued neurons is
models vs dynamic allocation models), the use of saliency, an
available, and is the usual way that information is read out
experimental variable that can be easily manipulated, has
from memory [6], [34]. This is in contrast to another
also demonstrated its value to contribute remarkable
mechanism that has been proposed that is based on
predictions. In particular, the predictions of a discrete
oscillations to implement multiple short term
attractor network similar to the one proposed in this work
memories [52], [53]. Although there is some evidence that
(although without synaptic facilitation) that was presented by
oscillations may play a role in short term memory function,
the authors [31] lie somewhere between pure slot and pure
there is controversy, with some studies indicating that there
shared resources models. This in agreement with the recent
is an increase of power in the alpha band (9–12 Hz) [54] with
results by Wei et al. [27] obtained by making use of a
short memory load whereas more recent studies indicate
continuous attractor network.
that an increase occurs in the lower range-theta (2–6 Hz) and
gamma bands (28–40 Hz) with a power decrease in the It is also worth noting that in the context of short term
alpha/beta band (10–18 Hz) (as discussed in Lundqvist et memory, the property investigated here is how many such
al. [55]) or in the theta band (4–12 Hz) [56]. Furthermore, the discrete short term memory states can be maintained active
locus (or loci) of short term memory function leading to simultaneously. While the number of memories that can be
capacity limits have not been fully established. Thus, stored and correctly recalled is high (in the order of the
although the proposed model could be extended to account number of synapses per neuron divided by the sparseness of
for oscillations by, for instance, changing the relative the representation [6], [60]), the number that can be
contribution of the slow NMDA and the fast AMPA receptors maintained simultaneously active is much smaller, with some
to the total synaptic currents [57], [58], or by introducing a of the relevant factors considered here.
mechanism based on the interplay of the different time
constants of the synaptic facilitation and neuronal adaptation In addition, we note that some types of short term memory
processes as in Mongillo et al. [38], we have preferred not to encode the order of the items [52], [61], but that order
incorporate such mechanisms, and instead propose a information may not be a property of all types of short term
minimal model that is investigated in depth in order to gain a memory, including for example that involved in remembering
fundamental understanding of how synaptic facilitation can all the subjects in a sentence referred to above.
boost short term memory capacity.
We further comment that the mechanism described here
The mechanism described here may play an important part utilizes synaptic facilitation, and that this mechanism would
in language, by enabling a single cortical network of 1–2 enable items to remain active relative to other items even if
mm2 to keep active multiple items simultaneously, at the same time there is some overall synaptic depression.
representing for example individuals that are the subjects of Indeed, the important new issue that we address here is how
a sentence [13], and thus reducing the load on the syntactic synaptic facilitation can provide a mechanism for increasing
mechanisms that implement language in the brain. By syntax the number of items that can be maintained simultaneously
we mean in this context in computational neuroscience the active in short term memory. We also note that long-term
ability to represent the fact that some sets of firing neurons associative synaptic modification can occur rapidly, as shown
might represent the subjects of a sentence, and other sets of by studies of long-term potentiation [6], so that the new
firing neurons the objects in a sentence, so that some form of attractor states required for new items to be stored in a short
binding mechanism is needed to indicate which firing term memory could be set up rapidly.
neurons are the subjects, and which are the objects. A
further way in which the multiple short term memory Finally, we point out that factors that reduce synaptic
mechanism described here may be useful in the brain’s facilitation [62] could cause a deterioration in short term
implementation of language is that if there is a need to memory by for example reducing effective synaptic weights,
reactivate assemblies in a sentence that is being produced which in stochastic neurodynamical systems can decrease
(for example to determine when forming a verb whether the the stability of the high firing rate attractor states that
subject was singular or plural), then non-specific activation implement short term memory [7], [63], and this opens a
working in conjunction with the remaining synaptic new avenue for helping to minimize the deterioration of short
facilitation might enable those assemblies to be reactivated, term memory that occurs in normal aging. We also note that
as illustrated in Fig. 3. given the potential role of the prefrontal cortex in the
cognitive symptoms of schizophrenia [10], [12], factors that
Some approaches to short term memory consider continuous enhance excitatory synaptic connectivity such as synaptic
attractor networks in which the concept of the precision of facilitation, may provide useful approaches to explore for
the memory is relevant [27], but that concept does not apply treatment [12], [64], [65].
A new discovery: How our memories stabilize while different mechanism in older animals compared to younger
we sleep ones. Further, in older mice the synaptic changes linked to
new memories were much harder to modify than the changes
Scientists at the Center for Interdisciplinary Research in seen in younger mice.The basic biological processes for
Biology (CNRS/Collège de France/INSERM) have shown that laying down memories is shared by mammals, so it is likely
delta waves emitted while we sleep are not generalized that memory formation in humans follows the same
periods of silence during which the cortex rests, as has been processes discovered in mice.Lead researcher Professor Karl
described for decades in the scientific literature. Instead, Peter Giese, from the Institute of Psychiatry, Psychology &
they isolate assemblies of neurons that play an essential role Neuroscience at King's, said: 'Our results give a fundamental
in long-term memory formation. These results were insight into how memory processes change with age. We
published on 18 October 2019 in Science.While we sleep, the found that, unlike in the younger mice, memories in the
hippocampus reactivates itself spontaneously by generating older mice were not modified when recalled. This 'fixed'
activity similar to that while we are awake. It sends nature of memories formed in old age was directly linked to
information to the cortex, which reacts in turn. This exchange the alternative way the memories were laid down, which our
is often followed by a period of silence called a 'delta wave' research revealed.''Until now it was thought that older people
then by rhythmic activity called a 'sleep spindle.' This is when should be able to form memories in just the same way as
the cortical circuits reorganize to form stable memories. younger people, so overcoming memory problems would
However, the role of delta waves in the formation of new simply involve restoring this ability,' added Professor Giese.
memories is still a puzzle: why does a period of silence 'However, our results suggest this is not true, and that there
interrupt the sequence of information exchanges between is an important biological difference in how memories are
the hippocampus and the cortex, and the functional stored in old age compared to young adulthood.'The results
reorganisation of the cortex? The authors here looked more may have implications for conditions where memory recall is
closely at what happens during delta waves themselves. a problem, such as post-traumatic stress disorder (PTSD).
They discovered, surprisingly, that the cortex is not entirely Professor Giese suggests that ageing should be taken into
silent but that a few neurons remain active and form consideration when treating patients with PTSD, since
assemblies, i.e. small, coactive sets that code information. confronting and modifying traumatic memories is a core
This unexpected observation suggests that the small number feature of some psychological treatments such as trauma-
of neurons that activate when all the others stay quiet can focused cognitive behavioural therapy.The study is published
carry out important calculations while protected from in the journal Current Biology and is funded by the UK
possible disturbances. And the discoveries from this work go Biotechnology and Biological Sciences Research Council.
even further. Spontaneous reactivations of the hippocampus
determine which cortical neurons remain active during the
delta waves and reveal transmission of information between
the two cerebral structures. In addition, the assemblies Visual and auditory digit-span performance in native
activated during the delta waves are formed of neurons that and nonnative speakers — Nomi M. Olsthoorn
have participated in learning a spatial memory task during Amsterdam Center for Language and Communication,
the day. Together these elements suggest that these University of Amsterdam, The Netherlands; Department of
processes are involved in memory consolidation. To Psychology, Lancaster University, UK Sible Andringa and Jan
demonstrate it in rats, the scientists caused artificial delta H. Hulstijn Amsterdam Center for Language and
waves to isolate either neurons associated with reactivations Communication, University of Amsterdam, The Netherlands
in the hippocampus, or random neurons. Result: when the Abstract We compared 121 native and 114 non-native
right neurons were isolated, the rats managed to stabilise speakers of Dutch (with 35 different first languages) on four
their memories and succeed at the spatial test the next day. digit-span tasks, varying modality (visual/auditory) and
These results substantially change how we understand the direction (forward/ backward). An interaction was observed
cortex. Delta waves are therefore a means of selectively between nativeness and modality, such that, while natives
isolating assemblies of chosen neurons, which send crucial performed better than the non-natives on the auditory tasks
information between the periods of hippocampo-cortical (which were performed in the non-natives’ second language),
dialog and the reorganisation of cortical circuits, to form performance on the visual tasks (which was performed in
long-term memories. participants’ dominant language) did not significantly differ
between natives and non-natives. The interaction between
Study reveals fundamental insight into how memory nativeness and modality disappeared when the data were
changes with age corrected for Dutch proficiency. Correction for Dutch
proficiency elevated non-native speakers’ scores on the
New research from King's College London and The Open auditory tasks, without altering the non-natives’ digit-span
University could help explain why memory in old age is much rank order. Despite considerable differences in mean length
less flexible than in young adulthood.Through experiments in of the digit names zero to nine in the non-natives’ first
mice the researchers discovered that there were dramatic languages, these differences were not significantly correlated
differences in how memories were stored in old age, with their visual digit-span scores. While further research is
compared to young adulthood. These differences, at needed on the sources of variation in digit-span performance,
the cellular level, meant that it was much harder to modify we recommend the use of the visual digit-span task (forward
the memories made in old age.Memories are stored in the or backward) for cross-linguistic research and advise
brain by strengthening the connections between nerve cells, researchers to be aware of the association between language
called synapses. Recalling a memory can alter these proficiency and verbal workingmemory performance.
connections, allowing memories to be updated to adapt to a Keywords Bilingual digit span, visual digit span, auditory digit
new situation. Until now researchers did not know whether span, forward digit span, backward digit span, verbal working
this memory updating process was affected by age.The memory Corresponding author: Jan H. Hulstijn, Amsterdam
researchers trained young adult and aged mice in a memory Center for Language and Communication, University of
task, finding that the animals' age did not affect their overall Amsterdam, Spuistraat 120, 1012VT Amsterdam, The
ability to make new memories. However, when analysing the Netherlands. Email: j.h.hulstijn@uva.nl
synapses before and after the memory task, the researchers 466314IJB0010.1177/1367006912466314International
found fundamental differences between older and younger Journal of BilingualismOlsthoorn et al. 2012 Article
mice.New memories were laid down via a completely Downloaded from ijb.sagepub.com at Universiteit van
Amsterdam on March 27, 2015 664 International Journal of ijb.sagepub.com at Universiteit van Amsterdam on March 27,
Bilingualism 18(6) Introduction Linguistic research into 2015 Olsthoorn et al. 665 respond with a YES (they are the
individual differences in language proficiency faces the same) or NO (they are not the same), said aloud or indicated
challenge of finding the factors responsible for inter- via a button press. The use of non-words as stimuli in a
individual variations. One of the factors often mentioned as a verbal WM test is also problematic. Even though nonwords
covariate is working-memory (WM) capacity, more are by definition not real words, and therefore unfamiliar to
specifically verbal WM capacity. However, many measures of both native speakers and nonnative speakers, non-words in
verbal WM tend to be affected by long-term memory factors one language may be (part-) words in another. This even
such as vocabulary knowledge and familiarity with the applies to artificially generated simple consonant–vowel–
speech sounds of a language. This is particularly an issue consonant (CVC) structures. It is therefore not surprising that
when participant samples are heterogeneous. This paper familiarity and neighbourhood effects for non-word recall
addresses the question of whether the visual digit span could have been widely reported (e.g. Hulme, Maughan, & Brown,
be a simple verbal WM measure that can be used reliably 1991; Hulme, Roodenrys, Brown, & Mercer, 1995; Roodenrys,
with native and non-native speakers alike, testing verbal WM Hulme, & Brown, 1993). Even if the items are not evoking
but not at the same time linguistic proficiency. First any existing words, non-word recall is also affected by
introduced in 1974 by Baddeley and Hitch, verbal WM is just familiarity with the phonotactic properties and frequencies of
one of three modality-based components subserving the the language from which the non-words are generated
Central Executive (for an overview, see Baddeley, 2003). Also (Gathercole, 1995; Gathercole, Frankish, Pickering, & Peaker,
called the phonological loop, the verbal WM component 1999; Gathercole et al., 2001; Gathercole, Willis, Emslie, &
allows us to store and retrieve a limited amount of Baddeley, 1991; Kovács & Racsmány, 2008; Roodenrys &
phonological information, for a brief period of time. This time Hinton, 2002; van Bon & van der Pijl, 1997), giving
period can be extended by the use of sub-vocal rehearsal. participants whose L1 is more similar to the non-word source
Verbal WM memory capacity has shown itself to be a good an advantage. Daneman and Merikle (1996) found that
predictor of successful foreign-language learning (e.g. Atkins numerical stimuli could also be used reliably as items in a
& Baddeley, 1998; Gathercole, Service, Hitch, Adams, & measure of verbal WM. This is the basis of digit-span tasks
Martin, 1999; Kormos & Safar, 2008; Service, 1992; Service & (e.g. Salthouse, Mitchell, Skovronek, & Babcock, 1989).
Kohonen, 1995), as well as linguistic aptness in the first Again, participants are presented with series of stimuli
language (e.g. Daneman & Carpenter, 1980; Just & (digits) of increasing length, which they have to recall. Each
Carpenter, 1992; Waters & Caplan, 1996, 2005; although see length is presented two times, and a participant’s span is
also MacDonald & Christiansen, 2002, for an alternative view defined as the last length at which one or both lists from a
on this matter). Verbal WM is also assumed to play a role in series of a particular length have been recalled correctly.
linguistic proficiency, both for speakers of a first language Variations include visual and auditory presentation of the
(L1) and speakers of a second language (L2); however, stimuli, and recall being required in forward or backward
measuring verbal WM in a multilingual setting has proven order with reference to the original list. The forward span
difficult. The problem lies with selecting a task that measures task is thought to reflect mainly storage, and the backward
verbal WM, but not simultaneously language proficiency. span reflects processing aspects of WM as well. Research on
Unfortunately, most if not all measures of verbal WM by digit-span performance in bilinguals has shown that
definition utilise stimuli that are linguistic in some way or familiarity with the language plays an important role in digit-
other, be they actual words (as in word or sentence spans), span performance, with participants generally achieving
non-words or digits. Consequently, they tap into long-term higher span scores in their mother tongue than in their non-
memory for linguistic knowledge, in addition to WM per se. dominant language(s) (Chincotta & Hoosain, 1995; Chincotta
This results in an obvious advantage for those familiar with & Underwood, 1997b; Da Costa Pinto, 1991; Thorn &
the language in question. Below we will briefly discuss the Gathercole, 2001). This is in spite of the fact that digits can
current measures and the disadvantages of each when used be considered to be relatively overlearnt stimuli in any
in a multilingual setting. Sentence-span tests (Daneman & language and are usually among the very first words taught
Carpenter, 1980), like most WM tasks, typically are presented in foreign-language education. Digit span has been studied in
in series: two sentences of average length, followed by a relation to linguistic diversity in several ways. Cross-linguistic
series of three such sentences, followed by four, etc. The comparisons show that digit-span size differs across
participant listens to the sentences or reads them aloud and languages, depending on between-language variation in the
after each series has to recall the final word of each length of digit names and thus in the average duration to
sentence. After having completed three series of a particular overtly or covertly articulate digit names. For example, digit
length, the participant then moves on to the next length, and names in Chinese consist of fewer phonemes (average of 2.5
so on. Sentence span is determined as the last length of phonemes) than digit names in English (3.2), while digit
which a participant has correctly recalled all words from at names in English are shorter than those in Finnish (5.7) (e.g.
least two out of three series. Word-span tests work on a Chincotta & Underwood, 1997a; Hoosain, 1979, 1982, 1984;
similar principle but use lists of words, rather than full Naveh-Benjamin & Ayres, 1986). Research typically involves
sentences as the initial input. The problem with both the comparing participants’ span sizes with and without
sentence-span and the word-span task is that they present articulatory suppression. Differences in the span size of
an advantage for native speakers that know the language bilinguals’ L1 and L2 are eliminated when the digit-span task
over non-native speakers who might not be familiar with is performed under articulatory suppression (Baddeley,
some of the words or the predictability of sentential or Thomson, & Buchanan, 1975, Chincotta & Underwood,
morphophonological structures. A solution to this is using 1997a). Additional research conducted by Chincotta, Hyöna,
non-words instead of existing words. The non-word task has and Underwood (1997) showed that, if digit names are longer
two variants: non-word repetition and non-word recognition in subjects’ L1 than in their L2, which is the case in Finnish–
(Gathercole, Pickering, Hall, & Peaker, 2001). Both make use Swedish bilinguals, the advantage of performing a digit-span
of non-words as stimuli, in lists of increasing length that are task in L1 over performing it in L2 is smaller than if digit
presented auditorily and subsequently have to be either names are shorter in L1 than in L2, which is the case in
recalled from memory (for the repetition task), or compared Swedish–Finnish bilinguals. This finding suggests that not
to a second list in which the same items are either in only actual word length of the digits and subjects’ speech
identical order as in the first list, or with two items of the list rate are the source of variation, as was previously postulated
swapped in position (for the recognition task). In the list- (Ellis & Hennely, 1980), but also subjects’ ‘relative level of
comparison task, participants Downloaded from competence between the languages for numerals’ (Chincotta
et al., 1997, p. 257). Downloaded from ijb.sagepub.com at puberty learners of Dutch as a foreign language and their
Universiteit van Amsterdam on March 27, 2015 666 proficiency level was equivalent to or higher than B1 in terms
International Journal of Bilingualism 18(6) In principle, the of the Common European Downloaded from ijb.sagepub.com
research just mentioned poses two potential problems for at Universiteit van Amsterdam on March 27, 2015 Olsthoorn
comparative research on the contributing factors of linguistic et al. 667 Framework of Reference for Languages (Council of
skills. Firstly, the between-language articulation-speed effect Europe, 2001). They were native speakers of 35 different
suggests that it might not be possible to compare speakers languages, including Russian (9 speakers), German (9
of different languages on their digit-span performance, speakers), Bahasa Indonesia (9 speakers) and Spanish (8
because these languages may vary in word length and in speakers). All participants were selected from the same age
average articulation rate. This challenges the possibility (1) range (19– 40), but the native group was slightly younger (M
of using a heterogeneous group of L2 learners in linguistic- = 25; SD = 5) than the non-native group (M = 29; SD =5).
proficiency research and (2) of comparing a group of L2 This difference was not significant, however. Participants of
speakers (of either homogeneous or heterogeneous language both high (higher vocational and university) and low (all
background) of a given language with native speakers of that other vocational) educational backgrounds were recruited
language. Secondly, comparing native and non-native (for the natives, 61 high and 60 low; for the non-natives, 66
speakers is also complicated by the finding that digit-span high and 47 low). Despite this, the groups differed on the
performance is generally better in one’s L1 than in one’s L2. complex matrices component from the Wechsler Adult
Despite the potential problems just mentioned, the present Intelligence Scale III (WAISIII), a measure of non-verbal IQ
study explores the possibility of using the visual and auditory (Wechsler, 1997), with native speakers scoring slightly higher
digit-span task for comparing verbal WM span between (M = 11.8; SD = 2.5) than the non-native speakers (M =
speakers of different language backgrounds. Participants 10.9; SD = 3.0); this difference was significant (F (1, 233) =
were 114 non-native speakers of Dutch (native speakers of a 6.588; p < 0.05). Materials Auditory forward and backward
large variety of languages) and 121 native speakers of digit-span tasks. Dutch language digits one to nine were
Dutch. The study reported here is part of a research project synthesised by using the MBROLA speech synthesiser
aiming at explaining individual differences between and (Dutoit, Pagel, Pierret, Bataille, & Van der Vrecken, 1996)
within native and non-native speakers of Dutch in performing with the NL3 diphone database, and all were subsequently
a listening comprehension task in Dutch. The explanation of edited in Praat (Boersma & Weenink, 2009) to be exactly 1
individual differences in listening comprehension is sought in second in duration. All digits were then judged by three
participant characteristics such as age, level of education, independent native speakers on their comprehensibility, and
non-verbal intelligence and WM capacity, and in performance adapted accordingly if deemed necessary until all three
on a range of verbal tasks (one of which was a Dutch judges were in agreement, with the provision that all
vocabulary-size task) (Andringa, Olsthoorn, van Beuningen, resulting sound files would remain 1 second in duration. The
Schoonen, & Hulstijn, 2012). Thus, one important question in procedures for the auditory forward and backward digit-span
the project was which type of WM task we should best use to tasks followed the recommendations of the WAIS-III Digit
predict individual differences in listening comprehension. To Span subtasks (Wechsler, 1997) and were implemented using
our knowledge, the present study is the first one the e-prime experimental software package (Schneider,
investigating differences between visual and auditory digit- Eschman, & Zuccolotto, 2002). The only change was that
span performance in non-native speakers in comparison to participants were required to type in their responses, rather
native speakers, examining the possible mediating effect of than say them out loud.1 Participants were instructed that
language proficiency (alternatively called language they would hear series of numbers and that their task was to
competence or language familiarity). In addition, the study enter these numbers in the correct order into the computer.
allowed us to replicate an effect of between-language Participants were seated at about 60 cm from a computer
differences in the average length of digit names on visual screen and were wearing headphones. To familiarise the
digit span. Visual digit-span tasks are widely used as a participants with the digitised numbers, these were played in
measure of the phonological loop. The phonological similarity order from one to nine, while at the same time their visual
effect for visual stimuli (Baddeley, 1968) and the finding that counterpart was displayed on the monitor in large font for 1
irrelevant speech sounds tend to disrupt memory even for second. After having been familiarised with the auditory
visually presented digits (Colle & Welch, 1976; Salame & digits, participants then received three practice trials
Baddeley, 1982) demonstrate that sub-vocal rehearsal is (consisting of two trials of length 2 and one trial of length 3).
involved in memorising visually presented items. While the In the experiment, participants heard series of digits of
auditory digit-span task forces participants to encode the increasing length, played back at a rate of one digit per
information in the language of the stimuli, the visual digit- second, and were required to enter these when prompted
span task allows participants to encode the sequences of after each series, in the order in which they were first
digits in their preferred language. This may make the visual presented for the forward task and in reverse order for the
task more suitable for measuring verbal WM capacity in a backward task. The minimum series length was two digits,
multilingual setting, that is, with participants with different increasing with one digit every two trials until the maximum
language backgrounds, even though between-language length of nine digits (for forward series, eight for backward
differences in the length of digit names may affect series) was reached, or until participants failed to respond
performance. We also wanted to explore whether the correctly to both trials of a particular length. Depending on
advantage of native speakers over non-native speakers in this variable exit point, this part of the experiment took
the auditory task might decrease when scores are corrected about 5–10 minutes to complete. Visual forward and
for language proficiency. The expectation was that correction backward digit span. The procedure for the visual digit-span
for language familiarity (i.e. L2 proficiency in the case of the tasks was exactly the same as for the auditory tasks, but
non-native speakers) will not affect the native–non-native instead of audio files of word names that were being played,
comparison when the digit-span task is performed in the the numbers (numerals) were presented in 75-point Arial font
visual mode, because participants will perform the task in in blue on the centre of a white computer screen for a
their dominant language (L1). In contrast, correction for duration of 1 second each, with no blank screen in between
language familiarity might compensate for non-native the items of a list. Visual span sessions also omitted the
speakers’ disadvantage of performing the task in the familiarisation block preceding the practice trials and again
auditory mode. Method Participants Participants were 121 took between 5 and 10 minutes to complete. Participants
native speakers of Dutch (84 female) and 114 non-native were instructed to type in the series of numbers Downloaded
speakers (76 female). All non-native speakers were post- from ijb.sagepub.com at Universiteit van Amsterdam on
March 27, 2015 668 International Journal of Bilingualism Pearson r values between .96 and 1.00. Thus, while
18(6) (in the order required). It was suggested that they use correction for language proficiency raised non-natives’ scores
their L1 to perform the task: ‘You may do the task in your in the auditory tasks (see Table 1), subjects’ rank order
first language’ (literal translation of the original Dutch remained virtually the same. The associations between the
instruction).2 In agreement with the WAIS-III instructions for four original digit-span measures were moderate in both the
digit-span tests, WM capacity was determined as the native and the non-native groups, with Pearson r values
maximum length at which a participant had responded at ranging between .45 and .51 (natives) and between .34 and .
least one out of two trials of the same length correctly, 61 (non-natives), with all coefficients significant at the .000
before having made an incorrect response on both level. The estimated mean length (in phonemes) of the 10
consecutive trials of the same length. Thus, WM capacity digit names (from zero to nine) in participants’ L1 varied from
could vary from 2 to 9 for the forward tasks, and from 2 to 8 2.5 (Mandarin) to 6.0 (Kinyarwanda). The mean length
for the backward tasks. Vocabulary test. Vocabulary calculated over all 114 non-native participants was 3.9 (SD =
knowledge was tested with a computer-administered 0.6). The mean phoneme length of the 10 digit names in
multiple-choice test. For each item, participants had five Dutch is 3.1. Importantly, mean phoneme length and visual
alternative paraphrases of the meaning of the target word to digit span were not significantly associated in the non-natives
choose from, the last one always being: ‘I really don’t know’. (n = 114); Pearson r values between mean phoneme length
The paraphrases were always couched in simple, common and forward and backward digit span was –.06 and –.16 (ns).
language. Several considerations guided item selection and Discussion The main aim of the study was to find out which
construction. Firstly, words should not be too domain- type of digit-span task lends itself best for use in studies
specific. Care was taken not to introduce any regularity in the involving both native and non-native speakers of a given
length of the alternatives and the way meanings were language. Adult native and non-native speakers of Dutch (n
paraphrased. The final test consisted of 60 items selected = 121 and 114, respectively) performed four digit-span tasks
from a total of 140 items that were piloted amongst 48 (auditory forward, auditory backward, visual forward and
native and non-native speakers. The target words were visual backward). As expected, a significant main effect of
selected on the basis of frequency information from the direction was obtained. Furthermore, a significant nativeness
CELEX database (Baayen, Piepenbrock, & Gulikers, 1995). × modality effect was observed: while natives obtained
CELEX consists of 124,000 lemmas of Dutch, with frequency significantly higher scores than non-natives in the auditory
information based on a corpus of written Dutch of 50-million mode, this was not the case in the visual mode. This
word forms (see also http://celex.mpi.nl/). Results All tests significant interaction (i.e. the advantage of the natives in
reported below are two-tailed. Scores on the vocabulary test the auditory mode) disappeared when performance was
ranged from 26 to 60 for native speakers (M = 41; SD =6) corrected for language proficiency (the Dutch vocabulary
and from 4 to 55 for non-native speakers (M = 28; SD = 9) test). Betweenlanguage differences in the average length of
(Welch’s F (1, 194.69) = 145.802; p < 0.001). An analysis of digit names in non-native participants’ L1 (ranging between
variance (ANOVA) of the digit-span data, with direction (2) 2.5 and 6.0 phonemes) did not appear to affect their
and modality (2) as within-subject factors, and nativeness (2) performance in the two visual digit-span tasks. The results
as a between-subject factor, revealed a betweensubjects confirm a well-documented effect of direction, with forward
main effect of nativeness, with native speakers tasks generally resulting in longer spans than backward tasks
outperforming the non-native speakers on average by .348 (see Ramsay & Reynolds, 1995, for a review). We also
items (F (1, 233) = 7.09; p < 0.05). A within-subjects main replicated the effect of proficiency, that is, that native
effect of direction (F (1, 233) = 165.668; p < 0.001; η2 p = . speakers tend to outperform non-native speakers in digit-
416) also obtained. This effect was due to the forward spans span tasks (Chincotta & Underwood, 1997b; Da Costa Pinto,
being on average 0.829 items longer than the backward 1991; Thorn & Gathercole, 2001). In our study, the
spans. In addition to this, there was an interaction effect advantage only held for the auditory version of the task
between modality and nativeness (F (1, 233) = 10.275; p < because that task had to be performed in Dutch, while the
0.01; η2 p = .042). No other main or interaction effects were visual task could be performed in participants’ native
significant. Additional ANOVAs showed that the modality × language. On average, digit span in the visual modality was
nativeness interaction effect was due to the native speakers equal for the native and non-native participants, despite the
and non-native speakers differing on the auditory digit spans, fact that the mean digit-name length of the native Table 1.
both in the forward direction (F (1, 233) = 12.76; p < 0.0001; Average span scores (SD in brackets) for native and non-
η2 p = 0.52) and backwards (F (1, 233) = 11.332; p < 0.01; native speakers and least-square means for the groups
η2 p = 0.46) but not on the visual span tasks. Also, native corrected for vocabulary score with IQ partialled out. ns nns
speakers performed better on the auditory tasks (F (1, 120) ns least-square means nns least-square means
= 17.77; p < 0. 0001; η2 p = 0.129), but native and non-
Forward auditory 6.45 (1.118) 5.91 (1.209) 6.415 (.120)
native speakers performed the same on the auditory and the
5.954 (.124) Forward visual 6.07 (1.243) 5.89 (1.431) 6.032
visual tasks. Since the native and non-native groups showed
(.138) 5.922 (.143) Backward auditory 5.55 (1.204) 4.98
an a priori difference in IQ, and IQ and vocabulary were found
(1.395) 5.465 (.133) 5.076 (.138) Backward visual 5.28
to be correlated (r = .342; p < 0.001), the IQ component of
(1.260) 5.18 (1.473) 5.331 (.141) 5.131 (.146) Downloaded
vocabulary was partialled out by means of linear regression
from ijb.sagepub.com at Universiteit van Amsterdam on
analysis, and the residuals were saved. To check for effects
March 27, 2015 670 International Journal of Bilingualism
of familiarity with the language, another factorial analysis
18(6) languages of the non-native participants was higher
was conducted with the vocabulary residuals as a covariate
(3.9) than the mean digit-name length in the Dutch language
(see Table 1 for both original means and least-square
(3.1). The fact that natives obtained higher scores in the
means). This resulted in a main effect of direction (F (1, 232)
auditory than in the visual tasks is reflective of an auditory
= 165.189; p < 0.01; η2 p = 0.416), but no other main or
superiority effect (Penney, 1989). The auditory superiority
interaction effects approached significance. In all eight
effect for serial recall items is generally thought to reflect the
measures (combinations of three binary distinctions:
fact that auditory presentation leads more reliably to the
natives/non-natives, auditory/visual, and forward/backward),
creation of a vivid phonological code than does visual
the correlations between Downloaded from ijb.sagepub.com
presentation of numerals, because subjects have to access
at Universiteit van Amsterdam on March 27, 2015 Olsthoorn
and retrieve the word names in the visual task, while the
et al. 669 the raw scores and the scores corrected for
word names are presented in the auditory task (Baddeley,
language proficiency (Dutch vocabulary) were in the range of
1986; Gathercole & Baddeley, 1990). The auditory superiority
effect was not obtained with the non-native speakers. component of WM. 2. Unfortunately we failed to ask the non-
Apparently, the advantage of being presented with the word native participants afterwards whether they had performed
names in the auditory task was cancelled because the task the visual digit-span tasks in their dominant language.
was performed in their L2. The fact that all differences
between native and non-native speakers disappeared when
linguistic proficiency was taken into account is telling. It
Anterior hippocampus: the anatomy of perception,
suggests that even with such overlearnt and abstract stimuli
imagination and episodic memory
as digits, the phonological-loop capacity, measured with a
digit-span task, reflects, at least partly, familiarity with Peter Zeidman and Eleanor A. Maguire*
(proficiency in) the language. As such, our results are
corroborative of MacDonald and Christiansen’s (2002) claim Abstract The hippocampus is critical for learning, memory
that verbal WM equals one’s amount of experience with and cognition. It is one of the most studied brain structures in
comprehension processes in a particular language. The neuroscience and efforts to understand its functions continue
following important question then arises. Is it empirically apace. In particular there is considerable interest in
possible to tease apart, in digit-span performance in a group functional differences between neural populations located at
of native speakers, (i) the variance produced by individual different points along its anterior–posterior axis 1–5. A recent
differences in articulation speed and (ii) the variance literature review3 showed that, in addition to the gradients of
produced by individual differences in familiarity with the gene expression and connectivity that vary along the length
language? In other words, to what extent is a person’s of the hippocampus, the anterior portion of the hippocampus
articulation speed an index of his or her language familiarity? (also known as the head, ventral or temporal region) can be
On the basis of the literature and on the basis of our findings, distinguished from the intermediate and posterior sections
what advice can we give to researchers of bilingualism with (also known as the tail, dorsal or septal regions) by sharp
respect to measuring verbal WM capacity in a cross-linguistic changes in these functional characteristics.
setting? In the introduction, we listed some important
The anterior hippocampus (and particularly that of humans)
disadvantages of using sentence-span tasks and non-word-
has proved to be difficult to study, as a result of its complex
span tasks. In general, depending on the study’s research
anatomy6. It has an intricate structure with unique cellular
questions of course, the auditory and visual digit-span tasks
morphology and is positioned at the junction between the
appear to lend themselves better to the purpose. The
parahippocampal gyrus, the amygdala and posterior
advantage of the auditory digit-span task is that all
hippocampus. It has widespread connectivity and damage
participants conduct the task in the same language.
that includes the anterior hippocampus has a deleterious
Researchers might consider correcting the raw digit-span
effect on learning, memory and navigation7–9. However, we
scores with a measure of language proficiency (such as
are only beginning to understand its precise anatomy in
vocabulary). Importantly, in our study this did not affect
humans10 and consequently little is known about the specific
participants’ digit-span rank order. The advantage of the
structures within the anterior hippocampus that contribute to
visual digit-span task (with numerals as stimuli) is that
these important cognitive functions.
participants perform the task in their strongest language, and
thus the score comes closer to the success with which they In this article, we focus on the human anterior hippocampus.
can store and process phonological information in WM. There We summarise current knowledge of its anatomy and
is strong evidence that this type of WM capacity is positively highlight a striking consistency in the functional magnetic
associated to learning a foreign language (see the references resonance imaging (fMRI) literature, which suggests that
in the introduction). The potential drawback of the fact that specific substructures within the anterior hippocampus make
languages differ in the length of numeral names can be dealt critical contributions to episodic memory, imagination and
with by estimating the mean length of digit names in visual scene perception. Claims that the hippocampus (and
participants’ first languages and computing the association the medial temporal lobes (MTL) more generally) performs
between that length and participants’ visual digit-span score. functions beyond memory (such as visual perception) are
Depending on the study’s cross-linguistic participant sample, heavily contested11, particularly because it is difficult to
this may or may not form an obstacle. Surprisingly, in our design experiments that unequivocally control for memory
study, the visual digit span of the non-natives was not encoding and retrieval. By emphasising the importance of
significantly different from that of the natives, although the anatomical detail, and by reference to known connectivity
average digit-name length in their first languages was longer across species, we propose a model of anterior hippocampal
than that in Dutch. These findings may encourage function that may help to clarify recent discussions of
researchers to use the visual digit-span tasks despite functional differences in the long axis of the hippocampus 1–
between-language differences in digit-name length. The 4
 as well as the relationship between visual perception and
decision to use the forward or backward version of the task memory.
will depend on the researcher’s wish to increase (backward)
or not (forward) the WM component of information Anterior hippocampus anatomy
manipulation. Acknowledgements The authors wish to thank
Netta Meijer and Tineke van der Linde for their help running In humans, the hippocampus is cradled by the
the experiments and Padraic Monaghan for his help in parahippocampal gyrus which is positioned beneath it. The
generating the auditory digits. Downloaded from anterior part of the parahippocampal gyrus bends over and
ijb.sagepub.com at Universiteit van Amsterdam on March 27, rests upon itself, forming the uncus. The various gyri that are
2015 Olsthoorn et al. 671 Funding This work was supported visible on the brain’s surface (Box 1) cover both the
by the Dutch Organisation for Scientific Research (NWO; amygdala and the anterior portion of the hippocampus.
grant number 360-70-230). Notes 1. As one anonymous Beneath this surface, the hippocampus is divided into
reviewer of this paper noted, having the participants type cytoarchitectonically defined subfields. These are commonly
their responses is a nonstandard implementation of the thought of as being arranged in a canonical circuit, which can
verbal digit-span task, because of the recoding of verbal be visualised by slicing the main body of the hippocampus in
information into motor and spatial codes necessary for the cross section (Fig 1). In this circuit, the entorhinal cortex
response. Also, due to the requirement to give typed (EC), which is part of the neighbouring parahippocampal
responses, the participants may have been prone to encode gyrus, projects to the dentate gyrus (DG), which in turn
verbal stimuli into a series of spatial positions before storage drives subfield CA3 then CA2, CA1 and the subiculum, before
and thus make use of the Visual-Spatial Sketchpad projecting back to EC. Also visible in this plane are regions
related to, but distinct from the subiculum: the prosubiculum,
presubiculum and parasubiculum. The prosubiculum is
situated between CA1 and the subiculum. However, despite
many differences between its connectivity and that of the
subiculum12, these two regions are not normally separated in
human neuroimaging due to the lack of borders visible with
MRI. The presubiculum and parasubiculum are situated
between the subiculum and the EC. These regions can be
delineated with high resolution MRI10 which offers new
opportunities to study their functions in humans.

Box 1

A tour of the anterior hippocampus

The anterior hippocampus may be understood in terms of its


cytoarchitecture (see main text) or in terms of the features
that are visible on the brain’s surface, which are summarised Figure 1
here based on previous observations10,13,94. (See the figure
part a for context. Dashed lines labelled b-e correspond to Anatomy of the anterior hippocampus.
the coronal slices shown in Fig 1).
a| A schematic showing a dorsal view of the hippocampus
Starting in parahippocampal gyrus and moving anteriorly with the dentate gyrus (DG) visible inside. The main
(towards the left of the figure) the gyrus becomes the hippocampus body as well as the uncus are indicated. b-
entorhinal cortex. Its approximate position is indicated for e| Coronal slices showing the hippocampal subfields in the
clarity, although its borders are not visible on the brain’s anterior hippocampus. Red lines indicate subfields that are
surface (in coronal slices the entorhinal cortex appears found within the uncus and which have distinct
approximately at the level of lateral geniculate nucleus 101). cytoarchitectural properties relative to the main body of the
Parahippocampal gyrus then bends upwards, separated by hippocampus. The slices in parts b and c are at the level of
the narrow intrarhinal sulcus from the gyrus ambiens above. the uncinate gyrus, whereas slice d is at the level of the
The gyrus ambiens extends laterally to the sulcus intralimbic gyrus. The arrows in part e indicate the flow of
semiannularis, which marks the lateral extent of the information in the canonical hippocampal circuit in the body
entorhinal cortex. Further lateral to this is the semilunar of the hippocampus. Pro, prosubiculum; PrS/PaS,
gyrus, which covers the medial portion of the amygdala and presubiculum and parasubiculum; EC, entorhinal cortex;
the most anterior part of the hippocampus. HATA=hippocampal-amygdaloid transition area. Part a
adapted from: REF94. Parts b-e adapted from REF10.
The uncus of the hippocampus (marked on the figure by the
orange shading) is part of the anterior parahippocampal Understanding the arrangement of the subfields in the
gyrus that bends over and rests on itself, separated from anterior hippocampus requires awareness of their three-
entorhinal cortex below by the uncal sulcus. Moving dimensional shape. The main body of the hippocampus
posteriorly along the uncus (towards the right of the figure), bends medially in its anterior portion (Fig 1a) to form the
the gyrus ambiens becomes the uncinate gyrus, which extraventricular part of anterior hippocampus, which is
contains the hippocampal-amygdaloid transition area positioned within the uncus. As a result, CA1 and the
(HATA; Fig 1b,c). Next, the uncus becomes the band of subiculum must bend around a wider radius than the other
Giacomini and finally the intralimbic gyrus, getting subfields13. Thus, the most anterior portion of the
progressively smaller towards the posterior. The point at hippocampus is dominated by CA1 and the subiculum (Fig
which the intralimbic gyrus is no longer visible in coronal 1b) whereas the more posterior part of the uncus contains
section is a landmark for the posterior-most slice of the only the DG and CA3-2 (Fig 1d). After the medial turn within
anterior hippocampus2. Thus, the anterior part of the uncus is the uncus, the subfields bend upwards, ascending vertically
formed of the uncinate gyrus, whereas the posterior consists towards the amygdala (Fig 1b,c), which changes their
of the band of Giacomini and the intralimbic gyrus. The main orientation from the coronal to the axial plane.
hippocampal body and fimbria are positioned lateral to the
Subfields within the uncus have cellular ‘peculiarities’ 14: for
uncus. Part a adapted, with permission, from REF102.
example CA1 in the uncus - referred to as CA1’ - has smaller
Photograph of a dissected brain adapted, with permission,
and more densely packed neurons than CA1 in the body of
from REF103.
the hippocampus13 (note that CA1’ is labelled uncal
subiculum in an alternative nomenclature 15). The vertical part
of the uncus has particularly differentiated cytoarchitecture -
for instance, vertical CA1’ contains more densely packed
pyramidal cell bodies than horizontal CA1’ or CA1 of the
hippocampal body10. The functional implications of these
cellular differences are unknown.

Although the subfields are usually studied in slices


perpendicular to the hippocampal long axis (the coronal
plane in humans, Figure 1b-e), connections also extend
through the length of the hippocampus16,17. Within the
coronal plane and with a slight anterior inclination, mossy
fibres project from the DG to CA3, and upon reaching distal
CA3 they change direction to extend 3-5 mm in the anterior
direction17. By contrast, associational and local connections
within the DG — which may have excitatory and inhibitory
roles, respectively16 — extend bidirectionally along much of
the length of the hippocampus. The DG in the uncus is
connected to the anterior DG in the main body of the gradient in level of detail, from coarse representations in
hippocampus through these associational projections 16. The anterior hippocampus to fine detail in the posterior. An
connection from CA3 to CA1 (the Schaffer collaterals) and alternative explanation comes from a recent finding32 that an
connections within CA3 also have longitudinal projections animal’s precise location can be decoded from the activity of
through most of the hippocampus17. Once again, the uncus is cell populations in anterior hippocampus, despite each
distinct, with the targets of projections of the uncal CA3 individual cell only representing a larger area of the
(CA3’) being largely confined to CA3’ and CA1’. The human environment. Computer simulations demonstrated that this
hippocampus can therefore be divided into subfields in the distributed representation in anterior hippocampus (which is
main body of the hippocampus that are connected in cross- no less precise than that of the posterior hippocampus) make
section and longitudinally, as well as modified subfields that it better suited to generalising across environments (pattern
comprise the uncus and have more limited connectivity with completion), whereas the smaller place fields in posterior
anterior hippocampus. hippocampus would be better at resisting interference from
similar locations (pattern separation)33. Although this finding
Importantly, there are differences in the hippocampal is still to be replicated, the general consensus is that having
anatomy of different species. The uncus is particularly highly a spatially large-scale or generalisable representation of the
developed in primates, and the degree to which it is environment depends upon the anterior hippocampus 2,32.
homologous to its counterpart in rodents is uncertain 10,13.
There are also differences in connectivity. Whereas rodents Hippocampal cell populations involved in spatial processing
have commissural connections along the length of the represent both the current state of the animal, and also
hippocampus, in monkeys these connections are largely imagined future locations34–36 and enable ‘replay’ of
restricted to anterior regions, particularly the uncus18. Cells in remembered past locations37. However, recent evidence
the DG of the uncus project via the white matter of the suggests that a purely spatial account of hippocampal
fimbria or fornix, before crossing hemispheres via the ventral function is insufficient. For example, some aspects of
hippocampal commissure19, returning along the same allocentric spatial navigation are preserved in patients with
pathway in the opposite hemisphere to terminate in bilateral hippocampal damage9,38, whereas these individuals
contralateral uncus. Other regions associated with the exhibit a clear deficit in processing scenes 39,40.
hippocampus – presubiculum, EC and posterior
parahippocampal cortex - connect across hemispheres via Recalling and imagining scenes
the dorsal hippocampal commissure. These findings point to
Scenes are coherent object-containing spaces within which
a striking role for the anterior hippocampus in connecting the
we can potentially operate. If scenes develop over time they
hippocampi of each hemisphere. It has been suggested that
may be referred to as events or episodes (however, a
the ventral hippocampal commissure may not exist in
temporal dimension is not required in order to involve the
humans20; however, there is a notable lack of research
hippocampus39,41). Tasks that require scenes or events to be
relating human hippocampal connectivity to that of non-
imagined or recalled41–50, including those which involve
human primates.
navigation28, engage the hippocampus, with performance
Representing the environment impaired or abolished following bilateral hippocampal
lesions7–9.
We can vividly re-experience past events, simulate future
events and imagine fictitious scenarios, in addition to It has been hypothesised that the hippocampus contributes
experiencing the environment we currently inhabit. To to these tasks by providing a spatial representation (or
achieve this, we must be able to construct internal model) of the scene being processed 51,52. If so, common
representations of environments based on incoming sensory subregions within the hippocampus should be engaged by
information and/or prior experience. As we will argue, there is any task that involves the construction of scene
evidence that structures within the medial part of anterior representations. Indeed, as described below, there is a
hippocampus have a role in constructing these striking consistency in the fMRI results across many studies
representations. in the literature, which differed in the tasks they employed
but all involved naturalistic scenes. These studies reported
Decades of research, carried out primarily in rodents, has activation of anterior hippocampus, and close examination of
revealed several key cell types in the hippocampus and their findings reveals a common region of activation in the
neighbouring structures that contribute to spatial processing. medial bank of the anterior hippocampus, which we refer to
Place cells in CA1 represent an animal’s location 21, whereas as amHipp (Figure 2a-j). Here we use a working definition of
grid cells in the EC, presubiculum and parasubiculum may anterior hippocampus as the region having a MNI y-
provide a metric-like framework for spatial representation 22,23. coordinate of less than or equal to -22 (ensuring complete
Head direction cells, which represent the animal’s heading coverage of the uncus) and we define amHipp functionally, as
relative to fixed landmarks, have been discovered in a clusters of activated locations (voxels) that peak in the
number of interconnected regions including presubiculum, anterior hippocampus and exhibit a clear bias towards the
postsubiculum, anterodorsal thalamus, EC, retrosplenial medial half of the structure. Using this definition, it can be
cortex, mammillary bodies and thalamus23–25. More recently, seen that a number of cognitive tasks engage amHipp (see
boundary vector (border) cells, which represent the position below). Although several of these tasks, particularly those
of environmental boundaries, have been found in subiculum, involving visual stimuli, also engage other parts of the
presubiculum, parasubiculum and EC26. In humans, the hippocampus, amHipp appears to be the most consistently
hippocampus9,27,28, entorhinal29 and retrosplenial engaged subregion across studies and often shows the
cortices30 have been implicated in fMRI and greatest effect size. Quantifying the proportion of studies
neuropsychological studies of spatial navigation. which have found this specific region to be engaged for recall
and imagination would require a formal meta-analysis, which
The properties of place cells are modulated by their anterior- would be complicated by the limited scanning resolution of
posterior position within the hippocampus. In rodents, some studies. However, our observation is that this finding is
anterior (ventral) place cells have firing fields that cover a the norm rather than the exception. As we discuss below,
larger area of space than posterior (dorsal) place fields31. The high resolution MRI is starting to relate this fMRI activity to
notion that the anterior hippocampus represents less specific the precise underlying anatomy.
information contributed to the proposal2 that anterior–
posterior differences in hippocampus can be understood as a
For example, when subjects were cued to imagine static
atemporal scenes based on short descriptions, amHipp was
the only part of the hippocampus to be significantly engaged,
relative to a control condition in which participants imagined
isolated static objects41 (Figure 2a). Specific increases in
amHipp activity were found in another study in which
subjects constructed and elaborated upon past events and
imagined future events42 (Figure 2b). When participants
recalled episodic memories and imagined fictitious events set
in the past or future (based on recombined elements from
episodic memories) the amHipp was again the only part of
the hippocampus significantly engaged for imagination 43 and
was part of a larger region activated during both imagination
and recall (Figure 2c) whereas activation of posterior
hippocampus was found specifically for vividly recalling real
memories. Only amHipp was found to respond more strongly
to imagining specific past or future events rather than
general events44 (Figure 2d), whereas the anterior lateral
hippocampus distinguished past from future episodes.
Autobiographical memory retrieval was also found to engage
only amHipp45 (Figure 2e) although other subregions of the
hippocampus were responsive to whether retrieval was cued
using a direct association with the cue or a strategy of
searching through memories. A subsequent study sought to
distinguish the initial construction stage of autobiographical
memory recall from elaboration46, and within the
hippocampus found solely amHipp engagement for
construction (Figure 2f). Posterior hippocampus was engaged
Open in a separate window for elaboration, which connectivity analyses showed was also
linked to areas processing visual stimuli. These studies
Figure 2 suggest that despite different research questions, tasks and
Activation of the anterior medial hippocampus during some points of divergence – including the extent of
fMRI. differences between recalling the past and imagining the
future55 and the relationship between imagination, novelty
a-i| The images show the results of a selection of fMRI and encoding (Box 3) - one common feature across studies
studies in which amHipp (arrows) was engaged by was the engagement of amHipp, in addition to regions of the
imagination and recall (a-f) and visual perception (g-i) of wider core network42,53,54,56.
scenes and events. Coloured regions are those in which there
was an increased haemodynamic response during each task Box 2
(in part g, colours represent % of subjects (n=34) with A core network for recall and imagination
activation, range 50-80%). The tasks involved constructing
static atemporal scenes41(a), constructing and elaborating A recent study47 showed that recalling and imagining scenes
upon imagined events42 (b, white circle indicates the activates the anterior medial hippocampus (amHipp) as part
hippocampus), recalling past events and imagining events of a well-established ‘core network’ of brain regions56. There
set in the past and future43 (c), imagining specific rather than are various suggestions as to what common function this
general future events44(d), autobiographical memory network may serve56,104,105. It is similar to a set of regions
retrieval45,46 (e and f), viewing novel scenes relative to found to be most active during rest periods, termed the
scrambled images66 (g), viewing scenes in which an object ‘default mode network’ (DMN)106. The figure shows resting
had been moved relative to the background67 (h, white circle state fMRI activity (based on 1000 subjects at rest107,108,
indicates hippocampus) or correct versus incorrect scene accessed via the Neurosynth meta-analysis tool109 – see Further
oddity judgements69 (i). j| The overlap in activity between the Information
) illustrating regions that had activity that correlated
perception and imagination of scenes. Hippocampal with right amHipp. The colour scale indicates the strength of
activation for scene perception relative to object perception the correlations (Pearson correlations (r)) between the BOLD
is shown in red, activity for imagining scenes versus timeseries in each voxel and a seed voxel in the right amHipp
imagining objects is shown in blue and the overlap is shown (seed MNI co-ordinates 22,-20,-18; r > 0.2). This
in turquoise (also shown in sagittal and coronal slices)47. Part demonstrates the association between the DMN and the
a modified, with permission from REF 41. Part b modified, with amHipp. The DMN brain regions shown are the anterior
permission, from REF 42. Part c modified, with permission superior temporal sulcus (STS), inferior parietal cortex (IPC),
from REF 43. Part d modified, with permission, from REF 44. parahippocampal cortex (PHC), retrosplenial cortex (RSC),
Part e modified, with permission, from REF 45. Part f modified, ventromedial prefrontal cortex (vmPFC) and the septal nuclei.
with permission from REF 46. Part g modified, with permission In agreement with other authors110, we suggest that scene-
from REF 66. Part h modified, with permission from REF 67. Part related tasks such as recalling the past engage a similar set
i modified, with permission from REF 69. Part j modified, with of brain regions as the DMN because they depend on
permission from REF 47. common cognitive processes. When at rest, people often
construct spatially coherent scenes, which may be generated
Episodic memory and imagination in their imagination or recalled from their past. This gives rise
FMRI studies of episodic (autobiographical) memory, which to several testable predictions regarding the brain at rest. As
require participants to vividly recall specific events from their patients with 39,40
hippocampal lesions cannot imagine fictitious
past, generally find that a ‘core network’ of brain regions — scenarios , their mind-wandering behaviour should be
including anterior hippocampus — is engaged 53,54
 (Box 2). limited to events in the present. Moreover, there should be
Activation within the hippocampus may be limited to amHipp. changes within the DMN compared with healthy control
subjects. Indeed the latter has been reported in amnesic Summary
patients with bilateral medial temporal lobe damage111.
Here, we build upon the proposal that the hippocampus
constructs representations of scenes52. This will necessarily
occur when a stimulus is novel or not recently recalled, and
may integrate non-spatial elements including emotional
valence.

The hippocampus may contribute to these tasks by linking


elements of scenes in a coherent spatial representation, as
suggested by studies in humans with bilateral hippocampal
lesions. Self-reports from these patients demonstrate that
they can fully comprehend the elements that should appear
in an imagined scene, but cannot arrange them into a
spatially coherent representation39,40. In one study, patients
and control participants were asked to describe what they
would see beyond the edges of a scene photograph 40. The
patients described relevant objects they would have
expected to see, demonstrating preserved associational
processing, but were specifically unable to describe how the
extended scene would fit together and reported being unable
to visualise it in their imagination. In another study57,
Box 3 patients with bilateral hippocampal damage were tested on
their ability to reflect on “what might have been”. The
Other functions of anterior hippocampus patients could deconstruct a given narrative and then add,
recombine and re-order narrative elements into a
Evidence points to the hippocampus being a site of counterfactual alternative reality. However, detailed
integration between spatial and non-spatial information. questioning showed that they had specific impairment in the
There are several key themes in the literature implicating spatial coherence of the mental representations needed to
anterior hippocampus. perform some aspects of the task. Although these patients
rarely, if ever, have lesions limited to the anterior
Anxiety and stress
hippocampus, these findings have been instrumental in
The anterior (ventral) hippocampus is associated with anxiety demonstrating a spatial scene-related role for the human
across species 112. Contextual fear conditioning, in which a hippocampus.
spatial location is associated with an aversive stimulus, is
Visual perception and posterior hippocampus
associated with activity in the anterior hippocampus in
113,114 115
humans  and rodents (reviewed in REF ). The anterior If the hippocampus is required for the internal representation
dentate gyrus (DG) is the source of adult-born stem cells in of spatially coherent scenes or events, then it may also
the hippocampus116 and stress-related decreases in support perception by representing the environment
neurogenesis in primates are accompanied by increases in currently being experienced. This contrasts with a long-
depressive symptoms117. However, neuroimaging studies in standing theory of the hippocampus58, which holds that
humans which have tested for changes in hippocampal human MTL (including the hippocampus) is involved only in
volume with depression suggest that there are more often memory and that perception involves a separate system.
changes in the posterior than the anterior volume118,119, Over the last two decades, this proposed separation has
potentially because neurogenesis effects are more difficult to been challenged by converging functional and anatomical
measure in the anterior hippocampus119. results across species (for reviews see REFs11,59,60).
Novelty and encoding Investigations into hippocampal involvement in scene
perception have focussed on visual discrimination studies, in
The anterior hippocampus responds to novelty, particularly
which subjects decide whether visually presented stimuli are
for spatial stimuli. For instance, there is a stronger response
identical or subtly different to each other or to a sample
to novel scene stimuli than to those that are more
120,121 stimulus. Patients with bilateral hippocampal lesions cannot
familiar  and the anterior hippocampus also responds to
67 match morphed pictures of scenes to a sample, but are not
novel spatial configurations . Moreover, it is engaged by
impaired when other stimuli are used, such as faces and
associative novelty, in which novel stimuli are paired with
objects61. Responding to criticisms that this effect may be
familiar stimuli to form new associations or violate
driven by learning over trials, a subsequent study62 used trial-
expectations122. It has also been suggested that the anterior
unique stimuli in an odd-one-out task. The patients had
hippocampus is involved with encoding memories123 and has
difficulty identifying non-matching scenes; however, this
been found to respond more strongly to imagined scenarios
impairment was only apparent when the scene pictures were
that are subsequently remembered than to those that are
124 taken from different viewpoints. By contrast, they were able
forgotten . However, successful encoding is not a
69 to match pictures of faces regardless of viewpoint. This is
prerequisite for anterior hippocampus engagement  or for its
important because matching scenes from different
interaction with the core network124 (Box 2) in response to
viewpoints requires an internal global scene model, which we
scene or event stimuli.
suggest is represented in the hippocampus. However, these
Decision-making results were not replicated in a separate group of subjects 63,
and another study involving four patients found only two to
Although beyond the scope of this article, the anterior have impaired perception of the topology of virtual reality
hippocampus is associated with decision-making and scenes64. Differences in the nature and extent of intra-
representation of value, functions that might relate to its hippocampal damage may help to explain these disparate
anatomical connections with prefrontal cortex. For instance, functional outcomes65 and emphasise the need to better
anterior hippocampus and amygdala were found to represent understand intra-hippocampal anatomy and its mapping to
value and task-specific goals in spatial decision-making125,126. function.
In healthy participants, the hippocampus consistently The studies we have discussed, tapping into apparently
responds to visually presented scenes. Making an distinct cognitive functions, share a common requirement to
indoor/outdoor decision about scene photographs was form internal representations of spatially coherent scenes.
sufficient to engage much of the length of the hippocampus, This, we suggest involves amHipp. The existence of amHipp
with the most consistent result across subjects found to be in as a distinct functional region was noted in a recent fMRI
amHipp66 (Figure 2g). In another study, when the spatial study47 and lately by two studies which parcellated the
configuration of items within visually presented scenes was hippocampus based on a meta-analysis of fMRI data 76 and
altered, amHipp was activated (Figure 2h)67 together with a functional connectivity77. Posterior hippocampus responds
more anterior lateral region of hippocampus. This study did particularly strongly to visual scene perception, which may
not identify posterior hippocampus activation. However, the relate to its direct anatomical connections with
posterior hippocampus was engaged during a scene parahippocampal and retrosplenial cortices78. How
discrimination task68 and subregions of the hippocampus involvement of the posterior hippocampus with scene
responded to viewing scenes, regardless of whether or not perception relates to its known role in navigation 79 is yet to
they were subsequently recalled 69 (Figure 2i), suggesting this be investigated. Interestingly, posterior hippocampal volume
does not relate to memory encoding. To probe which aspects has been found to be reduced in blind people80,81 with
of scenes the hippocampus responds to, a task was concomitant volume increases in anterior
developed70,71 in which scene discrimination based on global hippocampus81,82 (although there are various possible
layout (“strength-based perception”) was dissociated from explanations for this, such as differences in blind people’s
discrimination based on local visual features (“state-based navigation performance82).
perception”). The posterior hippocampus was involved
specifically with strength-based perception, reinforcing the In suggesting functional differences between anterior and
notion that it represents the configuration of the scene as a posterior hippocampus, it is important to note that these
whole. Beyond discrimination studies, a recent experiment functions are not completely segregated. For instance,
had participants passively view scenes and single isolated although the posterior hippocampus responds more strongly
objects as they underwent fMRI scanning47. Activation of both to visually perceiving scenes than imagining scenes 47, there
amHipp and posterior hippocampus was observed in is evidence that it is engaged by imagination and recall 48,49.
response to perceiving scenes, relative to perceiving isolated All of these functions involve extrinsic connections that link
objects. When the same participants constructed scenes in the hippocampus to parahippocampal and perirhinal
their imagination, there was significant engagement of cortices83,84. There is also variability in the locus of
amHipp without evidence for posterior hippocampus activity. hippocampal activation between studies, which may be
Thus, the posterior hippocampus appeared to be particularly explained by multiple factors including the age of recalled
responsive to visual perception, whereas amHipp was stimuli48,50, differences in task demands, differences in the
engaged by scenes regardless of whether they were definition of anterior versus posterior hippocampus and
generated internally or externally (Figure 2j). This study also varying statistical thresholds. This variability, we suggest,
showed that amHipp was engaged by novel scenes, whereas makes the striking overlap in amHipp across studies all the
posterior hippocampus was engaged regardless of novelty - more interesting.
supporting the hypothesis that amHipp is involved with
From scenes to subfields
constructing an initial representation of the scene, whereas
posterior hippocampus has a more specific visuospatial role. To investigate which specific structures underlie the
activation patterns observed in amHipp in neuroimaging
Also of relevance to scene perception is Boundary
72
studies, an experiment was recently conducted 50 in which
extension  (BE), a cognitive phenomenon whereby people
subjects constructed novel naturalistic scenes in their
remember seeing more of a scene than was actually present
imagination and recalled scene photographs from a week
in the original stimulus. This effect is attenuated in patients
earlier while undergoing high resolution structural and
with hippocampal lesions40 (see also REFs65,73). Of note,
functional MRI. The anterior presubiculum and parasubiculum
during an fMRI study in which healthy participants viewed
were both activated by tasks involving specifically the
scenes and experienced BE on half the trials, there was
construction and recall of scenes. There was also activation
greater engagement of the posterior hippocampus during the
74
of the uncus when subjects recalled scenes first viewed a
viewing of scenes that induced BE . The BE effect was not
week earlier, and of the anterior subiculum when subjects
linked with amHipp activity during neuroimaging, potentially
constructed novel scenes. Effect sizes in anterior lateral
because scene perception (which was present in all
hippocampus and posterior hippocampus were far smaller.
conditions) always engages amHipp, regardless of whether
These results suggest that engagement of amHipp in the
BE is experienced.
studies using scene or event stimuli described above is likely
The evidence that the hippocampus is engaged during scene to have been driven by activity in the presubiculum and
perception is challenged by studies of patients with parasubiculum, potentially in conjunction with anterior
hippocampal lesions who can lucidly produce narratives to subiculum and the uncus.
accompany visually presented cartoon images or
Anterior presubiculum and parasubiculum
photographs40,75. However, impairments become apparent
when patients are asked to extend the scene beyond the The presubiculum and parasubiculum were engaged by both
edges of the picture into the imagination40. This suggests that scene construction and scene recall50. These findings
a hippocampus-based model of the scene, constructed during complement those of a recent study using high resolution
visual perception, facilitates prediction beyond the edges of fMRI in humans, which identified the peak activity for
the view as well as beyond the sensory domain into perceiving novel scenes to be in the presubiculum85. In rats,
imagination/recall. We predict that asking patients with the presubiculum and parasubiculum contain grid cells and
hippocampal lesions challenging questions about visually border cells23, pointing to a spatial role for both regions. In
presented scenes that could only be answered by possessing monkeys, presubiculum at the level of the uncus projects to
a coherent internal scene model might reveal impairments medial entorhinal cortex18, which is commonly associated
that are not evident in more general narrative description with spatial function and where grid cells were first
tasks. discovered in rodents22,86. Based on these findings, we
hypothesise that the presubiculum and parasubiculum
contribute to the spatial basis of scene representations in Based on the available anatomical and functional data, we
humans. propose that the hippocampus (specifically amHipp) supports
modelling of scenes. It can be driven ‘offline’ during
Uncus imagination and recall in order to construct a spatially
coherent scene representation. In addition, it continually
As described above50, the uncus is engaged when recalling
constructs and refines a representation of the scene being
scenes from a week prior to scanning. Although subfields
experienced ‘online’, extending into the perceptual domain
within the uncus were not distinguished due to the limited
the notion of scene construction 51,52. The presubiculum and
spatial resolution of MRI, one striking property of this region
parasubiculum provide a spatial basis for the elements of the
is its inter-hemispheric connectivity. In non-human primates,
scene, complemented by the uncus and anterior subiculum,
the uncus of hippocampus sends and receives commissural
which may mediate communication with other regions that
connections18, which complement the even stronger
represent the elements of the scene and generate vivid
commissural connections arising from presubiculum and
imagery. When visual stimulation is used to update one’s
terminating in contralateral medial EC. Little is known about
internal model, or when visual representations are vividly
the precise differences in hippocampal function between
reinstated during recall, the posterior hippocampus is
hemispheres beyond suggestions of general verbal or
additionally engaged to process the scene’s visuospatial
visuospatial distinctions87 and the homology of inter-
properties.
hemispheric connections between humans and primates is
uncertain20. However, we speculate that recall of This model makes several predictions. First, there should be
consolidated representations from memory may involve the a novelty response to scenes in anterior hippocampus,
integration of information from across regions located in both reflecting the construction of the novel representation. This is
hemispheres and thus may be supported by connections supported by many findings2 (Box 3) and helps to explain
terminating in the uncus. why scene novelty particularly engages amHipp47,67. Second,
amHipp should be engaged when a spatially coherent
The uncus is also well placed to mediate communication
representation of a scene needs to be constructed or used for
between the prefrontal cortex (PFC) and the rest of the
simulating events. This is borne out by recent findings,
hippocampus. In non-human primates, subfield CA1’ projects
83,88 including studies using simple visual scene perception
directly to PFC, particularly medial areas BA25  and
88 tasks47,66 as well as scene discrimination tasks, where
BA14 , with lighter projections to orbital PFC. There is an
patients with hippocampal lesions could distinguish scenes
indirect pathway between CA1’ and PFC via the amygdala –
from the same angle but not different angles 62. Third, amHipp
both prosubiculum and CA1’ project to the basal
will be placed under greater demand for recalling
nucleus89 which in turn projects widely through medial PFC
consolidated memories than very recent memories, because
(BA 24, 25 and 32), lateral PFC (BA 12 and 45) and orbital
90 in the former case the scene must be constructed from a
PFC (BA 7a, 13b and 14) . Functional neuroimaging studies
distributed representation across cortex. A recent study
in humans have also found that the PFC, particularly
found greater amHipp response to recalling scenes encoded
ventromedial PFC, is co-activated with anterior hippocampus
a week before scanning, compared to those encoded 30
(Box 2). Of particular relevance are studies implicating
48,91,92 minutes before scanning50. Consolidation findings do,
ventromedial PFC in memory consolidation . Thus, the
however, differ across studies, which have used varying
uncus and ventromedial PFC may be jointly involved in
stimulus content, analysis techniques and durations of
retrieving the elements of memories which have been
consolidation48,50,92,96–98. We argue that understanding the role
consolidated, to be reconstructed as spatially coherent
of the uncus in recall will be key to unpacking these
scenes by amHipp.
differences. Finally, the model predicts that patients with
Anterior subiculum bilateral hippocampal lesions will have visual perceptual
deficits, but only when probed in such a way as to demand a
The subiculum is a main output structure of the hippocampus coherent internal model of their surroundings. More detailed
(although there are outgoing connections from various other neuropsychological studies should be able to evaluate the
parts of the hippocampus93). This region includes border limits of visual scene perception in the presence of
cells26, pointing to a role in modelling the environment. The dysfunctional hippocampi.
anterior subiculum is engaged when subjects imagine novel
scenes50. In human fMRI, the subiculum and prosubiculum are By reformulating the current ‘perception versus memory’
not generally distinguishable, but differ in their connectivity. debate with reference to the detailed anatomy of anterior
The subiculum projects to the mammillary bodies and hippocampus, we believe that increased explanatory power
retrosplenial cortex12, two regions closely involved with the and more fruitful lines of inquiry can be forthcoming.
representation of head direction. The prosubiculum, however, Moreover, this proposal may help to clarify the discussion
has reciprocal connections with targets including over functional differences in the long axis of the
ventromedial PFC12. hippocampus, as well as providing anatomical detail to
computational models of hippocampal function. For instance,
Methodological issues one computational model of spatial memory and
imagery99 proposed that CA3 combines spatial information
Anterior hippocampus is particularly challenging to study
from boundary vector cells with object identity information in
using fMRI. A consensus has not yet been reached on the
perirhinal cortex to facilitate pattern completion. The findings
definition of the subfields at the resolution attainable with
summarised here suggest that this point of integration
MRI although work is underway6. Weak contrast-to-noise ratio
between spatial and non-spatial information may be the
is a well-known problem in the anterior temporal lobes and
presubiculum and/or parasubiculum in humans.
may cause false negative results. Furthermore, the uncus is
94
flanked by significant vasculature . The influence of In our opinion, visual perception, imagination and episodic
vasculature on fMRI interpretation has been questioned 95, recall depend upon a common process – the creation of
however this issue is not specific to the hippocampus. These models of the world and the use of those models to plan
concerns emphasise the importance of having well-matched scenarios and re-play memories. This, we suggest, offers a
control conditions and the need for precise anatomical detail better way to understand the function of the hippocampus
as well as further research into neuro-vascular coupling. and reconcile apparently disparate findings in the literature.
Nevertheless, there are numerous unanswered questions.
A model and future directions
What form does the spatial representation in the field's perception that the complexity and diversity of cellular
presubiculum and parasubiculum take? What visually-driven elements in vivo remains unmatched by today's organoids,
operations are performed by the posterior hippocampus and current cultures are already isomorphic to sentient brain
how do these relate to navigation? Most of all, we call for structure and activity in critical domains and so may be
further studies capitalising on advances in high resolution capable of supporting sentient activity and behaviour." The
fMRI in humans to understand the specific functional Green Neuroscience Laboratory is run by Elan Ohayon and
anatomy of the anterior hippocampus, as well as its Ann Lam, two neuroscientists who have outlined a "Roadmap
functional connectivity with neighbouring structures. The to a New Neuroscience": a set of core ethical principles for
uncus is particularly poorly understood, but the results their research, designed to exclude "toxic methodologies",
reviewed here suggest it may play an important role in scene animal experimentation, and methods that otherwise infringe
recall. High resolution neuroimaging experiments and an individual's rights, privacy, and autonomy. From their
analysis of gene expression100 in humans could complement viewpoint, the state of sophistication in current mini-brain
neuropsychological studies of patients and in vivo animal research means we should be affording the same kinds of
models, to improve our knowledge of this critical brain protections to primitive organoids that might be just complex
region. enough to have thoughts and sensations. "If there's even a
possibility of the organoid being sentient, we could be
We Are 'Perilously Close' to Creating Sentient Mini- crossing that line," Ohayon told The Guardian. "We don't
Brains in a Dish, Experts Warn want people doing research where there is potential for
something to suffer." The Green team aren't the only
PETER DOCKRILL
scientists with such qualms. In a study published this month,
22 OCT 2019 neuroscientists from the University of Pennsylvania argued
why the field needs guidelines that don't currently exist
The scientific community is in danger of overstepping (or – especially in the context of experiments where lab-grown
may have already breached) its ethical responsibilities in a organoids are transplanted into animal host bodies.
rush to study and understand the mysteries of the brain via
experimentation with artificially grown substitutes, "The field is developing quickly, and as we continue down
researchers warn. Mini-brains, also known as organoids, have this path, researchers need to contribute to the creation of
in recent years become a hugely important resource in ethical guidelines grounded in scientific principles that define
neuroscience and related fields. But while these lab-grown how to approach their use before and after transplantation in
analogues grown from stem cells aren't technically animals," says neurosurgeon Isaac Chen.
considered human or animal organs, they're becoming
"While today's brain organoids and brain organoid hosts do
functionally close enough to warrant serious ethical concerns
not come close to reaching any level of self-awareness, there
– if not an outright ban on their use, according to some
is wisdom in understanding the relevant ethical
neuroscientists.
considerations in order to avoid potential pitfalls that may
In a presentation this week at the world's largest gathering of arise as this technology advances."
neuroscientists, a team led by researchers from the Green
The research was presented at Neuroscience 2019, the
Neuroscience Laboratory in San Diego made the case for why
annual meeting of the Society for Neuroscience, held in
there is an "urgent need" for scientists to develop a
Chicago this week.
framework of criteria that stipulates what 'sentience' is, so
that future research using mini-brains and stem cell cultures
can be bound by a developed set of ethical rules. "The
compositional and causal features in these cultures are – by
design – often very similar to naturally occurring neural
substrates," the team explains in their abstract. "Recent
developments in organoid research also entail that the
anatomical substrates are now approaching local network
organisation and larger structures found in sentient animals."
There's a lot of evidence to support this. In recent years,
scientists have promoted mini-brains as an economical and
practical alternative to animal testing, and advancements in
nurturing stem cells are helping scientists figure out how to
mimic the complex neural subtypes of human brain tissue.
Mini-brains grown in dishes have enabled researchers
to probe the differences between humans and chimpanzees,
and the rapid pace with which the field is evolving is almost
scary. In March, scientists grew a mini-brain – said to be
roughly analogous in complexity to a human foetal brain at
12 to 13 weeks – and, in the context of their model
experiment, it spontaneously connected itself to a nearby
spinal cord and muscle tissue. A few months later, in a
separate experiment, researchers detected electrical activity
exhibited by organoids that looked startlingly similar to
human brain waves. While the scientific teams behind these
incredible accomplishments are usually quick to observe that
the organoids we're capable of developing today are far
removed from showing the neural sophistication of human
and animal brains, Ohayon and his team's computational
models suggest we're getting awfully close to growing
sentient brains in a dish. "Current organoid research is
perilously close to crossing this ethical Rubicon and may
have already done so," the researchers explain. "Despite the

You might also like