You are on page 1of 13

1

Increasing Delay Period in Short-Term Memory


Tasks Leads to More Stable Information
Encoding
Amelia Simonoff 1,∗ , Matt Rosen 1 and David J. Freedman 1
1Freedman Laboratory, Department of Neurobiology, University of Chicago,
Chicago, IL, USA
Correspondence*:
Amelia Simonoff, 5812 S. Ellis Avenue, Chicago, IL., 60615, USA
asimonoff@uchicago.edu

2 ABSTRACT

3 Working memory, which encompasses the ability to maintain and manipulate information in
4 short-term memory, requires a complex interplay between several brain areas ranging from
5 sensorimotor to executive function. The primary neural correlate of WM is persistent neural
6 activity in the lateral interparietal cortex (LIP), prefrontal cortex (PFC), and the dorsolateral PFC
7 (dlPFC) while animals hold information in memory. The function of this persistent activity and the
8 mechanisms through which it occurs remain key areas of inquiry. Here, as part of an investigation
9 of the mechanism of persistent activity started by Masse et al. (2019), we conducted a theoretical
10 investigation using recurrent neural networks (RNNs) to understand how persistent activity may
11 occur and what it reflects. Average network task accuracies were greater than 97%, but decoding
12 accuracy varied from 12% to 99% using support-vector machines (SVMs). The networks robustly
13 demonstrated that information encoding occurs in short-term synaptic plasticity (STSP). This was
14 observed especially when the delay duration is significantly shorter than the time constant of
15 STSP, as decoding accuracy decreases in those circumstances. This was also demonstrated in
16 the stark changes in the principal component analyses (PCAs) as delay duration increases: the
17 first two principal components change from being overlapping to becoming spatially distinct as
18 soon as the delay duration is larger than the neurotransmitter release duration. Finally, this is
19 confirmed by the cross-temporal analyses, where this behavior is also observed during decoding
20 of the sample during the delay period. As such, we conclude that information must be maintained
21 in silent processing, i.e. STSP, during those times that information is unable to be decoded from
22 the persistent activity of the networks but the networks still demonstrate high accuracy.

23 Keywords: short-term memory, synaptic plasticity, information encoding, network, computation

1 INTRODUCTION

24 Working memory (WM) is important for day to day tasks, such as reasoning, decision-making, and behavior
25 (Miller et al., 2018). WM is a correlate of short-term memory (STM), which encompasses information
26 maintenance as opposed to information manipulation. Previous research has demonstraed that activity
27 occurs both during information maintenance, during the instruction steps (Stokes et al., 2013), and during

1
Simonoff et al. Increasing Delay Period in STM

28 information manipulation (Constantinidis et al., 2018). However, the neural mechanisms are not well-
29 understood. Recurrent neural networks (RNNs) were trained on a visual STM task to investigate the exact
30 working of these mechanisms, which may have important effects on treatments of diseases with WM or
31 STM degradation, such as vascular dementia.

32 STM is critical for intelligent behavior. STM allows for holding information ’in mind’ which may
33 be quickly accessed, updated, and manipulated (Foster et al., 2019). Visual STM has been studied in
34 non-human primates (NHPs) using delayed-response tasks, such as delayed match-to-sample (DMS) tasks
35 (Masse et al., 2019). In order to understand STM at a cellular population level, extracellular recordings have
36 been conducted from the middle temporal area (MT), lateral interparietal cortex (LIP), and the prefrontal
37 cortex (PFC).
38 Previous research from the Freedman laboratory has observed that, during WM and STM, neural
39 activity in sensorimotor and executive regions (LIP, PFC, and dorsolateral PFC (dlPFC)) persistently
40 encodes remembered information (Funahashi et al., 1989; Miller et al., 1996; Freedman and Assad, 2006;
41 Constantinidis et al., 2018). Also, neural activity during the delay period in frontal and parietal areas is
42 critical for visual STM (Lundqvist et al., 2018). Previous research has found that behavioral errors during
43 STM are linked to diminished delay activity (Funahashi et al., 1989), and that task information held in PFC
44 affects the final outcome (Stokes et al., 2013). Furthermore, the magnitude of delay activity tracks the STM
45 load, especially in the posterior parietal cortex (PPC) (Vogel and Machizawa, 2004).
46 However, this delay period neural activity is highly variable among many axes, such as across experiments
47 (Mendoza-Halliday et al., 2014; Sarma et al., 2016), time (Miller et al., 1996; Zaksas and Pasternak, 2006;
48 Inagaki et al., 2019), and in features. The origins of this variability are hotly contested. There have been
49 many hypotheses proposed that could explain this variability, such as through interactions with long-term
50 memory, Miller’s bursting model, or activity-silent WM (Masse et al., 2019; Barbosa et al., 2019).

51 Unfortunately, existing measurements limit the ability to elucidate the exact causes of the variability in
52 frontal and parietal STM encoding. Historically, measurements have been taken from single brain regions or
53 from a handful of neurons at once from animals conducting one task at a time. However, STM is distributed
54 in the activity of many neural populations, which are located in multiple interacting brain areas. As such,
55 simultaneous recordings from multiple neural populations are necessary. Also, STM requires different
56 behavioral demands, and so recordings are necessary from the same animals performing multiple tasks,
57 ideally from the same neurons, to fully understand the variability due to the tasks. Furthermore, cortical
58 circuit activity is very history-dependent, and so recordings must be done across a range of temporal
59 demands.
60 We hypothesized that the format of encoding of STM in the PPC and PFC is dependent on memory
61 duration and information manipulation. Our main goal was to understand how task duration influences the
62 format of memory encoding in the PPC and PFC. We predicted that increasing delay duration will call
63 for more information maintenance, which demands more stable and rapid encoding. This would lead to
64 stronger changes in the PFC and in posterior sensory areas (Adam et al., 2021; Zhao et al., 2022). As such,
65 after training on these tasks with variable delay durations, neural activity will encode memories in a stable
66 and easily-accessible manner for easy retrieval.

2 MATERIALS AND METHODS


67 We set up RNNs using the Python machine learning framework TensorFlow based on Masse et al. (2019)
68 and Zhou et al. (2021). Each network had 24 motion-direction tuned input neurons, projecting onto a

Frontiers 2
Simonoff et al. Increasing Delay Period in STM

69 recurrent network of 100 neurons. 100 neurons allows the network to demonstrate appropriate dynamics and
70 responses, and has been historically used for similar experiments (Masse et al. (2019)). We separated the
71 neurons into excitatory and inhibitory populations, in accordance with Dale’s law: there are 80 excitatory
72 neurons and 20 inhibitory neurons. The connection weights between all recurrently connected neurons were
73 dynamically modulated by STSP (Fig. 1). Connection weights from half of the neurons were depressing,
74 such that pre-synaptic activity decreases synaptic efficacy, and the other half were facilitating, such that
75 presynaptic activity increases synaptic efficacy. See Masse et al. (2019): Table 1 for the exact values of the
76 parameters used.
77 The activity of the recurrent neurons was modeled based on the following dynamic equation:

dr √
τ = −r + f (W rec r + W in u + brec + 2τ σrec ζ) (1)
dt
78 Here, τ is the neuron’s time constant, f is the activation function, W rec is the synaptic weight between
79 input recurrent neurons, W in is the synaptic weight between recurrent neurons, brec is a bias term, ζ is
80 independent Gaussian white noise with zero mean and unit variance applied to all recurrent neurons, and
81 σrec is the strength of the noise. The rectified linear function was the activation function to ensure that
82 neuron’s firing rates were non-negative and non-saturating: f (x) = max(0, x).
83 These networks were trained to perform the DMS task by doing backpropagation through time (Fig. 2)
84 (Song et al., 2016). The networks had to indicate whether sequentially presented (500 ms presentation;
85 1000 ms delay) sample and test stimuli were an exact match. Initial neuronal firing rates, biases, weights
86 for input, output, and recurrent connections are all potentially optimizable to minimize behavioral error.
87 These networks also faced metabolic constraints on their connectivity and activity inspired by costs faced
88 by biological systems.
89 They were trained using the Adam optimizer for stochastic gradient descent over 2000 iterations. The
90 average accuracy of the networks was 97.75%.

91 These RNNs were also trained with short-term synaptic plasticity (STSP) processes (Zucker and Regehr,
92 2002; Mongillo et al., 2008). STSP is based on the amount of available resources and release probability of
93 neurotransmitter, which depend on each other. A higher release probability means that there will be more
94 resources used up by activity, and that there will be fewer resources available for future activity.

95 We modeled two synapse types: depression-dominated and facilitation-dominated synapses. The former
96 have a high initial release probability, slow vesicle recovery, and fast release probability decay. The latter
97 have low initial release probability, fast vesicle recovery, and slow release probability decay.

3 RESULTS
98 Persistent activity is a hallmark of working memory. However, it is variable across tasks, stimulus
99 configurations, experimental conditions, etc., even when behaviorally the animal is conducting STM tasks.
100 We wanted to investigate and understand this variability. Based on previous work from the Freedman
101 Lab, we found that there exist other candidate mechanisms to maintain information not in spiking activity.
102 We simulated processes that have physical time constants (of STSP) to understand how they, along with
103 changes in the duration of STM, change the encoding of information (Fig. 1). We expected to observe
104 different levels of memory encoding as a function of the STSP constant, which was demonstrated in the
105 RNNs. These models demonstrated that synaptic processes use short-lived information encoding in STSP

Frontiers 3
Simonoff et al. Increasing Delay Period in STM

106 when the delay duration of the STM task is shorter than the time constant, which decreases as the delay
107 duration increases past the time constant.

108 RNNs were trained with different STSP time constants (τslow ), delay periods, and spike costs. All
109 networks had high task accuracies (mean = 0.9775), but the decoding accuracy (whether the identity of
110 the sample can be determined from the activity of the network) dropped sharply when the delay duration
111 of the STM task was shorter than the time constant of STSP (Fig. 3). We then recorded their accuracies
112 using static support vector machines (SVMs) (Fig. 4) as per (Stokes et al., 2013). We also visualized the
113 network’s activity state space via dimensionality reduction with principal component analysis (PCA) as
114 per (Tipping and Bishop, 2006) (Fig. 5) and cross-temporal SVMs (Fig. 6). The time constants for STSP
115 range from 1000 ms to 2000 ms, which are physiologically motivated (Mongillo et al., 2008). The delay
116 periods for the STM tasks range from 500 ms to 2500 ms. The spike costs are 0, 2 × 10−6 , and 0.02 (see
117 Masse et al. (2019) for details). An increase in spike cost leads to lower task accuracy (Fig. 3), while task
118 accuracy increases with delay period.
119 We focused our analyses on the highest spike cost networks (cost = 0.02), as they have the greatest
120 decoding dropoff at low delay lengths but are still able to complete all tasks at a greater than 90% accuracy.

121 3.1 Static Decoding

122 We conducted static decoding for each network to understand whether the network encodes information in
123 the activity of neurons (non-STSP) (Fig. 3, which demonstrates 4-fold cross-validated decoding accuracy).
124 We observed that decoding accuracy decreases when the time constant of STSP is significantly larger than
125 the delay period of the network. This signifies that information is being stored silently in the network,
126 instead of as actively in activity. At lower spike costs, networks demonstrated enough spiking activity that
127 some information could be maintained at short delay periods in active spiking. This was not observed at
128 the higher network costs. Network accuracy was high (greater than 90%) in all cases, and so the extant
129 information encoded by the network must be in STSP since it is not in the active spiking.

130 3.2 Principal Component Analyses (PCAs)

131 We hypothesized that when the delay period of the STM tasks would be greater than the time constant of
132 STSP, the networks will need to use spiking for the maintenance of information in STM. Upon conducting
133 PCA, we found that encoding of information in the first and second principal components was higher at
134 higher delay periods (Fig. 5). As such, at long delay periods the networks were not able to rely on the silent
135 substrate for information maintenance and manipulation.

136 3.3 Cross-Temporal Decoding

137 Finally, we conducted cross-temporal decoding for each network to understand whether information is
138 maintained in a stably decodable format when the delay is much longer than the time constant of STSP
139 (Fig. 6). We observed that decoding accuracy increases when training duration and the testing duration are
140 similar, and decoding accuracy decreases as those increase. This also implies that information is stored in a
141 progressively more stable format when the delay is much longer than the time constant of STSP. Longer
142 decoding times would have lead to even greater decoding accuracies.
143 After the conclusion of our simulations, we switched to NHP training to test the results.

Frontiers 4
Simonoff et al. Increasing Delay Period in STM

4 DISCUSSION
144 We wanted to know how memory encoding changes depending on how long an object needs to be attended
145 to in STM. During the experiments, multiple RNNs were trained on a variety of tasks to get them to
146 understand the concept of matching to gte them ready for a color DMS task. We observed varying success
147 with the NHPs.

148 The networks robustly demonstrate that information encoding occurs in STSP, especially when the delay
149 duration is significantly shorter than the time constant of STSP, as decoding accuracy decreases in those
150 circumstances. This is also demonstrated in the stark changes in the PCAs as delay duration increases: the
151 first two principal components change from being overlapping to becoming spatially distinct as soon as
152 the delay duration is larger than the neurotransmitter release duration. Finally, this is confirmed by the
153 cross-temporal analyses, where this behavior is also observed during decoding of the sample during the
154 delay period. As such, we conclude that information must be maintained in silent processing, i.e. STSP,
155 during those times.
156 While STM tasks in humans often use delay periods longer than 2500ms, often reaching up to double-
157 digit seconds, the focus of this work was to elucidate what happens in monkey PFC. As such, we decided
158 that the maximal delay period of 2500ms is adequate. This is supported by the near-100% decoding
159 accuracy throughout the cross-temporal decoding (Fig. 6). However, a longer delay period would most
160 likely cause more clustering in the first two principal components of the PCAs (Fig. 5).

161 In these experiments, we presented the sample and test stimuli to the same receptive fields (RFs) in the
162 NHPs, and the RNNs were trained based on the same presentation patterns. We assumed that both stimuli
163 would interact with the same synapses if they are presented in the same RFs, which could be reactivated
164 and take part in STSP. However, it is unclear how these mechanisms interact when the stimuli are presented
165 in different RFs, and whether the accuracy of RNNs trained in that way would remain equally as high. We
166 will investigate this paradigm in subsequent work.
167 We will investigate this further in vivo using semi-chronic electrode array recordings from surface cortical
168 areas (such as the dorsolateral prefrontal cortex, dlPFC) and acute recording from sulcal areas (such as the
169 lateral interparietal cortex, LIP).

170 4.0.1 Permission to Reuse and Copyright


171 Figures, tables, and images will be published under a Creative Commons CC-BY licence and
172 permission must be obtained for use of copyrighted material from other sources (including re-
173 published/adapted/modified/partial figures and images from the internet). It is the responsibility of the
174 authors to acquire the licenses, to follow any citation instructions requested by third-party rights holders,
175 and cover any supplementary charges.

CONFLICT OF INTEREST STATEMENT


176 The authors declare that the research was conducted in the absence of any commercial or financial
177 relationships that could be construed as a potential conflict of interest.

AUTHOR CONTRIBUTIONS
178 AS, MR, and DF contributed to the conception and design of the study. MR create the code. AS conducted
179 the analyses and wrote the manuscript. DF read and approved the submitted version.

Frontiers 5
Simonoff et al. Increasing Delay Period in STM

FUNDING
180 This study was funded by the Department of Defense. Grant number: N00014-19-1-2001.

ACKNOWLEDGMENTS
181 The authors would like to acknowledge the contributions of Alessandra Silva for her help and patience
182 with the crafting of this document.

DATA AVAILABILITY STATEMENT


183 The datasets generated and analyzed for this study, as well as the analysis code, can be found in the
184 Increasing Delay Period in Working Memory Tasks Leads to More Stable Information Encoding GitHub
185 repository.

REFERENCES
186 Adam, K. C., Rademaker, R. L., and Serences, J. T. (2021). Evidence for, and challenges to, sensory
187 recruitment models of visual working memory. Visual Memory , 5–25
188 Barbosa, J., Stein, H., Martinez, R., Galan, A., Adam, K., Li, S., et al. (2019). Interplay between persistent
189 activity and activity-silent dynamics in prefrontal cortex during working memory. BioRxiv , 763938
190 Constantinidis, C., Funahashi, S., Lee, D., Murray, J. D., Qi, X.-L., Wang, M., et al. (2018). Persistent
191 spiking activity underlies working memory. Journal of neuroscience 38, 7020–7028
192 Foster, J. J., Vogel, E. K., and Awh, E. (2019). Working memory as persistent neural activity. Oxford
193 Handbook of Human Memory In Press
194 Freedman, D. J. and Assad, J. A. (2006). Experience-dependent representation of visual categories in
195 parietal cortex. Nature 443, 85–88. doi:https://doi.org/10.1038/nature05078
196 Funahashi, S., Bruce, C. J., and Goldman-Rakic, P. S. (1989). Mnemonic coding of visual space in the
197 monkey’s dorsolateral prefrontal cortex. Journal of Neurophysiology 61
198 Inagaki, H. K., Fontolan, L., Romani, S., and Svoboda, K. (2019). Discrete attractor dynamics underlies
199 persistent activity in the frontal cortex. Nature 566. doi:https://doi.org/10.1038/s41586-019-0919-7
200 Lundqvist, M., Herman, P., and Miller, E. K. (2018). Working memory: delay activity, yes! persistent
201 activity? maybe not. Journal of neuroscience 38, 7013–7019
202 Masse, N. Y., Yang, G. R., Song, H. F., Wang, X.-J., and Freedman, D. J. (2019). Circuit mechanisms
203 for the maintenance and manipulation of information in working memory. Nature Neuroscience 22,
204 1159–1167. doi:https://doi.org/10.1038/s41593-019-0414-3
205 Mendoza-Halliday, D., Torres, S., and Martinez-Trujillo, J. C. (2014). Sharp emergence of feature-
206 selective sustained activity along the dorsal visual pathway. Nature Neuroscience 17. doi:https:
207 //doi.org/10.1038/nn.3785
208 Miller, E. K., Erickson, C. A., and Desimone, R. (1996). Neural mechanisms of visual working memory in
209 prefrontal cortex of the macaque. Journal of Neuroscience 16
210 Miller, E. K., Lundqvist, M., and Bastos, A. M. (2018). Working memory 2.0. Neuron 100, 463–475
211 Mongillo, G., Barak, O., and Tsodyks, M. (2008). Synaptic theory of working memory. Science 319,
212 1543–1546
213 Sarma, A., Masse, N. Y., Wang, X.-J., and Freedman, D. J. (2016). Task-specific versus generalized
214 mnemonic representations in parietal and prefrontal cortices. Nature Neuroscience 19. doi:https:
215 //doi.org/10.1038/nn.4168

Frontiers 6
Simonoff et al. Increasing Delay Period in STM

216 Song, H. F., Yang, G. R., and Wang, X.-J. (2016). Training excitatory-inhibitory recurrent neural networks
217 for cognitive tasks: a simple and flexible framework. PLoS computational biology 12, e1004792
218 Stokes, M. G., Kusunoki, M., Sigala, N., Nili, H., Gaffan, D., and Duncan, J. (2013). Dynamic coding for
219 cognitive control in prefrontal cortex. Neuron 78
220 Tipping, M. E. and Bishop, C. M. (2006). Mixtures of probabilistic principal component analysers. Neural
221 Computation 2
222 Vogel, E. K. and Machizawa, M. G. (2004). Neural activity predicts individual differences in visual
223 working memory capacity. Nature 428, 748–751
224 Zaksas, D. and Pasternak, T. (2006). Directional signals in the prefrontal cortex and in area mt during
225 a working memory for visual motion task. Journal of Neuroscience 45. doi:https://doi.org/10.1523/
226 JNEUROSCI.3420-06.2006
227 Zhao, Y.-J., Kay, K. N., Tian, Y., and Ku, Y. (2022). Sensory recruitment revisited: Ipsilateral v1 involved
228 in visual working memory. Cerebral Cortex 32, 1470–1479
229 Zhou, Y., Rosen, M. C., Swaminathan, S. K., Masse, N. Y., Zhu, O., and Freedman, D. J. (2021). Distributed
230 functions of prefrontal and parietal cortices during sequential categorical decisions. Elife 10, e58782
231 Zucker, R. S. and Regehr, W. G. (2002). Short-term synaptic plasticity. Annual Review of Physiology 64,
232 355–405

FIGURES AND FIGURE CAPTIONS

Frontiers 7
Simonoff et al. Increasing Delay Period in STM

Figure 1. (A), The core rate-based model consisted of 24 motion-direction-tuned neurons projecting onto
80 excitatory and 20 inhibitory recurrently connected neurons. The 80 excitatory neurons projected onto
3 decisions neurons. (B), For synapses that exhibited short-term synaptic depression (left), presynaptic
activity (top) weakly increases neurotransmitter utilization (red trace, middle) and strongly decreases the
available neurotransmitter (blue trace), decreasing synaptic efficacy (bottom). For synapses that exhibited
short-term synaptic facilitation (right), presynaptic activity strongly increases neurotransmitter utilization
and weakly decreases available neurotransmitter, increasing synaptic efficacy. Adapted with permission
from Masse et al. (2019).

Frontiers 8
Simonoff et al. Increasing Delay Period in STM

Figure 2. (A) A 500 ms fixation period was followed by a 500 ms sample motion direction stimulus,
followed by a 1000 ms delay period and finally a 500 ms test stimulus. (B) Sample decoding accuracy,
calculated using neuronal activity (green curves) and synaptic efficacy (magenta curves) for n = 20 networks.
The dashed vertical lines, from left to right, indicate the sample onset, offset, and end of the delay period.
(C) Scatter plot showing the neuronal decoding accuracy measured at the end of the delay (x-axis) versus
the task accuracy (y-axis) for all 20 networks (blue circles), the task accuracy for the same 20 networks
after neuronal activity was shuffled right before test onset (red circles) or synaptic efficacies were shuffled
right before test onset (cyan circles). The dashed vertical line indicates chance level decoding. Adapted
with permission from Masse et al. (2019).

Frontiers 9
Simonoff et al. Increasing Delay Period in STM

Figure 3. Increasing spike costs decreases decoding accuracy in RNNs. RNNs with longer slow time
constants (τslow ) also have decreased decoding accuracy. Task accuracies were not significantly affected.

Frontiers 10
Simonoff et al. Increasing Delay Period in STM

Figure 4. Static decoding of networks. Increasing delay period on the x-axis, τslow on the y-axis. The
traces are average sample decoding accuracy while the dots are the actual values of each network. An
increase in static decoding accuracy means that the identity of the sample can be better identified based on
the activity of the network, meaning that the information is encoded in neuronal activity and not STSP. A
decrease in accuracy represents information being maintained in STSP.

Frontiers 11
Simonoff et al. Increasing Delay Period in STM

Figure 5. The first two components of the PCA analysis of networks with the median spike cost and a
τslow = 1500ms. This duration was chosen due to its low decoding accuracy at shorter delay periods and
high decoding accuracy at longer delay periods. Encoding becomes more consistent as the delay period
increases which is observed as the grouping by sample direction becomes more localized. From left to right,
top to bottom: delay period: 500, 722, 944, 1166, 1388, 1611, 1833, 2055, 2277, 2500ms. More clustered
components represents sample identities being maintained separately in the first two components of the
PCA analysis.

Frontiers 12
Simonoff et al. Increasing Delay Period in STM

Figure 6. A proportional increase in training and test durations leads to the most stable decoding of
information. When the delay period is greater than the time constant, decoding accuracy decreases. These
networks have a median spike cost and a τslow = 1500ms. From left to right, top to bottom: delay period:
500, 722, 944, 1166, 1388, 1611, 1833, 2055, 2277, 2500ms. An increase in color brightness represents an
increase in sample decoding accuracy.

Frontiers 13

You might also like