Memoria y Aprendizaje

Available online at www.sciencedirect.
com
ScienceDirect
Oscillations, neural computations and learning during

wake and sleep
Hector Penagos, Carmen Varela and Matthew A Wilson
Learning and memory theories consider sleep and the exploration to discover different courses when familiar
reactivation of waking hippocampal neural patterns to be crucial routes are not accessible. By contrast, using a map as a
for the long-term consolidation of memories. Here we propose model, subjects could gain in generalization and flexibil-
that precisely coordinated representations across brain regions ity because they would be able to devise, evaluate and
allow the inference and evaluation of causal relationships to train plan alternative routes without previously having experi-
an internal generative model of the world. This training starts enced them, as predicted by the cognitive map hypothe-
during wakefulness and strongly benefits from sleep because its sis [3,4]. Importantly, knowledge of the spatial layout can
recurring nested oscillations may reflect compositional have limited value when planning and executing a plan,
operations that facilitate a hierarchical processing of because a map does not consider conditions and rules that
information, potentially including behavioral policy evaluations. may govern navigation in a given environment. Answer-
This suggests that an important function of sleep activity is to ing questions such as what paths are available and why
provide conditions conducive to general inference, prediction they are accessible or not is necessary for an individual to
and insight, which contribute to a more robust internal model decide how to execute a plan to reach a goal. Hence,
that underlies generalization and adaptive behavior. incorporating information to answer these what, why and
how questions can lead to a more robust model that
Address generates appropriate actions under varying requirements
Center for Brains, Minds and Machines, Picower Institute for Learning and contexts. Training such a generative model relies on
and Memory, Department of Brain and Cognitive Sciences,
extracting meaningful structure from its inputs to learn
Massachusetts Institute of Technology, Cambridge, MA, USA
statistical representations that can account for the broad
Corresponding author: Wilson, Matthew A (mwilson@mit.edu) set of conditions associated with them [5]. For neural
systems this training could start with the encoding of
external information during awake behavior and continue
Current Opinion in Neurobiology 2017, 44:193–201
during periods of sleep, resulting in more robust repre-
This review comes from a themed issue on Neurobiology of sleep sentations that increase behavioral flexibility [6]. Indeed,
Edited by Yang Dan and Thomas Kilduff several studies demonstrate that sleep, in addition to
For a complete overview see the Issue and the Editorial promoting memory consolidation, enables the discovery
http://dx.doi.org/10.1016/j.conb.2017.05.009
of implicit rules and insights, which are essential ele-
ments for generalization and learning [7]. What compu-
0959-4388/ã 2017 Elsevier Ltd. All rights reserved.
tational principles are at work during sleep to facilitate the
consolidation of memories while also promoting generali-
zation as well as the inference of causal relationships?
While a dialogue between the neocortex and hippocam-
pus has been thought to mediate the systems consolida-
tion of memory and the slow incorporation of statistical
Introduction
regularities into general cortical schemas [8,9,10], it is
The challenge faced by the brain in perceiving and
unclear whether or how it could also contribute to the
interpreting the external world is to map a high-dimen-
training of a generative model that infers causal relation-
sional input into neural representations in the form of
ships. In the following sections, we explore the idea that,
distributed spiking activity across brain regions, and to
in the case of spatial navigation, the coordinated neural
then infer causal relationships behind this sensory-driven
representations in the hippocampus, neocortex and thal-
code. A consequence of this inference is the generation of
amus during sleep, train a generative model that infers
actions that allow an optimal interaction with the envi-
contextual and spatial contingencies, and which can be
ronment in different contexts. Computationally, the com-
used during navigation to flexibly select actions to meet
plexity of this challenge renders solutions that rely on
contextual conditions.
discriminative methods and lookup tables as rigid and
inefficient. Instead, solutions based on probabilistic gen-
erative models aim to learn the underlying rules behind Predictive coding and neural network
external world observations, allowing more general appli- representations
cability [1,2]. For instance, in the case of spatial explora- Machine learning methods can provide helpful theoreti-
tion, using a lookup table constructed on past navigational cal frameworks for the implementation and training of
experiences, subjects would rely on trial and error flexible generative models in the brain. Although deep
www.sciencedirect.com Current Opinion in Neurobiology 2017, 44:193–201

194 Neurobiology of sleep
architectures have enjoyed a recent wave of success that could be used for flexible behavior. For example,
[11,12], they largely require explicit external feedback during goal-directed navigation, a subject would benefit
during training, which contrasts with how the brain learns from knowing the upcoming spatial layout of the envi-
despite the lack of external feedback. Generative net- ronment and the rules affecting potential choices. In
works, by contrast, while typically not structurally deep, rodents, the spatial component is given by the anticipa-
can extract statistical patterns in an unsupervised manner tory firing of CA1 place cells within cycles of the ongoing
[13]. These types of models cast the brain as an inference theta rhythm (8–12 Hz) in which place cells with partially
device that compares predictions generated by an internal overlapping receptive fields fire in a temporal sequence
model of the world against the ongoing spike train code. that reflects their relative position on the maze [19,20].
Perception, for example, is then a constructive process by These theta sequences represent the upcoming position
which the brain continuously tries to account for its of the animal and, at decision points, reveal sweeps in the
sensations in terms of internally generated expectations direction of all available options, consistent with an
or, when a mismatch occurs, updates the model that active, constructive evaluation of potential choices [21].
generates the predictions [14,15]. Stochastic recurrent Neocortical areas, including prefrontal and retrosplenial
neural networks offer a formal implementation that is cortices, could represent the rules and actions that ulti-
consistent with this type of predictive coding because mately impact the decision of the subject [22–24]. Con-
they allow the sequential estimation of probabilistic sistent with this cooperative interaction, prefrontal neu-
relationships between time-dependent random variables rons are selectively phased locked to the hippocampal
through generative models. The temporal restricted theta rhythm as subjects approach decision points [25,26].
Boltzmann machine (TRBM) [16,17], a forerunner to An additional structure that could assist in coordinating
modern recurrent neural networks, offers an intuition of consistent expectations and representations across brain
how this process unfolds. The TRBM is composed of a areas is the thalamus, given its role in the expression of
sequence of individual RBMs [18] (Box 1) each of which intended navigational trajectories in prefrontal cortex and
contains two sets of stochastic binary units. A layer of hippocampus [27], its projections to multiple neocortical
visible units receives the input to the RBM and is linked regions and its potential role in modulating the hippo-
to a set of hidden units through connection weights in campal theta oscillation [28,29]. How is this complex
such a way that the state of the units and the strength network, involving multiple brain regions, trained to
of their connections can, through training, extract and support predictive coding?
encode the statistical regularities of the inputs. Using this
computational design as a conceptual framework, even Training a generative model during awake
without a specific correspondence with detailed brain behavior
anatomy, can provide insights about how neocortical, Predictive coding relies on correcting errors resulting
hippocampal and thalamic networks act together to from comparisons between internal predictions and actual
implement and train a generative model of the world observations. This error, estimated through a process
Box 1 Restricted Boltzmann Machine as a computational model for learning across brain states
The stochastic nature of RBMs allows the estimation of the statistical structure implicit in its inputs in an unsupervised way mimicking the challenge
faced by the brain in interpreting the outside world. (a) In the RBM formulation, a set of binary visible units interacts directly with incoming stimuli
and is linked to a set of binary hidden units; the state of the units and the strength of their connections, represented in a connection matrix, can
encode statistical regularities of the input [49]. Computationally, this requires a two-step optimization heuristic known as contrastive divergence: in
the first (encoding) step, the visible units are fixed to represent a sample of the external input and the state of the hidden units is obtained and
expressed as a hidden vector; in the second (prediction) step, the hidden units are clamped to the recently obtained hidden vector to generate a
prediction quantified by casting the value of the visible units into a visible vector. The difference between the joint state of visible and hidden units
during the encoding and predictive steps provides an update to the connection matrix so that the internal model better approximates the training
sample [29]. Intuitively, the weight between two coactive units will increase in the encoding step, similar to a Hebbian rule, to encourage the hidden
units to model the incoming stimulus, whereas the weight between coactive units in the prediction step will decrease to minimize nonspecific
correlations. The temporal RBM (TRBM, (b) top) is an extension of the RBM that is useful in representing sequential data. At each time step, an
RBM corrects its prediction based on the ongoing sample of the external stimulus, and it provides the initial conditions for the hidden units in the
immediately following time step. Physiologically, this could correspond to the encoding and retrieval phases associated with the hippocampal theta
oscillation; in a given theta cycle, the state of the environment would be communicated through the feedforward connections from entorhinal cortex
(EC), allowing for a contrastive divergence-like operation by the recurrent CA3 network, while the expectation of the model, in the retrieval phase,
would be reflected by CA1 spiking activity. As an outcome, this process would set up the next expectation of the internal model as navigation
unfolds.
The recurrent TRBM (RTRBM, (b) bottom) provides a mathematical advantage over the TRBM because, at any given time step, the state of the
hidden units is obtained by a deterministic operation based on the state of an intermediary hidden layer (H’) at the previous time step along with the
current state of the visible layer. Although mapping the exact equivalence between each component of the RTRBM to physiological elements is
difficult, the temporal progression of the RTRBM suggests an analogy to the coordinated activity of the hippocampus and neocortex during sleep.
Specifically, the notion that temporally coincident reactivation events in both areas, analogous to the state of the intermediary hidden and visible
layers at each time step, can be used to infer statistical regularities in their representations that are incorporated into a generative model
Current Opinion in Neurobiology 2017, 44:193–201 www.sciencedirect.com

Neural computations in sleep Penagos, Varela and Wilson 195
represented by the hidden layer of the RTRBM. As explored in the main text, the interconnected anatomical relationships between the thalamus,
neocortex and hippocampus, along with the nested temporal structure of sleep could support linking successive time steps in the training of the
RTRBM.
(a) General RBM training algorithm

Step 1 Step 2
2.- Update hidden Units 1.- Fix Hidden Units
V
1.- Fix Visible Units (input data) 2.- Sample Visible Units (prediction)
(b) RBM variants for sequential data

TRBM
init
H0 H1 H2 H3 H4 H5
V0 V1 V2 V3 V4 V5
time
RTRBM
init
H0 H1 H2 H3 H4
H’0 H’1 H’2 H’3 H’4
V0 V1 V2 V3 V4
Current Opinion in Neurobiology
known as contrastive divergence [30] in the RBM formu- model updates as evidenced by the finding that, in
lation, drives an update in the connection weights response to changes in reward amounts at spatial loca-
between visible and hidden layers (Box 1). In models tions, hippocampal place cell activity changes from sig-
of the neocortex, this is postulated to occur through the naling the animal’s current position to depicting, in
interaction of lower and higher cortical layers, but training reverse order, the trajectory that led the subject to its
could also start at earlier processing stages if the neocortex current rewarded location [32,33]. Coupling this reverse
sent its expectations to the thalamus [15]. The error trajectory replay with bursting dopaminergic activity [34]
signal could then be generated through the inhibitory would produce a location-value gradient (highest at the
effect of the thalamic reticular nucleus, which receives reward location) that could serve two purposes: first, in the
inputs from both the neocortex and the dorsal thalamus, hippocampus, establish location-based reward expecta-
on thalamocortical cells. In the hippocampus, error esti- tions and second, in neocortical sites, attribute values to
mation can be calculated by comparing spiking activities the actions that led to reward. Combining these new
during the encoding and retrieval phases of the theta associations would effectively yield a more abstract
rhythm [26,31]. The outcome of this operation can lead to state-action representation that combines relevant

information such as space, reward and additional contin- structure that promotes the integration of information in a
gencies of a particular experience that can alter the hierarchical manner. In its first proposed role, this func-
subject’s behavioral policy to maximize reward collection. tional unit mediates specific computations depending on
Note that this reverse representation of spatial trajectories its internal temporal microstructure, which is supported
is accompanied by fast oscillations in the hippocampal by neocortical cells preferentially spiking at different
local field potential (LFP), known as sharp wave ripples phases of the up state [36]. In addition, spindles (7–
(100–300 Hz), which are otherwise predominantly found 15 Hz) may be present within the SO, typically following
during non-rapid eye movement (NREM) sleep. Because the K-complex, and their peak amplitude can happen at
neocortical activity during this state is temporally related different phases of the up state. A related empirical
to hippocampal ripples, sleep could be an ideal state observation is that spindles peaking in the late phase
for model training that generalizes these state-action of the SO are more predictive of subsequent learning [37].
representations to include additional scenarios and In addition to spindles, delta waves (1–4 Hz) may also be
environments. present at various phases of the SO [38,39]. Non-neocor-
tical areas such as the thalamus and hippocampus con-
Training a generative model during sleep tribute to the microstructure of the SO [36,40]. Thalamic
If training through error correction occurs as behavior activity is involved in spindle pacemaking and could also
takes place, what is the role of sleep? If no additional be involved in the initiation of an up state [41,42].
modifications occurred during sleep, each internal activity Hippocampal ripples (50–100 ms) preferentially occur
pattern produced during wakefulness would be almost before and after K-complexes, delta waves and spindles
exclusively tied to the set of conditions directly experi- [43]; when they happen in multi-ripple bursts (including
enced by an individual. Although for a few navigational up to eight ripples) [44], they can also be phased-locked to
experiences this might seem like a pragmatic solution, the individual cycles of a spindle [45]. Overall, the nested
brain would rapidly face an impossible challenge: to nature of these oscillations suggests a means by which
uniquely encode an exponentially large number of states complex computations can be decomposed into elemen-
with a finite number of internal spiking patterns. This tary operations across brain areas. These operations can
would hinder the ability of the brain to generalize and be used recursively in a sequential manner to extract
infer causal relationships beyond those experienced by fundamental associations between neural representations
the subject. Because during sleep the external input that potentially revealing causal relationships between them.
normally drives the update of the model is not present, we
propose that sleep provides an opportunity for the model Artificial recurrent networks and sleep dynamics
to evaluate novel representations that could not otherwise Current learning models posit that statistical regularities
be considered, given the strong constraints imposed by of the environment are slowly consolidated over time
external input during wakefulness. through cortico-hippocampal interactions during sleep,
resulting in more robust internal representations that
In the following sections we will consider how the tem- can facilitate subsequent learning [8,10]. We propose
poral organization of sleep and the computations across that the extraction of the statistics of the world during
brain regions can contribute to this extended training. sleep is one of a more extensive set of computations
whose ultimate goal is to infer causal dependencies
Nested oscillations during NREM sleep between rules, actions and states that produce favorable
The recurring organization of NREM sleep provides an outcomes during behavior. Therefore, as statistical regu-
appealing temporal setting for model training that could larities emerge, they are sampled and combined proba-
favor a hierarchical and compositional processing of infor- bilistically with rules and action representations to reveal
mation where increasing levels of abstraction are identi- alternative means to solve previously encountered pro-
fied at correspondingly increasing timescales and incor- blems. In the case of navigation, the process of sampling
porated into the model. Specifically, neuronal activity and combining representations of spatial paths with rules
during NREM sleep is characterized by alternations of and outcomes could result in the partition of behavioral
suppressed (down) and elevated (up) spiking that corre- trajectories into shorter paths that can reveal intermediate
spond to the cortical slow oscillation (SO, 0.1–1 Hz), positions as key nodes, or sub-goals, in reaching a desired
where a large-amplitude biphasic wave known as K- goal location. Paths that link several of these nodes can
complex marks the down state. Because the SO is present then be recombined as modules that produce novel routes
throughout NREM sleep and because it groups additional to be evaluated as potential navigational policies in sub-
sleep rhythms [35], it could represent a functional unit sequent behavior [46,47].
that serves two functions: first, to trigger the recursive
execution of elementary operations across brain areas Consistent with this proposal, a recent experiment in
within each up state and, second, to link the outcomes which rats learned odor-place associations highlights
of these operations in a sequential fashion over an the co-occurrence of these processes during sleep: rats
extended period of time, leading to a temporally nested learned the rules and spatial mapping of the task over

several days and, once performance was asymptotic, they observation that neurons in the medial entorhinal cortex
were able to extrapolate the rules to new odor-location can sustain elevated firing activity across several neocor-
pairs in a single training session. Critically, the ability to tical up states [53] suggests an effective means by which
perform the task successfully with the newly paired items the brain could implement a dilated causal convolution-
was eliminated if the hippocampus was lesioned before like operation to link temporal relationships across cycles
rats had the opportunity to sleep following training. In of the SO.
other words, when the task required the extrapolation of
rules beyond purely spatial changes to the layout of the In summary, while several artificial network architectures
environment, an intact hippocampus was required for can discover causal relationships across representations,
sleep to have a positive impact on behavior [48]. the RTRBM provides guiding principles that are consis-
tent with the temporal structure of sleep oscillations.
Intuitively, establishing causal relationships requires that
several neural representations interact over time to infer Learning through simulations during sleep
their correct temporal dependencies. Several recurrent An important difference between wake and sleep is that,
networks can successfully learn and generate sequential during the latter, the generative model is exclusively
data and suggest potential implementations of model producing and comparing internal predictions across
training during sleep. What these networks have in com- layers. Physiologically, this means that the neocortex
mon is that operations between visible and hidden layers and hippocampus are, in principle, not constrained to
are performed iteratively over time and the outcome at encode representations that match identical navigational
one time step influences the immediately following com- experiences. Instead, each area can produce variations of
putation, consistent with the recurring structure of sleep. those representations that can be combined in a compo-
They differ, however, in how they incorporate longer- sitional manner to produce simulations that yield increas-
range influences of one computation over upcoming ing levels of abstraction in the internal network. At the
representations and whether their operations can be most basic level, neocortical and hippocampal patterns
broadly mapped onto physiological events. For example, that are not linked to identical spatial experiences can
the long short-term memory formulation adds an internal help reveal equivalent features of an environment, lead-
state or memory cell to the hidden layer at each time step ing to the reconstruction of a map that can be used for
[49]. The content of the memory cell remains unchanged navigation. Next, including non-spatial sensory dimen-
for, and can influence, an arbitrary number of computa- sions into these internal simulations can bring about rules
tion time steps. While effective in enabling long-range associated with this map. Lastly, combining these
interactions across representations, it is unclear what the inferred rules and locations with representations of
physiological equivalent of this memory cell would be. By actions could assess their value in different contexts
contrast, the recurrent TRBM (RTRBM, Box 1) does not and improve the predictive power of the model. Ulti-
explicitly include long-range relationships, while empha- mately, these hierarchical processes can lead to richer and
sizing dependencies between adjacent computations more flexible representations because they can answer
[16]. Despite this shortcoming, the RTRBM is reminis- the what, why and how questions that allow for generaliza-
cent of the cortico-hippocampal temporal dynamics tion and appropriate behavioral policies within and across
observed within and across individual SOs (Figure 1) environments.
suggesting a broad correspondence to how the neocortex
and hippocampus influence each other’s representations These tiered processes can correspond to the multiple
within individual periods of the SO, in line with recent scales of temporal dynamics of the SO reflected by its
electrophysiological findings [50]. nested oscillations (Figure 1A). For example, recall that
multi-ripple bursts can be phase-locked to spindles
An interesting consideration is that sleep representations resulting in a repeating pattern of alternating cortico-
could violate the temporal order in which they are pro- hippocampal activity. Because representations of
duced during wake. This notion is captured in the bidi- extended trajectories in a multi-ripple burst arise from
rectional network formulation, which adds a second hid- the sequential expression of shorter segments within
den layer so that at each computation step, one hidden single ripples [44], individual spindle cycles provide an
layer receives information from previous states and the opportunity for cortical influence on the content of imme-
other hidden layer receives information from future states diately upcoming ripple-encoded spatial snippets. A
[51]. This configuration emphasizes the notion that the potential consequence is the expression of novel trajec-
role of one representation may change upon its specific tories, like shortcuts [54], that link segments of the
position within a sequence of events. Contrasting with environment over a multi-ripple burst. Likewise, indi-
bidirectional networks, models using dilated causal con- vidual ripples could trigger the neocortical expression of
volutions, although not recurrent, can effectively link past additional dimensions that allow the inference of rules
computations over long intervals preserving the temporal and actions relevant to the environment. Importantly,
order with which the network models the data [52]. The ripples can also be tied to K-complexes and delta waves;

Figure 1
(a)
Delta Spindle K-complex
Neocortex
Multi-ripple burst
Hippocampus
Slow Oscillation
(b)
V0 V1 V2 V3 V4 V5
H’5
H’0 H’1 H’2 H’3 H’4
H0 H1 H2 H3 H4 H5
Current Opinion in Neurobiology
Nested sleep oscillations under the RTRBM framework.

Sleep is characterized by thalamocortical and hippocampal oscillations whose temporal evolution is reminiscent of the sequential processing in
the RTRBM. (a) Diagram showing three schematic consecutive cycles of neocortical slow oscillation (SO) and the nested accompanying rhythms
in neocortex and hippocampus. The neocortical slow oscillation is a recurring pattern that groups the major thalamocortical sleep rhythms
including K-complexes, spindles and delta waves. Hippocampal ripples are also grouped by the SO and tend to occur in temporal proximity to
neocortical sleep waves. Each up state in the SO may serve as a functional unit to process information and extract patterns that train different
aspects of an internal generative model. (b) RTRBM-like interactions may happen within individual up states in which different neocortical regions
assume the role of the visible layer, Vt, and the hippocampus that of the intermediate hidden layer, H’t. The network update realized within a given
SO functional unit depends on its internal temporal structure, the neocortical areas engaged dominantly with the hippocampus and the state of
the network in the previous block of computations. For example, during a delta-dominated SO, interactions (red lines) between neocortical (V) and
hippocampal (H’) representations could result in the incorporation of rules to the internal generative model (H). This, in turn, could bias (green
lines) cortical and hippocampal representations to be evaluated iteratively in individual cycles (dotted box) of an upcoming spindle-multi-ripple-
burst event. The result can provide a means to infer spatial layouts or to simulate actions (e.g., a novel trajectory) in these inferred virtual
environments. Ultimately, these sequential processing chains can lead to more abstract representations that support generalization and casual
inference.
the finding that elevated prefrontal-hippocampal spike [24]. Together, these observations suggest two possibili-
correlations were observed in the absence of spindles ties: the extraction of rules could take place either during
within up states [43] suggests further computations the late, post-spindle, phase of an SO, or within SOs
linked to these additional neocortical oscillations. In dominated by delta oscillations alone. Besides inferring
particular, spiking patterns encoding rules in prefrontal relations between rules and space, sleep may allow the
cortex are subsequently replayed toward the end of opportunity to improve behavioral policies through the
neocortical up states, coinciding with ripples in the hip- evaluation of actions in simulated contexts. The retro-
pocampus as well as delta oscillations in prefrontal cortex splenial cortex lies at the intersection of allocentric and

egocentric representation streams and its spiking activity of sleep could enable the identification of modules that
also encodes action-related variables [55,56]. Hence, its can be flexibly combined in a compositional fashion to
interactions with the hippocampus and ventral striatum yield novel behavioral policy representations that can be
[57] could carry out this final state-action evaluation for evaluated through sleep simulations [47,60]. To this end,
network training during SOs. In this stage, the coordi- the RBM and its recurrent variants provide a general
nated activities of retrosplenial cortex and hippocampus framework of computational principles across different
could simulate multiple route options in order to assess brain states that lead to the establishment of robust
their value for efficient navigation, with multi-ripple generative models of the world. The observation that
bursts potentially mediating this simulation, given their humans are able to imagine and anticipate scenarios, even
representation of extended behavioral sequences, and in the absence of external cues, provides intuitive proof of
ventral striatum providing reward associations. the existence of this type of models in the brain, which
can identify and establish causal relationships between
Importantly, the recurrent operations of the RTRBM do contexts, actions and outcomes, and not merely recall past
not change across time steps. Instead the representations events. In addition to providing a useful framework to
of the intermediate and visible layers dictate what aspects interpret physiological observations, the RBM formula-
of the generative model can be learned (Figure 1B). tion raises important questions that require further theo-
Physiologically, this implies that multiple brain structures retical and experimental work. For instance, experiments
can differentially interact with the hippocampus to fulfill in which generalization within and across environments
different training needs. For example, sensory cortices can be assessed before and after sleep at the electrophysi-
would preferentially contribute to extraction of map ological and behavioral level, can examine the contribution
features, prefrontal cortex would enable rule identifica- of sleep to our proposed hypotheses of model updating
tion and retrosplenial cortex would be favored for policy during this state. In addition, manipulations that target the
and action evaluation. The thalamus, with its extensive temporal relationships across areas by altering phase rela-
projections and gating mechanisms, could bias cortical tionships between specific oscillatory events and/or the
activation and representation patterns that engage with representations associated with them, can shed light into
the hippocampus for different aspects of model training. the actual computations that take place in neural circuits.
For instance, it could modulate the duration, structure For example, experiments directed at breaking multi-
and content of ripple-spindle computations and it could ripple bursts into individual ripples, changing the duration
provide a link across SOs to implement an effective of spindles, or altering the phase relationships with one
strategy that prioritizes meaningful representations to another, could provide opportunities to examine neural
update the generative model while avoiding exhaustive representations in these altered states. Moreover, directing
comparisons that may not add value to it. these manipulations to retrosplenial, prefrontal or hippo-
campal areas could provide further assessments into their
In summary, sleep network training can be thought of as a contributions to inferring maps, rules and actions, and lead
simulation involving coordinated activity between tha- to improved theoretical architectures that better account
lamic, cortical and hippocampal areas. During this state, for the structure and function of the brain.
representations of past experiences are combined to infer
the spatial structure and valid rules in an environment. Conflict of interest statement
Together with the evaluation of simulated actions, these Nothing declared.
inferences effectively answer the what, why and how
questions that lead to a flexible model of the external Acknowledgements
world. In the end this model may enable the brain to We thank members of the Wilson lab for comments on an earlier version of
compose unexplored scenarios, something that could not this manuscript. This material is based upon work supported by the Center
for Brains, Minds and Machines (CBMM), funded by National Science
otherwise happen during awake behavior given the strong Foundation Science and Technology Center award CCF-1231216 and by a
constrains imposed by external input. NARSAD Young Investigator Award from the Brain & Behavior Research
Foundation to C.V.
Conclusions
Previous work has emphasized the contribution of sleep References and recommended reading
Papers of particular interest, published within the period of review,
to the formation of episodic and semantic knowledge and have been highlighted as:
to the identification of similar contexts in different envir-
of special interest
onments [8,10]. However, the discovery of insights or of outstanding interest
implicit rules for problem solving during sleep has been
largely unaccounted for [7,58,59]. Here, we explored 1. Tenenbaum JB, Kemp C, Griffiths TL, Goodman ND: How to grow
computational principles during sleep that endow the a mind: statistics, structure, and abstraction. Science 2011,
331:1279-1285.
brain with the ability to find new solutions to familiar
2. Chater N, Tenenbaum JB, Yuille A: Probabilistic models of
problems or to extrapolate previous strategies to new cognition: conceptual foundations. Trends Cogn. Sci. 2006,
scenarios. We discussed how the temporal architecture 10:287-291.

3. O’Keefe J: A computational theory of the hippocampal and conclude that such recurrent networks can be trained to implement
cognitive map. Prog. Brain Res. 1990, 83:301-312. generative models that can be used for temporal prediction, e.g., moving
stimuli, in neocortical physiological networks.
4. Pfeiffer BE, Foster DJ: Hippocampal place-cell sequences
depict future paths to remembered goals. Nature 2013, 16. Sutskever I, Hinton GE, Taylor GW: In Advances in Neural
497:74-79. Information Processing Systems 21. Edited by Koller D,
Schuurmans D, Bengio Y, Bottou L. Curran Associates, Inc.;
5. Hinton G: Where do features come from? Cogn. Sci. 2014, 2009:1601-1608.
38:1078-1101. This paper introduces the mathematical formulation for the Recurrent
This paper offers a brief and informative review of the evolution of two Temporal Restricted Boltzmann Machine (RTRBM) and it shows the
major efforts in machine learning: deep networks and the Restricted mathematical equivalence between deep architectures and the RTRBM.
Boltzmann Machine. Using intuitive descriptions, the author contrasts the
differences between these types of networks. The author also outlines 17. Sutskever I, Hinton GE: Learning multilevel distributed
how stacking RBMs can lead to deep architectures that can learn representations for high-dimensional sequences. AISTATS
statistical features in an unsupervised fashion and can then be fine-tuned 2007, 2:548-555.
with external feedback. This paper is relevant because the Recurrent
Temporal Restricted Boltzmann Machine is composed of sequential 18. Colin McDonnell: Lecture 12.3 — Restricted Boltzmann Machines
layers of RBMs that are stacked in time. [Neural Networks for Machine Learning].
6. Hinton GE, Dayan P, Frey BJ, Neal RM: The ‘wake-sleep’ 19. Skaggs WE, McNaughton BL, Wilson MA, Barnes CA: Theta
algorithm for unsupervised neural networks. Science 1995, phase precession in hippocampal neuronal populations and
268:1158-1161. the compression of temporal sequences. Hippocampus 1996,
6:149-172.
7. Wagner U, Gais S, Haider H, Verleger R, Born J: Sleep inspires
insight. Nature 2004, 427:352-355. 20. Foster DJ, Wilson MA: Hippocampal theta sequences.
Hippocampus 2007, 17:1093-1099.
8. Kumaran D, Hassabis D, McClelland JL: What learning systems
do intelligent agents need? Complementary learning systems 21. Redish AD: Vicarious trial and error. Nat. Rev. Neurosci. 2016,
theory updated. Trends Cogn. Sci. 2016, 20:512-534. 17:147-159.
This paper reviews and updates the complementary learning systems
(CLS) theory to better account for behavioral and theoretical results. In its 22. Smith DM, Barredo J, Mizumori SJY: Complimentary roles of the
original form, the CLS theory proposed that the hippocampus can quickly hippocampus and retrosplenial cortex in behavioral context
store detailed experiences, while neocortex updates synaptic weights discrimination. Hippocampus 2012, 22:1121-1133.
slowly to eventually store knowledge representations that are less
detailed but reflect environment statistics. Two updates to the CLS theory 23. Vedder LC, Miller AMP, Harrison MB, Smith DM: Retrosplenial
are discussed: first, results from machine learning in multilayer networks cortical neurons encode navigational cues, trajectories and
suggest that the neocortex can also learn fast, provided that the new reward locations during goal directed navigation. Cereb.
information is consistent with the model already stored in neocortical Cortex 2016 http://dx.doi.org/10.1093/cercor/bhw192.
networks; and second, the hippocampus can ensure that certain experi- [Epub ahead of print].
ences are remembered in detail by tagging salient experiences during
their hippocampal reactivation. 24. Peyrache A, Khamassi M, Benchenane K, Wiener SI, Battaglia FP:
Replay of rule-learning related neural patterns in the prefrontal
9. Marr D: Simple memory: a theory for archicortex. Philos. Trans. cortex during sleep. Nat. Neurosci. 2009, 12:919-926.
R. Soc. Lond. B Biol. Sci. 1971, 262:23-81.
25. Jones MW, Wilson MA: Theta rhythms coordinate
10. McClelland JL, McNaughton BL, O’Reilly RC: Why there are hippocampal–prefrontal interactions in a spatial memory task.
complementary learning systems in the hippocampus and PLoS Biol. 2005, 3(12):2187-2199.
neocortex: insights from the successes and failures of
connectionist models of learning and memory. Psychol. Rev. 26. Hyman JM, Hasselmo ME, Seamans JK: What is the functional
1995, 102:419-457. relevance of prefrontal cortex entrainment to hippocampal
theta rhythms? Front. Neurosci. 2011, 5:24.
11. Mnih V et al.: Human-level control through deep reinforcement
learning. Nature 2015, 518:529-533. 27. Ito HT, Zhang S-J, Witter MP, Moser EI, Moser M-B: A
prefrontal-thalamo-hippocampal circuit for goal-directed
12. Silver D et al.: Mastering the game of Go with deep neural spatial navigation. Nature 2015, 522:50-55.
networks and tree search. Nature 2016, 529:484-489.
28. Deschênes M, Veinante P, Zhang ZW: The organization of
13. Freund Y, Haussler D: In Advances in Neural Information corticothalamic projections: reciprocity versus parity.
Processing Systems 4. Edited by Moody JE, Hanson SJ, Lippmann Brain Res. Brain Res. Rev. 1998, 28:286-308.
RP. Morgan-Kaufmann; 1992:912-919.
29. Vertes RP, Hoover WB, Viana Di Prisco G: Theta rhythm of the
14. Clark A: Whatever next? Predictive brains, situated agents, and
hippocampus: subcortical control and functional significance.
the future of cognitive science. Behav. Brain Sci. 2013, 36:181-
Behav. Cogn. Neurosci. Rev. 2004, 3:173-200.
204.
The author reviews evidence in support of a unified hypothesis of 30. Hinton GE: Training products of experts by minimizing
perception, cognition and action that presents predictive coding, imple- contrastive divergence. Neural Comput. 2002, 14:1771-1800.
mented through probabilistic generative models, as a fundamental com-
putational tool in neocortical networks. Under this hypothesis, sensory, 31. Ketz N, Morkonda SG, O’Reilly RC: Theta coordinated error-
cognitive and motor systems rely on estimated probability-density dis- driven learning in the hippocampus. PLoS Comput. Biol. 2013,
tributions of external world variables to generate top-down predictions 9:e1003067.
that can be compared with ongoing sensory input. The resulting error
signal is sent forward along the cortical hierarchy to drive perception and 32. Foster DJ, Wilson MA: Reverse replay of behavioural sequences
allow further tuning of the internal model. In humans, the top-down in hippocampal place cells during the awake state. Nature
predictions and feedforward error signals are proposed to also incorpo- 2006, 440:680-683.
rate the statistics of sociocultural constrains, and to account for dysfunc-
tional states such as hallucinations and delusions. 33. Ambrose RE, Pfeiffer BE, Foster DJ: Reverse replay of
hippocampal place cells is uniquely modulated by changing
15. Rao RPN, Sejnowski TJ: Probabilistic Models of the Brain: reward. Neuron 2016, 91:1124-1136.
Perception and Neural Function. The MIT Press; 2002:297-315.
The authors discuss and test the hypothesis that recurrent connectivity in 34. Gomperts SN, Kloosterman F, Wilson MA: VTA neurons
neocortex implements temporal predictive coding through a generative coordinate with the hippocampal reactivation of spatial
model that can be trained using the rules of spike-timing dependent experience. eLife 2015, 4:e05360.
synaptic plasticity (STDP). They test this hypothesis using a compart-
mental model of interconnected neurons with synaptic weights that 35. Steriade M: Grouping of brain rhythms in corticothalamic
change according to STDP when presented with sequential stimuli, systems. Neuroscience 2006, 137:1087-1106.

36. Neske GT: The slow oscillation in cortical and thalamic 46. Botvinick M, Toussaint M: Planning as inference. Trends Cogn.
networks: mechanisms and functions. Front. Neural Circuits Sci. 2012, 16:485-488.
2015, 9:88.
47. Donnarumma F, Maisto D, Pezzulo G: Problem solving as
37. Demanuele C et al.: Coordination of slow waves with sleep probabilistic inference with subgoaling: explaining human
spindles predicts sleep-dependent memory consolidation in successes and pitfalls in the Tower of Hanoi. PLoS Comput.
schizophrenia. Sleep 2016. Biol. 2016, 12:e1004864.
38. Steriade M, Dossi RC, Nuñez A: Network modulation of a slow 48. Tse D et al.: Schemas and memory consolidation. Science 2007,
intrinsic oscillation of cat thalamocortical neurons implicated 316:76-82.
in sleep delta waves: cortically induced synchronization and
brainstem cholinergic suppression. J. Neurosci. 1991, 49. Graves A: Generating Sequences With Recurrent Neural
11:3200-3217. Networks. ArXiv13080850 Cs (2013).
39. Nuñez A, Curró Dossi R, Contreras D, Steriade M: Intracellular 50. Rothschild G, Eban E, Frank LM: A cortical-hippocampal-
evidence for incompatibility between spindle and delta cortical loop of information processing during memory
oscillations in thalamocortical neurons of cat. Neuroscience consolidation. Nat. Neurosci. 2017, 20:251-259.
1992, 48:75-85.
51. Schuster M, Paliwal KK: Bidirectional recurrent neural
40. Gardner RJ, Hughes SW, Jones MW: Differential spike timing networks. IEEE Trans. Signal Process. 1997, 45:2673-2681.
and phase dynamics of reticular thalamic and prefrontal
cortical neuronal populations during sleep spindles. J. 52. van den Oord A et al.: WaveNet: A Generative Model for Raw
Neurosci. 2013, 33:18469-18480. Audio. ArXiv160903499 Cs; 2016.
41. Steriade M, Domich L, Oakson G, Deschênes M: The 53. Hahn TTG, McFarland JM, Berberich S, Sakmann B, Mehta MR:
deafferented reticular thalamic nucleus generates spindle Spontaneous persistent activity in entorhinal cortex
rhythmicity. J. Neurophysiol. 1987, 57:260-273. modulates cortico-hippocampal interaction in vivo. Nat.
Neurosci. 2012, 15:1531-1538.
42. MacLean JN, Watson BO, Aaron GB, Yuste R: Internal dynamics
determine the cortical response to thalamic stimulation. 54. Gupta AS, van der Meer MAA, Touretzky DS, Redish AD:
Neuron 2005, 48:811-823. Hippocampal replay is not a simple function of experience.
Neuron 2010, 65:695-705.
43. Peyrache A, Battaglia FP, Destexhe A: Inhibition recruitment in
prefrontal cortex during sleep spindles and gating of 55. Alexander AS, Nitz DA: Retrosplenial cortex maps the
hippocampal inputs. Proc. Natl. Acad. Sci. U. S. A. 2011, conjunction of internal and external spaces. Nat. Neurosci.
108:17207-17212. 2015, 18:1143-1151.
44. Davidson TJ, Kloosterman F, Wilson MA: Hippocampal replay of 56. Jacob P-Y et al.: An independent, landmark-dominated head-
extended experience. Neuron 2009, 63:497-507. direction signal in dysgranular retrosplenial cortex. Nat.
45. Staresina BP et al.: Hierarchical nesting of slow oscillations, Neurosci. 2017, 20:173-175 http://dx.doi.org/10.1038/nn.4465.
spindles and ripples in the human hippocampus during sleep.
57. Lansink CS, Goltstein PM, Lankelma JV, McNaughton BL,
Nat. Neurosci. 2015, 18:1679-1686.
Pennartz CMA: Hippocampus leads ventral striatum in replay of
This study explores the relationship between the slow oscillation, spindles
place-reward information. PLoS Biol. 2009, 7:e1000173.
and ripples in the human hippocampus. The authors obtained intra-
cranial electroencephalogram (iEEG) recordings from 12 subjects and J, Gaskell MG: Does sleep improve your grammar?
58. Mirkovic
assessed the phase-amplitude relationship between sleep oscillations. Preferential consolidation of arbitrary components of new
They report a nested structure in which hippocampal ripples occur at the linguistic knowledge. PLoS One 2016, 11.
troughs of spindles, which in turn, take place during the negative phase of
the hippocampal slow oscillation. Importantly, they find that ripples occur 59. Monaghan P et al.: Sleep promotes analogical transfer in
in bursts with a repetition rate of 14.5 Hz, reinforcing the observation that problem solving. Cognition 2015, 143:25-30.
multi-ripple bursts can also have a tight timing relationship with spindles.
These observations are significant because they provide a framework like 60. Arnold AEGF, Iaria G, Ekstrom AD: Mental simulation of routes
that of the neocortical slow oscillation through which the hippocampus during navigation involves adaptive temporal compression.
and neocortex can communicate with precise timing during sleep. Cognition 2016, 157:14-23.

Memoria y Aprendizaje

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Memoria y Aprendizaje

Uploaded by

Copyright:

Available Formats

Available online at www.sciencedirect.

Oscillations, neural computations and learning during

www.sciencedirect.com Current Opinion in Neurobiology 2017, 44:193–201

Current Opinion in Neurobiology 2017, 44:193–201 www.sciencedirect.com

(a) General RBM training algorithm

(b) RBM variants for sequential data

H’0 H’1 H’2 H’3 H’4

Current Opinion in Neurobiology

www.sciencedirect.com Current Opinion in Neurobiology 2017, 44:193–201

Current Opinion in Neurobiology 2017, 44:193–201 www.sciencedirect.com

www.sciencedirect.com Current Opinion in Neurobiology 2017, 44:193–201

Current Opinion in Neurobiology

Nested sleep oscillations under the RTRBM framework.

Current Opinion in Neurobiology 2017, 44:193–201 www.sciencedirect.com

www.sciencedirect.com Current Opinion in Neurobiology 2017, 44:193–201

Current Opinion in Neurobiology 2017, 44:193–201 www.sciencedirect.com

www.sciencedirect.com Current Opinion in Neurobiology 2017, 44:193–201

You might also like